Thank you everyone for coming. If you were expecting the Postgres talk, that was the one before, so you might need to watch the video stream. So, Ansible best practices, I thought about calling it "Ansible, my best practices", so, just warning ahead, this is things I stumbled on using Ansible for the last 2-3 years and those are very specific things I found that worked very well for me. About me, I do also freelance work, do a lot of Ansible in there, I'm also the Debian maintainer for Ansible with Harlan Lieberman-Berg If there are any bugs in the package, just report them. The talk will be roughly divided into 4 parts. The first part will be about why you actually want to use config management and why you specifically want to use Ansible. So, if you're still SSHing into machines and editing config files, you're probably a good candidate for using Ansible. Then, the second part will be about good roles and playbook patterns that I have found that work really well for me. The third chapter will be about typical antipatterns I've stumbled upon, either in my work with other people using Ansible, or the IRC support channel, for example. The fourth part will be like advanced tips and tricks you can use like fun things you can do with Ansible. Quick elevator pitch, what makes config management good? It actually also serves as a documentation of changes on your servers over time so if you just put the whole config management in a git repo and just regularly commit, you will actually be able to say "Why doesn't this work? It used to work a year ago" You can actually check why. Also, most config management tools have a lot better error reporting than your self-written bash scripts that do whatever. And usually, you have a very good reproducibility with config management and also idempotency, meaning that if you run, for example, a playbook several times you will always get the same result. Also, it's great if you work in small team or you admin ??? in the company and you have some people working on a few things too. It makes team work a lot easier and you will save a lot of time actually debugging things when things break. What makes Ansible good? Comparing it to Chef or Puppet for example it's really easy to set up, you start with two config files, you have it installed and you're ready to go. It's also agentless, so whatever machines you actually want to control, the only thing you they really need to have is an SSH daemon and Python 2.6+ so that's virtually any Debian machine you have installed and that is still supported in any way. Ansible also supports configuration of many things like networking equipment or even Windows machines, they don't need SSH but they use the WinRM but Ansible came a bit late to the game so Ansible's still not as good in coverage like for example Puppet, which literally, you can configure any machine on the planet with that, as long as it has a CPU. Next step, I will talk about good role patterns. If you've never worked with Ansible before, this is the point when you watch the video stream, that you pause it and start working a few weeks with it and then unpause the actual video. A good role should ideally have the following layout. So, in the "roles" directory, you have the name of the role and task/main.yml You have the following rough layout. At the beginning of the role, you check for various conditions, for example using the "assert" task to for example check that certain variables are defined, things are set, that it's maybe part of a group, things like that you actually want to check. Then, usually, you install packages, you can use apt, or on CentOS machines, yum or you can do a git checkout or whatever, then usually you do some templating of files where you have certain abstraction and the variables are actually put into the template and make the actual config file. There's also good to point out that the template module actually has a "validate" parameter, that means you can actually use a command to check your config files for syntax errors and if that fails, your playbook will fail before actually deploying that config file so you can for example use Apache with the right parameters to actually do a check on the syntax of the file. That way, you never end up with a state where there's a broken config. In the end, you usually… When you change things, you trigger handlers to restart any daemons. If you use variables, I recommend putting sensible defaults in defaults/main.yml and then you only have to override those variables on specific cases. Ideally, you should have sensible defaults you want to have to get whatever things you want to have running. When you start working with it and do that a bit more, you notice a few things and that is your role should ideally run in "check mode". "ansible-playbook" has --check that basically is just a dry run of your complete playbook and with --diff, it will actually show you for example file changes, or file mode changes, stuff like that and won't actually change anything. So if you end up editing a lot of stuff, you can use that as a check. I'll later get to some antipatterns that actually break that thing. And, ideally, the way you change files and configs and states, you should make sure that when the actual changes are deployed, and you run it a second time, that Ansible doesn't report any changes because if you end up writing your roles fairly sloppy, you end up having a lot of changes and then, in the end of the report, you have like 20 changes reported and you kind of then know those 18, they're always there and you kind of miss the 2 that are important, that actually broke your system If you want to do it really well, you make sure that it doesn't report any changes when you run it twice in a row. Also, a thing to consider is you can define variables in the "defaults" folder and also in the "vars" folder, but if you look up how variables get inherited, you'll notice that the "vars" folder is really hard to actually override, so you want to avoid that as much as possible. That much larger section will be about typical anti-patterns I've noticed and I'll come to the first one now. It's the shell or command module. When people start using Ansible, that's the first thing they go "Oh well, I know how to use wget or I know 'apt-get install' " and then they end up using the shell module to do just that. If you use the shell module or the command module, you usually don't want to use that and that's for several reasons. There's currently, I think, 1300 different modules in Ansible so there's likely a big chance that whatever you want to do, there's already a module for that, that just does that thing. But those two modules also have several problems and that is the shell module, of course, gets interpreted by your actual shell, so if you have any special variables in there, you'd actually also have to take care of any variables you interpret in the shell string. Then, one of the biggest problems is if you run your playbook in check mode, the shell and the command modules won't get run. So if you're actually doing anything with that, they just get skipped and that would cause that your actual check mode and the real mode, they will start diverging if you use a lot of shell module. The worst, also, a bad part about this is that these two modules, they'll always ??? changed like, you run a command and it exits 0 it's like "Oh, it changed" To get the reporting right on that module, you'd actually have to define for yourself when this is actually a change or not. So you'd have to probably get the output and then check, for example, if there's something on stderr or something to report an actual error or change. Then I'll get to the actual examples. The left is a bad example for using the shell module, I've seen that a lot, it's basically "Yeah, I actually want this file, so just use 'cat /path/file' and I'll use the register parameter to get the output". The actual output goes into the "shell_cmd" and then we want to copy it to some other file somewhere else and so we use the Jinja "{{ }}" to define the actual content of the file and then put it into that destination file That is problematic because, first of all if you run it in check mode, this gets skipped and then this variable is undefined and Ansible will fail with an error, so you won't be able to actually run that in check mode. The other problem is this will always ??? so you'd probably have to… the most sensible thing would probably be to say just "changed when false" and just acknowledge that that shell command won't change anything on this system The good example would be to use the actual "slurp" module that will just slurp the whole file and base64encode it and you can access the actual content with "path_file.contents" and you then just base64decode it and write in there. The nice thing is slurp will never return any change, so it won't say it changed and it also works great in check mode. Here's an other quick example. The example on the left, oh yeah wget. Here's the problem, every time your playbook runs, this file will get downloaded and of course if the file can't be retrieved from that URL it will throw an error and that will happen all the time. The right example is a more clean example using the uri module. You define a URL to retrieve a file from, you define where you want to write it to and you use the "creates" parameter to say "Just skip the whole thing if the file is already there". "set_facts", that's my pet peeve. set_facts is a module that allows you to define variables during your playbook run, so you can say set_facts and then this variable = that variable + a third variable or whatever you can do things with that. It's very problematic, though, because you end up having your variables changed during the playbook run and that is a problem when you use the "--start-at" parameter from ansible-playbook. Because this parameter allows you to skip forward to a certain task in a role so it skips everything until that point and then continues running there and that's really great for debugging but if you define a variable with set_facts and you skip over it, that variable would just not be defined. If you heavily use set_facts, that makes prototyping really horrible. Another point is that you can use "ansible -m setup" and then the hostname to check what variables are actually defined for a specific host and everything set with set_facts is just not there. In summary, avoid the shell module, avoid the command module, avoid set_facts as much as you can, and don't hide changes with "changed_when" so the clean approach is always to use one task to check something and then a second task to actually execute something for example. Also, a bad idea in my opinion is when people say "Oh well, it's not important if this throws an error or not, I'll just say 'fails when false'" That might work sometimes, but the problem there is, if something really breaks, you'll never find out. Advanced topics. This is about the templating. The usual approach, for example for postfix role, would be to do the following templating. You define certain variables in for example group_vars/postfix_servers so any host in that group would inherit these variables, so this is sort of a list of parameters for smtp recipient restrictions and this is just the smtp helo required. So the usual approach would be to define variables in the host_vars or group_vars, or even in the defaults and then you have a template where you just check every single variable If it exists, you actually sort of put the actual value there in place. Here, I check if this variable is set true and if yes, put the string there else, put this string there and for example, smtpd_recipient_restrictions I just iterate over this array and just output these values in order in that list. The problem here is that every time upstream defines a new variable you'll end up having to touch the actual template file and touch the actual variables so, I thought, "Well, you actually have keys and values and strings and arrays and hashes on one side, and actually, a config file is nothing else than that, just in a different format". So I came up with… With Jinja2, you can also define functions I'll have to cut short a little bit on explaining it but basically, up here, a function is defined and it's called here in the bottom Basically, what it just does, it iterates over the whole dictionary defined here, "postfix.main", and it just goes… It iterates over all the keys and values and it goes… If the value is a string, I'll just put "key = value" and if it's an array, I just iterate over it and put it there in the format that postfix actually wants. Basically, you can do the same, for example, for haproxy and you can just deserialize all the variables you actually defined. The advantages of this is, your template file just stays the same and it doesn't get messy if you start adding things. You have complete whitespace control, usually if you edit stuff, you kind of get an extra space, a new line in there, and that changes the template files for all machines. You have all the settings in alphabetical order, so if you actually run it and you see the diff, you don't end up having things going back and forth. If you get the syntax on the template file right, you don't have to touch it after that and you also don't get any syntax errors by editing them. That follows to the next one. You can actually set a "hash_behaviour" merge in the Ansible config and that allows you to do the following. On the left here, you define for example a dictionary and this is, like, in a group and then in a specific machine, you define an other setting in this dictionary. If you wouldn't use merge, the second setting would just override the first one and you'd end up with that, but if you actually do the merge, it does a deep merge of the hash. So the previous thing I showed would actually benefit from that so the combination of both is really good. I'll skip that. Further resources. Ansible has just a really good documentation, there's the IRC and there's also debops which is a project that is specific to Debian and derivatives. That's it. [Applause] Thank you very much.