Thank you everyone for coming.
If you were expecting the Postgres talk,
that was the one before, so
you might need to watch the video stream.
So, Ansible best practices,
I thought about calling it "Ansible,
my best practices",
so, just warning ahead, this is things
I stumbled on using Ansible
for the last 2-3 years and
those are very specific things I found
that worked very well for me.
About me, I do also freelance work,
do a lot of Ansible in there,
I'm also the Debian maintainer for
Ansible with Harlan Lieberman-Berg
If there are any bugs in the package,
just report them.
The talk will be roughly divided into
4 parts.
The first part will be about why you
actually want to use config management
and why you specifically want to use
Ansible.
So, if you're still SSHing into machines
and editing config files,
you're probably a good candidate
for using Ansible.
Then, the second part will be about good
role and playbook patterns
that I have found that work really well
for me.
The third chapter will be about typical
antipatterns I've stumbled upon,
either in my work with other people
using Ansible,
or the IRC support channel, for example.
The fourth part will be like advanced
tips and tricks you can use
like fun things you can do with Ansible.
Quick elevator pitch, what makes config
management good?
It actually also serves as a documentation
of changes on your servers over time
so if you just put the whole config
management in a git repo
and just regularly commit,
you will actually be able to say
"Why doesn't this work? It used to work
a year ago"
You can actually check why.
Also, most config management tools have
a lot better error reporting than
your self-written bash scripts that do
whatever.
And usually, you have a very good
reproducibility with config management
and also idempotency, meaning that if you
run, for example, a playbook several times
you will always get the same result.
Also, it's great if you work in small team
or you admin ??? in the company
and you have some people working
on a few things too.
It makes team work a lot easier and
you will save a lot of time actually
debugging things when things break.
What makes Ansible good?
Comparing it to Chef or Puppet for example
it's really easy to set up,
you start with two config files, you have
it installed and you're ready to go.
It's also agentless, so whatever machines
you actually want to control,
the only thing you they really need to have
is an SSH daemon and Python 2.6+
so that's virtually any Debian machine
you have installed and
that ???
Ansible also supports configuration
of many things like
networking equipment or even Windows
machines,
they don't need SSH but they use the
WinRM
but Ansible came a bit late to the game
so Ansible's still not as good
in coverage like for example Puppet,
which literally, you can configure any
machine on the planet with that,
as long as it has a CPU.
Next step, I will talk about good
role patterns.
If you've never worked with Ansible
before,
this is the point when you watch
the video stream,
that you pause it and start working
a few weeks with it
and then unpause the actual video.
A good role should ideally have
the following layout.
So, in the "roles" directory, you have
the name of the role and task/main.yml
You have the following rough layout.
At the beginning of the role, you check
for various conditions,
for example using the "assert" task to
for example check that
certain variables are defined, things
are set,
that it's maybe part of a group, things
like that you actually want to check.
Then, usually, you install packages, you
can use apt on CentOS machines,
yum, or you can do a git checkout or
whatever,
then usually you do some templating of
files where you have certain abstraction
and the variables are actually put into
the template and
make the actual config file.
There's also good to point out that
the template module actually has
a "validate" parameter,
that means you can actually use a command
to check your config files for syntax errors
and if that fails, your playbook will fail
before actually deploying that config file
so you can for example use Apache with
the right parameters to actually do
a check on the syntax of the file.
That way, you never end up with a state
where there's a broken config.
In the end, you usually…
When you change things, you trigger
handlers to restart any ???
If you use variables, I recommend putting
sensible defaults in
defaults/main.yml
and then you only have to override
those variables on specific cases.
Ideally, you should have sensible defaults
you want to have to get whatever things
you want to have running.
When you start working with it and do that
a bit more,
you notice a few things and that is
your role should ideally run in "check mode".
"ansible-playbook" has --check that
basically is just a dry run of
your complete playbook
and with --diff, it will actually show you
for example file changes,
or file mode changes, stuff like that
and won't actually change anything.
So if you end up editing a lot of stuff,
you can use that as a check.
I'll later get to some antipatterns that
actually break that thing.
And, ideally, the way you change files
and configs and states,
you should make sure that when the actual
changes are deployed,
and you run it a second time,
that Ansible doesn't report any changes
because if you end up writing your roles
fairly sloppy, you end up having
a lot of changes and then,
in the end of the report, you have like
20 changes reported and
you kind of then know those 18,
they're always there
and you kind of miss the 2 that are
important, that actually broke your system
If you want to do it really well, you make
sure that it doesn't report any changes
when you run it twice in a row.
Also, a thing to consider is you can define
variables in the "defaults" folder
and also in the "vars" folder,
but if you look up how variables get
inherited, you'll notice that
the "vars" folder is really hard to
actually override,
so you want to avoid that as much as
possible.
That much larger section will be about
typical anti-patterns I've noticed
and I'll come to the first one now.
It's the shell or command module.
When people start using Ansible, that's
the first thing they go
"Oh well, I know how to use wget or I know
'apt-get install' "
and then they end up using the shell module
to do just that.
If you use the shell module or the command
module, you usually don't want to use that
and that's for several reasons.
There's currently, I think, 1300 different
modules in Ansible
so there's likely a big chance that
whatever you want to do,
there's already a module for that, that
just does that thing.
But those two modules also have several
problems and that is
the shell module, of course, gets
interpreted by your actual shell,
so if you have any special variables
in there,
you'd actually also have to take care of
any variables you interpret in the shell string.
Then, one of the biggest problems is if
you run your playbook in check mode,
the shell and the command modules
won't get run.
So if you're actually doing anything
with that, they just get skipped
and that would cause that your actual
check mode and the real mode,
they will start diverging if you use
a lot of shell module.
The worst, also, bad part about this
is that these two modules,
they'll always ??? changed
like, you run a command and it exits 0
it's like "Oh, it changed"
To get the reporting right on that module,
you'd actually have to define for yourself
when this is actually a change or not.
So you'd have to probably get the output
and then check, for example,
if there's something on stderr or something
to report an actual error or change.
Then I'll get to the actual examples.
The left is a bad example for using
the shell module,
I've seen that a lot, it's basically
"Yeah, I actually want this file, so just
use 'cat /path/file' and I'll use
the register parameter to get the output".
The actual output goes into the "shell_cmd"
and then
we want to copy it to some other file
somewhere else and
so we use the Jinja "{{ }}" to define
the actual content of the file
and then put it into that destination file
That is problematic because, first of all
if you run it in check mode,
this gets skipped and then this variable
is undefined and
Ansible will fail with an error, so you
won't be able to actually
run that in check mode.
The other problem is this will always
???
so you'd probably have to…
the most sensible thing would probably
be to say just "changed when false"
and just acknowledge that that shell
command won't change anything on this system
The good example would be to use the
actual "slurp" module that will
just slurp the whole file and base64encode it
and you can access the actual content with
"path_file.contents" and you then just
base64decode it and write in there.
The nice thing is slurp will never return
any change, so it won't say it changed
and it also works great in check mode.
Here's an other quick example.
The example on the left, oh yeah wget.
Here's the problem, every time your playbook
runs, this file will get downloaded
and of course if the file can't be
retrieved from that URL
it will throw an error and that will
happen all the time.
The right example is a more clean example
using the uri module.
You define a URL to retrieve a file from,
you define where you want to write it to
and you use the "creates" parameter to say
"Just skip the whole thing if the file is
already there".
"set_facts", that's my pet peeve.
set_facts is a module that allows you
to define variables
during your playbook run, so you can say
set_facts and then
this variable = that variable + a third
variable or whatever
you can do things with that.
It's very problematic, though, because
you end up having your variables
changed during the playbook run
and that is a problem when you use
the "--start-at" parameter
from ansible-playbook.
Because this parameter allows you to
skip forward to a certain task ???
so it skips everything until that point
and then continues running there
and that's really great for debugging
but if you define a variable with set_facts
and you skip over it,
that variable would just not be defined.
If you heavily use set_facts, that makes
prototyping really horrible.
Another point is that you can use
"ansible -m setup" and then the hostname
to check what variables are actually defined
for a specific host and everything set
with set_facts is just not there.
In summary, avoid the shell module,
avoid the command module,
avoid set_facts as much as you can,
and don't hide changes with "changed_when"
so the clean approach is always to use one
task to check something
and then a second task to actually execute
something for example.
Also, a bad idea in my opinion is when
people say
"Oh well, it's not important if this ???
I'll just say 'fails when false'"
That might work sometimes, but the problem
there is, if something really breaks,
you'll never find out.
Advanced topics.
This is about the templating.
The usual approach, for example for
postfix role,
would be to do the following templating.
You define certain variables in for example
group_vars/postfix_servers
so any host in that group would inherit
these variables,
so this is sort of a list of parameters
for stmp recipient restrictions
and this is just the smtp helo required.
So the usual approach would be to
define variables
in the host_vars or group_vars, or even
in the defaults
and then you have a template where
you just check every single variable
If it exists, you actually sort of put
the actual value there in place.
Here, I check if this variable is set true
and if yes, put the string there
else, put this string there
and for example, smtpd_recipient_restrictions
I just iterate over this array
and just output these values in order
in that list.
The problem here is that every time
upstream defines a new variable
you'll end up having to change the actual
template file and touch the actual variables
so, I thought, "Well, you actually have
keys and values and strings and arrays
and arrays on one side, and actually,
a config file is nothing else than that,
just in a different format".
So I came up with…
With Jinja2, you can also define functions
I'll have to cut short a little bit on
explaining it but
basically, up here, a function is defined
and it's called here in the bottom
Basically, what it just does, it iterates
over the whole dictionary defined here,
"postfix.main", and it just goes…
It iterates over all the keys and values
and it goes…
If the value is a string, I'll just put
"key = value" and
if it's an array, I just iterate over it
and put it there in the format that
postfix actually wants.
Basically, you can do the same, for
example, for haproxy and
you can just deserialize all the variables
you actually defined.
The advantages of this is,
your template file just stays the same
and it doesn't get messy
if you start adding things.
You have complete whitespace control,
usually if you edit stuff,
you kind of get an extra space, a new
line in there, and that changes
the template files for all machines.
You have all the settings in alphabetical
order, so if you actually run it and
you see the diff, you don't end up having
things going back and forth.
If you get the syntax on the template file
right, you don't have to touch it after that
and you also don't get any syntax errors
by editing them.
That follows to the next one.
You can actually set a "hash_behaviour"
merge in the Ansible config and
that allows you to do the following.
On the left here, you define for example
a dictionary and this is, like, in a group
and then in a specific machine, you define
an other setting in this dictionary.
If you wouldn't use merge, the second
setting would just override the first one
and you'd end up with that, but if you
actually do the merge,
it does a deep merge of the hash.
So the previous thing I showed would
actually benefit from that
so the combination of both is really good.
I'll skip that.
For the resources, Ansible has just
a really good documentation,
there's the IRC and there's also debops
which is a project that is
specific to Debian and derivatives.
That's it.
[Applause]
Thank you very much.