Ansible best current practices

Edit subtitles

0:06 - 0:07

Thank you everyone for coming.
0:08 - 0:12

If you were expecting the Postgres talk,
that was the one before, so
0:12 - 0:15

you might need to watch the video stream.
0:17 - 0:18

So, Ansible best practices,
0:19 - 0:22

I thought about calling it "Ansible,
my best practices",
0:23 - 0:30

so, just warning ahead, this is things
I stumbled on using Ansible
0:30 - 0:32

for the last 2-3 years and
0:32 - 0:37

those are very specific things I found
that worked very well for me.
0:39 - 0:46

About me, I do also freelance work,
do a lot of Ansible in there,
0:46 - 0:52

I'm also the Debian maintainer for
Ansible with Harlan Lieberman-Berg
0:54 - 0:58

If there are any bugs in the package,
just report them.
1:06 - 1:10

The talk will be roughly divided into
4 parts.
1:15 - 1:20

The first part will be about why you
actually want to use config management
1:20 - 1:23

and why you specifically want to use
Ansible.
1:24 - 1:30

So, if you're still SSHing into machines
and editing config files,
1:30 - 1:34

you're probably a good candidate
for using Ansible.
1:36 - 1:41

Then, the second part will be about good
roles and playbook patterns
1:42 - 1:44

that I have found that work really well
for me.
1:47 - 1:53

The third chapter will be about typical
antipatterns I've stumbled upon,
1:53 - 1:58

either in my work with other people
using Ansible,
1:58 - 2:01

or the IRC support channel, for example.
2:03 - 2:09

The fourth part will be like advanced
tips and tricks you can use
2:09 - 2:11

like fun things you can do with Ansible.
2:13 - 2:16

Quick elevator pitch, what makes config
management good?
2:18 - 2:25

It actually also serves as a documentation
of changes on your servers over time
2:25 - 2:29

so if you just put the whole config
management in a git repo
2:29 - 2:31

and just regularly commit,
2:31 - 2:33

you will actually be able to say
2:33 - 2:36

"Why doesn't this work? It used to work
a year ago"
2:36 - 2:39

You can actually check why.
2:41 - 2:50

Also, most config management tools have
a lot better error reporting than
2:50 - 2:53

your self-written bash scripts that do
whatever.
2:56 - 3:03

And usually, you have a very good
reproducibility with config management
3:03 - 3:11

and also idempotency, meaning that if you
run, for example, a playbook several times
3:11 - 3:13

you will always get the same result.
3:15 - 3:24

Also, it's great if you work in small team
or you admin ??? in the company
3:24 - 3:27

and you have some people working
on a few things too.
3:29 - 3:33

It makes team work a lot easier and
you will save a lot of time actually
3:33 - 3:36

debugging things when things break.
3:38 - 3:39

What makes Ansible good?
3:40 - 3:46

Comparing it to Chef or Puppet for example
it's really easy to set up,
3:46 - 3:50

you start with two config files, you have
it installed and you're ready to go.
3:52 - 3:56

It's also agentless, so whatever machines
you actually want to control,
3:56 - 4:05

the only thing you they really need to have
is an SSH daemon and Python 2.6+
4:05 - 4:11

so that's virtually any Debian machine
you have installed and
4:11 - 4:13

that is still supported in any way.
4:15 - 4:22

Ansible also supports configuration
of many things like
4:22 - 4:26

networking equipment or even Windows
machines,
4:26 - 4:31

they don't need SSH but they use the
WinRM
4:31 - 4:39

but Ansible came a bit late to the game
so Ansible's still not as good
4:39 - 4:41

in coverage like for example Puppet,
4:42 - 4:47

which literally, you can configure any
machine on the planet with that,
4:47 - 4:48

as long as it has a CPU.
4:50 - 4:54

Next step, I will talk about good
role patterns.
4:57 - 4:59

If you've never worked with Ansible
before,
4:59 - 5:02

this is the point when you watch
the video stream,
5:02 - 5:06

that you pause it and start working
a few weeks with it
5:06 - 5:08

and then unpause the actual video.
5:13 - 5:18

A good role should ideally have
the following layout.
5:19 - 5:25

So, in the "roles" directory, you have
the name of the role and task/main.yml
5:26 - 5:29

You have the following rough layout.
5:32 - 5:39

At the beginning of the role, you check
for various conditions,
5:39 - 5:44

for example using the "assert" task to
for example check that
5:44 - 5:48

certain variables are defined, things
are set,
5:48 - 5:53

that it's maybe part of a group, things
like that you actually want to check.
5:55 - 6:03

Then, usually, you install packages, you
can use apt, or on CentOS machines, yum
6:04 - 6:05

or you can do a git checkout or
whatever,
6:07 - 6:14

then usually you do some templating of
files where you have certain abstraction
6:14 - 6:19

and the variables are actually put into
the template and
6:19 - 6:21

make the actual config file.
6:22 - 6:27

There's also good to point out that
the template module actually has
6:27 - 6:30

a "validate" parameter,
6:30 - 6:36

that means you can actually use a command
to check your config files for syntax errors
6:36 - 6:44

and if that fails, your playbook will fail
before actually deploying that config file
6:44 - 6:53

so you can for example use Apache with
the right parameters to actually do
6:53 - 6:57

a check on the syntax of the file.
6:57 - 7:02

That way, you never end up with a state
where there's a broken config.
7:04 - 7:05

In the end, you usually…
7:06 - 7:10

When you change things, you trigger
handlers to restart any daemons.
7:12 - 7:24

If you use variables, I recommend putting
sensible defaults in
7:24 - 7:27

defaults/main.yml
7:28 - 7:35

and then you only have to override
those variables on specific cases.
7:35 - 7:41

Ideally, you should have sensible defaults
you want to have to get whatever things
7:41 - 7:43

you want to have running.
7:46 - 7:52

When you start working with it and do that
a bit more,
7:52 - 7:58

you notice a few things and that is
7:58 - 8:02

your role should ideally run in "check mode".
8:02 - 8:08

"ansible-playbook" has --check that
basically is just a dry run of
8:08 - 8:12

your complete playbook
8:12 - 8:18

and with --diff, it will actually show you
for example file changes,
8:18 - 8:21

or file mode changes, stuff like that
8:21 - 8:24

and won't actually change anything.
8:24 - 8:32

So if you end up editing a lot of stuff,
you can use that as a check.
8:32 - 8:37

I'll later get to some antipatterns that
actually break that thing.
8:40 - 8:47

And, ideally, the way you change files
and configs and states,
8:47 - 8:51

you should make sure that when the actual
changes are deployed,
8:51 - 8:53

and you run it a second time,
8:53 - 8:58

that Ansible doesn't report any changes
8:58 - 9:03

because if you end up writing your roles
fairly sloppy, you end up having
9:03 - 9:06

a lot of changes and then,
9:06 - 9:11

in the end of the report, you have like
20 changes reported and
9:11 - 9:15

you kind of then know those 18,
they're always there
9:15 - 9:18

and you kind of miss the 2 that are
important, that actually broke your system
9:18 - 9:25

If you want to do it really well, you make
sure that it doesn't report any changes
9:25 - 9:27

when you run it twice in a row.
9:31 - 9:38

Also, a thing to consider is you can define
variables in the "defaults" folder
9:38 - 9:40

and also in the "vars" folder,
9:41 - 9:46

but if you look up how variables get
inherited, you'll notice that
9:46 - 9:50

the "vars" folder is really hard to
actually override,
9:50 - 9:53

so you want to avoid that as much as
possible.
9:59 - 10:06

That much larger section will be about
typical anti-patterns I've noticed
10:06 - 10:10

and I'll come to the first one now.
10:12 - 10:15

It's the shell or command module.
10:17 - 10:20

When people start using Ansible, that's
the first thing they go
10:20 - 10:26

"Oh well, I know how to use wget or I know
'apt-get install' "
10:26 - 10:30

and then they end up using the shell module
to do just that.
10:31 - 10:35

If you use the shell module or the command
module, you usually don't want to use that
10:35 - 10:39

and that's for several reasons.
10:40 - 10:47

There's currently, I think, 1300 different
modules in Ansible
10:47 - 10:51

so there's likely a big chance that
whatever you want to do,
10:51 - 10:54

there's already a module for that, that
just does that thing.
10:55 - 11:03

But those two modules also have several
problems and that is
11:03 - 11:10

the shell module, of course, gets
interpreted by your actual shell,
11:10 - 11:13

so if you have any special variables
in there,
11:13 - 11:22

you'd actually also have to take care of
any variables you interpret in the shell string.
11:25 - 11:31

Then, one of the biggest problems is if
you run your playbook in check mode,
11:31 - 11:34

the shell and the command modules
won't get run.
11:35 - 11:38

So if you're actually doing anything
with that, they just get skipped
11:38 - 11:48

and that would cause that your actual
check mode and the real mode,
11:48 - 11:52

they will start diverging if you use
a lot of shell module.
11:56 - 12:01

The worst, also, a bad part about this
is that these two modules,
12:01 - 12:04

they'll always ??? changed
12:04 - 12:06

like, you run a command and it exits 0
12:06 - 12:08

it's like "Oh, it changed"
12:11 - 12:18

To get the reporting right on that module,
you'd actually have to define for yourself
12:18 - 12:21

when this is actually a change or not.
12:22 - 12:29

So you'd have to probably get the output
and then check, for example,
12:29 - 12:35

if there's something on stderr or something
to report an actual error or change.
12:38 - 12:41

Then I'll get to the actual examples.
12:41 - 12:46

The left is a bad example for using
the shell module,
12:46 - 12:49

I've seen that a lot, it's basically
12:49 - 12:57

"Yeah, I actually want this file, so just
use 'cat /path/file' and I'll use
12:57 - 13:00

the register parameter to get the output".
13:06 - 13:11

The actual output goes into the "shell_cmd"
and then
13:11 - 13:16

we want to copy it to some other file
somewhere else and
13:16 - 13:26

so we use the Jinja "{{ }}" to define
the actual content of the file
13:26 - 13:31

and then put it into that destination file
13:32 - 13:37

That is problematic because, first of all
if you run it in check mode,
13:37 - 13:41

this gets skipped and then this variable
is undefined and
13:41 - 13:45

Ansible will fail with an error, so you
won't be able to actually
13:45 - 13:47

run that in check mode.
13:48 - 13:51

The other problem is this will always
???
13:52 - 13:55

so you'd probably have to…
13:57 - 14:01

the most sensible thing would probably
be to say just "changed when false"
14:02 - 14:06

and just acknowledge that that shell
command won't change anything on this system
14:08 - 14:14

The good example would be to use the
actual "slurp" module that will
14:14 - 14:17

just slurp the whole file and base64encode it
14:18 - 14:28

and you can access the actual content with
"path_file.contents" and you then just
14:28 - 14:31

base64decode it and write in there.
14:32 - 14:39

The nice thing is slurp will never return
any change, so it won't say it changed
14:39 - 14:43

and it also works great in check mode.
14:46 - 14:48

Here's an other quick example.
14:50 - 14:53

The example on the left, oh yeah wget.
14:54 - 15:00

Here's the problem, every time your playbook
runs, this file will get downloaded
15:00 - 15:08

and of course if the file can't be
retrieved from that URL
15:08 - 15:13

it will throw an error and that will
happen all the time.
15:15 - 15:19

The right example is a more clean example
using the uri module.
15:20 - 15:28

You define a URL to retrieve a file from,
you define where you want to write it to
15:28 - 15:31

and you use the "creates" parameter to say
15:31 - 15:35

"Just skip the whole thing if the file is
already there".
15:40 - 15:43

"set_facts", that's my pet peeve.
15:45 - 15:50

set_facts is a module that allows you
to define variables
15:50 - 15:57

during your playbook run, so you can say
set_facts and then
15:57 - 16:03

this variable = that variable + a third
variable or whatever
16:03 - 16:05

you can do things with that.
16:06 - 16:13

It's very problematic, though, because
you end up having your variables
16:13 - 16:15

changed during the playbook run
16:15 - 16:25

and that is a problem when you use
the "--start-at" parameter
16:25 - 16:26

from ansible-playbook.
16:30 - 16:36

Because this parameter allows you to
skip forward to a certain task in a role
16:36 - 16:40

so it skips everything until that point
and then continues running there
16:40 - 16:42

and that's really great for debugging
16:42 - 16:49

but if you define a variable with set_facts
and you skip over it,
16:49 - 16:51

that variable would just not be defined.
16:54 - 17:02

If you heavily use set_facts, that makes
prototyping really horrible.
17:05 - 17:08

Another point is that you can use
17:08 - 17:13

"ansible -m setup" and then the hostname
to check what variables are actually defined
17:13 - 17:19

for a specific host and everything set
with set_facts is just not there.
17:22 - 17:27

In summary, avoid the shell module,
avoid the command module,
17:27 - 17:30

avoid set_facts as much as you can,
17:30 - 17:37

and don't hide changes with "changed_when"
17:37 - 17:42

so the clean approach is always to use one
task to check something
17:42 - 17:46

and then a second task to actually execute
something for example.
17:48 - 17:52

Also, a bad idea in my opinion is when
people say
17:52 - 17:56

"Oh well, it's not important if this
throws an error or not,
17:56 - 17:59

I'll just say 'fails when false'"
18:00 - 18:06

That might work sometimes, but the problem
there is, if something really breaks,
18:06 - 18:08

you'll never find out.
18:09 - 18:11

Advanced topics.
18:14 - 18:17

This is about the templating.
18:19 - 18:22

The usual approach, for example for
postfix role,
18:22 - 18:25

would be to do the following templating.
18:25 - 18:36

You define certain variables in for example
group_vars/postfix_servers
18:36 - 18:41

so any host in that group would inherit
these variables,
18:42 - 18:48

so this is sort of a list of parameters
for smtp recipient restrictions
18:49 - 18:54

and this is just the smtp helo required.
18:55 - 18:58

So the usual approach would be to
define variables
18:58 - 19:03

in the host_vars or group_vars, or even
in the defaults
19:03 - 19:08

and then you have a template where
you just check every single variable
19:08 - 19:15

If it exists, you actually sort of put
the actual value there in place.
19:18 - 19:24

Here, I check if this variable is set true
and if yes, put the string there
19:24 - 19:27

else, put this string there
19:28 - 19:34

and for example, smtpd_recipient_restrictions
I just iterate over this array
19:34 - 19:38

and just output these values in order
in that list.
19:42 - 19:47

The problem here is that every time
upstream defines a new variable
19:47 - 19:57

you'll end up having to touch the actual
template file and touch the actual variables
19:57 - 20:04

so, I thought, "Well, you actually have
keys and values and strings and arrays
20:04 - 20:09

and hashes on one side, and actually,
a config file is nothing else than that,
20:10 - 20:12

just in a different format".
20:12 - 20:17

So I came up with…
20:18 - 20:24

With Jinja2, you can also define functions
20:24 - 20:29

I'll have to cut short a little bit on
explaining it but
20:29 - 20:36

basically, up here, a function is defined
and it's called here in the bottom
20:36 - 20:44

Basically, what it just does, it iterates
over the whole dictionary defined here,
20:44 - 20:47

"postfix.main", and it just goes…
20:49 - 20:52

It iterates over all the keys and values
and it goes…
20:53 - 20:58

If the value is a string, I'll just put
"key = value" and
20:58 - 21:04

if it's an array, I just iterate over it
and put it there in the format that
21:04 - 21:06

postfix actually wants.
21:08 - 21:12

Basically, you can do the same, for
example, for haproxy and
21:12 - 21:18

you can just deserialize all the variables
you actually defined.
21:20 - 21:23

The advantages of this is,
21:23 - 21:28

your template file just stays the same
and it doesn't get messy
21:28 - 21:30

if you start adding things.
21:31 - 21:35

You have complete whitespace control,
usually if you edit stuff,
21:35 - 21:39

you kind of get an extra space, a new
line in there, and that changes
21:39 - 21:42

the template files for all machines.
21:44 - 21:49

You have all the settings in alphabetical
order, so if you actually run it and
21:49 - 21:55

you see the diff, you don't end up having
things going back and forth.
21:57 - 22:01

If you get the syntax on the template file
right, you don't have to touch it after that
22:01 - 22:06

and you also don't get any syntax errors
by editing them.
22:14 - 22:16

That follows to the next one.
22:18 - 22:24

You can actually set a "hash_behaviour"
merge in the Ansible config and
22:24 - 22:27

that allows you to do the following.
22:28 - 22:39

On the left here, you define for example
a dictionary and this is, like, in a group
22:39 - 22:45

and then in a specific machine, you define
an other setting in this dictionary.
22:46 - 22:51

If you wouldn't use merge, the second
setting would just override the first one
22:51 - 22:54

and you'd end up with that, but if you
actually do the merge,
22:54 - 22:56

it does a deep merge of the hash.
22:57 - 23:04

So the previous thing I showed would
actually benefit from that
23:04 - 23:06

so the combination of both is really good.
23:08 - 23:10

I'll skip that.
23:10 - 23:16

Further resources. Ansible has just
a really good documentation,
23:16 - 23:23

there's the IRC and there's also debops
which is a project that is
23:23 - 23:28

specific to Debian and derivatives.
23:30 - 23:31

That's it.
23:32 - 23:37

[Applause]
23:39 - 23:41

Thank you very much.

Title:: Ansible best current practices
Description:: Talk given by Lee Garrett at Minidebconf Hamburg 2018
https://meetings-archive.debian.net/pub/debian-meetings/2018/miniconf-hamburg/2018-05-20/ansible_bcp.webm

more » « less
Video Language:: English
Team:: Debconf
Project:: 2018_mini-debconf-hamburg
Duration:: 23:46

	tvincent edited English subtitles for Ansible best current practices
	tvincent edited English subtitles for Ansible best current practices
	tvincent edited English subtitles for Ansible best current practices
	tvincent edited English subtitles for Ansible best current practices
	tvincent edited English subtitles for Ansible best current practices

English subtitles

Incomplete

Revisions Compare revisions

Revision 5 Edited

tvincent
Revision 4 Edited

tvincent
Revision 3 Edited

tvincent
Revision 2 Edited

tvincent
Revision 1 Edited

tvincent

	Revision Number	Author	Created
	5	tvincent
	4	tvincent
	3	tvincent
	2	tvincent
	1	tvincent

Ansible best current practices

Revisions Compare revisions

Our website uses cookies

Operating cookies (Required)