Return to Video

Ansible best current practices

  • 0:06 - 0:07
    Thank you everyone for coming.
  • 0:08 - 0:12
    If you were expecting the Postgres talk,
    that was the one before, so
  • 0:12 - 0:15
    you might need to watch the video stream.
  • 0:17 - 0:18
    So, Ansible best practices,
  • 0:19 - 0:22
    I thought about calling it "Ansible,
    my best practices",
  • 0:23 - 0:30
    so, just warning ahead, this is things
    I stumbled on using Ansible
  • 0:30 - 0:32
    for the last 2-3 years and
  • 0:32 - 0:37
    those are very specific things I found
    that worked very well for me.
  • 0:39 - 0:46
    About me, I do also freelance work,
    do a lot of Ansible in there,
  • 0:46 - 0:52
    I'm also the Debian maintainer for
    Ansible with Harlan Lieberman-Berg
  • 0:54 - 0:58
    If there are any bugs in the package,
    just report them.
  • 1:06 - 1:10
    The talk will be roughly divided into
    4 parts.
  • 1:15 - 1:20
    The first part will be about why you
    actually want to use config management
  • 1:20 - 1:23
    and why you specifically want to use
    Ansible.
  • 1:24 - 1:30
    So, if you're still SSHing into machines
    and editing config files,
  • 1:30 - 1:34
    you're probably a good candidate
    for using Ansible.
  • 1:36 - 1:41
    Then, the second part will be about good
    roles and playbook patterns
  • 1:42 - 1:44
    that I have found that work really well
    for me.
  • 1:47 - 1:53
    The third chapter will be about typical
    antipatterns I've stumbled upon,
  • 1:53 - 1:58
    either in my work with other people
    using Ansible,
  • 1:58 - 2:01
    or the IRC support channel, for example.
  • 2:03 - 2:09
    The fourth part will be like advanced
    tips and tricks you can use
  • 2:09 - 2:11
    like fun things you can do with Ansible.
  • 2:13 - 2:16
    Quick elevator pitch, what makes config
    management good?
  • 2:18 - 2:25
    It actually also serves as a documentation
    of changes on your servers over time
  • 2:25 - 2:29
    so if you just put the whole config
    management in a git repo
  • 2:29 - 2:31
    and just regularly commit,
  • 2:31 - 2:33
    you will actually be able to say
  • 2:33 - 2:36
    "Why doesn't this work? It used to work
    a year ago"
  • 2:36 - 2:39
    You can actually check why.
  • 2:41 - 2:50
    Also, most config management tools have
    a lot better error reporting than
  • 2:50 - 2:53
    your self-written bash scripts that do
    whatever.
  • 2:56 - 3:03
    And usually, you have a very good
    reproducibility with config management
  • 3:03 - 3:11
    and also idempotency, meaning that if you
    run, for example, a playbook several times
  • 3:11 - 3:13
    you will always get the same result.
  • 3:15 - 3:24
    Also, it's great if you work in small team
    or you admin ??? in the company
  • 3:24 - 3:27
    and you have some people working
    on a few things too.
  • 3:29 - 3:33
    It makes team work a lot easier and
    you will save a lot of time actually
  • 3:33 - 3:36
    debugging things when things break.
  • 3:38 - 3:39
    What makes Ansible good?
  • 3:40 - 3:46
    Comparing it to Chef or Puppet for example
    it's really easy to set up,
  • 3:46 - 3:50
    you start with two config files, you have
    it installed and you're ready to go.
  • 3:52 - 3:56
    It's also agentless, so whatever machines
    you actually want to control,
  • 3:56 - 4:05
    the only thing you they really need to have
    is an SSH daemon and Python 2.6+
  • 4:05 - 4:11
    so that's virtually any Debian machine
    you have installed and
  • 4:11 - 4:13
    that is still supported in any way.
  • 4:15 - 4:22
    Ansible also supports configuration
    of many things like
  • 4:22 - 4:26
    networking equipment or even Windows
    machines,
  • 4:26 - 4:31
    they don't need SSH but they use the
    WinRM
  • 4:31 - 4:39
    but Ansible came a bit late to the game
    so Ansible's still not as good
  • 4:39 - 4:41
    in coverage like for example Puppet,
  • 4:42 - 4:47
    which literally, you can configure any
    machine on the planet with that,
  • 4:47 - 4:48
    as long as it has a CPU.
  • 4:50 - 4:54
    Next step, I will talk about good
    role patterns.
  • 4:57 - 4:59
    If you've never worked with Ansible
    before,
  • 4:59 - 5:02
    this is the point when you watch
    the video stream,
  • 5:02 - 5:06
    that you pause it and start working
    a few weeks with it
  • 5:06 - 5:08
    and then unpause the actual video.
  • 5:13 - 5:18
    A good role should ideally have
    the following layout.
  • 5:19 - 5:25
    So, in the "roles" directory, you have
    the name of the role and task/main.yml
  • 5:26 - 5:29
    You have the following rough layout.
  • 5:32 - 5:39
    At the beginning of the role, you check
    for various conditions,
  • 5:39 - 5:44
    for example using the "assert" task to
    for example check that
  • 5:44 - 5:48
    certain variables are defined, things
    are set,
  • 5:48 - 5:53
    that it's maybe part of a group, things
    like that you actually want to check.
  • 5:55 - 6:03
    Then, usually, you install packages, you
    can use apt, or on CentOS machines, yum
  • 6:04 - 6:05
    or you can do a git checkout or
    whatever,
  • 6:07 - 6:14
    then usually you do some templating of
    files where you have certain abstraction
  • 6:14 - 6:19
    and the variables are actually put into
    the template and
  • 6:19 - 6:21
    make the actual config file.
  • 6:22 - 6:27
    There's also good to point out that
    the template module actually has
  • 6:27 - 6:30
    a "validate" parameter,
  • 6:30 - 6:36
    that means you can actually use a command
    to check your config files for syntax errors
  • 6:36 - 6:44
    and if that fails, your playbook will fail
    before actually deploying that config file
  • 6:44 - 6:53
    so you can for example use Apache with
    the right parameters to actually do
  • 6:53 - 6:57
    a check on the syntax of the file.
  • 6:57 - 7:02
    That way, you never end up with a state
    where there's a broken config.
  • 7:04 - 7:05
    In the end, you usually…
  • 7:06 - 7:10
    When you change things, you trigger
    handlers to restart any daemons.
  • 7:12 - 7:24
    If you use variables, I recommend putting
    sensible defaults in
  • 7:24 - 7:27
    defaults/main.yml
  • 7:28 - 7:35
    and then you only have to override
    those variables on specific cases.
  • 7:35 - 7:41
    Ideally, you should have sensible defaults
    you want to have to get whatever things
  • 7:41 - 7:43
    you want to have running.
  • 7:46 - 7:52
    When you start working with it and do that
    a bit more,
  • 7:52 - 7:58
    you notice a few things and that is
  • 7:58 - 8:02
    your role should ideally run in "check mode".
  • 8:02 - 8:08
    "ansible-playbook" has --check that
    basically is just a dry run of
  • 8:08 - 8:12
    your complete playbook
  • 8:12 - 8:18
    and with --diff, it will actually show you
    for example file changes,
  • 8:18 - 8:21
    or file mode changes, stuff like that
  • 8:21 - 8:24
    and won't actually change anything.
  • 8:24 - 8:32
    So if you end up editing a lot of stuff,
    you can use that as a check.
  • 8:32 - 8:37
    I'll later get to some antipatterns that
    actually break that thing.
  • 8:40 - 8:47
    And, ideally, the way you change files
    and configs and states,
  • 8:47 - 8:51
    you should make sure that when the actual
    changes are deployed,
  • 8:51 - 8:53
    and you run it a second time,
  • 8:53 - 8:58
    that Ansible doesn't report any changes
  • 8:58 - 9:03
    because if you end up writing your roles
    fairly sloppy, you end up having
  • 9:03 - 9:06
    a lot of changes and then,
  • 9:06 - 9:11
    in the end of the report, you have like
    20 changes reported and
  • 9:11 - 9:15
    you kind of then know those 18,
    they're always there
  • 9:15 - 9:18
    and you kind of miss the 2 that are
    important, that actually broke your system
  • 9:18 - 9:25
    If you want to do it really well, you make
    sure that it doesn't report any changes
  • 9:25 - 9:27
    when you run it twice in a row.
  • 9:31 - 9:38
    Also, a thing to consider is you can define
    variables in the "defaults" folder
  • 9:38 - 9:40
    and also in the "vars" folder,
  • 9:41 - 9:46
    but if you look up how variables get
    inherited, you'll notice that
  • 9:46 - 9:50
    the "vars" folder is really hard to
    actually override,
  • 9:50 - 9:53
    so you want to avoid that as much as
    possible.
  • 9:59 - 10:06
    That much larger section will be about
    typical anti-patterns I've noticed
  • 10:06 - 10:10
    and I'll come to the first one now.
  • 10:12 - 10:15
    It's the shell or command module.
  • 10:17 - 10:20
    When people start using Ansible, that's
    the first thing they go
  • 10:20 - 10:26
    "Oh well, I know how to use wget or I know
    'apt-get install' "
  • 10:26 - 10:30
    and then they end up using the shell module
    to do just that.
  • 10:31 - 10:35
    If you use the shell module or the command
    module, you usually don't want to use that
  • 10:35 - 10:39
    and that's for several reasons.
  • 10:40 - 10:47
    There's currently, I think, 1300 different
    modules in Ansible
  • 10:47 - 10:51
    so there's likely a big chance that
    whatever you want to do,
  • 10:51 - 10:54
    there's already a module for that, that
    just does that thing.
  • 10:55 - 11:03
    But those two modules also have several
    problems and that is
  • 11:03 - 11:10
    the shell module, of course, gets
    interpreted by your actual shell,
  • 11:10 - 11:13
    so if you have any special variables
    in there,
  • 11:13 - 11:22
    you'd actually also have to take care of
    any variables you interpret in the shell string.
  • 11:25 - 11:31
    Then, one of the biggest problems is if
    you run your playbook in check mode,
  • 11:31 - 11:34
    the shell and the command modules
    won't get run.
  • 11:35 - 11:38
    So if you're actually doing anything
    with that, they just get skipped
  • 11:38 - 11:48
    and that would cause that your actual
    check mode and the real mode,
  • 11:48 - 11:52
    they will start diverging if you use
    a lot of shell module.
  • 11:56 - 12:01
    The worst, also, a bad part about this
    is that these two modules,
  • 12:01 - 12:04
    they'll always ??? changed
  • 12:04 - 12:06
    like, you run a command and it exits 0
  • 12:06 - 12:08
    it's like "Oh, it changed"
  • 12:11 - 12:18
    To get the reporting right on that module,
    you'd actually have to define for yourself
  • 12:18 - 12:21
    when this is actually a change or not.
  • 12:22 - 12:29
    So you'd have to probably get the output
    and then check, for example,
  • 12:29 - 12:35
    if there's something on stderr or something
    to report an actual error or change.
  • 12:38 - 12:41
    Then I'll get to the actual examples.
  • 12:41 - 12:46
    The left is a bad example for using
    the shell module,
  • 12:46 - 12:49
    I've seen that a lot, it's basically
  • 12:49 - 12:57
    "Yeah, I actually want this file, so just
    use 'cat /path/file' and I'll use
  • 12:57 - 13:00
    the register parameter to get the output".
  • 13:06 - 13:11
    The actual output goes into the "shell_cmd"
    and then
  • 13:11 - 13:16
    we want to copy it to some other file
    somewhere else and
  • 13:16 - 13:26
    so we use the Jinja "{{ }}" to define
    the actual content of the file
  • 13:26 - 13:31
    and then put it into that destination file
  • 13:32 - 13:37
    That is problematic because, first of all
    if you run it in check mode,
  • 13:37 - 13:41
    this gets skipped and then this variable
    is undefined and
  • 13:41 - 13:45
    Ansible will fail with an error, so you
    won't be able to actually
  • 13:45 - 13:47
    run that in check mode.
  • 13:48 - 13:51
    The other problem is this will always
    ???
  • 13:52 - 13:55
    so you'd probably have to…
  • 13:57 - 14:01
    the most sensible thing would probably
    be to say just "changed when false"
  • 14:02 - 14:06
    and just acknowledge that that shell
    command won't change anything on this system
  • 14:08 - 14:14
    The good example would be to use the
    actual "slurp" module that will
  • 14:14 - 14:17
    just slurp the whole file and base64encode it
  • 14:18 - 14:28
    and you can access the actual content with
    "path_file.contents" and you then just
  • 14:28 - 14:31
    base64decode it and write in there.
  • 14:32 - 14:39
    The nice thing is slurp will never return
    any change, so it won't say it changed
  • 14:39 - 14:43
    and it also works great in check mode.
  • 14:46 - 14:48
    Here's an other quick example.
  • 14:50 - 14:53
    The example on the left, oh yeah wget.
  • 14:54 - 15:00
    Here's the problem, every time your playbook
    runs, this file will get downloaded
  • 15:00 - 15:08
    and of course if the file can't be
    retrieved from that URL
  • 15:08 - 15:13
    it will throw an error and that will
    happen all the time.
  • 15:15 - 15:19
    The right example is a more clean example
    using the uri module.
  • 15:20 - 15:28
    You define a URL to retrieve a file from,
    you define where you want to write it to
  • 15:28 - 15:31
    and you use the "creates" parameter to say
  • 15:31 - 15:35
    "Just skip the whole thing if the file is
    already there".
  • 15:40 - 15:43
    "set_facts", that's my pet peeve.
  • 15:45 - 15:50
    set_facts is a module that allows you
    to define variables
  • 15:50 - 15:57
    during your playbook run, so you can say
    set_facts and then
  • 15:57 - 16:03
    this variable = that variable + a third
    variable or whatever
  • 16:03 - 16:05
    you can do things with that.
  • 16:06 - 16:13
    It's very problematic, though, because
    you end up having your variables
  • 16:13 - 16:15
    changed during the playbook run
  • 16:15 - 16:25
    and that is a problem when you use
    the "--start-at" parameter
  • 16:25 - 16:26
    from ansible-playbook.
  • 16:30 - 16:36
    Because this parameter allows you to
    skip forward to a certain task in a role
  • 16:36 - 16:40
    so it skips everything until that point
    and then continues running there
  • 16:40 - 16:42
    and that's really great for debugging
  • 16:42 - 16:49
    but if you define a variable with set_facts
    and you skip over it,
  • 16:49 - 16:51
    that variable would just not be defined.
  • 16:54 - 17:02
    If you heavily use set_facts, that makes
    prototyping really horrible.
  • 17:05 - 17:08
    Another point is that you can use
  • 17:08 - 17:13
    "ansible -m setup" and then the hostname
    to check what variables are actually defined
  • 17:13 - 17:19
    for a specific host and everything set
    with set_facts is just not there.
  • 17:22 - 17:27
    In summary, avoid the shell module,
    avoid the command module,
  • 17:27 - 17:30
    avoid set_facts as much as you can,
  • 17:30 - 17:37
    and don't hide changes with "changed_when"
  • 17:37 - 17:42
    so the clean approach is always to use one
    task to check something
  • 17:42 - 17:46
    and then a second task to actually execute
    something for example.
  • 17:48 - 17:52
    Also, a bad idea in my opinion is when
    people say
  • 17:52 - 17:56
    "Oh well, it's not important if this
    throws an error or not,
  • 17:56 - 17:59
    I'll just say 'fails when false'"
  • 18:00 - 18:06
    That might work sometimes, but the problem
    there is, if something really breaks,
  • 18:06 - 18:08
    you'll never find out.
  • 18:09 - 18:11
    Advanced topics.
  • 18:14 - 18:17
    This is about the templating.
  • 18:19 - 18:22
    The usual approach, for example for
    postfix role,
  • 18:22 - 18:25
    would be to do the following templating.
  • 18:25 - 18:36
    You define certain variables in for example
    group_vars/postfix_servers
  • 18:36 - 18:41
    so any host in that group would inherit
    these variables,
  • 18:42 - 18:48
    so this is sort of a list of parameters
    for smtp recipient restrictions
  • 18:49 - 18:54
    and this is just the smtp helo required.
  • 18:55 - 18:58
    So the usual approach would be to
    define variables
  • 18:58 - 19:03
    in the host_vars or group_vars, or even
    in the defaults
  • 19:03 - 19:08
    and then you have a template where
    you just check every single variable
  • 19:08 - 19:15
    If it exists, you actually sort of put
    the actual value there in place.
  • 19:18 - 19:24
    Here, I check if this variable is set true
    and if yes, put the string there
  • 19:24 - 19:27
    else, put this string there
  • 19:28 - 19:34
    and for example, smtpd_recipient_restrictions
    I just iterate over this array
  • 19:34 - 19:38
    and just output these values in order
    in that list.
  • 19:42 - 19:47
    The problem here is that every time
    upstream defines a new variable
  • 19:47 - 19:57
    you'll end up having to touch the actual
    template file and touch the actual variables
  • 19:57 - 20:04
    so, I thought, "Well, you actually have
    keys and values and strings and arrays
  • 20:04 - 20:09
    and hashes on one side, and actually,
    a config file is nothing else than that,
  • 20:10 - 20:12
    just in a different format".
  • 20:12 - 20:17
    So I came up with…
  • 20:18 - 20:24
    With Jinja2, you can also define functions
  • 20:24 - 20:29
    I'll have to cut short a little bit on
    explaining it but
  • 20:29 - 20:36
    basically, up here, a function is defined
    and it's called here in the bottom
  • 20:36 - 20:44
    Basically, what it just does, it iterates
    over the whole dictionary defined here,
  • 20:44 - 20:47
    "postfix.main", and it just goes…
  • 20:49 - 20:52
    It iterates over all the keys and values
    and it goes…
  • 20:53 - 20:58
    If the value is a string, I'll just put
    "key = value" and
  • 20:58 - 21:04
    if it's an array, I just iterate over it
    and put it there in the format that
  • 21:04 - 21:06
    postfix actually wants.
  • 21:08 - 21:12
    Basically, you can do the same, for
    example, for haproxy and
  • 21:12 - 21:18
    you can just deserialize all the variables
    you actually defined.
  • 21:20 - 21:23
    The advantages of this is,
  • 21:23 - 21:28
    your template file just stays the same
    and it doesn't get messy
  • 21:28 - 21:30
    if you start adding things.
  • 21:31 - 21:35
    You have complete whitespace control,
    usually if you edit stuff,
  • 21:35 - 21:39
    you kind of get an extra space, a new
    line in there, and that changes
  • 21:39 - 21:42
    the template files for all machines.
  • 21:44 - 21:49
    You have all the settings in alphabetical
    order, so if you actually run it and
  • 21:49 - 21:55
    you see the diff, you don't end up having
    things going back and forth.
  • 21:57 - 22:01
    If you get the syntax on the template file
    right, you don't have to touch it after that
  • 22:01 - 22:06
    and you also don't get any syntax errors
    by editing them.
  • 22:14 - 22:16
    That follows to the next one.
  • 22:18 - 22:24
    You can actually set a "hash_behaviour"
    merge in the Ansible config and
  • 22:24 - 22:27
    that allows you to do the following.
  • 22:28 - 22:39
    On the left here, you define for example
    a dictionary and this is, like, in a group
  • 22:39 - 22:45
    and then in a specific machine, you define
    an other setting in this dictionary.
  • 22:46 - 22:51
    If you wouldn't use merge, the second
    setting would just override the first one
  • 22:51 - 22:54
    and you'd end up with that, but if you
    actually do the merge,
  • 22:54 - 22:56
    it does a deep merge of the hash.
  • 22:57 - 23:04
    So the previous thing I showed would
    actually benefit from that
  • 23:04 - 23:06
    so the combination of both is really good.
  • 23:08 - 23:10
    I'll skip that.
  • 23:10 - 23:16
    Further resources. Ansible has just
    a really good documentation,
  • 23:16 - 23:23
    there's the IRC and there's also debops
    which is a project that is
  • 23:23 - 23:28
    specific to Debian and derivatives.
  • 23:30 - 23:31
    That's it.
  • 23:32 - 23:37
    [Applause]
  • 23:39 - 23:41
    Thank you very much.
Title:
Ansible best current practices
Description:

Talk given by Lee Garrett at Minidebconf Hamburg 2018
https://meetings-archive.debian.net/pub/debian-meetings/2018/miniconf-hamburg/2018-05-20/ansible_bcp.webm

more » « less
Video Language:
English
Team:
Debconf
Project:
2018_mini-debconf-hamburg
Duration:
23:46

English subtitles

Incomplete

Revisions Compare revisions