Return to Video

meetings-archive.debian.net/.../Docker_Debian.webm

  • Not Synced
    called the Sunlight Foundation.
  • Not Synced
    I do government transparency and accountability.
  • Not Synced
    Who's paying whom how much, who's voting which way, stuff like that.
  • Not Synced
    I'm been doing Ubuntu stuff for a while,
  • Not Synced
    Debian stuff for a little while, since 2009.
  • Not Synced
    I'm on the FTP Team
  • Not Synced
    and I'm like whatever else
  • Not Synced
    I've got other stuff that I'm doing
  • Not Synced
    but that's not really as important.
  • Not Synced
    Oh great, link not found! Cool!
  • Not Synced
    That is a giant beautiful picture of the Sunlight logo
  • Not Synced
    this is just the generic intro to any of my slides,
  • Not Synced
    so I apologise but if anyone's interested in Sunlight,
  • Not Synced
    feel free to talk to me.
  • Not Synced
    I will gladly tell you all about how awesome Sunlight is
  • Not Synced
    and how much fun I have working there.
  • Not Synced
    Right, so, Docker.
  • Not Synced
    What is Docker?
  • Not Synced
    This is sort of like the existential question
  • Not Synced
    No one quite knows, right?
  • Not Synced
    Everyone is using Docker for all these different things
  • Not Synced
    and it's kind of super confusing
  • Not Synced
    and that's really disappointing.
  • Not Synced
    Basically, Docker is a process-level isolation framework.
  • Not Synced
    That's all it is.
  • Not Synced
    It uses the Linux kernel namespacing to isolate a process,
  • Not Synced
    a single process and tracks the processes inside the container using cgroups.
  • Not Synced
    Also I forgot to mention this because I dove right in,
  • Not Synced
    this talk is going to be on the short side,
  • Not Synced
    because I'm hoping that we're going to have a bit of discussion
  • Not Synced
    about Docker's role in Debian
  • Not Synced
    and in what ways, us as the docker maintenance team, can help Debian
  • Not Synced
    and in what ways this can flow back and forth
  • Not Synced
    Also, I do work at Docker, I'm just the Debian hacker.
  • Not Synced
    I do this for fun, so I will cover some of the cons
  • Not Synced
    that maybe people don't talk about as much.
  • Not Synced
    Right, so Docker provides a whole bunch of tools
  • Not Synced
    used to manage and wrap these processes.
  • Not Synced
    For instance, docker run, which let's you run code inside a container.
  • Not Synced
    Remember, it's for a single process, it's not a virtual machine
  • Not Synced
    You just spawn it up and it wraps a process and keeps it semi-isolated
  • Not Synced
    so that it runs properly
  • Not Synced
    or stuff to pull images, if you have images on a central location on the index
  • Not Synced
    So if you docker pull paultag/postgres then you get my particular Postgres flavour
  • Not Synced
    Docker is higher-level than lxc,
  • Not Synced
    but lower level than something like ansible, vagrant or fabric
  • Not Synced
    Docker provides these primitives to work on the system,
  • Not Synced
    these things to allow you run processes in kind of a sane and normal way
  • Not Synced
    but it's not there to solve all of the configuration management problems
  • Not Synced
    and there's definitely configuration management to do,
  • Not Synced
    once a Docker install is on your system.
  • Not Synced
    So, one technique that I have is:
  • Not Synced
    all my containers are read-only, and then any data that changes in the container
  • Not Synced
    so for instance Postgres you have /var/lib/postgres
  • Not Synced
    that's volume mounted, which is like a bind mount out of the container.
  • Not Synced
    and that's on the host system in /srv
  • Not Synced
    and then I can just snapshot that and keep that backed up.
  • Not Synced
    I often use ansible, to provision the stuff that's in /srv
  • Not Synced
    So I won't provision anything inside the container,
  • Not Synced
    because it's only running a single process, you can't ssh in there
  • Not Synced
    but using something like ansible, vagrant or fabric
  • Not Synced
    you can coordinate a whole bunch of Docker containers
  • Not Synced
    to do some stuff that's pretty powerful.
  • Not Synced
    Originally Docker wrapped lxc,
  • Not Synced
    just to kind of give you a place of where it is on the stack,
  • Not Synced
    but that ended up getting reimplemented, just sort of in raw Go
  • Not Synced
    So it no longer uses the lxc backend by default.
  • Not Synced
    I think you might be able to still turn that on.
  • Not Synced
    You can but don't do it.
  • Not Synced
    [laughs]
  • Not Synced
    It's probably going to end up breaking stuff in a nasty way.
  • Not Synced
    There were a whole bunch of incompatibilities after a couple of versions.
  • Not Synced
    So basically Docker is slightly above lxc,
  • Not Synced
    but not quite at the level of vagrant or anything like that.
  • Not Synced
    [question]: Is Docker in jessie at the moment
  • Not Synced
    [Paul]: Docker is currently in jessie, yeah. Docker 1.0,
  • Not Synced
    which upstream assured me was stable
  • Not Synced
    but they've never released security or patch-fixes
  • Not Synced
    So, we're probably going to upload 1.2 pretty soon
  • Not Synced
    because there was a bug in golang 1.3 which affected us until then
  • Not Synced
    I'm waiting on a package in the NEW queue, ironically.
  • Not Synced
    [laughter]
  • Not Synced
    Yeah, it's embarrassing.
  • Not Synced
    What Docker is not:
  • Not Synced
    Docker is not a virtual machine, and I cannot beat this point home enough
  • Not Synced
    It should be a single process, if you start stressing that
  • Not Synced
    weird stuff's going to start to happen and you're going to be in for a bad time
  • Not Synced
    Some people use supervised, or whatever else,
  • Not Synced
    to manage a whole bunch of different stuff
  • Not Synced
    and if you're careful, that's fine.
  • Not Synced
    If you know what you're doing, fine..
  • Not Synced
    but in general, if you're just trying to "Dockerize" something,
  • Not Synced
    use a single process per container.
  • Not Synced
    So I have a Postgres container, then a webapp container
  • Not Synced
    and they're linked so they can talk to one another.
  • Not Synced
    So that's usually the architecture of standard by the book deployments.
  • Not Synced
    It's not a process for the entire application.
  • Not Synced
    It's not like I 'docker run', what do people run nowadays, etherpad?
  • Not Synced
    That might be kind of outdated.
  • Not Synced
    Yeah, there was a question.
  • Not Synced
    [question]: do you mean a single process or?
  • Not Synced
    [Paul]: So the question was, "Single process, question mark?"
  • Not Synced
    and the answer is "yes". Single PID.
  • Not Synced
    Actually sorry, no, actually that's wrong.
  • Not Synced
    The Docker instance should be starting a single PID
  • Not Synced
    but that can spawn other things. Perfectly fine.
  • Not Synced
    If for instance you have wsgi
  • Not Synced
    and that has a whole bunch of workers, spawning off a whole bunch of workers
  • Not Synced
    Totally reasonable if that's how it operates.
  • Not Synced
    But not for something like etherpad
  • Not Synced
    and you're having a database and the application in the same container.
  • Not Synced
    Sort of like logically the same stuff.
  • Not Synced
    Docker is not perfect isolation from the host,
  • Not Synced
    the goal is to isolated processes and not prevent exploits.
  • Not Synced
    The docker group is root-equivalent,
  • Not Synced
    so if you're part of the docker group
  • Not Synced
    and you can launch docker containers, it is trivial to get root on the host.
  • Not Synced
    Because you can just start a new container with the root of the filesystem
  • Not Synced
    mounted inside the container which you can chroot into and then be root.
  • Not Synced
    So don't think of this as a one size fits all security system,
  • Not Synced
    it's just providing basic wrapping around the process
  • Not Synced
    to make sure it's running in an environment that you can hold down for a minute.
  • Not Synced
    Basically this let's my unstable server run Postgres from stable.
  • Not Synced
    The web-apps that I'm really tightly controlling,
  • Not Synced
    because they're all running on python 3.4, are in unstable containers.
  • Not Synced
    So I can have different things for different daemons, which is kind of neat.
  • Not Synced
    So, why? Which is the bigger question.
  • Not Synced
    Why are we wasting all of our time with wrapping all of this stuff in Docker containers.
  • Not Synced
    and that's a good question.
  • Not Synced
    Basically it lets you abstract the idea of process
  • Not Synced
    and really not care about the host environment too much.
  • Not Synced
    So when I test something locally, on my local machine
  • Not Synced
    and deploy it to one of my VPSs, I can be pretty sure that
  • Not Synced
    that process is going to run in roughly the same way.
  • Not Synced
    Obviously there might be differences in the kernel,
  • Not Synced
    if there are difference in the kernel, okay fine
  • Not Synced
    that's going to have some problems
  • Not Synced
    But basically it lets you contain and abstract these processes
  • Not Synced
    and it lets my trivially move stuff between servers
  • Not Synced
    or test on my local machine.
  • Not Synced
    Reproduce the environment with a very lightweight environment.
  • Not Synced
    Contrasted to something like a virtual machine,
  • Not Synced
    where you have the entire overhead of the entire operating system
  • Not Synced
    and you're actually virtualizing the entire OS
  • Not Synced
    and all of the daemons included with that,
  • Not Synced
    and that's not entirely necessary in a lot of situations.
  • Not Synced
    Essentially it doesn't matter what-
  • Not Synced
    that's a typo, yeah "containenr" [laughs]
  • Not Synced
    I definitely wrote these kind of late, so I apologise
  • Not Synced
    Essentially it means I can deploy my stuff on whatever host I'm given
  • Not Synced
    because I'm cheap and I really don't like paying for servers
  • Not Synced
    So if someone decides they want to give my access to a Fedora host,
  • Not Synced
    I can host my stuff inside a stable containers
  • Not Synced
    and not stress about it too much,
  • Not Synced
    there's a little bit of stuff you have to worry about
  • Not Synced
    but yeah essentially this let's me run stable Postgres,
  • Not Synced
    using Python 3.4, play around with code, isolate it, move it around.
  • Not Synced
    That sort of thing.
  • Not Synced
    The comparison that upstream makes a lot,
  • Not Synced
    where the name comes from,
  • Not Synced
    is ISO containers that you see on trucks,
  • Not Synced
    the big metal things, they're super cool.
  • Not Synced
    Hipsters are living in them now.
  • Not Synced
    [laughter]
  • Not Synced
    Basically you can just put stuff in them,
  • Not Synced
    just pack it full of whatever, seal it up
  • Not Synced
    and it doesn't matter if you put it into a boat or a truck,
  • Not Synced
    it's just going somewhere, you don't really care.
  • Not Synced
    The comparison here, is that these are the ISO containers of the future
  • Not Synced
    with processes and computers.
  • Not Synced
    Docker itself, is the big docker ship full of ISO containers
  • Not Synced
    So you can basically create hosts that host all of your code,
  • Not Synced
    without really caring what's inside, because they all look the same to you.
  • Not Synced
    They all have the same docker run interface,
  • Not Synced
    they all have the same docker pull interface.
  • Not Synced
    Then inside the container you can just be concerned about how you pack it
  • Not Synced
    but the host doesn't care, and that ends up being pretty important.
  • Not Synced
    For super complex and hard to setup software,
  • Not Synced
    this can help remove a lot of complexity in actual initial setup.
  • Not Synced
    Because sometimes setting up processes like these, can be extremely difficult.
  • Not Synced
    As I'm sure everyone here knows.
  • Not Synced
    So if you have this weird historic way of setting up this application
  • Not Synced
    or it requires some weird configuration files but they're mostly kind of standard,
  • Not Synced
    then you can just make sure that's all in place.
  • Not Synced
    In fact at work, I've Dockerized a whole bunch of scrapers.
  • Not Synced
    So a large part of my day job is scraping terrible government websites,
  • Not Synced
    that have about 3 or 4 tags, I'm not joking!
  • Not Synced
    [laughter]
  • Not Synced
    We're all laughing, but this is my life!
  • Not Synced
    [laughter]
  • Not Synced
    The sites are all complicated
  • Not Synced
    and one of them times out every five minutes,
  • Not Synced
    so even if you're a human browsing it, it kicks you back to the main page.
  • Not Synced
    Augh, it's terrible!
  • Not Synced
    A lot of the scrape infrastructure is kind of gnarly,
  • Not Synced
    and setting up the actual scrapers can be a bit of a pain.
  • Not Synced
    Making sure that those scrapers run the same way in development and production
  • Not Synced
    is super handy
  • Not Synced
    and while it's easy for me to get them going,
  • Not Synced
    because I wrote a huge chunk of code,
  • Not Synced
    it's not as accessible for other people.
  • Not Synced
    So one of the things I did recently, is all of the scrape infrastructure,
  • Not Synced
    I'm currently working on a daemon that will run the scrapers inside Docker containers.
  • Not Synced
    So essentially I've packaged up the particular scrapers they have;
  • Not Synced
    so I have state scrapers, which are state legislative scrapers
  • Not Synced
    and so nightly it will 'docker run paultag/scrapers-us-state-alabama'
  • Not Synced
    and it'll go off the Alabama, scrape all of the data down and insert into Postgres
  • Not Synced
    This let's us build continuously from git, so as soon as I push,
  • Not Synced
    it'll rebuild the image and that image will be used
  • Not Synced
    in the actual run later on in the day.
  • Not Synced
    So it doesn't require mucking around, with like-
  • Not Synced
    I don't know if anyone's used bamboo, which is some non-free Atlassian stuff.
  • Not Synced
    which is what we're using, what I guess we're still using.
  • Not Synced
    Essentially it makes you rebuild an AMI, one of the Amazon images,
  • Not Synced
    everytime you update the environment, which is horrendous.
  • Not Synced
    And it has a 30 minute, like Indiana Jones, wall that's coming down
  • Not Synced
    After 30 minutes it shuts the machine down, because it thinks it's idle.
  • Not Synced
    So you have to make the change in 30 minutes, and then it shuts down
  • Not Synced
    and then you're like "God, gotta do it again"
  • Not Synced
    Before that we used Jenkins which was good enough, but kind of a pain too.
  • Not Synced
    It's just everything's running in the same environment, it can be a bit of a pain.
  • Not Synced
    So by Dockerizing all of this, essentially I can give anyone this scraper
  • Not Synced
    and if they're interested in having the data, they can just docker run this command
  • Not Synced
    and everything just kind of works.
  • Not Synced
    It's like OpenGov in a box. Which is pretty awesome.
  • Not Synced
    I've been working on trying to Dockerize
  • Not Synced
    more and more of the periodic jobs that get run.
  • Not Synced
    and so far, it's pretty thrilling
  • Not Synced
    and the results are really really promising
  • Not Synced
    and I hope that we're going to continue to develop Docker
  • Not Synced
    to that point that becomes a better use case.
  • Not Synced
    Because I think it's a really good one.
  • Not Synced
    Now for the fun part, my opinions!
  • Not Synced
    [laughter]
  • Not Synced
    Docker can let you get away with murder,
  • Not Synced
    you can do some pretty gnarly stuff
  • Not Synced
    and people do some pretty gnarly stuff.
  • Not Synced
    So I'm just going to brand up a couple of the things I care about.
  • Not Synced
    For instance, I only run my Docker containers off systemd unit files,
  • Not Synced
    actually I do use upstart on a couple of machines.
  • Not Synced
    Essentially they look like this, here's the spec file for one of them.
  • Not Synced
    Basically, the spec file declares that this is for my nginx config
  • Not Synced
    and so right there we've got 'docker start nginx' if it already exists
  • Not Synced
    otherwise it has the set up of the actual Docker container,
  • Not Synced
    so it says mount that into /serve, play around with /srv/pault.ag/nginx/serve
  • Not Synced
    with the image, which is paultag/nginx,
  • Not Synced
    and the binary it is running, /usr/sbin/nginx, with a couple of flags
  • Not Synced
    the stop command is 'docker stop -t 5 nginx' which means terminate after 5 seconds.
  • Not Synced
    That's kind of a lot and it's kind of ugly,
  • Not Synced
    I understand that but that's okay.
  • Not Synced
    This basically let's the nginx in Docker be treated like any other system level process
  • Not Synced
    This means that nginx inside Docker is treated identically in nearly
  • Not Synced
    everything else that I do, because I just do 'sudo service nginx restart'
  • Not Synced
    What does it matter, it's just launching commands?
  • Not Synced
    And the commands are happening to isolate it in Docker.
  • Not Synced
    Basically the same thing here for upstart,
  • Not Synced
    slightly cleaner actually, which is awesome
  • Not Synced
    But basically start on filesystem and start Docker
  • Not Synced
    source the file to do some work and launch essentially the same thing
  • Not Synced
    and these are nearly identical
  • Not Synced
    I really don't like deploying Docker unless there's a start up script in place
  • Not Synced
    I want all of my machines to be able to hard-shutdown
  • Not Synced
    in the middle of whatever they're doing -
  • Not Synced
    sometime in the transient ether, have all of my Docker containers disappear
  • Not Synced
    and when the machine starts up, have it be back up in which I can use it.
  • Not Synced
    And having unit files and spec files like this, really saves you a lot.
  • Not Synced
    As for whether or not systemd will replace Docker, I have no idea.
  • Not Synced
    I'm sure the systemd people think that.
  • Not Synced
    So 'sudo service docker restart' plays around with the Docker containers.
  • Not Synced
    Any questions on that part in particular, because I feel like I moved a little fast.
  • Not Synced
    Asheesh, yes
  • Not Synced
    [Asheesh]: How did you put nginx into that Docker instance,
  • Not Synced
    and what is running in there, Debian something?
  • Not Synced
    [Paul]: Yeah, totally. Thanks Asheesh.
  • Not Synced
    Essentially- let's see if I have my Docker files around.
  • Not Synced
    Ah yes, right fonts.
  • Not Synced
    Unfortunately xfce-terminal does not let me use Ctrl++, which is disappointing.
  • Not Synced
    That's too big, that should be pretty good.
  • Not Synced
    Okay, this is gigantic, but we should be able to do something.
  • Not Synced
    [laughter]
  • Not Synced
    Aah, that's a little bit too big
  • Not Synced
    (inaudible)
  • Not Synced
    CoC. Come on man, CoC. This is a little bit better,
  • Not Synced
    I'm just going to go a little bit smaller, sorry. Right, this'll do.
  • Not Synced
    So essentially, you declare from what base image you start from.
  • Not Synced
    This can be any arbitrary image.
  • Not Synced
    So I'm saying, FROM the debian:unstable image.
  • Not Synced
    The debian:unstable image is maintained here by tianon, upstream.
  • Not Synced
    It's roughly similar to what you get from a debootstrap,
  • Not Synced
    There are some differences, the differences are documented in the creation script,
  • Not Synced
    that's also shipped with Docker itself, it you want to create yourselves.
  • Not Synced
    The actual modifications, we've talked about moving them into a .deb before
  • Not Synced
    but nothing really came of that yet.
  • Not Synced
    It anyone's interested in making sure the differences in the Debian Docker image are better documented,
  • Not Synced
    I'm sure the Docker upstream, tianon in particular
  • Not Synced
    and myself would love to talk to you about how to make that possible. Thumbs up!
  • Not Synced
    So I haven't said anything entirely wrong. Great!
  • Not Synced
    So I'm saying, FROM the debian:unstable image.
  • Not Synced
    The first part is the name of the image, the second part is a tag.
  • Not Synced
    They are very similar to how git tags work, except you're encourage to change them often
  • Not Synced
    [laughter]
  • Not Synced
    Tags essentially point to a given layer
  • Not Synced
    and you essentially can use them for nearly whatever you want
  • Not Synced
    MAINTAINER, useless bit of metadata, not really important here
  • Not Synced
    RUN means when you're creating this image, run the following command
  • Not Synced
    'apt-get update' and then 'apt-get install -y nginx'
  • Not Synced
    which will actually install nginx into the container, that we're currently building in
  • Not Synced
    and then I 'rm -rf' all of the nginx/sites-*/* stuff
  • Not Synced
    because that gets volume mounted in, from my filesystem.
  • Not Synced
    So that when I configure a new app,
  • Not Synced
    I just drop the file in the hosts' /srv/docker/nginx
  • Not Synced
    and then I just kick the container and then it sees it
  • Not Synced
    in its /etc/nginx/sites-enabled/
  • Not Synced
    and then the CMD, which is the default command that's run if no arguments are given
  • Not Synced
    there's also ENTRYPOINT, ENTRYPOINT is sort of like RUN
  • Not Synced
    except it is a little bit harder and it's also put before RUN
  • Not Synced
    Confusing these two can get confusing [laughs]
  • Not Synced
    Essentially it's this declarative style,
  • Not Synced
    it's sort of a declarative style
  • Not Synced
    and it's powerful enough to basically do what you want
  • Not Synced
    you can actually do this stuff manually in a container
  • Not Synced
    and then tag the resulting container
  • Not Synced
    but it's generally good practice to use docker files
  • Not Synced
    so that you can create what are known as automated builds.
  • Not Synced
    They used to be called 'trusted builds' but that name's terrible
  • Not Synced
    and automated builds are basically builds that are being done on the Docker index routinely.
  • Not Synced
    [question]: Just a quick question for those who haven't played around with Docker yet
  • Not Synced
    Does that mean if your machine crashes and you boot up again,
  • Not Synced
    it'll detect "Okay, I need this image" so you get new versions of packages
  • Not Synced
    from unstable and all hell breaks loose, or is it fixed at some point? Somehow.
  • Not Synced
    [Paul]: Yes, there are two different concepts here
  • Not Synced
    that I think I've failed to delineate,
  • Not Synced
    essentially there's a concept of an image
  • Not Synced
    and there's a concept of a container.
  • Not Synced
    A container is an instance of an image,
  • Not Synced
    so container's always started from an image.
  • Not Synced
    So this declarative style of build things
  • Not Synced
    is building an image and the resulting image here is called paultag/nginx
  • Not Synced
    When I run this with a 'docker run paultag/nginx',
  • Not Synced
    it's assigned a pseudo-random name built on words
  • Not Synced
    so like, "feisty_turing" or "angry_stallman"
  • Not Synced
    [laughter]
  • Not Synced
    'stallman' was added recently, it's amazing! [laughs]
  • Not Synced
    The actual individual containers are given these opaque names.
  • Not Synced
    So you start an image and your given a container that's from that image.
  • Not Synced
    If my machine was to shutdown and everything was to start up again,
  • Not Synced
    it would still be using the same image unless I rebuild it in the meantime
  • Not Synced
    in which case, that's probably expected behaviour.
  • Not Synced
    Any other questions about this stuff so far?
  • Not Synced
    [question]: So how do you deal with security updates?
  • Not Synced
    [Paul]: Yes, security updates. That's great.
  • Not Synced
    Best practice here is to continuously rebuild your images
  • Not Synced
    and the Docker index has support for this,
  • Not Synced
    give it a repo and it'll watch for changes, post-commit hooks.
  • Not Synced
    When you change something, it'll rebuild the image and put it up on the index
  • Not Synced
    At which point you just pull it and kick your containers.
  • Not Synced
    If you don't use something like that, you're building it locally,
  • Not Synced
    you can have something on a cron that rebuilds the images
  • Not Synced
    and kicks the containers that are currently active.
  • Not Synced
    So the idea is that by using something declarative like this,
  • Not Synced
    that every time that debian:unstable image updates,
  • Not Synced
    it's going to have the latest security fixes
  • Not Synced
    so that when we rerun this and retag the image locally
  • Not Synced
    then we're going to get the security updates as well.
  • Not Synced
    Essentially containers should be, in my opinion, always read-only ephemeral.
  • Not Synced
    So you shouldn't be making any changes inside the containers,
  • Not Synced
    if you're writing anything that should be mounted onto the host.
  • Not Synced
    So that at any point, I can just trash all of the containers,
  • Not Synced
    start them up again and they then have the latest version.
  • Not Synced
    With minor interruptions. Which is similar enough.
  • Not Synced
    It's sort of the difference immutable versus mutable
  • Not Synced
    think of virtual machines as sort of mutable,
  • Not Synced
    you can update them, you can change their state.
  • Not Synced
    Docker container, really, they should be immutable.
  • Not Synced
    When you replace them, they should be an atomic replace.
  • Not Synced
    So Lisp versus Python, who's ready?
  • Not Synced
    [laughter]
  • Not Synced
    Any questions about this so far?
  • Not Synced
    Okay cool, I'm going to continue talking.
  • Not Synced
    Basically the only reason I gave this talk
  • Not Synced
    was to use the Unicode heart (âĽ), to see if any of the software would crash.
  • Not Synced
    It didn't, which was a huge disappointment,
  • Not Synced
    so hopefully this turns into more of a discussion pretty soon
  • Not Synced
    Again, another strong opinion from myself:
  • Not Synced
    that you should really only use container linking,
  • Not Synced
    SkyDock used to be something that I was preferring
  • Not Synced
    but it ended up being really buggy
  • Not Synced
    and ate up all of the free memory on my system
  • Not Synced
    and OOM-killed nearly everything,
  • Not Synced
    which was not fun, it ended up taking about 2GB.
  • Not Synced
    That kind of was a bad day.
  • Not Synced
    So I generally use container linking,
  • Not Synced
    all Docker containers when they're spawned
  • Not Synced
    are given a private IP address on a docker0 interface
  • Not Synced
    so they all can talk to each other behind the docker0 interface
  • Not Synced
    and when you bind to a port in a container it's bound to a container-local IP.
  • Not Synced
    Container linking basically rewrites the /etc/hosts,
  • Not Synced
    which is a bit of a hack, but it works.
  • Not Synced
    It essentially rewrites the /etc/hosts to point to another container's IP address.
  • Not Synced
    So it has the other 127.xx.xx.xx IP address,
  • Not Synced
    and this let's two containers talk to each other
  • Not Synced
    So my Postgres container's up but it's not bound to my public IP,
  • Not Synced
    it's bound to it's container IP.
  • Not Synced
    Then other containers will talk to it, using container linking.
  • Not Synced
    So it'll mean my web-apps know about Postgres,
  • Not Synced
    so you can connect to postgres://postgres@postgres:postgres.postgres.postgres/
  • Not Synced
    [laughter]
  • Not Synced
    The Docker API.
  • Not Synced
    I have so many things to say about it, it's not great.
  • Not Synced
    Essentially, more and more stuff has been duct-taped to it, as time has been going on.
  • Not Synced
    So to correctly tell it what ports to map, I think you need to find it in two places,
  • Not Synced
    which is the host config and the run config.
  • Not Synced
    Which you need to pass during 2 different POSTs
  • Not Synced
    And it's kind of a pain with mounting stuff in volumes.
  • Not Synced
    The API that Docker exposes is very much an implementation detail
  • Not Synced
    more than a public facing thing you should be playing around with.
  • Not Synced
    I've written plenty of Docker API clients, they are not fun.
  • Not Synced
    So if I could basically dissuade you in any way, I really want to.
  • Not Synced
    If you really want to play with on, put on a helmet.
  • Not Synced
    It's seriously good advice, this API can probably-
  • Not Synced
    for a while 'id' was spelt 3 different ways:
  • Not Synced
    there was all uppercase 'ID', 'Id' and 'id' all lower case.
  • Not Synced
    Docker images are super cheap, they're all built on each other.
  • Not Synced
    Essentially you have different layers on an image.
  • Not Synced
    Every time you perform an action, you're pulling from all of the images below it
  • Not Synced
    So when I say FROM debian:unstable, it's basing all of your changes from the
  • Not Synced
    debian:unstable layer
  • Not Synced
    So if you only make a couple of minimal changes, it's really cheap.
  • Not Synced
    So the more and more layers you add, it's not really that bad.
  • Not Synced
    So if you extend FROM debian:unstable in a couple of places,
  • Not Synced
    it's not actually duplicating that material on disk,
  • Not Synced
    it's just all in that one place, that one layer.
  • Not Synced
    You should definitely use images for as much as you can.
  • Not Synced
    Having good images is definitely a huge improvement over trying to do this stuff raw.
  • Not Synced
    Asheesh has a question.
  • Not Synced
    [Asheesh]: How are they cheap?
  • Not Synced
    Is it using copy-on-write, is it using aufs?
  • Not Synced
    Is it using a custom block layer? What? Huh?
  • Not Synced
    [Paul]: Great. Thanks, Asheesh.
  • Not Synced
    [laughter]
  • Not Synced
    Yes, so they are written to the filesystem
  • Not Synced
    and mounted on top of each other in a variety of fun ways.
  • Not Synced
    You can either use device-mapper,
  • Not Synced
    you can use aufs or you can use btrfs.
  • Not Synced
    device-mapper should not be used under any circumstances.
  • Not Synced
    I don't know why it's still in the tree, it's pretty bad.
  • Not Synced
    I used it on my-
  • Not Synced
    What's that?
  • Not Synced
    [comment]: (inaudible), aufs
  • Not Synced
    Compared to-, yes-
  • Not Synced
    aufs is not great, but it is much better than device-mapper.
  • Not Synced
    So it is what I'm using until btrfs becomes a bit more stable
  • Not Synced
    I want to switch to it, but I haven't had the chance to switch my VPS to
  • Not Synced
    btrfs.
  • Not Synced
    So right now the most stable backend, in my opinion is aufs,
  • Not Synced
    yes, it's deprecated and there are plenty of operating systems that don't ship
  • Not Synced
    aufs bundled anymore, like Arch.
  • Not Synced
    So that turns out to be a problem.
  • Not Synced
    But whatever you do, avoid device-mapper.
  • Not Synced
    So essentially it uses copy-on-write for everything,
  • Not Synced
    including the containers.
  • Not Synced
    Everything is mounted on top of each another
  • Not Synced
    using a variety of different methods
  • Not Synced
    So it's definitely definitely cheap.
  • Not Synced
    Yes basically ensure you can hard reboot your machine,
  • Not Synced
    kill all the offline containers and start everything back up
  • Not Synced
    and have it work.
  • Not Synced
    [Russ]: So there's a question on IRC:
  • Not Synced
    If everything layered on top of a base layer,
  • Not Synced
    what happens when you upgrade the base layer?
  • Not Synced
    Does everything on top of it break?
  • Not Synced
    [Paul]: Yes, so this is-
  • Not Synced
    No, this is great.
  • Not Synced
    So every time that you create a new image,
  • Not Synced
    it's given a new hash, it's given a new layer ID.
  • Not Synced
    So you're recreating the image from something new,
  • Not Synced
    so essentially the immutability principle holds,
  • Not Synced
    you'll have the old layers, which you'll still be based on.
  • Not Synced
    But they'll basically unreferenced tags, commits that are hanging out that
  • Not Synced
    aren't being referenced by anything.
  • Not Synced
    They are given a super descriptive name in the Docker images output,
  • Not Synced
    which is "" [laughs]
  • Not Synced
    [laughter]
  • Not Synced
    and these are essentially layers that are sitting around that have kind of moved on.
  • Not Synced
    So if you FROM debian:unstable and debian:unstable updates,
  • Not Synced
    then you're going to have an image based on IDs that aren't referenced
  • Not Synced
    by debian:unstable in a couple of weeks,
  • Not Synced
    which is why people like to continuously upgrade these things.
  • Not Synced
    Hopefully that answers the question.
  • Not Synced
    OK, be sure you can start everything back up and have everything just work.
  • Not Synced
    The easiest way of doing this, is treating them all as ephemeral read-only
  • Not Synced
    process wrappers.
  • Not Synced
    Some of the most interesting stuff-
  • Not Synced
    That was just a small review of Docker for anyone who doesn't know.
  • Not Synced
    Now this is the good part.
  • Not Synced
    Docker is totally installable by running 'sudo apt-get install docker.io'
  • Not Synced
    All of you guys should do that, it's great.
  • Not Synced
    Upstream, tianon in particular, has a super stripped down Debian image
  • Not Synced
    which is really good to base stuff off of,
  • Not Synced
    it's super lightweight and it's pullable from stock Docker.
  • Not Synced
    If you're interested in the changes from debootstrap,
  • Not Synced
    again they're documented in a shell script:
  • Not Synced
    /usr/share/docker.io/contrib/mkimage-debootstrap.sh
  • Not Synced
    Which I think might be the deprecated version, I can't remember.
  • Not Synced
    If you're doing a lot with Docker, feel free to check out what that's doing
  • Not Synced
    and make your own image.
  • Not Synced
    For Debian development, because I feel like this is going to start coming up,
  • Not Synced
    don't use the Docker image from the index.
  • Not Synced
    Don't dput stuff that you build with that image.
  • Not Synced
    If you're really trying to use Docker to package stuff,
  • Not Synced
    build the base image yourself.
  • Not Synced
    I think that's pretty sane advice.
  • Not Synced
    Just like pbuilder or sbuild, you wouldn't trust a chroot that you wget,
  • Not Synced
    don't just trust a Docker image that you're pulling from the Internet.
  • Not Synced
    Which brings me to another fun point.
  • Not Synced
    dbuilder, or something like that.
  • Not Synced
    Someone should totally do that.
  • Not Synced
    Having a backend that's as flexible as Docker would be really interesting.
  • Not Synced
    Having something with a pbuilder like interface
  • Not Synced
    that uses Docker containers on the backend is something I've been interested
  • Not Synced
    for a long time.
  • Not Synced
    You can even tag images with build-deps installed,
  • Not Synced
    so you don't have to have that warm-up time everytime.
  • Not Synced
    All sorts of crazy stuff.
  • Not Synced
    If anyone's interested in doing that, I'd love to talk with you about how to
  • Not Synced
    do that.
  • Not Synced
    Essentially I want to turn this BoF into,
  • Not Synced
    "What can Docker do with Debian? What can Debian do with Docker?"
  • Not Synced
    Because that's sort of what I'm interested in,
  • Not Synced
    I see a potential and hopefully other people do too.
  • Not Synced
    A quick overview of some future plans before we start more discussion,
  • Not Synced
    nightly builds, check-ish.
  • Not Synced
    We have nightly builds going to PPAs.
  • Not Synced
    I need to set up a debile cluster, to get nightly builds for Debian.
  • Not Synced
    These are mostly useful for myself and other people interested in testing
  • Not Synced
    nightlies, and making sure packaging works continuously.
  • Not Synced
    That's something I've been interested in,
  • Not Synced
    something that's mostly kind of working,
  • Not Synced
    props to tianon.
  • Not Synced
    Backports, we have a lot of stuff backported in a PPA.
  • Not Synced
    We need to upload that to pretty Debian soon,
  • Not Synced
    but it involves backporting Go.
  • Not Synced
    Which means that we need to commit to maintaining Go in stable.
  • Not Synced
    So as you can probably guess, I'm not super on top of that.
  • Not Synced
    I would love to see more Debian people push for content-based IDs of layers.
  • Not Synced
    So those layers I was talking about aren't actually given IDs based on the
  • Not Synced
    content of the layer, they're just IDs.
  • Not Synced
    If we had content based IDs, then we could do better stuff
  • Not Synced
    such as verifying the integrity of an image,
  • Not Synced
    or signing of images, which would be really cool
  • Not Synced
    so that we could gpg sign an image
  • Not Synced
    and then assert that it's the image that we have.
  • Not Synced
    Or set up a Docker daemon somewhere that only runs images that are pgp signed.
  • Not Synced
    Which would be awesome.
  • Not Synced
    Basically limit the stuff, to only stuff I've signed.
  • Not Synced
    Potentially trusted Debian image, somehow?
  • Not Synced
    I'm not sure what that would look like,
  • Not Synced
    what the logistics of that would look like.
  • Not Synced
    For now I think decentralising this and pushing it to all the people probably
  • Not Synced
    makes sense.
  • Not Synced
    Docker 1.2.0 has been released this week
  • Not Synced
    and I plan to upload it into unstable
  • Not Synced
    as soon as md2man is through NEW.
  • Not Synced
    So that should be really soon now. [laughs]
  • Not Synced
    OK, right. Who's ready to flame.
  • Not Synced
    [question4]: I've kind of been following Docker upstream development
  • Not Synced
    and I've noticed the version numbers were 9 months ago 0.2, 0.3, 0.4
  • Not Synced
    just jumping and we're already at 1.2 we're talking about a jessie freeze
  • Not Synced
    maybe this year.
  • Not Synced
    How do you plan to maintain that going forward
  • Not Synced
    or keep up with upstream, do you have any thoughts there?
  • Not Synced
    [Paul]: I don't think there's a good answer for that.
  • Not Synced
    The 1.0 release was supposed to be something a little more stable
  • Not Synced
    and more maintained, it's not turned out that way.
  • Not Synced
    1.2 is much more stable and much better supported than 1.0 right now.
  • Not Synced
    I can't imagine that's true in the future,
  • Not Synced
    but I'm hoping if we can sync Ubuntu and Debian on a particular version,
  • Not Synced
    the collected user base will be enough to pressure upstream.
  • Not Synced
    Which I think would be something worthwhile.
  • Not Synced
    The Docker upstream is super friendly and they're all really awesome.
  • Not Synced
    I love them all dearly.
  • Not Synced
    I poke fun at them plenty
  • Not Synced
    and I've definitely poked fun at them in this talk
  • Not Synced
    and I'm sure I'm going to hear about it.
  • Not Synced
    They're definitely amazing and want good things for the world.
  • Not Synced
    So I think if there was definitely a use case in which this made sense
  • Not Synced
    and I think a stable release of Debian and a couple of versions of Ubuntu maybe
  • Not Synced
    then I think we could probably pull off some support.
  • Not Synced
    It's a good point, fair point.
  • Not Synced
    But Docker 1.2 outclasses 1.0 in nearly every way.
  • Not Synced
    So it's definitely not worth keeping us on a "stable" version,
  • Not Synced
    that's not better in any way.
  • Not Synced
    Oh come on. Flame!
  • Not Synced
    [laughter]
  • Not Synced
    [question5]: So you said it's not suitable to prevent exploits
  • Not Synced
    Is it basically the design of Docker as in the tool
  • Not Synced
    or is it rather the underlying interfaces provided by the kernel
  • Not Synced
    that are not sufficient to run, say student admissions when assessing student work?
  • Not Synced
    [Paul]: I'm trying to live-exploit Docker in front of you.
  • Not Synced
    [question5]: I guess I wasn't quite clear.
  • Not Synced
    Something running inside a Docker container
  • Not Synced
    [Paul]: Oh inside Docker.
  • Not Synced
    See, now I'm root on the host.
  • Not Synced
    [question5]: Yes but you screwed up by calling Docker.
  • Not Synced
    So if I'm calling Docker in a sensible way
  • Not Synced
    is it reasonable to run untrusted code inside a well-prepared Docker container?
  • Not Synced
    [Paul]: Oh I see.
  • Not Synced
    Yes, if you change the user off root in the Docker container
  • Not Synced
    there is much less of an attack surface.
  • Not Synced
    And yes, if you're not a user with permissions,
  • Not Synced
    it's a lot harder to do this.
  • Not Synced
    and it definitely provides some level of isolation,
  • Not Synced
    it's just the kernel namespacing stuff
  • Not Synced
    I don't think was meant to provide bullet-proof security.
  • Not Synced
    It was meant to provide rough security.
  • Not Synced
    And I think it definitely does that pretty well
  • Not Synced
    and if you keep users as non-root
  • Not Synced
    it's pretty trivial to exploit this.
  • Not Synced
    So yeah, you're right, this particular exploit
  • Not Synced
    is because I can run Docker
  • Not Synced
    and the docker group is root-equivalent.
  • Not Synced
    But yeah, you should be fine.
  • Not Synced
    [question6]: Just a quick comment on that.
  • Not Synced
    If you are running developer's code on production systems,
  • Not Synced
    you probably want to use SELinux in combination with Docker
  • Not Synced
    [Paul]: Yeah, that's good advice.
  • Not Synced
    [question7]: With OpenShift, they use SELinux
  • Not Synced
    to isolate the containers from other things.
  • Not Synced
    [Paul]: Awesome. Yes, SELinux sounds like it could be a solution.
  • Not Synced
    [question8]: As somebody who helped maintain SELinux for a while,
  • Not Synced
    please don't trust SELinux for a single source of security.
  • Not Synced
    [laughter]
  • Not Synced
    I don't recommend it, it's a great thing
  • Not Synced
    as a part of a defence and dev strategy
  • Not Synced
    But if it's the only thing lying between you and remote root,
  • Not Synced
    you're going to have a bad day.
  • Not Synced
    [Paul]: So all software's terrible. [laughs]
  • Not Synced
    [Russ]: So have you experimented any with the various privilege isolation,
  • Not Synced
    system call limitation and similar privilege separation stuff in systemd?
  • Not Synced
    Because you're using unit files around Docker,
  • Not Synced
    have tried playing with adding that stuff in to do the containerisation.
  • Not Synced
    [Paul]: I have not, and that's a great idea.
  • Not Synced
    That would be awesome.
  • Not Synced
    Who else has great ideas on how to break Debian with Docker?
  • Not Synced
    [question9]: aufs backend for Docker has a 42 layer limit
  • Not Synced
    [Paul]: Well that's fun
  • Not Synced
    [question9]: Yeah, you obviously haven't hit that one yet.
  • Not Synced
    [Paul]: 127?
  • Not Synced
    [question9]: 127 now
  • Not Synced
    [Paul]: I'm so confused. [laughs]
  • Not Synced
    So I guess, it it hurts, don't poke it.
  • Not Synced
    [laughter]
  • Not Synced
    [question?]: Trying to attract more flames.
  • Not Synced
    Would it be reasonable to expect all Debian infrastructure to have Docker run commands?
  • Not Synced
    So we could run them on all machines easily and develop on them?
  • Not Synced
    [Paul]: So I've been playing around with Dockerizing dak.
  • Not Synced
    Yeah right, um. [laughs]
  • Not Synced
    I haven't spent too much time on it,
  • Not Synced
    but it's definitely a goal of my to 'docker run' 3 containers
  • Not Synced
    and have a working dak/debile set up.
  • Not Synced
    That will let you dput packages in source form,
  • Not Synced
    to a directory and end up with an apt-get able .deb directory
  • Not Synced
    somewhere else.
  • Not Synced
    It's something I'm definitely interested in,
  • Not Synced
    Dockerizing more of Debian infrastructure
  • Not Synced
    so people can run it, test it locally.
  • Not Synced
    and have the steps it takes to set it up in a Docker file,
  • Not Synced
    is like perfect.
  • Not Synced
    It's exactly what I love Docker for.
  • Not Synced
    Having something like that, where you can make some changes
  • Not Synced
    and then do a 'docker build' of the current directory that you're working in
  • Not Synced
    and being able to test it, without worrying about setting it up on the host.
  • Not Synced
    That would be key, that would be awesome.
  • Not Synced
    I'd love to play with that.
  • Not Synced
    [Asheesh]: Just to make the flame temperature increase.
  • Not Synced
    It seems like Docker, by promoting a world of process-based isolation,
  • Not Synced
    decreases the importance of things like Debian Policy
  • Not Synced
    which are all about having programs be co-installable and not step on each others toes
  • Not Synced
    and this seems sort of consistent with, I don't know,
  • Not Synced
    the way that the San Francisco bay area based development community operates
  • Not Synced
    (of which I'm now a part)
  • Not Synced
    where we just sort of install some sort of base operating system
  • Not Synced
    and then just pour files all over the system
  • Not Synced
    [laughter]
  • Not Synced
    But I guess I'm supposed to ask a question,
  • Not Synced
    so the question is:
  • Not Synced
    [laughter]
  • Not Synced
    [Paul]: Please form your flame in the form of a question.
  • Not Synced
    [Asheesh]: Yeah, but the question is really
  • Not Synced
    Should Debian take more seriously the idea that things like Policy
  • Not Synced
    may be less important over the next 2-15 years
  • Not Synced
    and alter Debian packaging accordingly?
  • Not Synced
    Russ!
  • Not Synced
    [Russ]: So there- [laughs]
  • Not Synced
    [laughter]
  • Not Synced
    So there are several pieces to what Policy does for you.
  • Not Synced
    What I would say is, there's a set of problems that Debian has tried to deal
  • Not Synced
    with for many years
  • Not Synced
    that are a bunch of the things that are in Policy
  • Not Synced
    which are you say, are about being able to install a bunch of stuff that
  • Not Synced
    prior to Debian putting a bunch of work into it,
  • Not Synced
    would have actually conflicted with each other
  • Not Synced
    and given all that Debian did, now they don't conflict with each other.
  • Not Synced
    There's a bunch of stuff like alternatives and diversions
  • Not Synced
    and all that kind of thing.
  • Not Synced
    I think that stuff is still going to be useful in a lot of cases,
  • Not Synced
    it's possible that will not be useful inside the little Docker containers
  • Not Synced
    that you're using to run production infrastructure.
  • Not Synced
    I think we'd all be pretty happy to see that happen,
  • Not Synced
    those are often workarounds for problems
  • Not Synced
    that are not as good as just having the one thing installed.
  • Not Synced
    Like for example, one of the things I want to use Docker for
  • Not Synced
    is to set up test MIT KDC and Heimdal KDC,
  • Not Synced
    so I can test Kerberos code against both of them
  • Not Synced
    and right now the packages conflict because of a bunch of reason.
  • Not Synced
    and you can kind of fix that with alternatives,
  • Not Synced
    except you can't really fix that with alternatives,
  • Not Synced
    because kadmin index is completely different
  • Not Synced
    and then you get into a big argument.
  • Not Synced
    So there are parts of Policy like that that will be less important
  • Not Synced
    I think that even when you put everything inside Docker,
  • Not Synced
    having all of the binaries in /var/tmp is still not useful
  • Not Synced
    when something goes wrong and you want to find the command that went wrong
  • Not Synced
    and you didn't think to look in /var/tmp for the command. [laughs]
  • Not Synced
    So I think there's still some role for,
  • Not Synced
    "I installed this thing, now where the hell did all the bits of it go?"
  • Not Synced
    and I want to configure this thing, I would like all of the configuration
  • Not Synced
    files to be in the configuration file directory
  • Not Synced
    and not scattered off in root's home directory.
  • Not Synced
    So that part of Policy I don't think really changes.
  • Not Synced
    [question?]: So what Paul gave us, was a bunch of recommendations
  • Not Synced
    on top of what Docker, if you can calls it that, describes.
  • Not Synced
    Isn't that something that would be useful as part of a Debian Docker policy
  • Not Synced
    as in, how do you Dockerize applications for Debian
  • Not Synced
    and in that case, what you can have is
  • Not Synced
    you can still have alternatives and diversions and everything else
  • Not Synced
    that actually allows you to have packages co-exist inside that debian unstable base image
  • Not Synced
    and you still need that to build your base images, or any images for Docker.
  • Not Synced
    You could have some sane recommendations
  • Not Synced
    on how to lay things out with Docker on top of that.
  • Not Synced
    [Paul]: Yeah, interesting.
  • Not Synced
    I hadn't really thought about that too much.
  • Not Synced
    If people would be happy with documenting best practices in Debian with Docker,
  • Not Synced
    I'd be happy to spend time and effort
  • Not Synced
    I don't know if me dictating that kind of thing is the best idea
  • Not Synced
    I think if other people want to try coherent thoughts around this
  • Not Synced
    that would be a lot of fun.
  • Not Synced
    Oh come on, you've got more than that!
  • Not Synced
    [question?]: Within the next 20 minutes, can we Dockerize Subsurface?
  • Not Synced
    [Paul]: I've got 5 minutes left.
  • Not Synced
    [question?]: There's a man here in 20 minutes who's going to be upset about
  • Not Synced
    the fact that Surface isn't using static linking
  • Not Synced
    [Paul]: Run 'sudo apt-get install subsurface'
  • Not Synced
    Should be good.
  • Not Synced
    [question?]: We solve all of our static linking problems that way?
  • Not Synced
    [laughs]
  • Not Synced
    [question?]: You said that you were using, Docker was using aufs
  • Not Synced
    Did you have some problems with the stability of aufs?
  • Not Synced
    [Paul]: I have not.
  • Not Synced
    Most of my problems have been using non-aufs backends.
  • Not Synced
    As a matter of fact, I can't even get the kernel to run on Linode
  • Not Synced
    because the kernel is built without aufs on it.
  • Not Synced
    I actually have a blog post where
  • Not Synced
    I load from Xen grub, to grub 0.9 to grub 2.0 to the Debian kernel
  • Not Synced
    because the old grub Xen doesn't support .xz compression.
  • Not Synced
    Which is great.
  • Not Synced
    [laughter]
  • Not Synced
    Yeah it is, so if someone wants to get aufs working on Linode,
  • Not Synced
    there's a blog post somewhere.
  • Not Synced
    Alright, I think I'm out of time
  • Not Synced
    but we can keep talking about Docker stuff.
  • Not Synced
    Cool.
  • Not Synced
    [applause]
  • Not Synced
Title:
meetings-archive.debian.net/.../Docker_Debian.webm
Video Language:
English
Team:
Debconf
Project:
2014_debconf14

English subtitles

Incomplete

Revisions