Return to Video

reproducible_builds_status_update.webm

  • Nesynchronizované
    Next talk
  • Nesynchronizované
    Chris and Holger are going to talk to us again
  • Nesynchronizované
    about reproducible builds and tell us
    where they're up to.
  • Nesynchronizované
    Thanks very much
  • Nesynchronizované
    The outline of this talk is from last year
    we realised there were a lot of questions.
  • Nesynchronizované
    The rough plan is to quickly go over
    what reproducible builds are
  • Nesynchronizované
    I guess everyone is up to speed
  • Nesynchronizované
    but getting everyone on the same page
    would be a good idea.
  • Nesynchronizované
    Then Holger's going to jump in
    and give the status update
  • Nesynchronizované
    and then we're going to talk about
    future work, questions etc
  • Nesynchronizované
    What is the actual problem we're
    solving here?
  • Nesynchronizované
    You can always inspect the source code of
    free software for malicious flaws
  • Nesynchronizované
    or just flaws as well.
  • Nesynchronizované
    Unfortunately distributions provide
    precompiled binaries to end users.
  • Nesynchronizované
    So can you actually trust this
    compilation process has not
  • Nesynchronizované
    introduced flaws of its own?
  • Nesynchronizované
    The problem is it seems very effective if
    you want to go after end users
  • Nesynchronizované
    you can go after developers.
    Because if you go infect a developers
  • Nesynchronizované
    machine you will then infect all the
    users of the software they generate.
  • Nesynchronizované
    Financial incentives. There always were
    but they are even more so these days
  • Nesynchronizované
    with mobile phone etc.
  • Nesynchronizované
    You can also have very subtle flaws.
    This one in particular there was a
  • Nesynchronizované
    root exploit in OpenSSH just by changing
    a compare equal.
  • Nesynchronizované
    That sort of assembler jump thing and it
    gives you root
  • Nesynchronizované
    but with only a single bit difference in
    the binary.
  • Nesynchronizované
    Which is not to shabby.
  • Nesynchronizované
    Then you have all sorts of cute demos
    where you load up the source code in VIM
  • Nesynchronizované
    and it just looks like 'Hello world' but
    when you compile it with GCC
  • Nesynchronizované
    your kernel rootkit just goes 'oh I'm
    going to give you a different file'
  • Nesynchronizované
    and self replicates of them like that.
  • Nesynchronizované
    Difficult to trust the process.
  • Nesynchronizované
    And there's some recent history as well
    around Xcodeghost and iOS
  • Nesynchronizované
    and adverts and things like that.
  • Nesynchronizované
    You can Google those things.
    Really scary stuff.
  • Nesynchronizované
    The last example is actually coming from
    a CIA design paper from 2012.
  • Nesynchronizované
    Which was then found in the wild in 2014.
    So these exploits are actually happening.
  • Nesynchronizované
    People are targeting developers to get
    users.
  • Nesynchronizované
    Xcodeghost had 20 milllion user
    installations.
  • Nesynchronizované
    It was probably not the CIA or NSA but
    we don't know who it was.
  • Nesynchronizované
    There are many people who do these
    exploits in the wild.
  • Nesynchronizované
    Yeah it's not just 'Here's this cute
    thing we can talk about'.
  • Nesynchronizované
    It's actually happening.
  • Nesynchronizované
    The motivation is to ensure no flaws are
    introduced during the build process.
  • Nesynchronizované
    We do this by ensuring the build always
    produces identical results.
  • Nesynchronizované
    Then multiple parties do the same thing.
  • Nesynchronizované
    I build it, you build it, your friends
    build it etc
  • Nesynchronizované
    An an attacker would need to infect
    everyone simultaneously
  • Nesynchronizované
    otherwise they'd be detected.
    For example if my machine was compromised
  • Nesynchronizované
    I would suddenly come up with a
    different result.
  • Nesynchronizované
    I would come up with different binaries.
  • Nesynchronizované
    And you'd be 'what's going on here' and
    eventually we would discover
  • Nesynchronizované
    that my machine was rootkitted etc.
  • Nesynchronizované
    You probably know it but identically
    means bit by bit identical.
  • Nesynchronizované
    As that is really the same.
  • Nesynchronizované
    Yeah, bit, SHA, MD5 whatever you want.
  • Nesynchronizované
    There are a bunch of challenges here.
    The biggest one is timestamps.
  • Nesynchronizované
    A lot of software just loves to include
    timestamps everywhere.
  • Nesynchronizované
    Documentation, underscore underscore date
    and underscore underscore time macros
  • Nesynchronizované
    Just all over the place, in file names etc
    Things like that.
  • Nesynchronizované
    Builds often vary by locale and timezone.
  • Nesynchronizované
    Different new lines, different sorting
    orders for example collations.
  • Nesynchronizované
    Different versions of libraries. I'm not
    sure what this refers to exactly.
  • Nesynchronizované
    Moving on.
  • Nesynchronizované
    Non-deterministic file ordering for
    example Shell Globs are not really defined
  • Nesynchronizované
    to be, I say not really defined they
    aren't defined to come out in normal order.
  • Nesynchronizované
    Also read syscall, it doesn't actually
    promise any particular ordering.
  • Nesynchronizované
    Dictionary/hash key ordering. So this is
    in things like Perl and python
  • Nesynchronizované
    you use a key or a hash. If you iterate over the keys with that it's a non-determinative order.
  • Nesynchronizované
    If your build system loops over such a
    hash or a dictionary
  • Nesynchronizované
    then the results from this build could be
    non-reproducible and non-deterministic.
  • Nesynchronizované
    And also things like files in the part of
    the build process will just adsorb
  • Nesynchronizované
    stuff from the surrounding environment
    like umask and all that kind of
  • Nesynchronizované
    stuff that lives outside there.
  • Nesynchronizované
    Build paths is a very interesting one which
    we cover in greater detail on another slide.
  • Nesynchronizované
    Also specifying the environment, we'll also
    cover this one in the build info slides.
  • Nesynchronizované
    So not only are there privacy and security
    advantages of using,
  • Nesynchronizované
    moving towards reproducible builds there
    are also technical advantages.
  • Nesynchronizované
    It's faster to build if you basically
    keep hitting cache.
  • Nesynchronizované
    I'm pretty certain this is why Google are
    interested in it.
  • Nesynchronizované
    Because of the amount of
    compilation they do
  • Nesynchronizované
    they're just going to save a whole bucket
    load of money just by
  • Nesynchronizované
    'Oh we don't need to rebuild this because
    it's the same SHA' etc
  • Nesynchronizované
    It's very nice to test revisions and
    changes I use all out tools
  • Nesynchronizované
    when doing QA uploads or NMUs you
    rebuild a package
  • Nesynchronizované
    and then you compare to the previous one.
    And as the only things that have changed
  • Nesynchronizované
    should be the things that you've changed,
    there haven't been all sorts
  • Nesynchronizované
    of random other nonsense being
    reorderd with timestamps added.
  • Nesynchronizované
    You can get rid of all that noise and
    just be 'oh yeah brilliant I can see that
  • Nesynchronizované
    the patch I've applied here has actually
    changed the behaviour of the program'
  • Nesynchronizované
    and only that. It hasn't done all sorts
    of wierd wierd stuff.
  • Nesynchronizované
    So you have safer uploads in that sense.
  • Nesynchronizované
    Speaking of safety a reproducible build
    won't go talking to the Internet
  • Nesynchronizované
    like a lot of modern package managers
    like to do. Mathen style ones.
  • Nesynchronizované
    Also a reproducible build will typically
    not have any
  • Nesynchronizované
    non-deterministic failure modes.
  • Nesynchronizované
    So there's a lot of tests and test suites
    in Debian that will
  • Nesynchronizované
    try and test things like 'Oh is this
    algorithm N squared or bigger than N'.
  • Nesynchronizované
    And it will try doing that by running some sort of bench mark
  • Nesynchronizované
    and fail if it doesn't meet some sort of
    arbitrary time difference and
  • Nesynchronizované
    that's obviously that's not reliable. So we
    get rid of all those nonsense things.
  • Nesynchronizované
    It also finds bugs in really weird
    locales. We build in French,
  • Nesynchronizované
    Swiss-French, and it just comes up with
    all sorts of nonsense.
  • Nesynchronizované
    Or timezones, if you build in UTC-12 then
    this date library doesn't work anymore
  • Nesynchronizované
    and it's like 'you had one job
    to be a date library'.
  • Nesynchronizované
    [audience laughter]
  • Nesynchronizované
    It's pretty scary and some pretty
    cute bugs.
  • Nesynchronizované
    It also detects if your machine is, you
    just have a broken ??? [8:28].
  • Nesynchronizované
    We build a year and a month in the
    future. You find things like the
  • Nesynchronizované
    maintainer has added a pre-generated SSL
    certificate to their tests and
  • Nesynchronizované
    it expires in the year. And so it breaks.
  • Nesynchronizované
    We're preemptively detecting that fail to
    ??? [9:01] source.
Title:
reproducible_builds_status_update.webm
Video Language:
English
Team:
Debconf
Projekt:
2016_miniconf-cambridge16
Duration:
43:10

English subtitles

Incomplete

Revízie Compare revisions