< Return to Video

Stretching out for trustworthy reproducible builds creating bit by bit identical binaries

  • 0:01 - 0:02
    Welcome and good morning
  • 0:04 - 0:07
    This is the reproducible builds team,
    talking about
  • 0:07 - 0:10
    "Stretching out towards trustworthy
    computing"
  • 0:12 - 0:20
    [Applause]
  • 0:22 - 0:26
    We're 4 on stage, but actually this is a
    team effort.
  • 0:26 - 0:31
    All these people listed here have
    contributed to the project at one point.
  • 0:31 - 0:33
    The 4 of us, that's
  • 0:33 - 0:34
    Lunar − me
  • 0:34 - 0:35
    there's Dhole,
  • 0:35 - 0:36
    Chris Lamb − lamby
  • 0:36 - 0:38
    and Holger.
  • 0:39 - 0:43
    But actually, this is DebConf and so a lot
    more of us have been or are
  • 0:43 - 0:47
    currently here and so, if you want to
    thank anybody that is working on this
  • 0:47 - 0:49
    you need to actually thank all of
    these folks
  • 0:49 - 0:51
    'cause, yay.
  • 0:51 - 0:56
    [Applause]
  • 0:57 - 1:00
    [Holger] The people in blue are here.
  • 1:04 - 1:06
    [Lunar] Let's get started.
  • 1:06 - 1:08
    Quick recap on what we're talking
    about.
  • 1:08 - 1:11
    We have software, it's made from source.
  • 1:11 - 1:15
    Source is readable by humans or at least
    a good amount of humans.
  • 1:15 - 1:17
    In this room it's good.
  • 1:17 - 1:24
    Binary, readable by computer and some
    tiny fraction of humanity.
  • 1:24 - 1:30
    Going from source to binary is called
    build, or like building or compiling
  • 1:30 - 1:33
    and we're doing free software and
    free software is awesome because
  • 1:33 - 1:38
    we can actually run these binaries like
    we want
  • 1:38 - 1:44
    We can actually study the software, how
    it's been made by studying the source
  • 1:44 - 1:49
    and by studying the source we can assess
    that it does what it's supposed to do
  • 1:49 - 1:51
    and not something else that does not
  • 1:51 - 1:56
    have malware, or trojans or security bugs
  • 1:56 - 2:01
    So we have the binary that can be used,
    fine.
  • 2:01 - 2:04
    We have the source that can be verified.
  • 2:04 - 2:10
    Problem is that right now, the only way we
    know that a binary that we get…
  • 2:10 - 2:16
    We have to trust a website or a Debian
    repository that says
  • 2:16 - 2:18
    "Well, this binary has been made with this
    source"
  • 2:18 - 2:23
    But there's no way we can actually prove
    that.
  • 2:23 - 2:27
    This is actually a problem that has been
    well explained by
  • 2:27 - 2:34
    Mike Perry and Seth Schoen at the 31c3
    in Hamburg last december.
  • 2:34 - 2:41
    For example, Seth Schoen made a proof of
    concept exploit for the Linux kernel
  • 2:41 - 2:52
    that when GCC was called, the kernel would
    without modifying anything on the disk
  • 2:52 - 2:59
    when the kernel detects that GCC is going
    to read a C file, it will insert some
  • 2:59 - 3:06
    extra lines of code, and these lines of
    code can be a very bad thing
  • 3:06 - 3:09
    in the case of 31c3 talk I was just
    recalling.
  • 3:09 - 3:18
    Actually, you can even have developers
    who are in very good faith, who have
  • 3:18 - 3:21
    totally secure dev machines, or they
    thought they have,
  • 3:21 - 3:24
    who have reviewed all their source code
    for any bugs
  • 3:24 - 3:31
    and we would still get totally owned as
    soon as their computer gets compromised
  • 3:31 - 3:34
    or one of the build demons from Debian
    gets compromised for example.
  • 3:34 - 3:41
    This is not, like, hypothetical threats
    here we're discussing
  • 3:41 - 3:46
    A couple of months after Seth an Mike's
    talk at 31c3,
  • 3:46 - 3:49
    the Intercept revealed from the Snowden
    leaks
  • 3:49 - 3:56
    that at a CIA conference in 2012, one
    of the talks that happened
  • 3:56 - 3:59
    was about a project called Strawhorse.
  • 3:59 - 4:05
    Strawhorse is about modifying Apple XCode,
    which is the development environment
  • 4:05 - 4:09
    for MacOS 10 and iOS applications
  • 4:09 - 4:11
    and well, they were modifying XCode so
    it would produce,
  • 4:11 - 4:13
    without the developer knowing,
  • 4:13 - 4:23
    binaries with trojans, malware,
    ??? binaries, lots of bad things.
  • 4:23 - 4:25
    So, solution:
  • 4:25 - 4:29
    enable anyone to reproduce identical
    binary packages from a given source.
  • 4:29 - 4:35
    Because if using a source, using the same
    environment,
  • 4:35 - 4:40
    multiple people on different computers, on
    different networks, at different times,
  • 4:40 - 4:43
    can all get the same thing
    from the same source
  • 4:43 - 4:45
    all the same binary, byte for byte,
  • 4:45 - 4:47
    then there's a good chance that…
  • 4:47 - 4:55
    Well, everybody could be owned,
    but let's be more joyful and say that
  • 4:55 - 4:59
    probably, if everybody gets the same
    result, there was actually no problem
  • 4:59 - 5:01
    and everybody is safe.
  • 5:02 - 5:04
    We call that solution
    "reproducible builds"
  • 5:07 - 5:08
    Yay.
  • 5:08 - 5:11
    [Applause]
  • 5:13 - 5:15
    Actually, it's not only about security.
  • 5:15 - 5:19
    For Debian, we have, if you're doing
    "Multi-arch: same" packages,
  • 5:19 - 5:25
    well they only have the same bytes if
    they are built for different architectures,
  • 5:25 - 5:28
    the files in the package.
  • 5:28 - 5:34
    Debug packages, you can create at a later
    time, if you forgot to have debug packages
  • 5:34 - 5:36
    in the first place,
  • 5:36 - 5:42
    you can pass the no-strip option later and
    because the package is reproducible,
  • 5:42 - 5:47
    you will get the debug symbols that work
    for software that has been shipped already
  • 5:47 - 5:50
    We do early detection of FTBFS that way
  • 5:50 - 5:54
    because if we try pretty quickly
    to reproduce a build,
  • 5:54 - 5:55
    then it has to work.
  • 5:55 - 5:58
    It's useful for build profiles.
  • 5:58 - 6:02
    We can get smaller .deb deltas,
  • 6:02 - 6:05
    because from one version to the next we
    might have the same content.
  • 6:05 - 6:09
    We can do validation of cross-builds,
  • 6:09 - 6:12
    Helmut Grohne can talk to you about that.
  • 6:12 - 6:17
    And also, Niels Thykier told me that
  • 6:17 - 6:21
    he was very interested in reproducible
    builds because it would enable him to
  • 6:21 - 6:24
    test debhelper better, because
  • 6:24 - 6:29
    if the package builds reproducibly,
    then he makes a change to debhelper
  • 6:29 - 6:32
    he can rebuild the package ???
  • 6:32 - 6:36
    the same version of a package with a newer
    debhelper and see what has changed
  • 6:36 - 6:40
    and this change can be isolated to only
    what he has worked on debhelper
  • 6:40 - 6:42
    for example.
  • 6:43 - 6:45
    And, oh my.
  • 6:45 - 6:48
    The whole world is watching us.
  • 6:48 - 6:56
    Since two years or a year and a half ago,
    everybody I meet in security conference,
  • 6:56 - 6:59
    in hacker conference, in free software
    conference is like
  • 6:59 - 7:01
    "Oh you're working on that,
    that's awesome."
  • 7:01 - 7:09
    And, I mean, I've been the one doing quite
    a lot of talks, and everybody comes to me
  • 7:09 - 7:11
    and I'm like "Wow wow, this is way bigger",
  • 7:11 - 7:16
    but we're actually leading the field here.
  • 7:16 - 7:19
    Yay Debian.
  • 7:19 - 7:26
    [Applause]
  • 7:26 - 7:29
    So, we are not the only ones leading the
    field,
  • 7:29 - 7:33
    Bitcoin and Tor made their software
    reproducible before us,
  • 7:33 - 7:37
    Coreboot also succeeded, if you build
    Coreboot without any payload,
  • 7:37 - 7:39
    that's 100% reproducible.
  • 7:39 - 7:44
    FreeBSD has a page on their wiki since
    2013
  • 7:44 - 7:49
    saying there are 5 reproducibility issues
    in their base system.
  • 7:49 - 7:52
    We're at the moment trying to
    confirm this.
  • 7:52 - 7:57
    On jenkins.debian.net, I've also set up
    now tests for FreeBSD, NetBSD,
  • 7:57 - 7:59
    Coreboot and OpenWrt.
  • 7:59 - 8:03
    So if you go to
    reproducible.debian.net/
  • 8:03 - 8:05
    you get that tested.
  • 8:05 - 8:08
    And there's more in the pipeline.
  • 8:08 - 8:11
    There are other projects interested
    as well.
  • 8:11 - 8:15
    NetBSD also has a variable ???
    which you can set
  • 8:15 - 8:17
    and that builds reproducibly.
  • 8:17 - 8:20
    Though they think "I'm keeping some
    timestamps ??? and then
  • 8:20 - 8:22
    filtering them out later".
  • 8:22 - 8:23
    We disagree.
  • 8:23 - 8:28
    So this is how Debian looks like,
    Debian Sid,
  • 8:28 - 8:30
    but this is a lie.
  • 8:30 - 8:32
    This is not the truth.
  • 8:32 - 8:34
    This is just our test setup.
  • 8:34 - 8:36
    Sid is not like this.
  • 8:36 - 8:40
    For Sid, it's all orange, there's zero
    reprodicibility in Sid today.
  • 8:40 - 8:44
    But we'll talk now and in the following
    round table,
  • 8:44 - 8:47
    it's to actually make Sid reproducible.
  • 8:47 - 8:52
    The current status is
  • 8:52 - 8:58
    we're working on this in Debian since
    two years ago.
  • 8:58 - 9:02
    We have weekly reports about our project
    now since May
  • 9:02 - 9:07
    and we've given several talks, especially
    in the last year
  • 9:07 - 9:11
    and all these talks, presentation, also
    other stuff is linked in the wiki.
  • 9:11 - 9:15
    There's a page with information about
    Debian, these BSDs,
  • 9:15 - 9:19
    other Linuxes, upstream ???
    all on this wiki.
  • 9:23 - 9:27
    Since DebConf14, which is merely
    a year ago,
  • 9:27 - 9:29
    we've made quite some changes.
  • 9:29 - 9:33
    We have introduced
    strip-nondeterminism
  • 9:33 - 9:39
    which is called by dh at the end
    of the build of the package
  • 9:39 - 9:45
    and will normalize some things
    which Chris will explain later
  • 9:45 - 9:50
    We have decided on a fixed build path
  • 9:50 - 9:54
    because the build path is leaked
    in the binaries and several things
  • 9:54 - 9:57
    We didn't find a way yet to make
    the build path arbitrary.
  • 9:57 - 10:03
    We designed a way to record the build
    environment
  • 10:03 - 10:08
    because to rebuild, you need to recreate
    the build environment.
  • 10:08 - 10:12
    We set up this Jenkins setup.
  • 10:12 - 10:17
    We wrote diffoscope which used to be
    called debbindiff
  • 10:17 - 10:21
    which shows differences between two
    packages or two directories or
  • 10:21 - 10:24
    two filesystems by now.
  • 10:24 - 10:31
    There's SOURCEDATEEPOCH, which is a way
    that the tools expose
  • 10:31 - 10:34
    the last modification of the source.
  • 10:34 - 10:37
    Because the build date, people want to
    include the build date
  • 10:37 - 10:39
    because they think this is a meaningful
    indication:
  • 10:39 - 10:42
    when a build was done,
    which software used.
  • Not Synced
    but if the build always recreates
    the same results
  • Not Synced
    the build date becomes meaningless
  • Not Synced
    and the really interesting thing is
    the latest modification of the source
Title:
Stretching out for trustworthy reproducible builds creating bit by bit identical binaries
Description:

more » « less
Video Language:
English
Team:
Debconf
Project:
2015_debconf15

English subtitles

Revisions Compare revisions