Stretching out for trustworthy reproducible builds creating bit by bit identical binaries
- 
0:01 - 0:02Welcome and good morning
- 
0:04 - 0:07This is the reproducible builds team,
 talking about
- 
0:07 - 0:10"Stretching out towards trustworthy
 computing"
- 
0:12 - 0:20[Applause]
- 
0:22 - 0:26We're 4 on stage, but actually this is a
 team effort.
- 
0:26 - 0:31All these people listed here have
 contributed to the project at one point.
- 
0:31 - 0:33The 4 of us, that's
- 
0:33 - 0:34Lunar − me
- 
0:34 - 0:35there's Dhole,
- 
0:35 - 0:36Chris Lamb − lamby
- 
0:36 - 0:38and Holger.
- 
0:39 - 0:43But actually, this is DebConf and so a lot
 more of us have been or are
- 
0:43 - 0:47currently here and so, if you want to
 thank anybody that is working on this
- 
0:47 - 0:49you need to actually thank all of
 these folks
- 
0:49 - 0:51'cause, yay.
- 
0:51 - 0:56[Applause]
- 
0:57 - 1:00[Holger] The people in blue are here.
- 
1:04 - 1:06[Lunar] Let's get started.
- 
1:06 - 1:08Quick recap on what we're talking
 about.
- 
1:08 - 1:11We have software, it's made from source.
- 
1:11 - 1:15Source is readable by humans or at least
 a good amount of humans.
- 
1:15 - 1:17In this room it's good.
- 
1:17 - 1:24Binary, readable by computer and some
 tiny fraction of humanity.
- 
1:24 - 1:30Going from source to binary is called
 build, or like building or compiling
- 
1:30 - 1:33and we're doing free software and
 free software is awesome because
- 
1:33 - 1:38we can actually run these binaries like
 we want
- 
1:38 - 1:44We can actually study the software, how
 it's been made by studying the source
- 
1:44 - 1:49and by studying the source we can assess
 that it does what it's supposed to do
- 
1:49 - 1:51and not something else that does not
- 
1:51 - 1:56have malware, or trojans or security bugs
- 
1:56 - 2:01So we have the binary that can be used,
 fine.
- 
2:01 - 2:04We have the source that can be verified.
- 
2:04 - 2:10Problem is that right now, the only way we
 know that a binary that we get…
- 
2:10 - 2:16We have to trust a website or a Debian
 repository that says
- 
2:16 - 2:18"Well, this binary has been made with this
 source"
- 
2:18 - 2:23But there's no way we can actually prove
 that.
- 
2:23 - 2:27This is actually a problem that has been
 well explained by
- 
2:27 - 2:34Mike Perry and Seth Schoen at the 31c3
 in Hamburg last december.
- 
2:34 - 2:41For example, Seth Schoen made a proof of
 concept exploit for the Linux kernel
- 
2:41 - 2:52that when GCC was called, the kernel would
 without modifying anything on the disk
- 
2:52 - 2:59when the kernel detects that GCC is going
 to read a C file, it will insert some
- 
2:59 - 3:06extra lines of code, and these lines of
 code can be a very bad thing
- 
3:06 - 3:09in the case of 31c3 talk I was just
 recalling.
- 
3:09 - 3:18Actually, you can even have developers
 who are in very good faith, who have
- 
3:18 - 3:21totally secure dev machines, or they
 thought they have,
- 
3:21 - 3:24who have reviewed all their source code
 for any bugs
- 
3:24 - 3:31and we would still get totally owned as
 soon as their computer gets compromised
- 
3:31 - 3:34or one of the build demons from Debian
 gets compromised for example.
- 
3:34 - 3:41This is not, like, hypothetical threats
 here we're discussing
- 
3:41 - 3:46A couple of months after Seth an Mike's
 talk at 31c3,
- 
3:46 - 3:49the Intercept revealed from the Snowden
 leaks
- 
3:49 - 3:56that at a CIA conference in 2012, one
 of the talks that happened
- 
3:56 - 3:59was about a project called Strawhorse.
- 
3:59 - 4:05Strawhorse is about modifying Apple XCode,
 which is the development environment
- 
4:05 - 4:09for MacOS 10 and iOS applications
- 
4:09 - 4:11and well, they were modifying XCode so
 it would produce,
- 
4:11 - 4:13without the developer knowing,
- 
4:13 - 4:23binaries with trojans, malware,
 ??? binaries, lots of bad things.
- 
4:23 - 4:25So, solution:
- 
4:25 - 4:29enable anyone to reproduce identical
 binary packages from a given source.
- 
4:29 - 4:35Because if using a source, using the same
 environment,
- 
4:35 - 4:40multiple people on different computers, on
 different networks, at different times,
- 
4:40 - 4:43can all get the same thing
 from the same source
- 
4:43 - 4:45all the same binary, byte for byte,
- 
4:45 - 4:47then there's a good chance that…
- 
4:47 - 4:55Well, everybody could be owned,
 but let's be more joyful and say that
- 
4:55 - 4:59probably, if everybody gets the same
 result, there was actually no problem
- 
4:59 - 5:01and everybody is safe.
- 
5:02 - 5:04We call that solution
 "reproducible builds"
- 
5:07 - 5:08Yay.
- 
5:08 - 5:11[Applause]
- 
5:13 - 5:15Actually, it's not only about security.
- 
5:15 - 5:19For Debian, we have, if you're doing
 "Multi-arch: same" packages,
- 
5:19 - 5:25well they only have the same bytes if
 they are built for different architectures,
- 
5:25 - 5:28the files in the package.
- 
5:28 - 5:34Debug packages, you can create at a later
 time, if you forgot to have debug packages
- 
5:34 - 5:36in the first place,
- 
5:36 - 5:42you can pass the no-strip option later and
 because the package is reproducible,
- 
5:42 - 5:47you will get the debug symbols that work
 for software that has been shipped already
- 
5:47 - 5:50We do early detection of FTBFS that way
- 
5:50 - 5:54because if we try pretty quickly
 to reproduce a build,
- 
5:54 - 5:55then it has to work.
- 
5:55 - 5:58It's useful for build profiles.
- 
5:58 - 6:02We can get smaller .deb deltas,
- 
6:02 - 6:05because from one version to the next we
 might have the same content.
- 
6:05 - 6:09We can do validation of cross-builds,
- 
6:09 - 6:12Helmut Grohne can talk to you about that.
- 
6:12 - 6:17And also, Niels Thykier told me that
- 
6:17 - 6:21he was very interested in reproducible
 builds because it would enable him to
- 
6:21 - 6:24test debhelper better, because
- 
6:24 - 6:29if the package builds reproducibly,
 then he makes a change to debhelper
- 
6:29 - 6:32he can rebuild the package ???
- 
6:32 - 6:36the same version of a package with a newer
 debhelper and see what has changed
- 
6:36 - 6:40and this change can be isolated to only
 what he has worked on debhelper
- 
6:40 - 6:42for example.
- 
6:43 - 6:45And, oh my.
- 
6:45 - 6:48The whole world is watching us.
- 
6:48 - 6:56Since two years or a year and a half ago,
 everybody I meet in security conference,
- 
6:56 - 6:59in hacker conference, in free software
 conference is like
- 
6:59 - 7:01"Oh you're working on that,
 that's awesome."
- 
7:01 - 7:09And, I mean, I've been the one doing quite
 a lot of talks, and everybody comes to me
- 
7:09 - 7:11and I'm like "Wow wow, this is way bigger",
- 
7:11 - 7:16but we're actually leading the field here.
- 
7:16 - 7:19Yay Debian.
- 
7:19 - 7:26[Applause]
- 
7:26 - 7:29So, we are not the only ones leading the
 field,
- 
7:29 - 7:33Bitcoin and Tor made their software
 reproducible before us,
- 
7:33 - 7:37Coreboot also succeeded, if you build
 Coreboot without any payload,
- 
7:37 - 7:39that's 100% reproducible.
- 
7:39 - 7:44FreeBSD has a page on their wiki since
 2013
- 
7:44 - 7:49saying there are 5 reproducibility issues
 in their base system.
- 
7:49 - 7:52We're at the moment trying to
 confirm this.
- 
7:52 - 7:57On jenkins.debian.net, I've also set up
 now tests for FreeBSD, NetBSD,
- 
7:57 - 7:59Coreboot and OpenWrt.
- 
7:59 - 8:03So if you go to
 reproducible.debian.net/
- 
8:03 - 8:05you get that tested.
- 
8:05 - 8:08And there's more in the pipeline.
- 
8:08 - 8:11There are other projects interested
 as well.
- 
8:11 - 8:15NetBSD also has a variable ???
 which you can set
- 
8:15 - 8:17and that builds reproducibly.
- 
8:17 - 8:20Though they think "I'm keeping some
 timestamps ??? and then
- 
8:20 - 8:22filtering them out later".
- 
8:22 - 8:23We disagree.
- 
8:23 - 8:28So this is how Debian looks like,
 Debian Sid,
- 
8:28 - 8:30but this is a lie.
- 
8:30 - 8:32This is not the truth.
- 
8:32 - 8:34This is just our test setup.
- 
8:34 - 8:36Sid is not like this.
- 
8:36 - 8:40For Sid, it's all orange, there's zero
 reprodicibility in Sid today.
- 
8:40 - 8:44But we'll talk now and in the following
 round table,
- 
8:44 - 8:47it's to actually make Sid reproducible.
- 
8:47 - 8:52The current status is
- 
8:52 - 8:58we're working on this in Debian since
 two years ago.
- 
8:58 - 9:02We have weekly reports about our project
 now since May
- 
9:02 - 9:07and we've given several talks, especially
 in the last year
- 
9:07 - 9:11and all these talks, presentation, also
 other stuff is linked in the wiki.
- 
9:11 - 9:15There's a page with information about
 Debian, these BSDs,
- 
9:15 - 9:19other Linuxes, upstream ???
 all on this wiki.
- 
9:23 - 9:27Since DebConf14, which is merely
 a year ago,
- 
9:27 - 9:29we've made quite some changes.
- 
9:29 - 9:33We have introduced
 strip-nondeterminism
- 
9:33 - 9:39which is called by dh at the end
 of the build of the package
- 
9:39 - 9:45and will normalize some things
 which Chris will explain later
- 
9:45 - 9:50We have decided on a fixed build path
- 
9:50 - 9:54because the build path is leaked
 in the binaries and several things
- 
9:54 - 9:57We didn't find a way yet to make
 the build path arbitrary.
- 
9:57 - 10:03We designed a way to record the build
 environment
- 
10:03 - 10:08because to rebuild, you need to recreate
 the build environment.
- 
10:08 - 10:12We set up this Jenkins setup.
- 
10:12 - 10:17We wrote diffoscope which used to be
 called debbindiff
- 
10:17 - 10:21which shows differences between two
 packages or two directories or
- 
10:21 - 10:24two filesystems by now.
- 
10:24 - 10:31There's SOURCEDATEEPOCH, which is a way
 that the tools expose
- 
10:31 - 10:34the last modification of the source.
- 
10:34 - 10:37Because the build date, people want to
 include the build date
- 
10:37 - 10:39because they think this is a meaningful
 indication:
- 
10:39 - 10:42when a build was done,
 which software used.
- 
Not Syncedbut if the build always recreates
 the same results
- 
Not Syncedthe build date becomes meaningless
- 
Not Syncedand the really interesting thing is
 the latest modification of the source
              
Show all
            
            
            
            
           Debconf
 Debconf
