9:59:59.000,9:59:59.000 Next talk 9:59:59.000,9:59:59.000 Chris and Holger are going to talk to us again 9:59:59.000,9:59:59.000 about reproducible builds and tell us [br]where they're up to. 9:59:59.000,9:59:59.000 Thanks very much 9:59:59.000,9:59:59.000 The outline of this talk is from last year [br]we realised there were a lot of questions. 9:59:59.000,9:59:59.000 The rough plan is to quickly go over [br]what reproducible builds are 9:59:59.000,9:59:59.000 I guess everyone is up to speed 9:59:59.000,9:59:59.000 but getting everyone on the same page [br]would be a good idea. 9:59:59.000,9:59:59.000 Then Holger's going to jump in[br]and give the status update 9:59:59.000,9:59:59.000 and then we're going to talk about [br]future work, questions etc 9:59:59.000,9:59:59.000 What is the actual problem we're [br]solving here? 9:59:59.000,9:59:59.000 You can always inspect the source code of [br]free software for malicious flaws 9:59:59.000,9:59:59.000 or just flaws as well. 9:59:59.000,9:59:59.000 Unfortunately distributions provide [br]precompiled binaries to end users. 9:59:59.000,9:59:59.000 So can you actually trust this [br]compilation process has not 9:59:59.000,9:59:59.000 introduced flaws of its own? 9:59:59.000,9:59:59.000 The problem is it seems very effective if [br]you want to go after end users 9:59:59.000,9:59:59.000 you can go after developers. [br]Because if you go infect a developers 9:59:59.000,9:59:59.000 machine you will then infect all the [br]users of the software they generate. 9:59:59.000,9:59:59.000 Financial incentives. There always were [br]but they are even more so these days 9:59:59.000,9:59:59.000 with mobile phone etc. 9:59:59.000,9:59:59.000 You can also have very subtle flaws. [br]This one in particular there was a 9:59:59.000,9:59:59.000 root exploit in OpenSSH just by changing [br]a compare equal. 9:59:59.000,9:59:59.000 That sort of assembler jump thing and it [br]gives you root 9:59:59.000,9:59:59.000 but with only a single bit difference in [br]the binary. 9:59:59.000,9:59:59.000 Which is not to shabby. 9:59:59.000,9:59:59.000 Then you have all sorts of cute demos [br]where you load up the source code in VIM 9:59:59.000,9:59:59.000 and it just looks like 'Hello world' but [br]when you compile it with GCC 9:59:59.000,9:59:59.000 your kernel rootkit just goes 'oh I'm [br]going to give you a different file' 9:59:59.000,9:59:59.000 and self replicates of them like that. 9:59:59.000,9:59:59.000 Difficult to trust the process. 9:59:59.000,9:59:59.000 And there's some recent history as well [br]around Xcodeghost and iOS 9:59:59.000,9:59:59.000 and adverts and things like that. 9:59:59.000,9:59:59.000 You can Google those things. [br]Really scary stuff. 9:59:59.000,9:59:59.000 The last example is actually coming from [br]a CIA design paper from 2012. 9:59:59.000,9:59:59.000 Which was then found in the wild in 2014. [br]So these exploits are actually happening. 9:59:59.000,9:59:59.000 People are targeting developers to get [br]users. 9:59:59.000,9:59:59.000 Xcodeghost had 20 milllion user [br]installations. 9:59:59.000,9:59:59.000 It was probably not the CIA or NSA but [br]we don't know who it was. 9:59:59.000,9:59:59.000 There are many people who do these [br]exploits in the wild. 9:59:59.000,9:59:59.000 Yeah it's not just 'Here's this cute [br]thing we can talk about'. 9:59:59.000,9:59:59.000 It's actually happening. 9:59:59.000,9:59:59.000 The motivation is to ensure no flaws are [br]introduced during the build process. 9:59:59.000,9:59:59.000 We do this by ensuring the build always [br]produces identical results. 9:59:59.000,9:59:59.000 Then multiple parties do the same thing. 9:59:59.000,9:59:59.000 I build it, you build it, your friends [br]build it etc 9:59:59.000,9:59:59.000 An an attacker would need to infect [br]everyone simultaneously 9:59:59.000,9:59:59.000 otherwise they'd be detected.[br]For example if my machine was compromised 9:59:59.000,9:59:59.000 I would suddenly come up with a [br]different result. 9:59:59.000,9:59:59.000 I would come up with different binaries. 9:59:59.000,9:59:59.000 And you'd be 'what's going on here' and [br]eventually we would discover 9:59:59.000,9:59:59.000 that my machine was rootkitted etc. 9:59:59.000,9:59:59.000 You probably know it but identically [br]means bit by bit identical. 9:59:59.000,9:59:59.000 As that is really the same. 9:59:59.000,9:59:59.000 Yeah, bit, SHA, MD5 whatever you want. 9:59:59.000,9:59:59.000 There are a bunch of challenges here. [br]The biggest one is timestamps. 9:59:59.000,9:59:59.000 A lot of software just loves to include [br]timestamps everywhere. 9:59:59.000,9:59:59.000 Documentation, underscore underscore date[br]and underscore underscore time macros 9:59:59.000,9:59:59.000 Just all over the place, in file names etc[br]Things like that. 9:59:59.000,9:59:59.000 Builds often vary by locale and timezone. 9:59:59.000,9:59:59.000 Different new lines, different sorting [br]orders for example collations. 9:59:59.000,9:59:59.000 Different versions of libraries. I'm not [br]sure what this refers to exactly. 9:59:59.000,9:59:59.000 Moving on. 9:59:59.000,9:59:59.000 Non-deterministic file ordering for [br]example Shell Globs are not really defined 9:59:59.000,9:59:59.000 to be, I say not really defined they [br]aren't defined to come out in normal order. 9:59:59.000,9:59:59.000 Also read syscall, it doesn't actually [br]promise any particular ordering. 9:59:59.000,9:59:59.000 Dictionary/hash key ordering. So this is [br]in things like Perl and python 9:59:59.000,9:59:59.000 you use a key or a hash. If you iterate over the keys with that it's a non-determinative order. 9:59:59.000,9:59:59.000 If your build system loops over such a [br]hash or a dictionary 9:59:59.000,9:59:59.000 then the results from this build could be [br]non-reproducible and non-deterministic. 9:59:59.000,9:59:59.000 And also things like files in the part of [br]the build process will just adsorb 9:59:59.000,9:59:59.000 stuff from the surrounding environment [br]like umask and all that kind of 9:59:59.000,9:59:59.000 stuff that lives outside there. 9:59:59.000,9:59:59.000 Build paths is a very interesting one which [br]we cover in greater detail on another slide. 9:59:59.000,9:59:59.000 Also specifying the environment, we'll also[br]cover this one in the build info slides. 9:59:59.000,9:59:59.000 So not only are there privacy and security[br]advantages of using, 9:59:59.000,9:59:59.000 moving towards reproducible builds there[br]are also technical advantages. 9:59:59.000,9:59:59.000 It's faster to build if you basically [br]keep hitting cache. 9:59:59.000,9:59:59.000 I'm pretty certain this is why Google are [br]interested in it. 9:59:59.000,9:59:59.000 Because of the amount of [br]compilation they do 9:59:59.000,9:59:59.000 they're just going to save a whole bucket [br]load of money just by 9:59:59.000,9:59:59.000 'Oh we don't need to rebuild this because [br]it's the same SHA' etc 9:59:59.000,9:59:59.000 It's very nice to test revisions and [br]changes I use all out tools 9:59:59.000,9:59:59.000 when doing QA uploads or NMUs you [br]rebuild a package 9:59:59.000,9:59:59.000 and then you compare to the previous one. [br]And as the only things that have changed 9:59:59.000,9:59:59.000 should be the things that you've changed, [br]there haven't been all sorts 9:59:59.000,9:59:59.000 of random other nonsense being [br]reorderd with timestamps added. 9:59:59.000,9:59:59.000 You can get rid of all that noise and [br]just be 'oh yeah brilliant I can see that 9:59:59.000,9:59:59.000 the patch I've applied here has actually [br]changed the behaviour of the program' 9:59:59.000,9:59:59.000 and only that. It hasn't done all sorts [br]of wierd wierd stuff. 9:59:59.000,9:59:59.000 So you have safer uploads in that sense. 9:59:59.000,9:59:59.000 Speaking of safety a reproducible build [br]won't go talking to the Internet 9:59:59.000,9:59:59.000 like a lot of modern package managers [br]like to do. Mathen style ones. 9:59:59.000,9:59:59.000 Also a reproducible build will typically [br]not have any 9:59:59.000,9:59:59.000 non-deterministic failure modes. 9:59:59.000,9:59:59.000 So there's a lot of tests and test suites [br]in Debian that will 9:59:59.000,9:59:59.000 try and test things like 'Oh is this [br]algorithm N squared or bigger than N'. 9:59:59.000,9:59:59.000 And it will try doing that by running some sort of bench mark 9:59:59.000,9:59:59.000 and fail if it doesn't meet some sort of [br]arbitrary time difference and 9:59:59.000,9:59:59.000 that's obviously that's not reliable. So we [br]get rid of all those nonsense things. 9:59:59.000,9:59:59.000 It also finds bugs in really weird [br]locales. We build in French, 9:59:59.000,9:59:59.000 Swiss-French, and it just comes up with [br]all sorts of nonsense. 9:59:59.000,9:59:59.000 Or timezones, if you build in UTC-12 then [br]this date library doesn't work anymore 9:59:59.000,9:59:59.000 and it's like 'you had one job [br]to be a date library'. 9:59:59.000,9:59:59.000 [audience laughter] 9:59:59.000,9:59:59.000 It's pretty scary and some pretty [br]cute bugs. 9:59:59.000,9:59:59.000 It also detects if your machine is, you [br]just have a broken ??? [8:28]. 9:59:59.000,9:59:59.000 We build a year and a month in the [br]future. You find things like the 9:59:59.000,9:59:59.000 maintainer has added a pre-generated SSL[br]certificate to their tests and 9:59:59.000,9:59:59.000 it expires in the year. And so it breaks. 9:59:59.000,9:59:59.000 We're preemptively detecting that fail to [br]??? [9:01] source.