1 99:59:59,999 --> 99:59:59,999 Next talk 2 99:59:59,999 --> 99:59:59,999 Chris and Holger are going to talk to us again 3 99:59:59,999 --> 99:59:59,999 about reproducible builds and tell us where they're up to. 4 99:59:59,999 --> 99:59:59,999 Thanks very much 5 99:59:59,999 --> 99:59:59,999 The outline of this talk is from last year we realised there were a lot of questions. 6 99:59:59,999 --> 99:59:59,999 The rough plan is to quickly go over what reproducible builds are 7 99:59:59,999 --> 99:59:59,999 I guess everyone is up to speed 8 99:59:59,999 --> 99:59:59,999 but getting everyone on the same page would be a good idea. 9 99:59:59,999 --> 99:59:59,999 Then Holger's going to jump in and give the status update 10 99:59:59,999 --> 99:59:59,999 and then we're going to talk about future work, questions etc 11 99:59:59,999 --> 99:59:59,999 What is the actual problem we're solving here? 12 99:59:59,999 --> 99:59:59,999 You can always inspect the source code of free software for malicious flaws 13 99:59:59,999 --> 99:59:59,999 or just flaws as well. 14 99:59:59,999 --> 99:59:59,999 Unfortunately distributions provide precompiled binaries to end users. 15 99:59:59,999 --> 99:59:59,999 So can you actually trust this compilation process has not 16 99:59:59,999 --> 99:59:59,999 introduced flaws of its own? 17 99:59:59,999 --> 99:59:59,999 The problem is it seems very effective if you want to go after end users 18 99:59:59,999 --> 99:59:59,999 you can go after developers. Because if you go infect a developers 19 99:59:59,999 --> 99:59:59,999 machine you will then infect all the users of the software they generate. 20 99:59:59,999 --> 99:59:59,999 Financial incentives. There always were but they are even more so these days 21 99:59:59,999 --> 99:59:59,999 with mobile phone etc. 22 99:59:59,999 --> 99:59:59,999 You can also have very subtle flaws. This one in particular there was a 23 99:59:59,999 --> 99:59:59,999 root exploit in OpenSSH just by changing a compare equal. 24 99:59:59,999 --> 99:59:59,999 That sort of assembler jump thing and it gives you root 25 99:59:59,999 --> 99:59:59,999 but with only a single bit difference in the binary. 26 99:59:59,999 --> 99:59:59,999 Which is not to shabby. 27 99:59:59,999 --> 99:59:59,999 Then you have all sorts of cute demos where you load up the source code in VIM 28 99:59:59,999 --> 99:59:59,999 and it just looks like 'Hello world' but when you compile it with GCC 29 99:59:59,999 --> 99:59:59,999 your kernel rootkit just goes 'oh I'm going to give you a different file' 30 99:59:59,999 --> 99:59:59,999 and self replicates of them like that. 31 99:59:59,999 --> 99:59:59,999 Difficult to trust the process. 32 99:59:59,999 --> 99:59:59,999 And there's some recent history as well around Xcodeghost and iOS 33 99:59:59,999 --> 99:59:59,999 and adverts and things like that. 34 99:59:59,999 --> 99:59:59,999 You can Google those things. Really scary stuff. 35 99:59:59,999 --> 99:59:59,999 The last example is actually coming from a CIA design paper from 2012. 36 99:59:59,999 --> 99:59:59,999 Which was then found in the wild in 2014. So these exploits are actually happening. 37 99:59:59,999 --> 99:59:59,999 People are targeting developers to get users. 38 99:59:59,999 --> 99:59:59,999 Xcodeghost had 20 milllion user installations. 39 99:59:59,999 --> 99:59:59,999 It was probably not the CIA or NSA but we don't know who it was. 40 99:59:59,999 --> 99:59:59,999 There are many people who do these exploits in the wild. 41 99:59:59,999 --> 99:59:59,999 Yeah it's not just 'Here's this cute thing we can talk about'. 42 99:59:59,999 --> 99:59:59,999 It's actually happening. 43 99:59:59,999 --> 99:59:59,999 The motivation is to ensure no flaws are introduced during the build process. 44 99:59:59,999 --> 99:59:59,999 We do this by ensuring the build always produces identical results. 45 99:59:59,999 --> 99:59:59,999 Then multiple parties do the same thing. 46 99:59:59,999 --> 99:59:59,999 I build it, you build it, your friends build it etc 47 99:59:59,999 --> 99:59:59,999 An an attacker would need to infect everyone simultaneously 48 99:59:59,999 --> 99:59:59,999 otherwise they'd be detected. For example if my machine was compromised 49 99:59:59,999 --> 99:59:59,999 I would suddenly come up with a different result. 50 99:59:59,999 --> 99:59:59,999 I would come up with different binaries. 51 99:59:59,999 --> 99:59:59,999 And you'd be 'what's going on here' and eventually we would discover 52 99:59:59,999 --> 99:59:59,999 that my machine was rootkitted etc. 53 99:59:59,999 --> 99:59:59,999 You probably know it but identically means bit by bit identical. 54 99:59:59,999 --> 99:59:59,999 As that is really the same. 55 99:59:59,999 --> 99:59:59,999 Yeah, bit, SHA, MD5 whatever you want. 56 99:59:59,999 --> 99:59:59,999 There are a bunch of challenges here. The biggest one is timestamps. 57 99:59:59,999 --> 99:59:59,999 A lot of software just loves to include timestamps everywhere. 58 99:59:59,999 --> 99:59:59,999 Documentation, underscore underscore date and underscore underscore time macros 59 99:59:59,999 --> 99:59:59,999 Just all over the place, in file names etc Things like that. 60 99:59:59,999 --> 99:59:59,999 Builds often vary by locale and timezone. 61 99:59:59,999 --> 99:59:59,999 Different new lines, different sorting orders for example collations. 62 99:59:59,999 --> 99:59:59,999 Different versions of libraries. I'm not sure what this refers to exactly. 63 99:59:59,999 --> 99:59:59,999 Moving on. 64 99:59:59,999 --> 99:59:59,999 Non-deterministic file ordering for example Shell Globs are not really defined 65 99:59:59,999 --> 99:59:59,999 to be, I say not really defined they aren't defined to come out in normal order. 66 99:59:59,999 --> 99:59:59,999 Also read syscall, it doesn't actually promise any particular ordering. 67 99:59:59,999 --> 99:59:59,999 Dictionary/hash key ordering. So this is in things like Perl and python 68 99:59:59,999 --> 99:59:59,999 you use a key or a hash. If you iterate over the keys with that it's a non-determinative order. 69 99:59:59,999 --> 99:59:59,999 If your build system loops over such a hash or a dictionary 70 99:59:59,999 --> 99:59:59,999 then the results from this build could be non-reproducible and non-deterministic. 71 99:59:59,999 --> 99:59:59,999 And also things like files in the part of the build process will just adsorb 72 99:59:59,999 --> 99:59:59,999 stuff from the surrounding environment like umask and all that kind of 73 99:59:59,999 --> 99:59:59,999 stuff that lives outside there. 74 99:59:59,999 --> 99:59:59,999 Build paths is a very interesting one which we cover in greater detail on another slide. 75 99:59:59,999 --> 99:59:59,999 Also specifying the environment, we'll also cover this one in the build info slides. 76 99:59:59,999 --> 99:59:59,999 So not only are there privacy and security advantages of using, 77 99:59:59,999 --> 99:59:59,999 moving towards reproducible builds there are also technical advantages. 78 99:59:59,999 --> 99:59:59,999 It's faster to build if you basically keep hitting cache. 79 99:59:59,999 --> 99:59:59,999 I'm pretty certain this is why Google are interested in it. 80 99:59:59,999 --> 99:59:59,999 Because of the amount of compilation they do 81 99:59:59,999 --> 99:59:59,999 they're just going to save a whole bucket load of money just by 82 99:59:59,999 --> 99:59:59,999 'Oh we don't need to rebuild this because it's the same SHA' etc 83 99:59:59,999 --> 99:59:59,999 It's very nice to test revisions and changes I use all out tools 84 99:59:59,999 --> 99:59:59,999 when doing QA uploads or NMUs you rebuild a package 85 99:59:59,999 --> 99:59:59,999 and then you compare to the previous one. And as the only things that have changed 86 99:59:59,999 --> 99:59:59,999 should be the things that you've changed, there haven't been all sorts 87 99:59:59,999 --> 99:59:59,999 of random other nonsense being reorderd with timestamps added. 88 99:59:59,999 --> 99:59:59,999 You can get rid of all that noise and just be 'oh yeah brilliant I can see that 89 99:59:59,999 --> 99:59:59,999 the patch I've applied here has actually changed the behaviour of the program' 90 99:59:59,999 --> 99:59:59,999 and only that. It hasn't done all sorts of wierd wierd stuff. 91 99:59:59,999 --> 99:59:59,999 So you have safer uploads in that sense. 92 99:59:59,999 --> 99:59:59,999 Speaking of safety a reproducible build won't go talking to the Internet 93 99:59:59,999 --> 99:59:59,999 like a lot of modern package managers like to do. Mathen style ones. 94 99:59:59,999 --> 99:59:59,999 Also a reproducible build will typically not have any 95 99:59:59,999 --> 99:59:59,999 non-deterministic failure modes. 96 99:59:59,999 --> 99:59:59,999 So there's a lot of tests and test suites in Debian that will 97 99:59:59,999 --> 99:59:59,999 try and test things like 'Oh is this algorithm N squared or bigger than N'. 98 99:59:59,999 --> 99:59:59,999 And it will try doing that by running some sort of bench mark 99 99:59:59,999 --> 99:59:59,999 and fail if it doesn't meet some sort of arbitrary time difference and 100 99:59:59,999 --> 99:59:59,999 that's obviously that's not reliable. So we get rid of all those nonsense things. 101 99:59:59,999 --> 99:59:59,999 It also finds bugs in really weird locales. We build in French, 102 99:59:59,999 --> 99:59:59,999 Swiss-French, and it just comes up with all sorts of nonsense. 103 99:59:59,999 --> 99:59:59,999 Or timezones, if you build in UTC-12 then this date library doesn't work anymore 104 99:59:59,999 --> 99:59:59,999 and it's like 'you had one job to be a date library'. 105 99:59:59,999 --> 99:59:59,999 [audience laughter] 106 99:59:59,999 --> 99:59:59,999 It's pretty scary and some pretty cute bugs. 107 99:59:59,999 --> 99:59:59,999 It also detects if your machine is, you just have a broken ??? [8:28]. 108 99:59:59,999 --> 99:59:59,999 We build a year and a month in the future. You find things like the 109 99:59:59,999 --> 99:59:59,999 maintainer has added a pre-generated SSL certificate to their tests and 110 99:59:59,999 --> 99:59:59,999 it expires in the year. And so it breaks. 111 99:59:59,999 --> 99:59:59,999 We're preemptively detecting that fail to ??? [9:01] source.