-
Not Synced
Next talk
-
Not Synced
Chris and Holger are going to talk to us again
-
Not Synced
about reproducible builds and tell us
where they're up to.
-
Not Synced
Thanks very much
-
Not Synced
The outline of this talk is from last year
we realised there were a lot of questions.
-
Not Synced
The rough plan is to quickly go over
what reproducible builds are
-
Not Synced
I guess everyone is up to speed
-
Not Synced
but getting everyone on the same page
would be a good idea.
-
Not Synced
Then Holger's going to jump in
and give the status update
-
Not Synced
and then we're going to talk about
future work, questions etc
-
Not Synced
What is the actual problem we're
solving here?
-
Not Synced
You can always inspect the source code of
free software for malicious flaws
-
Not Synced
or just flaws as well.
-
Not Synced
Unfortunately distributions provide
precompiled binaries to end users.
-
Not Synced
So can you actually trust this
compilation process has not
-
Not Synced
introduced flaws of its own?
-
Not Synced
The problem is it seems very effective if
you want to go after end users
-
Not Synced
you can go after developers.
Because if you go infect a developers
-
Not Synced
machine you will then infect all the
users of the software they generate.
-
Not Synced
Financial incentives. There always were
but they are even more so these days
-
Not Synced
with mobile phone etc.
-
Not Synced
You can also have very subtle flaws.
This one in particular there was a
-
Not Synced
root exploit in OpenSSH just by changing
a compare equal.
-
Not Synced
That sort of assembler jump thing and it
gives you root
-
Not Synced
but with only a single bit difference in
the binary.
-
Not Synced
Which is not to shabby.
-
Not Synced
Then you have all sorts of cute demos
where you load up the source code in VIM
-
Not Synced
and it just looks like 'Hello world' but
when you compile it with GCC
-
Not Synced
your kernel rootkit just goes 'oh I'm
going to give you a different file'
-
Not Synced
and self replicates of them like that.
-
Not Synced
Difficult to trust the process.
-
Not Synced
And there's some recent history as well
around Xcodeghost and iOS
-
Not Synced
and adverts and things like that.
-
Not Synced
You can Google those things.
Really scary stuff.
-
Not Synced
The last example is actually coming from
a CIA design paper from 2012.
-
Not Synced
Which was then found in the wild in 2014.
So these exploits are actually happening.
-
Not Synced
People are targeting developers to get
users.
-
Not Synced
Xcodeghost had 20 milllion user
installations.
-
Not Synced
It was probably not the CIA or NSA but
we don't know who it was.
-
Not Synced
There are many people who do these
exploits in the wild.
-
Not Synced
Yeah it's not just 'Here's this cute
thing we can talk about'.
-
Not Synced
It's actually happening.
-
Not Synced
The motivation is to ensure no flaws are
introduced during the build process.
-
Not Synced
We do this by ensuring the build always
produces identical results.
-
Not Synced
Then multiple parties do the same thing.
-
Not Synced
I build it, you build it, your friends
build it etc
-
Not Synced
An an attacker would need to infect
everyone simultaneously
-
Not Synced
otherwise they'd be detected.
For example if my machine was compromised
-
Not Synced
I would suddenly come up with a
different result.
-
Not Synced
I would come up with different binaries.
-
Not Synced
And you'd be 'what's going on here' and
eventually we would discover
-
Not Synced
that my machine was rootkitted etc.
-
Not Synced
You probably know it but identically
means bit by bit identical.
-
Not Synced
As that is really the same.
-
Not Synced
Yeah, bit, SHA, MD5 whatever you want.
-
Not Synced
There are a bunch of challenges here.
The biggest one is timestamps.
-
Not Synced
A lot of software just loves to include
timestamps everywhere.
-
Not Synced
Documentation, underscore underscore date
and underscore underscore time macros
-
Not Synced
Just all over the place, in file names etc
Things like that.
-
Not Synced
Builds often vary by locale and timezone.
-
Not Synced
Different new lines, different sorting
orders for example collations.
-
Not Synced
Different versions of libraries. I'm not
sure what this refers to exactly.
-
Not Synced
Moving on.
-
Not Synced
Non-deterministic file ordering for
example Shell Globs are not really defined
-
Not Synced
to be, I say not really defined they
aren't defined to come out in normal order.
-
Not Synced
Also read syscall, it doesn't actually
promise any particular ordering.
-
Not Synced
Dictionary/hash key ordering. So this is
in things like Perl and python
-
Not Synced
you use a key or a hash. If you iterate over the keys with that it's a non-determinative order.
-
Not Synced
If your build system loops over such a
hash or a dictionary
-
Not Synced
then the results from this build could be
non-reproducible and non-deterministic.
-
Not Synced
And also things like files in the part of
the build process will just adsorb
-
Not Synced
stuff from the surrounding environment
like umask and all that kind of
-
Not Synced
stuff that lives outside there.
-
Not Synced
Build paths is a very interesting one which
we cover in greater detail on another slide.
-
Not Synced
Also specifying the environment, we'll also
cover this one in the build info slides.
-
Not Synced
So not only are there privacy and security
advantages of using,
-
Not Synced
moving towards reproducible builds there
are also technical advantages.
-
Not Synced
It's faster to build if you basically
keep hitting cache.
-
Not Synced
I'm pretty certain this is why Google are
interested in it.
-
Not Synced
Because of the amount of
compilation they do
-
Not Synced
they're just going to save a whole bucket
load of money just by
-
Not Synced
'Oh we don't need to rebuild this because
it's the same SHA' etc
-
Not Synced
It's very nice to test revisions and
changes I use all out tools
-
Not Synced
when doing QA uploads or NMUs you
rebuild a package
-
Not Synced
and then you compare to the previous one.
And as the only things that have changed
-
Not Synced
should be the things that you've changed,
there haven't been all sorts
-
Not Synced
of random other nonsense being
reorderd with timestamps added.
-
Not Synced
You can get rid of all that noise and
just be 'oh yeah brilliant I can see that
-
Not Synced
the patch I've applied here has actually
changed the behaviour of the program'
-
Not Synced
and only that. It hasn't done all sorts
of wierd wierd stuff.
-
Not Synced
So you have safer uploads in that sense.
-
Not Synced
Speaking of safety a reproducible build
won't go talking to the Internet
-
Not Synced
like a lot of modern package managers
like to do. Mathen style ones.
-
Not Synced
Also a reproducible build will typically
not have any
-
Not Synced
non-deterministic failure modes.
-
Not Synced
So there's a lot of tests and test suites
in Debian that will
-
Not Synced
try and test things like 'Oh is this
algorithm N squared or bigger than N'.
-
Not Synced
And it will try doing that by running some sort of bench mark
-
Not Synced
and fail if it doesn't meet some sort of
arbitrary time difference and
-
Not Synced
that's obviously that's not reliable. So we
get rid of all those nonsense things.
-
Not Synced
It also finds bugs in really weird
locales. We build in French,
-
Not Synced
Swiss-French, and it just comes up with
all sorts of nonsense.
-
Not Synced
Or timezones, if you build in UTC-12 then
this date library doesn't work anymore
-
Not Synced
and it's like 'you had one job
to be a date library'.
-
Not Synced
[audience laughter]
-
Not Synced
It's pretty scary and some pretty
cute bugs.
-
Not Synced
It also detects if your machine is, you
just have a broken ??? [8:28].
-
Not Synced
We build a year and a month in the
future. You find things like the
-
Not Synced
maintainer has added a pre-generated SSL
certificate to their tests and
-
Not Synced
it expires in the year. And so it breaks.
-
Not Synced
We're preemptively detecting that fail to
??? [9:01] source.