-
34c3 preroll music
Herald Angel: Today we have our speaker
-
Filippo Valsorda. He is a crypto-engineer
and he is specialized in Go. He is some
-
kind of Go wizard so to say and used to
work at CloudFlare. He actually shows now
-
today in the talk how he made an attack to
exploit a bug in the implementation of the
-
elliptic curve P256 in Go, that's why he's
a Go wizard. The reason is actually due to
-
a misplaced bit on the assembler level,
just due to the curves implementation. The
-
result is that in certain cases you can
retrieve the private key in the elliptic
-
curve Diffie-Hellman encryption scheme.
Please welcome Filippo Valsorda to his
-
talk, give him a round of applause. Thank
you very much.
-
Filippo Valsorda: Thank you.
applause
-
Thanks. I love the term crypto-golfer. Tony
came up with it and I just want business
-
cards with that on it. Anyway, okay, I'm
Filippo. As you heard I worked on Internet
-
infrastructure, I work on Go, I work on
cryptography. But this is a collaboration
-
with Sean Devlin, you might know him as
one of the authors of the Cryptopals or
-
the matasano crypto challenges. A few
months ago earlier this year a bug.. a
-
CloudFlare engineer was scanning CT logs.
CT logs are big logs of TLS certificates
-
and he was checking the ECDSA signatures
on these certificates and one of the
-
signatures was not verifying. Which is
weird, because CT logs check the
-
certificates before adding them to the
log. It turned out that the bug was in the
-
code that CloudFlare was using to verify
the certificates. And more specifically it
-
was in the Go standard library. It was in the
implementation of the NIST P256 curve,
-
which is a popular, very hard to implement,
elliptic curve that is used for example
-
for ECDSA TLS certificates. This curve has
an assembly implementation in the Go
-
standard library, of course, to squeeze
every bit of performance out of it, and
-
it's specifically optimized for x86-64
architecture. So that's where the bug was.
-
There was a carry propagation bug. It was
reported upstream and everyone agreed that
-
this was not obviously exploitable.
But Adam Langley also said that it would
-
be a cool paper though. And well I mean,
how do you pass on that? So Sean and I met
-
in Paris and spend a weekend and then some
eating Thai food and staring at this
-
assembly code to try to understand what is
it doing. And one month later we have a
-
CVE and two Go security releases. We found
a way to go from this single carry bit bug
-
to a full key recovery against protocols
that use elliptic curve Diffie-Hellman
-
within ephemeral static way. If this means
nothing to you, it's okay, I will try to
-
go over it. One of these protocols for
example is JWT. Jozy, or however you want
-
to call it - it has 15 different acronyms.
And it's often implemented in Go, so this
-
was exploitable against real-world
services. So, let's start by looking at
-
the code with the bug. Let's take it from
the beginning. This is the short assembly
-
function that does subtraction. When you
do elliptic curve math, you usually work
-
on a field, you work on math modulo some
prime p. So if it was subtraction you do a
-
minus b modulo p and this is what this
assembly snippet does. It sets a to a
-
minus b. Of course these are numbers way
too big to fit into a single register. So
-
how do you do math, when you can't fit
into a single register? You do multi-
-
precision math. And the thing is, you all
know how to do multi-precision math, you
-
learned that in elementary school. When
you would write numbers in columns and you
-
do math with register size of 10, because
for every digit you would subtract two
-
digits and carry the minus one if it went
negative and then subtract with the carry
-
and then carry the minus one. That's
exactly what this is doing, but it's doing
-
it instead with digits with 64-bit registers,
because this is a 64-bit architecture
-
. So we look at the first
lines: the first lines is nothing else
-
than subtract, subtract, subtract with
carry. And then the carry is finally
-
stored in that mul0 accumulator. But then
what do we do if it goes negative?
-
We said that this is modulo p so we can't
just let it wrap around modulo 2^256,
-
because that's how wide, you know, 4
registers are. But since we're doing
-
arithmetic modulo a number, we can just
add that number and the result is the
-
same, right? Adding p modulo p is a no-op
in the result. So that's what this code
-
does: It does a equal a minus b, it takes
a copy of the result and it adds p. And
-
then in constant time, it uses that final
carry to check, if it went negative or not
-
to decide in constant time, which one to
use. The one with b , so a minus b plus p
-
or a minus B - and that's subtraction.
Straightforward enough. Now the problem
-
with this code is that if you look closely
you can see something that might be weird
-
if you're not familiar with assembly. It
still trips me over. To use a condition
-
like these constant time conditionals
there. Which have to be constant time,
-
because you don't want to leak timings
based on the size of number. You have to
-
first operate on mul0, so that you set the
flags, the zero flag. So normally what you
-
do is either add 0 and 1 to your mul0
to check if to set the flags. But that's
-
not that's not an add, that's an add with
carry. It means that it's adding zero to
-
mul0 and the carry bit from this addition
here, which has nothing to do with it,
-
it's not supposed to be there. Like this
is adding p, but it doesn't matter, if it
-
overflows, because if it does, it wasn't
going to be picked here anyway. So it's
-
adding another bit into the operation that
wasn't supposed to be there. So it's
-
flipping the result, but then also the
conditions here are flipped, so
-
essentially it evens itself out. Except:
when that carry bit happens not to be set,
-
because the number a minus b is small
enough, that if you add P, you don't wrap
-
around. That happens once every 2^32
times, which is why it's so rare that
-
nobody had noticed so far.
So the code went in and nobody noticed for
-
a long time, until CloudFlare first
started scanning CT logs and had this
-
weird one signature that wasn't verifying
and they were lucky enough to actually
-
have it in the logs because you know if it
happens transiently you might just think..
-
I don't know, the connection broke, right?
So this is what we call a carry
-
propagation bug. Carry propagation is the
activity of making sure that you carry the
-
1. And here it's a bit weird, we didn't
forget to carry it, but we carried a bit
-
that we weren't supposed to carry into a
result. Most of the times that goes well,
-
sometimes that breaks. When that breaks:
wrong result! Wrong result, wrong point
-
computation and wrong point computation,
so what? Like how does forgetting to carry
-
the one lead to a full key recovery? This
is not one of those bugs, where like
-
buffer overflow you know what you're
trying to do even if you might have to
-
jump through so many hoops, because there
might be all these protections, you know
-
what your capabilities are, you can
overwrite some memory. Here instead it's
-
not clear what your capability is. So
today I'm going to try to explain that.
-
How we turn this very rare failed
computation into a complete key recovery
-
attack. But first I apologize that I have
to give a elliptic curve cryptography
-
crash course here at CCC. So we've seen
the field is nothing else than operations
-
modulo p, then there are the points: the
points are x and y, the coordinates. They
-
fit an equation, which we don't care
about, they just fit an equation. They are
-
integers, so we can work with them, and we
can use them to build a group. A group is
-
one of the core structures in modern
cryptography. To build a group we need a
-
zero point, a generator point, and an
addition over the group, over the
-
points. So we define an addition,
-
again, we don't care about how addition
works. It's just: you take two points, you
-
add them together, you get a result.
It has all the properties that you
-
remember from elementary school addition:
it's commutative, it's associative. And
-
then you have multiplication: You don't
actually define how to multiply a point,
-
but if I tell you that you have an
addition operation and I want five times
-
this point, what do you do? You take the
point and you add the point and you add
-
the point and you add the point,... so
this is called scalar multiplication:
-
doing multiplication by adding repeatedly
a certain point.
-
So now we have a group, which is given by
multiplying the point, a generator point, a
-
certain number of times and we have a
multiplication operation. So how do we
-
build cryptography out of it? This is all
awfully abstract so far? So we build
-
elliptic curve Diffie-Hellman. If you're
familiar with normal Diffie-Hellman, you
-
will see something snap at some point. The
private key is a giant number, this is
-
important. The private key in ECDH is just
a random giant 256 bit number. And then we
-
have a public key. A public key is that
giant number multiplied, the scalar
-
multiplication we just talked about, by a
generator point. If you know normal
-
Diffie-Hellman, this is like doing g to
the a. If you don't, it's okay, this is
-
just multiplying private key by a point.
And then, when I have my private key and
-
my public key, you send me your public
key. We need to agree on a shared secret,
-
how we do that, is that I take your public
key, which is a point. I take this and I
-
multiply it by my private key, here, so
again: it's my giant number private key
-
multiplied by your point. That comes
together: The two results are the same,
-
because if we do my private key times your
private key times g it's the same as your
-
private key times my private key times g,
because that's commutative. And we land on
-
a shared secret. And that's all we need to
know about elliptic curve cartography to
-
exploit this. However, there's one thing
that I glossed over: It's easy to multiply
-
by five, you add five times.
But if I'm asking you to multiply by a 256
-
bit number, you can't add 2 to the 256
times a point, right? So what do we do
-
there? Remember that here, what we're
trying to do is the multiplication: The
-
private key times the public key, the
point. We do something called double and
-
add: We take our private key and we string
it out like bits. We start from the most
-
significant bit. This is little endian,
because I have gotten this slide wrong the
-
first time. But, you know, you just claim
that you meant it to be the opposite
-
endianness. Anyway, that's the most
significant bit, the one that is two to
-
the 256, no, two to the 255. If you flip
it, you're adding or removing two to the
-
255. So you start with 0, that's zed, 0,
nothing, and you check the first bit, the
-
most important bit: Is that set, yes or
no? Yes, so you add the point Q, which is
-
the point we're trying to multiply by this
giant e. So we add the point and then we
-
move down one by doubling. Can you imagine
how we double something? Remember we only
-
have addition. We add it to itself, of
course. So we use addition to double the
-
point. And you might see, where we're
going with this. We double every time we
-
slide down one bit.
By the time we arrive at the end, how many
-
times did we double that first point that
we added, because the first bit was one?
-
255 times! That bit was worth 2 to the
255, so at the end that will have the
-
value it was supposed to have. And we keep
going, we check if this bit is 1: Is it 1?
-
No. So we do nothing, we double again to
move down one. We check if this bit is
-
one. This bit is 1: so we add the point.
So we did add the point, double, double,
-
add the point, double, moving down one and
we keep going like this. We keep going
-
like this, until we reach the least
significant bit. At the least significant
-
bit, if it's one, we add the point, if
it's not, we don't add the point. And at
-
the end we have the correct result and the
result comes from this sequence of
-
operations which at most are twice 255
operations, which is something that we can
-
do concretely. Now why did I explain this
very specific algorithm to you. Because to
-
understand this attack, you have to
recognize that each key, so each string of
-
bits here, converts into a very specific
sequence of operations. Okay, if you
-
change one bit, there will be an one more
addition one less addition and each key
-
has a very specific sequence. In this case
it's add, double, double, add, double,
-
add, double, double, and it keeps going.
So back to our bug. If you spaced out,
-
because we took a lot of crypto, I saw
you! But the two things you should take
-
away are: There's a giant number, it's the
private key, we want to multiply the giant
-
number by a point and we do that by doing
additions and doubles in an order that is
-
specified by the bits of the giant number.
That's what you need to have clear, the
-
only thing. So let's go back to see how we
use that to turn our very small carry bug
-
into a complete key recovery attack. First
thing we do is we bubble it up. That
-
function that breaks is called P256
subinternal. That's the function I showed
-
you earlier. P256 subinternal is used by P256
point add, which is what we spoke about
-
adding two points, the only important
operation. And adding two points, we've
-
seen, we use when we're multiplying, when
we're doing that scalar multiplication,
-
which is d times q, d times the point. And
how is scalar mult used? Here we finally
-
surfaced to a level that if you work with
cryptographic protocols you might start
-
recognizing pieces of. Scalar
multiplication is the operation that the
-
peer does in elliptic curve Diffie-
Hellman. There's a scalar, which is the
-
secret, which is the private key. There's
a point, which is the public key of the
-
other person, which might be the attacker.
So the scalar multiplication here,
-
speaking in InfoSec terms, has an attacker
supplied point and a secret scalar and the
-
result, the shared secret, is the session
key. For example: TLS, when you open a
-
connection with TLS and you're using
elliptic curve Diffie-Hellman, then you
-
will do this dance to agree on a session
key. If the session key is correct, the
-
connection will open and you will be able
to, I don't know, get send HTTP request.
-
If the bug is hit and the result is wrong,
so the result bubbles up into a wrong
-
shared secret, the session key is wrong.
And the session key is wrong you notice,
-
how do you notice? The connection breaks.
So this is what in cryptography we call an
-
oracle.
You have an oracle that you can call and
-
send a point, because that's your public
key, you're the attacker, and you send
-
that point and it gets multiplied by the
private key. And it gives you one bit of
-
information, did the bug trigger or did it
not? Most of the times it won't, because
-
remember, this is an extremely rare bug.
So you have an oracle that tells you: Bug
-
happen, bug didn't happen, based on the
point you sent. And let's assume that the
-
key stays the same, we'll talk about that.
Can you imagine how we use that to start
-
learning things about the key? Well, let's
say, that we can magically conjure a point
-
that in that sequence of operation, that's
why I told you the sequence of operation
-
was important, the bug happens very
specifically at that addition and we find
-
another point where the bug happens very
specifically at that double. If we know
-
already these bits of the key, and we
aren't sure about this one, what can we do
-
with these two points? We send them both,
one of them will break the TLS connection,
-
the other one will succeed. That means
that the one that broke triggered the bug,
-
the one that didn't break, didn't trigger
the bug. And we know which one corresponds
-
to which key. Because we magically made
special points that only break very
-
precisely at that point of the
computation. Okay, so we learned a bit of
-
the key. In cryptography as soon as you
learn one bit of the key, there's probably
-
a way all the way down. So we build what's
called an adaptive attack. Let's say we
-
have these bits, but we want to learn
these, too. We compute two points, one
-
that breaks, when the addition happens
exactly at that point in the double and
-
add a procedure, and one that triggers
only when the add doesn't happen at the
-
specific point in the double and add
sequence. We send them both, only one of
-
them breaks the TLS connection, well, then
we know a bit! We go back to our magic
-
point generator and we generate two new
points.
-
This time we don't look for things that
break here, we look for things that break
-
here. We make two points, we send them
both. One of them breaks the connection,
-
the other doesn't break the connection. We
learned one more bit of the key. We go
-
back, we make two points, we send them
both: One breaks the connection, one
-
doesn't. We keep going like that, once for
each bit. Every time we go back and we
-
adapt to what we learned so far. That's
why it's called an adaptive attack, we
-
can't pre-compute all these points. We
have to come up with them while we learn
-
the key. And the beautiful thing about
adaptive attacks is that they look exactly
-
like Hollywood, it's beautiful! Because
you see them flipping and going through
-
values, getting it right and moving to the
next one, which you all told it was fake,
-
it was not! Everything else is fake. Okay,
so, this attack we came up with that, we
-
told we had something extremely novel, we
went to the literature and everyone that
-
had picked an academic career in the
audience knows exactly what happened: We
-
found a paper that was doing exactly this.
However, you know, it was a little
-
different. It was still P256 and it was
still ECDH and, hmm.. okay it's really
-
similar. But it's an attack that depends a
lot on the implementation details, how you
-
pull it off. You can't suddenly just
repurpose the code, but the idea so far:
-
an adaptive attack where you send two
points and you check which one breaks is
-
the same as a BBB paper from a few years
ago when it worked against open SSL
-
instead. Instead of against this bug in
the Go standard library. So instead from
-
now on we're going to talk about how
exactly we implemented this against Go,
-
because this changes a lot based on the
implementation details of the library
-
you're working with. So this was the
general idea of how the attack works. All
-
that: You find points that break at the
right time, you send them both, and that
-
way you learn a bit of the key,
except I lied to you.
-
I lied to you, because I lied to you on a
lot of things: The first one is that Go
-
doesn't do double and add one bit at a
time, it does it five bits at a time! It's
-
called Booth multiplication, it took us
forever to figure it out. It's an 1980s
-
paper. Instead of adding one point or zero
points and then doubling, it adds between
-
minus 16 and plus 16 points and then
doubles five times moving down. It just
-
does it in blocks of five. So it splits up
the key and then looks at each block of
-
bits, picks a value from a pre-computed
table, which is just all the values from
-
one times the point to 16 times the point
and in the loop it doubles five times,
-
because it moved five bits down. And then
it chooses, which point between zero and
-
16 to use from the table and it adds that
to the rolling result, instead of adding 1
-
or 0. There's also a bit of constant time
arithmetic there, because the other thing
-
I lied to you on is that there's no such
thing as at 0 point. It's an imaginary
-
point that we add to make the math work,
but when you try to tell the code to
-
actually add the zero point it's like
asking you to divide by 0. It just won't
-
do it! But you know how you add 0, right?
You do nothing! So there's some constant
-
time code here that looks at it and if
it's 0, it does nothing. If it's not 0, it
-
adds the value.
So the first thing we had to do to adapt
-
to this format is that we worked in limbs.
When you hear me say limb, it just means
-
that we will look at each five bit block
on its own as it is zero to sixteen and a
-
sign value. That's much easier, because
it's actually kind of weird how the five
-
bits are extracted and I don't want to
bore you with it. So let's just consider
-
that we look at each group of five bits
converted into a value from zero to
-
sixteen and a one and a sign or plus minus
sixteen. So when you hear me talk about
-
limbs you just know that it means the five
bit value from the key. This is still the
-
giant d key that's the multiplier. So how
does the attack change? Not that much!
-
Instead of attacking one bit at a time,
you know two points - one that breaks for
-
zero, one that breaks for one - we attack
one limb at a time, one that breaks for
-
one, one that breaks for two, one that
breaks for three, sixteen, minus one,
-
minus two, minus 16. So, to recover five
bits of the key, we'll need on average
-
half the space: seventeen points, which is
a little less efficient than the bit-by-
-
bit one, because that would be five points
for five bits. So, however, how the attack
-
triggers is the same: We're looking for a
bug that happens in the five doubles at
-
the very right time or that happens at the
addition at the very right time. Now
-
that's still the high level of how we're
going to do it, but in practice I didn't
-
tell you how we're going to magically
generate these magic points that break
-
just at the right time. And I didn't tell
you because it's fuzzing. There is there's
-
no specific way to generate them
algebraically, so instead we just hooked
-
the assembly code with something that
would just set a boolean, you know, true,
-
false, if the conditions for the bug are
met. And then we run through a lot of
-
points. And if for each point we run it
through the limbs we already know,
-
remember this is an adaptive attack, so
we want to learn the next limb.
-
We learned a few limbs, we want to learn
the next one. We run through the ones we
-
know and then we try all the possible
values for ones that we don't know. If one
-
of them and only one of them breaks,
that's a useful point. Because if we send
-
that point and it breaks, we know exactly
what the value of the next limb is. Okay,
-
so this is going even more low-level. If
you're interested in optimizations, we had
-
to run through a lot of candidate points
and for each point we needed to know the d
-
value. Because we can find a magic point,
but if we don't know the private key, we
-
don't know if the entire protocol works
and our oracle doesn't work anymore. So to
-
work with that instead of multiplying a
new random private key every time we just
-
add that one to the private key and added
the point to the public key, this is just
-
an optimization. We called into the
various assembly of the standard library.
-
Don't do this, but it's beautiful. You can
go call all the unexported functions in
-
the standard library. I will never approve
it on code review, but I love it. And then
-
there's some post-processing to make sure
that the bug is actually there, because
-
sometimes there are false positives. So we
just run it through the broken code, the
-
fixed code, is the result the same?
Well, false positive, is it
-
different, good. Okay, so we have a way to
move through the attack. The only thing we
-
don't have yet is: How we figure out the
first limb. The first one, the most
-
important, the most significant one, when
we still don't know anything about the
-
key. The problem with this one is: It's
like this, so let's pick three, it's an
-
example okay, let's pretend that the limb
is three. So we do our usual thing, we do
-
our fuzzing and we find something that
breaks at the fifth doubling and we send
-
it and it breaks. It means that the first
limb is three, right? Wrong. Sadly, it
-
might mean that the limb is also 6 or 12.
Because how six is selected for example is
-
that three is taken, it's doubled and
saved in the precomputation table as six,
-
then it's taken out of the table, doubled
five times, but what happens after you
-
double six four times? What's the value?
The exact same as doubling three five
-
times, and the sequence is the exact same!
So we don't know, which one it is, because
-
we only know that that's the sequence but
that doesn't tell us anything. And the
-
same happens for 12, 12 is nothing else
than 3 double double. So if we double it
-
five times, at the third double it breaks
and we only know if it breaks or not. So
-
we can't move, so instead what we do is
that we find three points: one that breaks
-
after doubling three five times, one that
breaks doubling it six times, and one that
-
breaks after doubling it seven times. We
send them all and we look at which ones
-
break. Only this one? It's a three! The
first and the second but not the third
-
point? It's a six! All three? It's a 12!
This took me forever to wrap my head
-
around, this is like pure shaman insight.
Okay now I can feel that you're getting
-
bit tired, this is intensive, it's a lot
of math, so let's go for a change of pace
-
and let's talk about kangaroos instead.
I'm gonna tell you something that I
-
learned from a cryptographic paper, I
swear. And it's about how kangaroos jump.
-
Apparently kangaroos jump depending on how
the terrain on which they are jumping from
-
is. Depending on that terrain, if you put
two kangaroos on the exact same spot, they
-
will jump the same distance and
approximately the same direction. I don't
-
know, if this is true, but Pollard said so
in a paper and I am NOT arguing with
-
Pollard. So now, why is this useful? Well,
this makes for an extremely cool way to
-
catch kangaroos! I mean, did you expect
some math or crypto? I know, we're
-
catching kangaroos here! So you take a
kangaroo that you have on hand, because
-
you all have a kangaroo on hand. And you
put a GPS tracker on it and you let it
-
loose. This kangaroo jumps and it roams it
enjoys a brief stint of freedom and it
-
runs and at some point you go pick it up
and because you know where it is and you
-
place a trap there exactly where you find
it. What happens to the kangaroo that is
-
just passing by? If it steps at any point
on one of the points, where the other
-
kangaroo jumped from there on it will
follow the same path. Because each jump
-
depends on the ground. So this way if the
wild kangaroo lands on any of the prints
-
of the previous kangaroo you're catching
it because eventually it will end up in
-
the trap. Okay, this had nothing to do
with the talk, I just wanted to share
-
this!
Now, okay, so here is how this converts to
-
crypto. We can make elliptic curve points
jump like kangaroos, we just have to make
-
the jump deterministic based on the input,
based on the starting point. For example
-
we can take a hash, any hash, you design
the hash. Apparently that's popular in
-
cryptocurrencies to design your own hash
..
-
laughter
And you hash a point to another point and
-
when you want to do a jump you take the
point and you add it to its own hash, so
-
Q_N+1 depends only on QN, and this is just
like the kangaroo jump. How do you use
-
this for what we're doing? We want to
recover d, right? We want to recover the
-
multiplier, the discrete log it's often
called, of the public key. How we work
-
with this is that we take a tame kangaroo,
a point that we know the d off, and we
-
make it jump a lot. It keeps jumping and
we remember what the value is at the end.
-
We take that value at the end and we save
it. No need to keep all the points in
-
between, so we don't need some giant
memory construction and then we take the
-
point that we don't know the d of and we
make it jump a lot. What happens is that
-
it has much higher probability to
intersect one of the various prints of the
-
previous point. When it does, it will
eventually end up in our trap, it will end
-
up having the same value as the one we
calculated earlier. When that happens, we
-
know how far the wild point traveled,
because we were the ones making it jump.
-
So we can just walk backwards to the
starting point. It turns out that this is
-
extremely efficient to find the d value
when you know the interval of the d value.
-
The intuition there is that if the
kangaroo starts in a small area: You know
-
when it's been too much time that it
probably travelled past your trap. So you
-
have to rewind time, which in crypto you
can, and try again, because it went past
-
your trap. So the smaller the interval,
the more precise you can be, the more
-
efficient attack. This is called the
Pollard-kangaroo attack. It's described in
-
original paper from the 1980s, it was
described on Diffie-Hellman first, but it
-
works on any group, so it works on
elliptic curves. And the elliptic curve
-
version is then improved by a few papers
later and there is a chapter in the
-
elliptic and hyperelliptic handbook
that describes it.
-
So we have it all, we have how to start,
we have how to run through the attack and
-
we have a how to end. So now let's get
concrete, let's pick a target, as I said
-
this attack works against JWT. JWT made
... a lot of questionable choices. One of
-
them is to include as one of the public
key algorithms elliptic curve Diffie-
-
Hellman but not the version of Diffie-
Hellman you and I are familiar with. The
-
one where we both generate a new private
key every time, which makes this attack
-
impossible. Remember that this is an
adaptive attack. We need to query the
-
orcale for the same private key over and
over and over again. Instead there's a
-
variant called ephemeral static Diffie-
Hellman, where one of the two sides is
-
always the same. This is sometimes done as
an optimization, don't do that. OpenSSL
-
was doing that and it stopped doing that
after a bunch of attacks came from that.
-
So in TLS that usually doesn't happen and
the Go TLS stack thankfully never did
-
that, so the attack doesn't work against
TLS. But JWT is encoded it straight into
-
the standard, because if your public key
is fixed, so is your private key, always
-
the same. So if we have a service that
accepts things encrypted with a ECDHES
-
algorithm, we can use this attack. For
example against the popular implementation
-
Go-Jose, it's not Go-Jose's fault and Go
1.8.1.1, the latest vulnerable version,
-
and we can just check if the service can
decrypt what we're sending it. It can,
-
because it throws HTTP error when it
can't, because of different timings. This
-
changes in any case, but what you need is
the Oracle that tells you: did it work did
-
it not work? Did the bug trigger, so are
you right about your prediction of the
-
limb or are you not?
Then of course we need to do a lot of
-
work. If you don't have the resources of a
big corporation of where to spin up
-
things, you can just use EC2 spot
instances. How we architected that, is
-
that there will be a small piece of code
that do nothing intensive on your laptop
-
or anything. It would accept HTTP requests
from the workers. The beautiful thing
-
about this infrastructure is that you can
horizontally scale the workers, spin up as
-
many as you want on node.js platforms,
because the only thing that they need to
-
be able to do: they don't need to have
ports open, you don't have to worry about
-
NAT, you can even run it on your botnet,
because the only thing they have to do is
-
connect back and ask for work and then
after 30 cycles or when they find the
-
point, connect back and say I found
something, I didn't find anything, give me
-
some more work and send the result. This
was also useful, because if you get the
-
worker code wrong or if you want to change
the deployment, you can just redeploy the
-
workers without losing state on the
dispatcher. Because the dispatcher just
-
keeps running and it will just wait for
workers to come and ask. Specifically we
-
built this just with the small script that
you can start an EC2 machine with, because
-
we didn't even need to make a custom
image. So a few figures, a few numbers.
-
Each Key has 52 limbs, it will take a bit
less than that, because we have kangaroos.
-
But let's say approximately 52. Each limb
is 16 points on average. It would be 17,
-
but two of the points are faster to find,
so let's say 16. Each point takes
-
approximately 2 to the 26 candidate
points, before we find one that triggers
-
the bug at the right point. Since we need
16 candidate points and each takes 2 to
-
the 26 candidate points, that takes
approximately 85 CPU hours. That's like
-
one CPU running for an hour, 1 core. Turns
out that you can get 85 CPU hours from
-
spot instances for about a dollar and a
quarter, which makes the total cost of the
-
attack something like 60 bucks.
Which was a relief, because I had done the
-
math wrong first and the it came out as at
1000 and I run the demo tonight and I
-
didn't check the bill, yeah.. Anyway, it's
cheap. Now, I am not brave enough to run
-
the attack live, because, yes, it's a nice
infrastructure, but no, I don't trust it
-
that much. But also, because it takes a
while. If you don't want to spin up
-
infinite number of EC2 machines, you have
to accept that it would take about, I
-
think, our attack run in 12 hours. So
we're gonna look at a sped-up version of a
-
one-hour video in the next 45 minutes, you
have time, right?
-
laughter
No, it's a couple of minutes. So this is
-
the UI. It shouldn't be too confusing and
if anyone works at Hollywood and wants to
-
license it, we can talk. What you're
seeing is that the red values are the ones
-
that we found a point for from the workers
and we submitted. And when we submitted
-
that it resulted not to be the right limb
and the green ones are the ones that
-
instead broke, so they're the right limb.
Remember that here the target is a JWT
-
receiving application. And then you can
see the key slowly flipping from the
-
bottom and it's exactly like Hollywood, I
love that. So you can see the limbs
-
filling up as we find them and that
approximately there's 30 bits, so 2 to the
-
30 rounds, so 30 candidates work for each
round for each limb that we find. It
-
obviously depends on luck and yes the the
thing will probably keep running for a
-
little while, but this is already at limb
nine, it has to get to 52 and you don't
-
have that patience. So this was the attack
the code will be open-source soon. Leave
-
the limbs you lost, they belong to us now.
And: any questions?
-
applause
Herald: Valsorda, thank you very much for
-
a lovely talk and the kangaroos. So, we
have a question from the signal angel, go
-
ahead, please.
Signal Angel: Actually the internet wants
-
to know: Did you compare this bug to
implementation in other libraries.
-
Filippo: So I guess that means, if I
looked for similar bugs in other
-
implementations. We did not, because each
implementation is a bit different. Hanno
-
works on a lot of fuzzing of bigint
implementations and at one point he asked
-
me, like on Twitter just today, if I tried
fuzzing the Go implementation for example.
-
And sadly this is constant time code that
is specific to P256, so the answer is,
-
there's a lot of them and the bug can be
small and anywhere. It's not like you will
-
be looking for another bug in P256
subtraction, it can be anywhere in the
-
underlying math and we can turn that into
the same attack. So, no, we didn't look
-
for this specific one, but I think that
four CVEs in 2017 on open SSL have
-
descriptions that are very similar, but
they're about finite field Diffie-Hellman,
-
like normal DH. If you look for something
that says about it's barely doable,
-
because all the computation can be done
offline. That's this kind of attacks on
-
open SSL. Next?
Herald: Are there any other questions from
-
the Signal Angel? So please line up at the
microphone. Microphone one please.
-
Mic1: So why can't you determine the
points algebraically?
-
Filippo: Laziness? No, so it's entirely
assembly and there's a lot of points where
-
the value is then thrown out or it might
get corrected by .. it's essentially we
-
didn't see a clear path to this, and it's
$65 on ec2, so it doesn't really change
-
the feasibility to just fuzz them, so we
just went for the fastest path to the
-
entrance.
Herald: Are there any other questions?
-
Filippo: No one is asking about kangaroos,
people, I mean..
-
Herald: Yes, ask about kangaroos, lovely.
have you been to Australia?
-
Filippo: I haven't.
Herald: Okay, you have to.
-
Filippo: Yep, definitely.
Herald: I think, there aren't any other
-
questions. So Filippo Valsorda, big round
of applause for you. Thank you!
-
applause
-
34c3 outro
-
subtitles created by c3subtitles.de
in the year 2019. Join, and help us!