-
In this segment, we're gonna look at the
Diffie-Hellman protocol, which is our
-
first practical key exchange mechanism. So
let me remind you of the settings. Our
-
friends here, Alice and Bob, have never
met and yet they wanna exchange a shared
-
secret key that they can then use to
communicate securely with one another.
-
In this segment and the next, we're only
gonna be looking at eavesdropping
-
security. In other words, we only care
about eavesdroppers. The attackers are
-
actually not allowed to tamper with data
in the network. They're not allowed to
-
inject packets. They're not allowed to
change packets in any way. All they do is
-
to just eavesdrop on the traffic. And at
the end of the protocol, the key exchange
-
protocol our friends Alice and Bob should
have a shared secret key but the attacker
-
namely the eavesdropper has no idea what
that key's gonna be. In the previous
-
segment we looked at a key segment called
Merkle puzzles. That's just using block
-
ciphers or hash functions. And I showed
you that there that basically the attacker
-
has a quadratic gap compared to the
participants. In other words if the
-
participants spent time proportional to N
the attacker can break the protocol in
-
time proportional to N squared. And as a
result that protocol is to insecure to be
-
considered practical. In this segment we
are going to ask whether we can do the
-
same thing but now we'd like to achieve an
exponential gap between the attacker's
-
work Ended up in the participant's work.
In other words, Alice and Bob might do
-
work proportional to N, but to break the
protocol the attacker is gonna have to do
-
work proportional to some exponential in
N. So now there's an exponential gap
-
between the participants work and the
attacker's work. So as I said in the
-
previous segment we can't achieve this
exponential gap from blog ciphers like AES
-
or SHA-256. We have to use hard problems
that have more structure than just those
-
symmetric primitives. And so instead what
we're gonna do is use a little bit of algebra.
-
In this segment I'm gonna
describe the Diffie-Hellman protocol very
-
concretely and somewhat informally. When
we're gonna come back to Diffie-Hellman next week
-
and we're gonna describe the protocol more
abstractly and we're gonna do a much more
-
rigorous security analysis of this
protocol. Okay, so how does Diffie-Hellman
-
work? What we're gonna do is, first of
all, we're gonna fix some large prime
-
which I'm gonna call p. Actually p I'll
usually use to denote primes. And you
-
should be thinking of really gigantic
primes. Like, primes that are made up of
-
600 digits are so. And just for those of
you who like to think in binary, a 600
-
digit prime roughly corresponds to about
2000 bits. So just writing down the prime
-
takes about two kilo bits of data. And
then we're also gonna fix an integer g
-
that happens to live in the range one to
p. So these values p and g are parameters
-
of the Diffie-Hellman protocol. They are
chosen once and they're fixed forever. Now
-
the protocol works as follow. So here's
our friends Alice and Bob. So what Alice's
-
going to do is she's gonna starts off by
choosing some random number a in the range 1 to p-1
-
And then she is gonna compute
the number g to the power of a modulo p.
-
Okay, so she computes this exponention,
and reduces the result modular the prime p.
-
And we're actually going to see how to
compute this efficiently the next module,
-
so for now just believe me that this
computation can be done efficiently.
-
Now, let's call capital A the result of this
value. So, what Alice will do is she will
-
send capital A over to Bob. Now Bob is
going to do the same thing. He's going to
-
choose a random number b in the range 1
to p-1. He's going to compute g to
-
the b module of p and let's call this
number B and he's going to send this
-
number B over to Alice. So Alice sends A
to Bob. Bob sends B. To Alice. And now
-
they claim that they can generate a shared
secret key. So what's a shared secret key?
-
Well, it's defined as. Let's call it
K_AB. And it's basically defined as the
-
value g to the power of a times b. Now the
amazing observation of Diffie-Hellman had
-
back in 1976 is that in fact both parties
can compute this value g to the ab.
-
So let's see, Alice can compute this value
because she can take her value B that she
-
received from Bob. She can take this value
B, raise it to the power of a and, let's
-
see, what she'll get is g to the b to the
power of b. And by the rules of
-
exponentiation, g to the power of b to the
power of a is equal to g to the ab. Bob
-
turns out, can do something similar, so
his goal is to compute g to the a to the b,
-
which again, is g to the ab, so both
Alice and Bob will have computed K_AB, and
-
let me ask you, how does Bob actually
compute this quantity g to the ab?
-
Well so all he does is he takes the value A that
he received from Alice and he raises it to
-
his own secret b and that gives him the
shared secret g to the ab. So you can see
-
now that both parties even though they
seemingly computed different values. In
-
fact they both end up with the same value
namely g ab module p. I should mention by
-
the way that Martie Hellman and Wig
Diffiie came up with this protocol back in
-
1976. Martie Hellman was a professor here
at Stanford and Wig Diffie was his
-
graduate student. They came up with this
protocol and this protocol really changed
-
the world. In that it introduced this new
age in cryptography. Where now it's not just
-
about developing block ciphers but it's
actually about designing algebraic
-
protocols that have properties like the
one we see here. So you notice that what
-
makes this protocol works is basically
properties of exponentiation. Namely that,
-
g to the b to the power of a is the same
as g to the a to the power of b, okay?
-
These are just properties of
exponentiations. Now when I describe a
-
protocol like the one I just showed you.
The real question that should be in your
-
mind is not why the protocol works. In
other words, it's very easy to verify
-
that, in fact, both Alice and Bob end up
with the same secret key. The bigger
-
question is why is this protocol secure?
In other words, why is it that an
-
eavesdropper cannot figure out the same
shared key due to the AB that both Alice
-
and Bob computed by themselves? So let's
analyze this a little bit, then, like I
-
said, we're gonna do a much more in-depth
analysis next week. But, let's, for now,
-
just see intuitively why this protocol is
gonna be secure. Well, what does the
-
eavesdropper see? Well, he sees the prime
and the generator. These are just fixed
-
forever and everybody knows who they are.
He also sees the value that Alice sent to
-
Bob, namely this capital A. And he sees
the value that Bob sent to Alice, namely
-
this capital B. And the question is, can
the, can the eavesdropper compute g to the
-
ab just given these four values? So more
generally, what we can do is we can define
-
this Diffie-Hellman function. So we'll say
that the Diffie-Hellman function is defined
-
based on some value g. And the question is
given g to the a, and g to the b. Can you
-
compute g to the ab? And what we're
asking is how hard is it to compute this
-
function module over very large prime p.
Remember p was something like 600 digits.
-
So the real question we need to answer is
how hard is this, is this Diffie-Hellman
-
function? So let me show you what's known
about this. So suppose the prime happens
-
to be n bits long. Okay, so in our case
say n is 2,000 bits. It turns out the best
-
known algorithm for computing the
Diffie–Hellman function. Is actually a
-
more general algorithm that computes the
discrete log function, which we're gonna
-
talk about next week. But for now, let's
just say that this algorithm computes a
-
Diffie-Hellman function. The algorithm is
called a general number field sieve. I'm
-
not gonna describe how it works, although
if you'd want to hear how it works, let me
-
know, and that could be one of the special
topics that we cover at the end of the
-
course. But the interesting thing is that
it's running time is exponential in
-
roughly the cube root of n. In other
words, the running time is roughly e to
-
the power of o of cube root of n. So in
fact the exact expression for the running
-
time, of this algorithm is much more
complicated than this. I'm deliberately
-
simplifying it here just to kind of get
the point across. The point that I want to
-
emphasize is that in fact, this is not
quite an exponential time algorithm.
-
Exponential would be e to the power of n.
This is sometimes called a sub-exponential
-
algorithm because the exponent is really
just proportional to the cube root of n,
-
as opposed to linear n. What this says is
that even though this problem is quite
-
difficult, it's not really an exponential
time problem. The running time in the
-
exponent is gonna be just proportional to
the cube root of n. So let's look at some
-
examples. Suppose I happen to be looking
at a modulus that happens to be about a
-
thousand bits long. What this algorithm
says is that we can solve the
-
Diffie-Hellman problem in time that's
approximately e to the qube root of 1,024.
-
Now this is not really correct, in fact
there are more constants in the exponent.
-
But again, just to get, the point across,
we can say that the running time would be
-
roughly e to the qube root of 1,024; which
is actually very small, roughly e to the 10.
-
So the simple example shows you that
if you have a sub-exponential algorithm,
-
you see that even if the problem is quite
large, like 1,000 bits. Because of the
-
cube root effect the exponent actually is not
that big overall. Now to be honest I'm
-
actually lying here. General number field
sieve actually have many other
-
constants in the exponents so in fact this
is not going to be ten at all. It's
-
actually going to be something like 80.
Because of various constants in the
-
exponents. But nevertheless but you see
the problem is much harder than say 2 to
-
the 1000. Even though describing it takes
1,000 bits, solving it does not take time
-
to the 1,000. So here I roll down the
table that kind of shows you the rough
-
difficulty of breaking down the
Diffie-Hellman protocol compared to the
-
difficulty of breaking down a cipher with
a appropriate number of bits. For example,
-
if your cipher has 80 bit keys. That would
be roughly comparable to using 1,000 bit
-
modules. In other words breaking a cipher
with 80 bit keys takes time roughly 2 to the 80
-
which means that
breaking if you have Diffie-Hellman function with a 1,000
-
bit module will take time 2 to the 80.
-
If your cipher uses 128 bit keys like AES, you should be
really using a 3,000 bit modulus,
-
even though nobody really does this. In reality
people would use 2,000 bit modulus. And
-
then if your key is very large, like if
we're using a 256 bit AES key, then in
-
fact your modulus needs to be very, very
large. Again, you, you can really see the
-
cube root effect here. We doubled the size
of our key and because of the cube root
-
effect, what that means is we have to
increase the size of, of our modulus by a
-
factor of two cubed, namely a factor of
eight, which is where this 15,000 comes from.
-
from. And again I should mention there's
not exactly a factor of eight because of
-
the extra constants in the exponent. So
this is a nice table that shows you that
-
if you're gonna be using Diffie-Hellman to
exchange, a session key. And that session
-
key is gonna be used for a block cipher
with a certain key size, this table shows
-
you what modular size you need to use so
that the security of the key exchange
-
protocol is comparable to the security of
the block cipher you're gonna be using after.
-
Now this last row should
really be disturbing to you. I should tell
-
you that working with such a large modulus
is very problematic. This is actually
-
gonna be quite slow, and so the question
is whether there is anything better that
-
we can do? And it turns out there is. In
fact the way I describe the Diffie-Hellman
-
function is I describe it at the way
Diffie and Hellman described it in 1976
-
using arithmetic module of primes. The
problem with arithmetic modular primes is
-
that the Diffie-Hellman function is hard
to compute, but it's not as hard as you'd
-
like. There's this cube root effect that
makes it a little easier than what you'd
-
really like. And so the question is can we
run the Diffie-Hellman protocol in other
-
settings. And it turns out the answer is
yes. In fact we can literally translate
-
the Diffie-Hellman protocol from an
arithmetic model of primes to a different
-
type of algebraic object and solving the
Diffie-Hellman problem on that other
-
algebraic object is much, much harder.
This other algebraic object is what's
-
called an elliptic curve. And as I said,
computing the Diffie-Hellman function on
-
these elliptic curves is much harder than
computing the Diffie-Hellman modulo primes.
-
Because the problem is so much harder, now
we can use much smaller objects in
-
particular, you know we'd be using primes
that are only a 160 bits or 80 bit keys or
-
only 512 bits for 256 bit keys. So because
these module don't grow as
-
fast on elliptic curves, there's generally
a slow transition away from using module
-
arithmetic, to using elliptic curves. I'm
not going to describe elliptic curves
-
right now for you, but if this is
something you'd like to learn about I can
-
do that at the very last week of the
course, when we discuss more advanced
-
topics. But in fact when we come back to
Diffie-Hellman next week I'm gonna describe it
-
more abstractly so that it really doesn't
matter which algebraic structure you use
-
whether you use arithmetic model of primes
or whether you use a elliptic curve we
-
can kinda abstract that whole issue away
and then use the Diffie-Hellman concept a for
-
key exchange and for other things as well.
And as I said we'll see that more
-
abstractly next week. So as usual I wanna
show that this beautiful protocol that I
-
just showed you, the Diffie-Hellman protocol,
is as is, is actually completely insecure
-
against an active attack. Namely, it's
completely insecure against what's called
-
the man in the middle attack. We need to
do something more than this protocol to
-
make is secure against man in the middle.
And again we're gonna come back to Diffie
-
Hellman and make it secure against man in
the middle next week. Okay. So let's see
-
why the protocol that I just showed you is
insecure against active attacks. Well
-
suppose we have this man in the middle
that's trying to eavesdrop on the
-
conversation between Alice and Bob. Well
so, the protocol starts with Alice sending
-
g to the a over to Bob. Well, so the man
in the middle is gonna intercept that and
-
he's gonna take the message that Alice
sent and he's gonna replace it with his
-
own message. So it's called A', And
let's write that is g to the a'.
-
Okay? So the man in the middle chooses his
own a' and what he sends to Bob is
-
actually g to the a'. Now poor Bob
doesn't know that the man in the middle
-
actually did anything to this traffic, all
he sees is he got the value A'. As
-
far as he knows, that value came from
Alice. So what is he gonna do in response?
-
Well, he's gonna send back his value B out
which is g to the b back to Alice. Well
-
again the man in the middle is gonna
intercept this B. He's gonna generate his
-
own b' and what he actually sends
back to Alice is B' which is g to the b'.
-
Okay, now what happens? Well
Alice is gonna compute her part of the
-
secret key and she's gonna get g to the ab'. Bob is gonna compute his part of
-
the secret key and he's gonna get g to the
b times a'. Okay, these now you
-
notice these are not the same keys. In the
man in the middle because he knows both A'
-
and B' he can compute both g to
the ab' and g to ba'. Yeah it's
-
not difficult to see the man in the middle
knows both values. And as a result, now he
-
can basically play this man in the middle
and when Alice sends an encrypted message
-
to Bob the man in the middle can simply
decrypt this message because he knows the
-
secret key that Alice encrypted message
with, re-encrypt it using Bob's key. And
-
then send the message on over to Bob. And
this way Alice sent the message, Bob
-
received the message. Bob thinks the
message is secure. But in fact that
-
message went through the man in the
middle. The man in the middle decrypted
-
it, re-encrypted it for Bob. At the same
time he was able to completely read it,
-
modify it if he wants to, and so on. So,
the protocol becomes completely insecure
-
give n a man in the middle. And so as I
said we're gonna have to enhance the
-
protocol somehow to defend against men in
the middle but it turns out that it's
-
actually not that difficult to enhance and
prevent man in the middle attacks.
-
And we're gonna come back to that and see that
in a week or two. The last think I want to
-
do is show you an interesting property of
the Diffie-Hellman protocol. In fact, I
-
want to show you that this protocol can be
viewed as a non-interactive protocol. So,
-
what do I mean by that? So Imagine we have
a whole bunch of users, you know, millions
-
of users. Let's call them Alice, Bob,
Charlie, David and so on and so forth.
-
Each one of them is going to choose a
random, secret value, and then on their
-
Facebook profiles, they're gonna write
down, their contribution to the
-
Diffie-Hellman protocol. Alright so
everybody just writes down you know g to
-
the a, g to the b, g to the c and so on.
Now the interesting thing about this is,
-
if say Alice and Charlie wanna set up a
shared key they don't need to communicate
-
at all. Basically Alice would go and read
Charlie's public profile. Charlie would go
-
and read Alice's public profile. And now,
boom, they immediately have a secret key.
-
Namely, now, Alice knows, g to the c and
a. Charlie knows g to the a and с. And as
-
a result, both of them can compute п to
the ac. So, in some sense, once they've
-
posted their contributions to the
Diffie-Hellman protocol to their public
-
profiles, they don't need to communicate
with each other at all to set up a shared key.
-
They immediately have a shared key
and now they can start communicating
-
securely through Facebook or not with one
another. And notice that this is true for
-
any Facebook user. So as soon as I read
somebody's public profile, I immediately
-
have a shared key with them without ever
communicating with them. This property is
-
sometimes called a non-interactive
property of the Diffie-Hellman. So now, let
-
me show you an open problem. And this is
an open problem that's been open for ages
-
and ages and ages. So it'd be really cool
if one of you can actually solve it. The
-
question is, can we do this for more than
two parties? In other words, say we have
-
four parties. All of them post their
values to their Facebook profiles. And now
-
we'd like to make it that just by reading
Facebook profiles, all of them can set up
-
a shared secret key. In other words, Alice
is, what she'll, she'll do is she'll only
-
read the public profiles of, the three
friends, Bob, Charlie and David. And
-
already she can compute a shared key with
them. And similarly David is just gonna
-
read the public profile of Charlie. Bob
and Alice. And already he has a shared key
-
with all four of them. Okay, so the
question is whether we can kind of setup
-
non-interactively these, these shared keys
for groups that are larger than just two
-
people. So as I said, for n equals two,
this is just a Diffie-Hellman protocol. In
-
other words, if just two parties want to
set up a shared key without communicating
-
with one another, that's just
Diffie-Hellman. It turns out, for N equals
-
three, we also know how to do it, there's
a known protocol, it's called protocol due
-
to Joux. It already uses very, very fancy
mathematics, much more complicated
-
mathematics than what I just showed you.
And for N equals four, or five, or
-
anything above this, above four, this
problem is completely open. Nearly the
-
case where four people post something to
the public profiles and then everybody
-
else reads the public profile and then
they have a joint shared key, this is
-
something we don't know how to do even for
four people. And this is a problem that's
-
been open for ages and ages, it's kind of
a fun problem to think about and so see if
-
you can solve it, if you will, it's
instant fame in the crypto world. Okay, so
-
I'll stop here, and we'll continue with
another key exchange mechanism in the next segment.