In this segment, we're gonna look at the
Diffie-Hellman protocol, which is our

first practical key exchange mechanism. So
let me remind you of the settings. Our

friends here, Alice and Bob, have never
met and yet they wanna exchange a shared

secret key that they can then use to
communicate securely with one another.

In this segment and the next, we're only
gonna be looking at eavesdropping

security. In other words, we only care
about eavesdroppers. The attackers are

actually not allowed to tamper with data
in the network. They're not allowed to

inject packets. They're not allowed to
change packets in any way. All they do is

to just eavesdrop on the traffic. And at
the end of the protocol, the key exchange

protocol our friends Alice and Bob should
have a shared secret key but the attacker

namely the eavesdropper has no idea what
that key's gonna be. In the previous

segment we looked at a key segment called
Merkle puzzles. That's just using block

ciphers or hash functions. And I showed
you that there that basically the attacker

has a quadratic gap compared to the
participants. In other words if the

participants spent time proportional to N
the attacker can break the protocol in

time proportional to N squared. And as a
result that protocol is to insecure to be

considered practical. In this segment we
are going to ask whether we can do the

same thing but now we'd like to achieve an
exponential gap between the attacker's

work Ended up in the participant's work.
In other words, Alice and Bob might do

work proportional to N, but to break the
protocol the attacker is gonna have to do

work proportional to some exponential in
N. So now there's an exponential gap

between the participants work and the
attacker's work. So as I said in the

previous segment we can't achieve this
exponential gap from blog ciphers like AES

or SHA-256. We have to use hard problems
that have more structure than just those

symmetric primitives. And so instead what
we're gonna do is use a little bit of algebra.

In this segment I'm gonna
describe the Diffie-Hellman protocol very

concretely and somewhat informally. When
we're gonna come back to Diffie-Hellman next week

and we're gonna describe the protocol more
abstractly and we're gonna do a much more

rigorous security analysis of this
protocol. Okay, so how does Diffie-Hellman

work? What we're gonna do is, first of
all, we're gonna fix some large prime

which I'm gonna call p. Actually p I'll
usually use to denote primes. And you

should be thinking of really gigantic
primes. Like, primes that are made up of

600 digits are so. And just for those of
you who like to think in binary, a 600

digit prime roughly corresponds to about
2000 bits. So just writing down the prime

takes about two kilo bits of data. And
then we're also gonna fix an integer g

that happens to live in the range one to
p. So these values p and g are parameters

of the Diffie-Hellman protocol. They are
chosen once and they're fixed forever. Now

the protocol works as follow. So here's
our friends Alice and Bob. So what Alice's

going to do is she's gonna starts off by
choosing some random number a in the range 1 to p-1

And then she is gonna compute
the number g to the power of a modulo p.

Okay, so she computes this exponention,
and reduces the result modular the prime p.

And we're actually going to see how to
compute this efficiently the next module,

so for now just believe me that this
computation can be done efficiently.

Now, let's call capital A the result of this
value. So, what Alice will do is she will

send capital A over to Bob. Now Bob is
going to do the same thing. He's going to

choose a random number b in the range 1
to p-1. He's going to compute g to

the b module of p and let's call this
number B and he's going to send this

number B over to Alice. So Alice sends A
to Bob. Bob sends B. To Alice. And now

they claim that they can generate a shared
secret key. So what's a shared secret key?

Well, it's defined as. Let's call it
K_AB. And it's basically defined as the

value g to the power of a times b. Now the
amazing observation of Diffie-Hellman had

back in 1976 is that in fact both parties
can compute this value g to the ab.

So let's see, Alice can compute this value
because she can take her value B that she

received from Bob. She can take this value
B, raise it to the power of a and, let's

see, what she'll get is g to the b to the
power of b. And by the rules of

exponentiation, g to the power of b to the
power of a is equal to g to the ab. Bob

turns out, can do something similar, so
his goal is to compute g to the a to the b,

which again, is g to the ab, so both
Alice and Bob will have computed K_AB, and

let me ask you, how does Bob actually
compute this quantity g to the ab?

Well so all he does is he takes the value A that
he received from Alice and he raises it to

his own secret b and that gives him the
shared secret g to the ab. So you can see

now that both parties even though they
seemingly computed different values. In

fact they both end up with the same value
namely g ab module p. I should mention by

the way that Martie Hellman and Wig
Diffiie came up with this protocol back in

1976. Martie Hellman was a professor here
at Stanford and Wig Diffie was his

graduate student. They came up with this
protocol and this protocol really changed

the world. In that it introduced this new
age in cryptography. Where now it's not just

about developing block ciphers but it's
actually about designing algebraic

protocols that have properties like the
one we see here. So you notice that what

makes this protocol works is basically
properties of exponentiation. Namely that,

g to the b to the power of a is the same
as g to the a to the power of b, okay?

These are just properties of
exponentiations. Now when I describe a

protocol like the one I just showed you.
The real question that should be in your

mind is not why the protocol works. In
other words, it's very easy to verify

that, in fact, both Alice and Bob end up
with the same secret key. The bigger

question is why is this protocol secure?
In other words, why is it that an

eavesdropper cannot figure out the same
shared key due to the AB that both Alice

and Bob computed by themselves? So let's
analyze this a little bit, then, like I

said, we're gonna do a much more in-depth
analysis next week. But, let's, for now,

just see intuitively why this protocol is
gonna be secure. Well, what does the

eavesdropper see? Well, he sees the prime
and the generator. These are just fixed

forever and everybody knows who they are.
He also sees the value that Alice sent to

Bob, namely this capital A. And he sees
the value that Bob sent to Alice, namely

this capital B. And the question is, can
the, can the eavesdropper compute g to the

ab just given these four values? So more
generally, what we can do is we can define

this Diffie-Hellman function. So we'll say
that the Diffie-Hellman function is defined

based on some value g. And the question is
given g to the a, and g to the b. Can you

compute g to the ab? And what we're
asking is how hard is it to compute this

function module over very large prime p.
Remember p was something like 600 digits.

So the real question we need to answer is
how hard is this, is this Diffie-Hellman

function? So let me show you what's known
about this. So suppose the prime happens

to be n bits long. Okay, so in our case
say n is 2,000 bits. It turns out the best

known algorithm for computing the
Diffie–Hellman function. Is actually a

more general algorithm that computes the
discrete log function, which we're gonna

talk about next week. But for now, let's
just say that this algorithm computes a

Diffie-Hellman function. The algorithm is
called a general number field sieve. I'm

not gonna describe how it works, although
if you'd want to hear how it works, let me

know, and that could be one of the special
topics that we cover at the end of the

course. But the interesting thing is that
it's running time is exponential in

roughly the cube root of n. In other
words, the running time is roughly e to

the power of o of cube root of n. So in
fact the exact expression for the running

time, of this algorithm is much more
complicated than this. I'm deliberately

simplifying it here just to kind of get
the point across. The point that I want to

emphasize is that in fact, this is not
quite an exponential time algorithm.

Exponential would be e to the power of n.
This is sometimes called a sub-exponential

algorithm because the exponent is really
just proportional to the cube root of n,

as opposed to linear n. What this says is
that even though this problem is quite

difficult, it's not really an exponential
time problem. The running time in the

exponent is gonna be just proportional to
the cube root of n. So let's look at some

examples. Suppose I happen to be looking
at a modulus that happens to be about a

thousand bits long. What this algorithm
says is that we can solve the

Diffie-Hellman problem in time that's
approximately e to the qube root of 1,024.

Now this is not really correct, in fact
there are more constants in the exponent.

But again, just to get, the point across,
we can say that the running time would be

roughly e to the qube root of 1,024; which
is actually very small, roughly e to the 10.

So the simple example shows you that
if you have a sub-exponential algorithm,

you see that even if the problem is quite
large, like 1,000 bits. Because of the

cube root effect the exponent actually is not
that big overall. Now to be honest I'm

actually lying here. General number field
sieve actually have many other

constants in the exponents so in fact this
is not going to be ten at all. It's

actually going to be something like 80.
Because of various constants in the

exponents. But nevertheless but you see
the problem is much harder than say 2 to

the 1000. Even though describing it takes
1,000 bits, solving it does not take time

to the 1,000. So here I roll down the
table that kind of shows you the rough

difficulty of breaking down the
Diffie-Hellman protocol compared to the

difficulty of breaking down a cipher with
a appropriate number of bits. For example,

if your cipher has 80 bit keys. That would
be roughly comparable to using 1,000 bit

modules. In other words breaking a cipher
with 80 bit keys takes time roughly 2 to the 80

which means that
breaking if you have Diffie-Hellman function with a 1,000

bit module will take time 2 to the 80.

If your cipher uses 128 bit keys like AES, you should be
really using a 3,000 bit modulus,

even though nobody really does this. In reality
people would use 2,000 bit modulus. And

then if your key is very large, like if
we're using a 256 bit AES key, then in

fact your modulus needs to be very, very
large. Again, you, you can really see the

cube root effect here. We doubled the size
of our key and because of the cube root

effect, what that means is we have to
increase the size of, of our modulus by a

factor of two cubed, namely a factor of
eight, which is where this 15,000 comes from.

from. And again I should mention there's
not exactly a factor of eight because of

the extra constants in the exponent. So
this is a nice table that shows you that

if you're gonna be using Diffie-Hellman to
exchange, a session key. And that session

key is gonna be used for a block cipher
with a certain key size, this table shows

you what modular size you need to use so
that the security of the key exchange

protocol is comparable to the security of
the block cipher you're gonna be using after.

Now this last row should
really be disturbing to you. I should tell

you that working with such a large modulus
is very problematic. This is actually

gonna be quite slow, and so the question
is whether there is anything better that

we can do? And it turns out there is. In
fact the way I describe the Diffie-Hellman

function is I describe it at the way
Diffie and Hellman described it in 1976

using arithmetic module of primes. The
problem with arithmetic modular primes is

that the Diffie-Hellman function is hard
to compute, but it's not as hard as you'd

like. There's this cube root effect that
makes it a little easier than what you'd

really like. And so the question is can we
run the Diffie-Hellman protocol in other

settings. And it turns out the answer is
yes. In fact we can literally translate

the Diffie-Hellman protocol from an
arithmetic model of primes to a different

type of algebraic object and solving the
Diffie-Hellman problem on that other

algebraic object is much, much harder.
This other algebraic object is what's

called an elliptic curve. And as I said,
computing the Diffie-Hellman function on

these elliptic curves is much harder than
computing the Diffie-Hellman modulo primes.

Because the problem is so much harder, now
we can use much smaller objects in

particular, you know we'd be using primes
that are only a 160 bits or 80 bit keys or

only 512 bits for 256 bit keys. So because
these module don't grow as

fast on elliptic curves, there's generally
a slow transition away from using module

arithmetic, to using elliptic curves. I'm
not going to describe elliptic curves

right now for you, but if this is
something you'd like to learn about I can

do that at the very last week of the
course, when we discuss more advanced

topics. But in fact when we come back to
Diffie-Hellman next week I'm gonna describe it

more abstractly so that it really doesn't
matter which algebraic structure you use

whether you use arithmetic model of primes
or whether you use a elliptic curve we

can kinda abstract that whole issue away
and then use the Diffie-Hellman concept a for

key exchange and for other things as well.
And as I said we'll see that more

abstractly next week. So as usual I wanna
show that this beautiful protocol that I

just showed you, the Diffie-Hellman protocol,
is as is, is actually completely insecure

against an active attack. Namely, it's
completely insecure against what's called

the man in the middle attack. We need to
do something more than this protocol to

make is secure against man in the middle.
And again we're gonna come back to Diffie

Hellman and make it secure against man in
the middle next week. Okay. So let's see

why the protocol that I just showed you is
insecure against active attacks. Well

suppose we have this man in the middle
that's trying to eavesdrop on the

conversation between Alice and Bob. Well
so, the protocol starts with Alice sending

g to the a over to Bob. Well, so the man
in the middle is gonna intercept that and

he's gonna take the message that Alice
sent and he's gonna replace it with his

own message. So it's called A', And
let's write that is g to the a'.

Okay? So the man in the middle chooses his
own a' and what he sends to Bob is

actually g to the a'. Now poor Bob
doesn't know that the man in the middle

actually did anything to this traffic, all
he sees is he got the value A'. As

far as he knows, that value came from
Alice. So what is he gonna do in response?

Well, he's gonna send back his value B out
which is g to the b back to Alice. Well

again the man in the middle is gonna
intercept this B. He's gonna generate his

own b' and what he actually sends
back to Alice is B' which is g to the b'.

Okay, now what happens? Well
Alice is gonna compute her part of the

secret key and she's gonna get g to the ab'. Bob is gonna compute his part of

the secret key and he's gonna get g to the
b times a'. Okay, these now you

notice these are not the same keys. In the
man in the middle because he knows both A'

and B' he can compute both g to
the ab' and g to ba'. Yeah it's

not difficult to see the man in the middle
knows both values. And as a result, now he

can basically play this man in the middle
and when Alice sends an encrypted message

to Bob the man in the middle can simply
decrypt this message because he knows the

secret key that Alice encrypted message
with, re-encrypt it using Bob's key. And

then send the message on over to Bob. And
this way Alice sent the message, Bob

received the message. Bob thinks the
message is secure. But in fact that

message went through the man in the
middle. The man in the middle decrypted

it, re-encrypted it for Bob. At the same
time he was able to completely read it,

modify it if he wants to, and so on. So,
the protocol becomes completely insecure

give n a man in the middle. And so as I
said we're gonna have to enhance the

protocol somehow to defend against men in
the middle but it turns out that it's

actually not that difficult to enhance and
prevent man in the middle attacks.

And we're gonna come back to that and see that
in a week or two. The last think I want to

do is show you an interesting property of
the Diffie-Hellman protocol. In fact, I

want to show you that this protocol can be
viewed as a non-interactive protocol. So,

what do I mean by that? So Imagine we have
a whole bunch of users, you know, millions

of users. Let's call them Alice, Bob,
Charlie, David and so on and so forth.

Each one of them is going to choose a
random, secret value, and then on their

Facebook profiles, they're gonna write
down, their contribution to the

Diffie-Hellman protocol. Alright so
everybody just writes down you know g to

the a, g to the b, g to the c and so on.
Now the interesting thing about this is,

if say Alice and Charlie wanna set up a
shared key they don't need to communicate

at all. Basically Alice would go and read
Charlie's public profile. Charlie would go

and read Alice's public profile. And now,
boom, they immediately have a secret key.

Namely, now, Alice knows, g to the c and
a. Charlie knows g to the a and с. And as

a result, both of them can compute п to
the ac. So, in some sense, once they've

posted their contributions to the
Diffie-Hellman protocol to their public

profiles, they don't need to communicate
with each other at all to set up a shared key.

They immediately have a shared key
and now they can start communicating

securely through Facebook or not with one
another. And notice that this is true for

any Facebook user. So as soon as I read
somebody's public profile, I immediately

have a shared key with them without ever
communicating with them. This property is

sometimes called a non-interactive
property of the Diffie-Hellman. So now, let

me show you an open problem. And this is
an open problem that's been open for ages

and ages and ages. So it'd be really cool
if one of you can actually solve it. The

question is, can we do this for more than
two parties? In other words, say we have

four parties. All of them post their
values to their Facebook profiles. And now

we'd like to make it that just by reading
Facebook profiles, all of them can set up

a shared secret key. In other words, Alice
is, what she'll, she'll do is she'll only

read the public profiles of, the three
friends, Bob, Charlie and David. And

already she can compute a shared key with
them. And similarly David is just gonna

read the public profile of Charlie. Bob
and Alice. And already he has a shared key

with all four of them. Okay, so the
question is whether we can kind of setup

non-interactively these, these shared keys
for groups that are larger than just two

people. So as I said, for n equals two,
this is just a Diffie-Hellman protocol. In

other words, if just two parties want to
set up a shared key without communicating

with one another, that's just
Diffie-Hellman. It turns out, for N equals

three, we also know how to do it, there's
a known protocol, it's called protocol due

to Joux. It already uses very, very fancy
mathematics, much more complicated

mathematics than what I just showed you.
And for N equals four, or five, or

anything above this, above four, this
problem is completely open. Nearly the

case where four people post something to
the public profiles and then everybody

else reads the public profile and then
they have a joint shared key, this is

something we don't know how to do even for
four people. And this is a problem that's

been open for ages and ages, it's kind of
a fun problem to think about and so see if

you can solve it, if you will, it's
instant fame in the crypto world. Okay, so

I'll stop here, and we'll continue with
another key exchange mechanism in the next segment.