0:00:00.000,0:00:02.951
Before we start with the technical[br]material, I wanna give you a quick

0:00:02.951,0:00:06.487
overview of what cryptography is about and[br]the different areas of cryptography. So

0:00:06.487,0:00:10.487
the core of cryptography of course is[br]secure communication that essentially

0:00:10.487,0:00:14.539
consists of two parts. The first is[br]secured key establishment and then how do

0:00:14.539,0:00:18.697
we communicate securely once we have a[br]shared key. We already said that secured

0:00:18.697,0:00:22.854
key establishment essentially amounts to[br]Alice and Bob sending messages to one

0:00:22.854,0:00:26.906
another such that at the end of this[br]protocol, there's a shared key that they

0:00:26.906,0:00:30.906
both agree on, shared key K, and beyond[br]that, beyond just a shared key, in fact

0:00:30.906,0:00:35.274
Alice would know that she's talking to Bob[br]and Bob would know that he's talking to

0:00:35.274,0:00:39.964
Alice. But a poor attacker who listens in[br]on this conversation has no idea what the

0:00:39.964,0:00:44.011
shared key is. And we'll see how to do[br]that later on in the course. Now once they

0:00:44.011,0:00:47.657
have a shared key, they want exchange[br]messages securely using this key, and

0:00:47.657,0:00:51.698
we'll talk about encryption schemes that[br]allow them to do that in such a way that

0:00:51.698,0:00:55.491
an attacker can't figure out what messages[br]are being sent back and forth. And

0:00:55.491,0:00:59.630
furthermore an attacker cannot even tamper[br]with this traffic without being detected.

0:00:59.630,0:01:03.227
In other words, these encryption schemes[br]provide both confidentiality and

0:01:03.227,0:01:06.774
integrity. But cryptography does much,[br]much, much more than just these two

0:01:06.774,0:01:10.519
things. And I want to give you a few[br]examples of that. So the first example I

0:01:10.519,0:01:14.468
want to give you is what's called a[br]digital signature. So a digital signature,

0:01:14.468,0:01:18.892
basically, is the analog of the signature[br]in the physical world. In the physical

0:01:18.892,0:01:23.372
world, remember when you sign a document,[br]essentially, you write your signature on

0:01:23.372,0:01:27.740
that document and your signature is always[br]the same. You always write the same

0:01:27.740,0:01:32.164
signature on all documents that you wanna[br]sign. In the digital world, this can't

0:01:32.164,0:01:36.812
possibly work because if the attacker just[br]obtained one signed document from me, he

0:01:36.812,0:01:41.180
can cut and paste my signature unto some[br]other document that I might not have

0:01:41.180,0:01:45.247
wanted to sign. And so, it's simply not[br]possible in a digital world that my

0:01:45.247,0:01:49.590
signature is the same for all documents[br]that I want to sign. So we're gonna talk

0:01:49.590,0:01:53.830
about how to construct digital signatures[br]in the second half of the course. It's

0:01:53.830,0:01:58.123
really quite an interesting primitive and[br]we'll see exactly how to do it. Just to

0:01:58.123,0:02:02.098
give you a hint, the way digital[br]signatures work is basically by making the

0:02:02.098,0:02:06.232
digital signature via function of the[br]content being signed. So an attacker who

0:02:06.232,0:02:10.313
tries to copy my signature from one[br]document to another is not gonna succeed

0:02:10.313,0:02:14.541
because the signature. On the new document[br]is not gonna be the proper function of the

0:02:14.541,0:02:18.526
data in the new document, and as a result,[br]the signature won't verify. And as I said,

0:02:18.526,0:02:22.608
we're gonna see exactly how to construct[br]digital signatures later on and then we'll

0:02:22.608,0:02:27.193
prove that those constructions are secure.[br]Another application of cryptography that I

0:02:27.193,0:02:31.096
wanted to mention, is anonymous[br]communication. So, here, imagine user

0:02:31.096,0:02:35.828
Alice wants to talk to some chat server,[br]Bob. And, perhaps she wants to talk about

0:02:35.828,0:02:40.382
a medical condition, and so she wants to[br]do that anonymously, so that the chat

0:02:40.382,0:02:45.113
server doesn't actually know who she is.[br]Well, there's a standard method, called a

0:02:45.113,0:02:49.946
mixnet, that allows Alice to communicate[br]over the public internet with Bob through

0:02:49.946,0:02:54.856
a sequence of proxies such that at the end[br]of the communication Bob has no idea who he

0:02:54.856,0:02:59.537
just talked to. The way mixnets work[br]is basically as Alice sends her messages

0:02:59.537,0:03:03.818
to Bob through a sequence of proxies,[br]these messages get encrypted and

0:03:03.818,0:03:08.271
decrypted appropriately so that Bob has no[br]idea who he talked to and the proxies

0:03:08.271,0:03:12.724
themselves don't even know that Alice is[br]talking to Bob, or that actually who is

0:03:12.724,0:03:16.750
talking to whom more generally. One[br]interesting thing about this anonymous

0:03:16.750,0:03:20.498
communication channel is, this is[br]bi-directional. In other words, even

0:03:20.498,0:03:24.743
though Bob has no idea who he's talking[br]to, he can still respond to Alice and

0:03:24.743,0:03:29.153
Alice will get those messages. Once we[br]have anonymous communication, we can build

0:03:29.153,0:03:33.784
other privacy mechanisms. And I wanna give[br]you one example which is called anonymous

0:03:33.784,0:03:37.643
digital cash. Remember that in the[br]physical world if I have a physical

0:03:37.643,0:03:42.108
dollar, I can walk into a bookstore and[br]buy a book and the merchant would have no

0:03:42.108,0:03:46.876
idea who I am. The question is whether we[br]can do the exact same thing in the digital

0:03:46.876,0:03:50.963
world. In the digital world, basically,[br]Alice might have a digital dollar,

0:03:50.963,0:03:55.984
a digital dollar coin. And she might wanna[br]spend that digital dollar at some online

0:03:55.984,0:04:00.760
merchants, perhaps some online bookstore.[br]Now, what we'd like to do is make it so

0:04:00.760,0:04:05.539
that when Alice spends her coin at the[br]bookstore, the bookstore would have no

0:04:05.539,0:04:10.629
idea who Alice is. So we provide the same[br]anonymity that we get from physical cash.

0:04:10.629,0:04:15.470
Now the problem is that in the digital[br]world, Alice can take the coin that she

0:04:15.470,0:04:20.250
had, this one dollar coin, and before she spent[br]it, she can actually make copies of it.

0:04:20.250,0:04:24.086
And then all of a sudden instead of just[br]having just one dollar coin now all

0:04:24.093,0:04:27.936
of a sudden she has three dollar coins and[br]they're all the same of course, and

0:04:27.936,0:04:31.828
there's nothing preventing her from taking[br]those replicas of a dollar coin and

0:04:31.828,0:04:35.819
spending it at other merchants. And so the[br]question is how do we provide anonymous

0:04:35.819,0:04:39.849
digital cash? But at the same time, also[br]prevent Alice from double spending the

0:04:39.849,0:04:43.760
dollar coin at different merchants. In[br]some sense there's a paradox here where

0:04:43.760,0:04:47.879
anonymity is in conflict with security[br]because if we have anonymous cash there's

0:04:47.879,0:04:51.999
nothing to prevent Alice from double[br]spending the coin and because the coin is

0:04:51.999,0:04:56.244
anonymous we have no way of telling who[br]committed this fraud. And so the question

0:04:56.244,0:05:00.394
is how do we resolve this tension. And it[br]turns out, it's completely doable. And

0:05:00.394,0:05:04.757
we'll talk about anonymous digital cash[br]later on. Just to give you a hint, I'll

0:05:04.757,0:05:09.173
say that the way we do it is basically by[br]making sure that if Alice spends the coin

0:05:09.173,0:05:13.764
once then no one knows who she is, but if[br]she spends the coin more than once, all of

0:05:13.764,0:05:17.878
a sudden, her identity is completely[br]exposed and then she could be subject to

0:05:17.878,0:05:22.096
all sorts of legal problems. And so that's[br]how anonymous digital cash would

0:05:22.096,0:05:26.158
work at a high level and we'll see how to[br]implement it later on in the course.

0:05:26.158,0:05:30.219
Another application of cryptography has to[br]do with more abstract protocols, but

0:05:30.219,0:05:34.333
before I speak the general result, I[br]want to give you two examples. So the

0:05:34.333,0:05:38.343
first example has to do with election[br]systems. So here's the election problem.

0:05:38.343,0:05:42.656
Suppose ... we have two parties, party zero[br]and party one. And voters vote for these

0:05:42.656,0:05:47.101
parties. So for example, this voter could[br]have voted for party zero, this voter voted for

0:05:47.101,0:05:52.313
party one. And so on. So in this election,[br]party zero got three votes and party two

0:05:52.313,0:05:56.590
got two votes. So the winner of the[br]election, of course, is party zero. But

0:05:56.590,0:06:01.579
more generally, the winner of the election[br]is the party who receives the majority of

0:06:01.579,0:06:06.453
the votes. Now, the voting problem is the[br]following. The voters would somehow like

0:06:06.453,0:06:11.720
to compute the majority of the votes, but[br]do it in such a way such that nothing else

0:06:11.720,0:06:16.797
is revealed about their individual votes.[br]Okay? So the question is how to do that?

0:06:16.797,0:06:21.493
And to do so, we're gonna introduce an[br]election center who's gonna help us

0:06:21.493,0:06:26.633
compute the majority, but keep the votes[br]otherwise secret. And what the parties

0:06:26.633,0:06:32.027
will do is they will each send the funny[br]encryption of their votes to the election

0:06:32.027,0:06:36.949
center in such a way that at the end of[br]the election, the election center is able

0:06:36.949,0:06:41.615
to compute and output the winner of the[br]election. However, other than the winner

0:06:41.615,0:06:46.580
of the election, nothing else is revealed[br]about the individual votes. The individual

0:06:46.580,0:06:51.366
votes otherwise remain completely private.[br]Of course the election center is also

0:06:51.366,0:06:56.331
gonna verify that this voter for example[br]is allowed to vote and that the voter has

0:06:56.331,0:07:00.818
only voted once. But other than that[br]information the election center and the

0:07:00.818,0:07:05.484
rest of the world learned nothing else[br]about that voter's vote other than the

0:07:05.484,0:07:10.104
result of the election. So this is an[br]example of a protocol that involves six

0:07:10.104,0:07:14.430
parties. In this case, there are five[br]voters in one election center. These

0:07:14.430,0:07:19.417
parties compute amongst themselves. And at[br]the end of the computation, the result of

0:07:19.417,0:07:24.404
the election is known but nothing else is[br]revealed about the individual inputs. Now

0:07:24.404,0:07:29.156
a similar problem comes up in the context[br]of private auctions. So in a private

0:07:29.156,0:07:34.160
auction every bidder has his own bid that[br]he wants to bid. And now suppose the

0:07:34.160,0:07:39.356
auction mechanism that's being used is[br]what's called a Vickrey auction where the

0:07:39.356,0:07:45.287
definition of a Vickrey auction is that[br]the winner is the highest bidder. But the

0:07:45.287,0:07:50.099
amounts that the winner pays is actually[br]the second highest bid. So he pays the

0:07:50.099,0:07:54.850
second highest bid. Okay, so this is a[br]standard auction mechanism called the

0:07:54.850,0:08:00.028
Vickrey auction. And now what we'd like to[br]do is basically enable the participants to

0:08:00.028,0:08:04.779
compute, to figure out who the highest[br]bidder is and how much he's supposed to

0:08:04.779,0:08:09.165
pay, but other than that, all other[br]information about the individual bids

0:08:09.165,0:08:14.160
should remain secret. So for example, the[br]actual amount that the highest bidder bid

0:08:14.160,0:08:19.225
should remain secret. The only thing that[br]should become public is the second highest

0:08:19.225,0:08:23.526
bid and the identity of the highest[br]bidder. So again now, the way we will do

0:08:23.526,0:08:28.172
that is we'll introduce an auction center, and[br]in a similar way, essentially, everybody

0:08:28.172,0:08:32.588
will send their encrypted bids to the[br]auction center. The auction center will

0:08:32.588,0:08:37.119
compute the identity of the winner and in[br]fact he will also compute the second

0:08:37.119,0:08:41.822
highest bid but other than these two[br]values, nothing else is revealed about the

0:08:41.822,0:08:46.126
individual bids. Now, this is actually an[br]example of a much more general problem

0:08:46.126,0:08:50.264
called secure multi-party computation. Let[br]me explain what secure multi-party

0:08:50.264,0:08:54.618
computation is about. So here, basically[br]abstractly, the participants have a secret

0:08:54.618,0:08:58.649
inputs to themselves. So, in the case of[br]an election, the inputs would be the

0:08:58.649,0:09:02.787
votes. In the case of an auction, the[br]inputs would be the secret bids. And then

0:09:02.787,0:09:06.959
what they would like to do is compute some[br]sort of a function of their inputs.

0:09:06.959,0:09:10.840
Again, in the case of an election, the[br]function's a majority. In the case of

0:09:10.840,0:09:15.088
auction, the function happens to be the[br]second highest, largest number among x one

0:09:15.088,0:09:19.179
to x four. And the question is, how can[br]they do that, such that the value of the

0:09:19.179,0:09:23.375
function is revealed, but nothing else is[br]revealed about the individual votes? So

0:09:23.375,0:09:27.675
let me show you kind of a dumb, insecure[br]way of doing it. What we do is introduce a

0:09:27.675,0:09:31.774
trusted party. And then, this trusted[br]authority basically collects individual

0:09:31.774,0:09:36.223
inputs. And it kinda promises to keep the[br]individual inputs secret, so that only it

0:09:36.223,0:09:40.510
would know what they are. And then, it[br]publishes the value of the function, to

0:09:40.510,0:09:44.742
the world. So, the idea is now that the[br]value of the function became public, but

0:09:44.742,0:09:48.812
nothing else is revealed about the[br]individual inputs. But, of course, you got

0:09:48.812,0:09:52.990
this trusted authority that you got to[br]trust, and if for some reason it's not

0:09:52.990,0:09:57.168
trustworthy, then you have a problem. And[br]so, there's a very central theorem in

0:09:57.168,0:10:01.001
crypto and it really is quite a[br]surprising fact. That says that any

0:10:01.001,0:10:05.204
computation you'd like to do, any function[br]F you'd like to compute, that you can

0:10:05.204,0:10:09.302
compute with a trusted authority, you can[br]also do without a trusted authority.

0:10:09.302,0:10:13.559
Let me at a high level explain what this[br]means. Basically, what we're gonna do, is

0:10:13.559,0:10:17.816
we're gonna get rid of the authority. So[br]the parties are actually not gonna send

0:10:17.816,0:10:21.807
their inputs to the authority. And, in[br]fact, there's no longer going to be an

0:10:21.807,0:10:26.011
authority in the system. Instead, what the[br]parties are gonna do, is they're gonna

0:10:26.011,0:10:30.567
talk to one another using some protocol.[br]Such that at the end of the protocol all

0:10:30.567,0:10:34.890
of a sudden the value of the function[br]becomes known to everybody. And yet

0:10:34.890,0:10:39.390
nothing other than the value of the[br]function is revealed. In other words, the

0:10:39.390,0:10:43.639
individual inputs is still kept secret.[br]But again there's no authority, there's

0:10:43.639,0:10:47.867
just a way for them to talk to one another[br]such that the final output is revealed. So

0:10:47.867,0:10:51.846
this is a fairly general result, it's kind[br]of a surprising fact that is at all

0:10:51.846,0:10:56.024
doable. And in fact it is and towards the[br]end of the class we'll actually see how to

0:10:56.024,0:11:00.577
make this happen. Now, there are some[br]applications of cryptography that I can't

0:11:00.577,0:11:05.560
classify any other way other than to say[br]that they are purely magical. Let me give

0:11:05.560,0:11:10.240
you two examples of that. So the first is[br]what's called privately outsourcing

0:11:10.240,0:11:15.224
computation. So I'll give you an example[br]of a Google search just to illustrate the

0:11:15.224,0:11:20.329
point. So imagine Alice has a search query[br]that she wants to issue. It turns out that

0:11:20.329,0:11:25.434
there are very special encryption schemes[br]such that Alice can send an encryption of

0:11:25.434,0:11:30.368
her query to Google. And then because of[br]the property of the encryption scheme

0:11:30.368,0:11:35.304
Google can actually compute on the[br]encrypted values without knowing what the

0:11:35.304,0:11:40.368
plain texts are. So Google can actually[br]run its massive search algorithm on the

0:11:40.368,0:11:44.903
encrypted query and recover in encrypted[br]results. Okay. Google will send the

0:11:44.903,0:11:49.242
encrypted results back to Alice. Alice[br]will decrypt and then she will receive the

0:11:49.242,0:11:53.689
results. But the magic here is all Google[br]saw was just encryptions of her queries

0:11:53.689,0:11:57.493
and nothing else. And so, Google as a[br]result has no idea what Alice just

0:11:57.493,0:12:01.672
searched for and nevertheless Alice[br]actually learned exactly what she

0:12:01.672,0:12:05.812
wanted to learn. Okay so, these are[br]magical kind of encryption schemes. They're

0:12:05.812,0:12:09.985
fairly recent, this is only a new[br]development from about two or three years

0:12:09.985,0:12:14.436
ago, that allows us to compute on encrypted[br]data, even though we don't really know

0:12:14.436,0:12:18.667
what's inside the encryption. Now, before[br]you rush off and think about implementing

0:12:18.667,0:12:22.470
this, I should warn you that this is[br]really at this point just theoretical, in

0:12:22.470,0:12:26.422
the sense that running a Google search on[br]encryption data probably would take a

0:12:26.422,0:12:30.521
billion years. But nevertheless, just the[br]fact that this is doable is already really

0:12:30.521,0:12:34.473
surprising, and is already quite useful[br]for relatively simple computations. So in

0:12:34.473,0:12:38.671
fact, we'll see some applications of this[br]later on. The other magical application I

0:12:38.671,0:12:42.474
want to show you is what's called zero[br]knowledge. And in particular, I'll tell

0:12:42.474,0:12:46.080
you about something called the zero[br]knowledge proof of knowledge. So here ...

0:12:46.080,0:12:50.177
what happens is there's a certain[br]number N, which Alice knows. And the way

0:12:50.177,0:12:54.169
the number N was constructed is as a[br]product of two large primes. So, imagine

0:12:54.169,0:12:58.835
here we have two primes, P and Q. Each one[br]you can think of it as like 1000 digits.

0:12:58.835,0:13:03.892
And you probably know that multiplying[br]two 1000-digit numbers is fairly easy. But if

0:13:03.892,0:13:08.235
I just give you their product, figuring[br]out their factorization into primes is

0:13:08.235,0:13:12.427
actually quite difficult. And, in fact,[br]we're gonna use the fact that factoring is

0:13:12.427,0:13:16.566
difficult to build public key cryptosystems[br]in the second half of the course.

0:13:16.566,0:13:20.968
Okay, so Alice happens to have this number[br]N, and she also knows the factorization of

0:13:20.968,0:13:24.898
N. Now Bob just has the number N. He[br]doesn't actually know the factorization.

0:13:24.898,0:13:28.723
Now, the magical fact about the zero[br]knowledge proof of knowledge, is that

0:13:28.723,0:13:33.144
Alice can prove to Bob that she knows the[br]factorization of N. Yes, you can actually

0:13:33.144,0:13:37.457
give this proof to Bob, that Bob can[br]check, and become convinced that Alice

0:13:37.457,0:13:42.386
knows the factorization of N, however Bob[br]learns nothing at all. About the factors P

0:13:42.386,0:13:47.034
and Q, and this is provable. Bob[br]absolutely learns nothing at all about the

0:13:47.034,0:13:50.997
factors P and Q. And the statement[br]actually is very, very general. This is

0:13:50.997,0:13:55.275
not just about proving the factorization[br]of N. In fact, almost any puzzle that you

0:13:55.275,0:13:59.606
want to prove that you know an answer to,[br]you can prove it is your knowledge. So if

0:13:59.606,0:14:03.831
you have a crossword puzzle that you've[br]solved. Well, maybe crosswords is not the

0:14:03.831,0:14:07.845
best example. But if you have like a[br]Sudoku puzzle, for example, that you want

0:14:07.845,0:14:12.282
to prove that you've solved, you can prove[br]it to Bob in a way that Bob would learn

0:14:12.282,0:14:16.718
nothing at all about the solution, and yet[br]Bob would be convinced that you really do

0:14:16.718,0:14:20.930
have a solution to this puzzle. Okay. So[br]those are kind of magical applications.

0:14:20.930,0:14:25.000
And so the last thing I want to say is[br]that modern cryptography is a very

0:14:25.000,0:14:29.015
rigorous science. And in fact, every[br]concept we're gonna describe is gonna

0:14:29.015,0:14:33.129
follow three very rigorous steps, okay,[br]and we're gonna see these three steps

0:14:33.129,0:14:37.338
again and again and again so I wanna[br]explain what they are. So the first thing

0:14:37.338,0:14:41.493
we're gonna do when we introduce a new[br]primitive like a digital signature is

0:14:41.493,0:14:45.540
we're gonna specify precisely what the[br]threat model is. That is, what can an

0:14:45.540,0:14:49.534
attacker do to attack a digital signature[br]and what is his goal in forging

0:14:49.534,0:14:53.851
signatures? Okay, so we're gonna define[br]exactly what does it mean for a signature

0:14:53.851,0:14:57.760
for example to be unforgeable.[br]Unforgeable. Okay and I'm giving digital

0:14:57.760,0:15:01.998
signatures just as an example. For every[br]primitive we describe we're gonna

0:15:01.998,0:15:06.464
precisely define what the threat model is.[br]Then we're gonna propose a construction

0:15:06.464,0:15:10.931
and then we're gonna give a proof that any[br]attacker that's able to attack the

0:15:10.931,0:15:15.955
construction under this threat model. That[br]attacker can also be used to solve some

0:15:15.955,0:15:20.150
underlying hard problem. And as a result,[br]if the problem really is hard, that

0:15:20.150,0:15:24.350
actually proves that no attacker can break[br]the construction under the threat model.

0:15:24.350,0:15:27.843
Okay. But these three steps are actually[br]quite important. In the case of

0:15:27.843,0:15:31.928
signatures, we'll define what it means for[br]a signature to be, forgeable, then we'll

0:15:31.928,0:15:35.914
give a construction, and then for example[br]we'll say that anyone who can break our

0:15:35.914,0:15:39.801
construction can then be used to say[br]factor integers, which is believed to be a

0:15:39.801,0:15:43.541
hard problem. Okay, so we're going to[br]follow these three steps throughout, and

0:15:43.541,0:15:47.331
then you'll see how this actually comes[br]about. Okay, so this is the end of the

0:15:47.331,0:15:51.218
segment. And then in the next segment[br]we'll talk a little bit about the history

0:15:51.218,0:15:52.006
of cryptography.