WEBVTT

00:00:00.000 --> 00:00:03.827
In the last segment, we explained what is
a public key encryption system. And we

00:00:03.827 --> 00:00:07.557
defined what it means for a public key
encryption system to be secure. If you

00:00:07.557 --> 00:00:11.142
remember, we required security against
active attacks. And in particular, we

00:00:11.142 --> 00:00:15.211
defined chosen cipher text security as our
goal. This week, we're gonna construct two

00:00:15.211 --> 00:00:19.281
families of public key encryption systems
that are chosen cipher text secure. And in

00:00:19.281 --> 00:00:22.914
this segment, we're gonna start by
constructing public key encryptions from,

00:00:23.059 --> 00:00:27.124
a concept called a trapdoor
permutation. So let's start by defining a

00:00:27.124 --> 00:00:31.064
general concept called a trapdoor
function. So what is a trapdoor function?

00:00:31.064 --> 00:00:35.484
Well, a trapdoor function basically is a
function that goes from some set X to some

00:00:35.484 --> 00:00:39.585
set Y. And it's really defined by a triple
of algorithms. There's a generation

00:00:39.585 --> 00:00:43.685
algorithm, the function f, and the inverse
of the function f. So the generation

00:00:43.685 --> 00:00:47.892
algorithm, basically what it does when you
run it, it will generate a key pair, a

00:00:47.892 --> 00:00:52.098
public key and a secret key. The public
key is gonna define a specific function

00:00:52.098 --> 00:00:56.869
from the set X to the set Y. And then the
secret key is going to define the inverse

00:00:56.869 --> 00:01:01.639
function now from the set Y to the set
X. So the idea is that you can evaluate

00:01:01.639 --> 00:01:06.588
the function at any point using a public key
PK and then you can invert that function

00:01:06.588 --> 00:01:12.443
using the secret key, SK. So what do I
mean by inversion? More precisely, if we

00:01:12.443 --> 00:01:17.255
look at any public key and secret key pair
generated by the key generation algorithm

00:01:17.255 --> 00:01:21.727
G, then it so happens that if I evaluate
the function at the point X, and then I

00:01:21.727 --> 00:01:26.142
evaluate the inverse at the resulting
point, I should get the original point X

00:01:26.142 --> 00:01:30.670
back. So the picture you should have in
your mind, is there is this big set X and

00:01:30.670 --> 00:01:35.857
this big set Y And then, this function
will map any point in X to a point in Y,

00:01:35.857 --> 00:01:41.508
and this can be done using the public key.
So again any point in X can be mapped, to

00:01:41.508 --> 00:01:46.897
a point in Y. And then if someone has the
secret key, then basically they can go in

00:01:46.897 --> 00:01:53.758
the reverse direction by applying, this
secret key sk. So now that we

00:01:53.758 --> 00:01:58.289
understand what a trapdoor function is,
let's define what it means for a trapdoor

00:01:58.289 --> 00:02:02.652
function to be secure. And so we'll say
that this triple, (G, F, F inverse), is secure

00:02:02.652 --> 00:02:06.903
if in fact this function F(PK, .) is what's
called a one way function. And let me

00:02:06.903 --> 00:02:10.986
explain what a, what is a one way
function. The idea is that basically the

00:02:10.986 --> 00:02:15.516
function can be evaluated at any point,
but inverting it is difficult without the

00:02:15.516 --> 00:02:19.639
secret key SK. So let's define that more
precisely. As usual we define that using a

00:02:19.639 --> 00:02:23.764
game. So here we have our game between the
challenger and the adversary. And the game

00:02:23.764 --> 00:02:27.496
proceeds as follows. Basically the
challenger will generate a public key and

00:02:27.496 --> 00:02:31.622
a secret key. And then they will generate
a random X. It will send the public key

00:02:31.622 --> 00:02:36.116
over to the adversary and then it will
evaluate the function at the point X and

00:02:36.116 --> 00:02:40.160
send the resulting Y also to the
adversary. So all the adversary gets to

00:02:40.160 --> 00:02:44.653
see is just a public key, which defines
what the function is, and then he gets to

00:02:44.653 --> 00:02:49.483
see the image of this function on a random
point X and is goal is basically to invert

00:02:49.483 --> 00:02:54.097
the function at this point Y. Okay, so he
outputs some X prime. And we said that the

00:02:54.097 --> 00:02:58.507
trap door function is secure if the
probability that the ad, adversary inverts

00:02:58.507 --> 00:03:03.143
the given at point y is negligible. In
other words, given y the probability that

00:03:03.143 --> 00:03:07.271
the adversary's able to alter the pre
image of y is in fact a negligible

00:03:07.271 --> 00:03:11.907
probability and if that's true for all
efficient algorithms then we say that this

00:03:11.907 --> 00:03:17.882
trapdor function is secure. So again
abstractly, it's a really interesting

00:03:17.882 --> 00:03:21.885
concept in that you can evaluate the
function in the forward direction very

00:03:21.885 --> 00:03:26.150
easily. But then no one can evaluate the
function in the reverse direction unless

00:03:26.150 --> 00:03:30.311
they have this trapdoor, the secret key
SK, which then all of a sudden lets them

00:03:30.311 --> 00:03:35.424
invert the function very, very easily. So
using the concept of a trapdoor function,

00:03:35.424 --> 00:03:39.552
it's not too difficult to build a public key encryption system, and let me show you how

00:03:39.552 --> 00:03:43.528
to do it. So here we have we our trap door
function, (G, F, and F inverse). The other

00:03:43.528 --> 00:03:47.605
tool that we are going to need is a symmetric encryption scheme, and I'm going to

00:03:47.605 --> 00:03:51.531
assume that this encryption scheme is actually
secure against active attacks, so in

00:03:51.531 --> 00:03:55.350
particular I needed to provide
authenticated encryption. Notice that the

00:03:55.350 --> 00:04:00.726
symmetric encryption system takes keys in
K and the trapdoor function takes inputs

00:04:00.726 --> 00:04:05.790
in X. Those are two different sets and so
we're also gonna need the hash function.

00:04:05.790 --> 00:04:09.937
That goes from X to K. In other words, it
maps elements in the set X into keys for

00:04:09.937 --> 00:04:14.033
the symmetric encryption systems. And now
once we have these three components, we

00:04:14.033 --> 00:04:17.821
can actually construct the public key encryption system as follows: so the key

00:04:17.821 --> 00:04:21.764
generation for the public key encryption
system is basically exactly the same as

00:04:21.764 --> 00:04:25.655
the key generation for the trap door
function. So we run G for the trap door

00:04:25.655 --> 00:04:29.956
function, we get a public key and a secret
key. And those are gonna be the public and

00:04:29.956 --> 00:04:34.171
secret keys for the public key encryption
system. And how do we encrypt and decrypt? Let's

00:04:34.171 --> 00:04:38.978
start with encryption. So the encryption
algorithm takes a public key and a message

00:04:38.978 --> 00:04:43.898
as input. So what it will do is it will
generate a random X from the set capital

00:04:43.898 --> 00:04:48.545
X. It will then apply the trapdoor
function to this random X, to obtain Y. So

00:04:48.545 --> 00:04:53.130
Y is the image of X under the trapdoor
function. Then it will go ahead and

00:04:53.130 --> 00:04:58.272
generate a symmetric key by hashing X. So
this is a symmetric key for the symmetric

00:04:58.272 --> 00:05:03.290
key system. And then finally it encrypts
the plain text message 'm' using this key that was

00:05:03.290 --> 00:05:08.123
just generated. And then it outputs the
value Y that it just computed, which is

00:05:08.123 --> 00:05:13.260
the image of X, along the encryption under
the symmetric system of the message M. So

00:05:13.260 --> 00:05:18.366
that's how encryption works. And I want to
emphasize again that the trapdoor function

00:05:18.366 --> 00:05:23.112
is only applied to this random value X,
whereas the message itself is encrypted

00:05:23.112 --> 00:05:28.098
using a symmetric key system using a key
that was derived from the value X that we

00:05:28.098 --> 00:05:32.959
chose at random. So now that we understand
encryption, let's see how to decrypt.

00:05:32.959 --> 00:05:37.366
While the decryption algorithm takes a
secret key as input, and the ciphertext.

00:05:37.366 --> 00:05:41.551
The ciphertext itself contains two
components, the value Y and the value C.

00:05:41.551 --> 00:05:46.070
So the first step we're gonna do, is we're
gonna apply the inverse transformation,

00:05:46.070 --> 00:05:50.366
the inverse trap door function to the
value Y, and that will give us back the

00:05:50.366 --> 00:05:54.495
original X that was chosen during
encryption. So now let me ask you, how do

00:05:54.495 --> 00:06:00.042
we derive the symmetric decryption key K
from this X that we just obtained? Well,

00:06:00.042 --> 00:06:04.736
so that's an easy question. We basically
hash X again. That gives us K just as

00:06:04.736 --> 00:06:09.372
during encryption. And now that we have
this symmetric encryption key we can apply

00:06:09.372 --> 00:06:13.783
the, the symmetric decryption algorithm to
decrypt the ciphertext C. We get the

00:06:13.783 --> 00:06:17.741
original message M and that's what we
output. So, that's how the public key

00:06:17.741 --> 00:06:22.321
encryption system works were this trap
door function is only used for encrypting

00:06:22.321 --> 00:06:26.788
some sort of a random value X and the
actual message is encrypted using the

00:06:26.788 --> 00:06:31.244
symmetric system. So in pictures here, we
have the message M obviously the plain

00:06:31.244 --> 00:06:35.545
text could be quite large. So, here we
have the body of the deciphered text which

00:06:35.545 --> 00:06:39.953
can be quite long is actually encrypted
using the symmetric system. And then again

00:06:39.953 --> 00:06:44.039
I emphasize that the key for the
symmetric system is simply the hash of X.

00:06:44.039 --> 00:06:48.232
And then the header of ciphertext is simply
this application of the trapdoor

00:06:48.232 --> 00:06:52.641
function to this random X that we picked.
And so during decryption what happens is

00:06:52.641 --> 00:06:56.888
we first decrypt the header to get X and
then we decrypt the body using the

00:06:56.888 --> 00:07:01.829
symmetric system to actually get the
original plain text M. So as usual when I

00:07:01.829 --> 00:07:06.542
show you a system like this, obviously you
want to verify that decryption in fact is

00:07:06.542 --> 00:07:10.605
the inverse of encryption. But more
importantly you want to ask why is this

00:07:10.605 --> 00:07:14.963
system secure. And in fact there's a nice
security theorem here that says. That if

00:07:14.963 --> 00:07:18.900
the trap door function that we started
with is secure. In other words, that's a

00:07:18.900 --> 00:07:22.634
one way function if the adversary doesn't
have a secret key. The symmetric

00:07:22.634 --> 00:07:26.621
encryption system provides authenticated
encryption. And the hash function is a

00:07:26.621 --> 00:07:30.558
random oracle, which simply means that
it's a random function from the set X to

00:07:30.558 --> 00:07:34.696
the set of keys K. So a random oracle is
some sort of an idealization of, what a

00:07:34.696 --> 00:07:38.280
hash function is supposed to be. In
practice, of course, when you come to

00:07:38.280 --> 00:07:42.317
implement a system like this, you would
just use, SHA-256, or any of the

00:07:42.317 --> 00:07:47.252
other hash functions that we discussed in
class. So, under those three conditions in

00:07:47.252 --> 00:07:51.863
fact the system that we just described is
chosen cipher text secure so it is CCA

00:07:51.863 --> 00:07:56.416
secure, the little ro here just denote the
fact that security is set in whats called

00:07:56.416 --> 00:08:00.572
a random oracle model. But, that's a detail
that's actually not so important for

00:08:00.572 --> 00:08:05.012
discussion here, what I want you to
remember is that if the trap door function

00:08:05.012 --> 00:08:09.000
is in fact a secure trap door function. The
symmetric encryption system is secure

00:08:09.000 --> 00:08:13.017
against tampering so it provides
authenticated encryption. And H

00:08:13.017 --> 00:08:17.468
is in some sense a good hash function.
It's a random, function, which in practice

00:08:17.468 --> 00:08:22.245
you would just use SHA-256, then in
fact the system that we just showed is CCA

00:08:22.245 --> 00:08:27.615
secure, is chosen ciphertext secure. I should
tell you that there's actually an ISO

00:08:27.615 --> 00:08:31.752
standard that, defines this mode of
encryption, of public encryption. ISO

00:08:31.752 --> 00:08:35.781
stands for International Standards
Organization. So in fact this particular

00:08:35.781 --> 00:08:40.456
system has actually been standardized, and
this is a fine thing to use. I'll refer to

00:08:40.456 --> 00:08:44.947
this as the ISO encryption in the next few
segments. To conclude this segment, I want

00:08:44.947 --> 00:08:48.925
to warn you about an incorrect way of
using a trapdoor function to build a

00:08:48.925 --> 00:08:53.328
public key encryption system. And in fact
this method might be the first thing that

00:08:53.328 --> 00:08:57.572
comes to mind, and yet it's completely
insecure. So let me show you, how not to

00:08:57.572 --> 00:09:01.762
encrypt using a trapdoor function. Well
the first thing that might come to mind

00:09:01.762 --> 00:09:05.696
is, well, let's apply the trapdoor
function directly to the message M. So we

00:09:05.696 --> 00:09:10.047
encrypt simply by applying a function to
the message M, and we decrypt simply by

00:09:10.047 --> 00:09:14.180
applying F inverse to the ciphertext C to
recover the original message M. So

00:09:14.180 --> 00:09:18.639
functionally, this is in fact, decryption
is the inverse of encryption, and yet this

00:09:18.639 --> 00:09:22.881
is completely insecure for many, many
different reasons. The easiest way to see

00:09:22.881 --> 00:09:26.960
that this is insecure, is that it's
simply, this is deterministic encryption.

00:09:26.960 --> 00:09:30.944
You notice there is no randomness being
used here. When we encrypt a message

00:09:30.944 --> 00:09:34.154
M, and since it is
deterministic, it's cannot possibly be

00:09:34.154 --> 00:09:37.948
semantically secure. But in fact, as I
said, when we instantiate this trap door

00:09:37.948 --> 00:09:41.644
function with particular implementations,
for example with the RSA trap door

00:09:41.644 --> 00:09:44.951
function, then there are many, many
attacks that are possible on this

00:09:44.951 --> 00:09:48.794
particular construction, and so you should
never, ever, ever use it, and I'm gonna

00:09:48.794 --> 00:09:52.830
repeat this throughout this module, and in
fact in the next segment I'll show you a

00:09:52.830 --> 00:09:56.699
number of attacks on this particular
implementation. Okay so, what I would like

00:09:56.699 --> 00:10:00.717
you to remember is that you should be
using an encryption system like the ISO

00:10:00.717 --> 00:10:04.992
standard, and you should never apply the
trap door function directly to the message M.

00:10:04.992 --> 00:10:09.010
Although in the next segment we'll
see other ways to encrypt using a trap

00:10:09.010 --> 00:10:13.233
door function that are also correct, but
this particular method is clearly, clearly

00:10:13.233 --> 00:10:17.560
incorrect. Okay, so now that we understand
how to build public key encryption

00:10:17.560 --> 00:10:21.423
given a trap door function, the next
question is how to construct trap door

00:10:21.423 --> 00:10:24.360
functions, and we're going to do that in
the next segment.