The RSA trapdoor permutation (18 min)

Edit subtitles

0:00 - 0:04

In the previous segment we saw how to
build public key encryption from trapdoor
0:04 - 0:08

functions, in this segment we're going to
build a classic trapdoor function called
0:08 - 0:13

RSA. But first let's quickly review what a
trapdoor function is. So recall that a
0:13 - 0:17

trapdoor function is made up of three
algorithms. There is a key generation
0:17 - 0:21

algorithm, the function itself, and the
inverse of the function. The key
0:21 - 0:25

generation algorithm outputs a public key
and a secret key. And in this case, in
0:25 - 0:30

this lecture the public key is going to
define a function that maps the set X onto
0:30 - 0:34

itself. Which is why I call these things
trapdoor permutations, as opposed to
0:34 - 0:38

trapdoor functions, simply because the
function maps X onto itself, whereas a
0:38 - 0:43

trapdoor function might map the set X to
some arbitrary set Y. Now given the public
0:43 - 0:48

key, the function, as we say, basically
defines this function from the set X to
0:48 - 0:53

the set X. And then given the secret key,
we can invert this function. So this
0:53 - 0:57

function F evaluates the function in the
forward direction, and this function F
0:57 - 1:02

inverse, which means the secret key SK,
computes F in the reverse direction. Now
1:02 - 1:06

we say that the trapdoor permutation is
secure if the function defined by the
1:06 - 1:11

public key is in fact a one-way function,
which means that it's easy to evaluate,
1:11 - 1:15

but without the trapdoor, the secret
trapdoor, it's difficult to invert. Before
1:15 - 1:20

we look at our first example of a trapdoor
permutation, I want to do a quick review
1:20 - 1:24

of some necessary arithmetic facts that
we're gonna need. And in particular,
1:24 - 1:29

let's look at some arithmetic facts
modulo composites. So here we have our
1:29 - 1:33

modulus N, which happens to be a product
of two primes, and you should be thinking
1:33 - 1:38

of P and Q as roughly equal size numbers,
since particular P and Q are both roughly
1:38 - 1:42

on the size of square root of N. So both
are roughly the same size. Recall that we
1:42 - 1:47

denoted by ZN the set of integers from
zero to N minus one, and we said that we
1:47 - 1:51

can do addition and multiplication module N. We denoted by ZN star the set of
1:51 - 1:56

invertible elements in ZN. So these are
all the elements, which have a modular
1:56 - 2:01

inverse. And we said that actually X is
invertible if and only if it's relatively
2:01 - 2:06

prime to N. Moreover, I also told you that
the number of invertible elements in ZN
2:06 - 2:11

star is denoted by this function phi(N). So phi(N)
is the number of invertible elements in ZN
2:11 - 2:16

star, And I told you that when N is a
product of two distinct primes, then in
2:16 - 2:21

fact phi(N) is equal to (P - 1) times (Q - 1) and if you extend that out, you
2:21 - 2:26

see that this is really equal to (N - P - Q + 1). Now remember that I said
2:26 - 2:31

that P and Q are on the order of square
root of N which means that P + Q is
2:31 - 2:36

also on the order of square root of N.
Which means that really phi(N) is on the
2:36 - 2:41

order of N minus two square root of N. So,
in other words, it's very, very close to
2:41 - 2:45

N. Basically, subtracting the square root
of N from a number, this is from, N is
2:45 - 2:49

going to be a huge number in our case,
like 600 digits. And so subtracting from a
2:49 - 2:54

600 digit number, the square root of that
600 digit number, namely a 300 digit
2:54 - 2:58

number, hardly affects the size of the
number. Which means that phi(N) is really,
2:58 - 3:02

really, really close to N, and I want you
to remember that, because we will actually
3:02 - 3:06

be using this now and again. And so the
fact that phi(N) is really close to N, means
3:06 - 3:11

that if we choose a random element modulo
N It's very, very, very likely to be in ZN
3:11 - 3:16

star. So it's very unlikely that by
choosing a random element in ZN, we will
3:16 - 3:20

end up choosing an element that is not
invertable. Okay. So just remember that,
3:20 - 3:25

that in fact almost all elements in ZN are
actually invertible. And the last thing
3:25 - 3:31

that we'll need is Euler's theorem, which
we covered last week, which basically says
3:31 - 3:36

that for any element X in ZN star, if I
raise X to the power of phi(N), I get one, in
3:36 - 3:43

ZN. So in other words, I get 1 modulo N. I'll say it one more time because this
3:43 - 3:47

is gonna be critical to what's coming.
Again X to the phi(N) is equal to 1 modulo
3:47 - 3:52

N. So now that we have the necessary
background we can actually describe the
3:52 - 3:56

RSA trapdoor permutation. This is a classic,
classic, classic construction in
3:56 - 4:01

cryptography that was first published in
Scientific American back in 1977, this is
4:01 - 4:06

a very well known article in cryptography.
And ever since then, this was 35 years
4:06 - 4:10

ago, the RSA trapdoor permutation has been used
extensively in cryptographic applications.
4:10 - 4:15

For example, SSL and TLS use RSA both for
certificates and for key exchange. There
4:15 - 4:19

are many secure email systems and secure
file systems that use RSA to encrypt
4:19 - 4:24

emails and encrypt files in the file
system. And there are many, many, many
4:24 - 4:28

other applications of this system. So this
is a very, very classic, crypto
4:28 - 4:33

construction, and I'll show you how it
works. I should say that RSA is named for
4:33 - 4:37

its inventors, Ron Rivest, Adi Shamir and
Len Adleman, who were all at MIT at the
4:37 - 4:42

time they made this important discovery.
So now we're ready to describe the RSA
4:42 - 4:46

trapdoor permutation. To do that, I have
to describe the key generation algorithm,
4:46 - 4:50

the function ƒ and the function ƒ−1.
So let's see. So the way the key
4:50 - 4:55

generation algorithm works is as follows:
What we do is we generate two primes, P
4:55 - 4:59

and Q, each would be, say on the order of
1000 bits, so, you know, roughly 300
4:59 - 5:04

digits, and then the RSA modulus is simply
going to be the product of those two
5:04 - 5:09

primes. The next thing we do is we pick
two exponents, e and d, such that e times
5:09 - 5:14

d is 1 modulo phi(N). What this means is
that e and d first of all are relatively
5:14 - 5:19

prime to phi(N) and second of all they're
actually modular inverses of one another,
5:19 - 5:24

modulo phi(N). And then we output the public
key as the pair N,e and the
5:24 - 5:29

secret key is the secret key N,d. I should mention that the exponent e,
5:29 - 5:35

that the number e is sometimes called the
encryption exponent and the exponent d is
5:35 - 5:39

sometimes called the decryption exponent.
And you'll see why I keep referring to
5:39 - 5:43

these as exponents in just a second. Now
the way the RSA function itself is
5:43 - 5:47

defined is really simple. For simplicity
I'm gonna define it as the function
5:47 - 5:52

from ZN star to ZN star. And the way
the function is defined is simply given an
5:52 - 5:57

input X, all we do is we simply take X and
raise it to the power of e in ZN. So we
5:57 - 6:02

just compute X to the e, modulo N. That's
it. And then to decrypt, what we do is we
6:02 - 6:07

simply, given an input Y, we simply raise
Y to the power of d, modulo N. And that's
6:07 - 6:12

it. So now you can see why as I refer to e
and d as exponents. They're the things
6:12 - 6:18

that X and Y are being raised to. So let's
quickly verify that F inverse really does
6:18 - 6:23

invert the function F. So let's see what
happens when we compute Y to the d. So
6:23 - 6:28

suppose Y itself happens to be the RSA
function of some value X. In which case,
6:28 - 6:33

what Y to the d is, is really RSA of X to
the power of d. While X by itself is
6:33 - 6:39

simply gonna be X to the e modulo N, And
therefore, Y to the d is simply X to the e
6:39 - 6:45

times d modulo N. Again, just using rules
of exponentiation, the exponents multiply,
6:45 - 6:51

so we get X to the e times d. But what do
we know about e times d? e times d we said
6:51 - 6:57

is one modulo phi(N). And what that means is
there exists some integer such that e
6:57 - 7:04

times d is equal to K times phi(N) plus one.
This is what it means that e times d is
7:04 - 7:10

one modulo phi(N). So, we can simply replace e
times d by K times phi(N)+1. So, that's
7:10 - 7:14

what I wrote here. So, we have X to the
power of K times phi(N)+1. But now again
7:14 - 7:20

just using rules of exponentiation, we can
re-write this as X to the power of phi(N) to
7:20 - 7:25

the power of K times X. This is really the
same thing as K times phi(N)+1 as the
7:25 - 7:30

exponent. I just kind of separated the
exponent into it's different components.
7:30 - 7:35

Now, X to the phi(N) by Euler's theorem, we know
that X to the phi(N) is equal to one. So what
7:35 - 7:41

is this whole product equal to? Well since
X to the phi(N) is equal to one, one to
7:41 - 7:46

the K is also equal to one, so this whole
thing over here is simply equal to one.
7:46 - 7:51

And what we're left with is X. So what we
just proved is that if I take the output of
7:51 - 7:55

the RSA function and raise it to the power
of 'd', I get back X. Which means that
7:55 - 8:00

raising to the power of 'd' basically
inverts the RSA function, which is what we
8:00 - 8:05

wanted to show. So that's it, that's the
whole description of the function, we've
8:05 - 8:09

described the key generation. The function
itself, which is simply raising things to
8:09 - 8:13

the power of e modulo N, and the inversion
function which is simply raising things to
8:13 - 8:17

the power of d, also modulo N. The next
question is, why is this function secure?
8:17 - 8:22

In other words, why is this function one
way if all I have is just a public key,
8:22 - 8:26

but I don't have the secret key? And so to
argue that this function is one way we
8:26 - 8:31

basically state the RSA assumption which
basically says that the RSA function is a
8:31 - 8:37

one way permutation. And formally the way
we state that is that, basically for all
8:37 - 8:41

efficient algorithms, A, it so happens
that if I generate two primes p and q,
8:41 - 8:46

random primes p and q. I multiply them to
get to modulus N and then I choose a
8:46 - 8:51

random 'y' in ZN star. And now I give
the modulus, the exponent and this 'y' to
8:51 - 8:56

algorithm A, the probability that I'll get
the inverse of RSA at the point Y, mainly
8:56 - 9:00

I'll get Y to the power of one over E.
That's really what the inverse is. This
9:00 - 9:05

probability is negligible. So this
assumption is called the RSA assumption.
9:05 - 9:09

It basically states that RSA is a one way
permutation just given the public [key?]. And
9:09 - 9:14

therefore, it is a trapdoor permutation
because it has a trapdoor. And makes this
9:14 - 9:20

easy to invert for anyone who knows the
trap door. So now that we have a secure
9:20 - 9:24

trap door permutation, we can simply plug
that into our construction for public key
9:24 - 9:28

encryption, and get our first real world
public key encryption system. And so what
9:28 - 9:32

we'll do is we'll simply plug the, the RSA
trapdoor permutation into the iso standard
9:32 - 9:36

construction that we saw in the previous
segment. So, if you recall, that
9:36 - 9:40

construction was based on a symmetric
encryption system that had to provide
9:40 - 9:44

authenticated encryption. And it was also
based on a hash function. Then mapped
9:45 - 9:49

while transferring it into the world of
RSA, it maps elements in
9:49 - 9:54

ZN, into secret keys for the
symmetric key system. And now the
9:54 - 9:59

way the encryption scheme works is really
easy to describe. Basically algorithm G
9:59 - 10:04

just runs the RSA key generation algorithm
and produces a public key and a secret
10:04 - 10:08

key. Just as before. So you notice the
public key contains the encryption
10:08 - 10:12

exponent and the, secret key contains the
decryption exponent. And the way we
10:12 - 10:16

encrypt is as follows. Well, we're going
to choose a random X in ZN. We're going
10:16 - 10:21

to apply the RSA function to this X, we're
going to deduce a symmetric key from this
10:21 - 10:26

X by hashing it, so we compute H of X to
obtain the key K, and then we output this
10:26 - 10:31

Y along with the encryption of the message
under the symmetric key K. And in
10:31 - 10:36

practice, the hash function H would be
just implemented just using SHA-256, and
10:36 - 10:41

you would use the output of SHA-256 to
generate a symmetric key that could then
10:41 - 10:46

be used for the symmetric encryption
assistant. So that's how we would encrypt
10:46 - 10:50

and then the way we would decrypt is
pretty much as we saw in the previous
10:50 - 10:55

segment, where the first thing we would do
is we would use the secret key to invert
10:55 - 11:00

the header of the ciphertext. So we would
compute RSA invert of Y, that would give
11:00 - 11:05

us the value X. Then we would apply the
hash function to X so that this would give
11:05 - 11:09

us the key K. And then we would run the
decryption algorithm for the symmetric
11:09 - 11:15

system on the ciphertext and that would
produce the original message m. And then
11:15 - 11:19

we stated a theorem in the previous
segment to say that if the RSA trapdoor
11:19 - 11:23

permutation is secure, Es and Ds, the
symmetric encryption scheme [inaudible]
11:23 - 11:27

provides authenticated encryption. And as
we said, H is just random oracle. It's,
11:27 - 11:31

you know, kind of a random function from
ZN to the key space. Then, in fact, this
11:31 - 11:36

system is chosen cipher text secure, and
is a good public key system to use.
11:36 - 11:42

So now that we have our first example of a
good public key system to use, I wanna
11:42 - 11:47

quickly warn you about how not to use RSA
for encryption. And this again something
11:47 - 11:51

that we said in the previous segment. And
I'm going to repeat it here, except I'm
11:51 - 11:56

going to make it specific to RSA. So
I'd like to call this, textbook RSA.
11:56 - 11:59

Which basically is the first thing that
comes to mind when you think about
11:59 - 12:04

encrypting using RSA, namely, the secret
key and the public key will be as before.
12:04 - 12:08

But now instead of running through, a hash
function to generate a symmetric key, what
12:08 - 12:12

we would do is we would directly use RSA
to encrypt the given message M. And then
12:12 - 12:16

we would directly use the decryption
exponent to decrypt the cipher text and
12:16 - 12:21

obtain the plain text M. I'm going to call
this textbook RSA, because there are many
12:21 - 12:25

textbooks that describe RSA encryption in
this way. And this is completely wrong.
12:25 - 12:29

This is not how RSA encryption works.
It's an insecure system. In particular,
12:29 - 12:34

it's deterministic encryption, and so it
can't possibly be semantically secure, but
12:34 - 12:39

in fact there are many attacks exist that
I'm going to show you an attack in just a
12:39 - 12:43

minute, but the message that I want to
make clear here, is that RSA, all it is,
12:43 - 12:47

is a trap door permutation. By itself
it's not an encryption system. You have to
12:47 - 12:51

embellish it with this ISO standard for
example, to make it into an encryption
12:51 - 12:56

system. By itself, all it is, is just a
function. So let's see what goes wrong if
12:56 - 13:00

you try to use textbook RSA, In other
words, if you try to encrypt a message
13:00 - 13:05

using RSA directly. And I'll give you an
example attack from the world of the web.
13:05 - 13:10

So imagine we have a web server. The web
server has an RSA secret key. Here's it's
13:10 - 13:15

denoted by N and D. And here we have a web
browser who's trying to establish a secure
13:15 - 13:19

session, a secure SSL session, with the web
server. So the way SSL works is that the
13:19 - 13:23

web browser starts off by sending this
client hello message saying, hey, I want
13:23 - 13:28

to set up a secure session with you. The
web server responds with a server hello
13:28 - 13:32

message that contains the server's public
key And then the web browser will go ahead
13:32 - 13:37

and generate a random what's called a
premaster secret K, it will encrypt the
13:37 - 13:41

premaster secret using K and send the
result in ciphertext over to the web
13:41 - 13:45

server. The web server will decrypt and
then the web server will also get K, so
13:45 - 13:49

now the two have a shared key that they
can use to then secure a session between
13:49 - 13:54

them. So I want to show you what goes
wrong if we directly use the RSA function
13:54 - 13:58

for encrypting K. In other words, if
directly K is encrypted as K to the e
13:58 - 14:03

modulo N. Just for the sake of argument
let's suppose that K is a 64-bit key.
14:03 - 14:08

We're going to treat K as an integer in
the range as zero to 2 to the 64.
14:08 - 14:13

More precisely two to the 64 minus one,
and now what we're going to do is the
14:13 - 14:18

following. First of all, suppose it so
happens that K factors into a product of
14:18 - 14:24

roughly equal sized numbers. So K we can
write as K1 times K2, where K1 and K2 are
14:24 - 14:30

integers. And both are say less than two
to the 34. So, it turns out this happens
14:30 - 14:35

with probability roughly twenty percent so
one in five times K can actually be
14:35 - 14:40

written in this way. But, now if we plug
this K, K=K1 x K2 if we plug that into the
14:40 - 14:45

equation that defines the cipher text you
see that we can simply substitute K by K1 x k2
14:45 - 14:51

and then we can move k1 to the other side.
So then we end up with this equation here,
14:51 - 14:56

namely C over K1 to the e is equal to K2 to the e. You notice if I multiply both
14:56 - 15:01

sides by K1 to the e, I get that C is
equal to K1 times K2 to the e,
15:01 - 15:06

which is precisely this equation here.
Okay, so all I did is I just replaced K by
15:06 - 15:12

K1 times K2 and then divided by K1 to the
e. So by now this should look familiar to
15:12 - 15:16

you. So now we have two variables in this
equation, of course. C is known to the
15:16 - 15:20

attacker, E is known to the attacker, and
N is known to the attacker. The two
15:20 - 15:25

variables in this equation are K1 and
K2, and we've separated them into a
15:25 - 15:29

different side of the equation, and as a
result we can do now a meet in the middle
15:29 - 15:33

attack on this equation. So let's do the
meet in the middle attack. What we would
15:33 - 15:38

do is we would build a table of all
possible values Of the left-hand side. So
15:38 - 15:43

all possible values of C over K1 to the E,
there are 2 to the 34 of them. And then,
15:44 - 15:49

for all possible values on the right-hand
side, [inaudible] for all possible values
15:49 - 15:54

of K2 to the e, we're gonna check whether
this value here lives in the table that we
15:54 - 15:59

constructed in step one. And if it is then
we found a collision, and basically we
15:59 - 16:03

have a solution to this equation. So as
soon as we find an element of the form K2
16:03 - 16:08

to the E that lives in the table that
we built in step one, we've solved this
16:08 - 16:13

equation and we found K1 and K2. And
of course once we found K1 and K2,
16:13 - 16:17

we can easily recover K simply by
multiplying them. So then we multiply K1
16:17 - 16:21

and K2 and we get, the secret key
K. Okay? So, we've broken, basically,
16:21 - 16:26

this, this encryption system. And how long
did it take? Well, by brute force, we
16:26 - 16:30

could have broken it in time 2 to the 64,
simply by trying all possible keys. But
16:30 - 16:34

this attack, you notice, it took 2 to
the 34 time for step number one. Well, it
16:34 - 16:38

took a little bit more than 2 to the 34,
'cause we had to do these exponentiations.
16:39 - 16:43

It took 2 to the 34 time for step two
against slightly more than two to the 34
16:43 - 16:48

because of the exponentiations. So let's
say that the whole algorithm took time two
16:48 - 16:52

to the 40. The point is that this is much,
much less time due to the 64. So here you
16:52 - 16:57

have an example. Where if you encrypt
using RSA directly, in other words you
16:57 - 17:01

directly compute, K to the E, mod N,
instead of going through the ISO standard,
17:01 - 17:06

for example, then you get an attack that
runs much faster than exhaustive search.
17:06 - 17:11

You get an attack that runs in time two to
the 40, rather than time two to the 64.
17:11 - 17:15

Okay, so this is a cute example of how
things can break if you use the RSA
17:15 - 17:19

trapdoor permutation directly to
encrypt a message. So the message to
17:19 - 17:24

remember here, is never, ever, ever use
RSA directly to encrypt. You have to use to go
17:24 - 17:28

through an encryption mechanism. For
example, like the ISO standard. And in
17:28 - 17:32

fact, in the next segment we are going to
see other ways of using RSA to build
17:32 - 17:34

public key encryption.

Title:: The RSA trapdoor permutation (18 min)
Video Language:: English

	rodolfojcj edited English subtitles for The RSA trapdoor permutation (18 min)
	rodolfojcj edited English subtitles for The RSA trapdoor permutation (18 min)
	rodolfojcj edited English subtitles for The RSA trapdoor permutation (18 min)
	MrK edited English subtitles for The RSA trapdoor permutation (18 min)
	MrK edited English subtitles for The RSA trapdoor permutation (18 min)
	MrK edited English subtitles for The RSA trapdoor permutation (18 min)
	MrK edited English subtitles for The RSA trapdoor permutation (18 min)
	MrK edited English subtitles for The RSA trapdoor permutation (18 min)

Show all

English subtitles

Revisions

Revision 6

rodolfojcj

The RSA trapdoor permutation (18 min)

Revisions

Our website uses cookies

Operating cookies (Required)