In the previous segment we saw how to
build public key encryption from trapdoor
functions, in this segment we're going to
build a classic trapdoor function called
RSA. But first let's quickly review what a
trapdoor function is. So recall that a
trapdoor function is made up of three
algorithms. There is a key generation
algorithm, the function itself, and the
inverse of the function. The key
generation algorithm outputs a public key
and a secret key. And in this case, in
this lecture the public key is going to
define a function that maps the set X onto
itself. Which is why I call these things
trapdoor permutations, as opposed to
trapdoor functions, simply because the
function maps X onto itself, whereas a
trapdoor function might map the set X to
some arbitrary set Y. Now given the public
key, the function, as we say, basically
defines this function from the set X to
the set X. And then given the secret key,
we can invert this function. So this
function F evaluates the function in the
forward direction, and this function F
inverse, which means the secret key SK,
computes F in the reverse direction. Now
we say that the trapdoor permutation is
secure if the function defined by the
public key is in fact a one-way function,
which means that it's easy to evaluate,
but without the trapdoor, the secret
trapdoor, it's difficult to invert. Before
we look at our first example of a trapdoor
permutation, I want to do a quick review
of some necessary arithmetic facts that
we're gonna need. And in particular,
let's look at some arithmetic facts
modulo composites. So here we have our
modulus N, which happens to be a product
of two primes, and you should be thinking
of P and Q as roughly equal size numbers,
since particular P and Q are both roughly
on the size of square root of N. So both
are roughly the same size. Recall that we
denoted by ZN the set of integers from
zero to N minus one, and we said that we
can do addition and multiplication module N. We denoted by ZN star the set of
invertible elements in ZN. So these are
all the elements, which have a modular
inverse. And we said that actually X is
invertible if and only if it's relatively
prime to N. Moreover, I also told you that
the number of invertible elements in ZN
star is denoted by this function phi(N). So phi(N)
is the number of invertible elements in ZN
star, And I told you that when N is a
product of two distinct primes, then in
fact phi(N) is equal to (P - 1) times (Q - 1) and if you extend that out, you
see that this is really equal to (N - P - Q + 1). Now remember that I said
that P and Q are on the order of square
root of N which means that P + Q is
also on the order of square root of N.
Which means that really phi(N) is on the
order of N minus two square root of N. So,
in other words, it's very, very close to
N. Basically, subtracting the square root
of N from a number, this is from, N is
going to be a huge number in our case,
like 600 digits. And so subtracting from a
600 digit number, the square root of that
600 digit number, namely a 300 digit
number, hardly affects the size of the
number. Which means that phi(N) is really,
really, really close to N, and I want you
to remember that, because we will actually
be using this now and again. And so the
fact that phi(N) is really close to N, means
that if we choose a random element modulo
N It's very, very, very likely to be in ZN
star. So it's very unlikely that by
choosing a random element in ZN, we will
end up choosing an element that is not
invertable. Okay. So just remember that,
that in fact almost all elements in ZN are
actually invertible. And the last thing
that we'll need is Euler's theorem, which
we covered last week, which basically says
that for any element X in ZN star, if I
raise X to the power of phi(N), I get one, in
ZN. So in other words, I get 1 modulo N. I'll say it one more time because this
is gonna be critical to what's coming.
Again X to the phi(N) is equal to 1 modulo
N. So now that we have the necessary
background we can actually describe the
RSA trapdoor permutation. This is a classic,
classic, classic construction in
cryptography that was first published in
Scientific American back in 1977, this is
a very well known article in cryptography.
And ever since then, this was 35 years
ago, the RSA trapdoor permutation has been used
extensively in cryptographic applications.
For example, SSL and TLS use RSA both for
certificates and for key exchange. There
are many secure email systems and secure
file systems that use RSA to encrypt
emails and encrypt files in the file
system. And there are many, many, many
other applications of this system. So this
is a very, very classic, crypto
construction, and I'll show you how it
works. I should say that RSA is named for
its inventors, Ron Rivest, Adi Shamir and
Len Adleman, who were all at MIT at the
time they made this important discovery.
So now we're ready to describe the RSA
trapdoor permutation. To do that, I have
to describe the key generation algorithm,
the function ƒ and the function ƒ−1.
So let's see. So the way the key
generation algorithm works is as follows:
What we do is we generate two primes, P
and Q, each would be, say on the order of
1000 bits, so, you know, roughly 300
digits, and then the RSA modulus is simply
going to be the product of those two
primes. The next thing we do is we pick
two exponents, e and d, such that e times
d is 1 modulo phi(N). What this means is
that e and d first of all are relatively
prime to phi(N) and second of all they're
actually modular inverses of one another,
modulo phi(N). And then we output the public
key as the pair N,e and the
secret key is the secret key N,d. I should mention that the exponent e,
that the number e is sometimes called the
encryption exponent and the exponent d is
sometimes called the decryption exponent.
And you'll see why I keep referring to
these as exponents in just a second. Now
the way the RSA function itself is
defined is really simple. For simplicity
I'm gonna define it as the function
from ZN star to ZN star. And the way
the function is defined is simply given an
input X, all we do is we simply take X and
raise it to the power of e in ZN. So we
just compute X to the e, modulo N. That's
it. And then to decrypt, what we do is we
simply, given an input Y, we simply raise
Y to the power of d, modulo N. And that's
it. So now you can see why as I refer to e
and d as exponents. They're the things
that X and Y are being raised to. So let's
quickly verify that F inverse really does
invert the function F. So let's see what
happens when we compute Y to the d. So
suppose Y itself happens to be the RSA
function of some value X. In which case,
what Y to the d is, is really RSA of X to
the power of d. While X by itself is
simply gonna be X to the e modulo N, And
therefore, Y to the d is simply X to the e
times d modulo N. Again, just using rules
of exponentiation, the exponents multiply,
so we get X to the e times d. But what do
we know about e times d? e times d we said
is one modulo phi(N). And what that means is
there exists some integer such that e
times d is equal to K times phi(N) plus one.
This is what it means that e times d is
one modulo phi(N). So, we can simply replace e
times d by K times phi(N)+1. So, that's
what I wrote here. So, we have X to the
power of K times phi(N)+1. But now again
just using rules of exponentiation, we can
re-write this as X to the power of phi(N) to
the power of K times X. This is really the
same thing as K times phi(N)+1 as the
exponent. I just kind of separated the
exponent into it's different components.
Now, X to the phi(N) by Euler's theorem, we know
that X to the phi(N) is equal to one. So what
is this whole product equal to? Well since
X to the phi(N) is equal to one, one to
the K is also equal to one, so this whole
thing over here is simply equal to one.
And what we're left with is X. So what we
just proved is that if I take the output of
the RSA function and raise it to the power
of 'd', I get back X. Which means that
raising to the power of 'd' basically
inverts the RSA function, which is what we
wanted to show. So that's it, that's the
whole description of the function, we've
described the key generation. The function
itself, which is simply raising things to
the power of e modulo N, and the inversion
function which is simply raising things to
the power of d, also modulo N. The next
question is, why is this function secure?
In other words, why is this function one
way if all I have is just a public key,
but I don't have the secret key? And so to
argue that this function is one way we
basically state the RSA assumption which
basically says that the RSA function is a
one way permutation. And formally the way
we state that is that, basically for all
efficient algorithms, A, it so happens
that if I generate two primes p and q,
random primes p and q. I multiply them to
get to modulus N and then I choose a
random 'y' in ZN star. And now I give
the modulus, the exponent and this 'y' to
algorithm A, the probability that I'll get
the inverse of RSA at the point Y, mainly
I'll get Y to the power of one over E.
That's really what the inverse is. This
probability is negligible. So this
assumption is called the RSA assumption.
It basically states that RSA is a one way
permutation just given the public [key?]. And
therefore, it is a trapdoor permutation
because it has a trapdoor. And makes this
easy to invert for anyone who knows the
trap door. So now that we have a secure
trap door permutation, we can simply plug
that into our construction for public key
encryption, and get our first real world
public key encryption system. And so what
we'll do is we'll simply plug the, the RSA
trapdoor permutation into the iso standard
construction that we saw in the previous
segment. So, if you recall, that
construction was based on a symmetric
encryption system that had to provide
authenticated encryption. And it was also
based on a hash function. Then mapped
while transferring it into the world of
RSA, it maps elements in
ZN, into secret keys for the
symmetric key system. And now the
way the encryption scheme works is really
easy to describe. Basically algorithm G
just runs the RSA key generation algorithm
and produces a public key and a secret
key. Just as before. So you notice the
public key contains the encryption
exponent and the, secret key contains the
decryption exponent. And the way we
encrypt is as follows. Well, we're going
to choose a random X in ZN. We're going
to apply the RSA function to this X, we're
going to deduce a symmetric key from this
X by hashing it, so we compute H of X to
obtain the key K, and then we output this
Y along with the encryption of the message
under the symmetric key K. And in
practice, the hash function H would be
just implemented just using SHA-256, and
you would use the output of SHA-256 to
generate a symmetric key that could then
be used for the symmetric encryption
assistant. So that's how we would encrypt
and then the way we would decrypt is
pretty much as we saw in the previous
segment, where the first thing we would do
is we would use the secret key to invert
the header of the ciphertext. So we would
compute RSA invert of Y, that would give
us the value X. Then we would apply the
hash function to X so that this would give
us the key K. And then we would run the
decryption algorithm for the symmetric
system on the ciphertext and that would
produce the original message m. And then
we stated a theorem in the previous
segment to say that if the RSA trapdoor
permutation is secure, Es and Ds, the
symmetric encryption scheme [inaudible]
provides authenticated encryption. And as
we said, H is just random oracle. It's,
you know, kind of a random function from
ZN to the key space. Then, in fact, this
system is chosen cipher text secure, and
is a good public key system to use.
So now that we have our first example of a
good public key system to use, I wanna
quickly warn you about how not to use RSA
for encryption. And this again something
that we said in the previous segment. And
I'm going to repeat it here, except I'm
going to make it specific to RSA. So
I'd like to call this, textbook RSA.
Which basically is the first thing that
comes to mind when you think about
encrypting using RSA, namely, the secret
key and the public key will be as before.
But now instead of running through, a hash
function to generate a symmetric key, what
we would do is we would directly use RSA
to encrypt the given message M. And then
we would directly use the decryption
exponent to decrypt the cipher text and
obtain the plain text M. I'm going to call
this textbook RSA, because there are many
textbooks that describe RSA encryption in
this way. And this is completely wrong.
This is not how RSA encryption works.
It's an insecure system. In particular,
it's deterministic encryption, and so it
can't possibly be semantically secure, but
in fact there are many attacks exist that
I'm going to show you an attack in just a
minute, but the message that I want to
make clear here, is that RSA, all it is,
is a trap door permutation. By itself
it's not an encryption system. You have to
embellish it with this ISO standard for
example, to make it into an encryption
system. By itself, all it is, is just a
function. So let's see what goes wrong if
you try to use textbook RSA, In other
words, if you try to encrypt a message
using RSA directly. And I'll give you an
example attack from the world of the web.
So imagine we have a web server. The web
server has an RSA secret key. Here's it's
denoted by N and D. And here we have a web
browser who's trying to establish a secure
session, a secure SSL session, with the web
server. So the way SSL works is that the
web browser starts off by sending this
client hello message saying, hey, I want
to set up a secure session with you. The
web server responds with a server hello
message that contains the server's public
key And then the web browser will go ahead
and generate a random what's called a
premaster secret K, it will encrypt the
premaster secret using K and send the
result in ciphertext over to the web
server. The web server will decrypt and
then the web server will also get K, so
now the two have a shared key that they
can use to then secure a session between
them. So I want to show you what goes
wrong if we directly use the RSA function
for encrypting K. In other words, if
directly K is encrypted as K to the e
modulo N. Just for the sake of argument
let's suppose that K is a 64-bit key.
We're going to treat K as an integer in
the range as zero to 2 to the 64.
More precisely two to the 64 minus one,
and now what we're going to do is the
following. First of all, suppose it so
happens that K factors into a product of
roughly equal sized numbers. So K we can
write as K1 times K2, where K1 and K2 are
integers. And both are say less than two
to the 34. So, it turns out this happens
with probability roughly twenty percent so
one in five times K can actually be
written in this way. But, now if we plug
this K, K=K1 x K2 if we plug that into the
equation that defines the cipher text you
see that we can simply substitute K by K1 x k2
and then we can move k1 to the other side.
So then we end up with this equation here,
namely C over K1 to the e is equal to K2 to the e. You notice if I multiply both
sides by K1 to the e, I get that C is
equal to K1 times K2 to the e,
which is precisely this equation here.
Okay, so all I did is I just replaced K by
K1 times K2 and then divided by K1 to the
e. So by now this should look familiar to
you. So now we have two variables in this
equation, of course. C is known to the
attacker, E is known to the attacker, and
N is known to the attacker. The two
variables in this equation are K1 and
K2, and we've separated them into a
different side of the equation, and as a
result we can do now a meet in the middle
attack on this equation. So let's do the
meet in the middle attack. What we would
do is we would build a table of all
possible values Of the left-hand side. So
all possible values of C over K1 to the E,
there are 2 to the 34 of them. And then,
for all possible values on the right-hand
side, [inaudible] for all possible values
of K2 to the e, we're gonna check whether
this value here lives in the table that we
constructed in step one. And if it is then
we found a collision, and basically we
have a solution to this equation. So as
soon as we find an element of the form K2
to the E that lives in the table that
we built in step one, we've solved this
equation and we found K1 and K2. And
of course once we found K1 and K2,
we can easily recover K simply by
multiplying them. So then we multiply K1
and K2 and we get, the secret key
K. Okay? So, we've broken, basically,
this, this encryption system. And how long
did it take? Well, by brute force, we
could have broken it in time 2 to the 64,
simply by trying all possible keys. But
this attack, you notice, it took 2 to
the 34 time for step number one. Well, it
took a little bit more than 2 to the 34,
'cause we had to do these exponentiations.
It took 2 to the 34 time for step two
against slightly more than two to the 34
because of the exponentiations. So let's
say that the whole algorithm took time two
to the 40. The point is that this is much,
much less time due to the 64. So here you
have an example. Where if you encrypt
using RSA directly, in other words you
directly compute, K to the E, mod N,
instead of going through the ISO standard,
for example, then you get an attack that
runs much faster than exhaustive search.
You get an attack that runs in time two to
the 40, rather than time two to the 64.
Okay, so this is a cute example of how
things can break if you use the RSA
trapdoor permutation directly to
encrypt a message. So the message to
remember here, is never, ever, ever use
RSA directly to encrypt. You have to use to go
through an encryption mechanism. For
example, like the ISO standard. And in
fact, in the next segment we are going to
see other ways of using RSA to build
public key encryption.