< Return to Video

The RSA trapdoor permutation (18 min)

  • 0:00 - 0:04
    In the previous segment we saw how to
    build public key encryption from trapdoor
  • 0:04 - 0:08
    functions, in this segment we're going to
    build a classic trapdoor function called
  • 0:08 - 0:13
    RSA. But first let's quickly review what a
    trapdoor function is. So recall that a
  • 0:13 - 0:17
    trapdoor function is made up of three
    algorithms. There is a key generation
  • 0:17 - 0:21
    algorithm, the function itself, and the
    inverse of the function. The key
  • 0:21 - 0:25
    generation algorithm outputs a public key
    and a secret key. And in this case, in
  • 0:25 - 0:30
    this lecture the public key is going to
    define a function that maps the set X onto
  • 0:30 - 0:34
    itself. Which is why I call these things
    trapdoor permutations, as opposed to
  • 0:34 - 0:38
    trapdoor functions, simply because the
    function maps X onto itself, whereas a
  • 0:38 - 0:43
    trapdoor function might map the set X to
    some arbitrary set Y. Now given the public
  • 0:43 - 0:48
    key, the function, as we say, basically
    defines this function from the set X to
  • 0:48 - 0:53
    the set X. And then given the secret key,
    we can invert this function. So this
  • 0:53 - 0:57
    function F evaluates the function in the
    forward direction, and this function F
  • 0:57 - 1:02
    inverse, which means the secret key SK,
    computes F in the reverse direction. Now
  • 1:02 - 1:06
    we say that the trapdoor permutation is
    secure if the function defined by the
  • 1:06 - 1:11
    public key is in fact a one-way function,
    which means that it's easy to evaluate,
  • 1:11 - 1:15
    but without the trapdoor, the secret
    trapdoor, it's difficult to invert. Before
  • 1:15 - 1:20
    we look at our first example of a trapdoor
    permutation, I want to do a quick review
  • 1:20 - 1:24
    of some necessary arithmetic facts that
    we're gonna need. And in particular,
  • 1:24 - 1:29
    let's look at some arithmetic facts
    modulo composites. So here we have our
  • 1:29 - 1:33
    modulus N, which happens to be a product
    of two primes, and you should be thinking
  • 1:33 - 1:38
    of P and Q as roughly equal size numbers,
    since particular P and Q are both roughly
  • 1:38 - 1:42
    on the size of square root of N. So both
    are roughly the same size. Recall that we
  • 1:42 - 1:47
    denoted by ZN the set of integers from
    zero to N minus one, and we said that we
  • 1:47 - 1:51
    can do addition and multiplication module N. We denoted by ZN star the set of
  • 1:51 - 1:56
    invertible elements in ZN. So these are
    all the elements, which have a modular
  • 1:56 - 2:01
    inverse. And we said that actually X is
    invertible if and only if it's relatively
  • 2:01 - 2:06
    prime to N. Moreover, I also told you that
    the number of invertible elements in ZN
  • 2:06 - 2:11
    star is denoted by this function phi(N). So phi(N)
    is the number of invertible elements in ZN
  • 2:11 - 2:16
    star, And I told you that when N is a
    product of two distinct primes, then in
  • 2:16 - 2:21
    fact phi(N) is equal to (P - 1) times (Q - 1) and if you extend that out, you
  • 2:21 - 2:26
    see that this is really equal to (N - P - Q + 1). Now remember that I said
  • 2:26 - 2:31
    that P and Q are on the order of square
    root of N which means that P + Q is
  • 2:31 - 2:36
    also on the order of square root of N.
    Which means that really phi(N) is on the
  • 2:36 - 2:41
    order of N minus two square root of N. So,
    in other words, it's very, very close to
  • 2:41 - 2:45
    N. Basically, subtracting the square root
    of N from a number, this is from, N is
  • 2:45 - 2:49
    going to be a huge number in our case,
    like 600 digits. And so subtracting from a
  • 2:49 - 2:54
    600 digit number, the square root of that
    600 digit number, namely a 300 digit
  • 2:54 - 2:58
    number, hardly affects the size of the
    number. Which means that phi(N) is really,
  • 2:58 - 3:02
    really, really close to N, and I want you
    to remember that, because we will actually
  • 3:02 - 3:06
    be using this now and again. And so the
    fact that phi(N) is really close to N, means
  • 3:06 - 3:11
    that if we choose a random element modulo
    N It's very, very, very likely to be in ZN
  • 3:11 - 3:16
    star. So it's very unlikely that by
    choosing a random element in ZN, we will
  • 3:16 - 3:20
    end up choosing an element that is not
    invertable. Okay. So just remember that,
  • 3:20 - 3:25
    that in fact almost all elements in ZN are
    actually invertible. And the last thing
  • 3:25 - 3:31
    that we'll need is Euler's theorem, which
    we covered last week, which basically says
  • 3:31 - 3:36
    that for any element X in ZN star, if I
    raise X to the power of phi(N), I get one, in
  • 3:36 - 3:43
    ZN. So in other words, I get 1 modulo N. I'll say it one more time because this
  • 3:43 - 3:47
    is gonna be critical to what's coming.
    Again X to the phi(N) is equal to 1 modulo
  • 3:47 - 3:52
    N. So now that we have the necessary
    background we can actually describe the
  • 3:52 - 3:56
    RSA trapdoor permutation. This is a classic,
    classic, classic construction in
  • 3:56 - 4:01
    cryptography that was first published in
    Scientific American back in 1977, this is
  • 4:01 - 4:06
    a very well known article in cryptography.
    And ever since then, this was 35 years
  • 4:06 - 4:10
    ago, the RSA trapdoor permutation has been used
    extensively in cryptographic applications.
  • 4:10 - 4:15
    For example, SSL and TLS use RSA both for
    certificates and for key exchange. There
  • 4:15 - 4:19
    are many secure email systems and secure
    file systems that use RSA to encrypt
  • 4:19 - 4:24
    emails and encrypt files in the file
    system. And there are many, many, many
  • 4:24 - 4:28
    other applications of this system. So this
    is a very, very classic, crypto
  • 4:28 - 4:33
    construction, and I'll show you how it
    works. I should say that RSA is named for
  • 4:33 - 4:37
    its inventors, Ron Rivest, Adi Shamir and
    Len Adleman, who were all at MIT at the
  • 4:37 - 4:42
    time they made this important discovery.
    So now we're ready to describe the RSA
  • 4:42 - 4:46
    trapdoor permutation. To do that, I have
    to describe the key generation algorithm,
  • 4:46 - 4:50
    the function ƒ and the function ƒ−1.
    So let's see. So the way the key
  • 4:50 - 4:55
    generation algorithm works is as follows:
    What we do is we generate two primes, P
  • 4:55 - 4:59
    and Q, each would be, say on the order of
    1000 bits, so, you know, roughly 300
  • 4:59 - 5:04
    digits, and then the RSA modulus is simply
    going to be the product of those two
  • 5:04 - 5:09
    primes. The next thing we do is we pick
    two exponents, e and d, such that e times
  • 5:09 - 5:14
    d is 1 modulo phi(N). What this means is
    that e and d first of all are relatively
  • 5:14 - 5:19
    prime to phi(N) and second of all they're
    actually modular inverses of one another,
  • 5:19 - 5:24
    modulo phi(N). And then we output the public
    key as the pair N,e and the
  • 5:24 - 5:29
    secret key is the secret key N,d. I should mention that the exponent e,
  • 5:29 - 5:35
    that the number e is sometimes called the
    encryption exponent and the exponent d is
  • 5:35 - 5:39
    sometimes called the decryption exponent.
    And you'll see why I keep referring to
  • 5:39 - 5:43
    these as exponents in just a second. Now
    the way the RSA function itself is
  • 5:43 - 5:47
    defined is really simple. For simplicity
    I'm gonna define it as the function
  • 5:47 - 5:52
    from ZN star to ZN star. And the way
    the function is defined is simply given an
  • 5:52 - 5:57
    input X, all we do is we simply take X and
    raise it to the power of e in ZN. So we
  • 5:57 - 6:02
    just compute X to the e, modulo N. That's
    it. And then to decrypt, what we do is we
  • 6:02 - 6:07
    simply, given an input Y, we simply raise
    Y to the power of d, modulo N. And that's
  • 6:07 - 6:12
    it. So now you can see why as I refer to e
    and d as exponents. They're the things
  • 6:12 - 6:18
    that X and Y are being raised to. So let's
    quickly verify that F inverse really does
  • 6:18 - 6:23
    invert the function F. So let's see what
    happens when we compute Y to the d. So
  • 6:23 - 6:28
    suppose Y itself happens to be the RSA
    function of some value X. In which case,
  • 6:28 - 6:33
    what Y to the d is, is really RSA of X to
    the power of d. While X by itself is
  • 6:33 - 6:39
    simply gonna be X to the e modulo N, And
    therefore, Y to the d is simply X to the e
  • 6:39 - 6:45
    times d modulo N. Again, just using rules
    of exponentiation, the exponents multiply,
  • 6:45 - 6:51
    so we get X to the e times d. But what do
    we know about e times d? e times d we said
  • 6:51 - 6:57
    is one modulo phi(N). And what that means is
    there exists some integer such that e
  • 6:57 - 7:04
    times d is equal to K times phi(N) plus one.
    This is what it means that e times d is
  • 7:04 - 7:10
    one modulo phi(N). So, we can simply replace e
    times d by K times phi(N)+1. So, that's
  • 7:10 - 7:14
    what I wrote here. So, we have X to the
    power of K times phi(N)+1. But now again
  • 7:14 - 7:20
    just using rules of exponentiation, we can
    re-write this as X to the power of phi(N) to
  • 7:20 - 7:25
    the power of K times X. This is really the
    same thing as K times phi(N)+1 as the
  • 7:25 - 7:30
    exponent. I just kind of separated the
    exponent into it's different components.
  • 7:30 - 7:35
    Now, X to the phi(N) by Euler's theorem, we know
    that X to the phi(N) is equal to one. So what
  • 7:35 - 7:41
    is this whole product equal to? Well since
    X to the phi(N) is equal to one, one to
  • 7:41 - 7:46
    the K is also equal to one, so this whole
    thing over here is simply equal to one.
  • 7:46 - 7:51
    And what we're left with is X. So what we
    just proved is that if I take the output of
  • 7:51 - 7:55
    the RSA function and raise it to the power
    of 'd', I get back X. Which means that
  • 7:55 - 8:00
    raising to the power of 'd' basically
    inverts the RSA function, which is what we
  • 8:00 - 8:05
    wanted to show. So that's it, that's the
    whole description of the function, we've
  • 8:05 - 8:09
    described the key generation. The function
    itself, which is simply raising things to
  • 8:09 - 8:13
    the power of e modulo N, and the inversion
    function which is simply raising things to
  • 8:13 - 8:17
    the power of d, also modulo N. The next
    question is, why is this function secure?
  • 8:17 - 8:22
    In other words, why is this function one
    way if all I have is just a public key,
  • 8:22 - 8:26
    but I don't have the secret key? And so to
    argue that this function is one way we
  • 8:26 - 8:31
    basically state the RSA assumption which
    basically says that the RSA function is a
  • 8:31 - 8:37
    one way permutation. And formally the way
    we state that is that, basically for all
  • 8:37 - 8:41
    efficient algorithms, A, it so happens
    that if I generate two primes p and q,
  • 8:41 - 8:46
    random primes p and q. I multiply them to
    get to modulus N and then I choose a
  • 8:46 - 8:51
    random 'y' in ZN star. And now I give
    the modulus, the exponent and this 'y' to
  • 8:51 - 8:56
    algorithm A, the probability that I'll get
    the inverse of RSA at the point Y, mainly
  • 8:56 - 9:00
    I'll get Y to the power of one over E.
    That's really what the inverse is. This
  • 9:00 - 9:05
    probability is negligible. So this
    assumption is called the RSA assumption.
  • 9:05 - 9:09
    It basically states that RSA is a one way
    permutation just given the public [key?]. And
  • 9:09 - 9:14
    therefore, it is a trapdoor permutation
    because it has a trapdoor. And makes this
  • 9:14 - 9:20
    easy to invert for anyone who knows the
    trap door. So now that we have a secure
  • 9:20 - 9:24
    trap door permutation, we can simply plug
    that into our construction for public key
  • 9:24 - 9:28
    encryption, and get our first real world
    public key encryption system. And so what
  • 9:28 - 9:32
    we'll do is we'll simply plug the, the RSA
    trapdoor permutation into the iso standard
  • 9:32 - 9:36
    construction that we saw in the previous
    segment. So, if you recall, that
  • 9:36 - 9:40
    construction was based on a symmetric
    encryption system that had to provide
  • 9:40 - 9:44
    authenticated encryption. And it was also
    based on a hash function. Then mapped
  • 9:45 - 9:49
    while transferring it into the world of
    RSA, it maps elements in
  • 9:49 - 9:54
    ZN, into secret keys for the
    symmetric key system. And now the
  • 9:54 - 9:59
    way the encryption scheme works is really
    easy to describe. Basically algorithm G
  • 9:59 - 10:04
    just runs the RSA key generation algorithm
    and produces a public key and a secret
  • 10:04 - 10:08
    key. Just as before. So you notice the
    public key contains the encryption
  • 10:08 - 10:12
    exponent and the, secret key contains the
    decryption exponent. And the way we
  • 10:12 - 10:16
    encrypt is as follows. Well, we're going
    to choose a random X in ZN. We're going
  • 10:16 - 10:21
    to apply the RSA function to this X, we're
    going to deduce a symmetric key from this
  • 10:21 - 10:26
    X by hashing it, so we compute H of X to
    obtain the key K, and then we output this
  • 10:26 - 10:31
    Y along with the encryption of the message
    under the symmetric key K. And in
  • 10:31 - 10:36
    practice, the hash function H would be
    just implemented just using SHA-256, and
  • 10:36 - 10:41
    you would use the output of SHA-256 to
    generate a symmetric key that could then
  • 10:41 - 10:46
    be used for the symmetric encryption
    assistant. So that's how we would encrypt
  • 10:46 - 10:50
    and then the way we would decrypt is
    pretty much as we saw in the previous
  • 10:50 - 10:55
    segment, where the first thing we would do
    is we would use the secret key to invert
  • 10:55 - 11:00
    the header of the ciphertext. So we would
    compute RSA invert of Y, that would give
  • 11:00 - 11:05
    us the value X. Then we would apply the
    hash function to X so that this would give
  • 11:05 - 11:09
    us the key K. And then we would run the
    decryption algorithm for the symmetric
  • 11:09 - 11:15
    system on the ciphertext and that would
    produce the original message m. And then
  • 11:15 - 11:19
    we stated a theorem in the previous
    segment to say that if the RSA trapdoor
  • 11:19 - 11:23
    permutation is secure, Es and Ds, the
    symmetric encryption scheme [inaudible]
  • 11:23 - 11:27
    provides authenticated encryption. And as
    we said, H is just random oracle. It's,
  • 11:27 - 11:31
    you know, kind of a random function from
    ZN to the key space. Then, in fact, this
  • 11:31 - 11:36
    system is chosen cipher text secure, and
    is a good public key system to use.
  • 11:36 - 11:42
    So now that we have our first example of a
    good public key system to use, I wanna
  • 11:42 - 11:47
    quickly warn you about how not to use RSA
    for encryption. And this again something
  • 11:47 - 11:51
    that we said in the previous segment. And
    I'm going to repeat it here, except I'm
  • 11:51 - 11:56
    going to make it specific to RSA. So
    I'd like to call this, textbook RSA.
  • 11:56 - 11:59
    Which basically is the first thing that
    comes to mind when you think about
  • 11:59 - 12:04
    encrypting using RSA, namely, the secret
    key and the public key will be as before.
  • 12:04 - 12:08
    But now instead of running through, a hash
    function to generate a symmetric key, what
  • 12:08 - 12:12
    we would do is we would directly use RSA
    to encrypt the given message M. And then
  • 12:12 - 12:16
    we would directly use the decryption
    exponent to decrypt the cipher text and
  • 12:16 - 12:21
    obtain the plain text M. I'm going to call
    this textbook RSA, because there are many
  • 12:21 - 12:25
    textbooks that describe RSA encryption in
    this way. And this is completely wrong.
  • 12:25 - 12:29
    This is not how RSA encryption works.
    It's an insecure system. In particular,
  • 12:29 - 12:34
    it's deterministic encryption, and so it
    can't possibly be semantically secure, but
  • 12:34 - 12:39
    in fact there are many attacks exist that
    I'm going to show you an attack in just a
  • 12:39 - 12:43
    minute, but the message that I want to
    make clear here, is that RSA, all it is,
  • 12:43 - 12:47
    is a trap door permutation. By itself
    it's not an encryption system. You have to
  • 12:47 - 12:51
    embellish it with this ISO standard for
    example, to make it into an encryption
  • 12:51 - 12:56
    system. By itself, all it is, is just a
    function. So let's see what goes wrong if
  • 12:56 - 13:00
    you try to use textbook RSA, In other
    words, if you try to encrypt a message
  • 13:00 - 13:05
    using RSA directly. And I'll give you an
    example attack from the world of the web.
  • 13:05 - 13:10
    So imagine we have a web server. The web
    server has an RSA secret key. Here's it's
  • 13:10 - 13:15
    denoted by N and D. And here we have a web
    browser who's trying to establish a secure
  • 13:15 - 13:19
    session, a secure SSL session, with the web
    server. So the way SSL works is that the
  • 13:19 - 13:23
    web browser starts off by sending this
    client hello message saying, hey, I want
  • 13:23 - 13:28
    to set up a secure session with you. The
    web server responds with a server hello
  • 13:28 - 13:32
    message that contains the server's public
    key And then the web browser will go ahead
  • 13:32 - 13:37
    and generate a random what's called a
    premaster secret K, it will encrypt the
  • 13:37 - 13:41
    premaster secret using K and send the
    result in ciphertext over to the web
  • 13:41 - 13:45
    server. The web server will decrypt and
    then the web server will also get K, so
  • 13:45 - 13:49
    now the two have a shared key that they
    can use to then secure a session between
  • 13:49 - 13:54
    them. So I want to show you what goes
    wrong if we directly use the RSA function
  • 13:54 - 13:58
    for encrypting K. In other words, if
    directly K is encrypted as K to the e
  • 13:58 - 14:03
    modulo N. Just for the sake of argument
    let's suppose that K is a 64-bit key.
  • 14:03 - 14:08
    We're going to treat K as an integer in
    the range as zero to 2 to the 64.
  • 14:08 - 14:13
    More precisely two to the 64 minus one,
    and now what we're going to do is the
  • 14:13 - 14:18
    following. First of all, suppose it so
    happens that K factors into a product of
  • 14:18 - 14:24
    roughly equal sized numbers. So K we can
    write as K1 times K2, where K1 and K2 are
  • 14:24 - 14:30
    integers. And both are say less than two
    to the 34. So, it turns out this happens
  • 14:30 - 14:35
    with probability roughly twenty percent so
    one in five times K can actually be
  • 14:35 - 14:40
    written in this way. But, now if we plug
    this K, K=K1 x K2 if we plug that into the
  • 14:40 - 14:45
    equation that defines the cipher text you
    see that we can simply substitute K by K1 x k2
  • 14:45 - 14:51
    and then we can move k1 to the other side.
    So then we end up with this equation here,
  • 14:51 - 14:56
    namely C over K1 to the e is equal to K2 to the e. You notice if I multiply both
  • 14:56 - 15:01
    sides by K1 to the e, I get that C is
    equal to K1 times K2 to the e,
  • 15:01 - 15:06
    which is precisely this equation here.
    Okay, so all I did is I just replaced K by
  • 15:06 - 15:12
    K1 times K2 and then divided by K1 to the
    e. So by now this should look familiar to
  • 15:12 - 15:16
    you. So now we have two variables in this
    equation, of course. C is known to the
  • 15:16 - 15:20
    attacker, E is known to the attacker, and
    N is known to the attacker. The two
  • 15:20 - 15:25
    variables in this equation are K1 and
    K2, and we've separated them into a
  • 15:25 - 15:29
    different side of the equation, and as a
    result we can do now a meet in the middle
  • 15:29 - 15:33
    attack on this equation. So let's do the
    meet in the middle attack. What we would
  • 15:33 - 15:38
    do is we would build a table of all
    possible values Of the left-hand side. So
  • 15:38 - 15:43
    all possible values of C over K1 to the E,
    there are 2 to the 34 of them. And then,
  • 15:44 - 15:49
    for all possible values on the right-hand
    side, [inaudible] for all possible values
  • 15:49 - 15:54
    of K2 to the e, we're gonna check whether
    this value here lives in the table that we
  • 15:54 - 15:59
    constructed in step one. And if it is then
    we found a collision, and basically we
  • 15:59 - 16:03
    have a solution to this equation. So as
    soon as we find an element of the form K2
  • 16:03 - 16:08
    to the E that lives in the table that
    we built in step one, we've solved this
  • 16:08 - 16:13
    equation and we found K1 and K2. And
    of course once we found K1 and K2,
  • 16:13 - 16:17
    we can easily recover K simply by
    multiplying them. So then we multiply K1
  • 16:17 - 16:21
    and K2 and we get, the secret key
    K. Okay? So, we've broken, basically,
  • 16:21 - 16:26
    this, this encryption system. And how long
    did it take? Well, by brute force, we
  • 16:26 - 16:30
    could have broken it in time 2 to the 64,
    simply by trying all possible keys. But
  • 16:30 - 16:34
    this attack, you notice, it took 2 to
    the 34 time for step number one. Well, it
  • 16:34 - 16:38
    took a little bit more than 2 to the 34,
    'cause we had to do these exponentiations.
  • 16:39 - 16:43
    It took 2 to the 34 time for step two
    against slightly more than two to the 34
  • 16:43 - 16:48
    because of the exponentiations. So let's
    say that the whole algorithm took time two
  • 16:48 - 16:52
    to the 40. The point is that this is much,
    much less time due to the 64. So here you
  • 16:52 - 16:57
    have an example. Where if you encrypt
    using RSA directly, in other words you
  • 16:57 - 17:01
    directly compute, K to the E, mod N,
    instead of going through the ISO standard,
  • 17:01 - 17:06
    for example, then you get an attack that
    runs much faster than exhaustive search.
  • 17:06 - 17:11
    You get an attack that runs in time two to
    the 40, rather than time two to the 64.
  • 17:11 - 17:15
    Okay, so this is a cute example of how
    things can break if you use the RSA
  • 17:15 - 17:19
    trapdoor permutation directly to
    encrypt a message. So the message to
  • 17:19 - 17:24
    remember here, is never, ever, ever use
    RSA directly to encrypt. You have to use to go
  • 17:24 - 17:28
    through an encryption mechanism. For
    example, like the ISO standard. And in
  • 17:28 - 17:32
    fact, in the next segment we are going to
    see other ways of using RSA to build
  • 17:32 - 17:34
    public key encryption.
Title:
The RSA trapdoor permutation (18 min)
Video Language:
English

English subtitles

Revisions