-
In this segment, I want to tell you about
another form of encryption, called format
-
preserving encryption. This is again
something that comes up in practice quite
-
often, especially when it comes to
encrypting credit card numbers. And, so,
-
let's see how it works. Remember that a
typical credit card number is sixteen
-
digits, broken into four blocks of four
digits each. You've probably heard before
-
that the first six digits are what's
called the BIN number, which represent the
-
issuer ID. For example, all credit cards
issued by VISA always start with the digit
-
four; all credit cards issued by
MasterCard start with digits 51 to 55, and
-
so on and so forth. The next nine digits
are actually the account number that is
-
specific to the, to the particular
customer and the last digit is a check sum
-
that's computed from the previous fifteen
digits. So there are about 20,000 BIN
-
numbers and so if you do the calculation
it turns out there are about 2 to the 42
-
possible credit card numbers which
corresponds to about 42 bits of
-
information that you need to encode if you
want to represent a credit card number
-
compactly. Now the problem is this.
Suppose we wanted to encrypt credit card
-
numbers, so that when the user swipes the
credit card number at the point of sale
-
terminal, namely at the, you know,
the merchant's cash register. The cash
-
register, this is called a point of sale
terminal, goes ahead and encrypts the
-
credit card number using a key k and
it's encrypted all the way until it goes
-
to the acquiring bank or maybe even beyond
that, maybe even all the way when it goes
-
to Visa. Now, the problem is that these
credit card numbers actually pass through
-
many, many, many processing points. All of
them expect to get basically something
-
that looks like a credit card number as a
result. So if we simply wanted to encrypt
-
something at the end point, at one end
point, and decrypt it at the other end
-
point, it's actually not that easy because
if all we did was encrypt it using AES,
-
even if we just used deterministic AES,
the output of the encrypted credit card
-
number would be 128 bit block. Sixteen
bytes that would be, that would need to be
-
sent from one system to the next, until it
reaches its destination. But each one of
-
these processors, then, would have to be
changed, because they're all expecting to
-
get credit card numbers. And so the
question is, can we encrypt at the cash
-
register, such that the resulting
encryption itself looks like a credit card
-
number. And as a result, none of these
intermediate systems would have to be
-
changed. Only the endpoints would have to
be changed. And in doing so we would
-
actually obtain end-to-end encryption
without actually having to change a lot of
-
software along the way. So the question
then is, again, can we have this mechanism
-
called format preserving encryption, where
encrypting a credit card itself produces
-
something that looks like a credit card?
So that's our goal. The encrypted credit
-
card number should look like a regular
valid credit card number. So this is the
-
goal of format preserving encryption. More
abstractly what it is we're trying to do,
-
is basically build a pseudo random
permutation on the set zero to S minus one
-
for any given S. So for the set of credit
card numbers, S would be roughly, you
-
know, two to the 42. In fact, it's gonna
be, not exactly two to the 42. It's gonna
-
be some funny numbers that's around two to
the 42. And our goal is to build a PRP
-
that acts exactly on the interval, zero to
S minus one. And what we're given as input
-
is some PRF, say, something like AES, that
acts on much larger blocks, say, acts on
-
sixteen byte blocks. So we're trying to,
in some sense, shrink the domain of the
-
PRF to make it fit the data that we're
given. And the question is basically how
-
to do that. Well, once we have such a
construction we can easily use it to
-
encrypt credit card numbers. What we would
do is we would take the, a given credit
-
card number. We would encode it such that
now it's represented as a number between
-
zero and the total number of credit card
numbers. Then we would apply our PRP so we
-
would encrypt this credit card number, you
know, using construction number two from
-
the deterministic encryption segment. And
then we would map the final number back to
-
be a regular, to look like val--, regular,
valid credit card number and then send
-
this along the way. So this is, this
is again non expanding encryption. The
-
best we can do, as we said before is to
encrypt using a PRP except again the
-
technical challenge is we need a PRP that
acts on this particular funny looking set
-
from zero to S-1 for this prespecified
value of S. So my goal is to show you how
-
to construct this and once we see how to
do it, you will also know how to encrypt
-
credit card number so that the resulting
cipher text is itself a credit card
-
number. So the construction works in two
steps. The first thing we do is we shrink
-
our PRF from the set {0,1} to the n, say
{0,1} to the 128 in the case of AES,
-
to {0,1} to the t where t is the
closest power of two to the value S.
-
So say S is some crazy number around two to
the 41. T would basically be then 42
-
because that's the closest power of two
that's just above the value S. So the
-
construction is basically gonna use the
Luby-Rackoff construction. What we need
-
is a PRF F prime that acts on blocks of
size t over two. So imagine for example
-
in the credit card case, t would be 42
because two to the 42 we said is the
-
closest power of two that's bigger than,
than, than S. S is the number of credit,
-
total number of credit cards. Then we would
need a PRF now that acts on 21-bit
-
inputs. So one way to do that is simply to
take the AES block cipher, treat it as a
-
PRF on 128-bit inputs, and then simply
truncate the output to the least
-
significant 21 bits. Okay, so we were
given a 21 bit value. We would append
-
zeros to it so that we get 128 bits as a
result. We would apply AES to that and
-
then we would truncate back to 21 bits.
Now I should tell you that that's actually
-
a simple way to do it but it's actually
not the best way to do it. There are
-
actually better ways to truncate a PRF on
n bits to a PRF on t over two bits.
-
But for our pur--, for the purposes of this
segment, the truncation method I just said
-
is good enough. So let's just use that
particular method. Okay, so now, we've
-
converted our AES block cipher into a PRF
on t over two bits, say, on 21 bits. But
-
what we really want is a PRP. And so what
we'll do is we'll plug our new PRF, F prime,
-
directly into the Luby-Rackoff
construction. If you remember, this is
-
basically a Feistel construction. We saw
this when we talked about DES. It's a,
-
Luby-Rackoff used three rounds, and we
know that this converts a secure PRF into
-
a secure PRP on twice the block size. In
other words, we started from a secure PRF
-
on 21 bits, and that will give us, and
Luby-Rackoff will give us a secure PRF on
-
42 bits, which is what we wanted. Now, I
should tell you that the error parameters
-
in the Luby-Rackoff construction are not as
good as they could be. And, in fact, a
-
better thing to do is to use seven rounds
of Feistel. So in other words, we'll do
-
this seven times; every time we'll use a
different key. So you notice, here, we're
-
only using three keys. We should be using,
we should be doing this seven times using
-
seven different keys. And then there's a
nice result, due to Patarin, that
-
shows that the seven-round construction
basically has as good an error
-
terms as possible. So the error terms in
the security theorem will basically be two
-
to the T. Okay. So now we have a pseudo
random permutation that operates on close
-
power of two to the value of S. But that's
not good enough. We actually want to get a
-
pseudo random permutation on the set zero
to S minus one. So step two will take us
-
down from {0,1} to the T, to the actual
set zero to the S minus one that we're
-
interested in. And this construction is,
again, really, really cute, so let me show
-
you how it works. So, again, we're given
this PRP that operates on a power of two.
-
And we wanna go down to a PRP that
operates on something slightly smaller.
-
Okay, so here's on the construction works.
Basically we're given input X in the set
-
zero to S minus one that we want. And what
we're going to do is we're going to
-
iterate the following procedure again
and again. So first of all we map X into
-
this temporary variable Y. And now we're
just gonna encrypt the input X and put the
-
result into Y. If Y is inside of our
target set, we stop and we output Y. If
-
not we iterate this again and again and
again and again and again until finally Y
-
falls into our target set and then we
output that value. So in a picture, let me
-
explain how this works. So we start from a
point in our target set. And now, now we
-
apply the, the function E and we might be
mapped into this point outside our target
-
set, so we're not gonna stop. So now we
apply the function E again and we might
-
be mapped into this point and now we apply
the function E again and now all of a
-
sudden we map into this point, and then we
stop, and that's our output. Okay, so
-
that's how we map points between from zero
to S minus one, to other points in the
-
zero to S minus one. It should be pretty
clear that this is invertible. To invert,
-
all I'll do is I'll just decrypt and walk,
kind of, in the opposite direction. So
-
I'll decrypt, and then decrypt, and then
decrypt, until finally, I fall into the
-
set, zero to S minus one. And that gives
me the inverse of the point that I wanted
-
to. So this is, in fact, a PRP. The
question is whether it's a secure PRP, and
-
we'll see that in just a minute. But before
that, let me ask you, how many iterations
-
do you expect that we're gonna need? And
I wanna remind you again, before I ask you
-
that question, that, in fact, S is less than
two to the T, but is more than two to the T-1.
-
So in particular S is greater than two to
the T over two. And my question to you
-
now is how many iterations are we gonna
need, on average, until this process converges?
-
So the answer is we're going to need two iterations,
-
so really just a small
constant number of iterations. And so this
-
will give us a very, very efficient
mechanism that will move us down from
-
{0,1} to the T to zero to the S-1. So now
when we talk about security, it turns out
-
this step here is tight. In other words,
there is really no additional security
-
cost to going down from two to the T to
zero to S-1. And the reason that's true,
-
it's actually again a very cute theorem
to prove, but I, I won't do it here. Maybe
-
I'll leave it as an exercise for you guys
to argue. Which says that if you give me
-
any two sets Y and X, where Y is contained
inside of X, then applying the
-
transformation that we just saw to a
random permutation from X to X actually
-
gives a random permutation from Y to Y. So
this means that if we started with a truly
-
random permutation on X, you'll end up
with a truly random permutation on Y. And
-
since, well, the permutation we're
starting with is indistinguishable from
-
random on X, we'll end up with a
permutation that's indistinguishable from
-
random on Y. Okay, so this is a very tight
construction and as I said, the first step
-
actually is basically, the analysis is the
same as the Luby-Rackoff analysis. In
-
fact, it's better to use as I said, the
Patarin analysis using seven rounds. So
-
actually here, there's a bit of cost in
security. But, overall, we get a
-
construction that is a secure PRP for
actually, not such good security
-
parameters, but nevertheless, this is a
good secure PRP that we can construct that
-
actually will operate on the range zero to
S minus one. This in turn will allow us to
-
encrypt credit card numbers such that the
cipher text looks like a credit card
-
number. And again, I want to emphasize
that there's no integrity here. The best
-
we can achieve here is just
deterministic CPA security. No cipher text
-
integrity, and no randomness at all. Okay.
So this concludes this module. And as
-
usual I want to point to a few reading
materials that you can take a look at if
-
you're interested in learning more about
anything that we discussed in this module.
-
So first of all, the HKDF construction
that we talked about for key derivation is
-
described in a paper from 2010 by Hugo
Krawczyk. Deterministic encryption, The
-
SIV mode that we described is discussed in
this paper over here. The EME mode that we
-
described that allows us to build a, a
wider block pseudo random permutation is
-
described in this paper here by Halevi and
Rogaway. Tweakable block ciphers that
-
actually led to the XDS mode of operation
that's used for disk encryption is
-
described in this paper here. And finally,
a format preserving encryption is described
-
in this last paper that I point to over
here. Okay so this concludes this module.
-
And in the next module we gonna start
looking at how to do key exchange.