In this segment, I want to tell you about
another form of encryption, called format
preserving encryption. This is again
something that comes up in practice quite
often, especially when it comes to
encrypting credit card numbers. And, so,
let's see how it works. Remember that a
typical credit card number is sixteen
digits, broken into four blocks of four
digits each. You've probably heard before
that the first six digits are what's
called the BIN number, which represent the
issuer ID. For example, all credit cards
issued by VISA always start with the digit
four; all credit cards issued by
MasterCard start with digits 51 to 55, and
so on and so forth. The next nine digits
are actually the account number that is
specific to the, to the particular
customer and the last digit is a check sum
that's computed from the previous fifteen
digits. So there are about 20,000 BIN
numbers and so if you do the calculation
it turns out there are about 2 to the 42
possible credit card numbers which
corresponds to about 42 bits of
information that you need to encode if you
want to represent a credit card number
compactly. Now the problem is this.
Suppose we wanted to encrypt credit card
numbers, so that when the user swipes the
credit card number at the point of sale
terminal, namely at the, you know,
the merchant's cash register. The cash
register, this is called a point of sale
terminal, goes ahead and encrypts the
credit card number using a key k and
it's encrypted all the way until it goes
to the acquiring bank or maybe even beyond
that, maybe even all the way when it goes
to Visa. Now, the problem is that these
credit card numbers actually pass through
many, many, many processing points. All of
them expect to get basically something
that looks like a credit card number as a
result. So if we simply wanted to encrypt
something at the end point, at one end
point, and decrypt it at the other end
point, it's actually not that easy because
if all we did was encrypt it using AES,
even if we just used deterministic AES,
the output of the encrypted credit card
number would be 128 bit block. Sixteen
bytes that would be, that would need to be
sent from one system to the next, until it
reaches its destination. But each one of
these processors, then, would have to be
changed, because they're all expecting to
get credit card numbers. And so the
question is, can we encrypt at the cash
register, such that the resulting
encryption itself looks like a credit card
number. And as a result, none of these
intermediate systems would have to be
changed. Only the endpoints would have to
be changed. And in doing so we would
actually obtain end-to-end encryption
without actually having to change a lot of
software along the way. So the question
then is, again, can we have this mechanism
called format preserving encryption, where
encrypting a credit card itself produces
something that looks like a credit card?
So that's our goal. The encrypted credit
card number should look like a regular
valid credit card number. So this is the
goal of format preserving encryption. More
abstractly what it is we're trying to do,
is basically build a pseudo random
permutation on the set zero to S minus one
for any given S. So for the set of credit
card numbers, S would be roughly, you
know, two to the 42. In fact, it's gonna
be, not exactly two to the 42. It's gonna
be some funny numbers that's around two to
the 42. And our goal is to build a PRP
that acts exactly on the interval, zero to
S minus one. And what we're given as input
is some PRF, say, something like AES, that
acts on much larger blocks, say, acts on
sixteen byte blocks. So we're trying to,
in some sense, shrink the domain of the
PRF to make it fit the data that we're
given. And the question is basically how
to do that. Well, once we have such a
construction we can easily use it to
encrypt credit card numbers. What we would
do is we would take the, a given credit
card number. We would encode it such that
now it's represented as a number between
zero and the total number of credit card
numbers. Then we would apply our PRP so we
would encrypt this credit card number, you
know, using construction number two from
the deterministic encryption segment. And
then we would map the final number back to
be a regular, to look like val--, regular,
valid credit card number and then send
this along the way. So this is, this
is again non expanding encryption. The
best we can do, as we said before is to
encrypt using a PRP except again the
technical challenge is we need a PRP that
acts on this particular funny looking set
from zero to S-1 for this prespecified
value of S. So my goal is to show you how
to construct this and once we see how to
do it, you will also know how to encrypt
credit card number so that the resulting
cipher text is itself a credit card
number. So the construction works in two
steps. The first thing we do is we shrink
our PRF from the set {0,1} to the n, say
{0,1} to the 128 in the case of AES,
to {0,1} to the t where t is the
closest power of two to the value S.
So say S is some crazy number around two to
the 41. T would basically be then 42
because that's the closest power of two
that's just above the value S. So the
construction is basically gonna use the
Luby-Rackoff construction. What we need
is a PRF F prime that acts on blocks of
size t over two. So imagine for example
in the credit card case, t would be 42
because two to the 42 we said is the
closest power of two that's bigger than,
than, than S. S is the number of credit,
total number of credit cards. Then we would
need a PRF now that acts on 21-bit
inputs. So one way to do that is simply to
take the AES block cipher, treat it as a
PRF on 128-bit inputs, and then simply
truncate the output to the least
significant 21 bits. Okay, so we were
given a 21 bit value. We would append
zeros to it so that we get 128 bits as a
result. We would apply AES to that and
then we would truncate back to 21 bits.
Now I should tell you that that's actually
a simple way to do it but it's actually
not the best way to do it. There are
actually better ways to truncate a PRF on
n bits to a PRF on t over two bits.
But for our pur--, for the purposes of this
segment, the truncation method I just said
is good enough. So let's just use that
particular method. Okay, so now, we've
converted our AES block cipher into a PRF
on t over two bits, say, on 21 bits. But
what we really want is a PRP. And so what
we'll do is we'll plug our new PRF, F prime,
directly into the Luby-Rackoff
construction. If you remember, this is
basically a Feistel construction. We saw
this when we talked about DES. It's a,
Luby-Rackoff used three rounds, and we
know that this converts a secure PRF into
a secure PRP on twice the block size. In
other words, we started from a secure PRF
on 21 bits, and that will give us, and
Luby-Rackoff will give us a secure PRF on
42 bits, which is what we wanted. Now, I
should tell you that the error parameters
in the Luby-Rackoff construction are not as
good as they could be. And, in fact, a
better thing to do is to use seven rounds
of Feistel. So in other words, we'll do
this seven times; every time we'll use a
different key. So you notice, here, we're
only using three keys. We should be using,
we should be doing this seven times using
seven different keys. And then there's a
nice result, due to Patarin, that
shows that the seven-round construction
basically has as good an error
terms as possible. So the error terms in
the security theorem will basically be two
to the T. Okay. So now we have a pseudo
random permutation that operates on close
power of two to the value of S. But that's
not good enough. We actually want to get a
pseudo random permutation on the set zero
to S minus one. So step two will take us
down from {0,1} to the T, to the actual
set zero to the S minus one that we're
interested in. And this construction is,
again, really, really cute, so let me show
you how it works. So, again, we're given
this PRP that operates on a power of two.
And we wanna go down to a PRP that
operates on something slightly smaller.
Okay, so here's on the construction works.
Basically we're given input X in the set
zero to S minus one that we want. And what
we're going to do is we're going to
iterate the following procedure again
and again. So first of all we map X into
this temporary variable Y. And now we're
just gonna encrypt the input X and put the
result into Y. If Y is inside of our
target set, we stop and we output Y. If
not we iterate this again and again and
again and again and again until finally Y
falls into our target set and then we
output that value. So in a picture, let me
explain how this works. So we start from a
point in our target set. And now, now we
apply the, the function E and we might be
mapped into this point outside our target
set, so we're not gonna stop. So now we
apply the function E again and we might
be mapped into this point and now we apply
the function E again and now all of a
sudden we map into this point, and then we
stop, and that's our output. Okay, so
that's how we map points between from zero
to S minus one, to other points in the
zero to S minus one. It should be pretty
clear that this is invertible. To invert,
all I'll do is I'll just decrypt and walk,
kind of, in the opposite direction. So
I'll decrypt, and then decrypt, and then
decrypt, until finally, I fall into the
set, zero to S minus one. And that gives
me the inverse of the point that I wanted
to. So this is, in fact, a PRP. The
question is whether it's a secure PRP, and
we'll see that in just a minute. But before
that, let me ask you, how many iterations
do you expect that we're gonna need? And
I wanna remind you again, before I ask you
that question, that, in fact, S is less than
two to the T, but is more than two to the T-1.
So in particular S is greater than two to
the T over two. And my question to you
now is how many iterations are we gonna
need, on average, until this process converges?
So the answer is we're going to need two iterations,
so really just a small
constant number of iterations. And so this
will give us a very, very efficient
mechanism that will move us down from
{0,1} to the T to zero to the S-1. So now
when we talk about security, it turns out
this step here is tight. In other words,
there is really no additional security
cost to going down from two to the T to
zero to S-1. And the reason that's true,
it's actually again a very cute theorem
to prove, but I, I won't do it here. Maybe
I'll leave it as an exercise for you guys
to argue. Which says that if you give me
any two sets Y and X, where Y is contained
inside of X, then applying the
transformation that we just saw to a
random permutation from X to X actually
gives a random permutation from Y to Y. So
this means that if we started with a truly
random permutation on X, you'll end up
with a truly random permutation on Y. And
since, well, the permutation we're
starting with is indistinguishable from
random on X, we'll end up with a
permutation that's indistinguishable from
random on Y. Okay, so this is a very tight
construction and as I said, the first step
actually is basically, the analysis is the
same as the Luby-Rackoff analysis. In
fact, it's better to use as I said, the
Patarin analysis using seven rounds. So
actually here, there's a bit of cost in
security. But, overall, we get a
construction that is a secure PRP for
actually, not such good security
parameters, but nevertheless, this is a
good secure PRP that we can construct that
actually will operate on the range zero to
S minus one. This in turn will allow us to
encrypt credit card numbers such that the
cipher text looks like a credit card
number. And again, I want to emphasize
that there's no integrity here. The best
we can achieve here is just
deterministic CPA security. No cipher text
integrity, and no randomness at all. Okay.
So this concludes this module. And as
usual I want to point to a few reading
materials that you can take a look at if
you're interested in learning more about
anything that we discussed in this module.
So first of all, the HKDF construction
that we talked about for key derivation is
described in a paper from 2010 by Hugo
Krawczyk. Deterministic encryption, The
SIV mode that we described is discussed in
this paper over here. The EME mode that we
described that allows us to build a, a
wider block pseudo random permutation is
described in this paper here by Halevi and
Rogaway. Tweakable block ciphers that
actually led to the XDS mode of operation
that's used for disk encryption is
described in this paper here. And finally,
a format preserving encryption is described
in this last paper that I point to over
here. Okay so this concludes this module.
And in the next module we gonna start
looking at how to do key exchange.