-
In this segment we will look at how to use
block ciphers to encrypt multiple messages
-
using the same key. This comes up in
practice for example in file systems where
-
the same key's used to encrypt multiple
files. It comes up in networking protocols
-
where the same key is used to encrypt
multiple packets. So let's see how to do
-
it. The first thing we need to do is to
define what is mean for a cipher to be
-
secure when the same key is used to
encrypt multiple messages. When we use the
-
key more than once the result of that is
that the adversary gets to see many cyber
-
text encrypted using the same key. As a
result, when we define security, we're
-
gonna allow the adversary to mount what's
called a chosen plain text attack. In
-
other words, the adversary can obtain the
encryption of arbitrary messages of his
-
choice. So, for example, if the
adversary's interacting with Alice. The
-
adversary can ask Alice to encrypt
arbitrary messages of the adversary's
-
choosing. And Alice will go ahead and
encrypt those messages and give the
-
adversary the resulting cipher texts. You
might wonder why would Alice ever do this.
-
How could this possibly happen in real
life? But it turns out this is actually
-
very common in real life. And in fact,
this modeling is quite a conservative
-
modeling of real life. For example, the
adversary might send Alice an email. When
-
Alice receives the email, the writes it to
her encrypted disk, thereby encrypting the
-
adversary's email using her secret key. If
later the adversary steals this disc, then
-
he obtains the encryption of an email that
he sent Alice under Alice's secret key. So
-
that's an example of a chosen plain text
attack, where the adversary provided Alice
-
with a message and she encrypted that
message using her own key. And then later
-
the attacker was able to obtain the
resulting cipher text. So that's the
-
adversary's power. And then the
adversary's goal is basically to break
-
semantic security. So let's define this
more precisely. As usual, we're gonna
-
define semantic security under a chosen
plain text attack using two experiments,
-
experiment zero and experiment one, that
are modeled as a game between a challenger
-
and an adversary. When the game begins,
the challenger is gonna choose a random
-
key K. And now the adversary basically
gets to query the challenger. So the
-
adversary now begins by submitting a
semantic security query, namely, he
-
submits two messages, M zero and M one. I
added another index, but let me ignore
-
that extra index for a while. So the
adversary submits two messages, M zero and
-
M one, that happen to be of the same
length. And then the adversary receives
-
the encryption of one of those messages,
either of M zero or of M one. In
-
experiment zero, he receives the
encryption of M zero. In experiment one,
-
he receives the encryption of M one. So,
so far this would look familiar this looks
-
exactly like a standard semantic security
[inaudible]. However, plain text attack
-
the adversary can now repeat this query
again. So now you can issue a query with
-
two other plain texts, again of the same
length, and again you would receive the
-
encryption of one of them. In experiment
zero you would receive the encryption of M
-
zero. In experiment one you would receive
the encryption of M one. And the attacker
-
can continue issuing queries like this. In
fact we'll say that he can issue up to Q
-
queries of this type. And then, remember,
every time he issues a pair of messages.
-
That happen to be of the same length and
every time he either gets the encryption
-
of the left side or the right side again
in experiment zero he will always get the
-
encryption of the left message in
experiment one he will always get the
-
encryption of the left message. And, then
adversary's goal is, basically, to figure
-
out whether he's in experimental zero or
in experiment one. In other words, whether
-
he was constantly receiving the encryption
of the left message or the encryption of
-
the right message. So, in some sense, this
is a standard semantic security game just
-
iterated over many queries that the
attacker can issue to adaptively one after
-
the other. Now the chosen plain text
attack is captured by the fact that if the
-
attacker wants the encryption of a
particular message m. What he could do is,
-
for example, use query J for sum J, where
in this query J he'll set both the zero
-
message and the one message to be the
exactly same message M. In other words,
-
both the left message and the right
message are the same, and both are set to
-
the message M. In this case, what he will
receive, since both messages are the same,
-
he knows that he's gonna receive the
encryption of this message M that he was
-
interested in. So this is exactly what we
meant by a chosen [inaudible] attack.
-
Where the advisory can submit a message m
and receive the encryption of that
-
particular message m of his choice. So
some of his queries might be of this chose
-
plain text flavor where the message on the
left is equal to the message on the right,
-
but some of the queries might be standard
semantic security queries where the two
-
messages are distinct and that actually
gives him information on whether he's in
-
experiment zero or in experiment one. Now
by now you should be used to this
-
definition where we say that the system is
semantically secure under a chosen plain
-
text attack. If, for all efficient
adversaries, they cannot distinguish
-
experiment zero from experiment one. In
other words, the probability that, at the
-
end, the output, B Prime, which we're
gonna denote by the output of experiment
-
B. This output will be the same whether
[inaudible] experiment zero or experiment
-
one. So the attacker couldn't distinguish
between always receiving encryptions of
-
the left messages, versus always receiving
encryptions of the right messages. So in
-
your mind, I'd like you to be thinking of
an adversary that is able to mount a
-
chosen plaintext attack, namely, be given
the encryption of arbitrary messages of
-
his choice, and his goal is to break
semantic security for some other challenge
-
cipher texts. And as I said in this
[inaudible] model of the real world the
-
attacker is able to fool Alice into
encrypting for him messages of his choice
-
and then the attacker's goal is to somehow
break some challenge cypher text. So I
-
claim that all the ciphers that we've seen
up until now, namely deterministic counter
-
mode or the one time pad, are insecure
under a chosen plain text attack. More
-
generally, suppose we have an encryption
scheme that always outputs the same cipher
-
text for a particular message M. In other
words, if I ask the encryption scheme to
-
encrypt the message M once. And then I ask
the encryption scheme to encrypt the
-
message m again. If in both cases the
encryption scheme outputs the same cypher
-
text, then that system cannot possibly be
secure under a chosen plain text attack.
-
And both deterministic counter mode and
the one time pad were of that flavor. They
-
always output the same cipher text, given
the same message. And so let's see why
-
that cannot be chosen plain text secure.
And the attack is fairly simple, what the
-
attacker is gonna do, is he's gonna output
the same message twice. This just says.
-
That he really wants the encryption of M0.
So here the attacker is given C0 which is
-
the encryption of M0. So this was his
chosen plain text query where he actually
-
received the encryption of the message M0
of his choice. And now he's going to break
-
semantic security. So what he does is he
outputs two messages, M0 and M1 of the
-
same length, and he's going to be given
the encryption of MB. But low and behold,
-
we said that the encryption system. Always
outputs the same cipher text when its
-
encrypting the message, M0. Therefore, if
B is=to zero, we know that C, this
-
challenged cipher text, is simply=to CO,
because it's the encryption of M0.
-
However, if B is=to one. Then we know that
this challenge cypher text is the
-
encryption of M1 which is something other
than C zero so all the attacker does is he
-
just checks his C is = to C0 the output's
zero in other words he outputs one. So, in
-
this case, the attacker is able to
perfectly guess this bit B, so he knows
-
exactly [inaudible] given the encryption
of M0, or the encryption of M1. And as a
-
result, his advantage in winning this game
is one. Meaning that the system cannot
-
possibly be CPA secure. One is not a
negligible number. So this shows that the
-
deterministic encryption schemes cannot
possibly be CPA-secure, but you might
-
wonder well, what does this mean in
practice? Well in practice this means
-
again that every message is always
encrypted to the same cipher text. What
-
this means is if you're encrypting files
on disk, and you happen to be encrypting
-
two files that happen to be the same, they
will result in the same cipher text and
-
then the attacker by looking at the
encrypted disk, will learn that these two
-
files actually contain the same content.
The attacker might not learn what the
-
content is, but he will learn that these
two encrypted files are an encryption of
-
the same content and he shouldn't be able
to learn that. Similarly, if you send two
-
encrypted packets on the network that
happen to be the same, the attacker will
-
not learn the content of those packets,
but he will learn that those two packets
-
actually contain the same information.
Think for example of an encrypted voice
-
conversation. Every time there's quiet on
the line, the system will be sending
-
encryptions of zero. But since encryption
of zero are always mapped to the same
-
cypher text. An attacker looking at the
network will be able to identify exactly
-
the points in the conversation where
there's quiet because he will always see
-
those exact same cypher text every time.
So these are examples where deterministic
-
encryption cannot possibly be secure. And
as I say formerly we say that the
-
terministic encryption can not be
semantically secure under a chosen plain
-
text attack. So what do we do, well the
lesson here is if the secret keys gonna be
-
used to encrypt multiple messages, it had
better be the case that given the same
-
plain text to encrypt twice. The
encryption algorithm must produce
-
different cipher texts. And so there are
two ways to do that. The first method is
-
what's called randomized encryption. Here,
the encryption algorithm itself is going
-
to choose some random string during the
encryption process and it is going to
-
encrypt the message M using that random
string. So what this means is that a
-
particular message, M0 for example, isn't
just going to be mapped to one cipher text
-
but it's going to be mapped to a whole
ball of cipher texts. Whereon every
-
encryption, basically, we output one point
in this ball. So every time we encrypt, the
-
encryption algorithm chooses a random
string, and that random string leads to
-
one point in this ball. Of course, the
decryption algorithm, when it takes any
-
point in this ball, will always map the
result to M zero. Similarly cipher text M
-
one will be mapped to a ball, and every
time we encrypt M one, we basically output
-
one point in this ball. And these balls
have to be disjoint, so that the
-
encryption algorithm, when it obtains a
point in the ball corresponding to M one,
-
will always output the message M one. In
this way, since the encryption algorithm
-
uses randomness, if we encrypt the same
message twice, with high probability we'll
-
get different cipher texts. Unfortunately
this means that the cipher text
-
necessarily has to be longer than the
plain text because somehow the randomness
-
that was used to generate the cipher text
is now encoded somehow in the cipher text.
-
So the cipher text takes more space. And
roughly speaking, the cipher text size is
-
going to be larger than the plain text. By
basically the number of random bits that
-
were used during encryption. So if the
plain texts are very big, if the plain
-
texts are gigabytes long, the number of
random bits is going to be on the order of
-
128. So maybe this extra space doesn't
really matter. But if the plain texts are
-
very short, maybe they themselves are 128
bits, then adding an extra 128 bits to
-
every cipher text is going to double the
total cipher text size. And that could be
-
quite expensive. So as I say randomized
encryption is a fine solution but in some
-
cases it actually introduces quite a bit
of costs. So let's look at a simple example.
-
So imagine we have a pseudorandom
function that takes inputs in a certain
-
space r which is gonna be called a nonce
space. And outputs, outputs in the message
-
space. And, now, let's define the
following randomize encryption scheme
-
where we want to encrypt the message m
with the encryption of whatever it's gonna
-
do is first it's gonna generate a random r
in this nonce space R. And then it's going
-
to open a cypher text that consist of two
components, the first component is going
-
to be this value R and the second
component is going to be an evaluation of
-
the pseudo-random function at the point R
XOR with the message M. And my question to
-
you is, is this encryption system
semantically secure under a chosen plain
-
text attack. So the correct answer is yes.
But only if the nonce space R is large
-
enough so that little R never repeats with
very, very high probability. And let's
-
quickly argue why that's true. So first of
all, because F is a secure pseudo-random
-
function, we might as well replace it with
a truly random function. In other words,
-
this is indistinguishable from the case
where we encrypt the message M, using the
-
truly random function little F, evaluated
to point R, and then XOR with M.
-
But since this little r never repeats every
cypher text uses a different little r what
-
this means is that the values of F(r)
are random uniform independent strings
-
every time. So every time we encrypt a
message, we encrypt it essentially using a
-
new uniform random one time pad. And since
XORing a uniform string with any string
-
simply generates a new uniform string, the
resulting cipher text is distributed as
-
simply two random uniform strings. I'll
call them r and r prime. And so both in
-
experiment zero and in experiment one, all
the attacker gets to see are truly uniform
-
random strings r, r', and since in both
experiments the attacker is seeing the same
-
distribution, he cannot distinguish the
two distributions. And so since security
-
holds completely when we're using a truly
random function it's also gonna hold when
-
we're using a pseudorandom function. Okay,
so this is a nice example of how we use
-
the fact that the pseudo random function
behaves like a random function to argue
-
security of this particular encryption
scheme. Okay, so now we have a nice
-
example of randomized encryption. The
other approach to building chosen plain
-
text secure encryption schemes is what's
called a nonce based encryption. Now, in
-
a non-spaced encryption system, the
encryption algorithm actually takes three
-
inputs rather than two. As usual it takes
the key and the message. But it also takes
-
an additional input called a nonce. And
similarly, the decryption algorithm also
-
takes the nonce as input, and then produces
the resulting decrypted plain text. And
-
what is this nonce value n. This nonce is
a public value. It does not need to be
-
hidden from the adversary but the only
requirement is that the pair (k,n)
-
is only used to encrypt a single
message. In other words, this pair (k,n)
-
must change from message to message. And
there are two ways to change it. One way
-
to change it is by choosing a new random
key for every message. And the other way
-
is to keep using the same key all the time
but then we must choose a new nonce for
-
every message. And, and as I said, I wanna
emphasize again, this nonce need not
-
be secret, and it need not be random. The
only requirement is the nonce is unique.
-
And in fact, we're gonna use this
term throughout the course. A nonce
-
for us, means a unique value that doesn't
repeat. It does not have to be random. So
-
let's look at some examples of choosing an
nonce, well the simplest option is
-
simply to make the nonce of the
accounter so for example the networking
-
protocol you can imagine the nonce
being a packet counter that's incremented
-
every time a packet is sent by a sender or
received by the receiver this means that
-
the encrypter has to keep state from
message to message mainly that he has to
-
keep this counter around and increment it
after every message is transmitted.
-
Interestingly, if the decrypter actually
has the same state then there is no need
-
to include the nuance in the cipher text
since the nuance is implicit. Let's look
-
at an example. The https protocol is run
over a reliable transport mechanism which
-
means that packets sent by the sender are
assumed to be received in order at a
-
recipient. So if the sender sends packet #5 and then packet #6, the recipient
-
will receive packet #5 and then
packet #6 in that order. This
-
means that if the sender maintains a
packet counter, the recipient can also
-
maintain a packet counter and two counters
basically increment in sync. In this case
-
there is no reason to include the
nonce in the packets because the
-
nonce is implicit between the two
sides. However, in other protocols, for
-
example, in IPsec, IPsec has a protocol
designed to encrypt the IP layer. The IP
-
layer does not guarantee in order
delivery. And so the sender might send
-
packet #5 and then packet #6, but
those will be received in reverse order at
-
the recipient. In this case it's still
fine to use a packet counter as a nonce
-
but now the nonce has to be included in
the packet so that the recipient knows
-
which nonce to use to decrypt the
received packet. So as I say, nonce based
-
encryption is a very efficient way to
achieve CPA security. In particular if the
-
nonce is implicit, it doesn't even
increase the cipher text length. Of course
-
another method to generate a unique nonce
is simply to pick the nonce at random
-
assuming the nonce space is sufficiently
large so that with high probability the
-
nonce will never repeat for the life of
the key. Now in this case, nonce
-
based encryption simply reduces to
randomized encryption. However, the
-
benefit here is that the sender does not
need to maintain any state from message to
-
message. So this is very useful for
example if encryption happens to take
-
place on multiple devices. For example, I
might have both a laptop and a smart
-
phone. They might both use the same key.
But in this case if I require state full
-
encryption, then my laptop and the
smartphone would have to coordinate to
-
make sure that they never reuse the same
nonces. Whereas if both of them simply take
-
nonces at random, they don't need to
coordinate because it was very high
-
probability they'll simply never choose
the same nonce. Again assuming the nonce
-
space is big enough. So there are some
cases where stateless encryption is quite
-
important, in particular where the same
key is used by multiple machines. So I
-
wanted to find, more precisely, what
security means for nonce based
-
encryption. And in particular, I want to
emphasize that the system must remain
-
secure when the nonce are chosen by
the adversary. The reason it's important
-
to allow the adversary to choose the
nonces is because the adversary can
-
choose which cipher text it wants to
attack. So imagine the nonce happens to
-
be a counter and it so happens that when
the couter hits the value fifteen, maybe
-
at that point it's easy for the adversary
to break semantic security. So the
-
adversary will wait until the fifteenth
packet is sent and only then he will ask
-
to break semantic security. So when we
talk about nonce based encryption, we
-
generally allow the adversary to choose
the nonce and the system should remain
-
secure even under those settings. So let's
define the CPA game in this case and it's
-
actually very similar to the game before.
Basically the attacker gets to submit
-
pairs of messages MI, MI0, and MI1.
Obviously they both have to be of the same
-
length. And he gets to supply the nonce.
And in response, the adversary is given
-
the encryption of either MI0, or MI1. But
using the nonce that the adversary
-
chose. And of course, as usual, the
adversary's goal is to tell whether he was
-
given the encryption of the left plain
text or the right plain text. And as
-
before the adversary gets to iterate these
queries and he can issue as, as many
-
queries as he wants, we usually let q
denote the number of queries that the
-
adversary issues. Now the only restriction
of course, which is crucial, is that
-
although the adversary gets to choose the
nonces, he's restricted to choosing
-
distinct nonces. The reason we force him
to choose distinct nonces is because
-
that's the requirement in practice. Even
if the adversary fools Alice into
-
encrypting multiple messages for him,
Alice will never use the same nonce
-
again. As a result, the adversary will
never see messages encrypted using the
-
same nonce and therefore, even in the
game, we require that all nonce be
-
distinct. And then as usual we say that
the system is a nonce based encryption
-
system that's, semantically secure under a
chosen plain text attack if the adversary
-
cannot distinguish experiment zero where
he's given encryptions of the left
-
messages from experiment one where he's
given encryptions of the right messages.
-
So let's look at an example of a nonce
based encryption system. As before, we
-
have a secure PRF that takes inputs in the
nonce space R and outputs strings in the
-
message space M. Now when a new key is
chosen, we're going to reset our counter R
-
to be zero. And now we encrypt the
particular message M, what we will do is
-
we will increment our counter R, and then
encrypt the message M using the
-
pseudorandom function applied to this
value R. And as before, the cipher text is
-
going to contain two components, our
current value of the counter and then the
-
one time pad encryption of the message M.
And so my question to you is whether this
-
is a secure, non-spaced encryption system.
So the answer as before is yes, but only
-
if the nuance space is large enough. So as
we increment the counter R, it will never
-
cycle back to zero so that the nuances
will always, always be unique. We argue
-
security the same way as before. Because
the PRF is secure, we know that this
-
encryption system is indistinguishable
from using a truly random function. In
-
other words, if we apply a truly random
function to the counter and XOR the
-
results with, the plain text M. But now
since the nuance R never repeats, every
-
time we compute this F of R, we get a
truly random uniform and independent
-
string so that we're actually encrypting
every message using the one time pad. And
-
as a result, all the adversary gets to see
in both experiments are basically just a
-
pair of random strings. So both the
experiment zero and experiment one the
-
adversary get's to see exactly the same
distribution, namely, the responses to all
-
this chosen plain text queries are just
pairs of strings that are just uniformly
-
distributed and this is basically the same
in experiment zero and experiment one and,
-
therefore, the attacker cannot distinguish
the two experiments. And since he cannot
-
win the semantic security game with a
truly random function he, also, cannot win
-
the semantics security game with the
secure PRF, and, therefore, the scheme is
-
secure. So now we understand what it means
for a symmetric system to be secure when
-
the keys used to encrypt multiple messages
the requirement is that it be secure under
-
a chosen plan of attack. And we said that
basically, the only way to be secure under
-
a chosen plain text attack is either to
use randomized encryption, or to use, use
-
nonce spaced encryption where the
nonce never repeats. And then in the
-
next two segments, we're gonna build two
classic encryption systems that are secure
-
when the key is used multiple times.