-
My goal for the next few segments is to
show you that if we use a secure PRG We'll
-
get a secure stream safer. The first thing
we have to do is define is, what does it
-
mean for a stream safer to be secure? So
whenever we define security we always
-
define it in terms of what can the
attacker do? And what is the attacker
-
trying to do? In the context of
stream ciphers remember these are only used
-
with a onetime key, and as a result the
most the attacker is ever going to see is
-
just one cypher text encrypted using the
key that we're using. And so we're gonna
-
limit the attackers? ability to basically
obtain just one cypher text. And in
-
fact later on we're going to allow the
attacker to do much, much, much more, but
-
for now we're just gonna give him one
cypher text. And we wanted to find,
-
what does it mean for the cypher to
be secure? So the first definition that
-
comes to mind is basically to say, well
maybe we wanna require that the attacker
-
can't actually recover the secret key.
Okay so given ciphertext you
-
shouldn't be able to recover the secret
key. But that's a terrible definition
-
because think about the falling brilliant
cipher but the way we encrypt the
-
message using a key K is basically
we just output the message. Okay this
-
is a brilliant cipher yeah of course it
doesn't do anything given a message just
-
output a message as the cipher text.
This is not a particularly good encryption
-
scheme; however, given the cipher text,
mainly the message the poor attacker
-
cannot recover the key because he doesn't
know what the key is. And so as a result
-
this cipher which clearly is insecure,
would be considered secure under this
-
requirement for security. So this
definition will be no good. Okay so it's
-
recovering the secret key is the wrong way
to define security. So the next thing we
-
can kinda attempt is basically just say,
well maybe the attacker doesn't care about
-
the secret key, what he really cares about
are the plain text. So maybe it should be
-
hard for the attacker to recover the
entire plain text. But even that doesn't
-
work because let's think about the
following encryption scheme. So suppose
-
what this encryption scheme does is it
takes two messages, so I'm gonna use two
-
lines to denote concatenation of two
messages M0 line, line M1 means
-
concatenate M0 and M1. And imagine
what the scheme does is really it outputs
-
M0 in the clear and concatenate to
that the encryption of M1. Perhaps even
-
using the One Time Pad okay? And
so here the attacker is gonna be given one
-
cipher text. And his goal would be to
recover the entire plain texts. But the
-
poor attacker can't do that because here
maybe we've encrypted M1 using the One
-
Time Pad so the attacker can't
actually recover M1 because we know the
-
One Time Pad is secure given just
one cipher text. So this construction
-
would satisfy the definition but
unfortunately clearly this is not a secure
-
encryption scheme because we just leaked
half of the plain text. M0 is
-
completely available to the attacker so
even though he can't recover all of the
-
plain text he might be able to recover
most of the plain text, and that's clearly
-
unsecured. So of course we already know
the solution to this and we talked about
-
Shanon definition of security perfect
secrecy where Shannon's idea was that in
-
fact when the attacker intercepts a cipher
text he should learn absolutely no
-
information about the plain texts. He
shouldn't even learn one bit about the
-
plain text or even he shouldn't learn, he
shouldn't even be able to predict a little
-
bit about a bid of the plain text.
Absolutely no information about the plain text.
-
So let's recall very briefly
Shannon's concept of perfect secrecy
-
basically we said that you know given a
cipher We said the cipher has perfect
-
secrecy if given two messages of the same
length it so happens that the distribution
-
of cipher texts. Yet if we pick a random
key and we look into distribution of
-
cipher texts we encrypt M0 we get
exactly the same distribution as when we
-
encrypt M1. The intuition here was that if
the advisory observes the cipher texts
-
then he doesn't know whether it came from
the distribution the result of encrypting
-
M0 or it came from the distribution as
the result of encrypting M1. And as a
-
result he can't tell whether we encrypted
M0 or M1. And that's true for all
-
messages of the same length and as a
result a poor attacker doesn't really know
-
what message was encrypted. Of course we
already said that this definition is too
-
strong in the sense that it requires
really long keys if cipher has short
-
keys can't possibly satisfy this
definition in a particular stream ciphers
-
can satisfy this definition. Okay so let's
try to weaken the definition a little bit
-
and let's think to the previous segment,
and we can say that instead of requiring
-
the two distributions to be absolutely
identical what we can require is, is that
-
two distributions just be computationally
indistinguishable. In other words a poor,
-
efficient attacker cannot distinguish the
two distributions even though the
-
distributions might be very, very, very
different. That just given a sample for
-
one distribution and a sample for another
distribution the attacker can't tell which
-
distribution he was given a sample from.
It turns out this definition is actually
-
almost right, but it's still a little too
strong, that still cannot be satisfied, so
-
we have to add one more constraint, and
that is that instead of saying that this
-
definition should have hold for all M0 and M1. It is to hold for only pairs M0 M1
-
that the attackers could
actually exhibit. Okay so this actually
-
leads us to the definition of semantics
security. And so, again this is semantics
-
security for one time key in other words
when the attacker is only given one cipher
-
text. And so the way we define semantic
security is by defining two experiments,
-
okay we'll define experiment 0 and
experiment 1. And more generally we will
-
think of these as experiments parentheses
B, where B can be zero or one. Okay so the
-
way the experiment is defined is as
follows, we have an adversary that's
-
trying to break the system. An adversary
A, that's kinda the analog of statistical
-
tests in the world of pseudo random
generators. And then the challenger does
-
the following, so really we have two
challengers, but the challengers are so
-
similar that we can just describe them as
a single challenger that in one case takes
-
his inputs bits set to zero, and the other
case takes his inputs bits set to
-
one. And let me show you what these
challengers do. The first thing the
-
challenger is gonna do is it's gonna pick
a random key and then the adversary
-
basically is going to output two messages
M0 and M1. Okay so this is an explicit
-
pair of messages that the attacker wants
to be challenged on and as usual we're not
-
trying to hide the length of the messages,
we require that the messages be equal
-
length. And then the challenger basically
will output either the encryption of M0
-
or the encryption of M1, okay so in
experiment 0 the challenger will output
-
the encryption of M0. In experiment 1 the challenger will output the encryption
-
of M1. Okay so that the difference between
the two experiments. And then the
-
adversary is trying to guess basically
whether he was given the encryption of M0
-
or given the encryption of M1. Okay so
here's a little bit of notation let's
-
define the events Wb to be the events that
an experiment B the adversary output one.
-
Okay so that is just an event that
basically in experiment zero W0 means that
-
the adversary output one and in experiment
one W1 means the adversary output one. And
-
now we can define the advantage of this
adversary, basically to say that this is
-
called the semantics security advantage of
the adversary A against the scheme E,
-
to be the difference of the probability of
these two events. In other words we are
-
looking at whether the adversary behaves
differently when he was given the
-
encryption of M0 from when he was given
the encryption of M1. And I wanna make
-
sure this is clear so I'm gonna say it one
more time. So in experiment zero he was
-
given the encryption of M0 and in
experiment one he was given the encryption
-
of M1. Now we're just interested in
whether the adversary output 1 or not.
-
... In these experiments. If in both
experiments the adversary output 1 with
-
the same probability that means the
adversary wasn't able to distinguish the
-
two experiments. Experiments zero
basically looks to the adversary the same
-
as experiment one because in both cases
upload one with the same probability.
-
However if the adversary is able to output
1 in one experiment with significantly
-
different probability than in the other
experiment, then the adversary was
-
actually able to distinguish the two
experiments. Okay so... To say this more
-
formally, essentially the advantage again
because it's the difference of two
-
probabilities the advantage is a number
between zero and one. If the advantage is
-
close to zero that means that the
adversary was not able to distinguish
-
experiment zero from experiment one.
However if the advantage is close to one,
-
that means the adversary was very well
able to distinguish experiment zero from
-
experiment one and that really means that
he was able to distinguish an encryption
-
of M0 from an encryption of M1, okay?So that's out definition. Actually
-
that is just the definition of the
advantage and the definition is just what
-
you would expect basically we'll say that
a symmetric encryption scheme is
-
semantically secure if for all efficient
adversaries here I'll put these in quotes
-
again, "For all efficient adversaries the
advantage is negligible." In other words,
-
no efficient adversary can distinguish the
encryption of M0 from the encryption
-
of M1 and basically this is what it
says repeatedly that for these two
-
messages that the adversary was able to
exhibit he wasn't able to distinguish
-
these two distributions. Now I want to
show you this, this is actually a very
-
elegant definition. It might not seem so
right away, but I want to show you some
-
implications of this definition and you'll
see exactly why the definition is the way
-
it is. Okay so let's look at some
examples. So the first example is suppose
-
we have a broken encryption scheme, in
other words suppose we have an adversary A
-
that somehow given the cipher text he is
always able to deduce the least
-
significant bit of the plain text. Okay so
given the encryption of M0, this adversary
-
is able to deduce the least significant
bit of M0. So that is a terrible
-
encryption scheme because it basically
leaks the least significant bit of the
-
plain text just given the cipher text. So
I want to show you that this scheme is
-
therefore semantically secure so that kind
of shows that if a system is semantically
-
secure than there is no attacker of this
type. Okay so let's see why is the system
-
not semantically secure well so what we
are gonna do is we're gonna basically use
-
our adversary who is able to learn our
most insignificant bits, we're going to
-
use him to break semantic security so
we're going to use him to distinguish
-
experiment zero from experiment one Okay
so here is what we are going to do. We are
-
algorithm B, we are going to be algorithm
B and this algorithm B is going to use
-
algorithm A in its belly. Okay so the
first thing that is going to happen is of
-
course the challenger is going to choose a
random key. The first thing that is going
-
to happen is that we need to output two
messages. The messages that we are going
-
to output basically are going to have
differently significant bits. So one
-
message is going to end with zero and one
message is going to end with one. Now what
-
is the challenger going to do? The
challenger is going to give us the
-
encryption of either M0 or M1,
depending on whether in experiment 0 or
-
in experiment 1. And then we just
forward this cipher text to the adversary
-
okay so the adversary A. Now what is the
property of adversary A? Given the cipher
-
text, adversary A can tell us what the
least significant bits of the plain text is.
-
In other words the adversary is going
to output the least significant bits of M0 or M1
-
but low and behold that's
basically the bit B. And then we're just
-
going to output that as our guest so let?s
call this thing B prime Okay so now this
-
describes the semantic security adversary.
And now you tell me what is the semantic
-
security advantage of this adversary? Well
so let's see. In experiment zero, what is
-
the probability that adversary B outputs
one? Well in experiment zero it is always
-
given the encryption of M zero and
therefore adversary A is always output the
-
least significant bit of M zero which
always happens to be zero. In experiment
-
zero, B always outputs zero. So the
probability that outputs one is zero.
-
However in experiment one, we're given the
encryption of M1. So how likely is
-
adversary B to output one in experiment
one well it always outputs one, again by
-
the properties of algorithm A and
therefore the advantage basically is one.
-
So this is a huge advantage, it's as big
as it's gonna get. Which means that this
-
adversary completely broke the system.
Okay so we consider, so under semantic
-
security basically just deducing the least
significant bits is enough to completely
-
break the system under a semantic security
definition. Okay, now the interesting
-
thing here of course is that this is not
just about the least significant bit, in
-
fact of the message for
example the most significant bit, you know
-
bit number seven Maybe the XOR of all the bits in the message and so on
-
and so forth any kind of information, any
bit about the plain text they can be
-
learned basically would mean that the
system is not semantically secure. So
-
basically all the adversary would have to
do would be to come up with two messages
-
M0 and M1 such that under one thing that
they learned it's the value zero and then
-
the other thing, the value, is one. So for
example if A was able to learn the XOR
-
bits of the message then M0
and M1 would just have different
-
XOR for all the bits of their
messages and then this adversary A would
-
also be sufficient to break semantic
security. Okay so, basically for cipher
-
is semantically secure then no
bit of information is revealed to an
-
efficient adversary. Okay so this is
exactly a concept of perfect secrecy only
-
applied just efficient adversaries
rather than all adversaries. So the next
-
thing I wanna show you is that in fact the
one time pad in fact is
-
semantically secure, they better be
semantically secure because it's in fact,
-
it's more than that it's perfectly secure.
So let's see why that's true. Well so
-
again it's one of these experiments, so
suppose we have an adversary that claims
-
to break semantic security of the one time
pad. The first the adversary is gonna do
-
is he's gonna output two messages M0
and M1 of the same length.
-
Now what does he get back as a challenge? As a
challenge, he gets either the encryption
-
of M0 or the encryption of M1 under
the one time pad.
-
And he's trying to distinguish between those two possible
cipher texts that he gets, right?
-
In experiment zero, he gets the encryption of
M0 and in experiment one, he gets the
-
encryption of M1. Well so let me ask
you, so what is the advantage of adversary
-
A against the one time patent? So I
remember that the property of the ones I
-
had is that, that K, XOR M0 is
distributed identically to K, XOR M1.
-
In other words, these distributions are
absolutely identical distribution,
-
distributions, identical. This is a
property of XOR. If we XOR the random pad
-
K with anything, either M0 or M1,
we get uniform distribution. So in both
-
cases, algorithm A is given as in input
exactly the same distribution. Maybe the
-
uniform distribution on cipher texts. And
therefore it's gonna behave exactly the
-
same in both cases because it was given
the exact distribution as input. And as a
-
result, its advantage is zero, which means
that the onetime pad is semantically
-
secure. Now the interesting thing here is
not only is it semantically secure, it's
-
semantically secure for all adversaries.
We don't even have to restrict the
-
adversaries to be efficient. No adversary,
doesn't matter how smart you are, no
-
adversary will be able to distinguish K XOR M0 from K XOR M1 because the two
-
distributions are identical. Okay, so the
one time pad is semantically secure. Okay,
-
so that completes our definition of
semantic security so the next thing we're
-
going to do is prove to the secure PRG,
in fact implies that the stream cipher is
-
semantically secure.