-
In this segment, we're gonna construct
authenticated encryption systems. Since we
-
already have CPA secured encryption, and
we have secure MACs, the natural question
-
is whether we can combine the two somehow,
in order to get authenticated encryption.
-
And if, that's exactly what we're gonna do
in this segment. Authenticated encryption
-
was introduced in the year 2000, in two
independent papers that I point to at the
-
end of this module. But before then, many
crytpolibraries provided an API that
-
separately supported CPA secure
encryption, and MAC-ing. So there was one
-
function for implementing CPA secure
encryption. For example, CBC with a random
-
IV. And another function for implementing
a MAC. And then every developer that
-
wanted to implement encryption, had to,
himself, call separately the CPA secure
-
encryption scheme and the MAC scheme. In
particular, every developer had to invent
-
his own way of combining encryption and
MAC-ing to provide some sort of
-
authenticated encryption. But since the
goals of combining encryption and MAC-ing
-
wasn't well understood since authenticated
encryption hasn't yet been defined, it
-
wasn't really clear which combinations of
encryption and MAC-ing are correct and
-
which aren't. And so, every project as I
said had to invent its own combination.
-
And in fact, not all combinations were
correct. And I can tell you that the most
-
common mistake in software projects were
basically incorrectly combining the
-
encryption and integrity mechanisms. So
hopefully, by the end of this module, you
-
will know how to combine them correctly,
and you won't be making these mistakes
-
yourself. So let's look at some
combinations of CPA secure encryption and
-
MAC, that were introduced by different
projects. So here are three examples. So,
-
first of all, in all three examples,
there's a separate key for encryption, and
-
a separate key for MACing. These two
keys are independent of one another, and
-
both are generated at session setup time.
And we're gonna see how to generate these
-
two keys later on in the course. So the
first example is the SSL protocol. So the
-
way SSL combines encryption and MAC in the
hope of achieving authenticated encryption
-
is the following. Basically you take the
plain text, m, and then you compute a MAC
-
on the plain text, m. So you use your MAC
key, kI, to compute tag for this message
-
m. And then you can concatenate the tag to
the message and then you encrypt the
-
concatenation of the message and the tag
and what comes out is the actual final cipher text.
-
So that's option number one. The
second option is what IPsec does. So
-
here, you take the message. The first
thing you do is you encrypt the message.
-
And then, you compute a tag on the
resulting cipher text. So you notice the
-
tag itself is computed on the resulting
cipher text. A third option is what the
-
SSH protocol does. So here, the SSH takes
the message, and encrypts it using a CPA
-
secure encryption scheme. And then, to it,
it concatenates a tag of the message. The
-
difference between IPsec and SSH, is that
in IPsec, the tag is computed over the
-
cipher text, whereas, in SSH, the tag is
computed over the message. And so these
-
are three completely different ways of
combining encryption and MAC. And the
-
question is, which one of these is secure?
So, I will let you think about this for a
-
second, and then when we continue we'll do
the analysis together.
-
Okay. So let's start with the SSH method. So
in the SSH method you notice that the tag
-
is computed on the message and then
concatenated in the clear to the cipher text.
-
Now this is actually quite a problem
because MACs themselves are not designed
-
to provide confidentiality. MACs are only
designed for integrity. And in fact, there's
-
nothing wrong with a MAC that as part of
the tag outputs a few bits of the plain
-
text. Outputs a few bits of the message M.
That would be a perfectly fine tag. And yet if
-
we did that, that would completely break
CPA security here, because some bits of
-
the message are leaked in the cipher text.
And so the SSH approach, even though the
-
specifics of SSH are fine and the
protocol itself is not compromised by
-
this specific combination, generally it's
advisable not to use this approach. Simply
-
because the output of the MAC signing algorithm might leak bits of the message. So
-
now let's look at SSL and IPsec. As it
turns out, the recommended method actually
-
is the IPsec method. Because it turns out
no matter what CPA secure system and MAC
-
key you use the combination is always
gonna provide authenticated encryption.
-
Now let me very, very briefly explain why.
Basically what happens is once we encrypt
-
the message well the message contents now
is hidden inside the cipher text and now
-
when we compute a tag of the cipher text
basically we're locking, this tag locks
-
the cipher text and makes sure no one can
produce a different cipher text that would
-
look valid. And as a result this approach
ensures that any modifications to the
-
cipher text will be detected by the
decrypter simply because the MAC isn't
-
gonna verify. As it turns out, for the SSL
approach, there actually are kind of
-
pathological examples, where you combine
CPA secure encryption system with a secure
-
MAC. And the result is vulnerable to a
chosen cipher text attack, so that it does
-
not actually provide authenticated
encryption. And basically, the reason that
-
could happen, is that there's some sort of
a bad interaction between the encryption
-
scheme and the MAC algorithm. Such that,
in fact, there will be a chosen cipher
-
text attack. So if you're designing a new
project the recommendation now is to
-
always use encrypt then MAC because that
is secure no matter which CPA secure
-
encryption and secure MAC algorithm you're
combining. Now, just to set the
-
terminology, the SSL method is sometimes
called MAC-then-encrypt. And the
-
IPsec method is called encrypt-then-MAC.
The SSH method even though you're
-
not supposed to use it, is called encrypt-and-MAC. Okay, so I'll often refer to
-
encrypt-then-MAC, and MAC-then-encrypt to
differentiate SSL and IPsec. Okay, so
-
just to repeat what I've just said. The IPsec
method encrypt-then-MAC always
-
provides authenticated encryption. If you start
from a CPA secure cipher and a secure MAC
-
you will always get authenticated
encryption. As I said, MAC-then-encrypt in
-
fact, there are pathological cases where
the result is vulnerable to CCA attacks and
-
therefore does not provide authenticated
encryption. However, the story's a little
-
bit more interesting than that, in that,
it turns out, if you're actually using
-
randomized counter mode or randomized CBC,
then it turns out, for those particular
-
CPA secure encryption schemes, MAC-then-encrypt
actually does provide authenticated
-
encryption and therefore it is secure. In
fact, there's even a more interesting
-
twist here in that if you're using
randomized counter mode. Then, it's enough
-
that your MAC algorithm just be one time
secure. It doesn't have to be a fully
-
secure MAC. It just has to be secure when
a key is used to encrypt a single message,
-
okay? And when we talked about message
integrity, we saw that there are actually
-
much faster MACs that are one time secure
than MACs that are fully secure. As a
-
result, if you're using randomized counter
mode MAC-then-encrypt could actually
-
result in a more efficient encryption
mechanism. However, I'm going to repeat
-
this again. The recommendation is to use
encrypt-then-MAC and we're going to see a
-
number of attacks on systems that didn't
use encrypt-then-MAC. And so just to make
-
sure things are secure without you having
to think too hard about this. Again, I am
-
going to recommend that you always use
encrypt-then-MAC. Now, once the concept of
-
authenticated encryption became more
popular, a number of standardized
-
approaches for combining encryption and
MAC turned up. And those were even
-
standardized by the National Institute of
Standards. So I'm just gonna mention three
-
of these standards. Two of these were
standardized by NIST. And these are
-
called Galois counter mode and CBC counter
mode. And so let me explain what they do.
-
Galois counter mode basically uses counter
mode encryption, so a randomized counter
-
mode with a Carter-Wegman MAC, so a very
fact Carter-Wegman MAC. And the way the
-
Carter-Wegman MAC works in GCM is it's
basically a hash function of the message
-
that's being MACed. And then the result is
encrypted using a PRF. Now this hash
-
function in GCM is already quite fast to
the point where the bulk of the running
-
time of GCM is dominated by the counter
mode encryption and it's even made more so
-
in that Intel introduces a special
instruction PCLMULQDQ specifically
-
designed for the purpose of making the
hash function in GCM run as fast as possible.
-
Now CCM counter mode is another
NIST standard. It uses a CBC MAC and
-
then counter mode encryption. So this
mechanism, you know, this uses MAC, then
-
encrypt, like SSL does. So this is
actually not the recommended way of doing
-
things, but because counter mode
encryption is used. This is actually a
-
perfectly fine encryption mechanism. One
thing that I'd like to point out about
-
CCM, is that everything is based on AES.
You notice, it's using AES for the CBC
-
MAC, and it's using AES for the counter
mode encryption. And as a result, CCM can
-
be implemented with relatively little
code. Cause all you need is an AES engine
-
and nothing else. And because of this, CCM
actually was adopted by the Wi-Fi
-
alliance, and in fact, you're probably
using CCM on a daily basis if you're using
-
encrypted Wi-Fi 802.11i then you're
basically using CCM to encrypt traffic
-
between your laptop and the access point.
There's another mode called a EAX that
-
uses counter mode encryption, and then
CMAC. So, again you notice encrypt-then-MAC
-
and that's another fine mode to
use. We'll do a comparison of all these
-
modes in just a minute. Now I wanted to
mention that first of all, all these modes are
-
nonce-based. In other words, they don't
use any randomness but they do take as
-
input a nonce and the nonce has to be
unique per key. In other words, as you
-
remember, the pair (key, nonce)
should never ever, ever repeat. But the
-
nonce itself need not be random, so
it's perfectly fine to use a counter, for
-
example, as a nonce. And the other
important point is that, in fact, all
-
these modes are what's called
authenticated encryption with associated
-
data. This is an extension of
authenticated encryption, that comes
-
up very often in networking protocols. So
the idea between AEAD is that, in fact,
-
the message that's provided to the encryption
mode is not intended to be fully
-
encrypted. Only part of the message is
intended to be encrypted, but all of the
-
message is intended to be authenticated. A
good example of this is a network packet.
-
Think of like a IP packet where there's a
header and then there's a payload. And
-
typically the header is not gonna be
encrypted. For example, the header might
-
contain the destination of the packet, but
then the header had better not be
-
encrypted otherwise routers along the way
wouldn't know where to route the packet.
-
And so, typically the header is sent in
the clear, but the payload, of course, is
-
always encrypted, but what you'd like to
do is have the header be authenticated.
-
Not encrypted but authenticated. So this is
exactly what these AEAD modes do. They
-
will authenticate the header and then
encrypt the payload. But the header and
-
the payload are bound together in the
authentication so they can't
-
actually be separated. So this is not
difficult to do. What happens is in these
-
three modes GCM, CCM, and EAX, basically
the MAC is applied to the entire data. But
-
the encryption is only applied to the part
of the data that needs to be encrypted.
-
So I wanted to show you what an API
to these authenticated encryption with
-
associated data encryption schemes look
like. So here's what it looks like in OpenSSL.
-
For example, this is, an API
for GCM. So what you do is you call the
-
init function to initialize the encryption
mode, and you notice you give it a key and
-
the nonce. The nonce again,
doesn't have to be random, but it has to
-
be unique. And after initialization, you
would call this encrypt function, where
-
you see that you give it the associated
data that's gonna be authenticated, but
-
not encrypted. You give it the data, and
it's gonna be both authenticated and
-
encrypted. And it gives you back the full
cipher text, which is an encryption of the
-
data, but of course does not include the
AEAD, because the AEAD is gonna be sent in
-
the clear. So now that we understand
this mode of encrypt-then-MAC, we can go
-
back to the definition of MAC security and
I can explain to you something that might
-
have been a little obscure when we looked
at that definition. So if you remember,
-
one of the requirements that followed from
our definition of secure MACs meant that
-
given a message-MAC pair on a message M,
the attacker cannot produce another tag on
-
the same message M. In other words, even
though the attacker already has a tag for
-
the message M, he shouldn't be able to
produce a new tag for the same message M.
-
And it's really not clear, why does that
matter? Who cares, if the adversary already
-
has a tag on the message M, who cares if
he can produce another tag? Well, it turns
-
out if the MAC didn't have this property.
In other words, given a message-MAC pair
-
you can produce another MAC on
the same message, then that MAC would
-
result in an insecure encrypt-then-MAC mode.
And so if we want our encrypt-then-MAC to
-
have cipher text integrity, it's crucial
that our MAC security would imply this strong
-
notion of security, which, of course, it
does because we defined it correctly.
-
So let's see what would go wrong, if, in
fact, it was easy to produce this type of
-
forgery. So what I'll do is I'll show you
a chosen cipher text attack on the
-
resulting encrypt-then-MAC system. And
since the system has a chosen cipher text
-
attack on it, it necessarily means that it
doesn't provide authenticated
-
encryption. So let's see. So the
adversary's gonnna start by sending two
-
messages, M0 and M1. And he's gonna
receive, as usual, the encryption of one
-
of them, either the encryption of M0 or
the encryption of M1. And since we're
-
using encrypt-then-MAC, the adversary
receives the cipher text we'll call it C0
-
and a MAC on the cipher text C0.
Well now we said that given the MAC on
-
a message the adversary can produce
another MAC on the same message. So what
-
he's gonna do is he's gonna produce
another MAC on the message C0. Now he has
-
a new cipher text (C0,T'), which is a
perfectly valid cipher text. T' is a
-
valid MAC of C0. Therefore, the adversary
now can submit a chosen cipher text query
-
on C' and this is a valid chosen
cipher text query because it's different
-
from C. It's a new cipher text. The poor
challenger now is forced to decrypt this
-
cipher text C' so he's going to send
back the decryption of C'. It's a
-
valid cipher text therefore the decryption
of C prime is the message Mb but now the
-
attacker just learned the value of B
because he can test whether Mb is equal to
-
M0 or MB is equal to M1. As a result he
can just output B and he gets advantage
-
one in defeating the scheme. And so
again if our MAC security did not imply
-
this property here. Then, there would be a
chosen cipher text attack on encrypt-then-MAC.
-
And therefore, it would not be secure. So the
fact that we define MAC security correctly
-
means that encrypt-then-MAC really does
provide authenticated encryption. And
-
throughout all the MACs that we discussed
actually do satisfy this strong notion of
-
unforgeability. So, interestingly, this is
not the end of the story. So, as we said
-
before the concept of authenticated
encryption was introduced everyone was
-
just combining MACs and encryption in
various ways in the hope of achieving
-
some authenticated encryption. After
the notion of authenticated encryption
-
became formalized and rigorous, people
kind of started scratching their heads and said,
-
hey, wait a minute. Maybe we can achieve
authenticated encryption more efficiently
-
than by combining a MAC and an encryption
scheme. In fact, if you think about how
-
this combination of MAC and encryption
works, let's say we combine counter mode
-
with CMAC, then for every block of
plaintext, you first of all have to use
-
your block cipher for counter mode, and
then you have to use to your block cipher
-
again, for the CBC-MAC. This means that if
you're combining CPA secure encryption with a
-
MAC, for every block of plaintext, you
have to evaluate your block cipher twice,
-
once for the MAC and once for the
encryption scheme. So the natural question
-
was, can we construct an authenticated
encryption scheme directly from a PRP,
-
such that we would have to only evaluate
the PRP once per block? And it turns out
-
the answer is yes, and there's this
beautiful construction called OCB, that
-
pretty much does everything you want, and
is much faster than constructions that are
-
separately built from an encryption and a
MAC. So I wrote down, kind of a schematic
-
of OCB. I don't want to explain it in
detail. I'll just kind of explain it at a
-
high level. So here we have our input
plain text, here at the top. And you
-
notice that, first of all, OCB is
parallelizable, completely parallelizable.
-
So every block can be encrypted separately of
every other block. The other thing to
-
notice is that as I promised, you only
evaluate your block cipher once per plain
-
text block. And then you evaluate it one
more time at the end to build your
-
authentication tag and then the overhead
of OCB beyond just a block cipher is
-
minimal. All you have to do is evaluate a
certain very simple function P. The
-
nonce goes into the P you notice, the
key goes into this P and then there is a
-
block counter that goes into this P. So
you just evaluate this function P, twice
-
for every block and you XOR the result
before and after encryption using the
-
block cipher and that's it. That's all you
have to do and then you get a very fast
-
and efficient authenticated encryption
scheme built from a block cipher. So OCB
-
actually has a nice security theorem
associated with it and I am going to point
-
to a paper on OCB when we get to end of
this module where I list some further
-
reading papers that you can take a look
at. So you might be wondering if OCB is so
-
much better than everything you've seen so
far, all these three standards CCM, GCM and
-
EAX why isn't OCB being used or why isn't
OCB the standard? And the answer is a
-
little sad. The primary answer that
OCB is not being used is actually because
-
of various patents. And I'll just leave it
at that. So to conclude this section I
-
wanted to show you some performance
numbers. So here on the right I listed
-
performance numbers for modes that you
shouldn't be using. So this is for
-
randomized counter mode, and this is for
randomized CBC. And you can see also the
-
performance of CBC MAC is basically the
same as the performance of CBC encryption.
-
Okay. Now here are the authenticated
encryption modes, so these are the ones
-
that you're supposed to using, these
you're not supposed to be using on their
-
own, right. These two, you should never
ever use these two because they only
-
provide CPA security, they don't
actually provide security against active
-
attacks. You're only supposed to use
authenticated encryption for encryption.
-
And so I listed performance numbers
for the three standards. And let me remind
-
you that GCM basically uses a very fast
hash. And then it uses counter mode for
-
actual encryption. And you can see that
the overhead of GCM over counter mode is
-
relatively small. CCM and EAX both use a
block cipher based encryption and a
-
block cipher based MAC. And as a result
they're about twice as slow as counter
-
modes. You see that OCB is actually the
fastest of these, primarily because it
-
only use the block cipher once per message
block. So based on these performance
-
numbers, you would think that GCM is
exactly the right mode to always use. But
-
it turns out if you're on the space
constrained hardware, GCM is not ideal.
-
Primarily because its implementation
requires larger code than the other two
-
modes. However, as I said, Intel
specifically added instructions to speed
-
up GCM mode. And as a result, implementing
GCM on an Intel architecture takes
-
very little code. But on other hardware
platforms, say in smart cards or other
-
constrained environments, the code size
for implementing GCM would be considerably
-
larger than for the other two modes. But
if code size is not a constraint then GCM
-
is the right mode to use. So to summarize
this segment I want to say it one more
-
time that when you want to encrypt
messages you have to use an authenticated
-
encryption mode and the recommended way to
do it is to use one of the standards,
-
namely one of these three modes for
providing authenticated encryption.
-
Don't implement the encryption scheme yourself.
In other words don't implement
-
encrypt-then-MAC yourself. Just use one of these
three standards. Many crypto libraries
-
now provide standard API's for these three
modes and these are the one's you should
-
be using and nothing else. In the next
segment we're going to see what else can
-
go wrong when you try to implement
authenticated encryption by yourself.