Constructions from ciphers and MACs (21 min)

Edit subtitles

0:00 - 0:04

In this segment, we're gonna construct
authenticated encryption systems. Since we
0:04 - 0:08

already have CPA secured encryption, and
we have secure MACs, the natural question
0:08 - 0:13

is whether we can combine the two somehow,
in order to get authenticated encryption.
0:13 - 0:16

And if, that's exactly what we're gonna do
in this segment. Authenticated encryption
0:16 - 0:20

was introduced in the year 2000, in two
independent papers that I point to at the
0:20 - 0:26

end of this module. But before then, many
crytpolibraries provided an API that
0:26 - 0:31

separately supported CPA secure
encryption, and MAC-ing. So there was one
0:31 - 0:36

function for implementing CPA secure
encryption. For example, CBC with a random
0:36 - 0:41

IV. And another function for implementing
a MAC. And then every developer that
0:41 - 0:46

wanted to implement encryption, had to,
himself, call separately the CPA secure
0:46 - 0:51

encryption scheme and the MAC scheme. In
particular, every developer had to invent
0:51 - 0:56

his own way of combining encryption and
MAC-ing to provide some sort of
0:56 - 0:59

authenticated encryption. But since the
goals of combining encryption and MAC-ing
0:59 - 1:04

wasn't well understood since authenticated
encryption hasn't yet been defined, it
1:04 - 1:08

wasn't really clear which combinations of
encryption and MAC-ing are correct and
1:08 - 1:13

which aren't. And so, every project as I
said had to invent its own combination.
1:13 - 1:17

And in fact, not all combinations were
correct. And I can tell you that the most
1:17 - 1:23

common mistake in software projects were
basically incorrectly combining the
1:23 - 1:27

encryption and integrity mechanisms. So
hopefully, by the end of this module, you
1:27 - 1:31

will know how to combine them correctly,
and you won't be making these mistakes
1:31 - 1:35

yourself. So let's look at some
combinations of CPA secure encryption and
1:35 - 1:39

MAC, that were introduced by different
projects. So here are three examples. So,
1:39 - 1:44

first of all, in all three examples,
there's a separate key for encryption, and
1:44 - 1:48

a separate key for MACing. These two
keys are independent of one another, and
1:48 - 1:52

both are generated at session setup time.
And we're gonna see how to generate these
1:52 - 1:57

two keys later on in the course. So the
first example is the SSL protocol. So the
1:57 - 2:03

way SSL combines encryption and MAC in the
hope of achieving authenticated encryption
2:03 - 2:08

is the following. Basically you take the
plain text, m, and then you compute a MAC
2:08 - 2:13

on the plain text, m. So you use your MAC
key, kI, to compute tag for this message
2:13 - 2:18

m. And then you can concatenate the tag to
the message and then you encrypt the
2:18 - 2:23

concatenation of the message and the tag
and what comes out is the actual final cipher text.
2:23 - 2:27

So that's option number one. The
second option is what IPsec does. So
2:27 - 2:31

here, you take the message. The first
thing you do is you encrypt the message.
2:31 - 2:36

And then, you compute a tag on the
resulting cipher text. So you notice the
2:36 - 2:40

tag itself is computed on the resulting
cipher text. A third option is what the
2:40 - 2:45

SSH protocol does. So here, the SSH takes
the message, and encrypts it using a CPA
2:45 - 2:51

secure encryption scheme. And then, to it,
it concatenates a tag of the message. The
2:51 - 2:56

difference between IPsec and SSH, is that
in IPsec, the tag is computed over the
2:56 - 3:00

cipher text, whereas, in SSH, the tag is
computed over the message. And so these
3:00 - 3:05

are three completely different ways of
combining encryption and MAC. And the
3:05 - 3:09

question is, which one of these is secure?
So, I will let you think about this for a
3:09 - 3:12

second, and then when we continue we'll do
the analysis together.
3:13 - 3:17

Okay. So let's start with the SSH method. So
in the SSH method you notice that the tag
3:17 - 3:22

is computed on the message and then
concatenated in the clear to the cipher text.
3:22 - 3:26

Now this is actually quite a problem
because MACs themselves are not designed
3:26 - 3:31

to provide confidentiality. MACs are only
designed for integrity. And in fact, there's
3:31 - 3:36

nothing wrong with a MAC that as part of
the tag outputs a few bits of the plain
3:36 - 3:41

text. Outputs a few bits of the message M.
That would be a perfectly fine tag. And yet if
3:41 - 3:47

we did that, that would completely break
CPA security here, because some bits of
3:47 - 3:52

the message are leaked in the cipher text.
And so the SSH approach, even though the
3:52 - 3:57

specifics of SSH are fine and the
protocol itself is not compromised by
3:57 - 4:01

this specific combination, generally it's
advisable not to use this approach. Simply
4:01 - 4:06

because the output of the MAC signing algorithm might leak bits of the message. So
4:06 - 4:11

now let's look at SSL and IPsec. As it
turns out, the recommended method actually
4:11 - 4:17

is the IPsec method. Because it turns out
no matter what CPA secure system and MAC
4:17 - 4:21

key you use the combination is always
gonna provide authenticated encryption.
4:21 - 4:26

Now let me very, very briefly explain why.
Basically what happens is once we encrypt
4:26 - 4:31

the message well the message contents now
is hidden inside the cipher text and now
4:31 - 4:36

when we compute a tag of the cipher text
basically we're locking, this tag locks
4:36 - 4:41

the cipher text and makes sure no one can
produce a different cipher text that would
4:41 - 4:45

look valid. And as a result this approach
ensures that any modifications to the
4:45 - 4:50

cipher text will be detected by the
decrypter simply because the MAC isn't
4:50 - 4:54

gonna verify. As it turns out, for the SSL
approach, there actually are kind of
4:54 - 4:59

pathological examples, where you combine
CPA secure encryption system with a secure
4:59 - 5:04

MAC. And the result is vulnerable to a
chosen cipher text attack, so that it does
5:04 - 5:08

not actually provide authenticated
encryption. And basically, the reason that
5:08 - 5:13

could happen, is that there's some sort of
a bad interaction between the encryption
5:13 - 5:17

scheme and the MAC algorithm. Such that,
in fact, there will be a chosen cipher
5:17 - 5:22

text attack. So if you're designing a new
project the recommendation now is to
5:22 - 5:26

always use encrypt then MAC because that
is secure no matter which CPA secure
5:26 - 5:31

encryption and secure MAC algorithm you're
combining. Now, just to set the
5:31 - 5:38

terminology, the SSL method is sometimes
called MAC-then-encrypt. And the
5:38 - 5:45

IPsec method is called encrypt-then-MAC.
The SSH method even though you're
5:45 - 5:52

not supposed to use it, is called encrypt-and-MAC. Okay, so I'll often refer to
5:52 - 5:57

encrypt-then-MAC, and MAC-then-encrypt to
differentiate SSL and IPsec. Okay, so
5:57 - 6:02

just to repeat what I've just said. The IPsec
method encrypt-then-MAC always
6:02 - 6:07

provides authenticated encryption. If you start
from a CPA secure cipher and a secure MAC
6:07 - 6:11

you will always get authenticated
encryption. As I said, MAC-then-encrypt in
6:11 - 6:16

fact, there are pathological cases where
the result is vulnerable to CCA attacks and
6:16 - 6:20

therefore does not provide authenticated
encryption. However, the story's a little
6:20 - 6:25

bit more interesting than that, in that,
it turns out, if you're actually using
6:25 - 6:29

randomized counter mode or randomized CBC,
then it turns out, for those particular
6:29 - 6:34

CPA secure encryption schemes, MAC-then-encrypt
actually does provide authenticated
6:34 - 6:38

encryption and therefore it is secure. In
fact, there's even a more interesting
6:38 - 6:42

twist here in that if you're using
randomized counter mode. Then, it's enough
6:42 - 6:47

that your MAC algorithm just be one time
secure. It doesn't have to be a fully
6:47 - 6:52

secure MAC. It just has to be secure when
a key is used to encrypt a single message,
6:52 - 6:56

okay? And when we talked about message
integrity, we saw that there are actually
6:56 - 7:01

much faster MACs that are one time secure
than MACs that are fully secure. As a
7:01 - 7:04

result, if you're using randomized counter
mode MAC-then-encrypt could actually
7:04 - 7:08

result in a more efficient encryption
mechanism. However, I'm going to repeat
7:08 - 7:12

this again. The recommendation is to use
encrypt-then-MAC and we're going to see a
7:12 - 7:16

number of attacks on systems that didn't
use encrypt-then-MAC. And so just to make
7:16 - 7:20

sure things are secure without you having
to think too hard about this. Again, I am
7:20 - 7:24

going to recommend that you always use
encrypt-then-MAC. Now, once the concept of
7:24 - 7:28

authenticated encryption became more
popular, a number of standardized
7:28 - 7:32

approaches for combining encryption and
MAC turned up. And those were even
7:32 - 7:36

standardized by the National Institute of
Standards. So I'm just gonna mention three
7:36 - 7:41

of these standards. Two of these were
standardized by NIST. And these are
7:41 - 7:46

called Galois counter mode and CBC counter
mode. And so let me explain what they do.
7:46 - 7:51

Galois counter mode basically uses counter
mode encryption, so a randomized counter
7:51 - 7:56

mode with a Carter-Wegman MAC, so a very
fact Carter-Wegman MAC. And the way the
7:56 - 8:01

Carter-Wegman MAC works in GCM is it's
basically a hash function of the message
8:01 - 8:06

that's being MACed. And then the result is
encrypted using a PRF. Now this hash
8:06 - 8:12

function in GCM is already quite fast to
the point where the bulk of the running
8:12 - 8:16

time of GCM is dominated by the counter
mode encryption and it's even made more so
8:16 - 8:22

in that Intel introduces a special
instruction PCLMULQDQ specifically
8:22 - 8:27

designed for the purpose of making the
hash function in GCM run as fast as possible.
8:27 - 8:33

Now CCM counter mode is another
NIST standard. It uses a CBC MAC and
8:33 - 8:37

then counter mode encryption. So this
mechanism, you know, this uses MAC, then
8:37 - 8:41

encrypt, like SSL does. So this is
actually not the recommended way of doing
8:41 - 8:44

things, but because counter mode
encryption is used. This is actually a
8:44 - 8:48

perfectly fine encryption mechanism. One
thing that I'd like to point out about
8:48 - 8:54

CCM, is that everything is based on AES.
You notice, it's using AES for the CBC
8:54 - 8:59

MAC, and it's using AES for the counter
mode encryption. And as a result, CCM can
8:59 - 9:03

be implemented with relatively little
code. Cause all you need is an AES engine
9:03 - 9:08

and nothing else. And because of this, CCM
actually was adopted by the Wi-Fi
9:08 - 9:14

alliance, and in fact, you're probably
using CCM on a daily basis if you're using
9:14 - 9:19

encrypted Wi-Fi 802.11i then you're
basically using CCM to encrypt traffic
9:19 - 9:23

between your laptop and the access point.
There's another mode called a EAX that
9:23 - 9:29

uses counter mode encryption, and then
CMAC. So, again you notice encrypt-then-MAC
9:29 - 9:32

and that's another fine mode to
use. We'll do a comparison of all these
9:32 - 9:37

modes in just a minute. Now I wanted to
mention that first of all, all these modes are
9:37 - 9:41

nonce-based. In other words, they don't
use any randomness but they do take as
9:41 - 9:46

input a nonce and the nonce has to be
unique per key. In other words, as you
9:46 - 9:51

remember, the pair (key, nonce)
should never ever, ever repeat. But the
9:51 - 9:54

nonce itself need not be random, so
it's perfectly fine to use a counter, for
9:54 - 9:58

example, as a nonce. And the other
important point is that, in fact, all
9:58 - 10:01

these modes are what's called
authenticated encryption with associated
10:01 - 10:05

data. This is an extension of
authenticated encryption, that comes
10:05 - 10:11

up very often in networking protocols. So
the idea between AEAD is that, in fact,
10:11 - 10:15

the message that's provided to the encryption
mode is not intended to be fully
10:15 - 10:20

encrypted. Only part of the message is
intended to be encrypted, but all of the
10:20 - 10:24

message is intended to be authenticated. A
good example of this is a network packet.
10:24 - 10:29

Think of like a IP packet where there's a
header and then there's a payload. And
10:29 - 10:33

typically the header is not gonna be
encrypted. For example, the header might
10:33 - 10:37

contain the destination of the packet, but
then the header had better not be
10:37 - 10:41

encrypted otherwise routers along the way
wouldn't know where to route the packet.
10:41 - 10:45

And so, typically the header is sent in
the clear, but the payload, of course, is
10:45 - 10:50

always encrypted, but what you'd like to
do is have the header be authenticated.
10:50 - 10:56

Not encrypted but authenticated. So this is
exactly what these AEAD modes do. They
10:56 - 11:00

will authenticate the header and then
encrypt the payload. But the header and
11:00 - 11:04

the payload are bound together in the
authentication so they can't
11:04 - 11:08

actually be separated. So this is not
difficult to do. What happens is in these
11:08 - 11:14

three modes GCM, CCM, and EAX, basically
the MAC is applied to the entire data. But
11:14 - 11:19

the encryption is only applied to the part
of the data that needs to be encrypted.
11:19 - 11:23

So I wanted to show you what an API
to these authenticated encryption with
11:23 - 11:29

associated data encryption schemes look
like. So here's what it looks like in OpenSSL.
11:29 - 11:34

For example, this is, an API
for GCM. So what you do is you call the
11:34 - 11:37

init function to initialize the encryption
mode, and you notice you give it a key and
11:37 - 11:41

the nonce. The nonce again,
doesn't have to be random, but it has to
11:41 - 11:44

be unique. And after initialization, you
would call this encrypt function, where
11:44 - 11:48

you see that you give it the associated
data that's gonna be authenticated, but
11:48 - 11:52

not encrypted. You give it the data, and
it's gonna be both authenticated and
11:52 - 11:56

encrypted. And it gives you back the full
cipher text, which is an encryption of the
11:56 - 12:00

data, but of course does not include the
AEAD, because the AEAD is gonna be sent in
12:00 - 12:05

the clear. So now that we understand
this mode of encrypt-then-MAC, we can go
12:05 - 12:10

back to the definition of MAC security and
I can explain to you something that might
12:10 - 12:14

have been a little obscure when we looked
at that definition. So if you remember,
12:14 - 12:19

one of the requirements that followed from
our definition of secure MACs meant that
12:19 - 12:26

given a message-MAC pair on a message M,
the attacker cannot produce another tag on
12:26 - 12:30

the same message M. In other words, even
though the attacker already has a tag for
12:30 - 12:35

the message M, he shouldn't be able to
produce a new tag for the same message M.
12:35 - 12:39

And it's really not clear, why does that
matter? Who cares, if the adversary already
12:39 - 12:44

has a tag on the message M, who cares if
he can produce another tag? Well, it turns
12:44 - 12:49

out if the MAC didn't have this property.
In other words, given a message-MAC pair
12:49 - 12:54

you can produce another MAC on
the same message, then that MAC would
12:54 - 12:59

result in an insecure encrypt-then-MAC mode.
And so if we want our encrypt-then-MAC to
12:59 - 13:04

have cipher text integrity, it's crucial
that our MAC security would imply this strong
13:04 - 13:09

notion of security, which, of course, it
does because we defined it correctly.
13:09 - 13:14

So let's see what would go wrong, if, in
fact, it was easy to produce this type of
13:14 - 13:18

forgery. So what I'll do is I'll show you
a chosen cipher text attack on the
13:18 - 13:23

resulting encrypt-then-MAC system. And
since the system has a chosen cipher text
13:23 - 13:27

attack on it, it necessarily means that it
doesn't provide authenticated
13:27 - 13:31

encryption. So let's see. So the
adversary's gonnna start by sending two
13:31 - 13:36

messages, M0 and M1. And he's gonna
receive, as usual, the encryption of one
13:36 - 13:40

of them, either the encryption of M0 or
the encryption of M1. And since we're
13:40 - 13:45

using encrypt-then-MAC, the adversary
receives the cipher text we'll call it C0
13:45 - 13:50

and a MAC on the cipher text C0.
Well now we said that given the MAC on
13:50 - 13:54

a message the adversary can produce
another MAC on the same message. So what
13:54 - 13:58

he's gonna do is he's gonna produce
another MAC on the message C0. Now he has
13:58 - 14:04

a new cipher text (C0,T'), which is a
perfectly valid cipher text. T' is a
14:04 - 14:10

valid MAC of C0. Therefore, the adversary
now can submit a chosen cipher text query
14:10 - 14:14

on C' and this is a valid chosen
cipher text query because it's different
14:14 - 14:19

from C. It's a new cipher text. The poor
challenger now is forced to decrypt this
14:19 - 14:23

cipher text C' so he's going to send
back the decryption of C'. It's a
14:23 - 14:29

valid cipher text therefore the decryption
of C prime is the message Mb but now the
14:29 - 14:32

attacker just learned the value of B
because he can test whether Mb is equal to
14:32 - 14:37

M0 or MB is equal to M1. As a result he
can just output B and he gets advantage
14:37 - 14:43

one in defeating the scheme. And so
again if our MAC security did not imply
14:43 - 14:48

this property here. Then, there would be a
chosen cipher text attack on encrypt-then-MAC.
14:48 - 14:53

And therefore, it would not be secure. So the
fact that we define MAC security correctly
14:53 - 14:57

means that encrypt-then-MAC really does
provide authenticated encryption. And
14:57 - 15:02

throughout all the MACs that we discussed
actually do satisfy this strong notion of
15:02 - 15:06

unforgeability. So, interestingly, this is
not the end of the story. So, as we said
15:06 - 15:10

before the concept of authenticated
encryption was introduced everyone was
15:10 - 15:15

just combining MACs and encryption in
various ways in the hope of achieving
15:15 - 15:19

some authenticated encryption. After
the notion of authenticated encryption
15:19 - 15:24

became formalized and rigorous, people
kind of started scratching their heads and said,
15:24 - 15:28

hey, wait a minute. Maybe we can achieve
authenticated encryption more efficiently
15:28 - 15:33

than by combining a MAC and an encryption
scheme. In fact, if you think about how
15:33 - 15:37

this combination of MAC and encryption
works, let's say we combine counter mode
15:37 - 15:42

with CMAC, then for every block of
plaintext, you first of all have to use
15:42 - 15:46

your block cipher for counter mode, and
then you have to use to your block cipher
15:46 - 15:51

again, for the CBC-MAC. This means that if
you're combining CPA secure encryption with a
15:51 - 15:56

MAC, for every block of plaintext, you
have to evaluate your block cipher twice,
15:56 - 16:01

once for the MAC and once for the
encryption scheme. So the natural question
16:01 - 16:05

was, can we construct an authenticated
encryption scheme directly from a PRP,
16:05 - 16:10

such that we would have to only evaluate
the PRP once per block? And it turns out
16:10 - 16:14

the answer is yes, and there's this
beautiful construction called OCB, that
16:14 - 16:18

pretty much does everything you want, and
is much faster than constructions that are
16:18 - 16:22

separately built from an encryption and a
MAC. So I wrote down, kind of a schematic
16:22 - 16:26

of OCB. I don't want to explain it in
detail. I'll just kind of explain it at a
16:26 - 16:30

high level. So here we have our input
plain text, here at the top. And you
16:30 - 16:35

notice that, first of all, OCB is
parallelizable, completely parallelizable.
16:35 - 16:40

So every block can be encrypted separately of
every other block. The other thing to
16:40 - 16:44

notice is that as I promised, you only
evaluate your block cipher once per plain
16:44 - 16:49

text block. And then you evaluate it one
more time at the end to build your
16:49 - 16:54

authentication tag and then the overhead
of OCB beyond just a block cipher is
16:54 - 16:59

minimal. All you have to do is evaluate a
certain very simple function P. The
16:59 - 17:03

nonce goes into the P you notice, the
key goes into this P and then there is a
17:03 - 17:08

block counter that goes into this P. So
you just evaluate this function P, twice
17:08 - 17:13

for every block and you XOR the result
before and after encryption using the
17:13 - 17:18

block cipher and that's it. That's all you
have to do and then you get a very fast
17:18 - 17:22

and efficient authenticated encryption
scheme built from a block cipher. So OCB
17:22 - 17:26

actually has a nice security theorem
associated with it and I am going to point
17:26 - 17:30

to a paper on OCB when we get to end of
this module where I list some further
17:30 - 17:34

reading papers that you can take a look
at. So you might be wondering if OCB is so
17:34 - 17:40

much better than everything you've seen so
far, all these three standards CCM, GCM and
17:40 - 17:46

EAX why isn't OCB being used or why isn't
OCB the standard? And the answer is a
17:46 - 17:51

little sad. The primary answer that
OCB is not being used is actually because
17:51 - 17:55

of various patents. And I'll just leave it
at that. So to conclude this section I
17:55 - 17:58

wanted to show you some performance
numbers. So here on the right I listed
17:58 - 18:02

performance numbers for modes that you
shouldn't be using. So this is for
18:02 - 18:08

randomized counter mode, and this is for
randomized CBC. And you can see also the
18:08 - 18:12

performance of CBC MAC is basically the
same as the performance of CBC encryption.
18:12 - 18:16

Okay. Now here are the authenticated
encryption modes, so these are the ones
18:16 - 18:20

that you're supposed to using, these
you're not supposed to be using on their
18:20 - 18:24

own, right. These two, you should never
ever use these two because they only
18:24 - 18:28

provide CPA security, they don't
actually provide security against active
18:28 - 18:32

attacks. You're only supposed to use
authenticated encryption for encryption.
18:32 - 18:36

And so I listed performance numbers
for the three standards. And let me remind
18:36 - 18:40

you that GCM basically uses a very fast
hash. And then it uses counter mode for
18:40 - 18:44

actual encryption. And you can see that
the overhead of GCM over counter mode is
18:44 - 18:50

relatively small. CCM and EAX both use a
block cipher based encryption and a
18:50 - 18:55

block cipher based MAC. And as a result
they're about twice as slow as counter
18:55 - 18:59

modes. You see that OCB is actually the
fastest of these, primarily because it
18:59 - 19:04

only use the block cipher once per message
block. So based on these performance
19:04 - 19:08

numbers, you would think that GCM is
exactly the right mode to always use. But
19:08 - 19:13

it turns out if you're on the space
constrained hardware, GCM is not ideal.
19:13 - 19:17

Primarily because its implementation
requires larger code than the other two
19:17 - 19:21

modes. However, as I said, Intel
specifically added instructions to speed
19:21 - 19:26

up GCM mode. And as a result, implementing
GCM on an Intel architecture takes
19:26 - 19:30

very little code. But on other hardware
platforms, say in smart cards or other
19:30 - 19:35

constrained environments, the code size
for implementing GCM would be considerably
19:35 - 19:39

larger than for the other two modes. But
if code size is not a constraint then GCM
19:39 - 19:44

is the right mode to use. So to summarize
this segment I want to say it one more
19:44 - 19:48

time that when you want to encrypt
messages you have to use an authenticated
19:48 - 19:53

encryption mode and the recommended way to
do it is to use one of the standards,
19:53 - 19:57

namely one of these three modes for
providing authenticated encryption.
19:57 - 20:00

Don't implement the encryption scheme yourself.
In other words don't implement
20:00 - 20:06

encrypt-then-MAC yourself. Just use one of these
three standards. Many crypto libraries
20:06 - 20:11

now provide standard API's for these three
modes and these are the one's you should
20:11 - 20:14

be using and nothing else. In the next
segment we're going to see what else can
20:14 - 20:18

go wrong when you try to implement
authenticated encryption by yourself.

Title:: Constructions from ciphers and MACs (21 min)
Video Language:: English

	amyc edited English subtitles for Constructions from ciphers and MACs (21 min)
	amyc edited English subtitles for Constructions from ciphers and MACs (21 min)
	amyc edited English subtitles for Constructions from ciphers and MACs (21 min)
	rodolfojcj edited English subtitles for Constructions from ciphers and MACs (21 min)
	rodolfojcj edited English subtitles for Constructions from ciphers and MACs (21 min)
	rodolfojcj edited English subtitles for Constructions from ciphers and MACs (21 min)
	iracly.kv edited English subtitles for Constructions from ciphers and MACs (21 min)
	iracly.kv edited English subtitles for Constructions from ciphers and MACs (21 min)

Show all

English subtitles

Revisions

Revision 5

amyc

Constructions from ciphers and MACs (21 min)

Revisions

Our website uses cookies

Operating cookies (Required)