Timing attacks on MAC verification (9 min)

Edit subtitles

0:00 - 0:04

In the last segment in this module, I
wanna show you a general attack that
0:04 - 0:08

affects many implementations of Mac
algorithms. And there's a nice lesson to
0:08 - 0:12

be learned from an attack like this. So
let's look at a particular implementation
0:12 - 0:16

of HMAC verification. This happens to be
an implementation from the Keyczar library,
0:16 - 0:20

that happens to be written in Python. So
here's the code that's used to verify a
0:20 - 0:24

tag generated by HMAC. This code is
actually simplified. I just wanted to
0:24 - 0:28

kinda simplify it as much as I can to get
the point across. So basically, what the
0:28 - 0:32

inputs are, the key, the message, and the
tag bytes. The way we verify it is, we
0:32 - 0:38

re-compute the HMAC on the message and
then we compare say the resulting sixteen
0:38 - 0:43

bytes. To the actual given signature
bites. So this looks perfectly fine.
0:43 - 0:48

In fact, anyone might implement it like this.
And, in fact, many people have implemented
0:48 - 0:52

it like this. The problem is, that if you
look at how the comparison is done, the
0:52 - 0:57

comparison, as you might expect, is done
byte by byte. There's a loop inside of the
0:57 - 1:01

Python interpreter that loops over all
sixteen bytes. And it so happens that the
1:01 - 1:06

first time it finds an inequality, the
loop terminates and says the strings are
1:06 - 1:11

not equal. And the fact that the
comparator exits when the first inequality
1:11 - 1:16

is found introduces a significant timing
attack on this library. So let me show you
1:16 - 1:21

how one might attack it. So imagine you're
the attacker, and you have the message m,
1:21 - 1:26

for which you want to obtain a valid tag.
Now, your goal is to attack a server that
1:26 - 1:31

has an HMAC secret key stored in it. And
the server exposes an interface that
1:31 - 1:36

basically takes message MAC pairs. Checks
that the MAC is valid, if the MAC is valid
1:36 - 1:40

it does something with the message. And if
the MAC is not valid, it says reject.
1:40 - 1:45

Okay, so it's back to the originator or
the message rejects. So now this attacker
1:45 - 1:50

has an opportunity to basically submit
lots of message it appears and see if it
1:50 - 1:54

can deduce the tags of the particular
message for which it once attacked. Here's
1:54 - 1:59

how we might use the broken implementation
from the previous slide to do just that.
1:59 - 2:04

So what the attacker is gonna do is to
submit many message tag queries, where the
2:04 - 2:08

message is always the same. But with a
tag, he's gonna experiment with lots and
2:08 - 2:13

lots and lots of different tags. So in the
first query, what he's gonna do is just
2:13 - 2:17

submit a random tag along with the target
message. And he's gonna measure how long
2:17 - 2:22

the server took to respond. The next query
that he's gonna submit, is he's gonna try
2:22 - 2:26

all possible first bytes for the tags. Let
me explain what I mean by that. So the
2:26 - 2:30

remaining bytes of the tags that he
submits are just arbitrary, doesn't really
2:30 - 2:35

matter what they are. But for the first
bite, what he'll do is he'll submit a tag
2:35 - 2:39

starting with a byte zero. And then he's
gonna see whether the server took a little
2:39 - 2:44

bit longer to verify the tag than before.
If the server took exactly the same amount
2:44 - 2:49

of time to verify the tag as in step one,
then he's gonna try again, this time with
2:49 - 2:53

bytes at one. If still the server
responded very quickly, he's going to try
2:53 - 2:57

with byte sets of two. If the server
responded quickly then he's going to try
2:57 - 3:01

with byte sets of three, and so on until
finally, let's say, when the byte sets of
3:01 - 3:05

three the server takes a little bit longer
to respond. What that means is actually
3:05 - 3:09

when it did the comparison between the
correct MAC and the MAC submitted by the
3:09 - 3:14

attacker. The two matched on this byte,
and the rejection happened on the second
3:14 - 3:19

bytes. Aha. So now the attacker knows that
the first bite of the tag is set to three
3:19 - 3:23

and now he can mount exactly the same
attack on the second bite. So here. It's
3:23 - 3:29

going to submit the tag. And the second,
back here. Here This should go here. So
3:29 - 3:33

it's gonna submit a tag when the second
byte is set to zero. And it's gonna
3:33 - 3:37

measure whether this took a little bit
longer than in step two. If not, he's
3:37 - 3:41

gonna change this to be set to one, and
he's gonna measure if the server's
3:41 - 3:45

response time is a little longer than
before. Eventually, let's say when he sets
3:45 - 3:49

this to, I don't know. When the byte is
set to, to 53, say, all of a sudden, the
3:49 - 3:53

server takes a little bit longer to
respond. Which means that now, the
3:53 - 3:57

comparator matched on the first two bytes.
And now the attacker learned that the
3:57 - 4:01

first two bytes of the Mac are three and
53. And now he can move on and do the
4:01 - 4:05

exact same thing on the third byte and
then on the fourth byte and so on and so
4:05 - 4:09

forth. Until finally, the server says,
yes, I accept. You actually gave me the
4:09 - 4:14

right Mac. And then we'll go ahead and act
on this bogus message. That, attack our
4:14 - 4:19

supply. So this is a beautiful example of
how a timing attack can reveal the value
4:19 - 4:23

of a MAC, the correct value of the MAC.
Kind of byte by byte, until eventually,
4:23 - 4:28

the attacker obtains all the correct bytes
of the tag, and then he's able to fool the
4:28 - 4:33

server into accepting this message tag
pair. The reason I like this example is
4:33 - 4:37

this is a perfectly reasonable way of
implementing a MAC verification routine.
4:37 - 4:42

And yet, if you right it this way, it will
be completely broken. So what do we do? So
4:42 - 4:47

let me show you two defenses, the first
defense, I'll write it in again in python
4:47 - 4:51

is, is as follows. In fact the Keyczar
library exactly implemented this defense.
4:51 - 4:56

This code is actually taken out of the
updated version of the library. The first
4:56 - 5:00

thing we do is we test if the signature
bytes submitted by the attacker are of the
5:00 - 5:05

correct length, say for HMAC this would
be say, you know 96 bits or 128 bits, and
5:05 - 5:09

if not we reject that as an invalid MAC.
But now, if the signature bytes really do
5:09 - 5:13

have the correct length, what we do is
implement our own comparator. And it
5:13 - 5:18

always takes the same amount of time to
compare the two strings. So in particular,
5:18 - 5:22

this uses the zip function in Python,
which will, essentially, if you are giving
5:22 - 5:28

it two sixteen byte strings. It will
create sixteen pairs. Of bytes. So it'll
5:28 - 5:33

just create a, a list of sixteen elements,
where each element is a pair of bytes. One
5:33 - 5:37

taken from the left and one taken from the
right. And then you loop, you know, you
5:37 - 5:41

loop through this list of pairs. You
compute the XOR of the first pair, and the
5:41 - 5:45

OR into the result. Then you
compute the XOR of the second pair, and
5:45 - 5:50

you OR that into the result. And
you note that, if at any point in this
5:50 - 5:54

loop, two bytes happen to be not equal,
then the XOR will evaluate to something
5:54 - 5:59

that's non zero. And therefore, when we
OR'ed it into the result. The result
5:59 - 6:03

will also be counting on zero, and then
we'll return false, at the end of the
6:03 - 6:07

comparison. So the point here is that now
the comparator always takes the same
6:07 - 6:11

amount of time. Even if it finds
difference in byte number three, it will
6:11 - 6:15

continue running down the both strings
until the very end. And only then will it
6:15 - 6:20

return the results. So now the timing
attack supposedly is impossible. However,
6:20 - 6:25

this can be quite problematic, because
compilers tried to be too helpful here. So
6:25 - 6:30

an optimized compiler might look at this
code and say, hey, wait a minute. I can
6:30 - 6:35

actually improve this code by making the
four loop end. As soon as an incompatible
6:35 - 6:39

set of bytes is discovered. And so, an
optimizing compiler could be your, kind
6:39 - 6:44

of, Achilles heel when it comes to making
programs always take the same amount of
6:44 - 6:48

time. And so a different defense, which is
not as widely implemented, is to try and
6:48 - 6:53

hide from the adversary, what strings are
actually being compared. So let me show
6:53 - 6:57

you what I mean by that. So again, here we
have our verification algorithm. So it
6:57 - 7:02

takes as inputs, a key, a message, and a
candidate's MAC from the adversary. And
7:02 - 7:06

then, the way we do the comparison is we
first of all, compute the correct MAC on
7:06 - 7:10

the message. But then instead of directly
comparing the MAC and the signature
7:10 - 7:15

bytes adversary, what we're gonna do
is we're gonna hash one more time. So we
7:15 - 7:19

compute a hash here of the MAC. We compute
a hash of the signature bytes. Of course,
7:19 - 7:24

if these two happen to be the same, then
the resulting HMACs will also be the
7:24 - 7:28

same, so the comparison will actually
succeed. But the point is now, if sig
7:28 - 7:32

bytes happen to equal MAC on the first
byte, but not on the remaining bytes.
7:32 - 7:36

Then, when we do this additional hash
layer, it's likely that the two resulting
7:36 - 7:40

values are completely different. And as a
result, the byte by byte comparator will
7:40 - 7:44

just output on the first iteration. The
point here is that the adversary doesn't
7:44 - 7:47

actually know what strings are being
compared. And as a result, he can't
7:47 - 7:51

mount a timing attack that we
discussed earlier. Okay, so this is
7:51 - 7:55

another defense. At least now, you're not
at the mercy of an optimizing compiler.
7:55 - 8:00

The main lesson from all of this is that
you realize that people who even are
8:00 - 8:04

experts at implementing cryptolibraries,
get this stuff wrong. And the right code
8:04 - 8:08

that words perfectly fine and yet is
completely vulnerable to a timing attack
8:08 - 8:12

that completely undo all security of the
system. So the lesson here is of course
8:12 - 8:16

you should not be inventing your own
crypto but you shouldn't even be
8:16 - 8:20

implementing your own crypto because most
likely it'll be vulnerable to the side
8:20 - 8:24

channel attacks. Just use a standard
library like OpenSSL. Keyczar is actually a
8:24 - 8:28

fine library to use that would reduce the
chances that you're vulnerable to these
8:28 - 8:28

types of attacks.

Title:: Timing attacks on MAC verification (9 min)
Video Language:: English

	iracly.kv edited English subtitles for Timing attacks on MAC verification (9 min)
	iracly.kv edited English subtitles for Timing attacks on MAC verification (9 min)
	iracly.kv edited English subtitles for Timing attacks on MAC verification (9 min)
	stanford-bot edited English subtitles for Timing attacks on MAC verification (9 min)
	stanford-bot edited English subtitles for Timing attacks on MAC verification (9 min)
	stanford-bot added a translation

English subtitles

Revisions

Revision 2

iracly.kv

Timing attacks on MAC verification (9 min)

Revisions

Our website uses cookies

Operating cookies (Required)