1
00:00:00,000 --> 00:00:03,757
In the last segment in this module, I
wanna show you a general attack that

2
00:00:03,757 --> 00:00:07,615
affects many implementations of Mac
algorithms. And there's a nice lesson to

3
00:00:07,615 --> 00:00:11,727
be learned from an attack like this. So
let's look at a particular implementation

4
00:00:11,727 --> 00:00:15,941
of HMAC verification. This happens to be
an implementation from the Keyczar library,

5
00:00:15,941 --> 00:00:20,003
that happens to be written in Python. So
here's the code that's used to verify a

6
00:00:20,003 --> 00:00:23,709
tag generated by HMAC. This code is
actually simplified. I just wanted to

7
00:00:23,709 --> 00:00:27,822
kinda simplify it as much as I can to get
the point across. So basically, what the

8
00:00:27,822 --> 00:00:32,235
inputs are, the key, the message, and the
tag bytes. The way we verify it is, we

9
00:00:32,235 --> 00:00:38,082
re-compute the HMAC on the message and
then we compare say the resulting sixteen

10
00:00:38,082 --> 00:00:43,271
bytes. To the actual given signature
bites. So this looks perfectly fine.

11
00:00:43,271 --> 00:00:47,834
In fact, anyone might implement it like this.
And, in fact, many people have implemented

12
00:00:47,834 --> 00:00:52,231
it like this. The problem is, that if you
look at how the comparison is done, the

13
00:00:52,231 --> 00:00:56,740
comparison, as you might expect, is done
byte by byte. There's a loop inside of the

14
00:00:56,740 --> 00:01:01,373
Python interpreter that loops over all
sixteen bytes. And it so happens that the

15
00:01:01,373 --> 00:01:06,297
first time it finds an inequality, the
loop terminates and says the strings are

16
00:01:06,297 --> 00:01:10,972
not equal. And the fact that the
comparator exits when the first inequality

17
00:01:10,972 --> 00:01:16,146
is found introduces a significant timing
attack on this library. So let me show you

18
00:01:16,146 --> 00:01:21,257
how one might attack it. So imagine you're
the attacker, and you have the message m,

19
00:01:21,257 --> 00:01:26,368
for which you want to obtain a valid tag.
Now, your goal is to attack a server that

20
00:01:26,368 --> 00:01:31,230
has an HMAC secret key stored in it. And
the server exposes an interface that

21
00:01:31,230 --> 00:01:36,089
basically takes message MAC pairs. Checks
that the MAC is valid, if the MAC is valid

22
00:01:36,089 --> 00:01:40,450
it does something with the message. And if
the MAC is not valid, it says reject.

23
00:01:40,450 --> 00:01:45,039
Okay, so it's back to the originator or
the message rejects. So now this attacker

24
00:01:45,039 --> 00:01:49,685
has an opportunity to basically submit
lots of message it appears and see if it

25
00:01:49,685 --> 00:01:54,274
can deduce the tags of the particular
message for which it once attacked. Here's

26
00:01:54,274 --> 00:01:59,036
how we might use the broken implementation
from the previous slide to do just that.

27
00:01:59,036 --> 00:02:03,661
So what the attacker is gonna do is to
submit many message tag queries, where the

28
00:02:03,661 --> 00:02:08,044
message is always the same. But with a
tag, he's gonna experiment with lots and

29
00:02:08,044 --> 00:02:12,595
lots and lots of different tags. So in the
first query, what he's gonna do is just

30
00:02:12,595 --> 00:02:17,202
submit a random tag along with the target
message. And he's gonna measure how long

31
00:02:17,202 --> 00:02:21,674
the server took to respond. The next query
that he's gonna submit, is he's gonna try

32
00:02:21,674 --> 00:02:25,896
all possible first bytes for the tags. Let
me explain what I mean by that. So the

33
00:02:25,896 --> 00:02:30,011
remaining bytes of the tags that he
submits are just arbitrary, doesn't really

34
00:02:30,011 --> 00:02:34,557
matter what they are. But for the first
bite, what he'll do is he'll submit a tag

35
00:02:34,557 --> 00:02:39,392
starting with a byte zero. And then he's
gonna see whether the server took a little

36
00:02:39,392 --> 00:02:44,285
bit longer to verify the tag than before.
If the server took exactly the same amount

37
00:02:44,285 --> 00:02:49,062
of time to verify the tag as in step one,
then he's gonna try again, this time with

38
00:02:49,062 --> 00:02:52,976
bytes at one. If still the server
responded very quickly, he's going to try

39
00:02:52,976 --> 00:02:56,961
with byte sets of two. If the server
responded quickly then he's going to try

40
00:02:56,961 --> 00:03:01,101
with byte sets of three, and so on until
finally, let's say, when the byte sets of

41
00:03:01,101 --> 00:03:05,344
three the server takes a little bit longer
to respond. What that means is actually

42
00:03:05,344 --> 00:03:09,484
when it did the comparison between the
correct MAC and the MAC submitted by the

43
00:03:09,484 --> 00:03:14,334
attacker. The two matched on this byte,
and the rejection happened on the second

44
00:03:14,334 --> 00:03:19,073
bytes. Aha. So now the attacker knows that
the first bite of the tag is set to three

45
00:03:19,073 --> 00:03:23,435
and now he can mount exactly the same
attack on the second bite. So here. It's

46
00:03:23,435 --> 00:03:28,519
going to submit the tag. And the second,
back here. Here This should go here. So

47
00:03:28,519 --> 00:03:32,514
it's gonna submit a tag when the second
byte is set to zero. And it's gonna

48
00:03:32,514 --> 00:03:36,516
measure whether this took a little bit
longer than in step two. If not, he's

49
00:03:36,516 --> 00:03:40,502
gonna change this to be set to one, and
he's gonna measure if the server's

50
00:03:40,502 --> 00:03:44,758
response time is a little longer than
before. Eventually, let's say when he sets

51
00:03:44,758 --> 00:03:48,960
this to, I don't know. When the byte is
set to, to 53, say, all of a sudden, the

52
00:03:48,960 --> 00:03:52,677
server takes a little bit longer to
respond. Which means that now, the

53
00:03:52,677 --> 00:03:56,943
comparator matched on the first two bytes.
And now the attacker learned that the

54
00:03:56,943 --> 00:04:01,056
first two bytes of the Mac are three and
53. And now he can move on and do the

55
00:04:01,056 --> 00:04:05,274
exact same thing on the third byte and
then on the fourth byte and so on and so

56
00:04:05,274 --> 00:04:09,175
forth. Until finally, the server says,
yes, I accept. You actually gave me the

57
00:04:09,175 --> 00:04:13,858
right Mac. And then we'll go ahead and act
on this bogus message. That, attack our

58
00:04:13,858 --> 00:04:18,711
supply. So this is a beautiful example of
how a timing attack can reveal the value

59
00:04:18,711 --> 00:04:23,140
of a MAC, the correct value of the MAC.
Kind of byte by byte, until eventually,

60
00:04:23,140 --> 00:04:28,094
the attacker obtains all the correct bytes
of the tag, and then he's able to fool the

61
00:04:28,094 --> 00:04:32,640
server into accepting this message tag
pair. The reason I like this example is

62
00:04:32,640 --> 00:04:37,186
this is a perfectly reasonable way of
implementing a MAC verification routine.

63
00:04:37,186 --> 00:04:41,941
And yet, if you right it this way, it will
be completely broken. So what do we do? So

64
00:04:41,941 --> 00:04:46,509
let me show you two defenses, the first
defense, I'll write it in again in python

65
00:04:46,509 --> 00:04:51,020
is, is as follows. In fact the Keyczar
library exactly implemented this defense.

66
00:04:51,020 --> 00:04:55,588
This code is actually taken out of the
updated version of the library. The first

67
00:04:55,588 --> 00:05:00,328
thing we do is we test if the signature
bytes submitted by the attacker are of the

68
00:05:00,328 --> 00:05:04,896
correct length, say for HMAC this would
be say, you know 96 bits or 128 bits, and

69
00:05:04,896 --> 00:05:09,421
if not we reject that as an invalid MAC.
But now, if the signature bytes really do

70
00:05:09,421 --> 00:05:13,466
have the correct length, what we do is
implement our own comparator. And it

71
00:05:13,466 --> 00:05:17,895
always takes the same amount of time to
compare the two strings. So in particular,

72
00:05:17,895 --> 00:05:22,159
this uses the zip function in Python,
which will, essentially, if you are giving

73
00:05:22,159 --> 00:05:28,116
it two sixteen byte strings. It will
create sixteen pairs. Of bytes. So it'll

74
00:05:28,116 --> 00:05:32,666
just create a, a list of sixteen elements,
where each element is a pair of bytes. One

75
00:05:32,666 --> 00:05:37,051
taken from the left and one taken from the
right. And then you loop, you know, you

76
00:05:37,051 --> 00:05:41,326
loop through this list of pairs. You
compute the XOR of the first pair, and the

77
00:05:41,326 --> 00:05:45,492
OR into the result. Then you
compute the XOR of the second pair, and

78
00:05:45,492 --> 00:05:49,932
you OR that into the result. And
you note that, if at any point in this

79
00:05:49,932 --> 00:05:54,207
loop, two bytes happen to be not equal,
then the XOR will evaluate to something

80
00:05:54,207 --> 00:05:58,577
that's non zero. And therefore, when we
OR'ed it into the result. The result

81
00:05:58,577 --> 00:06:02,632
will also be counting on zero, and then
we'll return false, at the end of the

82
00:06:02,632 --> 00:06:06,578
comparison. So the point here is that now
the comparator always takes the same

83
00:06:06,578 --> 00:06:10,720
amount of time. Even if it finds
difference in byte number three, it will

84
00:06:10,720 --> 00:06:15,479
continue running down the both strings
until the very end. And only then will it

85
00:06:15,479 --> 00:06:20,244
return the results. So now the timing
attack supposedly is impossible. However,

86
00:06:20,244 --> 00:06:25,256
this can be quite problematic, because
compilers tried to be too helpful here. So

87
00:06:25,256 --> 00:06:30,143
an optimized compiler might look at this
code and say, hey, wait a minute. I can

88
00:06:30,143 --> 00:06:35,107
actually improve this code by making the
four loop end. As soon as an incompatible

89
00:06:35,107 --> 00:06:39,378
set of bytes is discovered. And so, an
optimizing compiler could be your, kind

90
00:06:39,378 --> 00:06:43,930
of, Achilles heel when it comes to making
programs always take the same amount of

91
00:06:43,930 --> 00:06:48,482
time. And so a different defense, which is
not as widely implemented, is to try and

92
00:06:48,482 --> 00:06:52,978
hide from the adversary, what strings are
actually being compared. So let me show

93
00:06:52,978 --> 00:06:57,417
you what I mean by that. So again, here we
have our verification algorithm. So it

94
00:06:57,417 --> 00:07:01,740
takes as inputs, a key, a message, and a
candidate's MAC from the adversary. And

95
00:07:01,740 --> 00:07:06,156
then, the way we do the comparison is we
first of all, compute the correct MAC on

96
00:07:06,156 --> 00:07:10,407
the message. But then instead of directly
comparing the MAC and the signature

97
00:07:10,407 --> 00:07:14,933
bytes adversary, what we're gonna do
is we're gonna hash one more time. So we

98
00:07:14,933 --> 00:07:19,459
compute a hash here of the MAC. We compute
a hash of the signature bytes. Of course,

99
00:07:19,459 --> 00:07:23,765
if these two happen to be the same, then
the resulting HMACs will also be the

100
00:07:23,765 --> 00:07:27,794
same, so the comparison will actually
succeed. But the point is now, if sig

101
00:07:27,794 --> 00:07:31,690
bytes happen to equal MAC on the first
byte, but not on the remaining bytes.

102
00:07:31,690 --> 00:07:35,607
Then, when we do this additional hash
layer, it's likely that the two resulting

103
00:07:35,607 --> 00:07:39,675
values are completely different. And as a
result, the byte by byte comparator will

104
00:07:39,675 --> 00:07:43,693
just output on the first iteration. The
point here is that the adversary doesn't

105
00:07:43,693 --> 00:07:47,258
actually know what strings are being
compared. And as a result, he can't

106
00:07:47,258 --> 00:07:50,809
mount a timing attack that we
discussed earlier. Okay, so this is

107
00:07:50,809 --> 00:07:55,447
another defense. At least now, you're not
at the mercy of an optimizing compiler.

108
00:07:55,447 --> 00:08:00,027
The main lesson from all of this is that
you realize that people who even are

109
00:08:00,027 --> 00:08:04,490
experts at implementing cryptolibraries,
get this stuff wrong. And the right code

110
00:08:04,490 --> 00:08:08,351
that words perfectly fine and yet is
completely vulnerable to a timing attack

111
00:08:08,351 --> 00:08:12,310
that completely undo all security of the
system. So the lesson here is of course

112
00:08:12,310 --> 00:08:15,775
you should not be inventing your own
crypto but you shouldn't even be

113
00:08:15,775 --> 00:08:19,785
implementing your own crypto because most
likely it'll be vulnerable to the side

114
00:08:19,785 --> 00:08:23,546
channel attacks. Just use a standard
library like OpenSSL. Keyczar is actually a

115
00:08:23,546 --> 00:08:27,605
fine library to use that would reduce the
chances that you're vulnerable to these

116
00:08:27,605 --> 00:08:28,447
types of attacks.