0:00:00.000,0:00:04.052
Now that we know what MACs are, let's go[br]ahead and build our first secure MACs.

0:00:04.052,0:00:08.469
First I want to remind you that a MAC is a[br]pair of algorithms. The first is a signing

0:00:08.469,0:00:12.922
algorithm that's given a message and a key[br]will generate a corresponding tag And the

0:00:12.922,0:00:17.103
second is verification algorithms are[br]given a key message and a tag while

0:00:17.103,0:00:21.736
outputs zero or one depending on whether[br]the tag is valid or not. And we said that

0:00:21.736,0:00:26.313
a MAC is secure if it is existentially[br]unforgeable under a chosen message attack.

0:00:26.313,0:00:30.890
In other words, the attackers allow to[br]mount a chosen message attack where he can

0:00:30.890,0:00:35.298
submit arbitrary messages of his choice[br]and obtain the corresponding tags for

0:00:35.298,0:00:39.520
those messages, and then despite the[br]ability to generate arbitrary tags The

0:00:39.520,0:00:43.616
attacker cannot create a new message-tag pair that was not given to him

0:00:43.616,0:00:47.976
during the chosen message attack. Okay so[br]we've already seen this definition in the

0:00:47.976,0:00:52.179
last segment and now the question is how[br]do we build secure MACs? So the first

0:00:52.179,0:00:57.217
example I want to give you is basically[br]showing that any secure PRF directly gives

0:00:57.217,0:01:01.952
us a secure MAC as well. So let's see how[br]we do it. So suppose we have a pseudo

0:01:01.952,0:01:06.808
random function so the pseudo random[br]function takes and inputs X and outputs in

0:01:06.808,0:01:12.173
Y. And let's define the following MAC. So[br]the way we sign a message 'M' is basically

0:01:12.173,0:01:17.182
by simply evaluating the function at the[br]point 'M' So the tag for the message M is

0:01:17.182,0:01:21.350
simply the value of the function at the[br]point M and then the way we verify a

0:01:21.350,0:01:26.006
message to that pair is by recomputing the[br]value of the function at the message M and

0:01:26.006,0:01:30.283
checking whether that is equal to the tag[br]given to us. We say yes if so and we

0:01:30.283,0:01:34.681
reject otherwise. So here you have in[br]pictures basically that when Alice wants

0:01:34.681,0:01:39.023
to send a message to Bob she computes a[br]tag by the value of PRF and then she

0:01:39.023,0:01:43.252
appends this tag to the message, Bob[br]receives the corresponding tab pair, he

0:01:43.252,0:01:47.820
recomputes the value of the function and[br]tests whether the given tag is actually

0:01:47.820,0:01:52.730
equal to the value of the function at the[br]point M. So let's look at a bad example of

0:01:52.730,0:01:57.832
this instruction. And so suppose that we[br]have a pseudo-random function that happens

0:01:57.832,0:02:02.873
to output only ten bits. Okay, so this is[br]a fine pseudo-random function and it just

0:02:02.873,0:02:07.668
so happens that for any message 'M' it[br]only outputs ten bits value. My question

0:02:07.668,0:02:12.463
to you is if we use this function 'F' to[br]construct a MAC, is that going to be a

0:02:12.463,0:02:17.184
secure MAC? So the answer is no, this MAC[br]is insecure. In particular because the

0:02:17.184,0:02:21.471
tags are too short. So consider the[br]following simple adversary. What the

0:02:21.471,0:02:26.119
adversary will do is simply choose an[br]arbitrary message M and just guess the

0:02:26.119,0:02:30.768
value of the MAC for that particular[br]message. Now, because the tag is only ten

0:02:30.768,0:02:35.175
bits long, the adversary has a chance of[br]one in two to the ten in guessing the MAC

0:02:35.175,0:02:40.004
correctly. In other words, the advantage[br]of the guessing adversary, one distinctly

0:02:40.004,0:02:44.351
guesses a random tag for a given message.[br]That adversary is going to have an

0:02:44.351,0:02:48.898
advantage against the mac that's basically[br]one over two to the ten which is one over

0:02:48.898,0:02:52.962
a thousand twenty four and that's[br]definitely non negligible. So the adversary

0:02:52.962,0:02:57.348
basically will successfully forge the mac[br]on a given message with probability one

0:02:57.348,0:03:01.841
on a thousand which is insecure. However[br]it turns out this is the only example that

0:03:01.841,0:03:06.280
where things can go wrong, only when the[br]output of the function is small can things

0:03:06.280,0:03:10.536
go wrong. If the output of the PRF is[br]large Then we get a secure MAC out of this

0:03:10.536,0:03:14.344
function. And let's state the security[br]theorem here. So suppose we have a

0:03:14.344,0:03:18.257
function 'F' that takes messages in 'X'[br]and outputs tags in 'Y', then the MAC

0:03:18.257,0:03:22.588
that's derived from this PRF in fact is a[br]secure MAC. In particular, if you look at

0:03:22.588,0:03:26.804
the security theorem here, you'll see very[br]clearly the era bounds, in other words

0:03:26.804,0:03:31.179
since the PRF is secure we know that this[br]quantity here is negligible. And so if we

0:03:31.179,0:03:35.395
want this quantity to be negligible, this[br]is what we want, we want to say that no

0:03:35.395,0:03:39.664
adversary can defeat the MAC 'I sub F',[br]that implies that we want this quantity to

0:03:39.664,0:03:43.722
be negligible, in other words we want the[br]output space to be large. And so for

0:03:43.722,0:03:48.096
example, taking a PRF that outputs eighty[br]bits is perfectly fine. That will generate

0:03:48.096,0:03:52.102
an eighty bit MAC and therefore the[br]advantage of any adversary will be at most

0:03:52.102,0:03:56.521
one over two to the eighty. So now the proof[br]of this theorem is really simple, so lets

0:03:56.521,0:04:00.906
just go ahead and do it. So in fact lets[br]start by saying that suppose instead of a

0:04:00.906,0:04:05.446
PRF we have a truly random function from[br]the message space to the tag space so this

0:04:05.446,0:04:10.087
is just a random function from X to Y[br]that's chosen at random from the set of

0:04:10.087,0:04:14.966
all such functions. Now lets see if that[br]function can give us a secure mac. So what

0:04:14.966,0:04:19.551
the adversary says is, 'I want a tag on[br]the message M1'. What he gets back is the

0:04:19.551,0:04:24.157
tag which just happens to be the function[br]evaluated at the point M1. Notice there is

0:04:24.157,0:04:28.489
no key here because F is just a truly[br]random function from X to Y. And then the

0:04:28.489,0:04:33.096
adversary gets to choose a message from M2[br]and he obtains the tag from M2, he choose

0:04:33.096,0:04:37.264
M3, M4 up to MQ and he obtains all the[br]corresponding tags. Now his goal is to

0:04:37.264,0:04:41.432
produce a message tag pair and we say that[br]he wins, remember that this is an

0:04:41.432,0:04:45.891
existential forgery, in other words first[br]of all T has to be equal to F of M This

0:04:45.891,0:04:49.968
means that 'T' is a valid tag for the[br]message 'M'. And second of all, the

0:04:49.968,0:04:54.685
message 'M' has to be new. So the message[br]'M' had better not be one of M-one to M-Q.

0:04:54.685,0:04:58.879
But let's just think about this for a[br]minute, what does this mean? So the

0:04:58.879,0:05:03.830
adversary got to see the value of a truly[br]random function at the points M-one to M-Q

0:05:03.830,0:05:08.800
and now he?s suppose to predict the value[br]of this function as some new point, M

0:05:08.800,0:05:13.411
However, for a truly random function, the[br]value of the function at the point M is

0:05:13.411,0:05:18.195
independent of its value at the points M-1[br]to M-Q. So the best the adversary can do

0:05:18.195,0:05:22.749
at predicting the value of the function at[br]the point M is just guess the value,

0:05:22.749,0:05:27.302
because he has no information about F of[br]M. And as a result his advantage if he

0:05:27.302,0:05:31.625
guesses the value of the function at the[br]point M he'll guess it right with

0:05:31.625,0:05:36.294
probability exactly one over Y. And then[br]the tag that he produced will be correct

0:05:36.294,0:05:40.582
with probability exactly one over Y. Okay,[br]again he had no information about the

0:05:40.582,0:05:44.801
value of the function of M and so the best[br]he could do is guess. And if he guesses,

0:05:44.801,0:05:49.347
he'll get it right with probability one[br]over Y. And now of course, because the

0:05:49.347,0:05:54.420
function capital F is a pseudo random[br]function The adversary is gonna behave the

0:05:54.420,0:05:58.565
same whether we give him the truly random[br]function or the pseudo random function.

0:05:58.565,0:06:02.659
The adversary can't tell the difference[br]and as a result even if we use a pseudo

0:06:02.659,0:06:06.600
random function, the adversary is gonna[br]have advantages at most one over Y in

0:06:06.600,0:06:10.774
winning the game. Okay, so you can see[br]exactly the security theorem, let's go

0:06:10.774,0:06:15.561
back there for just a second. Essentially[br]this is basically why we got an error term

0:06:15.561,0:06:20.005
of one over Y because of the guessing[br]attack and that's the only way that the

0:06:20.005,0:06:24.734
attacker can win the game. So now that we[br]know that any secure PRF is also a secure

0:06:24.734,0:06:29.122
MAC, we already know that we have our[br]first example MAC. In particular, we know

0:06:29.122,0:06:33.680
that AES, or at least we believe that AES[br]is a secure PRF, therefore, AES since it

0:06:33.680,0:06:38.011
takes sixteen byte inputs, right the[br]message space for AES is 128 bits, which

0:06:38.011,0:06:43.212
is sixteen bytes. Therefore the AES cipher[br]essentially gives us a MAC that can match

0:06:43.212,0:06:48.140
messages that are exactly sixteen bytes.[br]Okay, so that's our first example of a

0:06:48.140,0:06:53.257
MAC. But now the question is if we have a[br]PRF for small inputs like AES that only

0:06:53.257,0:06:58.564
acts on sixteen bytes, can we build a MAC[br]for big messages that can act on gigabytes

0:06:58.564,0:07:02.066
of data? Sometimes I call this the[br]McDonald's problem. Basically given a

0:07:02.066,0:07:05.871
small MAC and we build a big MAC out of[br]it. In other words, given a MAC for small

0:07:05.871,0:07:10.234
messages and we build a MAC for large[br]messages. So we're gonna look at two

0:07:10.234,0:07:14.835
constructions for doing so. The first[br]example is called a CBC MAC that again

0:07:14.835,0:07:19.315
takes PRF for small messages as input[br]and produces a PRF for very large

0:07:19.315,0:07:23.686
messages. As outputs. The second one we'll[br]see is HMAC which does the same thing

0:07:23.686,0:07:28.278
again takes a PRF for small inputs and[br]generates a PRF for very large inputs. Now

0:07:28.278,0:07:32.050
the two are used in very different[br]contexts. Cbc-mac is actually very

0:07:32.050,0:07:36.315
commonly used in the banking industry. For[br]example, there's a system called the

0:07:36.315,0:07:40.797
Automatic Clearing House, ACH, which banks[br]use to clear checks with one another and

0:07:40.797,0:07:44.788
that system, CBC-MAC, is used to ensure[br]integrity of the checks as they're

0:07:44.788,0:07:49.107
transferred from bank to bank. On the[br]Internet, protocols like SSL and IPsec and

0:07:49.107,0:07:53.173
SSH, those all use HMAC for integrity. Two[br]different MACs and were gonna discuss them

0:07:53.173,0:07:57.086
in the next couple of segments. And as I[br]said were also gonna start from a PRF for

0:07:57.086,0:08:01.274
small messages and produce PRF for[br]messages that are gigabytes long and in

0:08:01.274,0:08:06.043
particular they can both be substantiated[br]with AES as the underlying cipher. So the

0:08:06.043,0:08:10.597
last comment I want to make about these[br]PRF based MACs is that in fact their

0:08:10.597,0:08:15.269
output can be truncated. So suppose we[br]have a PRF that outputs N bit outputs. So

0:08:15.269,0:08:19.941
again for AES this would be a PRF that[br]outputs 128 bits as outputs. Its an easy

0:08:19.941,0:08:24.909
lemma to show that in fact if you have an[br]N bit PRF if you truncated, in other words

0:08:24.909,0:08:31.062
if you only output first key bits The[br]result is also a secure PRF and the

0:08:31.062,0:08:36.529
intuition here is if the big PRF outputs N[br]bits of randomness for any inputs that you

0:08:36.529,0:08:42.059
give to the PRF then certainly chopping it[br]off and truncating it to T bits is still

0:08:42.059,0:08:46.572
gonna look random. The attacker now gets[br]less information so his job of

0:08:46.572,0:08:51.657
distinguishing the outputs from random[br]just became harder. In other words, if the

0:08:51.657,0:08:56.742
N bit PRF is secure, then the T less than[br]N bit PRF, the truncated PRF, would also

0:08:56.742,0:09:01.177
be secure. So this is an easy lemma and[br]since any secure PRF also gives us a

0:09:01.177,0:09:05.993
secure MAC, what this means is if you give[br]me a MAC that's based on a PRF and what I

0:09:05.993,0:09:10.686
can do is I can truncate it to W bits,[br]however, because of the error term in the

0:09:10.864,0:09:15.379
MAC based PRF theorem we know that[br]truncating to W bits will only be secure

0:09:15.379,0:09:19.954
as long as one over two to the W is[br]negligible. So if you truncate the PRF to

0:09:19.954,0:09:24.410
only three bits, the resulting MAC is not[br]going to be secure. However, if you

0:09:24.410,0:09:29.222
truncate it to say 80 bits or maybe even[br]64 bits, then the resulting MAC is still

0:09:29.222,0:09:33.974
gonna be a secure MAC. Okay, so the thing[br]to remember here is that even though we

0:09:33.974,0:09:39.235
use AES to construct larger PRFs and the[br]output of these PRFs is gonna be 128 bits,

0:09:39.235,0:09:44.115
it doesn't mean that the MAC itself has to[br]produce 128 bit tags We can always

0:09:44.115,0:09:48.550
truncate the outputs to 90 bits or 80[br]bits, and as a result, we would get still

0:09:48.550,0:09:53.097
secure MACs but now the output tag is[br]gonna be more reasonable size and doesn't

0:09:53.097,0:09:57.702
have to be the full 128 bits. Okay, so in[br]the next segment we're gonna look at how

0:09:57.702,0:09:58.726
the CBC-MAC works.