1 00:00:00,000 --> 00:00:03,837 In this segment, we're gonna construct authenticated encryption systems. Since we 2 00:00:03,837 --> 00:00:08,250 already have CPA secured encryption, and we have secure MACs, the natural question 3 00:00:08,250 --> 00:00:12,824 is whether we can combine the two somehow, in order to get authenticated encryption. 4 00:00:12,824 --> 00:00:15,721 And if, that's exactly what we're gonna do in this segment. Authenticated encryption 5 00:00:15,721 --> 00:00:20,447 was introduced in the year 2000, in two independent papers that I point to at the 6 00:00:20,447 --> 00:00:25,915 end of this module. But before then, many crytpolibraries provided an API that 7 00:00:25,915 --> 00:00:31,215 separately supported CPA secure encryption, and MAC-ing. So there was one 8 00:00:31,215 --> 00:00:36,175 function for implementing CPA secure encryption. For example, CBC with a random 9 00:00:36,175 --> 00:00:41,136 IV. And another function for implementing a MAC. And then every developer that 10 00:00:41,136 --> 00:00:45,646 wanted to implement encryption, had to, himself, call separately the CPA secure 11 00:00:45,646 --> 00:00:50,552 encryption scheme and the MAC scheme. In particular, every developer had to invent 12 00:00:50,552 --> 00:00:55,697 his own way of combining encryption and MAC-ing to provide some sort of 13 00:00:55,697 --> 00:00:59,376 authenticated encryption. But since the goals of combining encryption and MAC-ing 14 00:00:59,376 --> 00:01:03,690 wasn't well understood since authenticated encryption hasn't yet been defined, it 15 00:01:03,690 --> 00:01:08,497 wasn't really clear which combinations of encryption and MAC-ing are correct and 16 00:01:08,497 --> 00:01:13,243 which aren't. And so, every project as I said had to invent its own combination. 17 00:01:13,243 --> 00:01:17,202 And in fact, not all combinations were correct. And I can tell you that the most 18 00:01:17,202 --> 00:01:22,556 common mistake in software projects were basically incorrectly combining the 19 00:01:22,556 --> 00:01:27,025 encryption and integrity mechanisms. So hopefully, by the end of this module, you 20 00:01:27,025 --> 00:01:31,260 will know how to combine them correctly, and you won't be making these mistakes 21 00:01:31,260 --> 00:01:35,174 yourself. So let's look at some combinations of CPA secure encryption and 22 00:01:35,174 --> 00:01:39,303 MAC, that were introduced by different projects. So here are three examples. So, 23 00:01:39,303 --> 00:01:43,816 first of all, in all three examples, there's a separate key for encryption, and 24 00:01:43,816 --> 00:01:47,774 a separate key for MACing. These two keys are independent of one another, and 25 00:01:47,774 --> 00:01:52,224 both are generated at session setup time. And we're gonna see how to generate these 26 00:01:52,224 --> 00:01:57,071 two keys later on in the course. So the first example is the SSL protocol. So the 27 00:01:57,071 --> 00:02:02,681 way SSL combines encryption and MAC in the hope of achieving authenticated encryption 28 00:02:02,681 --> 00:02:07,656 is the following. Basically you take the plain text, m, and then you compute a MAC 29 00:02:07,656 --> 00:02:13,415 on the plain text, m. So you use your MAC key, kI, to compute tag for this message 30 00:02:13,415 --> 00:02:17,787 m. And then you can concatenate the tag to the message and then you encrypt the 31 00:02:17,787 --> 00:02:22,580 concatenation of the message and the tag and what comes out is the actual final cipher text. 32 00:02:22,580 --> 00:02:26,710 So that's option number one. The second option is what IPsec does. So 33 00:02:26,710 --> 00:02:31,160 here, you take the message. The first thing you do is you encrypt the message. 34 00:02:31,160 --> 00:02:35,692 And then, you compute a tag on the resulting cipher text. So you notice the 35 00:02:35,692 --> 00:02:40,238 tag itself is computed on the resulting cipher text. A third option is what the 36 00:02:40,238 --> 00:02:45,429 SSH protocol does. So here, the SSH takes the message, and encrypts it using a CPA 37 00:02:45,429 --> 00:02:50,944 secure encryption scheme. And then, to it, it concatenates a tag of the message. The 38 00:02:50,944 --> 00:02:55,567 difference between IPsec and SSH, is that in IPsec, the tag is computed over the 39 00:02:55,567 --> 00:02:59,988 cipher text, whereas, in SSH, the tag is computed over the message. And so these 40 00:02:59,988 --> 00:03:04,536 are three completely different ways of combining encryption and MAC. And the 41 00:03:04,536 --> 00:03:09,090 question is, which one of these is secure? So, I will let you think about this for a 42 00:03:09,090 --> 00:03:12,105 second, and then when we continue we'll do the analysis together. 43 00:03:13,197 --> 00:03:17,164 Okay. So let's start with the SSH method. So in the SSH method you notice that the tag 44 00:03:17,164 --> 00:03:21,629 is computed on the message and then concatenated in the clear to the cipher text. 45 00:03:21,629 --> 00:03:26,407 Now this is actually quite a problem because MACs themselves are not designed 46 00:03:26,407 --> 00:03:30,784 to provide confidentiality. MACs are only designed for integrity. And in fact, there's 47 00:03:30,784 --> 00:03:36,008 nothing wrong with a MAC that as part of the tag outputs a few bits of the plain 48 00:03:36,008 --> 00:03:41,458 text. Outputs a few bits of the message M. That would be a perfectly fine tag. And yet if 49 00:03:41,458 --> 00:03:46,667 we did that, that would completely break CPA security here, because some bits of 50 00:03:46,667 --> 00:03:51,815 the message are leaked in the cipher text. And so the SSH approach, even though the 51 00:03:51,815 --> 00:03:56,595 specifics of SSH are fine and the protocol itself is not compromised by 52 00:03:56,595 --> 00:04:00,818 this specific combination, generally it's advisable not to use this approach. Simply 53 00:04:00,818 --> 00:04:05,928 because the output of the MAC signing algorithm might leak bits of the message. So 54 00:04:05,928 --> 00:04:11,481 now let's look at SSL and IPsec. As it turns out, the recommended method actually 55 00:04:11,481 --> 00:04:16,556 is the IPsec method. Because it turns out no matter what CPA secure system and MAC 56 00:04:16,556 --> 00:04:21,134 key you use the combination is always gonna provide authenticated encryption. 57 00:04:21,134 --> 00:04:26,070 Now let me very, very briefly explain why. Basically what happens is once we encrypt 58 00:04:26,070 --> 00:04:31,005 the message well the message contents now is hidden inside the cipher text and now 59 00:04:31,005 --> 00:04:35,761 when we compute a tag of the cipher text basically we're locking, this tag locks 60 00:04:35,761 --> 00:04:40,815 the cipher text and makes sure no one can produce a different cipher text that would 61 00:04:40,815 --> 00:04:45,314 look valid. And as a result this approach ensures that any modifications to the 62 00:04:45,314 --> 00:04:49,555 cipher text will be detected by the decrypter simply because the MAC isn't 63 00:04:49,555 --> 00:04:54,026 gonna verify. As it turns out, for the SSL approach, there actually are kind of 64 00:04:54,026 --> 00:04:59,348 pathological examples, where you combine CPA secure encryption system with a secure 65 00:04:59,348 --> 00:05:03,542 MAC. And the result is vulnerable to a chosen cipher text attack, so that it does 66 00:05:03,542 --> 00:05:07,978 not actually provide authenticated encryption. And basically, the reason that 67 00:05:07,978 --> 00:05:12,824 could happen, is that there's some sort of a bad interaction between the encryption 68 00:05:12,824 --> 00:05:17,319 scheme and the MAC algorithm. Such that, in fact, there will be a chosen cipher 69 00:05:17,319 --> 00:05:21,752 text attack. So if you're designing a new project the recommendation now is to 70 00:05:21,752 --> 00:05:26,162 always use encrypt then MAC because that is secure no matter which CPA secure 71 00:05:26,162 --> 00:05:30,607 encryption and secure MAC algorithm you're combining. Now, just to set the 72 00:05:30,607 --> 00:05:37,995 terminology, the SSL method is sometimes called MAC-then-encrypt. And the 73 00:05:37,995 --> 00:05:45,039 IPsec method is called encrypt-then-MAC. The SSH method even though you're 74 00:05:45,039 --> 00:05:51,898 not supposed to use it, is called encrypt-and-MAC. Okay, so I'll often refer to 75 00:05:51,898 --> 00:05:57,002 encrypt-then-MAC, and MAC-then-encrypt to differentiate SSL and IPsec. Okay, so 76 00:05:57,002 --> 00:06:02,112 just to repeat what I've just said. The IPsec method encrypt-then-MAC always 77 00:06:02,112 --> 00:06:07,160 provides authenticated encryption. If you start from a CPA secure cipher and a secure MAC 78 00:06:07,160 --> 00:06:11,110 you will always get authenticated encryption. As I said, MAC-then-encrypt in 79 00:06:11,110 --> 00:06:15,737 fact, there are pathological cases where the result is vulnerable to CCA attacks and 80 00:06:15,737 --> 00:06:20,308 therefore does not provide authenticated encryption. However, the story's a little 81 00:06:20,308 --> 00:06:24,653 bit more interesting than that, in that, it turns out, if you're actually using 82 00:06:24,653 --> 00:06:29,224 randomized counter mode or randomized CBC, then it turns out, for those particular 83 00:06:29,224 --> 00:06:33,625 CPA secure encryption schemes, MAC-then-encrypt actually does provide authenticated 84 00:06:33,625 --> 00:06:38,028 encryption and therefore it is secure. In fact, there's even a more interesting 85 00:06:38,028 --> 00:06:42,334 twist here in that if you're using randomized counter mode. Then, it's enough 86 00:06:42,334 --> 00:06:46,804 that your MAC algorithm just be one time secure. It doesn't have to be a fully 87 00:06:46,804 --> 00:06:51,561 secure MAC. It just has to be secure when a key is used to encrypt a single message, 88 00:06:51,561 --> 00:06:56,088 okay? And when we talked about message integrity, we saw that there are actually 89 00:06:56,088 --> 00:07:00,575 much faster MACs that are one time secure than MACs that are fully secure. As a 90 00:07:00,575 --> 00:07:04,454 result, if you're using randomized counter mode MAC-then-encrypt could actually 91 00:07:04,454 --> 00:07:08,187 result in a more efficient encryption mechanism. However, I'm going to repeat 92 00:07:08,187 --> 00:07:12,213 this again. The recommendation is to use encrypt-then-MAC and we're going to see a 93 00:07:12,213 --> 00:07:16,093 number of attacks on systems that didn't use encrypt-then-MAC. And so just to make 94 00:07:16,093 --> 00:07:20,120 sure things are secure without you having to think too hard about this. Again, I am 95 00:07:20,120 --> 00:07:24,118 going to recommend that you always use encrypt-then-MAC. Now, once the concept of 96 00:07:24,118 --> 00:07:27,759 authenticated encryption became more popular, a number of standardized 97 00:07:27,759 --> 00:07:31,609 approaches for combining encryption and MAC turned up. And those were even 98 00:07:31,609 --> 00:07:35,978 standardized by the National Institute of Standards. So I'm just gonna mention three 99 00:07:35,978 --> 00:07:41,031 of these standards. Two of these were standardized by NIST. And these are 100 00:07:41,031 --> 00:07:46,138 called Galois counter mode and CBC counter mode. And so let me explain what they do. 101 00:07:46,138 --> 00:07:51,245 Galois counter mode basically uses counter mode encryption, so a randomized counter 102 00:07:51,245 --> 00:07:56,164 mode with a Carter-Wegman MAC, so a very fact Carter-Wegman MAC. And the way the 103 00:07:56,164 --> 00:08:01,146 Carter-Wegman MAC works in GCM is it's basically a hash function of the message 104 00:08:01,146 --> 00:08:06,311 that's being MACed. And then the result is encrypted using a PRF. Now this hash 105 00:08:06,311 --> 00:08:11,562 function in GCM is already quite fast to the point where the bulk of the running 106 00:08:11,562 --> 00:08:15,845 time of GCM is dominated by the counter mode encryption and it's even made more so 107 00:08:15,845 --> 00:08:22,371 in that Intel introduces a special instruction PCLMULQDQ specifically 108 00:08:22,371 --> 00:08:27,432 designed for the purpose of making the hash function in GCM run as fast as possible. 109 00:08:27,432 --> 00:08:32,950 Now CCM counter mode is another NIST standard. It uses a CBC MAC and 110 00:08:32,950 --> 00:08:37,276 then counter mode encryption. So this mechanism, you know, this uses MAC, then 111 00:08:37,276 --> 00:08:40,906 encrypt, like SSL does. So this is actually not the recommended way of doing 112 00:08:40,906 --> 00:08:44,023 things, but because counter mode encryption is used. This is actually a 113 00:08:44,023 --> 00:08:47,945 perfectly fine encryption mechanism. One thing that I'd like to point out about 114 00:08:47,945 --> 00:08:53,799 CCM, is that everything is based on AES. You notice, it's using AES for the CBC 115 00:08:53,799 --> 00:08:58,778 MAC, and it's using AES for the counter mode encryption. And as a result, CCM can 116 00:08:58,778 --> 00:09:03,084 be implemented with relatively little code. Cause all you need is an AES engine 117 00:09:03,084 --> 00:09:08,343 and nothing else. And because of this, CCM actually was adopted by the Wi-Fi 118 00:09:08,343 --> 00:09:13,963 alliance, and in fact, you're probably using CCM on a daily basis if you're using 119 00:09:13,963 --> 00:09:19,093 encrypted Wi-Fi 802.11i then you're basically using CCM to encrypt traffic 120 00:09:19,093 --> 00:09:23,450 between your laptop and the access point. There's another mode called a EAX that 121 00:09:23,450 --> 00:09:28,922 uses counter mode encryption, and then CMAC. So, again you notice encrypt-then-MAC 122 00:09:28,922 --> 00:09:31,906 and that's another fine mode to use. We'll do a comparison of all these 123 00:09:31,906 --> 00:09:36,806 modes in just a minute. Now I wanted to mention that first of all, all these modes are 124 00:09:36,806 --> 00:09:41,190 nonce-based. In other words, they don't use any randomness but they do take as 125 00:09:41,190 --> 00:09:46,360 input a nonce and the nonce has to be unique per key. In other words, as you 126 00:09:46,360 --> 00:09:50,600 remember, the pair (key, nonce) should never ever, ever repeat. But the 127 00:09:50,600 --> 00:09:53,851 nonce itself need not be random, so it's perfectly fine to use a counter, for 128 00:09:53,851 --> 00:09:57,520 example, as a nonce. And the other important point is that, in fact, all 129 00:09:57,520 --> 00:10:01,384 these modes are what's called authenticated encryption with associated 130 00:10:01,384 --> 00:10:04,936 data. This is an extension of authenticated encryption, that comes 131 00:10:04,936 --> 00:10:10,884 up very often in networking protocols. So the idea between AEAD is that, in fact, 132 00:10:10,884 --> 00:10:15,223 the message that's provided to the encryption mode is not intended to be fully 133 00:10:15,223 --> 00:10:19,924 encrypted. Only part of the message is intended to be encrypted, but all of the 134 00:10:19,924 --> 00:10:24,157 message is intended to be authenticated. A good example of this is a network packet. 135 00:10:24,157 --> 00:10:29,408 Think of like a IP packet where there's a header and then there's a payload. And 136 00:10:29,408 --> 00:10:33,157 typically the header is not gonna be encrypted. For example, the header might 137 00:10:33,157 --> 00:10:36,905 contain the destination of the packet, but then the header had better not be 138 00:10:36,905 --> 00:10:40,950 encrypted otherwise routers along the way wouldn't know where to route the packet. 139 00:10:40,950 --> 00:10:45,225 And so, typically the header is sent in the clear, but the payload, of course, is 140 00:10:45,225 --> 00:10:49,500 always encrypted, but what you'd like to do is have the header be authenticated. 141 00:10:49,500 --> 00:10:55,907 Not encrypted but authenticated. So this is exactly what these AEAD modes do. They 142 00:10:55,907 --> 00:11:00,033 will authenticate the header and then encrypt the payload. But the header and 143 00:11:00,033 --> 00:11:04,177 the payload are bound together in the authentication so they can't 144 00:11:04,177 --> 00:11:07,803 actually be separated. So this is not difficult to do. What happens is in these 145 00:11:07,803 --> 00:11:14,170 three modes GCM, CCM, and EAX, basically the MAC is applied to the entire data. But 146 00:11:14,170 --> 00:11:18,852 the encryption is only applied to the part of the data that needs to be encrypted. 147 00:11:18,852 --> 00:11:22,983 So I wanted to show you what an API to these authenticated encryption with 148 00:11:22,983 --> 00:11:28,716 associated data encryption schemes look like. So here's what it looks like in OpenSSL. 149 00:11:28,716 --> 00:11:33,609 For example, this is, an API for GCM. So what you do is you call the 150 00:11:33,609 --> 00:11:37,404 init function to initialize the encryption mode, and you notice you give it a key and 151 00:11:37,404 --> 00:11:40,609 the nonce. The nonce again, doesn't have to be random, but it has to 152 00:11:40,609 --> 00:11:44,402 be unique. And after initialization, you would call this encrypt function, where 153 00:11:44,402 --> 00:11:48,169 you see that you give it the associated data that's gonna be authenticated, but 154 00:11:48,169 --> 00:11:51,794 not encrypted. You give it the data, and it's gonna be both authenticated and 155 00:11:51,794 --> 00:11:55,752 encrypted. And it gives you back the full cipher text, which is an encryption of the 156 00:11:55,752 --> 00:11:59,582 data, but of course does not include the AEAD, because the AEAD is gonna be sent in 157 00:11:59,582 --> 00:12:04,535 the clear. So now that we understand this mode of encrypt-then-MAC, we can go 158 00:12:04,535 --> 00:12:09,951 back to the definition of MAC security and I can explain to you something that might 159 00:12:09,951 --> 00:12:13,970 have been a little obscure when we looked at that definition. So if you remember, 160 00:12:13,970 --> 00:12:18,570 one of the requirements that followed from our definition of secure MACs meant that 161 00:12:18,570 --> 00:12:25,689 given a message-MAC pair on a message M, the attacker cannot produce another tag on 162 00:12:25,689 --> 00:12:30,386 the same message M. In other words, even though the attacker already has a tag for 163 00:12:30,386 --> 00:12:34,771 the message M, he shouldn't be able to produce a new tag for the same message M. 164 00:12:34,771 --> 00:12:39,382 And it's really not clear, why does that matter? Who cares, if the adversary already 165 00:12:39,382 --> 00:12:44,038 has a tag on the message M, who cares if he can produce another tag? Well, it turns 166 00:12:44,038 --> 00:12:48,828 out if the MAC didn't have this property. In other words, given a message-MAC pair 167 00:12:48,828 --> 00:12:53,618 you can produce another MAC on the same message, then that MAC would 168 00:12:53,618 --> 00:12:58,694 result in an insecure encrypt-then-MAC mode. And so if we want our encrypt-then-MAC to 169 00:12:58,694 --> 00:13:03,961 have cipher text integrity, it's crucial that our MAC security would imply this strong 170 00:13:03,961 --> 00:13:08,910 notion of security, which, of course, it does because we defined it correctly. 171 00:13:08,910 --> 00:13:13,613 So let's see what would go wrong, if, in fact, it was easy to produce this type of 172 00:13:13,613 --> 00:13:18,081 forgery. So what I'll do is I'll show you a chosen cipher text attack on the 173 00:13:18,081 --> 00:13:22,784 resulting encrypt-then-MAC system. And since the system has a chosen cipher text 174 00:13:22,784 --> 00:13:27,193 attack on it, it necessarily means that it doesn't provide authenticated 175 00:13:27,193 --> 00:13:31,458 encryption. So let's see. So the adversary's gonnna start by sending two 176 00:13:31,458 --> 00:13:35,861 messages, M0 and M1. And he's gonna receive, as usual, the encryption of one 177 00:13:35,861 --> 00:13:39,522 of them, either the encryption of M0 or the encryption of M1. And since we're 178 00:13:39,522 --> 00:13:44,907 using encrypt-then-MAC, the adversary receives the cipher text we'll call it C0 179 00:13:44,907 --> 00:13:49,883 and a MAC on the cipher text C0. Well now we said that given the MAC on 180 00:13:49,883 --> 00:13:53,827 a message the adversary can produce another MAC on the same message. So what 181 00:13:53,827 --> 00:13:57,994 he's gonna do is he's gonna produce another MAC on the message C0. Now he has 182 00:13:57,994 --> 00:14:03,532 a new cipher text (C0,T'), which is a perfectly valid cipher text. T' is a 183 00:14:03,532 --> 00:14:09,558 valid MAC of C0. Therefore, the adversary now can submit a chosen cipher text query 184 00:14:09,558 --> 00:14:14,492 on C' and this is a valid chosen cipher text query because it's different 185 00:14:14,492 --> 00:14:19,288 from C. It's a new cipher text. The poor challenger now is forced to decrypt this 186 00:14:19,288 --> 00:14:23,278 cipher text C' so he's going to send back the decryption of C'. It's a 187 00:14:23,278 --> 00:14:29,093 valid cipher text therefore the decryption of C prime is the message Mb but now the 188 00:14:29,093 --> 00:14:32,318 attacker just learned the value of B because he can test whether Mb is equal to 189 00:14:32,318 --> 00:14:37,371 M0 or MB is equal to M1. As a result he can just output B and he gets advantage 190 00:14:37,371 --> 00:14:43,468 one in defeating the scheme. And so again if our MAC security did not imply 191 00:14:43,468 --> 00:14:48,471 this property here. Then, there would be a chosen cipher text attack on encrypt-then-MAC. 192 00:14:48,471 --> 00:14:53,181 And therefore, it would not be secure. So the fact that we define MAC security correctly 193 00:14:53,181 --> 00:14:57,241 means that encrypt-then-MAC really does provide authenticated encryption. And 194 00:14:57,241 --> 00:15:01,728 throughout all the MACs that we discussed actually do satisfy this strong notion of 195 00:15:01,728 --> 00:15:06,146 unforgeability. So, interestingly, this is not the end of the story. So, as we said 196 00:15:06,146 --> 00:15:10,385 before the concept of authenticated encryption was introduced everyone was 197 00:15:10,385 --> 00:15:15,295 just combining MACs and encryption in various ways in the hope of achieving 198 00:15:15,295 --> 00:15:19,201 some authenticated encryption. After the notion of authenticated encryption 199 00:15:19,201 --> 00:15:23,553 became formalized and rigorous, people kind of started scratching their heads and said, 200 00:15:23,553 --> 00:15:28,130 hey, wait a minute. Maybe we can achieve authenticated encryption more efficiently 201 00:15:28,130 --> 00:15:32,722 than by combining a MAC and an encryption scheme. In fact, if you think about how 202 00:15:32,722 --> 00:15:37,366 this combination of MAC and encryption works, let's say we combine counter mode 203 00:15:37,366 --> 00:15:42,134 with CMAC, then for every block of plaintext, you first of all have to use 204 00:15:42,134 --> 00:15:46,419 your block cipher for counter mode, and then you have to use to your block cipher 205 00:15:46,419 --> 00:15:51,357 again, for the CBC-MAC. This means that if you're combining CPA secure encryption with a 206 00:15:51,357 --> 00:15:55,943 MAC, for every block of plaintext, you have to evaluate your block cipher twice, 207 00:15:55,943 --> 00:16:00,535 once for the MAC and once for the encryption scheme. So the natural question 208 00:16:00,535 --> 00:16:05,396 was, can we construct an authenticated encryption scheme directly from a PRP, 209 00:16:05,396 --> 00:16:10,345 such that we would have to only evaluate the PRP once per block? And it turns out 210 00:16:10,345 --> 00:16:14,117 the answer is yes, and there's this beautiful construction called OCB, that 211 00:16:14,117 --> 00:16:18,343 pretty much does everything you want, and is much faster than constructions that are 212 00:16:18,343 --> 00:16:22,467 separately built from an encryption and a MAC. So I wrote down, kind of a schematic 213 00:16:22,467 --> 00:16:26,290 of OCB. I don't want to explain it in detail. I'll just kind of explain it at a 214 00:16:26,290 --> 00:16:30,025 high level. So here we have our input plain text, here at the top. And you 215 00:16:30,025 --> 00:16:34,540 notice that, first of all, OCB is parallelizable, completely parallelizable. 216 00:16:34,540 --> 00:16:39,700 So every block can be encrypted separately of every other block. The other thing to 217 00:16:39,700 --> 00:16:44,338 notice is that as I promised, you only evaluate your block cipher once per plain 218 00:16:44,338 --> 00:16:49,318 text block. And then you evaluate it one more time at the end to build your 219 00:16:49,318 --> 00:16:53,887 authentication tag and then the overhead of OCB beyond just a block cipher is 220 00:16:53,887 --> 00:16:58,749 minimal. All you have to do is evaluate a certain very simple function P. The 221 00:16:58,749 --> 00:17:02,844 nonce goes into the P you notice, the key goes into this P and then there is a 222 00:17:02,844 --> 00:17:08,238 block counter that goes into this P. So you just evaluate this function P, twice 223 00:17:08,238 --> 00:17:12,748 for every block and you XOR the result before and after encryption using the 224 00:17:12,748 --> 00:17:17,553 block cipher and that's it. That's all you have to do and then you get a very fast 225 00:17:17,553 --> 00:17:22,241 and efficient authenticated encryption scheme built from a block cipher. So OCB 226 00:17:22,241 --> 00:17:26,065 actually has a nice security theorem associated with it and I am going to point 227 00:17:26,065 --> 00:17:29,567 to a paper on OCB when we get to end of this module where I list some further 228 00:17:29,567 --> 00:17:34,457 reading papers that you can take a look at. So you might be wondering if OCB is so 229 00:17:34,457 --> 00:17:40,168 much better than everything you've seen so far, all these three standards CCM, GCM and 230 00:17:40,168 --> 00:17:46,037 EAX why isn't OCB being used or why isn't OCB the standard? And the answer is a 231 00:17:46,037 --> 00:17:50,729 little sad. The primary answer that OCB is not being used is actually because 232 00:17:50,729 --> 00:17:54,567 of various patents. And I'll just leave it at that. So to conclude this section I 233 00:17:54,567 --> 00:17:58,216 wanted to show you some performance numbers. So here on the right I listed 234 00:17:58,216 --> 00:18:02,368 performance numbers for modes that you shouldn't be using. So this is for 235 00:18:02,368 --> 00:18:07,879 randomized counter mode, and this is for randomized CBC. And you can see also the 236 00:18:07,879 --> 00:18:12,460 performance of CBC MAC is basically the same as the performance of CBC encryption. 237 00:18:12,460 --> 00:18:16,221 Okay. Now here are the authenticated encryption modes, so these are the ones 238 00:18:16,221 --> 00:18:20,083 that you're supposed to using, these you're not supposed to be using on their 239 00:18:20,083 --> 00:18:23,795 own, right. These two, you should never ever use these two because they only 240 00:18:23,795 --> 00:18:27,858 provide CPA security, they don't actually provide security against active 241 00:18:27,858 --> 00:18:31,669 attacks. You're only supposed to use authenticated encryption for encryption. 242 00:18:31,669 --> 00:18:35,509 And so I listed performance numbers for the three standards. And let me remind 243 00:18:35,509 --> 00:18:40,109 you that GCM basically uses a very fast hash. And then it uses counter mode for 244 00:18:40,109 --> 00:18:43,770 actual encryption. And you can see that the overhead of GCM over counter mode is 245 00:18:43,770 --> 00:18:49,554 relatively small. CCM and EAX both use a block cipher based encryption and a 246 00:18:49,554 --> 00:18:54,627 block cipher based MAC. And as a result they're about twice as slow as counter 247 00:18:54,627 --> 00:18:59,196 modes. You see that OCB is actually the fastest of these, primarily because it 248 00:18:59,196 --> 00:19:03,820 only use the block cipher once per message block. So based on these performance 249 00:19:03,820 --> 00:19:08,328 numbers, you would think that GCM is exactly the right mode to always use. But 250 00:19:08,328 --> 00:19:12,659 it turns out if you're on the space constrained hardware, GCM is not ideal. 251 00:19:12,659 --> 00:19:16,709 Primarily because its implementation requires larger code than the other two 252 00:19:16,709 --> 00:19:21,323 modes. However, as I said, Intel specifically added instructions to speed 253 00:19:21,323 --> 00:19:26,064 up GCM mode. And as a result, implementing GCM on an Intel architecture takes 254 00:19:26,064 --> 00:19:30,315 very little code. But on other hardware platforms, say in smart cards or other 255 00:19:30,315 --> 00:19:34,745 constrained environments, the code size for implementing GCM would be considerably 256 00:19:34,745 --> 00:19:39,387 larger than for the other two modes. But if code size is not a constraint then GCM 257 00:19:39,387 --> 00:19:43,928 is the right mode to use. So to summarize this segment I want to say it one more 258 00:19:43,928 --> 00:19:48,267 time that when you want to encrypt messages you have to use an authenticated 259 00:19:48,267 --> 00:19:52,716 encryption mode and the recommended way to do it is to use one of the standards, 260 00:19:52,716 --> 00:19:57,037 namely one of these three modes for providing authenticated encryption. 261 00:19:57,037 --> 00:19:59,846 Don't implement the encryption scheme yourself. In other words don't implement 262 00:19:59,846 --> 00:20:05,842 encrypt-then-MAC yourself. Just use one of these three standards. Many crypto libraries 263 00:20:05,842 --> 00:20:10,513 now provide standard API's for these three modes and these are the one's you should 264 00:20:10,513 --> 00:20:14,347 be using and nothing else. In the next segment we're going to see what else can 265 00:20:14,347 --> 00:20:17,500 go wrong when you try to implement authenticated encryption by yourself.