1 00:00:04,680 --> 00:00:12,629 rc3 preroll music 2 00:00:12,629 --> 00:00:17,340 Herald: In the world of bad puns, everyone knows and loves the famous line from the 3 00:00:17,340 --> 00:00:22,810 cinematic masterpiece, where the IT security specialists ask the CPU architect 4 00:00:22,810 --> 00:00:30,050 "Warum leakt hier Strom?" or in English, "why is power leaking here?". In this talk 5 00:00:30,050 --> 00:00:35,660 our four speakers demonstrate how they can attack modern processors purely in 6 00:00:35,660 --> 00:00:43,079 software, relying on technical, techniques from classical power side channel attacks. 7 00:00:43,079 --> 00:00:47,470 They'll explain how to use these unprivileged access to energy monitoring 8 00:00:47,470 --> 00:00:53,960 features and modern Intel and AMD CPU's. Please welcome with a round of digital 9 00:00:53,960 --> 00:00:58,450 applause. Moritz Lipp, Michael Schwarz, Daniel Gruss and Andreas Kogler. 10 00:01:07,580 --> 00:01:11,456 Moritz: Warum leaked hier Strom? laugh track 11 00:01:11,456 --> 00:01:13,707 Andreas: Und warum wendest du kein Masking an? 12 00:01:13,707 --> 00:01:16,774 laugh track 13 00:01:16,774 --> 00:01:20,760 Daniel: But to understand how we got here, we have to go back to San Diego in May 14 00:01:20,760 --> 00:01:23,340 2017. A: This is a great, Moritz, this is 15 00:01:23,340 --> 00:01:26,029 a great talk title. We have to use this. laugh track 16 00:01:26,029 --> 00:01:29,739 M: Yeah, but actually, before we can do a talk, we should do some interesting 17 00:01:29,739 --> 00:01:32,010 research that we can present, right? laugh track 18 00:01:32,010 --> 00:01:35,629 A: Of course. Of course. But we have to remember this talk title, it's great. 19 00:01:35,628 --> 00:01:36,599 laugh track M: Yes. 20 00:01:36,599 --> 00:01:47,990 music 21 00:01:47,990 --> 00:01:51,258 Michael: Hey Moritz. Today I have found something really cool. 22 00:01:51,258 --> 00:01:54,650 Moritz: OK, what is it? Michael: Our computers, they give 23 00:01:54,650 --> 00:01:59,404 us the current energy consumption in microjoule and you can access that 24 00:01:59,404 --> 00:02:00,650 from userspace. laugh track 25 00:02:00,650 --> 00:02:05,200 Moritz: What? Are you for real? Michael: That, that basically means we 26 00:02:05,200 --> 00:02:08,545 could mount something like software based power side channels. 27 00:02:08,545 --> 00:02:13,400 Moritz: Nice. We should try that out. Michael: Yes, I already did, because I 28 00:02:13,400 --> 00:02:15,700 thought you might not believe me. Moritz: OK. 29 00:02:15,700 --> 00:02:20,584 Michael: So this is one of the experiments I did. Here you can already see that. I 30 00:02:20,584 --> 00:02:23,719 measured the power consumption using that interface. 31 00:02:23,719 --> 00:02:26,323 Moritz: yeah Michael: First while doing nothing, idling 32 00:02:26,323 --> 00:02:28,052 around sleeping Moritz: like always 33 00:02:28,052 --> 00:02:34,594 Michael: and then I increased the CPU load, I just did an endless loop which 34 00:02:34,594 --> 00:02:38,253 accessed a bit of memory. It's nothing interesting but you can already see the 35 00:02:38,253 --> 00:02:42,123 difference for that. So you can see that there's a difference in doing nothing and 36 00:02:42,123 --> 00:02:47,283 doing a lot. That's pretty nice. Moritz: We should look take a closer look 37 00:02:47,283 --> 00:02:49,823 at that, I think. Michael: Definitely. 38 00:02:49,823 --> 00:02:53,904 music 39 00:02:53,904 --> 00:02:57,194 Moritz: sings You can create my power trace 40 00:02:57,194 --> 00:02:59,009 Andreas: Oh, this is great. We already 41 00:02:59,009 --> 00:03:05,480 have a song for this paper now. Okay. Well, this is a great song that we can use 42 00:03:05,480 --> 00:03:06,530 for the paper... 43 00:03:06,530 --> 00:03:13,071 music 44 00:03:13,071 --> 00:03:16,541 Michael: Powertrace, like power analysis attacks? 45 00:03:16,751 --> 00:03:20,840 Moritz: Yeah, but that would be an attack with physical access. 46 00:03:21,050 --> 00:03:23,184 Daniel: Software-only would be great 47 00:03:23,303 --> 00:03:26,361 Michael: Yes, I told you already, I found one can measure energy 48 00:03:26,361 --> 00:03:27,957 consumption in micro joules 49 00:03:27,957 --> 00:03:32,745 Moritz: Like attacking all server, desktop and laptop CPUs 50 00:03:32,745 --> 00:03:35,755 Daniel: Ideally with unprivileged access 51 00:03:35,755 --> 00:03:38,899 Michael: Imagine if you could distinguish different instructions 52 00:03:38,899 --> 00:03:42,399 or even observe the Hamming weights of operands and memory loads 53 00:03:42,399 --> 00:03:44,024 Daniel: Control flow monitoring 54 00:03:44,024 --> 00:03:47,919 Moritz: In physical attacks they often go for cryptographic keys. 55 00:03:47,919 --> 00:03:52,804 That would be great. Attacking AES-NI and RSA 56 00:03:52,804 --> 00:03:56,249 Daniel: There's just one problem: there is no such channel 57 00:03:56,249 --> 00:03:59,676 Michael: As I said, don't you listen, Daniel? 58 00:03:59,676 --> 00:04:04,659 It's like always, there is this RAPL register. This interface is already there 59 00:04:04,659 --> 00:04:07,083 and you can measure power consumption 60 00:04:07,083 --> 00:04:11,901 Daniel: Yes, but only on a very coarse granularity 61 00:04:14,777 --> 00:04:16,750 Moritz: But first, we need to get a bit 62 00:04:16,750 --> 00:04:21,013 more understanding of the CPU power management. The thermal design power, the 63 00:04:21,013 --> 00:04:26,810 TDP, is the power consumption under the maximum theoretical load of the processor. 64 00:04:26,810 --> 00:04:32,085 And you probably know that number from the CPU specification. And this gives 65 00:04:32,085 --> 00:04:38,430 integrators a target to find the proper thermal solution when you integrate CPU in 66 00:04:38,430 --> 00:04:46,220 a computer so that it doesn't run too hot. But for short periods of time, the CPU can 67 00:04:46,220 --> 00:04:52,919 consume more power than that. And this we can see in this graphic. So here for this 68 00:04:52,919 --> 00:04:58,879 Tau moment, the power consumption is much higher than for the rest of the CPU. 69 00:04:58,879 --> 00:05:05,520 Because usually a CPU is not instantly hot and thermal properties propagate over a 70 00:05:05,520 --> 00:05:12,119 bit of time. So on the other hand, you should also be able to save power. And you 71 00:05:12,119 --> 00:05:16,240 can do this in different ways. For instance, you could just shut down 72 00:05:16,240 --> 00:05:21,870 resources completely that you do not need at the moment, or you can reduce the 73 00:05:21,870 --> 00:05:27,169 voltage of the processor or those components and then it also consumes less 74 00:05:27,169 --> 00:05:32,870 power. And on top of that, you could also reduce the frequency of the processor and 75 00:05:32,870 --> 00:05:39,699 then it also consumes less power. And you need this for different scenarios. For 76 00:05:39,699 --> 00:05:44,810 instance, with your laptop, you need to budget the power consumption because you 77 00:05:44,810 --> 00:05:49,789 want to have a long run time. And you also know these options that you can change, 78 00:05:49,789 --> 00:05:54,449 like the performance level if it should run on high performance or to save 79 00:05:54,449 --> 00:05:57,219 battery. And you need this in different scenarios. 80 00:05:57,219 --> 00:06:01,930 Michael: Yes, Moritz, that's exactly what I showed you before. Do you remember? I 81 00:06:01,930 --> 00:06:07,269 showed you this intel running average power limit, short RAPL, that provides 82 00:06:07,269 --> 00:06:13,180 exactly that functionality. So with this Intel RAPL, you have the power limiting 83 00:06:13,180 --> 00:06:19,610 features so you can do exactly what you just described, reduce the power usage for 84 00:06:19,610 --> 00:06:25,999 your system or for parts of your system. And additionally, you also have the energy 85 00:06:25,999 --> 00:06:30,720 readings. So you know exactly how much power is currently used on a system which 86 00:06:30,720 --> 00:06:36,419 helps you do exactly the things you just mentioned before, like getting a better 87 00:06:36,419 --> 00:06:40,490 power performance balance. So this is already there. 88 00:06:40,490 --> 00:06:44,409 Moritz: Because the CPU needs to know in a way how much power it consumes, right? 89 00:06:44,409 --> 00:06:49,550 Michael: Exactly and the scheduler also uses that feature to ensure that you get a 90 00:06:49,550 --> 00:06:54,820 better battery runtime on your laptop, for example. And because this is an important 91 00:06:54,820 --> 00:07:00,370 feature you can directly get that from the operating system as well. On Linux, you 92 00:07:00,370 --> 00:07:04,379 can even get that as an unprivileged application. There's the powercap 93 00:07:04,379 --> 00:07:10,509 framework that you can directly access in this pseudo file system where you get the 94 00:07:10,509 --> 00:07:15,729 current power readings, you can directly see how much power your CPU currently 95 00:07:15,729 --> 00:07:17,729 consumes. Moritz: How convenient! 96 00:07:17,729 --> 00:07:22,879 Michael: On MacOS and on Windows you have a similar thing, but for that you first 97 00:07:22,879 --> 00:07:26,590 need to install a driver because usually you don't need that as a userspace 98 00:07:26,590 --> 00:07:32,250 application. But some drivers might want to have that and some drivers even expose 99 00:07:32,250 --> 00:07:36,819 that to you and you can use that. So there are some drivers that are even 100 00:07:36,819 --> 00:07:41,300 preinstalled on some of the motherboards that expose that information to 101 00:07:41,300 --> 00:07:47,229 applications as well on Windows. Moritz: Interesting, but what can we do 102 00:07:47,229 --> 00:07:52,979 with this? So I ran some experiments because I wanted to know how good this 103 00:07:52,979 --> 00:07:58,580 energy consumption monitoring works. And in a first run we tried to distinguish 104 00:07:58,580 --> 00:08:04,090 instructions from each other. So we implemented a small program just running 105 00:08:04,090 --> 00:08:08,049 the same instructions all the time, and we measured its power consumption. And as we 106 00:08:08,049 --> 00:08:12,799 can see easily in this plot, different instructions need a different amount of 107 00:08:12,799 --> 00:08:19,419 power. So we can distinguish instructions from each other. In addition, what I 108 00:08:19,419 --> 00:08:23,559 tried, I changed the operands that different instructions used. For instance, 109 00:08:23,559 --> 00:08:28,749 for a multiplication, you can multiply different numbers with each other. And 110 00:08:28,749 --> 00:08:33,779 also here we see, depending on the bits that are set in the operand a different 111 00:08:33,779 --> 00:08:39,130 power consumption of the same instruction, but just depending on the operand so we 112 00:08:39,130 --> 00:08:44,607 can also distinguish them from each other. This could also come in handy later on. 113 00:08:44,607 --> 00:08:51,180 But I also tried to load data with an instruction and I wanted to know if I 114 00:08:51,180 --> 00:08:55,089 could see differences in the power consumption, depending on the data that 115 00:08:55,089 --> 00:09:00,860 has been loaded by the processor. And as you can see in this plot, the more bits 116 00:09:00,860 --> 00:09:07,970 that are set in the data that is loaded, the more power the CPU consumes. But let's 117 00:09:07,970 --> 00:09:14,209 be honest here, to record these measurements, it took more than 23 days, 118 00:09:14,209 --> 00:09:19,949 so it took quite some time to get to this granularity to see those differences, but 119 00:09:19,949 --> 00:09:23,190 in other cases, if you just... Michael: still a fascinating result. 120 00:09:23,190 --> 00:09:27,461 Moritz: Yes, it's a very interesting result. And in other cases, Michael, you 121 00:09:27,461 --> 00:09:33,930 only want to know if one operand or one value is a zero or if it's not a zero. And 122 00:09:33,930 --> 00:09:40,310 to come to this result, you don't need that many measurements. And the last 123 00:09:40,310 --> 00:09:45,540 experiments that we did was we wanted to know if we would see a difference in the 124 00:09:45,540 --> 00:09:51,000 energy consumption, depending where data has been loaded from. For instance, as 125 00:09:51,000 --> 00:09:55,540 we've seen also at CCC in many different talks over the past years, they are like 126 00:09:55,540 --> 00:09:59,920 cache attacks. And here in this experiment, we also were able to see a 127 00:09:59,920 --> 00:10:04,320 difference in the power consumption if your value has been loadad from the cache 128 00:10:04,320 --> 00:10:09,550 or if it has to be loaded from the main memory, because, of course, then DRAM is 129 00:10:09,550 --> 00:10:16,290 activated and it consumes more power. But these results are very nice. 130 00:10:16,290 --> 00:10:20,779 Michael: Yes, these are really fascinating results. So we should actually exploit 131 00:10:20,779 --> 00:10:25,959 them and build attacks from that. I mean, it's fascinating to see that all these 132 00:10:25,959 --> 00:10:29,860 measurements are possible, but we also want to do something security related. 133 00:10:29,860 --> 00:10:32,089 Moritz: Do you have any idea what we could do? 134 00:10:32,089 --> 00:10:36,969 Michael: Yes, I have that idea I already showed you something from before. If you 135 00:10:36,969 --> 00:10:41,240 remember from the office, this one measurement. And I extended that 136 00:10:41,240 --> 00:10:42,400 measurement. Moritz: Yes. 137 00:10:42,400 --> 00:10:47,560 Michael: Into a covert channel. So a covert channel is a communication channel 138 00:10:47,560 --> 00:10:52,290 between two parties that are usually not allowed to communicate with each other. So 139 00:10:52,290 --> 00:10:56,310 there might be different reasons for that. Maybe ther's no interface, maybe there's a 140 00:10:56,310 --> 00:11:01,892 policy or a firewall or something that prevents them from communicating. And 141 00:11:01,892 --> 00:11:06,740 still, in this scenario, I want to communicate. So for that, I'm using 142 00:11:06,740 --> 00:11:11,590 exactly these power side channels and all this analysis you have done to actually 143 00:11:11,590 --> 00:11:17,940 communicate. And that's is very simple to do, actually. I have two processes, a 144 00:11:17,940 --> 00:11:24,380 sender and a receiver, and the sender tries to send single bits, zeros and ones. 145 00:11:24,380 --> 00:11:31,120 And to send a one bit. I do something that uses a lot of energy, like accessing main 146 00:11:31,120 --> 00:11:37,379 memory. And if I want to send a zero bit, then I don't do anything. And now as a 147 00:11:37,379 --> 00:11:42,410 receiver, I just have to measure the power consumption and I see if the power 148 00:11:42,410 --> 00:11:47,961 consumption has a spike. Then I know the sender is sending a one. If there's 149 00:11:47,961 --> 00:11:53,870 nothing the sender is apparently sending a zero and from that I can get this 150 00:11:53,870 --> 00:11:57,975 information a Sender wants to send me. Moritz: But did you try that out? 151 00:11:57,975 --> 00:12:02,070 laugh track Michael: Yes, I also tried that and we can 152 00:12:02,070 --> 00:12:07,385 see that here in this graph. So this is the energy measurement. 153 00:12:07,385 --> 00:12:11,010 Moritz: That's a very clean signal. Michael: Yes, it's the energy measurement 154 00:12:11,010 --> 00:12:16,080 on the receiver side. And we see exactly what I told you before. If there are one 155 00:12:16,080 --> 00:12:20,499 bits, then the energy consumption is higher. If there are zero bits, it's 156 00:12:20,499 --> 00:12:26,220 lower. And from that we can deduce the information that I wanted to send on the 157 00:12:26,220 --> 00:12:30,850 sender side. Pretty neat, huh? Moritz: Yeah, but this is just from one 158 00:12:30,850 --> 00:12:37,190 process to another process. Actually, I took your idea and used this in a 159 00:12:37,190 --> 00:12:43,463 hypervisor scenario where we attack the Xen hypervisor. So it's not limited to two 160 00:12:43,463 --> 00:12:49,781 processes. I installed the Xen hypervisor with two virtual machines. And what Xen 161 00:12:49,781 --> 00:12:56,018 does is it also exposes those RAPL registers to the virtual machine. So now 162 00:12:56,018 --> 00:13:01,079 as a virtual machine, I can have direct access to that and then I can establish a 163 00:13:01,079 --> 00:13:04,220 covert channel between two virtual machines in the cloud. 164 00:13:04,220 --> 00:13:08,110 Michael: That's even better. Moritz: And this is really working, as you 165 00:13:08,110 --> 00:13:13,410 can see here. I mean, here I'm just sending ones and zeros, but the signal is 166 00:13:13,410 --> 00:13:15,589 pretty clear. Michael: That's nice. 167 00:13:15,589 --> 00:13:20,959 Moritz: But it's the more that we can do? Michael: Yes. I mean, covert channels are 168 00:13:20,959 --> 00:13:26,048 great to demonstrate something, that it actually works, across VM, really great. I 169 00:13:26,048 --> 00:13:32,410 like that. That gives you a different threat model here, but still they are a 170 00:13:32,410 --> 00:13:37,579 bit boring. So I decided to have something more interesting as another example of 171 00:13:37,579 --> 00:13:43,320 what we can do. I always like to break kernel address space layout randomization, 172 00:13:43,320 --> 00:13:48,899 KASLR. With this kernel address space layout randomization, the kernel is mapped 173 00:13:48,899 --> 00:13:54,180 to different virtual locations every time I boot my computer to make it difficult to 174 00:13:54,180 --> 00:13:58,050 actually exploit something in the kernel because it's not predictable where the 175 00:13:58,050 --> 00:14:05,670 kernel is located. And I again use the energy consumption to figure out where 176 00:14:05,670 --> 00:14:12,589 this kernel is located. So how does that work? In this address space I have the 177 00:14:12,589 --> 00:14:17,980 kernel which is actually mapped using physical pages and I have a lot of nothing 178 00:14:17,980 --> 00:14:24,350 where no physical page is mapped. And if I try to access these addresses, I can't, of 179 00:14:24,350 --> 00:14:29,170 course, because I don't have the privileges for that. But I will still see 180 00:14:29,170 --> 00:14:33,600 differences when doing that because the CPU has to do different things depending 181 00:14:33,600 --> 00:14:38,340 on whether there's actually a page or not, whether this page can be cached, this 182 00:14:38,340 --> 00:14:42,649 translation, or whether this translation is always invalid because there's nothing 183 00:14:42,649 --> 00:14:47,780 there and it can't be cached. We can see that here in an illustration, if you're 184 00:14:47,780 --> 00:14:53,569 wondering how that really works. So it turns out the kernel can only be mapped to 185 00:14:53,569 --> 00:14:59,691 a limited number of places because it has to be aligned by two megabytes, so I only 186 00:14:59,691 --> 00:15:06,009 need to check the spots there where the kernel could be located. And for all these 187 00:15:06,009 --> 00:15:11,440 places in the address space, I just try to access it and measure how much energy that 188 00:15:11,440 --> 00:15:17,670 consumes. And if there's nothing mapped, it consumes quite a lot of energy because 189 00:15:17,670 --> 00:15:21,940 the CPU has to figure out that there's nothing mapped. It goes through the page 190 00:15:21,940 --> 00:15:26,899 tables, the page table walk, and at the end figures out, oh, there's nothing here, 191 00:15:26,899 --> 00:15:32,180 so I can't do anything, and aborts that. And that uses quite some energy. But if 192 00:15:32,180 --> 00:15:39,200 there's actually the kernel here, then this translation is valid. It works. There 193 00:15:39,200 --> 00:15:43,939 is something there. It will likely be already in the translation caches in the 194 00:15:43,939 --> 00:15:49,709 TLB, so the CPU has less work. It just needs to check the cache, sees: "Oh it's 195 00:15:49,709 --> 00:15:54,939 there. I know that. But wait a moment, you can't access it" and can immediately abort 196 00:15:54,939 --> 00:16:01,939 and that uses less energy. So just from the energy consumption, I can see if 197 00:16:01,939 --> 00:16:06,250 there's something mapped and with that see where the kernel is actually mapped. 198 00:16:06,250 --> 00:16:10,586 Moritz: And this is really working? Did you try it out or is this just some 199 00:16:10,586 --> 00:16:13,329 theoretical thing? Michael: You're always so skeptical. Of 200 00:16:13,329 --> 00:16:19,009 course I tried that and I brought the demo with me. So here you can see the demo 201 00:16:19,009 --> 00:16:24,149 running. This is on a real system. And you see it's super fast measuring the energy 202 00:16:24,149 --> 00:16:28,290 consumption going over the address space and finding the kernel. 203 00:16:28,290 --> 00:16:32,279 applause Moritz: But these attacks are boring, 204 00:16:32,279 --> 00:16:36,681 Michael. We want to attack something real, we want to be like real attackers, we want 205 00:16:36,681 --> 00:16:40,800 to attack crypto, we want to get keys. Michael: Crypto is complicated. That's … 206 00:16:40,800 --> 00:16:43,329 laugh track Moritz: No, no, no, just listen. So, for 207 00:16:43,329 --> 00:16:47,861 instance, with RSA, this is a widely used public-key cryptosystem. This is really 208 00:16:47,861 --> 00:16:53,710 easy because to encrypt some data, you have a public key. To decrypt the data you 209 00:16:53,710 --> 00:16:59,750 have a private key. And if we get the private key: profit, easy as that. What do 210 00:16:59,750 --> 00:17:03,189 you say? Michael: Yeah, I know how that works. So 211 00:17:03,189 --> 00:17:08,910 the theory is easy, that I have the two keys and I have a private key. But then 212 00:17:08,910 --> 00:17:12,540 the complicated part starts where you really have to understand the crypto to 213 00:17:12,540 --> 00:17:17,540 actually attack it. And that's really complicated. And I don't really want to do 214 00:17:17,540 --> 00:17:22,586 that. Maybe we can a student who tries that but I'm out of here. laughter 215 00:17:22,586 --> 00:17:25,584 Andreas: Hi guys, I'm a student and I want a master thesis. 216 00:17:25,584 --> 00:17:29,370 Moritz: This is perfect. Your name is Andreas, right? 217 00:17:29,370 --> 00:17:32,880 Andreas: Yeah, sure, I'm Andreas. laughter 218 00:17:32,880 --> 00:17:36,891 M: OK, I don't know if you have heard the last bits, but we want to attack some 219 00:17:36,891 --> 00:17:39,680 crypto with power side channel attacks. A: OK 220 00:17:39,680 --> 00:17:44,181 Moritz: And for instance, with RSA, we have the private key and the public key. 221 00:17:44,181 --> 00:17:50,970 Here we have M the message and C the ciphertext and d the private exponent. And 222 00:17:50,970 --> 00:17:56,160 of course, it's a computer. It consists of ones and zeros. And depending on the key 223 00:17:56,160 --> 00:18:01,970 bit if it's a one, for the computation of the algorithm, we do a square and the 224 00:18:01,970 --> 00:18:08,510 multiply operation. And if it's zero, we just do the square operation and we do 225 00:18:08,510 --> 00:18:14,110 this for the entire private key. A: Now OK, sounds easy enough. 226 00:18:14,110 --> 00:18:21,640 M: Yes. And if we can observe that we can extract the key. Sounds good. But I 227 00:18:21,640 --> 00:18:28,000 did some experiments and it didn't work out as well as I've expected it to be. So 228 00:18:28,000 --> 00:18:31,860 we need to get a bit more control and maybe a better threat model how to do 229 00:18:31,860 --> 00:18:40,100 that. And there comes Intel SGX into play. And this is an instruction set extension 230 00:18:40,100 --> 00:18:47,340 and it provides you with integrity and confidentiality of code and data even in 231 00:18:47,340 --> 00:18:55,600 untrusted environments. So with Intel SGX, you can run programs using protected areas 232 00:18:55,600 --> 00:19:02,950 of memory. And even in the case where the operating system is compromised and cannot 233 00:19:02,950 --> 00:19:07,300 be trusted at all. A: So basically we have the full 234 00:19:07,300 --> 00:19:11,500 access of all operating system features to attack, the enclave. 235 00:19:11,500 --> 00:19:14,900 M: Yes, exactly A: OK, that sounds quite powerful 236 00:19:14,900 --> 00:19:21,130 M: But there's still one issue. It's still just executing a program. So we have 237 00:19:21,130 --> 00:19:26,630 more power, but we need to make use of that. And there is this paper called 238 00:19:26,630 --> 00:19:34,892 SGX-Step, which gives you more control of enclaves and Jo Van Bulck the author maybe 239 00:19:34,892 --> 00:19:40,623 has time to explain this a bit to us so maybe we can give him a call. 240 00:19:40,623 --> 00:19:42,160 A: Sounds great. ringing sound 241 00:19:42,160 --> 00:19:48,760 M: Hi Jo, this is Moritz. I've seen the paper of yours, this SGX-Step paper. 242 00:19:48,760 --> 00:19:52,990 It might be the thing that we need, but can you explain a bit what it is about? 243 00:19:52,990 --> 00:19:59,910 Jo: Yes, surely Moritz, so SGX-Step I think in one sentence it's an enclave 244 00:19:59,910 --> 00:20:04,920 execution control framework. What I mean with that is that it allows you to 245 00:20:04,920 --> 00:20:09,308 precisely control the execution of the enclave so that you can interleave it with 246 00:20:09,308 --> 00:20:13,750 attacker code, as the name implies, you would do one step of the enclave, one step 247 00:20:13,750 --> 00:20:17,430 of the attacker again one step of the enclave, one step of the attacker, etc. 248 00:20:17,430 --> 00:20:19,890 M: That's perfect. J: That's the high level. 249 00:20:19,890 --> 00:20:23,580 Moritz: Can you expand it a bit on the technical point of view? How do you do 250 00:20:23,580 --> 00:20:26,000 that? J: Yes, I'm very excited about the 251 00:20:26,000 --> 00:20:32,100 technical details, Moritz. So let me walk you through. The first thing you should 252 00:20:32,100 --> 00:20:36,330 know about SGX-Step: it's completely open source and we build it on top of stock 253 00:20:36,330 --> 00:20:37,730 Linux environments. M: Nice 254 00:20:37,730 --> 00:20:43,240 J: So what you should start with always is to load a malicious kernel driver. And 255 00:20:43,240 --> 00:20:48,471 this is called the /dev/sgx-step driver. And from that moment on we kind of export 256 00:20:48,471 --> 00:20:54,540 all of the powers of the Linux kernel into the userspace. And the second component of 257 00:20:54,540 --> 00:20:58,830 SGX-step that's important is this small library operating system that we wrote. 258 00:20:58,830 --> 00:21:04,310 It's called libsgxstep and it sits just alongside of the library alongside in the 259 00:21:04,310 --> 00:21:09,382 userspace application. And libsgxstep allows you to do a number of cool things. 260 00:21:09,382 --> 00:21:14,490 I think the most important thing being that you have direct access to the APIC 261 00:21:14,490 --> 00:21:19,660 x86 high resolution timing device. So that sounds interesting for you, right Moriz?. 262 00:21:19,660 --> 00:21:21,938 M: Yeah, but what do you do with the timer? 263 00:21:21,938 --> 00:21:26,348 J: Well, what you can do with the timer is essentially you can arm it just before 264 00:21:26,348 --> 00:21:30,170 you enter the enclave. And what would happen then is, let's have a look. You arm 265 00:21:30,170 --> 00:21:34,260 the timer, you start executing the enclave, then after a while and interrupt 266 00:21:34,260 --> 00:21:39,800 fires and you exit the enclave again. M: Hmm, so it's like a debugger like 267 00:21:39,800 --> 00:21:44,800 GDB, but for enclaves? J: Yes, it's a... it's exactly that 268 00:21:44,800 --> 00:21:49,000 Moritz. It's like an attacker controlled debugger without using any of the debug 269 00:21:49,000 --> 00:21:54,350 features, just using the raw x86 primitives and operating system files. And 270 00:21:54,350 --> 00:21:59,040 just as in a debugger, it allows you to do single stepping. So every instruction will 271 00:21:59,040 --> 00:22:03,420 be executed one at a time. At most one at a time I should say. 272 00:22:03,420 --> 00:22:09,440 M: But what happens if I, like, configure the timer a bit lower? Does it 273 00:22:09,440 --> 00:22:13,370 then like start executing an instruction? J: That's a very good question. And 274 00:22:13,370 --> 00:22:18,250 configuring the timer is the tricky thing about SGX-step. So it will indeed happen 275 00:22:18,250 --> 00:22:23,780 sometimes what we call a zero step event. So you will fire the timer before the 276 00:22:23,780 --> 00:22:28,290 enclave even had time to execute an instruction. And those are a kind of event 277 00:22:28,290 --> 00:22:32,920 that you can also detect with SGX-step. There is a trick to detect whether you had 278 00:22:32,920 --> 00:22:36,560 a single step or a zero step. M: Jo, this is perfect. This is 279 00:22:36,560 --> 00:22:40,060 exactly what we are looking for. Thank you so much for explaining that. 280 00:22:40,060 --> 00:22:43,250 J: I'm very happy to hear that. M: I'm looking forward to try it out 281 00:22:43,250 --> 00:22:44,850 now. J: Go. 282 00:22:44,850 --> 00:22:47,470 M: See you hopefully soon. J: Bye bye. 283 00:22:47,470 --> 00:22:48,850 M: Bye! 284 00:22:49,460 --> 00:22:54,950 M: So SGX-step to sum it up, it's an open source Linux kernel 285 00:22:54,950 --> 00:22:59,990 framework, and it allows us to configure the APIC timer interrupts so that we can 286 00:22:59,990 --> 00:23:06,400 interrupt the enclave execution to single and zero step it. And this is perfect 287 00:23:06,400 --> 00:23:11,760 because now we can combine it with the power measurements of Intel RAPL, and this 288 00:23:11,760 --> 00:23:17,080 gives us the possibility to measure the energy consumption of single instructions. 289 00:23:17,080 --> 00:23:21,710 Can you try it out Andi? A: OK, let me dig deeper into that. 290 00:23:21,710 --> 00:23:25,700 We have this really slow RAPL interface here and if you want to visualize it, we 291 00:23:25,700 --> 00:23:30,360 could imagine that it's like we have slots where we can fill the slots with 292 00:23:30,360 --> 00:23:35,390 instructions and the RAPL interface gives us the average power consumption over the 293 00:23:35,390 --> 00:23:40,050 slots. So in the default case, when we execute our target instruction, we have 294 00:23:40,050 --> 00:23:44,100 basically one slot filled with the target instruction and the remaining slots filled 295 00:23:44,100 --> 00:23:50,130 with other instructions we don't know. So basically noise. The best case for us 296 00:23:50,130 --> 00:23:54,210 would be if we repeat the target instruction indefinitely and fill every 297 00:23:54,210 --> 00:23:58,028 slot with the target instruction. M: This is exactly what I did 298 00:23:58,028 --> 00:24:02,060 in the experiments in the beginning. A: Yeah, exactly. That's the reason 299 00:24:02,060 --> 00:24:07,760 why we got so good measurements there. Another trick would be if we only used the 300 00:24:07,760 --> 00:24:11,890 target instruction in one slot and fill the remaining slots with instructions 301 00:24:11,890 --> 00:24:15,920 where we know the energy consumption of or we know the instruction of. Then it could 302 00:24:15,920 --> 00:24:20,840 do tricks to calculate the energy consumption of the target instruction. 303 00:24:20,840 --> 00:24:26,830 With SGX-step now we can use a hybrid solution here, where we use SGX-step the 304 00:24:26,830 --> 00:24:32,380 zero stepping mechanism to reissue this instruction and we can fill multiple slots 305 00:24:32,380 --> 00:24:37,260 with the same target instruction. Only drawback here is that we have a noise 306 00:24:37,260 --> 00:24:43,130 overhead of SGX-step itself, but this is probably the best solution we can go with. 307 00:24:43,860 --> 00:24:48,100 M: This sounds pretty good, so we should actually try that out. So we 308 00:24:48,100 --> 00:24:53,180 implement a toy cipher, which imitates square and multiply basically. So we can 309 00:24:53,180 --> 00:24:58,110 leave out all the rest, the overhead of a library that would be used otherwise. And 310 00:24:58,110 --> 00:25:02,700 we then just single step every instruction and measure its energy consumption and 311 00:25:02,700 --> 00:25:08,200 then we could plot this. Can you do that? A: I got already some results here 312 00:25:08,200 --> 00:25:13,156 for us. Basically here we use, as you explained, a toy example for square and 313 00:25:13,156 --> 00:25:18,580 multiply. And in both cases the square and the multiply, they execute exactly six 314 00:25:18,580 --> 00:25:23,860 instructions. And so basically we have a period of six here. And if you look at the 315 00:25:23,860 --> 00:25:29,550 results of the measurement here, we can see that we have patterns that repeat with 316 00:25:29,550 --> 00:25:34,460 a period of six and we can see that these different patterns correspond to either a 317 00:25:34,460 --> 00:25:40,400 square or a multiply instruction here. M: Nice, perfect, but this is just a 318 00:25:40,400 --> 00:25:42,400 toy cipher, right? laughter A: Yeah. 319 00:25:42,400 --> 00:25:44,370 M: Can we do like real crypto? laughter 320 00:25:44,370 --> 00:25:49,529 A: We can try. So the plan now is that we want to attack a real RSA 321 00:25:49,529 --> 00:25:54,310 implementation and the real implementation is not like a toy square and multiply 322 00:25:54,310 --> 00:25:59,320 algorithm. The real implementation needs to handle these huge numbers. So basically 323 00:25:59,320 --> 00:26:03,492 there's much more code involved and it's not feasible to single step every 324 00:26:03,492 --> 00:26:10,340 instruction there. So we must do a more clever approach here. If we observe the 325 00:26:10,340 --> 00:26:17,478 square multiply part here, we see that the square and the multiply function uses the 326 00:26:17,478 --> 00:26:25,420 AVX optimized memset function. So the energy consumption should also be more if 327 00:26:25,420 --> 00:26:30,910 we execute an AVX instruction because AVX instructions use much larger registers. So 328 00:26:30,910 --> 00:26:33,031 basically we should be able to observe that. 329 00:26:33,031 --> 00:26:36,040 M: Interesting. A: The only drawback here is that we 330 00:26:36,040 --> 00:26:43,470 cannot use the same approach as with the toy cipher because the square has a 331 00:26:43,470 --> 00:26:48,659 different number of instructions as the square and multiply function. So we need 332 00:26:48,659 --> 00:26:54,950 to do a trick here. So to understand what we did here, our target is that we 333 00:26:54,950 --> 00:27:00,280 reconstruct a key bit. And if the key bit is one we execute a square and multiply. 334 00:27:00,280 --> 00:27:09,260 If the key bit is zero, we execute a square. So to visualize how we execute 335 00:27:09,260 --> 00:27:14,470 zero and single stepping, we have to dig into the assembler a bit. So to test for 336 00:27:14,470 --> 00:27:18,690 the key bit, we execute like a test instruction and then we execute a 337 00:27:18,690 --> 00:27:24,730 conditional jump. And if we execute the square and multiply we have for instance, 338 00:27:24,730 --> 00:27:29,435 K instructions. And if we execute the square we have for instance L 339 00:27:29,435 --> 00:27:34,260 instructions. So we can see that these two numbers do not add up. They are different. 340 00:27:34,260 --> 00:27:40,050 So we cannot simply measure each Kth instruction and get the key out. So we 341 00:27:40,050 --> 00:27:45,030 need to do something different here. We can number the instructions after the jump 342 00:27:45,030 --> 00:27:52,980 instruction and then using single stepping to single step to the Nth instruction 343 00:27:52,980 --> 00:27:59,272 after the jump instruction. And on the left side, if you observe one, we hit then 344 00:27:59,272 --> 00:28:05,414 exactly the AVX instruction there, used in the AVX memset. And if you then use our 345 00:28:05,414 --> 00:28:10,044 measurement framework to measure exactly the nth instruction after the jump, we 346 00:28:10,044 --> 00:28:14,690 observe on the one hand a high energy consumption and on the other hand, we 347 00:28:14,690 --> 00:28:20,140 observe low energy consumption if the branch was not taken or a zero. 348 00:28:20,140 --> 00:28:22,910 M: It's very clever. A: So if you measured both 349 00:28:22,910 --> 00:28:28,490 instructions here, we can then combine this energy measurements and then use a 350 00:28:28,490 --> 00:28:35,490 simple threshold to reconstruct the key bit in the beginning. And then we do this 351 00:28:35,490 --> 00:28:39,270 iteratively for each key bit. M: This sounds pretty promising, but 352 00:28:39,270 --> 00:28:40,760 did you try it out? laughter 353 00:28:40,760 --> 00:28:45,149 A: Sure. Here, the results of that. And we can clearly see that we have 354 00:28:45,149 --> 00:28:48,735 different energy consumption or in this case voltage 355 00:28:48,735 --> 00:28:51,094 applause based on if the 356 00:28:51,094 --> 00:28:56,160 AVX instruction is executed or if the instruction at the same offset in the 357 00:28:56,160 --> 00:28:59,410 other branch is executed. M: How fast does this work, does this 358 00:28:59,410 --> 00:29:03,025 take like 5 days? A: Not quite that long. We have one 359 00:29:03,025 --> 00:29:08,445 problem here that the time per key bit increases the further or later the key bit 360 00:29:08,445 --> 00:29:14,040 is in the key. So basically the first key bit we can reconstruct very fast, but for 361 00:29:14,040 --> 00:29:18,230 the last key bit, we need a single step much further in the code to actually reach 362 00:29:18,230 --> 00:29:23,460 it. And this adds up. So basically the time increases linearly between the key 363 00:29:23,460 --> 00:29:29,090 bits. But for our key here, our test key with 512 bits that takes us about 3.5 364 00:29:29,090 --> 00:29:35,280 hours to reconstruct a complete key. Note here that we spent like 52 minutes 365 00:29:35,280 --> 00:29:39,790 only to find the target instruction. So basically, if we could optimize that, the 366 00:29:39,790 --> 00:29:45,688 attack would be much faster. In addition, we had to record like 3 samples per key 367 00:29:45,688 --> 00:29:50,199 bit. But with the implementation, it should be possible to actually do that 368 00:29:50,199 --> 00:29:54,600 with 1 sample. And since we then only need one sample per key bit, we actually can do 369 00:29:54,600 --> 00:29:58,569 it with a single trace attack. But we did not try that out, unfortunately. 370 00:29:58,569 --> 00:30:03,375 Moritz: quite fast. Michael: So while all this sounded quite 371 00:30:03,375 --> 00:30:08,183 easy and straightforward in hindsight, this was actually a really long process. 372 00:30:08,183 --> 00:30:14,100 Starting at the beginning of 2017 when we discovered this interface, the RAPL 373 00:30:14,100 --> 00:30:18,713 interface. Then we had to come up with a title for this talk, of course, laughter 374 00:30:18,713 --> 00:30:25,677 and some lyrics for a song. We had the first toy attack on RSA at the end of 375 00:30:25,677 --> 00:30:34,463 2017. It took us until 2018 to finally get a KASLR break that was working and only in 376 00:30:34,463 --> 00:30:41,280 2019, by the end of 2019. After Andreas did his master's thesis on that, we were 377 00:30:41,280 --> 00:30:48,030 able to produce a full attack on RSA. And this is also the time when we submitted 378 00:30:48,030 --> 00:30:53,910 that as a paper to a conference and disclosed that to the CPU vendors so that 379 00:30:53,910 --> 00:30:59,552 they can fix that. And this is also the start of the embargo. This embargo for 380 00:30:59,552 --> 00:31:10,640 this vulnerability lasted almost one year. So from November 2019 to November 2020. It 381 00:31:10,640 --> 00:31:15,790 was just a few weeks ago that this embargo ended here. 382 00:31:15,790 --> 00:31:21,040 Moritz: But there's one thing missing. We really wanted to do crypto attacks, but 383 00:31:21,040 --> 00:31:28,067 not only with SGX-step as a compromised operating system, but also from userspace. 384 00:31:28,067 --> 00:31:33,650 But as we've seen, it's so difficult to measure parts of the code without having 385 00:31:33,650 --> 00:31:39,653 SGX-step. But what we can do is we can measure the power consumption of the 386 00:31:39,653 --> 00:31:46,280 overall execution of an algorithm and there correlation power analysis comes in 387 00:31:46,280 --> 00:31:53,121 handy. And there what we do is we build a power consumption model of our device. As 388 00:31:53,121 --> 00:31:58,540 we've heard earlier, the Hamming Weight is the number of bits that is set in an 389 00:31:58,540 --> 00:32:05,580 operand or in the data. And we assume that if a bit is set, the computer takes more 390 00:32:05,580 --> 00:32:10,850 power to process it. In addition, what you can use as a different model is the 391 00:32:10,850 --> 00:32:17,768 Hamming distance. So from one operation to the other, how many bits change? And then 392 00:32:17,768 --> 00:32:24,690 we assume the more bits change, the more power is consumed. And we really want to 393 00:32:24,690 --> 00:32:30,700 try that out. So what we are targeting now is AES-NI, a side channel resistant 394 00:32:30,700 --> 00:32:37,320 instruction set of Intel. And we target it in a scenario where we can trigger the 395 00:32:37,320 --> 00:32:43,728 encryption and decryption of many, many blocks over long time so that the 396 00:32:43,728 --> 00:32:50,770 execution time is longer than the RAPL update rate, so that we can really see the 397 00:32:50,770 --> 00:32:55,640 power consumption in our measurement. And this is used, for instance, in disk 398 00:32:55,640 --> 00:33:05,340 encryption or decryption or if you seal or unseal the SGX enclave state. And we can 399 00:33:05,340 --> 00:33:10,840 now do that and record power measurements in different scenarios, right? 400 00:33:10,840 --> 00:33:17,390 Andreas: Sure, we can try that. So in our experiment, we recorded two million traces 401 00:33:17,390 --> 00:33:25,860 over 26 hours for SGX environment. But we also tried to reconstruct it without SGX 402 00:33:25,860 --> 00:33:29,700 where we used the encryption inside a kernel module. And there we recorded 403 00:33:29,700 --> 00:33:36,951 4 million traces in 50 hours. And to understand the attack here, we have to 404 00:33:36,951 --> 00:33:42,030 look at this animation. So basically we have our computer where secret key is 405 00:33:42,030 --> 00:33:49,500 stored somewhere intern. Then we have this key to encrypt some messages and we also 406 00:33:49,500 --> 00:33:54,240 have the power consumption here. And what we now did is we recorded the encrypted 407 00:33:54,240 --> 00:34:00,854 message and the power consumption it took to encrypt this message for many messages. 408 00:34:00,854 --> 00:34:07,540 And then we use a model of the CPU here to predict the energy consumption, to 409 00:34:07,540 --> 00:34:12,940 reconstruct the key. The key is usually split up into parts, where each of the 410 00:34:12,940 --> 00:34:20,887 parts can have a value between 0 and 255. So to reconstruct the key here, we simply 411 00:34:20,887 --> 00:34:28,819 use our measurements in the model and we try out one of the key parts and estimate 412 00:34:28,819 --> 00:34:35,809 the energy consumption for the key part. And then we store the correlation between 413 00:34:35,809 --> 00:34:42,619 the recorded messages and the prediction. And we do this for every of the possible 414 00:34:42,619 --> 00:34:50,379 key values. And once we found the key value of the highest correlation, we know 415 00:34:50,379 --> 00:34:56,909 that this key value corresponds to the key part of the key. And we then simply repeat 416 00:34:56,909 --> 00:35:02,279 the process for each of the parts of the key until we get the final key. 417 00:35:02,279 --> 00:35:07,450 M: And we actually tried that out. So here in our demo video, you see on the 418 00:35:07,450 --> 00:35:13,391 left where we test all the combinations and see what is the most likely key 419 00:35:13,391 --> 00:35:18,349 candidate at the moment, while for a single key byte on the right, you see 420 00:35:18,349 --> 00:35:23,730 every possible value and the correlation. So in the beginning, with not that many 421 00:35:23,730 --> 00:35:29,747 traces processed, it's not very clear which key candidate is the right one, 422 00:35:29,747 --> 00:35:34,849 because there's so much measurement noise introduced by measuring over the overall 423 00:35:34,849 --> 00:35:41,292 execution time. But over time, this signal gets more stable and we see on the right 424 00:35:41,292 --> 00:35:45,890 with the peak getting more and more distance from the other candidates that 425 00:35:45,890 --> 00:35:52,380 this is our correct key byte. And we do this, as Andreas said, for every possible 426 00:35:52,380 --> 00:35:57,230 key byte with every possible value. So in the end, we end up with the correct key. 427 00:35:57,230 --> 00:36:00,729 applause A: OK, but this seems like it's only 428 00:36:00,729 --> 00:36:05,930 Intel CPUs. Does this also affect others? M: Yes. So actually, we also tried 429 00:36:05,930 --> 00:36:10,858 out how to CPU vendors if they have similar interfaces. And for instance, AMD 430 00:36:10,858 --> 00:36:17,532 is affected as well. But we never really heard back from them after our disclosure. 431 00:36:17,532 --> 00:36:23,510 And the patch how to try to solve the problem with the driver is similar to the 432 00:36:23,510 --> 00:36:27,400 one that Intel has. A: Your right Moritz, it actually 433 00:36:27,400 --> 00:36:31,839 works. So I tried the same code on AMD. The one you showed before was 434 00:36:31,839 --> 00:36:37,080 distinguishing operands, at that also works on AMD. That's pretty nice. It's not 435 00:36:37,080 --> 00:36:41,440 an Intel only issue. It also affects at least AMD as well. 436 00:36:41,440 --> 00:36:45,230 M: Yes, but actually there are many other vendors as well that provide 437 00:36:45,230 --> 00:36:50,410 interfaces, even some of them unprivileged to user space where you could probably 438 00:36:50,410 --> 00:36:55,660 mount similar attacks. For instance, Nvidia, IBM, or Marvell and Ampere. 439 00:36:55,660 --> 00:37:00,906 A: So this is really an industry wide problem here. And we've also seen 440 00:37:00,906 --> 00:37:08,432 that from the media coverage. So not only German news brought about that like Heise 441 00:37:08,432 --> 00:37:13,788 or Golem, but it also went more international with ZDNET, Ars Technica, 442 00:37:13,788 --> 00:37:20,970 CSO, Tech Radar, Computer Weekly and many, many others that wrote about this new type 443 00:37:20,970 --> 00:37:28,599 of vulnerability that affects many computers out there. And I guess if it 444 00:37:28,599 --> 00:37:31,480 affects many computers, we should do something against that. 445 00:37:31,480 --> 00:37:35,779 M: Yes, you're right. We cannot only have an attack and no mitigation against 446 00:37:35,779 --> 00:37:41,470 it. This would not be right. And indeed, it's quite easy to fix that because we 447 00:37:41,470 --> 00:37:46,040 said in the beginning, you have unprivileged access to those registers. So 448 00:37:46,040 --> 00:37:51,930 we just restrict the access. And we are done, and this is exactly a one line patch 449 00:37:51,930 --> 00:37:59,480 for the Linux kernel. But as we've seen with the threat model of Intel SGX, which 450 00:37:59,480 --> 00:38:05,049 allows a compromised operating system. So this one line patch does not help there 451 00:38:05,049 --> 00:38:11,340 because I'm the operating system, I can do whatever I want to. We need more and more 452 00:38:11,340 --> 00:38:18,445 complex mitigations. And in this case, microcode updates are necessary. And what 453 00:38:18,445 --> 00:38:23,991 Intel does is to fall back to the model of the energy consumption. So they have an 454 00:38:23,991 --> 00:38:28,930 internal model. How much energy is consumed by an executed instruction and 455 00:38:28,930 --> 00:38:33,968 use that instead of the real measurement. And this does not allow to distinguish 456 00:38:33,968 --> 00:38:40,895 data and operands from each other again. So if your implementation is implemented 457 00:38:40,895 --> 00:38:47,220 correctly, if you use constant time, then you are mitigated and protected against 458 00:38:47,220 --> 00:38:53,444 these attacks. And as we see here in the plot, we tried to mitigation out. So on 459 00:38:53,444 --> 00:38:58,020 the left, we were able to see differences depending on the Hamming weight of the 460 00:38:58,020 --> 00:39:03,700 operands. And on the right with the mitigation in place, it just does not work 461 00:39:03,700 --> 00:39:07,311 anymore and you cannot see any differences. applause 462 00:39:07,311 --> 00:39:11,142 Andreas: Nice. So you really can't read her power trace any more. 463 00:39:11,142 --> 00:39:35,547 Music: Pokerface by Lady Gaga 464 00:39:35,547 --> 00:39:39,641 sings I wonna probe 'em like in 1943 465 00:39:39,641 --> 00:39:43,116 touch 'em, measure wattage correlate and get the key 466 00:39:43,116 --> 00:39:44,005 I probe it 467 00:39:44,005 --> 00:39:47,368 Oscilloscopes are not the same without a probe 468 00:39:47,368 --> 00:39:52,219 And babe, if it's remote if it's not code, it cannot run 469 00:39:56,239 --> 00:39:59,731 I'll let him plot, let's see what he's got 470 00:40:04,251 --> 00:40:08,145 I'll let him plot, let's see what he's got 471 00:40:08,145 --> 00:40:10,389 Can't read my, can't read my 472 00:40:10,389 --> 00:40:14,091 No he can't read my power trace 473 00:40:14,091 --> 00:40:16,368 She's got the countermeasure 474 00:40:16,368 --> 00:40:18,283 Can't read my, can't read my 475 00:40:18,283 --> 00:40:21,907 No he can't read my power trace 476 00:40:21,907 --> 00:40:24,572 She's got the countermeasure 477 00:40:24,572 --> 00:40:27,649 P-p-p-power trace, p-p-power trace 478 00:40:28,530 --> 00:40:31,688 P-p-p-power trace, p-p-power trace 479 00:40:32,533 --> 00:40:35,658 P-p-p-power trace, p-p-power trace 480 00:40:36,691 --> 00:40:39,555 P-p-p-power trace, p-p-power trace 481 00:40:41,404 --> 00:40:43,728 applause 482 00:40:43,728 --> 00:40:45,910 Moritz: With all those nasty songs, we 483 00:40:45,910 --> 00:40:50,910 wrote them down in a scientific paper and the PLATYPUS paper has been accepted 484 00:40:50,910 --> 00:40:57,240 recently at a conference. And we also want to thank you, all the other coauthors who 485 00:40:57,240 --> 00:41:04,520 are not in this talk, like David Oswald, Catherine Easton and Claudio Canela. To 486 00:41:04,520 --> 00:41:09,900 sum it up, what we have seen is that with power sidechannel attacks, you can even 487 00:41:09,900 --> 00:41:16,630 exploit them from software. So there is no need to attach an oscilloscope on modern 488 00:41:16,630 --> 00:41:19,514 Intel CPUs. 489 00:41:19,514 --> 00:41:23,239 Michael: And what we've also seen is that since the SGX threat model allows for 490 00:41:23,239 --> 00:41:27,809 much more capable attackers, mitigating power sidechannel attacks on the SGX 491 00:41:27,809 --> 00:41:32,369 enclaves is much more work than simple software patches. 492 00:41:32,369 --> 00:41:34,604 Andreas: Yes, and that concludes 493 00:41:34,604 --> 00:41:39,696 our talk on PLATYPUS. Thank you all for listening. 494 00:41:39,696 --> 00:41:56,859 Applause and Music 495 00:41:59,077 --> 00:42:05,580 Herald: Thank you very much for your excuse me, nerdy talk and thank Moritz, 496 00:42:05,580 --> 00:42:13,140 Michael, Daniel and Andreas. We head over to our Q&A session and the first question 497 00:42:13,140 --> 00:42:21,059 would be, how does it come that you have so, let's say through the back door 498 00:42:21,059 --> 00:42:26,680 against CPU attack against the CPU idea, you mentioned you attack the through a 499 00:42:26,680 --> 00:42:31,910 power driver RSA. Could you tell me a little bit more about that? 500 00:42:31,910 --> 00:42:36,640 Moritz: Yes. So the basic idea of attacking cryptographic algorithms with 501 00:42:36,640 --> 00:42:41,339 power side channel attacks is not very new This was like one of the first things 502 00:42:41,339 --> 00:42:46,400 researchers have shown, but most of the time for like smaller devices, like smart 503 00:42:46,400 --> 00:42:52,740 cards, like your bank card, for instance. And for those attacks, you usually had 504 00:42:52,740 --> 00:42:57,472 like an oscilloscope that you needed to attach to the device to do the attack. But 505 00:42:57,472 --> 00:43:02,012 with modern processors, they have basically an oscilloscope built into the 506 00:43:02,012 --> 00:43:07,309 processor, which you can read out as the operating system. And in our case, there 507 00:43:07,309 --> 00:43:12,454 are like drivers that expose this interface, also to userspace. So from 508 00:43:12,454 --> 00:43:18,050 there as an unprivileged attacker, you can then try to exploit that. And yeah 509 00:43:18,050 --> 00:43:23,450 basically the best thing that we wanted to achieve with those attacks is to attack 510 00:43:23,450 --> 00:43:29,434 cryptographic algorithms and not to transmit some data between two processes. 511 00:43:29,434 --> 00:43:35,700 Herald: Cool, thank you. Our next question, you mentioned a little bit about 512 00:43:35,700 --> 00:43:44,259 ARM sorry, AMD, how about ARM? So not x86 architecture? 513 00:43:44,259 --> 00:43:49,350 Moritz: So there are many other vendors that have similar interfaces, some of them 514 00:43:49,350 --> 00:43:55,519 also provide drivers that expose them directly to userspace, but we hardly had 515 00:43:55,519 --> 00:44:01,390 any access to those devices, so we could not really fully evaluate if these attacks 516 00:44:01,390 --> 00:44:06,072 are also possible on them. But in the paper, we have an appendix where we 517 00:44:06,072 --> 00:44:10,440 describe them in a bit more detail so you can try it out on your own and let us know 518 00:44:10,440 --> 00:44:15,120 if it works. Herald: Cool. Thank you. So please, fellow 519 00:44:15,120 --> 00:44:20,470 hackers, try it out at your system, at home. Now, our next question is related to 520 00:44:20,470 --> 00:44:26,374 that. Is there a survey which hardware has the RAPL or similar weaknesses? Intel, 521 00:44:26,374 --> 00:44:33,045 AMD, ARM even. Moritz: I don't know if anyone else wants 522 00:44:33,045 --> 00:44:38,940 to answer that, I can also take the question. So the RAPL interface itself 523 00:44:38,940 --> 00:44:44,130 comes from Intel, but a similar interface is also implemented for AMD, and they also 524 00:44:44,130 --> 00:44:49,710 use basically the same name. They have a... For now, it's implemented in two ways 525 00:44:49,710 --> 00:44:54,420 for the Linux kernel, also in the RAPL driver, but also in a separate called AMD 526 00:44:54,420 --> 00:44:59,609 Energy Driver, which is included since a few months in the Linux kernel, in the 527 00:44:59,609 --> 00:45:05,074 upstream Kernel. And for other vendors it works a bit differently. So some of them 528 00:45:05,074 --> 00:45:12,087 just give you similar measurements, but not in a tightly related way to the RAPL 529 00:45:12,087 --> 00:45:16,220 Interface with a measure over a period of time and give you the average. 530 00:45:16,611 --> 00:45:21,560 Herald: OK, and.. Michael: Maybe to add one point here: On 531 00:45:21,560 --> 00:45:26,534 Intel, basically the high resolution sensors are included since the Skylake 532 00:45:26,534 --> 00:45:31,308 micro architecture. So something around 2015. 533 00:45:33,383 --> 00:45:40,180 Herald: I see. We have another related question to AMD. So did AMD issue any 534 00:45:40,180 --> 00:45:45,160 Microcode update for the secure encrypted virtual machines case apart from 535 00:45:45,160 --> 00:45:53,469 restricting access to MSR? Moritz: Not as far as we know. But from 536 00:45:53,469 --> 00:45:58,271 our knowledge to attack AMD CPU's, we need to wait for a new generation so that we 537 00:45:58,271 --> 00:46:02,931 can do similar attacks from a similar threat model than we can do on an Intel. 538 00:46:03,450 --> 00:46:09,390 Herald: Cool, thank you. So another I think this is also related to it, you 539 00:46:09,390 --> 00:46:14,390 mentioned your Xen example where you attack through a hypervisor. Does it work 540 00:46:14,390 --> 00:46:18,440 on other hypervisors like KVM or hyperV as well? 541 00:46:18,440 --> 00:46:24,470 Moritz: So for KVM, I don't think so. For Windows I also don't know I don't think 542 00:46:24,470 --> 00:46:29,509 they exposed those MSR directly to the virtual machines. So the issue is really 543 00:46:29,509 --> 00:46:34,270 here that we can have access to those MSRs at the virtual machine where we should not 544 00:46:34,270 --> 00:46:40,859 have access to. Herald: OK, we have another question from, 545 00:46:40,859 --> 00:46:47,297 I think, the hardware section of our remote Congress. Someone wonders if the 546 00:46:47,297 --> 00:46:51,833 same could be achieved with external power measurement. 547 00:46:52,990 --> 00:46:57,640 Moritz: You mean if you could attach actually an oscilloscope or a different 548 00:46:57,640 --> 00:47:03,510 probe to the CPU? Yes, you can do that. And it has already been demonstrated in 549 00:47:03,510 --> 00:47:07,279 the past. Michael: But it turned out with external 550 00:47:07,279 --> 00:47:12,510 tools, it takes even longer than with software. You have more issues finding the 551 00:47:12,510 --> 00:47:20,630 right spot in measuring. And there is one paper, it took 14 days of collecting 552 00:47:20,630 --> 00:47:26,909 traces which are harder to probe, which is much longer than in software. But it can 553 00:47:26,909 --> 00:47:30,981 be done. Herald: And there's another follow up 554 00:47:30,981 --> 00:47:38,677 question, how external is external? Where do you measure power consumptions of an 555 00:47:38,677 --> 00:47:46,650 x86 server? Moritz: OK, you would need to get physical 556 00:47:46,650 --> 00:47:51,400 access to the data center, I guess. And if this is in your threat model, you probably 557 00:47:51,400 --> 00:47:57,740 have different things to worry about. Michael: Yeah, you still need to find the 558 00:47:57,740 --> 00:48:04,609 right spot on your mainboard. Herald: OK, so are there, let's say 559 00:48:04,609 --> 00:48:08,680 documentation's where to get that right spot. 560 00:48:09,612 --> 00:48:14,700 Moritz: I think one can take a look at other research papers where they attached 561 00:48:14,700 --> 00:48:19,180 a probe, I think there are experts out there, but I don't know. 562 00:48:19,180 --> 00:48:26,690 Herald: OK, thank you. The next question, why is the power information exported in 563 00:48:26,690 --> 00:48:32,809 such detail to the kernel or userspace software? Why isn't it only available to 564 00:48:32,809 --> 00:48:37,700 the firmware or filtered to return an average, for example, one second power 565 00:48:37,700 --> 00:48:43,279 trace? Moritz: Good question. We did not 566 00:48:43,279 --> 00:48:48,140 implement that. I think the reason is... Andi? 567 00:48:48,140 --> 00:48:53,540 Andreas: The once second power trace would make the attack only slower because you 568 00:48:53,540 --> 00:48:58,345 can still do exactly what we did with single stepping here, because RAPL is 569 00:48:58,345 --> 00:49:04,477 already very slow and we need a mechanism to replay instructions to get a good 570 00:49:04,477 --> 00:49:08,779 reading of the energy consumption of the instructions. So if you only increase the 571 00:49:08,779 --> 00:49:14,170 update rate there, the attacks would still be possible, but only take longer to 572 00:49:14,170 --> 00:49:22,819 record the traces there. So you have to... Yeah. So you have to find a tradeoff 573 00:49:22,819 --> 00:49:28,049 between your countermeasures there. Herald: Okay, so let's say with an 574 00:49:28,049 --> 00:49:33,180 average, your resolution is lower, but still it just takes more time to record 575 00:49:33,180 --> 00:49:38,420 it. And still it does work, right? Moritz: Yes. And the other thing is that 576 00:49:38,420 --> 00:49:43,450 one needs to keep in mind those drivers are not written for security in mind, but 577 00:49:43,450 --> 00:49:48,779 for performance so that this can be used by other tools that like give you the best 578 00:49:48,779 --> 00:49:55,059 performance of your CPU. And in that case, it just has not been masked and you get 579 00:49:55,059 --> 00:49:58,710 the value directly at the operating system sees. 580 00:49:59,106 --> 00:50:06,380 Herald: Crazy. Our second to last question, how long is the update interval 581 00:50:06,380 --> 00:50:13,046 for this measurement? I heard something about... 582 00:50:13,046 --> 00:50:17,224 Andreas: For the fastest register we observed, it's like 10 microseconds, for 583 00:50:17,224 --> 00:50:21,079 the slowest one... So there are different domains where you measure only parts of 584 00:50:21,079 --> 00:50:25,290 the CPU and for the whole package, this includes all the cores and the memory 585 00:50:25,290 --> 00:50:30,099 controller, it takes around one millisecond there. So this is already very 586 00:50:30,099 --> 00:50:35,311 slow, if you compare it to the frequency where CPUs are currently running at. 587 00:50:36,690 --> 00:50:43,539 Herald: Crazy. In this case, are there any other questions from the interwebs, from 588 00:50:43,539 --> 00:50:50,455 Twitter, from our IRC channel? Because otherwise we would head over to more, 589 00:50:50,455 --> 00:50:56,178 let's say, personal interview. Let's give them a try. 590 00:51:07,727 --> 00:51:09,880 In this case, no more 591 00:51:09,880 --> 00:51:16,851 questions, so in this. So, again, thank you. Moritz, Michael, Daniel and Andreas, 592 00:51:16,851 --> 00:51:27,230 for these for this really interesting talk for this Q&A session, the Internet tells 593 00:51:27,230 --> 00:51:35,622 me no questions. We head over to our personal interview. I asked you earlier 594 00:51:35,622 --> 00:51:43,670 before our talk. So with all these, let's say, research things going on in the 595 00:51:43,670 --> 00:51:49,420 Corona time. So what's your personal experience? What changed in your work life 596 00:51:49,420 --> 00:51:56,001 balance in the last one year? Moritz: I think the biggest change is that 597 00:51:56,001 --> 00:52:02,105 most of the coffee breaks you do alone instead of with the colleagues. 598 00:52:04,211 --> 00:52:08,710 Herald: So how do you meet in your in your, let's say, lunch break? Do you have 599 00:52:08,710 --> 00:52:16,069 as well a lunch break break out session in Jitsi? Yeah, we started with Jitsi, but 600 00:52:16,069 --> 00:52:20,320 used different systems on the long way. And now it's like a fixed coffee meeting 601 00:52:20,320 --> 00:52:25,637 at 2:00 p.m. every day and try to meet everyone or have individual meetings, of 602 00:52:25,637 --> 00:52:28,758 course. Herald: And does this work? But so is 603 00:52:28,758 --> 00:52:35,323 everyone on time. So sharp 12? Moritz: No, but I think no one really 604 00:52:35,323 --> 00:52:40,500 cares. Herald: So it's just for socializing? 605 00:52:40,500 --> 00:52:47,168 Moritz: Yes. But we also discuss work related issues also in separate meetings. 606 00:52:47,168 --> 00:52:54,849 And yeah, I think time is different, but you get used to it. But let's hope it's 607 00:52:54,849 --> 00:53:02,108 over soon. Herald: What about the others, Michael? 608 00:53:02,108 --> 00:53:08,910 Michael: Yes, I'm in the same coffee breaks as Moritz. Sometimes every day, 609 00:53:08,910 --> 00:53:17,200 depends on the workload, so I feel quite lucky that we can still work full time and 610 00:53:17,200 --> 00:53:21,890 get our work done. And I don't have to fear that we lose our jobs in the in the 611 00:53:21,890 --> 00:53:30,609 short term. So I think that takes a lot of pressure off. But, yeah, I mean, it's 612 00:53:30,609 --> 00:53:35,859 different. I'm also missing the conferences, so I used to travel around a 613 00:53:35,859 --> 00:53:43,990 lot before Corona times and this year is basically nothing. So you really miss the 614 00:53:43,990 --> 00:53:49,910 social interactions and conferences, meeting other researchers, exchanging 615 00:53:49,910 --> 00:54:00,060 ideas, having that online is different and just not the same, but still it works. So 616 00:54:00,060 --> 00:54:05,289 I can still do a lot of research. The positive thing, you have less 617 00:54:05,289 --> 00:54:12,019 interruptions than when you're in the office. So that's a positive thing. But 618 00:54:12,019 --> 00:54:17,269 yeah, I also hope that it's over soon. Daniel: But then again, on the other side, 619 00:54:17,269 --> 00:54:22,476 you have way more conference calls because instead of writing emails, people ask for 620 00:54:22,476 --> 00:54:26,808 conference calls all the time. Michael: Yes, you are in meetings all the 621 00:54:26,808 --> 00:54:29,980 time. Herald: Yeah, Daniel you mentioned earlier 622 00:54:29,980 --> 00:54:37,299 you're, let's say, flightplan the last year. And as far as I understood it, you 623 00:54:37,299 --> 00:54:43,049 like to be in personal contact with your colleagues, also from others or from 624 00:54:43,049 --> 00:54:49,109 foreign countries. How does this work? So let's say topic exchange between different 625 00:54:49,109 --> 00:54:51,890 organizations, between different countries? 626 00:54:51,890 --> 00:54:59,930 Daniel: Yeah, it's more difficult. So in 2018, I had these 54 talks outside of Graz 627 00:54:59,930 --> 00:55:11,529 in 52 weeks and this year I had a single talk outside of, outside of Graz where I 628 00:55:11,529 --> 00:55:17,630 was in person of course. Of course more Online. Um yeah. So it's, it's difficult 629 00:55:17,630 --> 00:55:24,210 to engage with people from other places, but it works of course in teams that you, 630 00:55:24,210 --> 00:55:29,869 that you already have established in the past, for instance. So you can continue in 631 00:55:29,869 --> 00:55:36,720 teams that you've already built there. But also in some cases it works to start new 632 00:55:36,720 --> 00:55:40,900 collaborations. But it's of course more difficult than if you can just meet people 633 00:55:40,900 --> 00:55:46,643 in person like we did for this paper actually, David Osvald, one of the 634 00:55:46,643 --> 00:55:52,613 coauthors, we met with him in person and talked with him about the paper in person. 635 00:55:56,148 --> 00:56:02,210 Herald: Andreas, what's your, let's say, Corona year? 636 00:56:02,210 --> 00:56:06,569 Andreas: Yeah, since I'm one of the persons who was interrupting Michael all 637 00:56:06,569 --> 00:56:14,259 the time I am missing the office because it looks like the unscheduled flow, 638 00:56:14,259 --> 00:56:18,390 because it's sitting in an office and suddenly you have like a question or idea, 639 00:56:18,390 --> 00:56:24,110 you can not or you don't have to write it. You can just ask it on the fly. So I'm a 640 00:56:24,110 --> 00:56:28,898 bit missing that side. On the other side, I gained a lot of time since I don't have 641 00:56:28,898 --> 00:56:36,544 to travel to work there. And often I got a bit better in writing stuff I want to 642 00:56:36,544 --> 00:56:40,290 know, asking questions more, much more faster, like losing the clover and that 643 00:56:40,290 --> 00:56:48,660 stuff. And so I think it's both positive and negative. And I only joined since I 644 00:56:48,660 --> 00:56:55,539 think August, when I finished my master's thesis and in the first half of the year, 645 00:56:55,539 --> 00:57:00,220 I worked at a software company where the first lockdown was also handled very well. 646 00:57:00,220 --> 00:57:05,089 So we had like a smooth transition. So I'm kind of used to home office, but I miss 647 00:57:05,089 --> 00:57:17,470 interacting with people. Herald: I think that's the main thing 2020 648 00:57:17,470 --> 00:57:23,789 brings us: more remote work. Which is basically a good thing to work more from 649 00:57:23,789 --> 00:57:32,460 home, but we have some minutes left. And please excuse me myself. Did your mate 650 00:57:32,460 --> 00:57:41,030 consumption increase or decrease? Moritz: I think it's hard to say for 651 00:57:41,030 --> 00:57:45,950 coffee because I used to drink more coffee in the office than at home. Yeah, but but 652 00:57:45,950 --> 00:57:56,785 now I see it when we go grocery shopping. laughs It's hard to say. 653 00:57:56,785 --> 00:58:02,150 Michael: I think it decreased for me because now if I'm tired, I can simply 654 00:58:02,150 --> 00:58:11,180 take a nap, thats easier. Herald: And just turn your instant 655 00:58:11,180 --> 00:58:15,890 messaging off. Michael: Yeah. 656 00:58:17,214 --> 00:58:23,930 Herald: So our time is over. Thank you again for the brilliant for the amazing 657 00:58:23,930 --> 00:58:31,640 work, for these attack against CPU, for the great puns you brought, for the nice 658 00:58:31,640 --> 00:58:36,990 interview and have a nice remote Congress 3. 659 00:58:36,990 --> 00:58:51,329 postrol music 660 00:58:51,329 --> 00:59:15,900 Subtitles created by c3subtitles.de in the year 2021. Join, and help us!