1
00:00:04,680 --> 00:00:12,629
rc3 preroll music
2
00:00:12,629 --> 00:00:17,340
Herald: In the world of bad puns, everyone
knows and loves the famous line from the
3
00:00:17,340 --> 00:00:22,810
cinematic masterpiece, where the IT
security specialists ask the CPU architect
4
00:00:22,810 --> 00:00:30,050
"Warum leakt hier Strom?" or in English,
"why is power leaking here?". In this talk
5
00:00:30,050 --> 00:00:35,660
our four speakers demonstrate how they can
attack modern processors purely in
6
00:00:35,660 --> 00:00:43,079
software, relying on technical, techniques
from classical power side channel attacks.
7
00:00:43,079 --> 00:00:47,470
They'll explain how to use these
unprivileged access to energy monitoring
8
00:00:47,470 --> 00:00:53,960
features and modern Intel and AMD CPU's.
Please welcome with a round of digital
9
00:00:53,960 --> 00:00:58,450
applause. Moritz Lipp, Michael Schwarz,
Daniel Gruss and Andreas Kogler.
10
00:01:07,580 --> 00:01:11,456
Moritz: Warum leaked hier Strom?
laugh track
11
00:01:11,456 --> 00:01:13,707
Andreas: Und warum wendest du
kein Masking an?
12
00:01:13,707 --> 00:01:16,774
laugh track
13
00:01:16,774 --> 00:01:20,760
Daniel: But to understand how we got here,
we have to go back to San Diego in May
14
00:01:20,760 --> 00:01:23,340
2017.
A: This is a great, Moritz, this is
15
00:01:23,340 --> 00:01:26,029
a great talk title. We have to use this.
laugh track
16
00:01:26,029 --> 00:01:29,739
M: Yeah, but actually, before we can
do a talk, we should do some interesting
17
00:01:29,739 --> 00:01:32,010
research that we can present, right?
laugh track
18
00:01:32,010 --> 00:01:35,629
A: Of course. Of course. But we have
to remember this talk title, it's great.
19
00:01:35,628 --> 00:01:36,599
laugh track
M: Yes.
20
00:01:36,599 --> 00:01:47,990
music
21
00:01:47,990 --> 00:01:51,258
Michael: Hey Moritz. Today I have found
something really cool.
22
00:01:51,258 --> 00:01:54,650
Moritz: OK, what is it?
Michael: Our computers, they give
23
00:01:54,650 --> 00:01:59,404
us the current energy consumption in
microjoule and you can access that
24
00:01:59,404 --> 00:02:00,650
from userspace.
laugh track
25
00:02:00,650 --> 00:02:05,200
Moritz: What? Are you for real?
Michael: That, that basically means we
26
00:02:05,200 --> 00:02:08,545
could mount something like software based
power side channels.
27
00:02:08,545 --> 00:02:13,400
Moritz: Nice. We should try that out.
Michael: Yes, I already did, because I
28
00:02:13,400 --> 00:02:15,700
thought you might not believe me.
Moritz: OK.
29
00:02:15,700 --> 00:02:20,584
Michael: So this is one of the experiments
I did. Here you can already see that. I
30
00:02:20,584 --> 00:02:23,719
measured the power consumption using that
interface.
31
00:02:23,719 --> 00:02:26,323
Moritz: yeah
Michael: First while doing nothing, idling
32
00:02:26,323 --> 00:02:28,052
around sleeping
Moritz: like always
33
00:02:28,052 --> 00:02:34,594
Michael: and then I increased the CPU
load, I just did an endless loop which
34
00:02:34,594 --> 00:02:38,253
accessed a bit of memory. It's nothing
interesting but you can already see the
35
00:02:38,253 --> 00:02:42,123
difference for that. So you can see that
there's a difference in doing nothing and
36
00:02:42,123 --> 00:02:47,283
doing a lot. That's pretty nice.
Moritz: We should look take a closer look
37
00:02:47,283 --> 00:02:49,823
at that, I think.
Michael: Definitely.
38
00:02:49,823 --> 00:02:53,904
music
39
00:02:53,904 --> 00:02:57,194
Moritz: sings You can create
my power trace
40
00:02:57,194 --> 00:02:59,009
Andreas: Oh, this is great. We already
41
00:02:59,009 --> 00:03:05,480
have a song for this paper now. Okay.
Well, this is a great song that we can use
42
00:03:05,480 --> 00:03:06,530
for the paper...
43
00:03:06,530 --> 00:03:13,071
music
44
00:03:13,071 --> 00:03:16,541
Michael: Powertrace,
like power analysis attacks?
45
00:03:16,751 --> 00:03:20,840
Moritz: Yeah, but that would be
an attack with physical access.
46
00:03:21,050 --> 00:03:23,184
Daniel: Software-only would be great
47
00:03:23,303 --> 00:03:26,361
Michael: Yes, I told you already,
I found one can measure energy
48
00:03:26,361 --> 00:03:27,957
consumption in micro joules
49
00:03:27,957 --> 00:03:32,745
Moritz: Like attacking all server,
desktop and laptop CPUs
50
00:03:32,745 --> 00:03:35,755
Daniel: Ideally with unprivileged access
51
00:03:35,755 --> 00:03:38,899
Michael: Imagine if you could
distinguish different instructions
52
00:03:38,899 --> 00:03:42,399
or even observe the Hamming weights of
operands and memory loads
53
00:03:42,399 --> 00:03:44,024
Daniel: Control flow monitoring
54
00:03:44,024 --> 00:03:47,919
Moritz: In physical attacks they often go
for cryptographic keys.
55
00:03:47,919 --> 00:03:52,804
That would be great.
Attacking AES-NI and RSA
56
00:03:52,804 --> 00:03:56,249
Daniel: There's just one problem:
there is no such channel
57
00:03:56,249 --> 00:03:59,676
Michael: As I said,
don't you listen, Daniel?
58
00:03:59,676 --> 00:04:04,659
It's like always, there is this RAPL
register. This interface is already there
59
00:04:04,659 --> 00:04:07,083
and you can measure power consumption
60
00:04:07,083 --> 00:04:11,901
Daniel: Yes, but only on a
very coarse granularity
61
00:04:14,777 --> 00:04:16,750
Moritz: But first, we need to get a bit
62
00:04:16,750 --> 00:04:21,013
more understanding of the CPU power
management. The thermal design power, the
63
00:04:21,013 --> 00:04:26,810
TDP, is the power consumption under the
maximum theoretical load of the processor.
64
00:04:26,810 --> 00:04:32,085
And you probably know that number from the
CPU specification. And this gives
65
00:04:32,085 --> 00:04:38,430
integrators a target to find the proper
thermal solution when you integrate CPU in
66
00:04:38,430 --> 00:04:46,220
a computer so that it doesn't run too hot.
But for short periods of time, the CPU can
67
00:04:46,220 --> 00:04:52,919
consume more power than that. And this we
can see in this graphic. So here for this
68
00:04:52,919 --> 00:04:58,879
Tau moment, the power consumption is much
higher than for the rest of the CPU.
69
00:04:58,879 --> 00:05:05,520
Because usually a CPU is not instantly hot
and thermal properties propagate over a
70
00:05:05,520 --> 00:05:12,119
bit of time. So on the other hand, you
should also be able to save power. And you
71
00:05:12,119 --> 00:05:16,240
can do this in different ways. For
instance, you could just shut down
72
00:05:16,240 --> 00:05:21,870
resources completely that you do not need
at the moment, or you can reduce the
73
00:05:21,870 --> 00:05:27,169
voltage of the processor or those
components and then it also consumes less
74
00:05:27,169 --> 00:05:32,870
power. And on top of that, you could also
reduce the frequency of the processor and
75
00:05:32,870 --> 00:05:39,699
then it also consumes less power. And you
need this for different scenarios. For
76
00:05:39,699 --> 00:05:44,810
instance, with your laptop, you need to
budget the power consumption because you
77
00:05:44,810 --> 00:05:49,789
want to have a long run time. And you also
know these options that you can change,
78
00:05:49,789 --> 00:05:54,449
like the performance level if it should
run on high performance or to save
79
00:05:54,449 --> 00:05:57,219
battery. And you need this in different
scenarios.
80
00:05:57,219 --> 00:06:01,930
Michael: Yes, Moritz, that's exactly what
I showed you before. Do you remember? I
81
00:06:01,930 --> 00:06:07,269
showed you this intel running average
power limit, short RAPL, that provides
82
00:06:07,269 --> 00:06:13,180
exactly that functionality. So with this
Intel RAPL, you have the power limiting
83
00:06:13,180 --> 00:06:19,610
features so you can do exactly what you
just described, reduce the power usage for
84
00:06:19,610 --> 00:06:25,999
your system or for parts of your system.
And additionally, you also have the energy
85
00:06:25,999 --> 00:06:30,720
readings. So you know exactly how much
power is currently used on a system which
86
00:06:30,720 --> 00:06:36,419
helps you do exactly the things you just
mentioned before, like getting a better
87
00:06:36,419 --> 00:06:40,490
power performance balance. So this is
already there.
88
00:06:40,490 --> 00:06:44,409
Moritz: Because the CPU needs to know in a
way how much power it consumes, right?
89
00:06:44,409 --> 00:06:49,550
Michael: Exactly and the scheduler also
uses that feature to ensure that you get a
90
00:06:49,550 --> 00:06:54,820
better battery runtime on your laptop, for
example. And because this is an important
91
00:06:54,820 --> 00:07:00,370
feature you can directly get that from the
operating system as well. On Linux, you
92
00:07:00,370 --> 00:07:04,379
can even get that as an unprivileged
application. There's the powercap
93
00:07:04,379 --> 00:07:10,509
framework that you can directly access in
this pseudo file system where you get the
94
00:07:10,509 --> 00:07:15,729
current power readings, you can directly
see how much power your CPU currently
95
00:07:15,729 --> 00:07:17,729
consumes.
Moritz: How convenient!
96
00:07:17,729 --> 00:07:22,879
Michael: On MacOS and on Windows you have
a similar thing, but for that you first
97
00:07:22,879 --> 00:07:26,590
need to install a driver because usually
you don't need that as a userspace
98
00:07:26,590 --> 00:07:32,250
application. But some drivers might want
to have that and some drivers even expose
99
00:07:32,250 --> 00:07:36,819
that to you and you can use that. So there
are some drivers that are even
100
00:07:36,819 --> 00:07:41,300
preinstalled on some of the motherboards
that expose that information to
101
00:07:41,300 --> 00:07:47,229
applications as well on Windows.
Moritz: Interesting, but what can we do
102
00:07:47,229 --> 00:07:52,979
with this? So I ran some experiments
because I wanted to know how good this
103
00:07:52,979 --> 00:07:58,580
energy consumption monitoring works. And
in a first run we tried to distinguish
104
00:07:58,580 --> 00:08:04,090
instructions from each other. So we
implemented a small program just running
105
00:08:04,090 --> 00:08:08,049
the same instructions all the time, and we
measured its power consumption. And as we
106
00:08:08,049 --> 00:08:12,799
can see easily in this plot, different
instructions need a different amount of
107
00:08:12,799 --> 00:08:19,419
power. So we can distinguish instructions
from each other. In addition, what I
108
00:08:19,419 --> 00:08:23,559
tried, I changed the operands that
different instructions used. For instance,
109
00:08:23,559 --> 00:08:28,749
for a multiplication, you can multiply
different numbers with each other. And
110
00:08:28,749 --> 00:08:33,779
also here we see, depending on the bits
that are set in the operand a different
111
00:08:33,779 --> 00:08:39,130
power consumption of the same instruction,
but just depending on the operand so we
112
00:08:39,130 --> 00:08:44,607
can also distinguish them from each other.
This could also come in handy later on.
113
00:08:44,607 --> 00:08:51,180
But I also tried to load data with an
instruction and I wanted to know if I
114
00:08:51,180 --> 00:08:55,089
could see differences in the power
consumption, depending on the data that
115
00:08:55,089 --> 00:09:00,860
has been loaded by the processor. And as
you can see in this plot, the more bits
116
00:09:00,860 --> 00:09:07,970
that are set in the data that is loaded,
the more power the CPU consumes. But let's
117
00:09:07,970 --> 00:09:14,209
be honest here, to record these
measurements, it took more than 23 days,
118
00:09:14,209 --> 00:09:19,949
so it took quite some time to get to this
granularity to see those differences, but
119
00:09:19,949 --> 00:09:23,190
in other cases, if you just...
Michael: still a fascinating result.
120
00:09:23,190 --> 00:09:27,461
Moritz: Yes, it's a very interesting
result. And in other cases, Michael, you
121
00:09:27,461 --> 00:09:33,930
only want to know if one operand or one
value is a zero or if it's not a zero. And
122
00:09:33,930 --> 00:09:40,310
to come to this result, you don't need
that many measurements. And the last
123
00:09:40,310 --> 00:09:45,540
experiments that we did was we wanted to
know if we would see a difference in the
124
00:09:45,540 --> 00:09:51,000
energy consumption, depending where data
has been loaded from. For instance, as
125
00:09:51,000 --> 00:09:55,540
we've seen also at CCC in many different
talks over the past years, they are like
126
00:09:55,540 --> 00:09:59,920
cache attacks. And here in this
experiment, we also were able to see a
127
00:09:59,920 --> 00:10:04,320
difference in the power consumption if
your value has been loadad from the cache
128
00:10:04,320 --> 00:10:09,550
or if it has to be loaded from the main
memory, because, of course, then DRAM is
129
00:10:09,550 --> 00:10:16,290
activated and it consumes more power. But
these results are very nice.
130
00:10:16,290 --> 00:10:20,779
Michael: Yes, these are really fascinating
results. So we should actually exploit
131
00:10:20,779 --> 00:10:25,959
them and build attacks from that. I mean,
it's fascinating to see that all these
132
00:10:25,959 --> 00:10:29,860
measurements are possible, but we also
want to do something security related.
133
00:10:29,860 --> 00:10:32,089
Moritz: Do you have any idea what we
could do?
134
00:10:32,089 --> 00:10:36,969
Michael: Yes, I have that idea I already
showed you something from before. If you
135
00:10:36,969 --> 00:10:41,240
remember from the office, this one
measurement. And I extended that
136
00:10:41,240 --> 00:10:42,400
measurement.
Moritz: Yes.
137
00:10:42,400 --> 00:10:47,560
Michael: Into a covert channel. So a
covert channel is a communication channel
138
00:10:47,560 --> 00:10:52,290
between two parties that are usually not
allowed to communicate with each other. So
139
00:10:52,290 --> 00:10:56,310
there might be different reasons for that.
Maybe ther's no interface, maybe there's a
140
00:10:56,310 --> 00:11:01,892
policy or a firewall or something that
prevents them from communicating. And
141
00:11:01,892 --> 00:11:06,740
still, in this scenario, I want to
communicate. So for that, I'm using
142
00:11:06,740 --> 00:11:11,590
exactly these power side channels and all
this analysis you have done to actually
143
00:11:11,590 --> 00:11:17,940
communicate. And that's is very simple to
do, actually. I have two processes, a
144
00:11:17,940 --> 00:11:24,380
sender and a receiver, and the sender
tries to send single bits, zeros and ones.
145
00:11:24,380 --> 00:11:31,120
And to send a one bit. I do something that
uses a lot of energy, like accessing main
146
00:11:31,120 --> 00:11:37,379
memory. And if I want to send a zero bit,
then I don't do anything. And now as a
147
00:11:37,379 --> 00:11:42,410
receiver, I just have to measure the power
consumption and I see if the power
148
00:11:42,410 --> 00:11:47,961
consumption has a spike. Then I know the
sender is sending a one. If there's
149
00:11:47,961 --> 00:11:53,870
nothing the sender is apparently sending a
zero and from that I can get this
150
00:11:53,870 --> 00:11:57,975
information a Sender wants to send me.
Moritz: But did you try that out?
151
00:11:57,975 --> 00:12:02,070
laugh track
Michael: Yes, I also tried that and we can
152
00:12:02,070 --> 00:12:07,385
see that here in this graph. So this is
the energy measurement.
153
00:12:07,385 --> 00:12:11,010
Moritz: That's a very clean signal.
Michael: Yes, it's the energy measurement
154
00:12:11,010 --> 00:12:16,080
on the receiver side. And we see exactly
what I told you before. If there are one
155
00:12:16,080 --> 00:12:20,499
bits, then the energy consumption is
higher. If there are zero bits, it's
156
00:12:20,499 --> 00:12:26,220
lower. And from that we can deduce the
information that I wanted to send on the
157
00:12:26,220 --> 00:12:30,850
sender side. Pretty neat, huh?
Moritz: Yeah, but this is just from one
158
00:12:30,850 --> 00:12:37,190
process to another process. Actually, I
took your idea and used this in a
159
00:12:37,190 --> 00:12:43,463
hypervisor scenario where we attack the
Xen hypervisor. So it's not limited to two
160
00:12:43,463 --> 00:12:49,781
processes. I installed the Xen hypervisor
with two virtual machines. And what Xen
161
00:12:49,781 --> 00:12:56,018
does is it also exposes those RAPL
registers to the virtual machine. So now
162
00:12:56,018 --> 00:13:01,079
as a virtual machine, I can have direct
access to that and then I can establish a
163
00:13:01,079 --> 00:13:04,220
covert channel between two virtual
machines in the cloud.
164
00:13:04,220 --> 00:13:08,110
Michael: That's even better.
Moritz: And this is really working, as you
165
00:13:08,110 --> 00:13:13,410
can see here. I mean, here I'm just
sending ones and zeros, but the signal is
166
00:13:13,410 --> 00:13:15,589
pretty clear.
Michael: That's nice.
167
00:13:15,589 --> 00:13:20,959
Moritz: But it's the more that we can do?
Michael: Yes. I mean, covert channels are
168
00:13:20,959 --> 00:13:26,048
great to demonstrate something, that it
actually works, across VM, really great. I
169
00:13:26,048 --> 00:13:32,410
like that. That gives you a different
threat model here, but still they are a
170
00:13:32,410 --> 00:13:37,579
bit boring. So I decided to have something
more interesting as another example of
171
00:13:37,579 --> 00:13:43,320
what we can do. I always like to break
kernel address space layout randomization,
172
00:13:43,320 --> 00:13:48,899
KASLR. With this kernel address space
layout randomization, the kernel is mapped
173
00:13:48,899 --> 00:13:54,180
to different virtual locations every time
I boot my computer to make it difficult to
174
00:13:54,180 --> 00:13:58,050
actually exploit something in the kernel
because it's not predictable where the
175
00:13:58,050 --> 00:14:05,670
kernel is located. And I again use the
energy consumption to figure out where
176
00:14:05,670 --> 00:14:12,589
this kernel is located. So how does that
work? In this address space I have the
177
00:14:12,589 --> 00:14:17,980
kernel which is actually mapped using
physical pages and I have a lot of nothing
178
00:14:17,980 --> 00:14:24,350
where no physical page is mapped. And if I
try to access these addresses, I can't, of
179
00:14:24,350 --> 00:14:29,170
course, because I don't have the
privileges for that. But I will still see
180
00:14:29,170 --> 00:14:33,600
differences when doing that because the
CPU has to do different things depending
181
00:14:33,600 --> 00:14:38,340
on whether there's actually a page or not,
whether this page can be cached, this
182
00:14:38,340 --> 00:14:42,649
translation, or whether this translation
is always invalid because there's nothing
183
00:14:42,649 --> 00:14:47,780
there and it can't be cached. We can see
that here in an illustration, if you're
184
00:14:47,780 --> 00:14:53,569
wondering how that really works. So it
turns out the kernel can only be mapped to
185
00:14:53,569 --> 00:14:59,691
a limited number of places because it has
to be aligned by two megabytes, so I only
186
00:14:59,691 --> 00:15:06,009
need to check the spots there where the
kernel could be located. And for all these
187
00:15:06,009 --> 00:15:11,440
places in the address space, I just try to
access it and measure how much energy that
188
00:15:11,440 --> 00:15:17,670
consumes. And if there's nothing mapped,
it consumes quite a lot of energy because
189
00:15:17,670 --> 00:15:21,940
the CPU has to figure out that there's
nothing mapped. It goes through the page
190
00:15:21,940 --> 00:15:26,899
tables, the page table walk, and at the
end figures out, oh, there's nothing here,
191
00:15:26,899 --> 00:15:32,180
so I can't do anything, and aborts that.
And that uses quite some energy. But if
192
00:15:32,180 --> 00:15:39,200
there's actually the kernel here, then
this translation is valid. It works. There
193
00:15:39,200 --> 00:15:43,939
is something there. It will likely be
already in the translation caches in the
194
00:15:43,939 --> 00:15:49,709
TLB, so the CPU has less work. It just
needs to check the cache, sees: "Oh it's
195
00:15:49,709 --> 00:15:54,939
there. I know that. But wait a moment, you
can't access it" and can immediately abort
196
00:15:54,939 --> 00:16:01,939
and that uses less energy. So just from
the energy consumption, I can see if
197
00:16:01,939 --> 00:16:06,250
there's something mapped and with that see
where the kernel is actually mapped.
198
00:16:06,250 --> 00:16:10,586
Moritz: And this is really working? Did
you try it out or is this just some
199
00:16:10,586 --> 00:16:13,329
theoretical thing?
Michael: You're always so skeptical. Of
200
00:16:13,329 --> 00:16:19,009
course I tried that and I brought the demo
with me. So here you can see the demo
201
00:16:19,009 --> 00:16:24,149
running. This is on a real system. And you
see it's super fast measuring the energy
202
00:16:24,149 --> 00:16:28,290
consumption going over the address space
and finding the kernel.
203
00:16:28,290 --> 00:16:32,279
applause
Moritz: But these attacks are boring,
204
00:16:32,279 --> 00:16:36,681
Michael. We want to attack something real,
we want to be like real attackers, we want
205
00:16:36,681 --> 00:16:40,800
to attack crypto, we want to get keys.
Michael: Crypto is complicated. That's …
206
00:16:40,800 --> 00:16:43,329
laugh track
Moritz: No, no, no, just listen. So, for
207
00:16:43,329 --> 00:16:47,861
instance, with RSA, this is a widely used
public-key cryptosystem. This is really
208
00:16:47,861 --> 00:16:53,710
easy because to encrypt some data, you
have a public key. To decrypt the data you
209
00:16:53,710 --> 00:16:59,750
have a private key. And if we get the
private key: profit, easy as that. What do
210
00:16:59,750 --> 00:17:03,189
you say?
Michael: Yeah, I know how that works. So
211
00:17:03,189 --> 00:17:08,910
the theory is easy, that I have the two
keys and I have a private key. But then
212
00:17:08,910 --> 00:17:12,540
the complicated part starts where you
really have to understand the crypto to
213
00:17:12,540 --> 00:17:17,540
actually attack it. And that's really
complicated. And I don't really want to do
214
00:17:17,540 --> 00:17:22,586
that. Maybe we can a student who tries
that but I'm out of here. laughter
215
00:17:22,586 --> 00:17:25,584
Andreas: Hi guys, I'm a student and I want
a master thesis.
216
00:17:25,584 --> 00:17:29,370
Moritz: This is perfect. Your name is
Andreas, right?
217
00:17:29,370 --> 00:17:32,880
Andreas: Yeah, sure, I'm Andreas.
laughter
218
00:17:32,880 --> 00:17:36,891
M: OK, I don't know if you have heard
the last bits, but we want to attack some
219
00:17:36,891 --> 00:17:39,680
crypto with power side channel attacks.
A: OK
220
00:17:39,680 --> 00:17:44,181
Moritz: And for instance, with RSA, we
have the private key and the public key.
221
00:17:44,181 --> 00:17:50,970
Here we have M the message and C the
ciphertext and d the private exponent. And
222
00:17:50,970 --> 00:17:56,160
of course, it's a computer. It consists of
ones and zeros. And depending on the key
223
00:17:56,160 --> 00:18:01,970
bit if it's a one, for the computation of
the algorithm, we do a square and the
224
00:18:01,970 --> 00:18:08,510
multiply operation. And if it's zero, we
just do the square operation and we do
225
00:18:08,510 --> 00:18:14,110
this for the entire private key.
A: Now OK, sounds easy enough.
226
00:18:14,110 --> 00:18:21,640
M: Yes. And if we can observe that we
can extract the key. Sounds good. But I
227
00:18:21,640 --> 00:18:28,000
did some experiments and it didn't work
out as well as I've expected it to be. So
228
00:18:28,000 --> 00:18:31,860
we need to get a bit more control and
maybe a better threat model how to do
229
00:18:31,860 --> 00:18:40,100
that. And there comes Intel SGX into play.
And this is an instruction set extension
230
00:18:40,100 --> 00:18:47,340
and it provides you with integrity and
confidentiality of code and data even in
231
00:18:47,340 --> 00:18:55,600
untrusted environments. So with Intel SGX,
you can run programs using protected areas
232
00:18:55,600 --> 00:19:02,950
of memory. And even in the case where the
operating system is compromised and cannot
233
00:19:02,950 --> 00:19:07,300
be trusted at all.
A: So basically we have the full
234
00:19:07,300 --> 00:19:11,500
access of all operating system features to
attack, the enclave.
235
00:19:11,500 --> 00:19:14,900
M: Yes, exactly
A: OK, that sounds quite powerful
236
00:19:14,900 --> 00:19:21,130
M: But there's still one issue. It's
still just executing a program. So we have
237
00:19:21,130 --> 00:19:26,630
more power, but we need to make use of
that. And there is this paper called
238
00:19:26,630 --> 00:19:34,892
SGX-Step, which gives you more control of
enclaves and Jo Van Bulck the author maybe
239
00:19:34,892 --> 00:19:40,623
has time to explain this a bit to us so
maybe we can give him a call.
240
00:19:40,623 --> 00:19:42,160
A: Sounds great.
ringing sound
241
00:19:42,160 --> 00:19:48,760
M: Hi Jo, this is Moritz. I've seen
the paper of yours, this SGX-Step paper.
242
00:19:48,760 --> 00:19:52,990
It might be the thing that we need, but
can you explain a bit what it is about?
243
00:19:52,990 --> 00:19:59,910
Jo: Yes, surely Moritz, so SGX-Step I
think in one sentence it's an enclave
244
00:19:59,910 --> 00:20:04,920
execution control framework. What I mean
with that is that it allows you to
245
00:20:04,920 --> 00:20:09,308
precisely control the execution of the
enclave so that you can interleave it with
246
00:20:09,308 --> 00:20:13,750
attacker code, as the name implies, you
would do one step of the enclave, one step
247
00:20:13,750 --> 00:20:17,430
of the attacker again one step of the
enclave, one step of the attacker, etc.
248
00:20:17,430 --> 00:20:19,890
M: That's perfect.
J: That's the high level.
249
00:20:19,890 --> 00:20:23,580
Moritz: Can you expand it a bit on the
technical point of view? How do you do
250
00:20:23,580 --> 00:20:26,000
that?
J: Yes, I'm very excited about the
251
00:20:26,000 --> 00:20:32,100
technical details, Moritz. So let me walk
you through. The first thing you should
252
00:20:32,100 --> 00:20:36,330
know about SGX-Step: it's completely open
source and we build it on top of stock
253
00:20:36,330 --> 00:20:37,730
Linux environments.
M: Nice
254
00:20:37,730 --> 00:20:43,240
J: So what you should start with always
is to load a malicious kernel driver. And
255
00:20:43,240 --> 00:20:48,471
this is called the /dev/sgx-step driver.
And from that moment on we kind of export
256
00:20:48,471 --> 00:20:54,540
all of the powers of the Linux kernel into
the userspace. And the second component of
257
00:20:54,540 --> 00:20:58,830
SGX-step that's important is this small
library operating system that we wrote.
258
00:20:58,830 --> 00:21:04,310
It's called libsgxstep and it sits just
alongside of the library alongside in the
259
00:21:04,310 --> 00:21:09,382
userspace application. And libsgxstep
allows you to do a number of cool things.
260
00:21:09,382 --> 00:21:14,490
I think the most important thing being
that you have direct access to the APIC
261
00:21:14,490 --> 00:21:19,660
x86 high resolution timing device. So that
sounds interesting for you, right Moriz?.
262
00:21:19,660 --> 00:21:21,938
M: Yeah, but what do you
do with the timer?
263
00:21:21,938 --> 00:21:26,348
J: Well, what you can do with the timer
is essentially you can arm it just before
264
00:21:26,348 --> 00:21:30,170
you enter the enclave. And what would
happen then is, let's have a look. You arm
265
00:21:30,170 --> 00:21:34,260
the timer, you start executing the
enclave, then after a while and interrupt
266
00:21:34,260 --> 00:21:39,800
fires and you exit the enclave again.
M: Hmm, so it's like a debugger like
267
00:21:39,800 --> 00:21:44,800
GDB, but for enclaves?
J: Yes, it's a... it's exactly that
268
00:21:44,800 --> 00:21:49,000
Moritz. It's like an attacker controlled
debugger without using any of the debug
269
00:21:49,000 --> 00:21:54,350
features, just using the raw x86
primitives and operating system files. And
270
00:21:54,350 --> 00:21:59,040
just as in a debugger, it allows you to do
single stepping. So every instruction will
271
00:21:59,040 --> 00:22:03,420
be executed one at a time. At most one at
a time I should say.
272
00:22:03,420 --> 00:22:09,440
M: But what happens if I, like,
configure the timer a bit lower? Does it
273
00:22:09,440 --> 00:22:13,370
then like start executing an instruction?
J: That's a very good question. And
274
00:22:13,370 --> 00:22:18,250
configuring the timer is the tricky thing
about SGX-step. So it will indeed happen
275
00:22:18,250 --> 00:22:23,780
sometimes what we call a zero step event.
So you will fire the timer before the
276
00:22:23,780 --> 00:22:28,290
enclave even had time to execute an
instruction. And those are a kind of event
277
00:22:28,290 --> 00:22:32,920
that you can also detect with SGX-step.
There is a trick to detect whether you had
278
00:22:32,920 --> 00:22:36,560
a single step or a zero step.
M: Jo, this is perfect. This is
279
00:22:36,560 --> 00:22:40,060
exactly what we are looking for. Thank you
so much for explaining that.
280
00:22:40,060 --> 00:22:43,250
J: I'm very happy to hear that.
M: I'm looking forward to try it out
281
00:22:43,250 --> 00:22:44,850
now.
J: Go.
282
00:22:44,850 --> 00:22:47,470
M: See you hopefully soon.
J: Bye bye.
283
00:22:47,470 --> 00:22:48,850
M: Bye!
284
00:22:49,460 --> 00:22:54,950
M: So SGX-step to sum it up,
it's an open source Linux kernel
285
00:22:54,950 --> 00:22:59,990
framework, and it allows us to configure
the APIC timer interrupts so that we can
286
00:22:59,990 --> 00:23:06,400
interrupt the enclave execution to single
and zero step it. And this is perfect
287
00:23:06,400 --> 00:23:11,760
because now we can combine it with the
power measurements of Intel RAPL, and this
288
00:23:11,760 --> 00:23:17,080
gives us the possibility to measure the
energy consumption of single instructions.
289
00:23:17,080 --> 00:23:21,710
Can you try it out Andi?
A: OK, let me dig deeper into that.
290
00:23:21,710 --> 00:23:25,700
We have this really slow RAPL interface
here and if you want to visualize it, we
291
00:23:25,700 --> 00:23:30,360
could imagine that it's like we have slots
where we can fill the slots with
292
00:23:30,360 --> 00:23:35,390
instructions and the RAPL interface gives
us the average power consumption over the
293
00:23:35,390 --> 00:23:40,050
slots. So in the default case, when we
execute our target instruction, we have
294
00:23:40,050 --> 00:23:44,100
basically one slot filled with the target
instruction and the remaining slots filled
295
00:23:44,100 --> 00:23:50,130
with other instructions we don't know. So
basically noise. The best case for us
296
00:23:50,130 --> 00:23:54,210
would be if we repeat the target
instruction indefinitely and fill every
297
00:23:54,210 --> 00:23:58,028
slot with the target instruction.
M: This is exactly what I did
298
00:23:58,028 --> 00:24:02,060
in the experiments in the beginning.
A: Yeah, exactly. That's the reason
299
00:24:02,060 --> 00:24:07,760
why we got so good measurements there.
Another trick would be if we only used the
300
00:24:07,760 --> 00:24:11,890
target instruction in one slot and fill
the remaining slots with instructions
301
00:24:11,890 --> 00:24:15,920
where we know the energy consumption of or
we know the instruction of. Then it could
302
00:24:15,920 --> 00:24:20,840
do tricks to calculate the energy
consumption of the target instruction.
303
00:24:20,840 --> 00:24:26,830
With SGX-step now we can use a hybrid
solution here, where we use SGX-step the
304
00:24:26,830 --> 00:24:32,380
zero stepping mechanism to reissue this
instruction and we can fill multiple slots
305
00:24:32,380 --> 00:24:37,260
with the same target instruction. Only
drawback here is that we have a noise
306
00:24:37,260 --> 00:24:43,130
overhead of SGX-step itself, but this is
probably the best solution we can go with.
307
00:24:43,860 --> 00:24:48,100
M: This sounds pretty good, so we
should actually try that out. So we
308
00:24:48,100 --> 00:24:53,180
implement a toy cipher, which imitates
square and multiply basically. So we can
309
00:24:53,180 --> 00:24:58,110
leave out all the rest, the overhead of a
library that would be used otherwise. And
310
00:24:58,110 --> 00:25:02,700
we then just single step every instruction
and measure its energy consumption and
311
00:25:02,700 --> 00:25:08,200
then we could plot this. Can you do that?
A: I got already some results here
312
00:25:08,200 --> 00:25:13,156
for us. Basically here we use, as you
explained, a toy example for square and
313
00:25:13,156 --> 00:25:18,580
multiply. And in both cases the square and
the multiply, they execute exactly six
314
00:25:18,580 --> 00:25:23,860
instructions. And so basically we have a
period of six here. And if you look at the
315
00:25:23,860 --> 00:25:29,550
results of the measurement here, we can
see that we have patterns that repeat with
316
00:25:29,550 --> 00:25:34,460
a period of six and we can see that these
different patterns correspond to either a
317
00:25:34,460 --> 00:25:40,400
square or a multiply instruction here.
M: Nice, perfect, but this is just a
318
00:25:40,400 --> 00:25:42,400
toy cipher, right? laughter
A: Yeah.
319
00:25:42,400 --> 00:25:44,370
M: Can we do like real crypto?
laughter
320
00:25:44,370 --> 00:25:49,529
A: We can try. So the plan now is
that we want to attack a real RSA
321
00:25:49,529 --> 00:25:54,310
implementation and the real implementation
is not like a toy square and multiply
322
00:25:54,310 --> 00:25:59,320
algorithm. The real implementation needs
to handle these huge numbers. So basically
323
00:25:59,320 --> 00:26:03,492
there's much more code involved and it's
not feasible to single step every
324
00:26:03,492 --> 00:26:10,340
instruction there. So we must do a more
clever approach here. If we observe the
325
00:26:10,340 --> 00:26:17,478
square multiply part here, we see that the
square and the multiply function uses the
326
00:26:17,478 --> 00:26:25,420
AVX optimized memset function. So the
energy consumption should also be more if
327
00:26:25,420 --> 00:26:30,910
we execute an AVX instruction because AVX
instructions use much larger registers. So
328
00:26:30,910 --> 00:26:33,031
basically we should be able to observe
that.
329
00:26:33,031 --> 00:26:36,040
M: Interesting.
A: The only drawback here is that we
330
00:26:36,040 --> 00:26:43,470
cannot use the same approach as with the
toy cipher because the square has a
331
00:26:43,470 --> 00:26:48,659
different number of instructions as the
square and multiply function. So we need
332
00:26:48,659 --> 00:26:54,950
to do a trick here. So to understand what
we did here, our target is that we
333
00:26:54,950 --> 00:27:00,280
reconstruct a key bit. And if the key bit
is one we execute a square and multiply.
334
00:27:00,280 --> 00:27:09,260
If the key bit is zero, we execute a
square. So to visualize how we execute
335
00:27:09,260 --> 00:27:14,470
zero and single stepping, we have to dig
into the assembler a bit. So to test for
336
00:27:14,470 --> 00:27:18,690
the key bit, we execute like a test
instruction and then we execute a
337
00:27:18,690 --> 00:27:24,730
conditional jump. And if we execute the
square and multiply we have for instance,
338
00:27:24,730 --> 00:27:29,435
K instructions. And if we execute the
square we have for instance L
339
00:27:29,435 --> 00:27:34,260
instructions. So we can see that these two
numbers do not add up. They are different.
340
00:27:34,260 --> 00:27:40,050
So we cannot simply measure each Kth
instruction and get the key out. So we
341
00:27:40,050 --> 00:27:45,030
need to do something different here. We
can number the instructions after the jump
342
00:27:45,030 --> 00:27:52,980
instruction and then using single stepping
to single step to the Nth instruction
343
00:27:52,980 --> 00:27:59,272
after the jump instruction. And on the
left side, if you observe one, we hit then
344
00:27:59,272 --> 00:28:05,414
exactly the AVX instruction there, used in
the AVX memset. And if you then use our
345
00:28:05,414 --> 00:28:10,044
measurement framework to measure exactly
the nth instruction after the jump, we
346
00:28:10,044 --> 00:28:14,690
observe on the one hand a high energy
consumption and on the other hand, we
347
00:28:14,690 --> 00:28:20,140
observe low energy consumption if the
branch was not taken or a zero.
348
00:28:20,140 --> 00:28:22,910
M: It's very clever.
A: So if you measured both
349
00:28:22,910 --> 00:28:28,490
instructions here, we can then combine
this energy measurements and then use a
350
00:28:28,490 --> 00:28:35,490
simple threshold to reconstruct the key
bit in the beginning. And then we do this
351
00:28:35,490 --> 00:28:39,270
iteratively for each key bit.
M: This sounds pretty promising, but
352
00:28:39,270 --> 00:28:40,760
did you try it out?
laughter
353
00:28:40,760 --> 00:28:45,149
A: Sure. Here, the results of that.
And we can clearly see that we have
354
00:28:45,149 --> 00:28:48,735
different energy consumption or in this
case voltage
355
00:28:48,735 --> 00:28:51,094
applause
based on if the
356
00:28:51,094 --> 00:28:56,160
AVX instruction is executed or if the
instruction at the same offset in the
357
00:28:56,160 --> 00:28:59,410
other branch is executed.
M: How fast does this work, does this
358
00:28:59,410 --> 00:29:03,025
take like 5 days?
A: Not quite that long. We have one
359
00:29:03,025 --> 00:29:08,445
problem here that the time per key bit
increases the further or later the key bit
360
00:29:08,445 --> 00:29:14,040
is in the key. So basically the first key
bit we can reconstruct very fast, but for
361
00:29:14,040 --> 00:29:18,230
the last key bit, we need a single step
much further in the code to actually reach
362
00:29:18,230 --> 00:29:23,460
it. And this adds up. So basically the
time increases linearly between the key
363
00:29:23,460 --> 00:29:29,090
bits. But for our key here, our test key
with 512 bits that takes us about 3.5
364
00:29:29,090 --> 00:29:35,280
hours to reconstruct a complete key. Note
here that we spent like 52 minutes
365
00:29:35,280 --> 00:29:39,790
only to find the target instruction. So
basically, if we could optimize that, the
366
00:29:39,790 --> 00:29:45,688
attack would be much faster. In addition,
we had to record like 3 samples per key
367
00:29:45,688 --> 00:29:50,199
bit. But with the implementation, it
should be possible to actually do that
368
00:29:50,199 --> 00:29:54,600
with 1 sample. And since we then only need
one sample per key bit, we actually can do
369
00:29:54,600 --> 00:29:58,569
it with a single trace attack. But we did
not try that out, unfortunately.
370
00:29:58,569 --> 00:30:03,375
Moritz: quite fast.
Michael: So while all this sounded quite
371
00:30:03,375 --> 00:30:08,183
easy and straightforward in hindsight,
this was actually a really long process.
372
00:30:08,183 --> 00:30:14,100
Starting at the beginning of 2017 when we
discovered this interface, the RAPL
373
00:30:14,100 --> 00:30:18,713
interface. Then we had to come up with a
title for this talk, of course, laughter
374
00:30:18,713 --> 00:30:25,677
and some lyrics for a song. We had the
first toy attack on RSA at the end of
375
00:30:25,677 --> 00:30:34,463
2017. It took us until 2018 to finally get
a KASLR break that was working and only in
376
00:30:34,463 --> 00:30:41,280
2019, by the end of 2019. After Andreas
did his master's thesis on that, we were
377
00:30:41,280 --> 00:30:48,030
able to produce a full attack on RSA. And
this is also the time when we submitted
378
00:30:48,030 --> 00:30:53,910
that as a paper to a conference and
disclosed that to the CPU vendors so that
379
00:30:53,910 --> 00:30:59,552
they can fix that. And this is also the
start of the embargo. This embargo for
380
00:30:59,552 --> 00:31:10,640
this vulnerability lasted almost one year.
So from November 2019 to November 2020. It
381
00:31:10,640 --> 00:31:15,790
was just a few weeks ago that this embargo
ended here.
382
00:31:15,790 --> 00:31:21,040
Moritz: But there's one thing missing. We
really wanted to do crypto attacks, but
383
00:31:21,040 --> 00:31:28,067
not only with SGX-step as a compromised
operating system, but also from userspace.
384
00:31:28,067 --> 00:31:33,650
But as we've seen, it's so difficult to
measure parts of the code without having
385
00:31:33,650 --> 00:31:39,653
SGX-step. But what we can do is we can
measure the power consumption of the
386
00:31:39,653 --> 00:31:46,280
overall execution of an algorithm and
there correlation power analysis comes in
387
00:31:46,280 --> 00:31:53,121
handy. And there what we do is we build a
power consumption model of our device. As
388
00:31:53,121 --> 00:31:58,540
we've heard earlier, the Hamming Weight is
the number of bits that is set in an
389
00:31:58,540 --> 00:32:05,580
operand or in the data. And we assume that
if a bit is set, the computer takes more
390
00:32:05,580 --> 00:32:10,850
power to process it. In addition, what you
can use as a different model is the
391
00:32:10,850 --> 00:32:17,768
Hamming distance. So from one operation to
the other, how many bits change? And then
392
00:32:17,768 --> 00:32:24,690
we assume the more bits change, the more
power is consumed. And we really want to
393
00:32:24,690 --> 00:32:30,700
try that out. So what we are targeting now
is AES-NI, a side channel resistant
394
00:32:30,700 --> 00:32:37,320
instruction set of Intel. And we target it
in a scenario where we can trigger the
395
00:32:37,320 --> 00:32:43,728
encryption and decryption of many, many
blocks over long time so that the
396
00:32:43,728 --> 00:32:50,770
execution time is longer than the RAPL
update rate, so that we can really see the
397
00:32:50,770 --> 00:32:55,640
power consumption in our measurement. And
this is used, for instance, in disk
398
00:32:55,640 --> 00:33:05,340
encryption or decryption or if you seal or
unseal the SGX enclave state. And we can
399
00:33:05,340 --> 00:33:10,840
now do that and record power measurements
in different scenarios, right?
400
00:33:10,840 --> 00:33:17,390
Andreas: Sure, we can try that. So in our
experiment, we recorded two million traces
401
00:33:17,390 --> 00:33:25,860
over 26 hours for SGX environment. But we
also tried to reconstruct it without SGX
402
00:33:25,860 --> 00:33:29,700
where we used the encryption inside a
kernel module. And there we recorded
403
00:33:29,700 --> 00:33:36,951
4 million traces in 50 hours. And to
understand the attack here, we have to
404
00:33:36,951 --> 00:33:42,030
look at this animation. So basically we
have our computer where secret key is
405
00:33:42,030 --> 00:33:49,500
stored somewhere intern. Then we have this
key to encrypt some messages and we also
406
00:33:49,500 --> 00:33:54,240
have the power consumption here. And what
we now did is we recorded the encrypted
407
00:33:54,240 --> 00:34:00,854
message and the power consumption it took
to encrypt this message for many messages.
408
00:34:00,854 --> 00:34:07,540
And then we use a model of the CPU here to
predict the energy consumption, to
409
00:34:07,540 --> 00:34:12,940
reconstruct the key. The key is usually
split up into parts, where each of the
410
00:34:12,940 --> 00:34:20,887
parts can have a value between 0 and 255.
So to reconstruct the key here, we simply
411
00:34:20,887 --> 00:34:28,819
use our measurements in the model and we
try out one of the key parts and estimate
412
00:34:28,819 --> 00:34:35,809
the energy consumption for the key part.
And then we store the correlation between
413
00:34:35,809 --> 00:34:42,619
the recorded messages and the prediction.
And we do this for every of the possible
414
00:34:42,619 --> 00:34:50,379
key values. And once we found the key
value of the highest correlation, we know
415
00:34:50,379 --> 00:34:56,909
that this key value corresponds to the key
part of the key. And we then simply repeat
416
00:34:56,909 --> 00:35:02,279
the process for each of the parts of the
key until we get the final key.
417
00:35:02,279 --> 00:35:07,450
M: And we actually tried that out. So
here in our demo video, you see on the
418
00:35:07,450 --> 00:35:13,391
left where we test all the combinations
and see what is the most likely key
419
00:35:13,391 --> 00:35:18,349
candidate at the moment, while for a
single key byte on the right, you see
420
00:35:18,349 --> 00:35:23,730
every possible value and the correlation.
So in the beginning, with not that many
421
00:35:23,730 --> 00:35:29,747
traces processed, it's not very clear
which key candidate is the right one,
422
00:35:29,747 --> 00:35:34,849
because there's so much measurement noise
introduced by measuring over the overall
423
00:35:34,849 --> 00:35:41,292
execution time. But over time, this signal
gets more stable and we see on the right
424
00:35:41,292 --> 00:35:45,890
with the peak getting more and more
distance from the other candidates that
425
00:35:45,890 --> 00:35:52,380
this is our correct key byte. And we do
this, as Andreas said, for every possible
426
00:35:52,380 --> 00:35:57,230
key byte with every possible value. So in
the end, we end up with the correct key.
427
00:35:57,230 --> 00:36:00,729
applause
A: OK, but this seems like it's only
428
00:36:00,729 --> 00:36:05,930
Intel CPUs. Does this also affect others?
M: Yes. So actually, we also tried
429
00:36:05,930 --> 00:36:10,858
out how to CPU vendors if they have
similar interfaces. And for instance, AMD
430
00:36:10,858 --> 00:36:17,532
is affected as well. But we never really
heard back from them after our disclosure.
431
00:36:17,532 --> 00:36:23,510
And the patch how to try to solve the
problem with the driver is similar to the
432
00:36:23,510 --> 00:36:27,400
one that Intel has.
A: Your right Moritz, it actually
433
00:36:27,400 --> 00:36:31,839
works. So I tried the same code on AMD.
The one you showed before was
434
00:36:31,839 --> 00:36:37,080
distinguishing operands, at that also
works on AMD. That's pretty nice. It's not
435
00:36:37,080 --> 00:36:41,440
an Intel only issue. It also affects at
least AMD as well.
436
00:36:41,440 --> 00:36:45,230
M: Yes, but actually there are many
other vendors as well that provide
437
00:36:45,230 --> 00:36:50,410
interfaces, even some of them unprivileged
to user space where you could probably
438
00:36:50,410 --> 00:36:55,660
mount similar attacks. For instance,
Nvidia, IBM, or Marvell and Ampere.
439
00:36:55,660 --> 00:37:00,906
A: So this is really an industry
wide problem here. And we've also seen
440
00:37:00,906 --> 00:37:08,432
that from the media coverage. So not only
German news brought about that like Heise
441
00:37:08,432 --> 00:37:13,788
or Golem, but it also went more
international with ZDNET, Ars Technica,
442
00:37:13,788 --> 00:37:20,970
CSO, Tech Radar, Computer Weekly and many,
many others that wrote about this new type
443
00:37:20,970 --> 00:37:28,599
of vulnerability that affects many
computers out there. And I guess if it
444
00:37:28,599 --> 00:37:31,480
affects many computers, we should do
something against that.
445
00:37:31,480 --> 00:37:35,779
M: Yes, you're right. We cannot only
have an attack and no mitigation against
446
00:37:35,779 --> 00:37:41,470
it. This would not be right. And indeed,
it's quite easy to fix that because we
447
00:37:41,470 --> 00:37:46,040
said in the beginning, you have
unprivileged access to those registers. So
448
00:37:46,040 --> 00:37:51,930
we just restrict the access. And we are
done, and this is exactly a one line patch
449
00:37:51,930 --> 00:37:59,480
for the Linux kernel. But as we've seen
with the threat model of Intel SGX, which
450
00:37:59,480 --> 00:38:05,049
allows a compromised operating system. So
this one line patch does not help there
451
00:38:05,049 --> 00:38:11,340
because I'm the operating system, I can do
whatever I want to. We need more and more
452
00:38:11,340 --> 00:38:18,445
complex mitigations. And in this case,
microcode updates are necessary. And what
453
00:38:18,445 --> 00:38:23,991
Intel does is to fall back to the model of
the energy consumption. So they have an
454
00:38:23,991 --> 00:38:28,930
internal model. How much energy is
consumed by an executed instruction and
455
00:38:28,930 --> 00:38:33,968
use that instead of the real measurement.
And this does not allow to distinguish
456
00:38:33,968 --> 00:38:40,895
data and operands from each other again.
So if your implementation is implemented
457
00:38:40,895 --> 00:38:47,220
correctly, if you use constant time, then
you are mitigated and protected against
458
00:38:47,220 --> 00:38:53,444
these attacks. And as we see here in the
plot, we tried to mitigation out. So on
459
00:38:53,444 --> 00:38:58,020
the left, we were able to see differences
depending on the Hamming weight of the
460
00:38:58,020 --> 00:39:03,700
operands. And on the right with the
mitigation in place, it just does not work
461
00:39:03,700 --> 00:39:07,311
anymore and you cannot see any
differences. applause
462
00:39:07,311 --> 00:39:11,142
Andreas: Nice. So you really
can't read her power trace any more.
463
00:39:11,142 --> 00:39:35,547
Music: Pokerface by Lady Gaga
464
00:39:35,547 --> 00:39:39,641
sings
I wonna probe 'em like in 1943
465
00:39:39,641 --> 00:39:43,116
touch 'em, measure wattage
correlate and get the key
466
00:39:43,116 --> 00:39:44,005
I probe it
467
00:39:44,005 --> 00:39:47,368
Oscilloscopes are not the same
without a probe
468
00:39:47,368 --> 00:39:52,219
And babe, if it's remote if it's not code,
it cannot run
469
00:39:56,239 --> 00:39:59,731
I'll let him plot, let's see what he's got
470
00:40:04,251 --> 00:40:08,145
I'll let him plot, let's see what he's got
471
00:40:08,145 --> 00:40:10,389
Can't read my, can't read my
472
00:40:10,389 --> 00:40:14,091
No he can't read my power trace
473
00:40:14,091 --> 00:40:16,368
She's got the countermeasure
474
00:40:16,368 --> 00:40:18,283
Can't read my, can't read my
475
00:40:18,283 --> 00:40:21,907
No he can't read my power trace
476
00:40:21,907 --> 00:40:24,572
She's got the countermeasure
477
00:40:24,572 --> 00:40:27,649
P-p-p-power trace, p-p-power trace
478
00:40:28,530 --> 00:40:31,688
P-p-p-power trace, p-p-power trace
479
00:40:32,533 --> 00:40:35,658
P-p-p-power trace, p-p-power trace
480
00:40:36,691 --> 00:40:39,555
P-p-p-power trace, p-p-power trace
481
00:40:41,404 --> 00:40:43,728
applause
482
00:40:43,728 --> 00:40:45,910
Moritz: With all those nasty songs, we
483
00:40:45,910 --> 00:40:50,910
wrote them down in a scientific paper and
the PLATYPUS paper has been accepted
484
00:40:50,910 --> 00:40:57,240
recently at a conference. And we also want
to thank you, all the other coauthors who
485
00:40:57,240 --> 00:41:04,520
are not in this talk, like David Oswald,
Catherine Easton and Claudio Canela. To
486
00:41:04,520 --> 00:41:09,900
sum it up, what we have seen is that with
power sidechannel attacks, you can even
487
00:41:09,900 --> 00:41:16,630
exploit them from software. So there is no
need to attach an oscilloscope on modern
488
00:41:16,630 --> 00:41:19,514
Intel CPUs.
489
00:41:19,514 --> 00:41:23,239
Michael: And what we've also seen is
that since the SGX threat model allows for
490
00:41:23,239 --> 00:41:27,809
much more capable attackers, mitigating
power sidechannel attacks on the SGX
491
00:41:27,809 --> 00:41:32,369
enclaves is much more work than simple
software patches.
492
00:41:32,369 --> 00:41:34,604
Andreas: Yes, and that concludes
493
00:41:34,604 --> 00:41:39,696
our talk on PLATYPUS. Thank you all for
listening.
494
00:41:39,696 --> 00:41:56,859
Applause and Music
495
00:41:59,077 --> 00:42:05,580
Herald: Thank you very much for your
excuse me, nerdy talk and thank Moritz,
496
00:42:05,580 --> 00:42:13,140
Michael, Daniel and Andreas. We head over
to our Q&A session and the first question
497
00:42:13,140 --> 00:42:21,059
would be, how does it come that you have
so, let's say through the back door
498
00:42:21,059 --> 00:42:26,680
against CPU attack against the CPU idea,
you mentioned you attack the through a
499
00:42:26,680 --> 00:42:31,910
power driver RSA. Could you tell me a
little bit more about that?
500
00:42:31,910 --> 00:42:36,640
Moritz: Yes. So the basic idea of
attacking cryptographic algorithms with
501
00:42:36,640 --> 00:42:41,339
power side channel attacks is not very new
This was like one of the first things
502
00:42:41,339 --> 00:42:46,400
researchers have shown, but most of the
time for like smaller devices, like smart
503
00:42:46,400 --> 00:42:52,740
cards, like your bank card, for instance.
And for those attacks, you usually had
504
00:42:52,740 --> 00:42:57,472
like an oscilloscope that you needed to
attach to the device to do the attack. But
505
00:42:57,472 --> 00:43:02,012
with modern processors, they have
basically an oscilloscope built into the
506
00:43:02,012 --> 00:43:07,309
processor, which you can read out as the
operating system. And in our case, there
507
00:43:07,309 --> 00:43:12,454
are like drivers that expose this
interface, also to userspace. So from
508
00:43:12,454 --> 00:43:18,050
there as an unprivileged attacker, you can
then try to exploit that. And yeah
509
00:43:18,050 --> 00:43:23,450
basically the best thing that we wanted to
achieve with those attacks is to attack
510
00:43:23,450 --> 00:43:29,434
cryptographic algorithms and not to
transmit some data between two processes.
511
00:43:29,434 --> 00:43:35,700
Herald: Cool, thank you. Our next
question, you mentioned a little bit about
512
00:43:35,700 --> 00:43:44,259
ARM sorry, AMD, how about ARM? So not x86
architecture?
513
00:43:44,259 --> 00:43:49,350
Moritz: So there are many other vendors
that have similar interfaces, some of them
514
00:43:49,350 --> 00:43:55,519
also provide drivers that expose them
directly to userspace, but we hardly had
515
00:43:55,519 --> 00:44:01,390
any access to those devices, so we could
not really fully evaluate if these attacks
516
00:44:01,390 --> 00:44:06,072
are also possible on them. But in the
paper, we have an appendix where we
517
00:44:06,072 --> 00:44:10,440
describe them in a bit more detail so you
can try it out on your own and let us know
518
00:44:10,440 --> 00:44:15,120
if it works.
Herald: Cool. Thank you. So please, fellow
519
00:44:15,120 --> 00:44:20,470
hackers, try it out at your system, at
home. Now, our next question is related to
520
00:44:20,470 --> 00:44:26,374
that. Is there a survey which hardware has
the RAPL or similar weaknesses? Intel,
521
00:44:26,374 --> 00:44:33,045
AMD, ARM even.
Moritz: I don't know if anyone else wants
522
00:44:33,045 --> 00:44:38,940
to answer that, I can also take the
question. So the RAPL interface itself
523
00:44:38,940 --> 00:44:44,130
comes from Intel, but a similar interface
is also implemented for AMD, and they also
524
00:44:44,130 --> 00:44:49,710
use basically the same name. They have
a... For now, it's implemented in two ways
525
00:44:49,710 --> 00:44:54,420
for the Linux kernel, also in the RAPL
driver, but also in a separate called AMD
526
00:44:54,420 --> 00:44:59,609
Energy Driver, which is included since a
few months in the Linux kernel, in the
527
00:44:59,609 --> 00:45:05,074
upstream Kernel. And for other vendors it
works a bit differently. So some of them
528
00:45:05,074 --> 00:45:12,087
just give you similar measurements, but
not in a tightly related way to the RAPL
529
00:45:12,087 --> 00:45:16,220
Interface with a measure over a period of
time and give you the average.
530
00:45:16,611 --> 00:45:21,560
Herald: OK, and..
Michael: Maybe to add one point here: On
531
00:45:21,560 --> 00:45:26,534
Intel, basically the high resolution
sensors are included since the Skylake
532
00:45:26,534 --> 00:45:31,308
micro architecture. So something around
2015.
533
00:45:33,383 --> 00:45:40,180
Herald: I see. We have another related
question to AMD. So did AMD issue any
534
00:45:40,180 --> 00:45:45,160
Microcode update for the secure encrypted
virtual machines case apart from
535
00:45:45,160 --> 00:45:53,469
restricting access to MSR?
Moritz: Not as far as we know. But from
536
00:45:53,469 --> 00:45:58,271
our knowledge to attack AMD CPU's, we need
to wait for a new generation so that we
537
00:45:58,271 --> 00:46:02,931
can do similar attacks from a similar
threat model than we can do on an Intel.
538
00:46:03,450 --> 00:46:09,390
Herald: Cool, thank you. So another I
think this is also related to it, you
539
00:46:09,390 --> 00:46:14,390
mentioned your Xen example where you
attack through a hypervisor. Does it work
540
00:46:14,390 --> 00:46:18,440
on other hypervisors like KVM or hyperV as
well?
541
00:46:18,440 --> 00:46:24,470
Moritz: So for KVM, I don't think so. For
Windows I also don't know I don't think
542
00:46:24,470 --> 00:46:29,509
they exposed those MSR directly to the
virtual machines. So the issue is really
543
00:46:29,509 --> 00:46:34,270
here that we can have access to those MSRs
at the virtual machine where we should not
544
00:46:34,270 --> 00:46:40,859
have access to.
Herald: OK, we have another question from,
545
00:46:40,859 --> 00:46:47,297
I think, the hardware section of our
remote Congress. Someone wonders if the
546
00:46:47,297 --> 00:46:51,833
same could be achieved with external power
measurement.
547
00:46:52,990 --> 00:46:57,640
Moritz: You mean if you could attach
actually an oscilloscope or a different
548
00:46:57,640 --> 00:47:03,510
probe to the CPU? Yes, you can do that.
And it has already been demonstrated in
549
00:47:03,510 --> 00:47:07,279
the past.
Michael: But it turned out with external
550
00:47:07,279 --> 00:47:12,510
tools, it takes even longer than with
software. You have more issues finding the
551
00:47:12,510 --> 00:47:20,630
right spot in measuring. And there is one
paper, it took 14 days of collecting
552
00:47:20,630 --> 00:47:26,909
traces which are harder to probe, which is
much longer than in software. But it can
553
00:47:26,909 --> 00:47:30,981
be done.
Herald: And there's another follow up
554
00:47:30,981 --> 00:47:38,677
question, how external is external? Where
do you measure power consumptions of an
555
00:47:38,677 --> 00:47:46,650
x86 server?
Moritz: OK, you would need to get physical
556
00:47:46,650 --> 00:47:51,400
access to the data center, I guess. And if
this is in your threat model, you probably
557
00:47:51,400 --> 00:47:57,740
have different things to worry about.
Michael: Yeah, you still need to find the
558
00:47:57,740 --> 00:48:04,609
right spot on your mainboard.
Herald: OK, so are there, let's say
559
00:48:04,609 --> 00:48:08,680
documentation's where to get that right
spot.
560
00:48:09,612 --> 00:48:14,700
Moritz: I think one can take a look at
other research papers where they attached
561
00:48:14,700 --> 00:48:19,180
a probe, I think there are experts out
there, but I don't know.
562
00:48:19,180 --> 00:48:26,690
Herald: OK, thank you. The next question,
why is the power information exported in
563
00:48:26,690 --> 00:48:32,809
such detail to the kernel or userspace
software? Why isn't it only available to
564
00:48:32,809 --> 00:48:37,700
the firmware or filtered to return an
average, for example, one second power
565
00:48:37,700 --> 00:48:43,279
trace?
Moritz: Good question. We did not
566
00:48:43,279 --> 00:48:48,140
implement that. I think the reason is...
Andi?
567
00:48:48,140 --> 00:48:53,540
Andreas: The once second power trace would
make the attack only slower because you
568
00:48:53,540 --> 00:48:58,345
can still do exactly what we did with
single stepping here, because RAPL is
569
00:48:58,345 --> 00:49:04,477
already very slow and we need a mechanism
to replay instructions to get a good
570
00:49:04,477 --> 00:49:08,779
reading of the energy consumption of the
instructions. So if you only increase the
571
00:49:08,779 --> 00:49:14,170
update rate there, the attacks would still
be possible, but only take longer to
572
00:49:14,170 --> 00:49:22,819
record the traces there. So you have to...
Yeah. So you have to find a tradeoff
573
00:49:22,819 --> 00:49:28,049
between your countermeasures there.
Herald: Okay, so let's say with an
574
00:49:28,049 --> 00:49:33,180
average, your resolution is lower, but
still it just takes more time to record
575
00:49:33,180 --> 00:49:38,420
it. And still it does work, right?
Moritz: Yes. And the other thing is that
576
00:49:38,420 --> 00:49:43,450
one needs to keep in mind those drivers
are not written for security in mind, but
577
00:49:43,450 --> 00:49:48,779
for performance so that this can be used
by other tools that like give you the best
578
00:49:48,779 --> 00:49:55,059
performance of your CPU. And in that case,
it just has not been masked and you get
579
00:49:55,059 --> 00:49:58,710
the value directly at the operating system
sees.
580
00:49:59,106 --> 00:50:06,380
Herald: Crazy. Our second to last
question, how long is the update interval
581
00:50:06,380 --> 00:50:13,046
for this measurement? I heard something
about...
582
00:50:13,046 --> 00:50:17,224
Andreas: For the fastest register we
observed, it's like 10 microseconds, for
583
00:50:17,224 --> 00:50:21,079
the slowest one... So there are different
domains where you measure only parts of
584
00:50:21,079 --> 00:50:25,290
the CPU and for the whole package, this
includes all the cores and the memory
585
00:50:25,290 --> 00:50:30,099
controller, it takes around one
millisecond there. So this is already very
586
00:50:30,099 --> 00:50:35,311
slow, if you compare it to the frequency
where CPUs are currently running at.
587
00:50:36,690 --> 00:50:43,539
Herald: Crazy. In this case, are there any
other questions from the interwebs, from
588
00:50:43,539 --> 00:50:50,455
Twitter, from our IRC channel? Because
otherwise we would head over to more,
589
00:50:50,455 --> 00:50:56,178
let's say, personal interview. Let's give
them a try.
590
00:51:07,727 --> 00:51:09,880
In this case, no more
591
00:51:09,880 --> 00:51:16,851
questions, so in this. So, again, thank
you. Moritz, Michael, Daniel and Andreas,
592
00:51:16,851 --> 00:51:27,230
for these for this really interesting talk
for this Q&A session, the Internet tells
593
00:51:27,230 --> 00:51:35,622
me no questions. We head over to our
personal interview. I asked you earlier
594
00:51:35,622 --> 00:51:43,670
before our talk. So with all these, let's
say, research things going on in the
595
00:51:43,670 --> 00:51:49,420
Corona time. So what's your personal
experience? What changed in your work life
596
00:51:49,420 --> 00:51:56,001
balance in the last one year?
Moritz: I think the biggest change is that
597
00:51:56,001 --> 00:52:02,105
most of the coffee breaks you do alone
instead of with the colleagues.
598
00:52:04,211 --> 00:52:08,710
Herald: So how do you meet in your in
your, let's say, lunch break? Do you have
599
00:52:08,710 --> 00:52:16,069
as well a lunch break break out session in
Jitsi? Yeah, we started with Jitsi, but
600
00:52:16,069 --> 00:52:20,320
used different systems on the long way.
And now it's like a fixed coffee meeting
601
00:52:20,320 --> 00:52:25,637
at 2:00 p.m. every day and try to meet
everyone or have individual meetings, of
602
00:52:25,637 --> 00:52:28,758
course.
Herald: And does this work? But so is
603
00:52:28,758 --> 00:52:35,323
everyone on time. So sharp 12?
Moritz: No, but I think no one really
604
00:52:35,323 --> 00:52:40,500
cares.
Herald: So it's just for socializing?
605
00:52:40,500 --> 00:52:47,168
Moritz: Yes. But we also discuss work
related issues also in separate meetings.
606
00:52:47,168 --> 00:52:54,849
And yeah, I think time is different, but
you get used to it. But let's hope it's
607
00:52:54,849 --> 00:53:02,108
over soon.
Herald: What about the others, Michael?
608
00:53:02,108 --> 00:53:08,910
Michael: Yes, I'm in the same coffee
breaks as Moritz. Sometimes every day,
609
00:53:08,910 --> 00:53:17,200
depends on the workload, so I feel quite
lucky that we can still work full time and
610
00:53:17,200 --> 00:53:21,890
get our work done. And I don't have to
fear that we lose our jobs in the in the
611
00:53:21,890 --> 00:53:30,609
short term. So I think that takes a lot of
pressure off. But, yeah, I mean, it's
612
00:53:30,609 --> 00:53:35,859
different. I'm also missing the
conferences, so I used to travel around a
613
00:53:35,859 --> 00:53:43,990
lot before Corona times and this year is
basically nothing. So you really miss the
614
00:53:43,990 --> 00:53:49,910
social interactions and conferences,
meeting other researchers, exchanging
615
00:53:49,910 --> 00:54:00,060
ideas, having that online is different and
just not the same, but still it works. So
616
00:54:00,060 --> 00:54:05,289
I can still do a lot of research. The
positive thing, you have less
617
00:54:05,289 --> 00:54:12,019
interruptions than when you're in the
office. So that's a positive thing. But
618
00:54:12,019 --> 00:54:17,269
yeah, I also hope that it's over soon.
Daniel: But then again, on the other side,
619
00:54:17,269 --> 00:54:22,476
you have way more conference calls because
instead of writing emails, people ask for
620
00:54:22,476 --> 00:54:26,808
conference calls all the time.
Michael: Yes, you are in meetings all the
621
00:54:26,808 --> 00:54:29,980
time.
Herald: Yeah, Daniel you mentioned earlier
622
00:54:29,980 --> 00:54:37,299
you're, let's say, flightplan the last
year. And as far as I understood it, you
623
00:54:37,299 --> 00:54:43,049
like to be in personal contact with your
colleagues, also from others or from
624
00:54:43,049 --> 00:54:49,109
foreign countries. How does this work? So
let's say topic exchange between different
625
00:54:49,109 --> 00:54:51,890
organizations, between different
countries?
626
00:54:51,890 --> 00:54:59,930
Daniel: Yeah, it's more difficult. So in
2018, I had these 54 talks outside of Graz
627
00:54:59,930 --> 00:55:11,529
in 52 weeks and this year I had a single
talk outside of, outside of Graz where I
628
00:55:11,529 --> 00:55:17,630
was in person of course. Of course more
Online. Um yeah. So it's, it's difficult
629
00:55:17,630 --> 00:55:24,210
to engage with people from other places,
but it works of course in teams that you,
630
00:55:24,210 --> 00:55:29,869
that you already have established in the
past, for instance. So you can continue in
631
00:55:29,869 --> 00:55:36,720
teams that you've already built there. But
also in some cases it works to start new
632
00:55:36,720 --> 00:55:40,900
collaborations. But it's of course more
difficult than if you can just meet people
633
00:55:40,900 --> 00:55:46,643
in person like we did for this paper
actually, David Osvald, one of the
634
00:55:46,643 --> 00:55:52,613
coauthors, we met with him in person and
talked with him about the paper in person.
635
00:55:56,148 --> 00:56:02,210
Herald: Andreas, what's your, let's say,
Corona year?
636
00:56:02,210 --> 00:56:06,569
Andreas: Yeah, since I'm one of the
persons who was interrupting Michael all
637
00:56:06,569 --> 00:56:14,259
the time I am missing the office because
it looks like the unscheduled flow,
638
00:56:14,259 --> 00:56:18,390
because it's sitting in an office and
suddenly you have like a question or idea,
639
00:56:18,390 --> 00:56:24,110
you can not or you don't have to write it.
You can just ask it on the fly. So I'm a
640
00:56:24,110 --> 00:56:28,898
bit missing that side. On the other side,
I gained a lot of time since I don't have
641
00:56:28,898 --> 00:56:36,544
to travel to work there. And often I got a
bit better in writing stuff I want to
642
00:56:36,544 --> 00:56:40,290
know, asking questions more, much more
faster, like losing the clover and that
643
00:56:40,290 --> 00:56:48,660
stuff. And so I think it's both positive
and negative. And I only joined since I
644
00:56:48,660 --> 00:56:55,539
think August, when I finished my master's
thesis and in the first half of the year,
645
00:56:55,539 --> 00:57:00,220
I worked at a software company where the
first lockdown was also handled very well.
646
00:57:00,220 --> 00:57:05,089
So we had like a smooth transition. So I'm
kind of used to home office, but I miss
647
00:57:05,089 --> 00:57:17,470
interacting with people.
Herald: I think that's the main thing 2020
648
00:57:17,470 --> 00:57:23,789
brings us: more remote work. Which is
basically a good thing to work more from
649
00:57:23,789 --> 00:57:32,460
home, but we have some minutes left. And
please excuse me myself. Did your mate
650
00:57:32,460 --> 00:57:41,030
consumption increase or decrease?
Moritz: I think it's hard to say for
651
00:57:41,030 --> 00:57:45,950
coffee because I used to drink more coffee
in the office than at home. Yeah, but but
652
00:57:45,950 --> 00:57:56,785
now I see it when we go grocery shopping.
laughs It's hard to say.
653
00:57:56,785 --> 00:58:02,150
Michael: I think it decreased for me
because now if I'm tired, I can simply
654
00:58:02,150 --> 00:58:11,180
take a nap, thats easier.
Herald: And just turn your instant
655
00:58:11,180 --> 00:58:15,890
messaging off.
Michael: Yeah.
656
00:58:17,214 --> 00:58:23,930
Herald: So our time is over. Thank you
again for the brilliant for the amazing
657
00:58:23,930 --> 00:58:31,640
work, for these attack against CPU, for
the great puns you brought, for the nice
658
00:58:31,640 --> 00:58:36,990
interview and have a nice remote Congress
3.
659
00:58:36,990 --> 00:58:51,329
postrol music
660
00:58:51,329 --> 00:59:15,900
Subtitles created by c3subtitles.de
in the year 2021. Join, and help us!