1
00:00:04,680 --> 00:00:12,629
<i>rc3 preroll music</i>

2
00:00:12,629 --> 00:00:17,340
Herald: In the world of bad puns, everyone
knows and loves the famous line from the

3
00:00:17,340 --> 00:00:22,810
cinematic masterpiece, where the IT
security specialists ask the CPU architect

4
00:00:22,810 --> 00:00:30,050
"Warum leakt hier Strom?" or in English,
"why is power leaking here?". In this talk

5
00:00:30,050 --> 00:00:35,660
our four speakers demonstrate how they can
attack modern processors purely in

6
00:00:35,660 --> 00:00:43,079
software, relying on technical, techniques
from classical power side channel attacks.

7
00:00:43,079 --> 00:00:47,470
They'll explain how to use these
unprivileged access to energy monitoring

8
00:00:47,470 --> 00:00:53,960
features and modern Intel and AMD CPU's.
Please welcome with a round of digital

9
00:00:53,960 --> 00:00:58,450
applause. Moritz Lipp, Michael Schwarz,
Daniel Gruss and Andreas Kogler.

10
00:01:07,580 --> 00:01:11,456
Moritz: Warum leaked hier Strom?
<i>laugh track</i>

11
00:01:11,456 --> 00:01:13,707
Andreas: Und warum wendest du 
kein Masking an?

12
00:01:13,707 --> 00:01:16,774
<i>laugh track</i>

13
00:01:16,774 --> 00:01:20,760
Daniel: But to understand how we got here,
we have to go back to San Diego in May

14
00:01:20,760 --> 00:01:23,340
2017.
A: This is a great, Moritz, this is

15
00:01:23,340 --> 00:01:26,029
a great talk title. We have to use this.
<i>laugh track</i>

16
00:01:26,029 --> 00:01:29,739
M: Yeah, but actually, before we can
do a talk, we should do some interesting

17
00:01:29,739 --> 00:01:32,010
research that we can present, right?
<i>laugh track</i>

18
00:01:32,010 --> 00:01:35,629
A: Of course. Of course. But we have
to remember this talk title, it's great.

19
00:01:35,628 --> 00:01:36,599
<i>laugh track</i>
M: Yes.

20
00:01:36,599 --> 00:01:47,990
<i>music</i>

21
00:01:47,990 --> 00:01:51,258
Michael: Hey Moritz. Today I have found
something really cool.

22
00:01:51,258 --> 00:01:54,650
Moritz: OK, what is it?
Michael: Our computers, they give

23
00:01:54,650 --> 00:01:59,404
us the current energy consumption in
microjoule and you can access that

24
00:01:59,404 --> 00:02:00,650
from userspace.
<i>laugh track</i>

25
00:02:00,650 --> 00:02:05,200
Moritz: What? Are you for real?
Michael: That, that basically means we

26
00:02:05,200 --> 00:02:08,545
could mount something like software based
power side channels.

27
00:02:08,545 --> 00:02:13,400
Moritz: Nice. We should try that out.
Michael: Yes, I already did, because I

28
00:02:13,400 --> 00:02:15,700
thought you might not believe me.
Moritz: OK.

29
00:02:15,700 --> 00:02:20,584
Michael: So this is one of the experiments
I did. Here you can already see that. I

30
00:02:20,584 --> 00:02:23,719
measured the power consumption using that
interface.

31
00:02:23,719 --> 00:02:26,323
Moritz: yeah
Michael: First while doing nothing, idling

32
00:02:26,323 --> 00:02:28,052
around sleeping
Moritz: like always

33
00:02:28,052 --> 00:02:34,594
Michael: and then I increased the CPU
load, I just did an endless loop which

34
00:02:34,594 --> 00:02:38,253
accessed a bit of memory. It's nothing
interesting but you can already see the

35
00:02:38,253 --> 00:02:42,123
difference for that. So you can see that
there's a difference in doing nothing and

36
00:02:42,123 --> 00:02:47,283
doing a lot. That's pretty nice.
Moritz: We should look take a closer look

37
00:02:47,283 --> 00:02:49,823
at that, I think.
Michael: Definitely.

38
00:02:49,823 --> 00:02:53,904
<i>music</i>

39
00:02:53,904 --> 00:02:57,194
Moritz: <i>sings</i> You can create 
my power trace

40
00:02:57,194 --> 00:02:59,009
Andreas: Oh, this is great. We already

41
00:02:59,009 --> 00:03:05,480
have a song for this paper now. Okay.
Well, this is a great song that we can use

42
00:03:05,480 --> 00:03:06,530
for the paper...

43
00:03:06,530 --> 00:03:13,071
<i>music</i>

44
00:03:13,071 --> 00:03:16,541
Michael: Powertrace, 
like power analysis attacks?

45
00:03:16,751 --> 00:03:20,840
Moritz: Yeah, but that would be 
an attack with physical access.

46
00:03:21,050 --> 00:03:23,184
Daniel: Software-only would be great

47
00:03:23,303 --> 00:03:26,361
Michael: Yes, I told you already,
I found one can measure energy

48
00:03:26,361 --> 00:03:27,957
consumption in micro joules

49
00:03:27,957 --> 00:03:32,745
Moritz: Like attacking all server, 
desktop and laptop CPUs

50
00:03:32,745 --> 00:03:35,755
Daniel: Ideally with unprivileged access

51
00:03:35,755 --> 00:03:38,899
Michael: Imagine if you could
distinguish different instructions

52
00:03:38,899 --> 00:03:42,399
or even observe the Hamming weights of 
operands and memory loads

53
00:03:42,399 --> 00:03:44,024
Daniel: Control flow monitoring

54
00:03:44,024 --> 00:03:47,919
Moritz: In physical attacks they often go
for cryptographic keys.

55
00:03:47,919 --> 00:03:52,804
That would be great.
Attacking AES-NI and RSA

56
00:03:52,804 --> 00:03:56,249
Daniel: There's just one problem:
there is no such channel

57
00:03:56,249 --> 00:03:59,676
Michael: As I said,
don't you listen, Daniel?

58
00:03:59,676 --> 00:04:04,659
It's like always, there is this RAPL 
register. This interface is already there

59
00:04:04,659 --> 00:04:07,083
and you can measure power consumption

60
00:04:07,083 --> 00:04:11,901
Daniel: Yes, but only on a 
very coarse granularity

61
00:04:14,777 --> 00:04:16,750
Moritz: But first, we need to get a bit

62
00:04:16,750 --> 00:04:21,013
more understanding of the CPU power
management. The thermal design power, the

63
00:04:21,013 --> 00:04:26,810
TDP, is the power consumption under the
maximum theoretical load of the processor.

64
00:04:26,810 --> 00:04:32,085
And you probably know that number from the
CPU specification. And this gives

65
00:04:32,085 --> 00:04:38,430
integrators a target to find the proper
thermal solution when you integrate CPU in

66
00:04:38,430 --> 00:04:46,220
a computer so that it doesn't run too hot.
But for short periods of time, the CPU can

67
00:04:46,220 --> 00:04:52,919
consume more power than that. And this we
can see in this graphic. So here for this

68
00:04:52,919 --> 00:04:58,879
Tau moment, the power consumption is much
higher than for the rest of the CPU.

69
00:04:58,879 --> 00:05:05,520
Because usually a CPU is not instantly hot
and thermal properties propagate over a

70
00:05:05,520 --> 00:05:12,119
bit of time. So on the other hand, you
should also be able to save power. And you

71
00:05:12,119 --> 00:05:16,240
can do this in different ways. For
instance, you could just shut down

72
00:05:16,240 --> 00:05:21,870
resources completely that you do not need
at the moment, or you can reduce the

73
00:05:21,870 --> 00:05:27,169
voltage of the processor or those
components and then it also consumes less

74
00:05:27,169 --> 00:05:32,870
power. And on top of that, you could also
reduce the frequency of the processor and

75
00:05:32,870 --> 00:05:39,699
then it also consumes less power. And you
need this for different scenarios. For

76
00:05:39,699 --> 00:05:44,810
instance, with your laptop, you need to
budget the power consumption because you

77
00:05:44,810 --> 00:05:49,789
want to have a long run time. And you also
know these options that you can change,

78
00:05:49,789 --> 00:05:54,449
like the performance level if it should
run on high performance or to save

79
00:05:54,449 --> 00:05:57,219
battery. And you need this in different
scenarios.

80
00:05:57,219 --> 00:06:01,930
Michael: Yes, Moritz, that's exactly what
I showed you before. Do you remember? I

81
00:06:01,930 --> 00:06:07,269
showed you this intel running average
power limit, short RAPL, that provides

82
00:06:07,269 --> 00:06:13,180
exactly that functionality. So with this
Intel RAPL, you have the power limiting

83
00:06:13,180 --> 00:06:19,610
features so you can do exactly what you
just described, reduce the power usage for

84
00:06:19,610 --> 00:06:25,999
your system or for parts of your system.
And additionally, you also have the energy

85
00:06:25,999 --> 00:06:30,720
readings. So you know exactly how much
power is currently used on a system which

86
00:06:30,720 --> 00:06:36,419
helps you do exactly the things you just
mentioned before, like getting a better

87
00:06:36,419 --> 00:06:40,490
power performance balance. So this is
already there.

88
00:06:40,490 --> 00:06:44,409
Moritz: Because the CPU needs to know in a
way how much power it consumes, right?

89
00:06:44,409 --> 00:06:49,550
Michael: Exactly and the scheduler also
uses that feature to ensure that you get a

90
00:06:49,550 --> 00:06:54,820
better battery runtime on your laptop, for
example. And because this is an important

91
00:06:54,820 --> 00:07:00,370
feature you can directly get that from the
operating system as well. On Linux, you

92
00:07:00,370 --> 00:07:04,379
can even get that as an unprivileged
application. There's the powercap

93
00:07:04,379 --> 00:07:10,509
framework that you can directly access in
this pseudo file system where you get the

94
00:07:10,509 --> 00:07:15,729
current power readings, you can directly
see how much power your CPU currently

95
00:07:15,729 --> 00:07:17,729
consumes.
Moritz: How convenient!

96
00:07:17,729 --> 00:07:22,879
Michael: On MacOS and on Windows you have
a similar thing, but for that you first

97
00:07:22,879 --> 00:07:26,590
need to install a driver because usually
you don't need that as a userspace

98
00:07:26,590 --> 00:07:32,250
application. But some drivers might want
to have that and some drivers even expose

99
00:07:32,250 --> 00:07:36,819
that to you and you can use that. So there
are some drivers that are even

100
00:07:36,819 --> 00:07:41,300
preinstalled on some of the motherboards
that expose that information to

101
00:07:41,300 --> 00:07:47,229
applications as well on Windows.
Moritz: Interesting, but what can we do

102
00:07:47,229 --> 00:07:52,979
with this? So I ran some experiments
because I wanted to know how good this

103
00:07:52,979 --> 00:07:58,580
energy consumption monitoring works. And
in a first run we tried to distinguish

104
00:07:58,580 --> 00:08:04,090
instructions from each other. So we
implemented a small program just running

105
00:08:04,090 --> 00:08:08,049
the same instructions all the time, and we
measured its power consumption. And as we

106
00:08:08,049 --> 00:08:12,799
can see easily in this plot, different
instructions need a different amount of

107
00:08:12,799 --> 00:08:19,419
power. So we can distinguish instructions
from each other. In addition, what I

108
00:08:19,419 --> 00:08:23,559
tried, I changed the operands that
different instructions used. For instance,

109
00:08:23,559 --> 00:08:28,749
for a multiplication, you can multiply
different numbers with each other. And

110
00:08:28,749 --> 00:08:33,779
also here we see, depending on the bits
that are set in the operand a different

111
00:08:33,779 --> 00:08:39,130
power consumption of the same instruction,
but just depending on the operand so we

112
00:08:39,130 --> 00:08:44,607
can also distinguish them from each other.
This could also come in handy later on.

113
00:08:44,607 --> 00:08:51,180
But I also tried to load data with an
instruction and I wanted to know if I

114
00:08:51,180 --> 00:08:55,089
could see differences in the power
consumption, depending on the data that

115
00:08:55,089 --> 00:09:00,860
has been loaded by the processor. And as
you can see in this plot, the more bits

116
00:09:00,860 --> 00:09:07,970
that are set in the data that is loaded,
the more power the CPU consumes. But let's

117
00:09:07,970 --> 00:09:14,209
be honest here, to record these
measurements, it took more than 23 days,

118
00:09:14,209 --> 00:09:19,949
so it took quite some time to get to this
granularity to see those differences, but

119
00:09:19,949 --> 00:09:23,190
in other cases, if you just...
Michael: still a fascinating result.

120
00:09:23,190 --> 00:09:27,461
Moritz: Yes, it's a very interesting
result. And in other cases, Michael, you

121
00:09:27,461 --> 00:09:33,930
only want to know if one operand or one
value is a zero or if it's not a zero. And

122
00:09:33,930 --> 00:09:40,310
to come to this result, you don't need
that many measurements. And the last

123
00:09:40,310 --> 00:09:45,540
experiments that we did was we wanted to
know if we would see a difference in the

124
00:09:45,540 --> 00:09:51,000
energy consumption, depending where data
has been loaded from. For instance, as

125
00:09:51,000 --> 00:09:55,540
we've seen also at CCC in many different
talks over the past years, they are like

126
00:09:55,540 --> 00:09:59,920
cache attacks. And here in this
experiment, we also were able to see a

127
00:09:59,920 --> 00:10:04,320
difference in the power consumption if
your value has been loadad from the cache

128
00:10:04,320 --> 00:10:09,550
or if it has to be loaded from the main
memory, because, of course, then DRAM is

129
00:10:09,550 --> 00:10:16,290
activated and it consumes more power. But
these results are very nice.

130
00:10:16,290 --> 00:10:20,779
Michael: Yes, these are really fascinating
results. So we should actually exploit

131
00:10:20,779 --> 00:10:25,959
them and build attacks from that. I mean,
it's fascinating to see that all these

132
00:10:25,959 --> 00:10:29,860
measurements are possible, but we also
want to do something security related.

133
00:10:29,860 --> 00:10:32,089
Moritz: Do you have any idea what we
could do?

134
00:10:32,089 --> 00:10:36,969
Michael: Yes, I have that idea I already
showed you something from before. If you

135
00:10:36,969 --> 00:10:41,240
remember from the office, this one
measurement. And I extended that

136
00:10:41,240 --> 00:10:42,400
measurement.
Moritz: Yes.

137
00:10:42,400 --> 00:10:47,560
Michael: Into a covert channel. So a
covert channel is a communication channel

138
00:10:47,560 --> 00:10:52,290
between two parties that are usually not
allowed to communicate with each other. So

139
00:10:52,290 --> 00:10:56,310
there might be different reasons for that.
Maybe ther's no interface, maybe there's a

140
00:10:56,310 --> 00:11:01,892
policy or a firewall or something that
prevents them from communicating. And

141
00:11:01,892 --> 00:11:06,740
still, in this scenario, I want to
communicate. So for that, I'm using

142
00:11:06,740 --> 00:11:11,590
exactly these power side channels and all
this analysis you have done to actually

143
00:11:11,590 --> 00:11:17,940
communicate. And that's is very simple to
do, actually. I have two processes, a

144
00:11:17,940 --> 00:11:24,380
sender and a receiver, and the sender
tries to send single bits, zeros and ones.

145
00:11:24,380 --> 00:11:31,120
And to send a one bit. I do something that
uses a lot of energy, like accessing main

146
00:11:31,120 --> 00:11:37,379
memory. And if I want to send a zero bit,
then I don't do anything. And now as a

147
00:11:37,379 --> 00:11:42,410
receiver, I just have to measure the power
consumption and I see if the power

148
00:11:42,410 --> 00:11:47,961
consumption has a spike. Then I know the
sender is sending a one. If there's

149
00:11:47,961 --> 00:11:53,870
nothing the sender is apparently sending a
zero and from that I can get this

150
00:11:53,870 --> 00:11:57,975
information a Sender wants to send me.
Moritz: But did you try that out?

151
00:11:57,975 --> 00:12:02,070
<i>laugh track</i>
Michael: Yes, I also tried that and we can

152
00:12:02,070 --> 00:12:07,385
see that here in this graph. So this is
the energy measurement.

153
00:12:07,385 --> 00:12:11,010
Moritz: That's a very clean signal.
Michael: Yes, it's the energy measurement

154
00:12:11,010 --> 00:12:16,080
on the receiver side. And we see exactly
what I told you before. If there are one

155
00:12:16,080 --> 00:12:20,499
bits, then the energy consumption is
higher. If there are zero bits, it's

156
00:12:20,499 --> 00:12:26,220
lower. And from that we can deduce the
information that I wanted to send on the

157
00:12:26,220 --> 00:12:30,850
sender side. Pretty neat, huh?
Moritz: Yeah, but this is just from one

158
00:12:30,850 --> 00:12:37,190
process to another process. Actually, I
took your idea and used this in a

159
00:12:37,190 --> 00:12:43,463
hypervisor scenario where we attack the
Xen hypervisor. So it's not limited to two

160
00:12:43,463 --> 00:12:49,781
processes. I installed the Xen hypervisor
with two virtual machines. And what Xen

161
00:12:49,781 --> 00:12:56,018
does is it also exposes those RAPL
registers to the virtual machine. So now

162
00:12:56,018 --> 00:13:01,079
as a virtual machine, I can have direct
access to that and then I can establish a

163
00:13:01,079 --> 00:13:04,220
covert channel between two virtual
machines in the cloud.

164
00:13:04,220 --> 00:13:08,110
Michael: That's even better.
Moritz: And this is really working, as you

165
00:13:08,110 --> 00:13:13,410
can see here. I mean, here I'm just
sending ones and zeros, but the signal is

166
00:13:13,410 --> 00:13:15,589
pretty clear.
Michael: That's nice.

167
00:13:15,589 --> 00:13:20,959
Moritz: But it's the more that we can do?
Michael: Yes. I mean, covert channels are

168
00:13:20,959 --> 00:13:26,048
great to demonstrate something, that it
actually works, across VM, really great. I

169
00:13:26,048 --> 00:13:32,410
like that. That gives you a different
threat model here, but still they are a

170
00:13:32,410 --> 00:13:37,579
bit boring. So I decided to have something
more interesting as another example of

171
00:13:37,579 --> 00:13:43,320
what we can do. I always like to break
kernel address space layout randomization,

172
00:13:43,320 --> 00:13:48,899
KASLR. With this kernel address space
layout randomization, the kernel is mapped

173
00:13:48,899 --> 00:13:54,180
to different virtual locations every time
I boot my computer to make it difficult to

174
00:13:54,180 --> 00:13:58,050
actually exploit something in the kernel
because it's not predictable where the

175
00:13:58,050 --> 00:14:05,670
kernel is located. And I again use the
energy consumption to figure out where

176
00:14:05,670 --> 00:14:12,589
this kernel is located. So how does that
work? In this address space I have the

177
00:14:12,589 --> 00:14:17,980
kernel which is actually mapped using
physical pages and I have a lot of nothing

178
00:14:17,980 --> 00:14:24,350
where no physical page is mapped. And if I
try to access these addresses, I can't, of

179
00:14:24,350 --> 00:14:29,170
course, because I don't have the
privileges for that. But I will still see

180
00:14:29,170 --> 00:14:33,600
differences when doing that because the
CPU has to do different things depending

181
00:14:33,600 --> 00:14:38,340
on whether there's actually a page or not,
whether this page can be cached, this

182
00:14:38,340 --> 00:14:42,649
translation, or whether this translation
is always invalid because there's nothing

183
00:14:42,649 --> 00:14:47,780
there and it can't be cached. We can see
that here in an illustration, if you're

184
00:14:47,780 --> 00:14:53,569
wondering how that really works. So it
turns out the kernel can only be mapped to

185
00:14:53,569 --> 00:14:59,691
a limited number of places because it has
to be aligned by two megabytes, so I only

186
00:14:59,691 --> 00:15:06,009
need to check the spots there where the
kernel could be located. And for all these

187
00:15:06,009 --> 00:15:11,440
places in the address space, I just try to
access it and measure how much energy that

188
00:15:11,440 --> 00:15:17,670
consumes. And if there's nothing mapped,
it consumes quite a lot of energy because

189
00:15:17,670 --> 00:15:21,940
the CPU has to figure out that there's
nothing mapped. It goes through the page

190
00:15:21,940 --> 00:15:26,899
tables, the page table walk, and at the
end figures out, oh, there's nothing here,

191
00:15:26,899 --> 00:15:32,180
so I can't do anything, and aborts that.
And that uses quite some energy. But if

192
00:15:32,180 --> 00:15:39,200
there's actually the kernel here, then
this translation is valid. It works. There

193
00:15:39,200 --> 00:15:43,939
is something there. It will likely be
already in the translation caches in the

194
00:15:43,939 --> 00:15:49,709
TLB, so the CPU has less work. It just
needs to check the cache, sees: "Oh it's

195
00:15:49,709 --> 00:15:54,939
there. I know that. But wait a moment, you
can't access it" and can immediately abort

196
00:15:54,939 --> 00:16:01,939
and that uses less energy. So just from
the energy consumption, I can see if

197
00:16:01,939 --> 00:16:06,250
there's something mapped and with that see
where the kernel is actually mapped.

198
00:16:06,250 --> 00:16:10,586
Moritz: And this is really working? Did
you try it out or is this just some

199
00:16:10,586 --> 00:16:13,329
theoretical thing?
Michael: You're always so skeptical. Of

200
00:16:13,329 --> 00:16:19,009
course I tried that and I brought the demo
with me. So here you can see the demo

201
00:16:19,009 --> 00:16:24,149
running. This is on a real system. And you
see it's super fast measuring the energy

202
00:16:24,149 --> 00:16:28,290
consumption going over the address space
and finding the kernel.

203
00:16:28,290 --> 00:16:32,279
<i>applause</i>
Moritz: But these attacks are boring,

204
00:16:32,279 --> 00:16:36,681
Michael. We want to attack something real,
we want to be like real attackers, we want

205
00:16:36,681 --> 00:16:40,800
to attack crypto, we want to get keys.
Michael: Crypto is complicated. That's …

206
00:16:40,800 --> 00:16:43,329
<i>laugh track</i>
Moritz: No, no, no, just listen. So, for

207
00:16:43,329 --> 00:16:47,861
instance, with RSA, this is a widely used
public-key cryptosystem. This is really

208
00:16:47,861 --> 00:16:53,710
easy because to encrypt some data, you
have a public key. To decrypt the data you

209
00:16:53,710 --> 00:16:59,750
have a private key. And if we get the
private key: profit, easy as that. What do

210
00:16:59,750 --> 00:17:03,189
you say?
Michael: Yeah, I know how that works. So

211
00:17:03,189 --> 00:17:08,910
the theory is easy, that I have the two
keys and I have a private key. But then

212
00:17:08,910 --> 00:17:12,540
the complicated part starts where you
really have to understand the crypto to

213
00:17:12,540 --> 00:17:17,540
actually attack it. And that's really
complicated. And I don't really want to do

214
00:17:17,540 --> 00:17:22,586
that. Maybe we can a student who tries
that but I'm out of here. <i>laughter</i>

215
00:17:22,586 --> 00:17:25,584
Andreas: Hi guys, I'm a student and I want
a master thesis.

216
00:17:25,584 --> 00:17:29,370
Moritz: This is perfect. Your name is
Andreas, right?

217
00:17:29,370 --> 00:17:32,880
Andreas: Yeah, sure, I'm Andreas.
<i>laughter</i>

218
00:17:32,880 --> 00:17:36,891
M: OK, I don't know if you have heard
the last bits, but we want to attack some

219
00:17:36,891 --> 00:17:39,680
crypto with power side channel attacks.
A: OK

220
00:17:39,680 --> 00:17:44,181
Moritz: And for instance, with RSA, we
have the private key and the public key.

221
00:17:44,181 --> 00:17:50,970
Here we have M the message and C the
ciphertext and d the private exponent. And

222
00:17:50,970 --> 00:17:56,160
of course, it's a computer. It consists of
ones and zeros. And depending on the key

223
00:17:56,160 --> 00:18:01,970
bit if it's a one, for the computation of
the algorithm, we do a square and the

224
00:18:01,970 --> 00:18:08,510
multiply operation. And if it's zero, we
just do the square operation and we do

225
00:18:08,510 --> 00:18:14,110
this for the entire private key.
A: Now OK, sounds easy enough.

226
00:18:14,110 --> 00:18:21,640
M: Yes. And if we can observe that we
can extract the key. Sounds good. But I

227
00:18:21,640 --> 00:18:28,000
did some experiments and it didn't work
out as well as I've expected it to be. So

228
00:18:28,000 --> 00:18:31,860
we need to get a bit more control and
maybe a better threat model how to do

229
00:18:31,860 --> 00:18:40,100
that. And there comes Intel SGX into play.
And this is an instruction set extension

230
00:18:40,100 --> 00:18:47,340
and it provides you with integrity and
confidentiality of code and data even in

231
00:18:47,340 --> 00:18:55,600
untrusted environments. So with Intel SGX,
you can run programs using protected areas

232
00:18:55,600 --> 00:19:02,950
of memory. And even in the case where the
operating system is compromised and cannot

233
00:19:02,950 --> 00:19:07,300
be trusted at all.
A: So basically we have the full

234
00:19:07,300 --> 00:19:11,500
access of all operating system features to
attack, the enclave.

235
00:19:11,500 --> 00:19:14,900
M: Yes, exactly
A: OK, that sounds quite powerful

236
00:19:14,900 --> 00:19:21,130
M: But there's still one issue. It's
still just executing a program. So we have

237
00:19:21,130 --> 00:19:26,630
more power, but we need to make use of
that. And there is this paper called

238
00:19:26,630 --> 00:19:34,892
SGX-Step, which gives you more control of
enclaves and Jo Van Bulck the author maybe

239
00:19:34,892 --> 00:19:40,623
has time to explain this a bit to us so
maybe we can give him a call.

240
00:19:40,623 --> 00:19:42,160
A: Sounds great.
<i>ringing sound</i>

241
00:19:42,160 --> 00:19:48,760
M: Hi Jo, this is Moritz. I've seen
the paper of yours, this SGX-Step paper.

242
00:19:48,760 --> 00:19:52,990
It might be the thing that we need, but
can you explain a bit what it is about?

243
00:19:52,990 --> 00:19:59,910
Jo: Yes, surely Moritz, so SGX-Step I
think in one sentence it's an enclave

244
00:19:59,910 --> 00:20:04,920
execution control framework. What I mean
with that is that it allows you to

245
00:20:04,920 --> 00:20:09,308
precisely control the execution of the
enclave so that you can interleave it with

246
00:20:09,308 --> 00:20:13,750
attacker code, as the name implies, you
would do one step of the enclave, one step

247
00:20:13,750 --> 00:20:17,430
of the attacker again one step of the
enclave, one step of the attacker, etc.

248
00:20:17,430 --> 00:20:19,890
M: That's perfect.
J: That's the high level.

249
00:20:19,890 --> 00:20:23,580
Moritz: Can you expand it a bit on the
technical point of view? How do you do

250
00:20:23,580 --> 00:20:26,000
that?
J: Yes, I'm very excited about the

251
00:20:26,000 --> 00:20:32,100
technical details, Moritz. So let me walk
you through. The first thing you should

252
00:20:32,100 --> 00:20:36,330
know about SGX-Step: it's completely open
source and we build it on top of stock

253
00:20:36,330 --> 00:20:37,730
Linux environments.
M: Nice

254
00:20:37,730 --> 00:20:43,240
J: So what you should start with always
is to load a malicious kernel driver. And

255
00:20:43,240 --> 00:20:48,471
this is called the /dev/sgx-step driver.
And from that moment on we kind of export

256
00:20:48,471 --> 00:20:54,540
all of the powers of the Linux kernel into
the userspace. And the second component of

257
00:20:54,540 --> 00:20:58,830
SGX-step that's important is this small
library operating system that we wrote.

258
00:20:58,830 --> 00:21:04,310
It's called libsgxstep and it sits just
alongside of the library alongside in the

259
00:21:04,310 --> 00:21:09,382
userspace application. And libsgxstep
allows you to do a number of cool things.

260
00:21:09,382 --> 00:21:14,490
I think the most important thing being
that you have direct access to the APIC

261
00:21:14,490 --> 00:21:19,660
x86 high resolution timing device. So that
sounds interesting for you, right Moriz?.

262
00:21:19,660 --> 00:21:21,938
M: Yeah, but what do you
do with the timer?

263
00:21:21,938 --> 00:21:26,348
J: Well, what you can do with the timer
is essentially you can arm it just before

264
00:21:26,348 --> 00:21:30,170
you enter the enclave. And what would
happen then is, let's have a look. You arm

265
00:21:30,170 --> 00:21:34,260
the timer, you start executing the
enclave, then after a while and interrupt

266
00:21:34,260 --> 00:21:39,800
fires and you exit the enclave again.
M: Hmm, so it's like a debugger like

267
00:21:39,800 --> 00:21:44,800
GDB, but for enclaves?
J: Yes, it's a... it's exactly that

268
00:21:44,800 --> 00:21:49,000
Moritz. It's like an attacker controlled
debugger without using any of the debug

269
00:21:49,000 --> 00:21:54,350
features, just using the raw x86
primitives and operating system files. And

270
00:21:54,350 --> 00:21:59,040
just as in a debugger, it allows you to do
single stepping. So every instruction will

271
00:21:59,040 --> 00:22:03,420
be executed one at a time. At most one at
a time I should say.

272
00:22:03,420 --> 00:22:09,440
M: But what happens if I, like,
configure the timer a bit lower? Does it

273
00:22:09,440 --> 00:22:13,370
then like start executing an instruction?
J: That's a very good question. And

274
00:22:13,370 --> 00:22:18,250
configuring the timer is the tricky thing
about SGX-step. So it will indeed happen

275
00:22:18,250 --> 00:22:23,780
sometimes what we call a zero step event.
So you will fire the timer before the

276
00:22:23,780 --> 00:22:28,290
enclave even had time to execute an
instruction. And those are a kind of event

277
00:22:28,290 --> 00:22:32,920
that you can also detect with SGX-step.
There is a trick to detect whether you had

278
00:22:32,920 --> 00:22:36,560
a single step or a zero step.
M: Jo, this is perfect. This is

279
00:22:36,560 --> 00:22:40,060
exactly what we are looking for. Thank you
so much for explaining that.

280
00:22:40,060 --> 00:22:43,250
J: I'm very happy to hear that.
M: I'm looking forward to try it out

281
00:22:43,250 --> 00:22:44,850
now.
J: Go.

282
00:22:44,850 --> 00:22:47,470
M: See you hopefully soon.
J: Bye bye.

283
00:22:47,470 --> 00:22:48,850
M: Bye!

284
00:22:49,460 --> 00:22:54,950
M: So SGX-step to sum it up,
it's an open source Linux kernel

285
00:22:54,950 --> 00:22:59,990
framework, and it allows us to configure
the APIC timer interrupts so that we can

286
00:22:59,990 --> 00:23:06,400
interrupt the enclave execution to single
and zero step it. And this is perfect

287
00:23:06,400 --> 00:23:11,760
because now we can combine it with the
power measurements of Intel RAPL, and this

288
00:23:11,760 --> 00:23:17,080
gives us the possibility to measure the
energy consumption of single instructions.

289
00:23:17,080 --> 00:23:21,710
Can you try it out Andi?
A: OK, let me dig deeper into that.

290
00:23:21,710 --> 00:23:25,700
We have this really slow RAPL interface
here and if you want to visualize it, we

291
00:23:25,700 --> 00:23:30,360
could imagine that it's like we have slots
where we can fill the slots with

292
00:23:30,360 --> 00:23:35,390
instructions and the RAPL interface gives
us the average power consumption over the

293
00:23:35,390 --> 00:23:40,050
slots. So in the default case, when we
execute our target instruction, we have

294
00:23:40,050 --> 00:23:44,100
basically one slot filled with the target
instruction and the remaining slots filled

295
00:23:44,100 --> 00:23:50,130
with other instructions we don't know. So
basically noise. The best case for us

296
00:23:50,130 --> 00:23:54,210
would be if we repeat the target
instruction indefinitely and fill every

297
00:23:54,210 --> 00:23:58,028
slot with the target instruction.
M: This is exactly what I did

298
00:23:58,028 --> 00:24:02,060
in the experiments in the beginning.
A: Yeah, exactly. That's the reason

299
00:24:02,060 --> 00:24:07,760
why we got so good measurements there.
Another trick would be if we only used the

300
00:24:07,760 --> 00:24:11,890
target instruction in one slot and fill
the remaining slots with instructions

301
00:24:11,890 --> 00:24:15,920
where we know the energy consumption of or
we know the instruction of. Then it could

302
00:24:15,920 --> 00:24:20,840
do tricks to calculate the energy
consumption of the target instruction.

303
00:24:20,840 --> 00:24:26,830
With SGX-step now we can use a hybrid
solution here, where we use SGX-step the

304
00:24:26,830 --> 00:24:32,380
zero stepping mechanism to reissue this
instruction and we can fill multiple slots

305
00:24:32,380 --> 00:24:37,260
with the same target instruction. Only
drawback here is that we have a noise

306
00:24:37,260 --> 00:24:43,130
overhead of SGX-step itself, but this is
probably the best solution we can go with.

307
00:24:43,860 --> 00:24:48,100
M: This sounds pretty good, so we
should actually try that out. So we

308
00:24:48,100 --> 00:24:53,180
implement a toy cipher, which imitates
square and multiply basically. So we can

309
00:24:53,180 --> 00:24:58,110
leave out all the rest, the overhead of a
library that would be used otherwise. And

310
00:24:58,110 --> 00:25:02,700
we then just single step every instruction
and measure its energy consumption and

311
00:25:02,700 --> 00:25:08,200
then we could plot this. Can you do that?
A: I got already some results here

312
00:25:08,200 --> 00:25:13,156
for us. Basically here we use, as you
explained, a toy example for square and

313
00:25:13,156 --> 00:25:18,580
multiply. And in both cases the square and
the multiply, they execute exactly six

314
00:25:18,580 --> 00:25:23,860
instructions. And so basically we have a
period of six here. And if you look at the

315
00:25:23,860 --> 00:25:29,550
results of the measurement here, we can
see that we have patterns that repeat with

316
00:25:29,550 --> 00:25:34,460
a period of six and we can see that these
different patterns correspond to either a

317
00:25:34,460 --> 00:25:40,400
square or a multiply instruction here.
M: Nice, perfect, but this is just a

318
00:25:40,400 --> 00:25:42,400
toy cipher, right? <i>laughter</i>
A: Yeah.

319
00:25:42,400 --> 00:25:44,370
M: Can we do like real crypto?
<i>laughter</i>

320
00:25:44,370 --> 00:25:49,529
A: We can try. So the plan now is
that we want to attack a real RSA

321
00:25:49,529 --> 00:25:54,310
implementation and the real implementation
is not like a toy square and multiply

322
00:25:54,310 --> 00:25:59,320
algorithm. The real implementation needs
to handle these huge numbers. So basically

323
00:25:59,320 --> 00:26:03,492
there's much more code involved and it's
not feasible to single step every

324
00:26:03,492 --> 00:26:10,340
instruction there. So we must do a more
clever approach here. If we observe the

325
00:26:10,340 --> 00:26:17,478
square multiply part here, we see that the
square and the multiply function uses the

326
00:26:17,478 --> 00:26:25,420
AVX optimized memset function. So the
energy consumption should also be more if

327
00:26:25,420 --> 00:26:30,910
we execute an AVX instruction because AVX
instructions use much larger registers. So

328
00:26:30,910 --> 00:26:33,031
basically we should be able to observe
that.

329
00:26:33,031 --> 00:26:36,040
M: Interesting.
A: The only drawback here is that we

330
00:26:36,040 --> 00:26:43,470
cannot use the same approach as with the
toy cipher because the square has a

331
00:26:43,470 --> 00:26:48,659
different number of instructions as the
square and multiply function. So we need

332
00:26:48,659 --> 00:26:54,950
to do a trick here. So to understand what
we did here, our target is that we

333
00:26:54,950 --> 00:27:00,280
reconstruct a key bit. And if the key bit
is one we execute a square and multiply.

334
00:27:00,280 --> 00:27:09,260
If the key bit is zero, we execute a
square. So to visualize how we execute

335
00:27:09,260 --> 00:27:14,470
zero and single stepping, we have to dig
into the assembler a bit. So to test for

336
00:27:14,470 --> 00:27:18,690
the key bit, we execute like a test
instruction and then we execute a

337
00:27:18,690 --> 00:27:24,730
conditional jump. And if we execute the
square and multiply we have for instance,

338
00:27:24,730 --> 00:27:29,435
K instructions. And if we execute the
square we have for instance L

339
00:27:29,435 --> 00:27:34,260
instructions. So we can see that these two
numbers do not add up. They are different.

340
00:27:34,260 --> 00:27:40,050
So we cannot simply measure each Kth
instruction and get the key out. So we

341
00:27:40,050 --> 00:27:45,030
need to do something different here. We
can number the instructions after the jump

342
00:27:45,030 --> 00:27:52,980
instruction and then using single stepping
to single step to the Nth instruction

343
00:27:52,980 --> 00:27:59,272
after the jump instruction. And on the
left side, if you observe one, we hit then

344
00:27:59,272 --> 00:28:05,414
exactly the AVX instruction there, used in
the AVX memset. And if you then use our

345
00:28:05,414 --> 00:28:10,044
measurement framework to measure exactly
the nth instruction after the jump, we

346
00:28:10,044 --> 00:28:14,690
observe on the one hand a high energy
consumption and on the other hand, we

347
00:28:14,690 --> 00:28:20,140
observe low energy consumption if the
branch was not taken or a zero.

348
00:28:20,140 --> 00:28:22,910
M: It's very clever.
A: So if you measured both

349
00:28:22,910 --> 00:28:28,490
instructions here, we can then combine
this energy measurements and then use a

350
00:28:28,490 --> 00:28:35,490
simple threshold to reconstruct the key
bit in the beginning. And then we do this

351
00:28:35,490 --> 00:28:39,270
iteratively for each key bit.
M: This sounds pretty promising, but

352
00:28:39,270 --> 00:28:40,760
did you try it out?
<i>laughter</i>

353
00:28:40,760 --> 00:28:45,149
A: Sure. Here, the results of that.
And we can clearly see that we have

354
00:28:45,149 --> 00:28:48,735
different energy consumption or in this
case voltage

355
00:28:48,735 --> 00:28:51,094
<i>applause</i>
based on if the

356
00:28:51,094 --> 00:28:56,160
AVX instruction is executed or if the
instruction at the same offset in the

357
00:28:56,160 --> 00:28:59,410
other branch is executed.
M: How fast does this work, does this

358
00:28:59,410 --> 00:29:03,025
take like 5 days?
A: Not quite that long. We have one

359
00:29:03,025 --> 00:29:08,445
problem here that the time per key bit
increases the further or later the key bit

360
00:29:08,445 --> 00:29:14,040
is in the key. So basically the first key
bit we can reconstruct very fast, but for

361
00:29:14,040 --> 00:29:18,230
the last key bit, we need a single step
much further in the code to actually reach

362
00:29:18,230 --> 00:29:23,460
it. And this adds up. So basically the
time increases linearly between the key

363
00:29:23,460 --> 00:29:29,090
bits. But for our key here, our test key
with 512 bits that takes us about 3.5

364
00:29:29,090 --> 00:29:35,280
hours to reconstruct a complete key. Note
here that we spent like 52 minutes

365
00:29:35,280 --> 00:29:39,790
only to find the target instruction. So
basically, if we could optimize that, the

366
00:29:39,790 --> 00:29:45,688
attack would be much faster. In addition,
we had to record like 3 samples per key

367
00:29:45,688 --> 00:29:50,199
bit. But with the implementation, it
should be possible to actually do that

368
00:29:50,199 --> 00:29:54,600
with 1 sample. And since we then only need
one sample per key bit, we actually can do

369
00:29:54,600 --> 00:29:58,569
it with a single trace attack. But we did
not try that out, unfortunately.

370
00:29:58,569 --> 00:30:03,375
Moritz: quite fast.
Michael: So while all this sounded quite

371
00:30:03,375 --> 00:30:08,183
easy and straightforward in hindsight,
this was actually a really long process.

372
00:30:08,183 --> 00:30:14,100
Starting at the beginning of 2017 when we
discovered this interface, the RAPL

373
00:30:14,100 --> 00:30:18,713
interface. Then we had to come up with a
title for this talk, of course, <i>laughter</i>

374
00:30:18,713 --> 00:30:25,677
and some lyrics for a song. We had the
first toy attack on RSA at the end of

375
00:30:25,677 --> 00:30:34,463
2017. It took us until 2018 to finally get
a KASLR break that was working and only in

376
00:30:34,463 --> 00:30:41,280
2019, by the end of 2019. After Andreas
did his master's thesis on that, we were

377
00:30:41,280 --> 00:30:48,030
able to produce a full attack on RSA. And
this is also the time when we submitted

378
00:30:48,030 --> 00:30:53,910
that as a paper to a conference and
disclosed that to the CPU vendors so that

379
00:30:53,910 --> 00:30:59,552
they can fix that. And this is also the
start of the embargo. This embargo for

380
00:30:59,552 --> 00:31:10,640
this vulnerability lasted almost one year.
So from November 2019 to November 2020. It

381
00:31:10,640 --> 00:31:15,790
was just a few weeks ago that this embargo
ended here.

382
00:31:15,790 --> 00:31:21,040
Moritz: But there's one thing missing. We
really wanted to do crypto attacks, but

383
00:31:21,040 --> 00:31:28,067
not only with SGX-step as a compromised
operating system, but also from userspace.

384
00:31:28,067 --> 00:31:33,650
But as we've seen, it's so difficult to
measure parts of the code without having

385
00:31:33,650 --> 00:31:39,653
SGX-step. But what we can do is we can
measure the power consumption of the

386
00:31:39,653 --> 00:31:46,280
overall execution of an algorithm and
there correlation power analysis comes in

387
00:31:46,280 --> 00:31:53,121
handy. And there what we do is we build a
power consumption model of our device. As

388
00:31:53,121 --> 00:31:58,540
we've heard earlier, the Hamming Weight is
the number of bits that is set in an

389
00:31:58,540 --> 00:32:05,580
operand or in the data. And we assume that
if a bit is set, the computer takes more

390
00:32:05,580 --> 00:32:10,850
power to process it. In addition, what you
can use as a different model is the

391
00:32:10,850 --> 00:32:17,768
Hamming distance. So from one operation to
the other, how many bits change? And then

392
00:32:17,768 --> 00:32:24,690
we assume the more bits change, the more
power is consumed. And we really want to

393
00:32:24,690 --> 00:32:30,700
try that out. So what we are targeting now
is AES-NI, a side channel resistant

394
00:32:30,700 --> 00:32:37,320
instruction set of Intel. And we target it
in a scenario where we can trigger the

395
00:32:37,320 --> 00:32:43,728
encryption and decryption of many, many
blocks over long time so that the

396
00:32:43,728 --> 00:32:50,770
execution time is longer than the RAPL
update rate, so that we can really see the

397
00:32:50,770 --> 00:32:55,640
power consumption in our measurement. And
this is used, for instance, in disk

398
00:32:55,640 --> 00:33:05,340
encryption or decryption or if you seal or
unseal the SGX enclave state. And we can

399
00:33:05,340 --> 00:33:10,840
now do that and record power measurements
in different scenarios, right?

400
00:33:10,840 --> 00:33:17,390
Andreas: Sure, we can try that. So in our
experiment, we recorded two million traces

401
00:33:17,390 --> 00:33:25,860
over 26 hours for SGX environment. But we
also tried to reconstruct it without SGX

402
00:33:25,860 --> 00:33:29,700
where we used the encryption inside a
kernel module. And there we recorded

403
00:33:29,700 --> 00:33:36,951
4 million traces in 50 hours. And to
understand the attack here, we have to

404
00:33:36,951 --> 00:33:42,030
look at this animation. So basically we
have our computer where secret key is

405
00:33:42,030 --> 00:33:49,500
stored somewhere intern. Then we have this
key to encrypt some messages and we also

406
00:33:49,500 --> 00:33:54,240
have the power consumption here. And what
we now did is we recorded the encrypted

407
00:33:54,240 --> 00:34:00,854
message and the power consumption it took
to encrypt this message for many messages.

408
00:34:00,854 --> 00:34:07,540
And then we use a model of the CPU here to
predict the energy consumption, to

409
00:34:07,540 --> 00:34:12,940
reconstruct the key. The key is usually
split up into parts, where each of the

410
00:34:12,940 --> 00:34:20,887
parts can have a value between 0 and 255.
So to reconstruct the key here, we simply

411
00:34:20,887 --> 00:34:28,819
use our measurements in the model and we
try out one of the key parts and estimate

412
00:34:28,819 --> 00:34:35,809
the energy consumption for the key part.
And then we store the correlation between

413
00:34:35,809 --> 00:34:42,619
the recorded messages and the prediction.
And we do this for every of the possible

414
00:34:42,619 --> 00:34:50,379
key values. And once we found the key
value of the highest correlation, we know

415
00:34:50,379 --> 00:34:56,909
that this key value corresponds to the key
part of the key. And we then simply repeat

416
00:34:56,909 --> 00:35:02,279
the process for each of the parts of the
key until we get the final key.

417
00:35:02,279 --> 00:35:07,450
M: And we actually tried that out. So
here in our demo video, you see on the

418
00:35:07,450 --> 00:35:13,391
left where we test all the combinations
and see what is the most likely key

419
00:35:13,391 --> 00:35:18,349
candidate at the moment, while for a
single key byte on the right, you see

420
00:35:18,349 --> 00:35:23,730
every possible value and the correlation.
So in the beginning, with not that many

421
00:35:23,730 --> 00:35:29,747
traces processed, it's not very clear
which key candidate is the right one,

422
00:35:29,747 --> 00:35:34,849
because there's so much measurement noise
introduced by measuring over the overall

423
00:35:34,849 --> 00:35:41,292
execution time. But over time, this signal
gets more stable and we see on the right

424
00:35:41,292 --> 00:35:45,890
with the peak getting more and more
distance from the other candidates that

425
00:35:45,890 --> 00:35:52,380
this is our correct key byte. And we do
this, as Andreas said, for every possible

426
00:35:52,380 --> 00:35:57,230
key byte with every possible value. So in
the end, we end up with the correct key.

427
00:35:57,230 --> 00:36:00,729
<i>applause</i>
A: OK, but this seems like it's only

428
00:36:00,729 --> 00:36:05,930
Intel CPUs. Does this also affect others?
M: Yes. So actually, we also tried

429
00:36:05,930 --> 00:36:10,858
out how to CPU vendors if they have
similar interfaces. And for instance, AMD

430
00:36:10,858 --> 00:36:17,532
is affected as well. But we never really
heard back from them after our disclosure.

431
00:36:17,532 --> 00:36:23,510
And the patch how to try to solve the
problem with the driver is similar to the

432
00:36:23,510 --> 00:36:27,400
one that Intel has.
A: Your right Moritz, it actually

433
00:36:27,400 --> 00:36:31,839
works. So I tried the same code on AMD.
The one you showed before was

434
00:36:31,839 --> 00:36:37,080
distinguishing operands, at that also
works on AMD. That's pretty nice. It's not

435
00:36:37,080 --> 00:36:41,440
an Intel only issue. It also affects at
least AMD as well.

436
00:36:41,440 --> 00:36:45,230
M: Yes, but actually there are many
other vendors as well that provide

437
00:36:45,230 --> 00:36:50,410
interfaces, even some of them unprivileged
to user space where you could probably

438
00:36:50,410 --> 00:36:55,660
mount similar attacks. For instance,
Nvidia, IBM, or Marvell and Ampere.

439
00:36:55,660 --> 00:37:00,906
A: So this is really an industry
wide problem here. And we've also seen

440
00:37:00,906 --> 00:37:08,432
that from the media coverage. So not only
German news brought about that like Heise

441
00:37:08,432 --> 00:37:13,788
or Golem, but it also went more
international with ZDNET, Ars Technica,

442
00:37:13,788 --> 00:37:20,970
CSO, Tech Radar, Computer Weekly and many,
many others that wrote about this new type

443
00:37:20,970 --> 00:37:28,599
of vulnerability that affects many
computers out there. And I guess if it

444
00:37:28,599 --> 00:37:31,480
affects many computers, we should do
something against that.

445
00:37:31,480 --> 00:37:35,779
M: Yes, you're right. We cannot only
have an attack and no mitigation against

446
00:37:35,779 --> 00:37:41,470
it. This would not be right. And indeed,
it's quite easy to fix that because we

447
00:37:41,470 --> 00:37:46,040
said in the beginning, you have
unprivileged access to those registers. So

448
00:37:46,040 --> 00:37:51,930
we just restrict the access. And we are
done, and this is exactly a one line patch

449
00:37:51,930 --> 00:37:59,480
for the Linux kernel. But as we've seen
with the threat model of Intel SGX, which

450
00:37:59,480 --> 00:38:05,049
allows a compromised operating system. So
this one line patch does not help there

451
00:38:05,049 --> 00:38:11,340
because I'm the operating system, I can do
whatever I want to. We need more and more

452
00:38:11,340 --> 00:38:18,445
complex mitigations. And in this case,
microcode updates are necessary. And what

453
00:38:18,445 --> 00:38:23,991
Intel does is to fall back to the model of
the energy consumption. So they have an

454
00:38:23,991 --> 00:38:28,930
internal model. How much energy is
consumed by an executed instruction and

455
00:38:28,930 --> 00:38:33,968
use that instead of the real measurement.
And this does not allow to distinguish

456
00:38:33,968 --> 00:38:40,895
data and operands from each other again.
So if your implementation is implemented

457
00:38:40,895 --> 00:38:47,220
correctly, if you use constant time, then
you are mitigated and protected against

458
00:38:47,220 --> 00:38:53,444
these attacks. And as we see here in the
plot, we tried to mitigation out. So on

459
00:38:53,444 --> 00:38:58,020
the left, we were able to see differences
depending on the Hamming weight of the

460
00:38:58,020 --> 00:39:03,700
operands. And on the right with the
mitigation in place, it just does not work

461
00:39:03,700 --> 00:39:07,311
anymore and you cannot see any
differences. <i>applause</i>

462
00:39:07,311 --> 00:39:11,142
Andreas: Nice. So you really
can't read her power trace any more.

463
00:39:11,142 --> 00:39:35,547
<i>Music: Pokerface by Lady Gaga</i>

464
00:39:35,547 --> 00:39:39,641
<i>sings</i>
I wonna probe 'em like in 1943

465
00:39:39,641 --> 00:39:43,116
touch 'em, measure wattage
correlate and get the key

466
00:39:43,116 --> 00:39:44,005
I probe it

467
00:39:44,005 --> 00:39:47,368
Oscilloscopes are not the same
without a probe

468
00:39:47,368 --> 00:39:52,219
And babe, if it's remote if it's not code,
it cannot run

469
00:39:56,239 --> 00:39:59,731
I'll let him plot, let's see what he's got

470
00:40:04,251 --> 00:40:08,145
I'll let him plot, let's see what he's got

471
00:40:08,145 --> 00:40:10,389
Can't read my, can't read my

472
00:40:10,389 --> 00:40:14,091
No he can't read my power trace

473
00:40:14,091 --> 00:40:16,368
She's got the countermeasure

474
00:40:16,368 --> 00:40:18,283
Can't read my, can't read my

475
00:40:18,283 --> 00:40:21,907
No he can't read my power trace

476
00:40:21,907 --> 00:40:24,572
She's got the countermeasure

477
00:40:24,572 --> 00:40:27,649
P-p-p-power trace, p-p-power trace

478
00:40:28,530 --> 00:40:31,688
P-p-p-power trace, p-p-power trace

479
00:40:32,533 --> 00:40:35,658
P-p-p-power trace, p-p-power trace

480
00:40:36,691 --> 00:40:39,555
P-p-p-power trace, p-p-power trace

481
00:40:41,404 --> 00:40:43,728
<i>applause</i>

482
00:40:43,728 --> 00:40:45,910
Moritz: With all those nasty songs, we

483
00:40:45,910 --> 00:40:50,910
wrote them down in a scientific paper and
the PLATYPUS paper has been accepted

484
00:40:50,910 --> 00:40:57,240
recently at a conference. And we also want
to thank you, all the other coauthors who

485
00:40:57,240 --> 00:41:04,520
are not in this talk, like David Oswald,
Catherine Easton and Claudio Canela. To

486
00:41:04,520 --> 00:41:09,900
sum it up, what we have seen is that with
power sidechannel attacks, you can even

487
00:41:09,900 --> 00:41:16,630
exploit them from software. So there is no
need to attach an oscilloscope on modern

488
00:41:16,630 --> 00:41:19,514
Intel CPUs.

489
00:41:19,514 --> 00:41:23,239
Michael: And what we've also seen is
that since the SGX threat model allows for

490
00:41:23,239 --> 00:41:27,809
much more capable attackers, mitigating
power sidechannel attacks on the SGX

491
00:41:27,809 --> 00:41:32,369
enclaves is much more work than simple
software patches.

492
00:41:32,369 --> 00:41:34,604
Andreas: Yes, and that concludes

493
00:41:34,604 --> 00:41:39,696
our talk on PLATYPUS. Thank you all for
listening.

494
00:41:39,696 --> 00:41:56,859
<i>Applause and Music</i>

495
00:41:59,077 --> 00:42:05,580
Herald: Thank you very much for your
excuse me, nerdy talk and thank Moritz,

496
00:42:05,580 --> 00:42:13,140
Michael, Daniel and Andreas. We head over
to our Q&A session and the first question

497
00:42:13,140 --> 00:42:21,059
would be, how does it come that you have
so, let's say through the back door

498
00:42:21,059 --> 00:42:26,680
against CPU attack against the CPU idea,
you mentioned you attack the through a

499
00:42:26,680 --> 00:42:31,910
power driver RSA. Could you tell me a
little bit more about that?

500
00:42:31,910 --> 00:42:36,640
Moritz: Yes. So the basic idea of
attacking cryptographic algorithms with

501
00:42:36,640 --> 00:42:41,339
power side channel attacks is not very new
This was like one of the first things

502
00:42:41,339 --> 00:42:46,400
researchers have shown, but most of the
time for like smaller devices, like smart

503
00:42:46,400 --> 00:42:52,740
cards, like your bank card, for instance.
And for those attacks, you usually had

504
00:42:52,740 --> 00:42:57,472
like an oscilloscope that you needed to
attach to the device to do the attack. But

505
00:42:57,472 --> 00:43:02,012
with modern processors, they have
basically an oscilloscope built into the

506
00:43:02,012 --> 00:43:07,309
processor, which you can read out as the
operating system. And in our case, there

507
00:43:07,309 --> 00:43:12,454
are like drivers that expose this
interface, also to userspace. So from

508
00:43:12,454 --> 00:43:18,050
there as an unprivileged attacker, you can
then try to exploit that. And yeah

509
00:43:18,050 --> 00:43:23,450
basically the best thing that we wanted to
achieve with those attacks is to attack

510
00:43:23,450 --> 00:43:29,434
cryptographic algorithms and not to
transmit some data between two processes.

511
00:43:29,434 --> 00:43:35,700
Herald: Cool, thank you. Our next
question, you mentioned a little bit about

512
00:43:35,700 --> 00:43:44,259
ARM sorry, AMD, how about ARM? So not x86
architecture?

513
00:43:44,259 --> 00:43:49,350
Moritz: So there are many other vendors
that have similar interfaces, some of them

514
00:43:49,350 --> 00:43:55,519
also provide drivers that expose them
directly to userspace, but we hardly had

515
00:43:55,519 --> 00:44:01,390
any access to those devices, so we could
not really fully evaluate if these attacks

516
00:44:01,390 --> 00:44:06,072
are also possible on them. But in the
paper, we have an appendix where we

517
00:44:06,072 --> 00:44:10,440
describe them in a bit more detail so you
can try it out on your own and let us know

518
00:44:10,440 --> 00:44:15,120
if it works.
Herald: Cool. Thank you. So please, fellow

519
00:44:15,120 --> 00:44:20,470
hackers, try it out at your system, at
home. Now, our next question is related to

520
00:44:20,470 --> 00:44:26,374
that. Is there a survey which hardware has
the RAPL or similar weaknesses? Intel,

521
00:44:26,374 --> 00:44:33,045
AMD, ARM even.
Moritz: I don't know if anyone else wants

522
00:44:33,045 --> 00:44:38,940
to answer that, I can also take the
question. So the RAPL interface itself

523
00:44:38,940 --> 00:44:44,130
comes from Intel, but a similar interface
is also implemented for AMD, and they also

524
00:44:44,130 --> 00:44:49,710
use basically the same name. They have
a... For now, it's implemented in two ways

525
00:44:49,710 --> 00:44:54,420
for the Linux kernel, also in the RAPL
driver, but also in a separate called AMD

526
00:44:54,420 --> 00:44:59,609
Energy Driver, which is included since a
few months in the Linux kernel, in the

527
00:44:59,609 --> 00:45:05,074
upstream Kernel. And for other vendors it
works a bit differently. So some of them

528
00:45:05,074 --> 00:45:12,087
just give you similar measurements, but
not in a tightly related way to the RAPL

529
00:45:12,087 --> 00:45:16,220
Interface with a measure over a period of
time and give you the average.

530
00:45:16,611 --> 00:45:21,560
Herald: OK, and..
Michael: Maybe to add one point here: On

531
00:45:21,560 --> 00:45:26,534
Intel, basically the high resolution
sensors are included since the Skylake

532
00:45:26,534 --> 00:45:31,308
micro architecture. So something around
2015.

533
00:45:33,383 --> 00:45:40,180
Herald: I see. We have another related
question to AMD. So did AMD issue any

534
00:45:40,180 --> 00:45:45,160
Microcode update for the secure encrypted
virtual machines case apart from

535
00:45:45,160 --> 00:45:53,469
restricting access to MSR?
Moritz: Not as far as we know. But from

536
00:45:53,469 --> 00:45:58,271
our knowledge to attack AMD CPU's, we need
to wait for a new generation so that we

537
00:45:58,271 --> 00:46:02,931
can do similar attacks from a similar
threat model than we can do on an Intel.

538
00:46:03,450 --> 00:46:09,390
Herald: Cool, thank you. So another I
think this is also related to it, you

539
00:46:09,390 --> 00:46:14,390
mentioned your Xen example where you
attack through a hypervisor. Does it work

540
00:46:14,390 --> 00:46:18,440
on other hypervisors like KVM or hyperV as
well?

541
00:46:18,440 --> 00:46:24,470
Moritz: So for KVM, I don't think so. For
Windows I also don't know I don't think

542
00:46:24,470 --> 00:46:29,509
they exposed those MSR directly to the
virtual machines. So the issue is really

543
00:46:29,509 --> 00:46:34,270
here that we can have access to those MSRs
at the virtual machine where we should not

544
00:46:34,270 --> 00:46:40,859
have access to.
Herald: OK, we have another question from,

545
00:46:40,859 --> 00:46:47,297
I think, the hardware section of our
remote Congress. Someone wonders if the

546
00:46:47,297 --> 00:46:51,833
same could be achieved with external power
measurement.

547
00:46:52,990 --> 00:46:57,640
Moritz: You mean if you could attach
actually an oscilloscope or a different

548
00:46:57,640 --> 00:47:03,510
probe to the CPU? Yes, you can do that.
And it has already been demonstrated in

549
00:47:03,510 --> 00:47:07,279
the past.
Michael: But it turned out with external

550
00:47:07,279 --> 00:47:12,510
tools, it takes even longer than with
software. You have more issues finding the

551
00:47:12,510 --> 00:47:20,630
right spot in measuring. And there is one
paper, it took 14 days of collecting

552
00:47:20,630 --> 00:47:26,909
traces which are harder to probe, which is
much longer than in software. But it can

553
00:47:26,909 --> 00:47:30,981
be done.
Herald: And there's another follow up

554
00:47:30,981 --> 00:47:38,677
question, how external is external? Where
do you measure power consumptions of an

555
00:47:38,677 --> 00:47:46,650
x86 server?
Moritz: OK, you would need to get physical

556
00:47:46,650 --> 00:47:51,400
access to the data center, I guess. And if
this is in your threat model, you probably

557
00:47:51,400 --> 00:47:57,740
have different things to worry about.
Michael: Yeah, you still need to find the

558
00:47:57,740 --> 00:48:04,609
right spot on your mainboard.
Herald: OK, so are there, let's say

559
00:48:04,609 --> 00:48:08,680
documentation's where to get that right
spot.

560
00:48:09,612 --> 00:48:14,700
Moritz: I think one can take a look at
other research papers where they attached

561
00:48:14,700 --> 00:48:19,180
a probe, I think there are experts out
there, but I don't know.

562
00:48:19,180 --> 00:48:26,690
Herald: OK, thank you. The next question,
why is the power information exported in

563
00:48:26,690 --> 00:48:32,809
such detail to the kernel or userspace
software? Why isn't it only available to

564
00:48:32,809 --> 00:48:37,700
the firmware or filtered to return an
average, for example, one second power

565
00:48:37,700 --> 00:48:43,279
trace?
Moritz: Good question. We did not

566
00:48:43,279 --> 00:48:48,140
implement that. I think the reason is...
Andi?

567
00:48:48,140 --> 00:48:53,540
Andreas: The once second power trace would
make the attack only slower because you

568
00:48:53,540 --> 00:48:58,345
can still do exactly what we did with
single stepping here, because RAPL is

569
00:48:58,345 --> 00:49:04,477
already very slow and we need a mechanism
to replay instructions to get a good

570
00:49:04,477 --> 00:49:08,779
reading of the energy consumption of the
instructions. So if you only increase the

571
00:49:08,779 --> 00:49:14,170
update rate there, the attacks would still
be possible, but only take longer to

572
00:49:14,170 --> 00:49:22,819
record the traces there. So you have to...
Yeah. So you have to find a tradeoff

573
00:49:22,819 --> 00:49:28,049
between your countermeasures there.
Herald: Okay, so let's say with an

574
00:49:28,049 --> 00:49:33,180
average, your resolution is lower, but
still it just takes more time to record

575
00:49:33,180 --> 00:49:38,420
it. And still it does work, right?
Moritz: Yes. And the other thing is that

576
00:49:38,420 --> 00:49:43,450
one needs to keep in mind those drivers
are not written for security in mind, but

577
00:49:43,450 --> 00:49:48,779
for performance so that this can be used
by other tools that like give you the best

578
00:49:48,779 --> 00:49:55,059
performance of your CPU. And in that case,
it just has not been masked and you get

579
00:49:55,059 --> 00:49:58,710
the value directly at the operating system
sees.

580
00:49:59,106 --> 00:50:06,380
Herald: Crazy. Our second to last
question, how long is the update interval

581
00:50:06,380 --> 00:50:13,046
for this measurement? I heard something
about...

582
00:50:13,046 --> 00:50:17,224
Andreas: For the fastest register we
observed, it's like 10 microseconds, for

583
00:50:17,224 --> 00:50:21,079
the slowest one... So there are different
domains where you measure only parts of

584
00:50:21,079 --> 00:50:25,290
the CPU and for the whole package, this
includes all the cores and the memory

585
00:50:25,290 --> 00:50:30,099
controller, it takes around one
millisecond there. So this is already very

586
00:50:30,099 --> 00:50:35,311
slow, if you compare it to the frequency
where CPUs are currently running at.

587
00:50:36,690 --> 00:50:43,539
Herald: Crazy. In this case, are there any
other questions from the interwebs, from

588
00:50:43,539 --> 00:50:50,455
Twitter, from our IRC channel? Because
otherwise we would head over to more,

589
00:50:50,455 --> 00:50:56,178
let's say, personal interview. Let's give
them a try.

590
00:51:07,727 --> 00:51:09,880
In this case, no more

591
00:51:09,880 --> 00:51:16,851
questions, so in this. So, again, thank
you. Moritz, Michael, Daniel and Andreas,

592
00:51:16,851 --> 00:51:27,230
for these for this really interesting talk
for this Q&A session, the Internet tells

593
00:51:27,230 --> 00:51:35,622
me no questions. We head over to our
personal interview. I asked you earlier

594
00:51:35,622 --> 00:51:43,670
before our talk. So with all these, let's
say, research things going on in the

595
00:51:43,670 --> 00:51:49,420
Corona time. So what's your personal
experience? What changed in your work life

596
00:51:49,420 --> 00:51:56,001
balance in the last one year?
Moritz: I think the biggest change is that

597
00:51:56,001 --> 00:52:02,105
most of the coffee breaks you do alone
instead of with the colleagues.

598
00:52:04,211 --> 00:52:08,710
Herald: So how do you meet in your in
your, let's say, lunch break? Do you have

599
00:52:08,710 --> 00:52:16,069
as well a lunch break break out session in
Jitsi? Yeah, we started with Jitsi, but

600
00:52:16,069 --> 00:52:20,320
used different systems on the long way.
And now it's like a fixed coffee meeting

601
00:52:20,320 --> 00:52:25,637
at 2:00 p.m. every day and try to meet
everyone or have individual meetings, of

602
00:52:25,637 --> 00:52:28,758
course.
Herald: And does this work? But so is

603
00:52:28,758 --> 00:52:35,323
everyone on time. So sharp 12?
Moritz: No, but I think no one really

604
00:52:35,323 --> 00:52:40,500
cares.
Herald: So it's just for socializing?

605
00:52:40,500 --> 00:52:47,168
Moritz: Yes. But we also discuss work
related issues also in separate meetings.

606
00:52:47,168 --> 00:52:54,849
And yeah, I think time is different, but
you get used to it. But let's hope it's

607
00:52:54,849 --> 00:53:02,108
over soon.
Herald: What about the others, Michael?

608
00:53:02,108 --> 00:53:08,910
Michael: Yes, I'm in the same coffee
breaks as Moritz. Sometimes every day,

609
00:53:08,910 --> 00:53:17,200
depends on the workload, so I feel quite
lucky that we can still work full time and

610
00:53:17,200 --> 00:53:21,890
get our work done. And I don't have to
fear that we lose our jobs in the in the

611
00:53:21,890 --> 00:53:30,609
short term. So I think that takes a lot of
pressure off. But, yeah, I mean, it's

612
00:53:30,609 --> 00:53:35,859
different. I'm also missing the
conferences, so I used to travel around a

613
00:53:35,859 --> 00:53:43,990
lot before Corona times and this year is
basically nothing. So you really miss the

614
00:53:43,990 --> 00:53:49,910
social interactions and conferences,
meeting other researchers, exchanging

615
00:53:49,910 --> 00:54:00,060
ideas, having that online is different and
just not the same, but still it works. So

616
00:54:00,060 --> 00:54:05,289
I can still do a lot of research. The
positive thing, you have less

617
00:54:05,289 --> 00:54:12,019
interruptions than when you're in the
office. So that's a positive thing. But

618
00:54:12,019 --> 00:54:17,269
yeah, I also hope that it's over soon.
Daniel: But then again, on the other side,

619
00:54:17,269 --> 00:54:22,476
you have way more conference calls because
instead of writing emails, people ask for

620
00:54:22,476 --> 00:54:26,808
conference calls all the time.
Michael: Yes, you are in meetings all the

621
00:54:26,808 --> 00:54:29,980
time.
Herald: Yeah, Daniel you mentioned earlier

622
00:54:29,980 --> 00:54:37,299
you're, let's say, flightplan the last
year. And as far as I understood it, you

623
00:54:37,299 --> 00:54:43,049
like to be in personal contact with your
colleagues, also from others or from

624
00:54:43,049 --> 00:54:49,109
foreign countries. How does this work? So
let's say topic exchange between different

625
00:54:49,109 --> 00:54:51,890
organizations, between different
countries?

626
00:54:51,890 --> 00:54:59,930
Daniel: Yeah, it's more difficult. So in
2018, I had these 54 talks outside of Graz

627
00:54:59,930 --> 00:55:11,529
in 52 weeks and this year I had a single
talk outside of, outside of Graz where I

628
00:55:11,529 --> 00:55:17,630
was in person of course. Of course more
Online. Um yeah. So it's, it's difficult

629
00:55:17,630 --> 00:55:24,210
to engage with people from other places,
but it works of course in teams that you,

630
00:55:24,210 --> 00:55:29,869
that you already have established in the
past, for instance. So you can continue in

631
00:55:29,869 --> 00:55:36,720
teams that you've already built there. But
also in some cases it works to start new

632
00:55:36,720 --> 00:55:40,900
collaborations. But it's of course more
difficult than if you can just meet people

633
00:55:40,900 --> 00:55:46,643
in person like we did for this paper
actually, David Osvald, one of the

634
00:55:46,643 --> 00:55:52,613
coauthors, we met with him in person and
talked with him about the paper in person.

635
00:55:56,148 --> 00:56:02,210
Herald: Andreas, what's your, let's say,
Corona year?

636
00:56:02,210 --> 00:56:06,569
Andreas: Yeah, since I'm one of the
persons who was interrupting Michael all

637
00:56:06,569 --> 00:56:14,259
the time I am missing the office because
it looks like the unscheduled flow,

638
00:56:14,259 --> 00:56:18,390
because it's sitting in an office and
suddenly you have like a question or idea,

639
00:56:18,390 --> 00:56:24,110
you can not or you don't have to write it.
You can just ask it on the fly. So I'm a

640
00:56:24,110 --> 00:56:28,898
bit missing that side. On the other side,
I gained a lot of time since I don't have

641
00:56:28,898 --> 00:56:36,544
to travel to work there. And often I got a
bit better in writing stuff I want to

642
00:56:36,544 --> 00:56:40,290
know, asking questions more, much more
faster, like losing the clover and that

643
00:56:40,290 --> 00:56:48,660
stuff. And so I think it's both positive
and negative. And I only joined since I

644
00:56:48,660 --> 00:56:55,539
think August, when I finished my master's
thesis and in the first half of the year,

645
00:56:55,539 --> 00:57:00,220
I worked at a software company where the
first lockdown was also handled very well.

646
00:57:00,220 --> 00:57:05,089
So we had like a smooth transition. So I'm
kind of used to home office, but I miss

647
00:57:05,089 --> 00:57:17,470
interacting with people.
Herald: I think that's the main thing 2020

648
00:57:17,470 --> 00:57:23,789
brings us: more remote work. Which is
basically a good thing to work more from

649
00:57:23,789 --> 00:57:32,460
home, but we have some minutes left. And
please excuse me myself. Did your mate

650
00:57:32,460 --> 00:57:41,030
consumption increase or decrease?
Moritz: I think it's hard to say for

651
00:57:41,030 --> 00:57:45,950
coffee because I used to drink more coffee
in the office than at home. Yeah, but but

652
00:57:45,950 --> 00:57:56,785
now I see it when we go grocery shopping.
<i>laughs</i> It's hard to say.

653
00:57:56,785 --> 00:58:02,150
Michael: I think it decreased for me
because now if I'm tired, I can simply

654
00:58:02,150 --> 00:58:11,180
take a nap, thats easier.
Herald: And just turn your instant

655
00:58:11,180 --> 00:58:15,890
messaging off.
Michael: Yeah.

656
00:58:17,214 --> 00:58:23,930
Herald: So our time is over. Thank you
again for the brilliant for the amazing

657
00:58:23,930 --> 00:58:31,640
work, for these attack against CPU, for
the great puns you brought, for the nice

658
00:58:31,640 --> 00:58:36,990
interview and have a nice remote Congress
3.

659
00:58:36,990 --> 00:58:51,329
<i>postrol music</i>

660
00:58:51,329 --> 00:59:15,900
Subtitles created by c3subtitles.de
in the year 2021. Join, and help us!