-
rc3 preroll music
-
Herald: In the world of bad puns, everyone
knows and loves the famous line from the
-
cinematic masterpiece, where the IT
security specialists ask the CPU architect
-
"Warum leakt hier Strom?" or in English,
"why is power leaking here?". In this talk
-
our four speakers demonstrate how they can
attack modern processors purely in
-
software, relying on technical, techniques
from classical power side channel attacks.
-
They'll explain how to use these
unprivileged access to energy monitoring
-
features and modern Intel and AMD CPU's.
Please welcome with a round of digital
-
applause. Moritz Lipp, Michael Schwarz,
Daniel Gruss and Andreas Kogler.
-
Moritz: Warum leaked hier Strom?
laugh track
-
Andreas: Und warum wendest du
kein Masking an?
-
laugh track
-
Daniel: But to understand how we got here,
we have to go back to San Diego in May
-
2017.
A: This is a great, Moritz, this is
-
a great talk title. We have to use this.
laugh track
-
M: Yeah, but actually, before we can
do a talk, we should do some interesting
-
research that we can present, right?
laugh track
-
A: Of course. Of course. But we have
to remember this talk title, it's great.
-
laugh track
M: Yes.
-
music
-
Michael: Hey Moritz. Today I have found
something really cool.
-
Moritz: OK, what is it?
Michael: Our computers, they give
-
us the current energy consumption in
microjoule and you can access that
-
from userspace.
laugh track
-
Moritz: What? Are you for real?
Michael: That, that basically means we
-
could mount something like software based
power side channels.
-
Moritz: Nice. We should try that out.
Michael: Yes, I already did, because I
-
thought you might not believe me.
Moritz: OK.
-
Michael: So this is one of the experiments
I did. Here you can already see that. I
-
measured the power consumption using that
interface.
-
Moritz: yeah
Michael: First while doing nothing, idling
-
around sleeping
Moritz: like always
-
Michael: and then I increased the CPU
load, I just did an endless loop which
-
accessed a bit of memory. It's nothing
interesting but you can already see the
-
difference for that. So you can see that
there's a difference in doing nothing and
-
doing a lot. That's pretty nice.
Moritz: We should look take a closer look
-
at that, I think.
Michael: Definitely.
-
music
-
Moritz: sings You can create
my power trace
-
Andreas: Oh, this is great. We already
-
have a song for this paper now. Okay.
Well, this is a great song that we can use
-
for the paper...
-
music
-
Michael: Powertrace,
like power analysis attacks?
-
Moritz: Yeah, but that would be
an attack with physical access.
-
Daniel: Software-only would be great
-
Michael: Yes, I told you already,
I found one can measure energy
-
consumption in micro joules
-
Moritz: Like attacking all server,
desktop and laptop CPUs
-
Daniel: Ideally with unprivileged access
-
Michael: Imagine if you could
distinguish different instructions
-
or even observe the Hamming weights of
operands and memory loads
-
Daniel: Control flow monitoring
-
Moritz: In physical attacks they often go
for cryptographic keys.
-
That would be great.
Attacking AES-NI and RSA
-
Daniel: There's just one problem:
there is no such channel
-
Michael: As I said,
don't you listen, Daniel?
-
It's like always, there is this RAPL
register. This interface is already there
-
and you can measure power consumption
-
Daniel: Yes, but only on a
very coarse granularity
-
Moritz: But first, we need to get a bit
-
more understanding of the CPU power
management. The thermal design power, the
-
TDP, is the power consumption under the
maximum theoretical load of the processor.
-
And you probably know that number from the
CPU specification. And this gives
-
integrators a target to find the proper
thermal solution when you integrate CPU in
-
a computer so that it doesn't run too hot.
But for short periods of time, the CPU can
-
consume more power than that. And this we
can see in this graphic. So here for this
-
Tau moment, the power consumption is much
higher than for the rest of the CPU.
-
Because usually a CPU is not instantly hot
and thermal properties propagate over a
-
bit of time. So on the other hand, you
should also be able to save power. And you
-
can do this in different ways. For
instance, you could just shut down
-
resources completely that you do not need
at the moment, or you can reduce the
-
voltage of the processor or those
components and then it also consumes less
-
power. And on top of that, you could also
reduce the frequency of the processor and
-
then it also consumes less power. And you
need this for different scenarios. For
-
instance, with your laptop, you need to
budget the power consumption because you
-
want to have a long run time. And you also
know these options that you can change,
-
like the performance level if it should
run on high performance or to save
-
battery. And you need this in different
scenarios.
-
Michael: Yes, Moritz, that's exactly what
I showed you before. Do you remember? I
-
showed you this intel running average
power limit, short RAPL, that provides
-
exactly that functionality. So with this
Intel RAPL, you have the power limiting
-
features so you can do exactly what you
just described, reduce the power usage for
-
your system or for parts of your system.
And additionally, you also have the energy
-
readings. So you know exactly how much
power is currently used on a system which
-
helps you do exactly the things you just
mentioned before, like getting a better
-
power performance balance. So this is
already there.
-
Moritz: Because the CPU needs to know in a
way how much power it consumes, right?
-
Michael: Exactly and the scheduler also
uses that feature to ensure that you get a
-
better battery runtime on your laptop, for
example. And because this is an important
-
feature you can directly get that from the
operating system as well. On Linux, you
-
can even get that as an unprivileged
application. There's the powercap
-
framework that you can directly access in
this pseudo file system where you get the
-
current power readings, you can directly
see how much power your CPU currently
-
consumes.
Moritz: How convenient!
-
Michael: On MacOS and on Windows you have
a similar thing, but for that you first
-
need to install a driver because usually
you don't need that as a userspace
-
application. But some drivers might want
to have that and some drivers even expose
-
that to you and you can use that. So there
are some drivers that are even
-
preinstalled on some of the motherboards
that expose that information to
-
applications as well on Windows.
Moritz: Interesting, but what can we do
-
with this? So I ran some experiments
because I wanted to know how good this
-
energy consumption monitoring works. And
in a first run we tried to distinguish
-
instructions from each other. So we
implemented a small program just running
-
the same instructions all the time, and we
measured its power consumption. And as we
-
can see easily in this plot, different
instructions need a different amount of
-
power. So we can distinguish instructions
from each other. In addition, what I
-
tried, I changed the operands that
different instructions used. For instance,
-
for a multiplication, you can multiply
different numbers with each other. And
-
also here we see, depending on the bits
that are set in the operand a different
-
power consumption of the same instruction,
but just depending on the operand so we
-
can also distinguish them from each other.
This could also come in handy later on.
-
But I also tried to load data with an
instruction and I wanted to know if I
-
could see differences in the power
consumption, depending on the data that
-
has been loaded by the processor. And as
you can see in this plot, the more bits
-
that are set in the data that is loaded,
the more power the CPU consumes. But let's
-
be honest here, to record these
measurements, it took more than 23 days,
-
so it took quite some time to get to this
granularity to see those differences, but
-
in other cases, if you just...
Michael: still a fascinating result.
-
Moritz: Yes, it's a very interesting
result. And in other cases, Michael, you
-
only want to know if one operand or one
value is a zero or if it's not a zero. And
-
to come to this result, you don't need
that many measurements. And the last
-
experiments that we did was we wanted to
know if we would see a difference in the
-
energy consumption, depending where data
has been loaded from. For instance, as
-
we've seen also at CCC in many different
talks over the past years, they are like
-
cache attacks. And here in this
experiment, we also were able to see a
-
difference in the power consumption if
your value has been loadad from the cache
-
or if it has to be loaded from the main
memory, because, of course, then DRAM is
-
activated and it consumes more power. But
these results are very nice.
-
Michael: Yes, these are really fascinating
results. So we should actually exploit
-
them and build attacks from that. I mean,
it's fascinating to see that all these
-
measurements are possible, but we also
want to do something security related.
-
Moritz: Do you have any idea what we
could do?
-
Michael: Yes, I have that idea I already
showed you something from before. If you
-
remember from the office, this one
measurement. And I extended that
-
measurement.
Moritz: Yes.
-
Michael: Into a covert channel. So a
covert channel is a communication channel
-
between two parties that are usually not
allowed to communicate with each other. So
-
there might be different reasons for that.
Maybe ther's no interface, maybe there's a
-
policy or a firewall or something that
prevents them from communicating. And
-
still, in this scenario, I want to
communicate. So for that, I'm using
-
exactly these power side channels and all
this analysis you have done to actually
-
communicate. And that's is very simple to
do, actually. I have two processes, a
-
sender and a receiver, and the sender
tries to send single bits, zeros and ones.
-
And to send a one bit. I do something that
uses a lot of energy, like accessing main
-
memory. And if I want to send a zero bit,
then I don't do anything. And now as a
-
receiver, I just have to measure the power
consumption and I see if the power
-
consumption has a spike. Then I know the
sender is sending a one. If there's
-
nothing the sender is apparently sending a
zero and from that I can get this
-
information a Sender wants to send me.
Moritz: But did you try that out?
-
laugh track
Michael: Yes, I also tried that and we can
-
see that here in this graph. So this is
the energy measurement.
-
Moritz: That's a very clean signal.
Michael: Yes, it's the energy measurement
-
on the receiver side. And we see exactly
what I told you before. If there are one
-
bits, then the energy consumption is
higher. If there are zero bits, it's
-
lower. And from that we can deduce the
information that I wanted to send on the
-
sender side. Pretty neat, huh?
Moritz: Yeah, but this is just from one
-
process to another process. Actually, I
took your idea and used this in a
-
hypervisor scenario where we attack the
Xen hypervisor. So it's not limited to two
-
processes. I installed the Xen hypervisor
with two virtual machines. And what Xen
-
does is it also exposes those RAPL
registers to the virtual machine. So now
-
as a virtual machine, I can have direct
access to that and then I can establish a
-
covert channel between two virtual
machines in the cloud.
-
Michael: That's even better.
Moritz: And this is really working, as you
-
can see here. I mean, here I'm just
sending ones and zeros, but the signal is
-
pretty clear.
Michael: That's nice.
-
Moritz: But it's the more that we can do?
Michael: Yes. I mean, covert channels are
-
great to demonstrate something, that it
actually works, across VM, really great. I
-
like that. That gives you a different
threat model here, but still they are a
-
bit boring. So I decided to have something
more interesting as another example of
-
what we can do. I always like to break
kernel address space layout randomization,
-
KASLR. With this kernel address space
layout randomization, the kernel is mapped
-
to different virtual locations every time
I boot my computer to make it difficult to
-
actually exploit something in the kernel
because it's not predictable where the
-
kernel is located. And I again use the
energy consumption to figure out where
-
this kernel is located. So how does that
work? In this address space I have the
-
kernel which is actually mapped using
physical pages and I have a lot of nothing
-
where no physical page is mapped. And if I
try to access these addresses, I can't, of
-
course, because I don't have the
privileges for that. But I will still see
-
differences when doing that because the
CPU has to do different things depending
-
on whether there's actually a page or not,
whether this page can be cached, this
-
translation, or whether this translation
is always invalid because there's nothing
-
there and it can't be cached. We can see
that here in an illustration, if you're
-
wondering how that really works. So it
turns out the kernel can only be mapped to
-
a limited number of places because it has
to be aligned by two megabytes, so I only
-
need to check the spots there where the
kernel could be located. And for all these
-
places in the address space, I just try to
access it and measure how much energy that
-
consumes. And if there's nothing mapped,
it consumes quite a lot of energy because
-
the CPU has to figure out that there's
nothing mapped. It goes through the page
-
tables, the page table walk, and at the
end figures out, oh, there's nothing here,
-
so I can't do anything, and aborts that.
And that uses quite some energy. But if
-
there's actually the kernel here, then
this translation is valid. It works. There
-
is something there. It will likely be
already in the translation caches in the
-
TLB, so the CPU has less work. It just
needs to check the cache, sees: "Oh it's
-
there. I know that. But wait a moment, you
can't access it" and can immediately abort
-
and that uses less energy. So just from
the energy consumption, I can see if
-
there's something mapped and with that see
where the kernel is actually mapped.
-
Moritz: And this is really working? Did
you try it out or is this just some
-
theoretical thing?
Michael: You're always so skeptical. Of
-
course I tried that and I brought the demo
with me. So here you can see the demo
-
running. This is on a real system. And you
see it's super fast measuring the energy
-
consumption going over the address space
and finding the kernel.
-
applause
Moritz: But these attacks are boring,
-
Michael. We want to attack something real,
we want to be like real attackers, we want
-
to attack crypto, we want to get keys.
Michael: Crypto is complicated. That's …
-
laugh track
Moritz: No, no, no, just listen. So, for
-
instance, with RSA, this is a widely used
public-key cryptosystem. This is really
-
easy because to encrypt some data, you
have a public key. To decrypt the data you
-
have a private key. And if we get the
private key: profit, easy as that. What do
-
you say?
Michael: Yeah, I know how that works. So
-
the theory is easy, that I have the two
keys and I have a private key. But then
-
the complicated part starts where you
really have to understand the crypto to
-
actually attack it. And that's really
complicated. And I don't really want to do
-
that. Maybe we can a student who tries
that but I'm out of here. laughter
-
Andreas: Hi guys, I'm a student and I want
a master thesis.
-
Moritz: This is perfect. Your name is
Andreas, right?
-
Andreas: Yeah, sure, I'm Andreas.
laughter
-
M: OK, I don't know if you have heard
the last bits, but we want to attack some
-
crypto with power side channel attacks.
A: OK
-
Moritz: And for instance, with RSA, we
have the private key and the public key.
-
Here we have M the message and C the
ciphertext and d the private exponent. And
-
of course, it's a computer. It consists of
ones and zeros. And depending on the key
-
bit if it's a one, for the computation of
the algorithm, we do a square and the
-
multiply operation. And if it's zero, we
just do the square operation and we do
-
this for the entire private key.
A: Now OK, sounds easy enough.
-
M: Yes. And if we can observe that we
can extract the key. Sounds good. But I
-
did some experiments and it didn't work
out as well as I've expected it to be. So
-
we need to get a bit more control and
maybe a better threat model how to do
-
that. And there comes Intel SGX into play.
And this is an instruction set extension
-
and it provides you with integrity and
confidentiality of code and data even in
-
untrusted environments. So with Intel SGX,
you can run programs using protected areas
-
of memory. And even in the case where the
operating system is compromised and cannot
-
be trusted at all.
A: So basically we have the full
-
access of all operating system features to
attack, the enclave.
-
M: Yes, exactly
A: OK, that sounds quite powerful
-
M: But there's still one issue. It's
still just executing a program. So we have
-
more power, but we need to make use of
that. And there is this paper called
-
SGX-Step, which gives you more control of
enclaves and Jo Van Bulck the author maybe
-
has time to explain this a bit to us so
maybe we can give him a call.
-
A: Sounds great.
ringing sound
-
M: Hi Jo, this is Moritz. I've seen
the paper of yours, this SGX-Step paper.
-
It might be the thing that we need, but
can you explain a bit what it is about?
-
Jo: Yes, surely Moritz, so SGX-Step I
think in one sentence it's an enclave
-
execution control framework. What I mean
with that is that it allows you to
-
precisely control the execution of the
enclave so that you can interleave it with
-
attacker code, as the name implies, you
would do one step of the enclave, one step
-
of the attacker again one step of the
enclave, one step of the attacker, etc.
-
M: That's perfect.
J: That's the high level.
-
Moritz: Can you expand it a bit on the
technical point of view? How do you do
-
that?
J: Yes, I'm very excited about the
-
technical details, Moritz. So let me walk
you through. The first thing you should
-
know about SGX-Step: it's completely open
source and we build it on top of stock
-
Linux environments.
M: Nice
-
J: So what you should start with always
is to load a malicious kernel driver. And
-
this is called the /dev/sgx-step driver.
And from that moment on we kind of export
-
all of the powers of the Linux kernel into
the userspace. And the second component of
-
SGX-step that's important is this small
library operating system that we wrote.
-
It's called libsgxstep and it sits just
alongside of the library alongside in the
-
userspace application. And libsgxstep
allows you to do a number of cool things.
-
I think the most important thing being
that you have direct access to the APIC
-
x86 high resolution timing device. So that
sounds interesting for you, right Moriz?.
-
M: Yeah, but what do you
do with the timer?
-
J: Well, what you can do with the timer
is essentially you can arm it just before
-
you enter the enclave. And what would
happen then is, let's have a look. You arm
-
the timer, you start executing the
enclave, then after a while and interrupt
-
fires and you exit the enclave again.
M: Hmm, so it's like a debugger like
-
GDB, but for enclaves?
J: Yes, it's a... it's exactly that
-
Moritz. It's like an attacker controlled
debugger without using any of the debug
-
features, just using the raw x86
primitives and operating system files. And
-
just as in a debugger, it allows you to do
single stepping. So every instruction will
-
be executed one at a time. At most one at
a time I should say.
-
M: But what happens if I, like,
configure the timer a bit lower? Does it
-
then like start executing an instruction?
J: That's a very good question. And
-
configuring the timer is the tricky thing
about SGX-step. So it will indeed happen
-
sometimes what we call a zero step event.
So you will fire the timer before the
-
enclave even had time to execute an
instruction. And those are a kind of event
-
that you can also detect with SGX-step.
There is a trick to detect whether you had
-
a single step or a zero step.
M: Jo, this is perfect. This is
-
exactly what we are looking for. Thank you
so much for explaining that.
-
J: I'm very happy to hear that.
M: I'm looking forward to try it out
-
now.
J: Go.
-
M: See you hopefully soon.
J: Bye bye.
-
M: Bye!
-
M: So SGX-step to sum it up,
it's an open source Linux kernel
-
framework, and it allows us to configure
the APIC timer interrupts so that we can
-
interrupt the enclave execution to single
and zero step it. And this is perfect
-
because now we can combine it with the
power measurements of Intel RAPL, and this
-
gives us the possibility to measure the
energy consumption of single instructions.
-
Can you try it out Andi?
A: OK, let me dig deeper into that.
-
We have this really slow RAPL interface
here and if you want to visualize it, we
-
could imagine that it's like we have slots
where we can fill the slots with
-
instructions and the RAPL interface gives
us the average power consumption over the
-
slots. So in the default case, when we
execute our target instruction, we have
-
basically one slot filled with the target
instruction and the remaining slots filled
-
with other instructions we don't know. So
basically noise. The best case for us
-
would be if we repeat the target
instruction indefinitely and fill every
-
slot with the target instruction.
M: This is exactly what I did
-
in the experiments in the beginning.
A: Yeah, exactly. That's the reason
-
why we got so good measurements there.
Another trick would be if we only used the
-
target instruction in one slot and fill
the remaining slots with instructions
-
where we know the energy consumption of or
we know the instruction of. Then it could
-
do tricks to calculate the energy
consumption of the target instruction.
-
With SGX-step now we can use a hybrid
solution here, where we use SGX-step the
-
zero stepping mechanism to reissue this
instruction and we can fill multiple slots
-
with the same target instruction. Only
drawback here is that we have a noise
-
overhead of SGX-step itself, but this is
probably the best solution we can go with.
-
M: This sounds pretty good, so we
should actually try that out. So we
-
implement a toy cipher, which imitates
square and multiply basically. So we can
-
leave out all the rest, the overhead of a
library that would be used otherwise. And
-
we then just single step every instruction
and measure its energy consumption and
-
then we could plot this. Can you do that?
A: I got already some results here
-
for us. Basically here we use, as you
explained, a toy example for square and
-
multiply. And in both cases the square and
the multiply, they execute exactly six
-
instructions. And so basically we have a
period of six here. And if you look at the
-
results of the measurement here, we can
see that we have patterns that repeat with
-
a period of six and we can see that these
different patterns correspond to either a
-
square or a multiply instruction here.
M: Nice, perfect, but this is just a
-
toy cipher, right? laughter
A: Yeah.
-
M: Can we do like real crypto?
laughter
-
A: We can try. So the plan now is
that we want to attack a real RSA
-
implementation and the real implementation
is not like a toy square and multiply
-
algorithm. The real implementation needs
to handle these huge numbers. So basically
-
there's much more code involved and it's
not feasible to single step every
-
instruction there. So we must do a more
clever approach here. If we observe the
-
square multiply part here, we see that the
square and the multiply function uses the
-
AVX optimized memset function. So the
energy consumption should also be more if
-
we execute an AVX instruction because AVX
instructions use much larger registers. So
-
basically we should be able to observe
that.
-
M: Interesting.
A: The only drawback here is that we
-
cannot use the same approach as with the
toy cipher because the square has a
-
different number of instructions as the
square and multiply function. So we need
-
to do a trick here. So to understand what
we did here, our target is that we
-
reconstruct a key bit. And if the key bit
is one we execute a square and multiply.
-
If the key bit is zero, we execute a
square. So to visualize how we execute
-
zero and single stepping, we have to dig
into the assembler a bit. So to test for
-
the key bit, we execute like a test
instruction and then we execute a
-
conditional jump. And if we execute the
square and multiply we have for instance,
-
K instructions. And if we execute the
square we have for instance L
-
instructions. So we can see that these two
numbers do not add up. They are different.
-
So we cannot simply measure each Kth
instruction and get the key out. So we
-
need to do something different here. We
can number the instructions after the jump
-
instruction and then using single stepping
to single step to the Nth instruction
-
after the jump instruction. And on the
left side, if you observe one, we hit then
-
exactly the AVX instruction there, used in
the AVX memset. And if you then use our
-
measurement framework to measure exactly
the nth instruction after the jump, we
-
observe on the one hand a high energy
consumption and on the other hand, we
-
observe low energy consumption if the
branch was not taken or a zero.
-
M: It's very clever.
A: So if you measured both
-
instructions here, we can then combine
this energy measurements and then use a
-
simple threshold to reconstruct the key
bit in the beginning. And then we do this
-
iteratively for each key bit.
M: This sounds pretty promising, but
-
did you try it out?
laughter
-
A: Sure. Here, the results of that.
And we can clearly see that we have
-
different energy consumption or in this
case voltage
-
applause
based on if the
-
AVX instruction is executed or if the
instruction at the same offset in the
-
other branch is executed.
M: How fast does this work, does this
-
take like 5 days?
A: Not quite that long. We have one
-
problem here that the time per key bit
increases the further or later the key bit
-
is in the key. So basically the first key
bit we can reconstruct very fast, but for
-
the last key bit, we need a single step
much further in the code to actually reach
-
it. And this adds up. So basically the
time increases linearly between the key
-
bits. But for our key here, our test key
with 512 bits that takes us about 3.5
-
hours to reconstruct a complete key. Note
here that we spent like 52 minutes
-
only to find the target instruction. So
basically, if we could optimize that, the
-
attack would be much faster. In addition,
we had to record like 3 samples per key
-
bit. But with the implementation, it
should be possible to actually do that
-
with 1 sample. And since we then only need
one sample per key bit, we actually can do
-
it with a single trace attack. But we did
not try that out, unfortunately.
-
Moritz: quite fast.
Michael: So while all this sounded quite
-
easy and straightforward in hindsight,
this was actually a really long process.
-
Starting at the beginning of 2017 when we
discovered this interface, the RAPL
-
interface. Then we had to come up with a
title for this talk, of course, laughter
-
and some lyrics for a song. We had the
first toy attack on RSA at the end of
-
2017. It took us until 2018 to finally get
a KASLR break that was working and only in
-
2019, by the end of 2019. After Andreas
did his master's thesis on that, we were
-
able to produce a full attack on RSA. And
this is also the time when we submitted
-
that as a paper to a conference and
disclosed that to the CPU vendors so that
-
they can fix that. And this is also the
start of the embargo. This embargo for
-
this vulnerability lasted almost one year.
So from November 2019 to November 2020. It
-
was just a few weeks ago that this embargo
ended here.
-
Moritz: But there's one thing missing. We
really wanted to do crypto attacks, but
-
not only with SGX-step as a compromised
operating system, but also from userspace.
-
But as we've seen, it's so difficult to
measure parts of the code without having
-
SGX-step. But what we can do is we can
measure the power consumption of the
-
overall execution of an algorithm and
there correlation power analysis comes in
-
handy. And there what we do is we build a
power consumption model of our device. As
-
we've heard earlier, the Hamming Weight is
the number of bits that is set in an
-
operand or in the data. And we assume that
if a bit is set, the computer takes more
-
power to process it. In addition, what you
can use as a different model is the
-
Hamming distance. So from one operation to
the other, how many bits change? And then
-
we assume the more bits change, the more
power is consumed. And we really want to
-
try that out. So what we are targeting now
is AES-NI, a side channel resistant
-
instruction set of Intel. And we target it
in a scenario where we can trigger the
-
encryption and decryption of many, many
blocks over long time so that the
-
execution time is longer than the RAPL
update rate, so that we can really see the
-
power consumption in our measurement. And
this is used, for instance, in disk
-
encryption or decryption or if you seal or
unseal the SGX enclave state. And we can
-
now do that and record power measurements
in different scenarios, right?
-
Andreas: Sure, we can try that. So in our
experiment, we recorded two million traces
-
over 26 hours for SGX environment. But we
also tried to reconstruct it without SGX
-
where we used the encryption inside a
kernel module. And there we recorded
-
4 million traces in 50 hours. And to
understand the attack here, we have to
-
look at this animation. So basically we
have our computer where secret key is
-
stored somewhere intern. Then we have this
key to encrypt some messages and we also
-
have the power consumption here. And what
we now did is we recorded the encrypted
-
message and the power consumption it took
to encrypt this message for many messages.
-
And then we use a model of the CPU here to
predict the energy consumption, to
-
reconstruct the key. The key is usually
split up into parts, where each of the
-
parts can have a value between 0 and 255.
So to reconstruct the key here, we simply
-
use our measurements in the model and we
try out one of the key parts and estimate
-
the energy consumption for the key part.
And then we store the correlation between
-
the recorded messages and the prediction.
And we do this for every of the possible
-
key values. And once we found the key
value of the highest correlation, we know
-
that this key value corresponds to the key
part of the key. And we then simply repeat
-
the process for each of the parts of the
key until we get the final key.
-
M: And we actually tried that out. So
here in our demo video, you see on the
-
left where we test all the combinations
and see what is the most likely key
-
candidate at the moment, while for a
single key byte on the right, you see
-
every possible value and the correlation.
So in the beginning, with not that many
-
traces processed, it's not very clear
which key candidate is the right one,
-
because there's so much measurement noise
introduced by measuring over the overall
-
execution time. But over time, this signal
gets more stable and we see on the right
-
with the peak getting more and more
distance from the other candidates that
-
this is our correct key byte. And we do
this, as Andreas said, for every possible
-
key byte with every possible value. So in
the end, we end up with the correct key.
-
applause
A: OK, but this seems like it's only
-
Intel CPUs. Does this also affect others?
M: Yes. So actually, we also tried
-
out how to CPU vendors if they have
similar interfaces. And for instance, AMD
-
is affected as well. But we never really
heard back from them after our disclosure.
-
And the patch how to try to solve the
problem with the driver is similar to the
-
one that Intel has.
A: Your right Moritz, it actually
-
works. So I tried the same code on AMD.
The one you showed before was
-
distinguishing operands, at that also
works on AMD. That's pretty nice. It's not
-
an Intel only issue. It also affects at
least AMD as well.
-
M: Yes, but actually there are many
other vendors as well that provide
-
interfaces, even some of them unprivileged
to user space where you could probably
-
mount similar attacks. For instance,
Nvidia, IBM, or Marvell and Ampere.
-
A: So this is really an industry
wide problem here. And we've also seen
-
that from the media coverage. So not only
German news brought about that like Heise
-
or Golem, but it also went more
international with ZDNET, Ars Technica,
-
CSO, Tech Radar, Computer Weekly and many,
many others that wrote about this new type
-
of vulnerability that affects many
computers out there. And I guess if it
-
affects many computers, we should do
something against that.
-
M: Yes, you're right. We cannot only
have an attack and no mitigation against
-
it. This would not be right. And indeed,
it's quite easy to fix that because we
-
said in the beginning, you have
unprivileged access to those registers. So
-
we just restrict the access. And we are
done, and this is exactly a one line patch
-
for the Linux kernel. But as we've seen
with the threat model of Intel SGX, which
-
allows a compromised operating system. So
this one line patch does not help there
-
because I'm the operating system, I can do
whatever I want to. We need more and more
-
complex mitigations. And in this case,
microcode updates are necessary. And what
-
Intel does is to fall back to the model of
the energy consumption. So they have an
-
internal model. How much energy is
consumed by an executed instruction and
-
use that instead of the real measurement.
And this does not allow to distinguish
-
data and operands from each other again.
So if your implementation is implemented
-
correctly, if you use constant time, then
you are mitigated and protected against
-
these attacks. And as we see here in the
plot, we tried to mitigation out. So on
-
the left, we were able to see differences
depending on the Hamming weight of the
-
operands. And on the right with the
mitigation in place, it just does not work
-
anymore and you cannot see any
differences. applause
-
Andreas: Nice. So you really
can't read her power trace any more.
-
Music: Pokerface by Lady Gaga
-
sings
I wonna probe 'em like in 1943
-
touch 'em, measure wattage
correlate and get the key
-
I probe it
-
Oscilloscopes are not the same
without a probe
-
And babe, if it's remote if it's not code,
it cannot run
-
I'll let him plot, let's see what he's got
-
I'll let him plot, let's see what he's got
-
Can't read my, can't read my
-
No he can't read my power trace
-
She's got the countermeasure
-
Can't read my, can't read my
-
No he can't read my power trace
-
She's got the countermeasure
-
P-p-p-power trace, p-p-power trace
-
P-p-p-power trace, p-p-power trace
-
P-p-p-power trace, p-p-power trace
-
P-p-p-power trace, p-p-power trace
-
applause
-
Moritz: With all those nasty songs, we
-
wrote them down in a scientific paper and
the PLATYPUS paper has been accepted
-
recently at a conference. And we also want
to thank you, all the other coauthors who
-
are not in this talk, like David Oswald,
Catherine Easton and Claudio Canela. To
-
sum it up, what we have seen is that with
power sidechannel attacks, you can even
-
exploit them from software. So there is no
need to attach an oscilloscope on modern
-
Intel CPUs.
-
Michael: And what we've also seen is
that since the SGX threat model allows for
-
much more capable attackers, mitigating
power sidechannel attacks on the SGX
-
enclaves is much more work than simple
software patches.
-
Andreas: Yes, and that concludes
-
our talk on PLATYPUS. Thank you all for
listening.
-
Applause and Music
-
Herald: Thank you very much for your
excuse me, nerdy talk and thank Moritz,
-
Michael, Daniel and Andreas. We head over
to our Q&A session and the first question
-
would be, how does it come that you have
so, let's say through the back door
-
against CPU attack against the CPU idea,
you mentioned you attack the through a
-
power driver RSA. Could you tell me a
little bit more about that?
-
Moritz: Yes. So the basic idea of
attacking cryptographic algorithms with
-
power side channel attacks is not very new
This was like one of the first things
-
researchers have shown, but most of the
time for like smaller devices, like smart
-
cards, like your bank card, for instance.
And for those attacks, you usually had
-
like an oscilloscope that you needed to
attach to the device to do the attack. But
-
with modern processors, they have
basically an oscilloscope built into the
-
processor, which you can read out as the
operating system. And in our case, there
-
are like drivers that expose this
interface, also to userspace. So from
-
there as an unprivileged attacker, you can
then try to exploit that. And yeah
-
basically the best thing that we wanted to
achieve with those attacks is to attack
-
cryptographic algorithms and not to
transmit some data between two processes.
-
Herald: Cool, thank you. Our next
question, you mentioned a little bit about
-
ARM sorry, AMD, how about ARM? So not x86
architecture?
-
Moritz: So there are many other vendors
that have similar interfaces, some of them
-
also provide drivers that expose them
directly to userspace, but we hardly had
-
any access to those devices, so we could
not really fully evaluate if these attacks
-
are also possible on them. But in the
paper, we have an appendix where we
-
describe them in a bit more detail so you
can try it out on your own and let us know
-
if it works.
Herald: Cool. Thank you. So please, fellow
-
hackers, try it out at your system, at
home. Now, our next question is related to
-
that. Is there a survey which hardware has
the RAPL or similar weaknesses? Intel,
-
AMD, ARM even.
Moritz: I don't know if anyone else wants
-
to answer that, I can also take the
question. So the RAPL interface itself
-
comes from Intel, but a similar interface
is also implemented for AMD, and they also
-
use basically the same name. They have
a... For now, it's implemented in two ways
-
for the Linux kernel, also in the RAPL
driver, but also in a separate called AMD
-
Energy Driver, which is included since a
few months in the Linux kernel, in the
-
upstream Kernel. And for other vendors it
works a bit differently. So some of them
-
just give you similar measurements, but
not in a tightly related way to the RAPL
-
Interface with a measure over a period of
time and give you the average.
-
Herald: OK, and..
Michael: Maybe to add one point here: On
-
Intel, basically the high resolution
sensors are included since the Skylake
-
micro architecture. So something around
2015.
-
Herald: I see. We have another related
question to AMD. So did AMD issue any
-
Microcode update for the secure encrypted
virtual machines case apart from
-
restricting access to MSR?
Moritz: Not as far as we know. But from
-
our knowledge to attack AMD CPU's, we need
to wait for a new generation so that we
-
can do similar attacks from a similar
threat model than we can do on an Intel.
-
Herald: Cool, thank you. So another I
think this is also related to it, you
-
mentioned your Xen example where you
attack through a hypervisor. Does it work
-
on other hypervisors like KVM or hyperV as
well?
-
Moritz: So for KVM, I don't think so. For
Windows I also don't know I don't think
-
they exposed those MSR directly to the
virtual machines. So the issue is really
-
here that we can have access to those MSRs
at the virtual machine where we should not
-
have access to.
Herald: OK, we have another question from,
-
I think, the hardware section of our
remote Congress. Someone wonders if the
-
same could be achieved with external power
measurement.
-
Moritz: You mean if you could attach
actually an oscilloscope or a different
-
probe to the CPU? Yes, you can do that.
And it has already been demonstrated in
-
the past.
Michael: But it turned out with external
-
tools, it takes even longer than with
software. You have more issues finding the
-
right spot in measuring. And there is one
paper, it took 14 days of collecting
-
traces which are harder to probe, which is
much longer than in software. But it can
-
be done.
Herald: And there's another follow up
-
question, how external is external? Where
do you measure power consumptions of an
-
x86 server?
Moritz: OK, you would need to get physical
-
access to the data center, I guess. And if
this is in your threat model, you probably
-
have different things to worry about.
Michael: Yeah, you still need to find the
-
right spot on your mainboard.
Herald: OK, so are there, let's say
-
documentation's where to get that right
spot.
-
Moritz: I think one can take a look at
other research papers where they attached
-
a probe, I think there are experts out
there, but I don't know.
-
Herald: OK, thank you. The next question,
why is the power information exported in
-
such detail to the kernel or userspace
software? Why isn't it only available to
-
the firmware or filtered to return an
average, for example, one second power
-
trace?
Moritz: Good question. We did not
-
implement that. I think the reason is...
Andi?
-
Andreas: The once second power trace would
make the attack only slower because you
-
can still do exactly what we did with
single stepping here, because RAPL is
-
already very slow and we need a mechanism
to replay instructions to get a good
-
reading of the energy consumption of the
instructions. So if you only increase the
-
update rate there, the attacks would still
be possible, but only take longer to
-
record the traces there. So you have to...
Yeah. So you have to find a tradeoff
-
between your countermeasures there.
Herald: Okay, so let's say with an
-
average, your resolution is lower, but
still it just takes more time to record
-
it. And still it does work, right?
Moritz: Yes. And the other thing is that
-
one needs to keep in mind those drivers
are not written for security in mind, but
-
for performance so that this can be used
by other tools that like give you the best
-
performance of your CPU. And in that case,
it just has not been masked and you get
-
the value directly at the operating system
sees.
-
Herald: Crazy. Our second to last
question, how long is the update interval
-
for this measurement? I heard something
about...
-
Andreas: For the fastest register we
observed, it's like 10 microseconds, for
-
the slowest one... So there are different
domains where you measure only parts of
-
the CPU and for the whole package, this
includes all the cores and the memory
-
controller, it takes around one
millisecond there. So this is already very
-
slow, if you compare it to the frequency
where CPUs are currently running at.
-
Herald: Crazy. In this case, are there any
other questions from the interwebs, from
-
Twitter, from our IRC channel? Because
otherwise we would head over to more,
-
let's say, personal interview. Let's give
them a try.
-
In this case, no more
-
questions, so in this. So, again, thank
you. Moritz, Michael, Daniel and Andreas,
-
for these for this really interesting talk
for this Q&A session, the Internet tells
-
me no questions. We head over to our
personal interview. I asked you earlier
-
before our talk. So with all these, let's
say, research things going on in the
-
Corona time. So what's your personal
experience? What changed in your work life
-
balance in the last one year?
Moritz: I think the biggest change is that
-
most of the coffee breaks you do alone
instead of with the colleagues.
-
Herald: So how do you meet in your in
your, let's say, lunch break? Do you have
-
as well a lunch break break out session in
Jitsi? Yeah, we started with Jitsi, but
-
used different systems on the long way.
And now it's like a fixed coffee meeting
-
at 2:00 p.m. every day and try to meet
everyone or have individual meetings, of
-
course.
Herald: And does this work? But so is
-
everyone on time. So sharp 12?
Moritz: No, but I think no one really
-
cares.
Herald: So it's just for socializing?
-
Moritz: Yes. But we also discuss work
related issues also in separate meetings.
-
And yeah, I think time is different, but
you get used to it. But let's hope it's
-
over soon.
Herald: What about the others, Michael?
-
Michael: Yes, I'm in the same coffee
breaks as Moritz. Sometimes every day,
-
depends on the workload, so I feel quite
lucky that we can still work full time and
-
get our work done. And I don't have to
fear that we lose our jobs in the in the
-
short term. So I think that takes a lot of
pressure off. But, yeah, I mean, it's
-
different. I'm also missing the
conferences, so I used to travel around a
-
lot before Corona times and this year is
basically nothing. So you really miss the
-
social interactions and conferences,
meeting other researchers, exchanging
-
ideas, having that online is different and
just not the same, but still it works. So
-
I can still do a lot of research. The
positive thing, you have less
-
interruptions than when you're in the
office. So that's a positive thing. But
-
yeah, I also hope that it's over soon.
Daniel: But then again, on the other side,
-
you have way more conference calls because
instead of writing emails, people ask for
-
conference calls all the time.
Michael: Yes, you are in meetings all the
-
time.
Herald: Yeah, Daniel you mentioned earlier
-
you're, let's say, flightplan the last
year. And as far as I understood it, you
-
like to be in personal contact with your
colleagues, also from others or from
-
foreign countries. How does this work? So
let's say topic exchange between different
-
organizations, between different
countries?
-
Daniel: Yeah, it's more difficult. So in
2018, I had these 54 talks outside of Graz
-
in 52 weeks and this year I had a single
talk outside of, outside of Graz where I
-
was in person of course. Of course more
Online. Um yeah. So it's, it's difficult
-
to engage with people from other places,
but it works of course in teams that you,
-
that you already have established in the
past, for instance. So you can continue in
-
teams that you've already built there. But
also in some cases it works to start new
-
collaborations. But it's of course more
difficult than if you can just meet people
-
in person like we did for this paper
actually, David Osvald, one of the
-
coauthors, we met with him in person and
talked with him about the paper in person.
-
Herald: Andreas, what's your, let's say,
Corona year?
-
Andreas: Yeah, since I'm one of the
persons who was interrupting Michael all
-
the time I am missing the office because
it looks like the unscheduled flow,
-
because it's sitting in an office and
suddenly you have like a question or idea,
-
you can not or you don't have to write it.
You can just ask it on the fly. So I'm a
-
bit missing that side. On the other side,
I gained a lot of time since I don't have
-
to travel to work there. And often I got a
bit better in writing stuff I want to
-
know, asking questions more, much more
faster, like losing the clover and that
-
stuff. And so I think it's both positive
and negative. And I only joined since I
-
think August, when I finished my master's
thesis and in the first half of the year,
-
I worked at a software company where the
first lockdown was also handled very well.
-
So we had like a smooth transition. So I'm
kind of used to home office, but I miss
-
interacting with people.
Herald: I think that's the main thing 2020
-
brings us: more remote work. Which is
basically a good thing to work more from
-
home, but we have some minutes left. And
please excuse me myself. Did your mate
-
consumption increase or decrease?
Moritz: I think it's hard to say for
-
coffee because I used to drink more coffee
in the office than at home. Yeah, but but
-
now I see it when we go grocery shopping.
laughs It's hard to say.
-
Michael: I think it decreased for me
because now if I'm tired, I can simply
-
take a nap, thats easier.
Herald: And just turn your instant
-
messaging off.
Michael: Yeah.
-
Herald: So our time is over. Thank you
again for the brilliant for the amazing
-
work, for these attack against CPU, for
the great puns you brought, for the nice
-
interview and have a nice remote Congress
3.
-
postrol music
-
Subtitles created by c3subtitles.de
in the year 2021. Join, and help us!