Return to Video

36C3 - Practical Cache Attacks from the Network and Bad Cat Puns

  • 0:00 - 0:20
    36C3 preroll music
  • 0:20 - 0:26
    Herald: So, our next talk is practical
    cache attacks from the network. And the
  • 0:26 - 0:34
    speaker, Michael Kurth, is the person who
    discovered the attack it’s the first
  • 0:34 - 0:43
    attack of its type. So he’s the first
    author of the paper. And this talk is
  • 0:43 - 0:47
    going to be amazing! We’ve also been
    promised a lot of bad cat puns, so I’m
  • 0:47 - 0:53
    going to hold you to that. A round of
    applause for Michael Kurth!
  • 0:53 - 0:59
    applaus
  • 0:59 - 1:04
    Michael: Hey everyone and thank you so
    much for making it to my talk tonight. My
  • 1:04 - 1:09
    name is Michael and I want to share with
    you the research that I was able to
  • 1:09 - 1:16
    conduct at the amazing VUSec group during
    my master thesis. Briefly to myself: So I
  • 1:16 - 1:20
    pursued my masthers degree in Computer
    Science at ETH Zürich and could do my
  • 1:20 - 1:28
    Master’s thesis in Amsterdam. Nowadays, I
    work as a security analyst at infoGuard.
  • 1:28 - 1:33
    So what you see here are the people that
    actually made this research possible.
  • 1:33 - 1:38
    These are my supervisors and research
    colleagues which supported me all the way
  • 1:38 - 1:44
    along and put so much time and effort in
    the research. So these are the true
  • 1:44 - 1:51
    rockstars behind this research. So, but
    let’s start with cache attacks. So, cache
  • 1:51 - 1:57
    attacks are previously known to be local
    code execution attacks. So, for example,
  • 1:57 - 2:04
    in a cloud setting here on the left-hand
    side, we have two VMs that basically share
  • 2:04 - 2:10
    the hardware. So they’re time-sharing the
    CPU and the cache and therefore an
  • 2:10 - 2:18
    attacker that controlls VM2 can actually
    attack VM1 via cache attack. Similarly,
  • 2:18 - 2:23
    JavaScript. So, a malicious JavaScript
    gets served to your browser which then
  • 2:23 - 2:28
    executes it and because you share the
    resource on your computer, it can also
  • 2:28 - 2:33
    attack other processes. Well, this
    JavaScript thing gives you the feeling of
  • 2:33 - 2:39
    a remoteness, right? But still, it
    requires this JavaScript to be executed on
  • 2:39 - 2:46
    your machine to be actually effective. So
    we wanted to really push this further and
  • 2:46 - 2:54
    have a true network cache attack. We have
    this basic setting where a client does SSH
  • 2:54 - 3:01
    to a server and we have a third machine
    that is controlled by the attack. And as I
  • 3:01 - 3:08
    will show you today, we can break the
    confidentiality of this SSH session from
  • 3:08 - 3:13
    the third machine without any malicious
    software running either on the client or
  • 3:13 - 3:21
    the server. Furthermore, the CPU on the
    server is not even involved in any of
  • 3:21 - 3:25
    these cache attacks. So it’s just there
    and not even noticing that we actually
  • 3:25 - 3:35
    leak secrets. So, let’s look a bit more
    closely. So, we have this nice cat doing
  • 3:35 - 3:41
    an SSH session to the server and everytime
    the cat presses a key, one packet gets
  • 3:41 - 3:50
    send to the server. So this is always true
    for interactive SSH sessions. Because, as
  • 3:50 - 3:57
    it’s said in the name, it gives you this
    feeling of interactiveness. When we look a
  • 3:57 - 4:01
    bit more under the hood what’s happening
    on the server, we see that these packages
  • 4:01 - 4:07
    are actually activating the Last Level
    Cache. More to that also later into the
  • 4:07 - 4:13
    talk. Now, the attacker in the same time
    launches a remote cache attack on the Last
  • 4:13 - 4:19
    Level Cache by just sending network
    packets. And by this, we can actually leak
  • 4:19 - 4:28
    arrival times of individual SSH packets.
    Now, you might ask yourself: “How would
  • 4:28 - 4:37
    arrival times of SSH packets break the
    confidentiality of my SSH session?” Well,
  • 4:37 - 4:43
    humans have distinct typing patterns. And
    here we see an example of a user typing
  • 4:43 - 4:50
    the word “because”. And you see that
    typing e right after b is faster than for
  • 4:50 - 4:57
    example c after e. And this can be
    generalised. And we can use this to launch
  • 4:57 - 5:04
    a statistical analysis. So here on the
    orange dots, if we’re able to reconstruct
  • 5:04 - 5:11
    these arrival times correctly—and what
    correctly means: we can reconstruct the
  • 5:11 - 5:16
    exact times of when the user was typing—,
    we can then launch this statistical
  • 5:16 - 5:23
    analysis on the inter-arrival timings. And
    therefore, we can leak what you were
  • 5:23 - 5:30
    typing in your private SSH session. Sounds
    very scary and futuristic, but I will
  • 5:30 - 5:37
    demistify this during my talk. So,
    alright! There is something I want to
  • 5:37 - 5:43
    bringt up right here at the beginning: As
    per tradition and the ease of writing, you
  • 5:43 - 5:48
    give a name to your paper. And if you’re
    following InfoSec twitter closely, you
  • 5:48 - 5:54
    probably already know what I’m talking
    about. Because in our case, we named our
  • 5:54 - 6:01
    paper NetCAT. Well, of course, it was a
    pun. In our case, NetCAT stands for
  • 6:01 - 6:09
    “Network Cache Attack,” and as it is with
    humour, it can backfire sometime. And in
  • 6:09 - 6:18
    our case, it backfired massively. And with
    that we caused like a small twitter drama
  • 6:18 - 6:24
    this September. One of the most-liked
    tweets about this research was the one
  • 6:24 - 6:33
    from Jake. These talks are great, because
    you can put the face to such tweets and
  • 6:33 - 6:43
    yes: I’m this idiot. So let’s fix this!
    Intel acknowledged us with a bounty and
  • 6:43 - 6:49
    also a CVE number, so from nowadays, we
    can just refer it with the CVE number. Or
  • 6:49 - 6:54
    if that is inconvenient to you, during
    that twitter drama, somebody sent us like
  • 6:54 - 7:00
    a nice little alternative name and also
    including a logo which actually I quite
  • 7:00 - 7:09
    like. It’s called NeoCAT. Anyway, lessons
    learned on that whole naming thing. And
  • 7:09 - 7:15
    so, let’s move on. Let’s get back to the
    actual interesting bits and pieces of our
  • 7:15 - 7:22
    research! So, a quick outline: I’m firstly
    going to talk about the background, so
  • 7:22 - 7:28
    general cache attacks. Then DDIO and RDMA
    which are the key technologies that we
  • 7:28 - 7:34
    were abusing for our remote cache attack.
    Then about the attack itself, how we
  • 7:34 - 7:42
    reverse-engineered DDIO, the End-to-End
    attack, and, of course, a small demo. So,
  • 7:42 - 7:47
    cache attacks are all about observing a
    microarchitectural state which should be
  • 7:47 - 7:53
    hidden from software. And we do this by
    leveraging shared resources to leak
  • 7:53 - 8:00
    information. An analogy here is: Safe
    cracking with a stethoscope, where the
  • 8:00 - 8:06
    shared resource is actually air that just
    transmits the sound noises from the lock
  • 8:06 - 8:12
    on different inputs that you’re doing. And
    actually works quite similarly in
  • 8:12 - 8:22
    computers. But here, it’s just the cache.
    So, caches solve the problem that latency
  • 8:22 - 8:28
    of loads from memory are really bad,
    right? Which make up roughly a quarter of
  • 8:28 - 8:34
    all instructions. And with caches, we can
    reuse specific data and also use spatial
  • 8:34 - 8:42
    locality in programs. Modern CPUs have
    usually this 3-layer cache hierarchy: L1,
  • 8:42 - 8:47
    which is split between data and
    instruction cache. L2, and then L3, which
  • 8:47 - 8:54
    is shared amongst the cores. If data that
    you access is already in the cache, that
  • 8:54 - 8:59
    results in a cache hit. And if it has to
    be fetched from main memory, that’s
  • 8:59 - 9:06
    considered a cache miss. So, how do we
    actually know now if a cache hits or
  • 9:06 - 9:12
    misses? Because we cannot actually read
    data directly from the caches. We can do
  • 9:12 - 9:16
    this, for example, with prime and probe.
    It’s a well-known technique that we
  • 9:16 - 9:21
    actually also used in the network setting.
    So I want to quickly go through what’s
  • 9:21 - 9:26
    actually happening. So the first step of
    prime+probe is that the hacker brings the
  • 9:26 - 9:34
    cache to a known state. Basically priming
    the cache. So it fills it with its own
  • 9:34 - 9:42
    data and then the attacker waits until the
    victim accesses it. The last step is then
  • 9:42 - 9:49
    probing which is basically doing priming
    again, but this time just timing the
  • 9:49 - 9:56
    access times. So, fast access cache hits
    are meaning that the cache was not touched
  • 9:56 - 10:03
    in-between. And cache misses results in,
    that we known now, that the victim
  • 10:03 - 10:10
    actually accessed one of the cache lines
    in the time between prime and probe. So
  • 10:10 - 10:16
    what can we do with these cache hits and
    misses now? Well: We can analyse them! And
  • 10:16 - 10:21
    these timing information tell us a lot
    about the behaviour of programs and users.
  • 10:21 - 10:29
    And based on cache hits and misses alone,
    we can—or researchers were able to—leak
  • 10:29 - 10:36
    crypto keys, guess visited websites, or
    leak memory content. That’s with SPECTRE
  • 10:36 - 10:42
    and MELTDOWN. So let’s see how we can
    actually launch such an attack over the
  • 10:42 - 10:51
    network! So, one of the key technologies
    is DDIO. But first, I want to talk to DMA,
  • 10:51 - 10:55
    because it’s like the predecessor to it.
    So DMA is basically a technology that
  • 10:55 - 11:02
    allows your PCIe device, for example the
    network card, to interact directly on
  • 11:02 - 11:09
    itself with main memory without the CPU
    interrupt. So for example if a packet is
  • 11:09 - 11:14
    received, the PCIe device then just puts
    it in main memory and then, when the
  • 11:14 - 11:19
    program or the application wants to work
    on that data, then it can fetch from main
  • 11:19 - 11:27
    memory. Now with DDIO, this is a bit
    different. With DDIO, the PCIe device can
  • 11:27 - 11:33
    directly put data into the Last Level
    Cache. And that’s great, because now the
  • 11:33 - 11:39
    application, when working on the data,
    just doesn’t have to go through the costly
  • 11:39 - 11:44
    main-memory walk and can just directly
    work on the data from—or fetch it from—the
  • 11:44 - 11:52
    Last Level Cache. So DDIO stands for “Data
    Direct I/O Technology,” and it’s enabled
  • 11:52 - 11:59
    on all Intel server-grade processors since
    2012. It’s enabled by default and
  • 11:59 - 12:04
    transparent to drivers and operating
    systems. So I guess, most people didn’t
  • 12:04 - 12:09
    even notice that something changed unter
    the hood. And it changed somethings quite
  • 12:09 - 12:17
    drastically. But why is DDIO actually
    needed? Well: It’s for performance
  • 12:17 - 12:23
    reasons. So here we have a nice study from
    Intel, which shows on the bottom,
  • 12:23 - 12:29
    different times of NICs. So we have a
    setting with 2 NICs, 4 NICs, 6, and 8
  • 12:29 - 12:36
    NICs. And you have the throughput for it.
    And as you can see with the dark blue,
  • 12:36 - 12:43
    that without DDIO, it basically stops
    scaling after having 4 NICs. With the
  • 12:43 - 12:48
    light-blue you then see that it still
    scales up when you add more netowork cards
  • 12:48 - 12:57
    to it. So DDIO is specifically built to
    scale network applications. The other
  • 12:57 - 13:02
    technology that we were abusing is RDMA.
    So stands for “Remote Direct Memory
  • 13:02 - 13:09
    Access,” and it basically offloads
    transport-layer tasks to silicon. It’s
  • 13:09 - 13:15
    basically a kernel bypass. And it’s also
    no CPU involvement, so application can
  • 13:15 - 13:24
    access remote memory without consuming any
    CPU time on the remote server. So I
  • 13:24 - 13:28
    brought here a little illustration to
    showcase you the RDMA. So on the left we
  • 13:28 - 13:34
    have the initiator and on the right we
    have the target server. A memory region
  • 13:34 - 13:40
    gets allocated on startup of the server
    and from now on, applications can perform
  • 13:40 - 13:44
    data transfer without the involvement of
    the network software stack. So you made
  • 13:44 - 13:53
    the TCP/IP stack completely. With one-
    sided RDMA operations you even allow the
  • 13:53 - 14:00
    initiator to read and write to arbitrary
    offsets within that allocated space on the
  • 14:00 - 14:07
    target. I quote here a statement of the
    market leader of one of these high
  • 14:07 - 14:13
    performance snakes: “Moreover, the caches
    of the remote CPU will not be filled with
  • 14:13 - 14:21
    the accessed memory content.” Well, that’s
    not true anymore with DDIO and that’s
  • 14:21 - 14:29
    exactly what we attacked on. So you might
    ask yourself, “where is this RDMA used,”
  • 14:29 - 14:34
    right? And I can tell you that RDMA is one
    of these technologies that you don’t hear
  • 14:34 - 14:39
    often but are actually extensively used in
    the backends of the big data centres and
  • 14:39 - 14:46
    cloud infrastructures. So you can get your
    own RDMA-enabled infrastructures from
  • 14:46 - 14:53
    public clouds like Azure, Oracle Cloud,
    Huawei, or AliBaba. Also file protocols
  • 14:53 - 14:59
    use SMB… like SMB and NFS can support
    RDMA. And other applications are HIgh
  • 14:59 - 15:07
    Performance Computing, Big Data, Machine
    Learning, Data Centres, Clouds, and so on.
  • 15:07 - 15:13
    But let’s get a bit into detail about the
    research and how we abused the 2
  • 15:13 - 15:19
    technologies. So we know now that we have
    a Shared Resource exposed to the network
  • 15:19 - 15:26
    via DDIO and RDMA gives us the necessary
    Read and Write primitives to launch such a
  • 15:26 - 15:34
    cache attack over the network. But first,
    we needed to clarify some things. Of
  • 15:34 - 15:39
    course, we did many experiments and
    extensively tested the DDIO port to
  • 15:39 - 15:45
    understand the inner workings. But here, I
    brought with me like 2 major questions
  • 15:45 - 15:50
    which we had to answer. So first of all
    is, of course, can we distinguish a cache
  • 15:50 - 15:58
    hit or miss over the network? But we still
    have network latency and packet queueing
  • 15:58 - 16:04
    and so on. So would it be possible to
    actually get the timing right? Which is an
  • 16:04 - 16:09
    absolute must for launching a side-
    channel. Well, the second question is
  • 16:09 - 16:14
    then: Can we actually access the full Last
    Level Cache? This would correspond more to
  • 16:14 - 16:21
    the attack surface that we actually have
    for attack. So the first question, we can
  • 16:21 - 16:27
    answer with this very simple experiment:
    So we have on the left, a very small code
  • 16:27 - 16:33
    snippet. We have a timed RDMA read to a
    certain offset. Then we write to that
  • 16:33 - 16:42
    offset and we read again from the offset.
    So what you can see is that, when doing
  • 16:42 - 16:46
    this like 50 000 times over multiple
    different offsets, you can clearly
  • 16:46 - 16:52
    distinguish the two distributions. So the
    blue one corresponds to data that was
  • 16:52 - 16:58
    fetched from my memory and the orange one
    to the data that was fetched from the Last
  • 16:58 - 17:03
    Level Cache over the network. You can also
    see the effects of the network. For
  • 17:03 - 17:10
    example, you can see the long tails which
    correspond to some packages that were
  • 17:10 - 17:16
    slowed down in the network or were queued.
    So on a sidenote here for all the side-
  • 17:16 - 17:23
    channel experts: We really need that write,
    because actually with DDIO reads do not
  • 17:23 - 17:30
    allocate anything in the Last Level Cache.
    So basically, this is the building block
  • 17:30 - 17:36
    to launch a prime and probe attack over
    the network. However, we still need to
  • 17:36 - 17:40
    have a target what we can actually
    profile. So let’s see what kind of an
  • 17:40 - 17:46
    attack surface we actually have. Which
    brings us to the question: Can we access
  • 17:46 - 17:51
    the full Last Level Cache? And
    unfortunately, this is not the case. So
  • 17:51 - 17:59
    DDIO has this allocation limitation of two
    ways. Here in the example out of 20 ways.
  • 17:59 - 18:08
    So roughly 10%. It’s not a dedicated way,
    so still the CPU uses this. But we would
  • 18:08 - 18:17
    only have like access to 10% of the cache
    activity of the CPU in the Last Level bit.
  • 18:17 - 18:23
    So that was not so well working for a
    first attack. But the good news is that
  • 18:23 - 18:32
    other PCIe devices—let’s say a second
    network card—will also use the same two
  • 18:32 - 18:39
    cache ways. And with that, we have 100%
    visibility of what other PCIe devices are
  • 18:39 - 18:49
    doing in the cache. So let’s look at the
    end-to-end attack! So as I told you
  • 18:49 - 18:54
    before, we have this basic setup of a
    client and a server. And we have the
  • 18:54 - 19:01
    machine that is controlled by us, the
    attackers. So the client just sends this
  • 19:01 - 19:07
    package over a normal ethernet NIC and
    there is a second NIC attached to the
  • 19:07 - 19:15
    server which allows the attacker to launch
    RDMA operations. So we also know now that
  • 19:15 - 19:20
    all the packets that… or all the
    keystrokes that the user is typing are
  • 19:20 - 19:26
    sent in individual packets which are
    activated in the Last Level Cache through
  • 19:26 - 19:34
    DDIO. But how can we actually now get
    these arrival times of packets? Because
  • 19:34 - 19:39
    that’s what we are interested in! So now
    we have to look a bit more closely to how
  • 19:39 - 19:47
    such arrival of network packages actually
    work. So the IP stack has a ring buffer
  • 19:47 - 19:53
    which is basically there to have an
    asynchronous operation between the
  • 19:53 - 20:02
    hardware—so the NIC—and the CPU. So if a
    packet arrives, it will allocate this in
  • 20:02 - 20:08
    the first ring buffer position. On the
    right-hand side you see the view of the
  • 20:08 - 20:14
    attacker which can just profile the cache
    activity. And we see that the cache line
  • 20:14 - 20:19
    at position 1 lights up. So we see an
    activity there. Could also be on cache
  • 20:19 - 20:25
    line 2, that’s … we don’t know on which
    cache line this will actually pop up. But
  • 20:25 - 20:29
    what is important is: What happens with
    the second packet? Because the second
  • 20:29 - 20:35
    packet will also light up a cache line,
    but this time different. And it’s actually
  • 20:35 - 20:42
    the next cache line as from the previous
    package. And if we do this for 3 and 4
  • 20:42 - 20:51
    packets, we can see that we suddenly have
    this nice staircase pattern. So now we
  • 20:51 - 20:57
    have predictable pattern that we can
    exploit to get information when packets
  • 20:57 - 21:04
    were received. And this is just because
    the ring buffer is allocated in a way that
  • 21:04 - 21:10
    it doesn’t evict itself, right? It doesn’t
    evict if packet 2 arrives. It doesn’t
  • 21:10 - 21:17
    evict the cache content of the packet 1.
    Which is great for us as an attacker,
  • 21:17 - 21:22
    because we can profile it well. Well,
    let’s look at the real-life example. So
  • 21:22 - 21:28
    this is the cache activity when the server
    receives constant pings. You can see this
  • 21:28 - 21:35
    nice staircase pattern and you can also
    see that the ring buffer reuses locations
  • 21:35 - 21:41
    as it is a circular buffer. Here, it is
    important to know that the ring buffer
  • 21:41 - 21:49
    doesn’t hold the data content, just the
    descriptor to the data. So this is reused.
  • 21:49 - 21:56
    Unfortunately when the user types over
    SSH, the pattern is not as nice as this
  • 21:56 - 22:00
    one here. Because then we would already
    have a done deal and just could work on
  • 22:00 - 22:06
    this. Because when a user types, you will
    have more delays between packages.
  • 22:06 - 22:11
    Generally also you don’t know when the
    user is typing, so you have to profile all
  • 22:11 - 22:16
    the time to get the timings right.
    Therefore, we needed to build a bit more
  • 22:16 - 22:24
    of a sophisticated pipeline. So it
    basically is a 2-stage pipeline which
  • 22:24 - 22:32
    consists of an online tracker that is just
    looking at a bunch of cache lines that
  • 22:32 - 22:38
    he’s observing all the time. And when he
    sees that certain cache lines were
  • 22:38 - 22:44
    activated, it moves that windows forward
    the next position that he believes an
  • 22:44 - 22:50
    activation will have. The reason why is
    that we have a speed advantage. So we need
  • 22:50 - 22:57
    to profile much faster than the network
    packets of the SSH session are arriving.
  • 22:57 - 23:01
    And what you can see here one the left-
    hand side is a visual output of what the
  • 23:01 - 23:07
    online tracker does. So it just profiles
    this window which you can see in red. And
  • 23:07 - 23:15
    if you look very closely, you can see also
    more lit-up in the middle which
  • 23:15 - 23:20
    corresponds to arrived network packets.
    You can also see that there is plenty of
  • 23:20 - 23:27
    noise involved, so therefore we’re not
    able just to directly get the packet
  • 23:27 - 23:35
    arrival times from it. That’s why we need
    a second stage. The Offline Extractor. And
  • 23:35 - 23:41
    the offline extractor is in charge of
    computing the most likeliest occurence of
  • 23:41 - 23:46
    client SSH network packet. It uses the
    information from the online tracker and
  • 23:46 - 23:52
    the predictable pattern of the ring buffer
    to do so. And then, it outputs the inter-
  • 23:52 - 23:59
    packet arrival times for different words
    as shown here on the right. Great. So, now
  • 23:59 - 24:05
    we’re again at the point where we have
    just packet arrival times but no words,
  • 24:05 - 24:10
    which we need for breaking the
    confidentiality of your private SSH
  • 24:10 - 24:19
    session. So, as I told you before, users
    or generally humans have distinctive
  • 24:19 - 24:27
    typing patterns. And with that, we were
    able to launch a statistical attack. More
  • 24:27 - 24:33
    closely, we just do like a machine
    learning of mapping between user typing
  • 24:33 - 24:39
    behaviour and actual words. So that in the
    end, we can output the two words that you
  • 24:39 - 24:48
    were typing in your SSH session. So we
    used 20 subjects that were typing free and
  • 24:48 - 24:56
    transcribed text which resulted in a total
    of 4 574 unique words. And each
  • 24:56 - 25:01
    represented as a point in a multi-
    dimensional space. And we used really
  • 25:01 - 25:06
    simple machine learning techniques like
    the k-nearest neighbour’s algorithm which
  • 25:06 - 25:12
    is basically categorising the measurements
    in terms of Euclidian space to other
  • 25:12 - 25:18
    words. The reason why we just used like a
    very basic machine learning algorithm is
  • 25:18 - 25:21
    that we just wanted to prove that the
    signal that we were extracting from the
  • 25:21 - 25:27
    remote cache is actually strong enough to
    launch such an attack. So we didn’t want
  • 25:27 - 25:33
    to improve in general, like, these kind of
    mapping between users and their typing
  • 25:33 - 25:40
    behaviour. So let’s look how this worked
    out! So, firstly, on the left-hand side,
  • 25:40 - 25:47
    you see we used our classifier on raw
    keyboard data. So means that we just used
  • 25:47 - 25:53
    the signal that was emitted during the
    typing. So when they were typing on their
  • 25:53 - 25:59
    local keyboard. Which gives us perfect and
    precise data timing. And we can see that
  • 25:59 - 26:02
    this is already quite challenging to
    mount. So we have an accuracy of
  • 26:02 - 26:10
    roughly 35%. But looking at the top 10
    accuracy which is basically: the attacker
  • 26:10 - 26:16
    can guess 10 words, and if the correct
    word was among these 10 words, then that’s
  • 26:16 - 26:23
    considered to be accurate. And with the
    top 10 guesses, we have an accuracy of
  • 26:23 - 26:31
    58%. That’s just on the raw keyboard data.
    And then we used the same data and also
  • 26:31 - 26:36
    the same classifier on the remote signal.
    And of course, this is less precise
  • 26:36 - 26:44
    because we have noise factors and we could
    even add or miss out on keystrokes. And
  • 26:44 - 26:55
    the accuracy is roughly 11% less and the
    top 10 accuracy is roughly 60%. So as we
  • 26:55 - 27:01
    used a very basic machine learning
    algorithm, many subjects, and a relately
  • 27:01 - 27:08
    large word corpus, we believe that we can
    showcase that the signal is strong enough
  • 27:08 - 27:15
    to launch such attacks. So of course, now
    we want to see this whole thing working,
  • 27:15 - 27:21
    right? As I’m a bit nervous here on stage,
    I’m not going to do a live demo because it
  • 27:21 - 27:28
    would involve me doing some typing which
    probably would confuse myself and of
  • 27:28 - 27:34
    course also the machine-learning model.
    Therefore, I brought a video with me. So
  • 27:34 - 27:40
    here on the right-hand side, you see the
    victim. So it will shortly begin with
  • 27:40 - 27:45
    doing an SSH session. And then on the
    left-hand side, you see the attacker. So
  • 27:45 - 27:51
    mainly on the bottom you see this online
    tracker and on top you see the extractor
  • 27:51 - 27:58
    and hopefully the predicted words. So now
    the victim starts this SSH session to
  • 27:58 - 28:05
    the server called “father.” And the
    attacker, which is on the machine “son,”
  • 28:05 - 28:11
    launches now this attack. So you saw we
    profiled the ring buffer location and now
  • 28:11 - 28:20
    the victim starts to type. And as this
    pipeline takes a bit to process this words
  • 28:20 - 28:24
    and to predict the right thing, you will
    shortly see, like slowly, the words
  • 28:24 - 28:42
    popping up in the correct—hopefully the
    correct—order. And as you can see, we can
  • 28:42 - 28:48
    correctly guess the right words over the
    network by just sending network package to
  • 28:48 - 28:54
    the same server. And with that, getting
    out the crucial information of when such
  • 28:54 - 29:05
    SSH packets were arrived.
    applause
  • 29:05 - 29:10
    So now you might ask yourself: How do you
    mitigate against these things? Well,
  • 29:10 - 29:17
    luckily it’s just server-grade processors,
    so no clients and so on. But then, from
  • 29:17 - 29:23
    our viewpoint, the only true mitigation at
    the moment is to either disable DDIO or
  • 29:23 - 29:30
    don’t use RDMA. Both comes quite with the
    performance impact. So DDIO, you will talk
  • 29:30 - 29:37
    roughly about 10-18% less performance,
    depending, of course, on your application.
  • 29:37 - 29:43
    And if you decide just to don’t use RDMA,
    you probably rewrite your whole
  • 29:43 - 29:50
    application. So, Intel on their publication
    on Disclosure Day sounded a bit different
  • 29:50 - 30:00
    therefore. But read it for yourself! I
    mean, the meaning “untrusted network” can,
  • 30:00 - 30:10
    I guess, be quite debatable. And yeah. But
    it is what it is. So I’m very proud that
  • 30:10 - 30:17
    we got accepted at Security and Privacy
    2020. Also, Intel acknowledged our
  • 30:17 - 30:23
    findings, public disclosure was in
    September, and we also got a bug bounty
  • 30:23 - 30:27
    payment.
    someone cheering in crowd
  • 30:27 - 30:30
    laughs
    Increased peripheral performance has
  • 30:30 - 30:37
    forced Intel to place the Last Level Cache
    on the fast I/O path in its processors.
  • 30:37 - 30:43
    And by this, it exposed even more shared
    microarchitectural components which we
  • 30:43 - 30:52
    know by now have a direct security impact.
    Our research is the first DDIO side-
  • 30:52 - 30:56
    channel vulnerability but we still believe
    that we just scratched the surface with
  • 30:56 - 31:03
    it. Remember: There’s more PCIe devices
    attached to them! So there could be
  • 31:03 - 31:11
    storage devices—so you could profile cache
    activity of storage devices and so on!
  • 31:11 - 31:20
    There is even such things as GPUDirect
    which gives you access to the GPU’s cache.
  • 31:20 - 31:26
    But that’s a whole other story. So, yeah.
    I think there’s much more to discover on
  • 31:26 - 31:33
    that side and stay tuned with that! All is
    left to say is a massive “thank you” to
  • 31:33 - 31:38
    you and, of course, to all the volunteers
    here at the conference. Thank you!
  • 31:38 - 31:47
    applause
  • 31:47 - 31:53
    Herald: Thank you, Michael! We have time
    for questions. So you can line up behind
  • 31:53 - 31:58
    the microphones. And I can see someone at
    microphone 7!
  • 31:58 - 32:03
    Question: So, thank you for your talk! I
    had a question about—when I’m working on a
  • 32:03 - 32:09
    remote machine using SSH, I’m usually not
    typing nice words like you’ve shown, but
  • 32:09 - 32:14
    usually it’s weird bash things like dollar
    signs, and dashes, and I don’t know. Have
  • 32:14 - 32:18
    you looked into that as well?
    Michael: Well, I think … I mean, of
  • 32:18 - 32:22
    course: What we would’ve wanted to
    showcase is that we could leak passwords,
  • 32:22 - 32:28
    right? If you would do “sudo” or
    whatsoever. The thing with passwords is
  • 32:28 - 32:36
    that it’s kind of its own dynamic. So you
    type key… passwords differently than you
  • 32:36 - 32:40
    type normal keywords. And then it gets a
    bit difficult because when you want to do
  • 32:40 - 32:46
    a large study of how users would type
    passwords, you either ask them for their
  • 32:46 - 32:51
    real password—which is not so ethical
    anymore—or you train them different
  • 32:51 - 32:58
    passwords. And that’s also difficult
    because they might adapt different style
  • 32:58 - 33:03
    of how they type these passwords than if
    it were the real password. And of course,
  • 33:03 - 33:10
    the same would go for command line in
    general and we just didn’t have, like, the
  • 33:10 - 33:13
    word corpus for it to launch such an
    attack.
  • 33:13 - 33:19
    Herald: Thank you! Microphone 1!
    Q: Hi. Thanks for your talk! I’d like to
  • 33:19 - 33:27
    ask: the original SSH timing paper
    attacks, is like 2001?
  • 33:27 - 33:31
    Michael: Yeah, exactly. Exactly!
    Q: And do you have some idea why there are
  • 33:31 - 33:38
    no circumventions on the side of SSH
    clients to add some padding or some random
  • 33:38 - 33:42
    delays or something like that? Do you have
    some idea why there’s nothing happening
  • 33:42 - 33:46
    there? Is it some technical reason or
    what’s the deal?
  • 33:46 - 33:53
    Michael: So, we also were afraid that
    between 2001 and nowadays, that they added
  • 33:53 - 33:59
    some kind of a delay or batching or
    whatsoever. I’m not sure if it’s just a
  • 33:59 - 34:05
    tradeoff between the interactiveness of
    your SSH session or if there’s, like, a
  • 34:05 - 34:09
    true reason behind it. But what I do know
    is that it’s oftentimes quite difficult to
  • 34:09 - 34:16
    add, like these artifical packets in-
    between. Because if it’s, like, not random
  • 34:16 - 34:21
    at all, you could even filter out, like,
    additional packets that just get inserted
  • 34:21 - 34:27
    by the SSH. But other than that, I’m not
    familiar with anything, why they didn’t
  • 34:27 - 34:35
    adapt, or why this wasn’t on their radar.
    Herald: Thank you! Microphone 4.
  • 34:35 - 34:42
    Q: How much do you rely on the skill of
    the typers? So I think of a user that has
  • 34:42 - 34:49
    to search each letter on the keyboard or
    someone that is distracted while typing,
  • 34:49 - 34:57
    so not having a real pattern
    behind the typing.
  • 34:57 - 35:02
    Michael: Oh, we’re actually absolutely
    relying that the pattern is reducible. As
  • 35:02 - 35:07
    I said: We’re just using this very simple
    machine learning algorithm that just looks
  • 35:07 - 35:12
    at the Euclidian distance of previous
    words that you were typing and a new word
  • 35:12 - 35:17
    or the new arrival times that we were
    observing. And so if that is completely
  • 35:17 - 35:24
    different, then the accuracy would drop.
    Herald: Thank you! Microphone 8!
  • 35:24 - 35:29
    Q: As a follow-up to what was said before.
    Wouldn’t this make it a targeted attack
  • 35:29 - 35:33
    since you would need to train the machine-
    learning algorithm exactly for the person
  • 35:33 - 35:40
    that you want to extract the data from?
    Michael: So, yeah. Our goal of the
  • 35:40 - 35:47
    research was not, like, to do next-level,
    let’s say machine-learning type of
  • 35:47 - 35:54
    recognition of your typing behaviours. So
    we actually used the information which
  • 35:54 - 36:01
    user was typing so to profile that
    correctly. But still I think you could
  • 36:01 - 36:07
    maybe generalize. So there is other
    research showing that you can categorize
  • 36:07 - 36:13
    users in different type of typers and if I
    remember correctly, they came up that you
  • 36:13 - 36:20
    can categorize each person into, like, 7
    different typing, let’s say, categories.
  • 36:20 - 36:27
    And I also know that some kind of online
    trackers are using your typing behaviour
  • 36:27 - 36:35
    to re-identify you. So just to, like,
    serve you personalized ads, and so on. But
  • 36:35 - 36:41
    still, I mean—we didn’t, like, want to go
    into that depth of improving the state of
  • 36:41 - 36:46
    this whole thing.
    Herald: Thank you! And we’ll take a
  • 36:46 - 36:49
    question from the Internet next!
    Signal angel: Did you ever try this with a
  • 36:49 - 36:56
    high-latency network like the Internet?
    Michael: So of course, we rely on a—let’s
  • 36:56 - 37:03
    say—a constant latency. Because otherwise
    it would basically screw up our timing
  • 37:03 - 37:09
    attack. So as we’re talking with RDMA,
    which is usually in datacenters, we also
  • 37:09 - 37:16
    tested it in datacenter kind of
    topologies. It would make it, I guess,
  • 37:16 - 37:21
    quite hard, which means that you would
    have to do a lot of repetition which is
  • 37:21 - 37:26
    actually bad because you cannot tell the
    users “please retype what you just did
  • 37:26 - 37:33
    because I have to profile it again,”
    right? So yeah, the answer is: No.
  • 37:33 - 37:40
    Herald: Thank you! Mic 1, please.
    Q: If the victim pastes something into the
  • 37:40 - 37:45
    SSH session. Would you be able to carry
    out the attacks successfully?
  • 37:45 - 37:51
    Michael: No. This is … so if you paste
    stuff, this is just sent out as a badge
  • 37:51 - 37:54
    when you enter.
    Q: OK, thanks!
  • 37:54 - 38:00
    Herald: Thank you! The angels tell me
    there is a person behind mic 6 whom I’m
  • 38:00 - 38:03
    completely unable to see
    because of all the lights.
  • 38:03 - 38:08
    Q: So as far as I understood, the attacker
    can only see that some package arrived on
  • 38:08 - 38:13
    their NIC. So if there’s a second SSH
    session running simultaneously on the
  • 38:13 - 38:18
    machine under attack, would this
    already interfere with this attack?
  • 38:18 - 38:24
    Michael: Yeah, absolutely! So even
    distinguishing SSH packets from normal
  • 38:24 - 38:32
    network packages is challenging. So we use
    kind of a heuristic here because the thing
  • 38:32 - 38:38
    with SSH is that it always sends two
    packets right after. So not only 1, just
  • 38:38 - 38:44
    2. But I ommited this part because of
    simplicity of this talk. But we also rely
  • 38:44 - 38:49
    on these kind of heuristics to even filter
    out SSH packets. And if you would have a
  • 38:49 - 38:55
    second SSH session, I can imagine that
    this would completely… so we cannot
  • 38:55 - 39:05
    distinguish which SSH session it was.
    Herald: Thank you. Mic 7 again!
  • 39:05 - 39:12
    Q: You always said you were using two
    connectors, like—what was it called? NICs?
  • 39:12 - 39:16
    Michael: Yes, exactly.
    Q: Is it has to be two different ones? Can
  • 39:16 - 39:21
    it be the same? Or how does it work?
    Michael: So in our setting we used one NIC
  • 39:21 - 39:27
    that has the capability of doing RDMA. So
    in our case, this was Fabric, so
  • 39:27 - 39:32
    InfiniBand. And the other was just like a
    normal Ethernet connection.
  • 39:32 - 39:37
    Q: But could it be the same or could it be
    both over InfiniBand, for example?
  • 39:37 - 39:43
    Michael: Yes, I mean … the thing with
    InfiniBand: It doesn’t use the ring buffer
  • 39:43 - 39:50
    so we would have to come up with a
    different kind of tracking ability to get
  • 39:50 - 39:54
    this. Which could even get a bit more
    complicated because it does this kernel
  • 39:54 - 39:59
    bypass. But if there’s a predictable
    pattern, we could potentially also do
  • 39:59 - 40:04
    this.
    Herald: Thank you. Mic 1?
  • 40:04 - 40:09
    Q: Hello again! I would like to ask, I
    know it was not the main focus of your
  • 40:09 - 40:14
    study, but do you have some estimation how
    practical this can be, this timing attack?
  • 40:14 - 40:20
    Like, if you do, like, real-world
    simulation, not the, like, prepared one?
  • 40:20 - 40:23
    How big a problem can it really be?
    What would you think, like, what’s
  • 40:23 - 40:27
    the state-of-the-art in this field? How
    do you feel the risk?
  • 40:27 - 40:30
    Michael: You’re just referring to the
    typing attack, right?
  • 40:30 - 40:34
    Q: Timing attack. SSH timing. Not
    necessarily the cache version.
  • 40:34 - 40:40
    Michael: So, the original research that
    was conducted is out there since 2001. And
  • 40:40 - 40:46
    since then, many researchers have showed
    that it’s possible to launch such typing
  • 40:46 - 40:52
    attacks over different scenarios, for
    example JavaScript is another one. It’s
  • 40:52 - 40:57
    always a bit difficult to judge because
    most of the researcher are using different
  • 40:57 - 41:03
    datasets so it’s different to compare. But
    I think in general, I mean, we have used,
  • 41:03 - 41:09
    like, quite a large word corpus and it
    still worked. Not super-precisely, but it
  • 41:09 - 41:16
    still worked. So yeah, I do believe it’s
    possible. But to even make it a real-world
  • 41:16 - 41:21
    attack where an attacker wants to have
    high accuracy, he probably would need a
  • 41:21 - 41:26
    lot of data and even, like, more
    sophisticated techniques. Which there are.
  • 41:26 - 41:30
    So there are a couple other of machine-
    learning techniques that you could use
  • 41:30 - 41:34
    which have their pros and cons.
    Q: Thanks.
  • 41:34 - 41:40
    Herald: Thank you! Ladies and
    Gentlemen—the man who named an attack
  • 41:40 - 41:45
    netCAT: Michael Kurth! Give him
    a round of applause, please!
  • 41:45 - 41:58
    applause
    Michael: Thanks a lot!
  • 41:57 - 42:01
    36C3 postscroll music
  • 42:01 - 42:16
    Subtitles created by c3subtitles.de
    in the year 2020. Join, and help us!
Title:
36C3 - Practical Cache Attacks from the Network and Bad Cat Puns
Description:

more » « less
Video Language:
English
Duration:
42:16

English subtitles

Revisions