Return to Video

#rC3 - Ramming Enclave Gates: A Systematic Vulnerability Assessment of TEE Shielding Runtimes

  • 0:00 - 0:11
    rc3 preroll music
  • 0:12 - 0:18
    Herald: So for the next talk, I have Jo
    Van Bulck, and Fritz Alder from the
  • 0:18 - 0:25
    University of Leuven in Belgium, and David
    Oswald professor for cyber security in
  • 0:25 - 0:30
    Birmingham. They are here to talk about
    the trusted execution environment. You
  • 0:30 - 0:36
    probably know from Intel and so on, and
    you should probably not trust it all the
  • 0:36 - 0:42
    way because it's software and it has its
    flaws. And so they're talking about
  • 0:42 - 0:48
    ramming enclave gates, which is always
    good, a systematic vulnerability
  • 0:48 - 0:52
    assessment of TEE shielding runtimes.
    Please go on with your talk.
  • 0:52 - 0:59
    Jo van Bulck: Hi, everyone. Welcome to our
    talk. So I'm Jo, former imec-DistriNet
  • 0:59 - 1:03
    research group at KU Leuven. And
    today joining me are Fritz, also from
  • 1:03 - 1:07
    Leuven and David from the University of
    Birmingham. And we have this very exciting
  • 1:07 - 1:11
    topic to talk about, ramming enclave
    gates. But before we dive into that, I
  • 1:11 - 1:16
    think most of you will not know what are
    enclave's, let alone what are these TEEs.
  • 1:16 - 1:24
    So let me first start with some analogy.
    So enclave's are essentially a sort of a
  • 1:24 - 1:30
    secure fortress in the processor, in the
    CPU. And so it's an encrypted memory
  • 1:30 - 1:37
    region that is exclusively accessible from
    the inside. And what we know from the last
  • 1:37 - 1:42
    history of fortress attacks and defenses,
    of course, is that when you cannot take a
  • 1:42 - 1:47
    fortress because the walls are high and
    strong, you typically aim for the gates,
  • 1:47 - 1:51
    right? That's the weakest point in any in
    any fortress defense. And that's exactly
  • 1:51 - 1:57
    the idea of this research. So it turns out
    to apply to enclave's as well. And we have
  • 1:57 - 2:02
    been ramming the enclave gates. We have
    been attacking the input/output interface
  • 2:02 - 2:08
    of the enclave. So a very simple idea, but
    very drastic consequences I dare to say.
  • 2:08 - 2:15
    So this is sort of the summary of our
    research. With over 40 interface
  • 2:15 - 2:20
    sanitization vulnerabilities that we found
    in over 8 widely used open source enclave
  • 2:20 - 2:27
    projects. So we will go a bit into detail
    over that in the rest of the slides. Also,
  • 2:27 - 2:32
    a nice thing to say here is that this
    resulted in two academic papers to date,
  • 2:32 - 2:39
    over 7 CVEs and altogether quite some
    responsible disclosure, lengthy embargo
  • 2:39 - 2:46
    periods.
    David Oswald: OK, so, uh, I guess we
  • 2:46 - 2:55
    should talk about why we need such enclave
    fortresses anyway. So if you look at a
  • 2:55 - 3:00
    traditional kind of like operating system
    or computer architecture, you have a very
  • 3:00 - 3:06
    large trusted computing base. So you, for
    instance, on the laptop that you most
  • 3:06 - 3:12
    likely use to watch this talk, you
    trust the kernel, you trust maybe a
  • 3:12 - 3:17
    hypervisor if you have and the whole
    hardware under the systems: a CPU,
  • 3:17 - 3:23
    memory, maybe hard drive, a trusted
    platform module and the like. So actually
  • 3:23 - 3:29
    the problem is here with such a large TCB,
    trusted computing base, you can also have
  • 3:29 - 3:36
    vulnerabilities basically everywhere. And
    also malware hiding in all these parts. So
  • 3:36 - 3:42
    the idea of this enclaved execution is as
    we find, for instance, in Intel SGX, which
  • 3:42 - 3:48
    is built into most recent Intel
    processors, is that you take most of the
  • 3:48 - 3:54
    software stack between an actual
    application, here the enclave app and the
  • 3:54 - 4:01
    actual CPU out of the TCB. So now you only
    trust really the CPU and of course, you
  • 4:01 - 4:05
    trust your own code, but you don't have to
    trust the OS anymore. And SGX, for
  • 4:05 - 4:10
    instance, promises to protect against an
    attacker who has achieved root in the
  • 4:10 - 4:15
    operating system. And even depending on
    who you ask against, for instance, a
  • 4:15 - 4:21
    malicious cloud provider. So imagine you
    run your application on the cloud and then
  • 4:21 - 4:27
    you can still run your code in a trusted
    way with hardware level isolation. And you
  • 4:27 - 4:31
    have attestation and so on. And you don't
    no longer really have to trust even the
  • 4:31 - 4:41
    administrator. So the problem is, of
    course, that attack surface remains, so
  • 4:41 - 4:47
    previous attacks and some of them, I think
    will also be presented at this remote
  • 4:47 - 4:52
    Congress this year, have targeted
    vulnerabilities in the microarchitecture
  • 4:52 - 4:59
    of the CPU. So you are hacking basically
    the hardware level. So you had foreshadow,
  • 4:59 - 5:06
    you had microarchitectural data sampling,
    spectre and LVI and the like. But what
  • 5:06 - 5:10
    less attention has been paid to and what
    we'll talk about more in this presentation
  • 5:10 - 5:17
    is the software level inside the enclave,
    which I hinted at, that there is some
  • 5:17 - 5:22
    software that you trust. But now we'll
    look in more detail into what actually is
  • 5:22 - 5:30
    in such an enclave. Now from the
    software side. So can an attacker exploit
  • 5:30 - 5:34
    any classical software vulnerabilities in
    the enclave?
  • 5:36 - 5:41
    Jo: Yes David, that's quite an interesting
    approach, right? Let's aim for the
  • 5:41 - 5:45
    software. So we have to understand what is
    the software landscape out there for these
  • 5:45 - 5:50
    SGX enclaves and TEEs in general. So
    that's what we did. We started with an
  • 5:50 - 5:54
    analysis and you see some screenshots
    here. This is actually a growing open
  • 5:54 - 5:59
    source ecosystem. Many, many of these
    runtimes, library operating systems, SDKs.
  • 5:59 - 6:04
    And before we dive into the details, I
    want to stand still with what is the
  • 6:04 - 6:10
    common factor that all of them share,
    right? What is kind of the idea of these
  • 6:10 - 6:17
    enclave development environments? So here,
    what any TEE, trusted execution
  • 6:17 - 6:22
    environment gives you is this notion of a
    secure enclave oasis in a hostile
  • 6:22 - 6:27
    environment. And you can do secure
    computations in the green box while the
  • 6:27 - 6:33
    outside world is burning. As with any
    defense mechanism, as I said earlier, the
  • 6:33 - 6:38
    devil is in the details and typically at
    the gate, right? So how do you mediate
  • 6:38 - 6:43
    between that untrusted world where the
    desert is on fire, and the secure oasis in
  • 6:43 - 6:48
    the enclave? And the intuition here is
    that you need some sort of intermediary
  • 6:48 - 6:53
    software layer, what we call a shielding
    runtime. So it kind of makes a secure
  • 6:53 - 6:58
    bridge to go from the untrusted world to
    the enclave and back. And that's what we
  • 6:58 - 7:04
    are interested in. To see, what kind of
    security checks you need to do there. So
  • 7:04 - 7:08
    it's quite a beautiful picture you have on
    the right, the fertile enclave and on the
  • 7:08 - 7:14
    left the hostile desert. And we make this
    secure bridge in between. And what we are
  • 7:14 - 7:20
    interested in is what if it goes wrong?
    What if your bridge itself is flawed? So
  • 7:20 - 7:26
    to answer that question, we look at that
    yellow box and we ask what kind of
  • 7:26 - 7:30
    sanitization, what kind of security checks
    do you need to apply when you go from the
  • 7:30 - 7:35
    outside to the inside and back from the
    inside to the outside. And one of the key
  • 7:35 - 7:39
    contributions that we have built up in the
    past two years of this research, I think,
  • 7:39 - 7:46
    is that that yellow box can be subdivided
    into 2 smaller subsequent layers. And the
  • 7:46 - 7:51
    first one is this ABI, application binary
    interface, very low level CPU state. And
  • 7:51 - 7:55
    the second one is what we call API,
    application programing interface. So
  • 7:55 - 7:58
    that's the kind of state that is already
    visible at the programing language. In the
  • 7:58 - 8:02
    remainder of the presentation, we will
    kind of guide you through some relevant
  • 8:02 - 8:06
    vulnerabilities on both these layers to
    give you an understanding of what this
  • 8:06 - 8:12
    means. So first, Fritz will guide you to
    the exciting low level landscape of the
  • 8:12 - 8:15
    ABI.
    Fritz: Yeah, exactly. And Jo, you just
  • 8:15 - 8:22
    said it's the CPU state and it's the
    application binary interface. But let's
  • 8:22 - 8:27
    take a look at what this means, actually.
    So it means basically that the attacker
  • 8:27 - 8:39
    controls the CPU register contents and
    that... On every enclave entry and every
  • 8:39 - 8:46
    enclave exit, we need to perform some
    tasks. So that's the enclave and the
  • 8:46 - 8:57
    trusted runtime have some like, well
    initialized CPU state and the compiler can
  • 8:57 - 9:03
    work with the calling conventions that it
    expects. So these are basically the key
  • 9:03 - 9:09
    part. We need to initialize the CPU
    registers when entering the enclave and
  • 9:09 - 9:16
    scrubbing them when we exiting the
    enclave. So we can't just assume anything
  • 9:16 - 9:21
    that the attacker gives us as a given. We
    have to initialize it to something proper.
  • 9:21 - 9:30
    And we looked at multiple TEE runtimes and
    multiple TEEs and we found a lot of
  • 9:30 - 9:38
    vulnerabilities in this ABI layer. And one
    key insight of this analysis is basically
  • 9:38 - 9:45
    that a lot of these vulnerabilities happen
    on complex instruction set processors, so
  • 9:45 - 9:52
    on CISC processors and basically on the
    Intel SGX TEE. We also looked at some RISC
  • 9:52 - 9:58
    processors and of course, it's not
    representative, but it's like immediately
  • 9:58 - 10:06
    visible that the complex x86 ABI seems to
    be... have a way higher, larger attack
  • 10:06 - 10:14
    surface than the simpler RISC designs. So
    let's take a look at one example of this
  • 10:14 - 10:20
    more complex design. So, for example,
    there's the x86 string instructions that
  • 10:20 - 10:27
    are controlled by the direction flag. So
    there's a special x86 rep instruction that
  • 10:27 - 10:33
    basically allows you to perform streamed
    memory operations. So if you do a memset
  • 10:33 - 10:41
    on a buffer, this will be compiled to the
    rep string operation instruction. And the
  • 10:41 - 10:51
    idea here is basically that the buffer is
    read from left to right and written over
  • 10:51 - 10:57
    it by memset. But this direction flag also
    allows you to go through it from right to
  • 10:57 - 11:03
    left. So backwards. Let's not think about
    why this was a good idea or why this is
  • 11:03 - 11:09
    needed. But definitely it is possible to
    just set the direction flag to one and run
  • 11:09 - 11:16
    this buffer backwards. And what we found
    out is that the System-V ABI actually says
  • 11:16 - 11:21
    that this must be clear or set to
    forward on function entry and return.
  • 11:21 - 11:27
    And that compilers expect this to happen.
    So let's take a look at this when we do
  • 11:27 - 11:34
    this in our enclave. So in our enclave,
    when we, in our trusted application,
  • 11:34 - 11:40
    perform this memset on our buffer, on
    normal entry with the normal direction
  • 11:40 - 11:45
    flag this just means that we walk this
    buffer from front to back. So you can see
  • 11:45 - 11:52
    here it just runs correctly from front to
    back. But now, if the attacker enters the
  • 11:52 - 11:59
    enclave with the direction flag set to 1
    so set to run backwards, this now means
  • 11:59 - 12:06
    that from the start of our buffer. So from
    where the pointer points right now, you
  • 12:06 - 12:11
    can now see it actually runs backwards. So
    that's a problem. And that's definitely
  • 12:11 - 12:16
    something that we don't want in our
    trusted applications because, well, as you
  • 12:16 - 12:23
    can think, it allows you to overwrite keys
    that are in the memory location that you
  • 12:23 - 12:27
    can go backwards. It allows you to read
    out things, that's definitely not
  • 12:27 - 12:33
    something that is useful. And when we
    reported this, this actually got a nice
  • 12:33 - 12:39
    CVE assigned with the base score High, as
    you can see here on the next slide. And
  • 12:39 - 12:47
    while you may say, OK, well, that's one
    instance. And you just have to think of
  • 12:47 - 12:54
    all the flags to sanitize and all the
    flags to check. But wait, of course,
  • 12:54 - 13:03
    there's always more, right? So as we found
    out, there's actually the floating point
  • 13:03 - 13:07
    unit, which comes with a like, whole lot
    of other registers and a whole lot of
  • 13:07 - 13:17
    other things to exploit. And I will spare
    you all the details. But just for this
  • 13:17 - 13:26
    presentation, just know that there is an
    older x87 FPU and a new SSE that does
  • 13:26 - 13:32
    vector floating point operations. So
    there's the FPU control word and the MXCSR
  • 13:32 - 13:40
    register for these newer instructions. And
    this x87 FPU is older, but it's still used
  • 13:40 - 13:46
    for example, for extended precision, like
    long double variables. So old and new
  • 13:46 - 13:49
    doesn't really apply here because both are
    still relevant. And that's kind of the
  • 13:49 - 13:58
    thing with x86 and x87 here. That old
    archaic things that you could say are
  • 13:58 - 14:03
    outdated, are still relevant or are still
    used nowadays. And again, if you look at
  • 14:03 - 14:09
    the System-V ABI now, we saw that these
    control bits are callee-saved. So they are
  • 14:09 - 14:14
    preserved across function calls. And the
    idea here is which to some degree holds
  • 14:14 - 14:22
    merit, is that these are some global
    states that you can set and they are all
  • 14:22 - 14:28
    transferred within one application. So one
    application can set some global state and
  • 14:28 - 14:35
    keep the state across all its usage. But
    the problem here as you can see here is
  • 14:35 - 14:40
    our application or enclave is basically
    one application, and we don't want our
  • 14:40 - 14:44
    attacker to have control over the global
    state within our trusted application,
  • 14:44 - 14:53
    right? So what happens if FPU settings are
    preserved across calls? Well, on a normal,
  • 14:53 - 14:58
    for a normal user, let's say we just do
    some calculation inside the enclave. Like
  • 14:58 - 15:03
    2.1 times 3.4, which just nicely
    calculates to a 7.14, a long double.
  • 15:03 - 15:10
    That's nice, right? But what happens if
    the attacker now enters the enclave with
  • 15:10 - 15:16
    some corrupt precision and rounding modes
    for the FPU? Well, then we actually get
  • 15:16 - 15:22
    another result. So we get distorted
    results with a lower precision and a
  • 15:22 - 15:26
    different rounding mode. So actually it's
    rounding down here, whenever it exceeds
  • 15:26 - 15:31
    the precision. And this is something we
    don't want, right? So this is something
  • 15:31 - 15:38
    where the developer expects a certain
    precision or long double precision, but
  • 15:38 - 15:44
    the attacker could actually just reduce it
    to a very short position. And we reported
  • 15:44 - 15:50
    this and we actually found this issue also
    in Microsoft OpenEnclave. That's why it's
  • 15:50 - 15:56
    marked as not exploitable here. But what
    we found interesting is that the Intel SGX
  • 15:56 - 16:01
    SDK, which was vulnerable, patched this
    with some xrstore instruction, which
  • 16:01 - 16:10
    completely restores the extended state to
    a known value, while OpenEnclave only
  • 16:10 - 16:16
    restored the specific register that was
    affected, the ldmxcsr instruction. And
  • 16:16 - 16:20
    so let's just skip over the next few
    slides here, because I just want to give
  • 16:20 - 16:27
    you the idea that this was not enough. So
    we found out that even if you restored
  • 16:27 - 16:33
    this specific register, there's still
    another data register that you can just
  • 16:33 - 16:40
    mark as in use before entering the enclave
    and with which the attacker can make that
  • 16:40 - 16:46
    any floating point calculation results in
    a not a number. And this is silent, so
  • 16:46 - 16:50
    this is not programing language specific,
    this is not developer specific. This is a
  • 16:50 - 16:56
    silent ABI issue that the calculations are
    just not a number. So we also reported
  • 16:56 - 17:04
    this. And now, thankfully, all enclave
    runtimes use this full xrstor instruction
  • 17:04 - 17:10
    to fully restore this extended state. So
    it took two CVEs, but now luckily, they
  • 17:10 - 17:16
    all perform this nice full restore. So I
    don't want to go to the full details of
  • 17:16 - 17:21
    our use cases now or of our case studies
    that we did now. So let me just give you
  • 17:21 - 17:29
    the ideas of these case studies. So we
    looked at these issues and wanted to look
  • 17:29 - 17:37
    into whether they just feel difficult or
    whether they are bad. And we found that we
  • 17:37 - 17:42
    can use overflows as a side channel to
    deduce secrets. So, for example, the
  • 17:42 - 17:49
    attacker could use this register to unmask
    exceptions, that inside the
  • 17:49 - 17:58
    enclave are then triggered by some input
    dependent multiplication. And we found out
  • 17:58 - 18:03
    that these side channels if you have some
    input dependent multiplication can
  • 18:03 - 18:12
    actually be used in the enclave to perform
    a binary search on this input space. And
  • 18:12 - 18:17
    we can actually retrieve this
    multiplication secret with a deterministic
  • 18:17 - 18:24
    number of steps. So even though we just
    have a single mask we flip, we can
  • 18:24 - 18:32
    actually retrieve a secret with
    deterministic steps. And just for the, just
  • 18:32 - 18:37
    so that you know, there's more you can do.
    We can also do machine learning in the
  • 18:37 - 18:44
    enclave. So Jo said it nicely, you can run
    it inside the TEE, inside the cloud. And
  • 18:44 - 18:48
    that's great for machine learning, right?
    So let's do a handwritten digit
  • 18:48 - 18:55
    recognition. And if you look at just the
    model that we look at, we just have two
  • 18:55 - 19:01
    users where one user pushes some
    machine learning model and the other user
  • 19:01 - 19:06
    pushes some input and everything is
    protected with enclaves, right?
  • 19:06 - 19:11
    Everything is secure. But we actually
    found out that we can poison these FPU
  • 19:11 - 19:18
    registers and degrade the performance of
    this machine learning down from all digits
  • 19:18 - 19:24
    were predicted correctly to just eight
    percent of digits were correctly. And
  • 19:24 - 19:32
    actually all digits were just predicting
    the same number. And this basically made
  • 19:32 - 19:38
    this machine learning model useless,
    right? There's more we did so we can also
  • 19:38 - 19:42
    attack blender with image differences,
    slight image differences between blender
  • 19:42 - 19:49
    images. But this is just for you to see
    that it's small, but it's a tricky thing
  • 19:49 - 19:56
    and indicate that that can go wrong very
    fast on the ABI level once you play around
  • 19:56 - 20:03
    with it. So this is about the CPU state.
    And now we will talk more about the
  • 20:03 - 20:06
    application programing interface that I
    think more of you will be comfortable
  • 20:06 - 20:09
    with.
    David: Yeah, we take, uh, thank you,
  • 20:09 - 20:14
    Fritz. We take a quite simple example. So
    let's assume that we actually load a
  • 20:14 - 20:19
    standard Unix binary into such an enclave,
    and there are frameworks that can do that,
  • 20:19 - 20:25
    such as graphene or so. And what I want to
    illustrate with that example is that it's
  • 20:25 - 20:30
    actually very important to check where
    pointers come from. Because the enclave
  • 20:30 - 20:35
    kind of partitions memory into untrusted
    memory and enclave memory and they live in
  • 20:35 - 20:41
    a shared address space. So the problem
    here is as follows. Let's assume we have
  • 20:41 - 20:47
    an echo binary that just prints an input.
    And we give it as an argument a string and
  • 20:47 - 20:53
    that normally, when everything is fine,
    points to some string, let's say hello
  • 20:53 - 20:58
    world, which is located in the untrusted
    memory. So if everything runs as it
  • 20:58 - 21:03
    should, this enclave will run, will get
    the pointer to untrusted memory and will
  • 21:03 - 21:09
    just print that string. But the problem is
    now actually the enclave has access also
  • 21:09 - 21:16
    to its own trusted memory. So if you don't
    check this pointer and the attacker passes
  • 21:16 - 21:21
    a pointed to the secret that might live in
    enclave memory, what will happen? Well the
  • 21:21 - 21:25
    enclave will fetch it from there and will
    just print it. So suddenly you have turned
  • 21:25 - 21:32
    this kind of like into a like a memory
    disclosure vulnerability. And we can see
  • 21:32 - 21:36
    that in action here for the framework
    named graphene that I mentioned. So we
  • 21:36 - 21:41
    have a very simple hello world binary and
    we run it with a couple of command line
  • 21:41 - 21:45
    arguments. And now on the untrusted side,
    we actually change a memory address to
  • 21:45 - 21:50
    point into enclave memory. And as you can
    see, normally, it should print here test,
  • 21:50 - 21:55
    but actually it prints a super secret
    enclave string that lived inside
  • 21:55 - 22:01
    the memory space of the enclave. So
    these kind of vulnerabilities are quite
  • 22:01 - 22:06
    well known from user to kernel research
    and from other instances. And they're
  • 22:06 - 22:12
    called confused deputy. So the deputy kind
    of like has a gun now can read and if
  • 22:12 - 22:17
    memory and suddenly then does something
    which is not not supposed to do because he
  • 22:17 - 22:22
    didn't really didn't really check where
    the memory should belong or not. So I
  • 22:22 - 22:28
    think this vulnerability, uh, seems seems
    to be quite trivial to solve. You simply
  • 22:28 - 22:32
    check all the time where, uh, where
    pointers come from. But as you will tell,
  • 22:32 - 22:38
    you know, it's often not quite quite that
    easy. Yes. David, that's quite insightful
  • 22:38 - 22:42
    that we should check all of the pointers.
    So that's what we did. We checked all of
  • 22:42 - 22:46
    the pointer checks and we noticed that
    Endo has a very interesting kind of all
  • 22:46 - 22:50
    the way to check these things. Of course,
    the code is high quality. They checked all
  • 22:50 - 22:53
    of the pointers, but you have to do
    something special for things. We're
  • 22:53 - 22:58
    talking here, the C programing language.
    So things are no terminated, terminated.
  • 22:58 - 23:03
    They end with a new byte and you can use a
    function as they are struggling to compute
  • 23:03 - 23:06
    the length of this thing. And let's see
    how they check whether thing that's
  • 23:06 - 23:11
    completely outside of memory. So the first
    step is you compute the length of the
  • 23:11 - 23:16
    interest, it's ten, and then you check
    whether the string from start to end lives
  • 23:16 - 23:19
    completely outside of the anchor. That
    sounds so legitimate. Then you eject the
  • 23:19 - 23:24
    steam. So so this works beautifully. Let's
    see, however, how it behaves when we when
  • 23:24 - 23:27
    we partnered. And so we are not going to
    parse this thing has a world outside of
  • 23:27 - 23:34
    the enclave that we pass on string secret,
    one that lies within the. So the first
  • 23:34 - 23:38
    step will be that the conclave starts
    computing the length of that string that
  • 23:38 - 23:43
    lies within the anklet. That sounds
    already fishy, but then luckily everything
  • 23:43 - 23:47
    comes OK because then it will detect that
    this actually should never have been done
  • 23:47 - 23:51
    and that this thing lies inside the
    enclave. So it will reject the call so
  • 23:51 - 23:56
    that the call into the anklet. So that's
    fine. But but some of you who know such
  • 23:56 - 24:00
    channels know that this is exciting
    because the English did some competition
  • 24:00 - 24:04
    it was never supposed to do. And the
    length of that competition depends on the
  • 24:04 - 24:10
    amount of of non-zero bites within the
    anklet. So what we have here is a side
  • 24:10 - 24:16
    channel where the English will always
    return false. But the time it takes to
  • 24:16 - 24:22
    return false depends on the amount of of
    zero bytes inside that secret Arncliffe
  • 24:22 - 24:27
    memory block. So that's what we found. We
    are excited and we said, OK, it's simple
  • 24:27 - 24:32
    timing channel. Let's go with that. So we
    did that and you can see a graph here and
  • 24:32 - 24:36
    it turns out it's not as easy as it seems.
    So I can tell you that the blue one is for
  • 24:36 - 24:40
    a string of length one, and that one is
    for a string of like two. But there is no
  • 24:40 - 24:44
    way you can see that from that graph
    because it said six processors are
  • 24:44 - 24:48
    lightning fast so that one single
    incrementing section is completely
  • 24:48 - 24:53
    dissolves into the pipeline. You will not
    see that by by measuring execution time.
  • 24:53 - 24:59
    So we need something different. And what
    we have smart papers and in literature,
  • 24:59 - 25:04
    one of the very common attacks in ASICs is
    also something that Intel describes here.
  • 25:04 - 25:10
    You can see which memory pages for memory
    blocks are being accessed while the
  • 25:10 - 25:14
    English executes because you control the
    operating system and the paging machinery.
  • 25:15 - 25:20
    So that's what we tried to do. We thought
    this is a nice channel and we were there
  • 25:20 - 25:24
    scratching our heads, looking at that code
    of very simple for loop that fits entirely
  • 25:24 - 25:29
    within one page and a very short string
    that fits entirely within one page. So
  • 25:29 - 25:34
    just having access to for a memory, it's
    not going to help us here because because
  • 25:35 - 25:39
    votes the code and the data fit on a
    single page. So this is essentially what
  • 25:39 - 25:44
    we call the temporal resolution of the
    sideshow. This is not accurate enough. So
  • 25:44 - 25:51
    we need a lot of take. And well, here we
    have been working on quite an exciting
  • 25:51 - 25:55
    framework. It uses indirects and it's
    called as a big step. So it's a completely
  • 25:55 - 26:01
    open source framework on Hadoop. And what
    it allows you to do essentially is to
  • 26:01 - 26:05
    execute an enclave one step at a time,
    hence the name. So it allows you to
  • 26:05 - 26:09
    interleave the execution of the enclave
    with attacker code after every single
  • 26:09 - 26:13
    instruction. And the way we pull it off is
    highly technical. We have this Linux
  • 26:13 - 26:18
    kernel drive around a little library
    operating system in userspace, but that's
  • 26:18 - 26:23
    a bit out of scope. The matter is that we
    can interrupt an enclave after every
  • 26:23 - 26:28
    single restriction and then let's see what
    we can do with that. So. What we
  • 26:28 - 26:34
    essentially can do here is to execute and
    follow up with all this extra increment
  • 26:34 - 26:39
    instructions one of the time, and after
    every interrupt, we can simply check
  • 26:39 - 26:45
    whether the enclave accessed the string
    residing of our target. That's another way
  • 26:45 - 26:51
    to think about it, is that we have that
    execution of the enclave and we can break
  • 26:51 - 26:57
    that up into individual steps and then
    just count the steps and hands and hands.
  • 26:57 - 27:03
    A deterministic timing. So in other words,
    we have an oracle that tells you where all
  • 27:03 - 27:09
    zero bytes are in the anklet. I don't know
    if that's useful, actually do so. It turns
  • 27:09 - 27:13
    out that this I mean, some people who
    might be born into exploitation already
  • 27:13 - 27:18
    know that it's good to know whether zero
    is somewhere in memory or not. And we do
  • 27:18 - 27:24
    now do one example where we break A-S and
    Iowa, which is the hardware acceleration
  • 27:24 - 27:29
    of enterprises process for AI. So finally,
    that actually operates only on registers.
  • 27:29 - 27:34
    And you just said you can kind of like do
    that on onepoint us on memory, but says
  • 27:34 - 27:39
    another trick that comes into play here.
    So whenever the enclave is interrupted, it
  • 27:39 - 27:44
    will store its current registers, date
    somewhere to memory Quazi as a frame so we
  • 27:44 - 27:50
    can actually interrupt it and clarify make
    it right. It's memory to to it's it's
  • 27:50 - 27:57
    register sorry to to say memory. And then
    we can run the zero byte oracle on this
  • 27:57 - 28:03
    SSA a memory. And what we figure out is
    where zero is or if there's any zero in
  • 28:03 - 28:09
    the state. So I don't want to go into the
    gory details of a yes. But what we
  • 28:09 - 28:16
    basically do is we find whenever there's a
    zero in the last in the state before the
  • 28:16 - 28:22
    last round of ads and then that zero will
    go down to the box will be X or to a key
  • 28:22 - 28:28
    byte, and then that will give us a cipher
    text. But we actually know the ciphertext
  • 28:28 - 28:34
    byte so we can go backwards. So we can
    kind of compute, uh, we can compute from
  • 28:34 - 28:40
    zero up to here and from here to this X1.
    And that way we can compute directly one
  • 28:40 - 28:46
    key byte. So we repeat that whole thing 16
    times until we have found a zero in every
  • 28:46 - 28:51
    bite of this state before the last round.
    And that way we get the whole final round
  • 28:51 - 28:56
    key. And for those that know as if you
    have one round key, you have the whole key
  • 28:56 - 29:01
    in it. So you get like the original key,
    you can go backwards. So sounds
  • 29:01 - 29:06
    complicated, but it's actually a very fast
    attack when you see it running. So here is
  • 29:06 - 29:11
    a except doing this attack and as you can
    see, was in a couple of seconds and maybe
  • 29:11 - 29:16
    five hundred twenty invocations of of
    Asir, we get the full KeIso. That's
  • 29:16 - 29:21
    actually quite impressive, especially
    because the whole uh. Yeah, one of the
  • 29:21 - 29:26
    points in essence is that you don't put
    anything in memory, but this is
  • 29:26 - 29:33
    interaction with SGX, which is kind of
    like allows you to put stuff into into
  • 29:33 - 29:41
    memory. So I want to wrap up here. Um, we
    have found various other attacks. Yeah.
  • 29:41 - 29:48
    So, um, both in research code and in
    production code, such as the Intel SDK and
  • 29:48 - 29:53
    the Microsoft SDK. And they basically go
    across the whole range of foreign
  • 29:53 - 29:58
    abilities that we have often seen already
    from use it to kind of research. But there
  • 29:58 - 30:03
    are also some, uh, some interesting new
    new kind of like vulnerabilities due to
  • 30:03 - 30:08
    some of the aspects we explained. There
    was also a problem with all call centers
  • 30:08 - 30:14
    when the enclave calls into untrust, the
    codes that is used when you want to, for
  • 30:14 - 30:19
    instance, emulate system calls and so on.
    And if you return some kind of like a
  • 30:19 - 30:25
    wrong result here, you could again go out
    of out of bounds. And they were actually
  • 30:25 - 30:31
    quite, quite widespread. And then finally,
    we also found some issues with padding,
  • 30:31 - 30:36
    with leakage in the padding. I don't want
    to go into details. I think we have, uh,
  • 30:36 - 30:41
    learned a lesson here that that we also
    know from from the real world. And that is
  • 30:41 - 30:47
    it's important to wash your hands. So it's
    also important to sanitize and state to
  • 30:47 - 30:54
    check pointers and so on. No. So that is
    kind of one one of the take away message
  • 30:54 - 30:59
    is really that to build and connect
    securely, yes, you need to fix all the
  • 30:59 - 31:03
    hardware issues, but you also need to
    write safe code. And for enclave's, that
  • 31:03 - 31:10
    means you have to do a proper API and APIs
    sanitization. And that's quite a difficult
  • 31:10 - 31:16
    task actually, as as we've seen, I think
    in that presentation, there's quite a
  • 31:16 - 31:21
    large attack surface due to the attack
    model, especially of intellectual X, where
  • 31:21 - 31:26
    you can interrupt after every instruction
    and so on. And I think for from a research
  • 31:26 - 31:32
    perspective, there's really a need for a
    more. Approach, then just continue if you
  • 31:32 - 31:38
    want, maybe we can learn something from
    from the user to analogy which which I
  • 31:38 - 31:44
    invoked, I think a couple of times so we
    can learn kind of like how what an enclave
  • 31:44 - 31:49
    should do, uh, from from what we know
    about what a colonel should do. But they
  • 31:49 - 31:54
    are quite important differences also that
    need to be taken account. So I think, as
  • 31:54 - 32:00
    you said, all all our code is is open
    source. So you can find that on the below
  • 32:00 - 32:07
    GitHub links and you can, of course, ask
    also questions after you have watched this
  • 32:07 - 32:15
    talk. So thank you very much. Hello, so
    back again. Here are the questions. Hello
  • 32:15 - 32:22
    to see your life. Um, we have no questions
    yet, so you can put up questions in the
  • 32:22 - 32:28
    see below if you have questions. And on
    the other hand. Oh, let me make close this
  • 32:28 - 32:37
    up so I'll ask you some questions. How did
    you come about this topic and how did you
  • 32:37 - 32:43
    meet? Uh, well, that's actually
    interesting. I think this such as has been
  • 32:43 - 32:50
    building up over the years. Um, and it's
    so, so, so I think some some of the
  • 32:50 - 32:57
    vulnerabilities from our initial paper, I
    actually started in my master's thesis to
  • 32:57 - 33:02
    sort of see and collect and we didn't
    really see the big picture until I think I
  • 33:02 - 33:07
    met David and his colleagues from
    Birmingham at an event in London, the nice
  • 33:07 - 33:11
    conference. And then we we started to
    collaborate on this and we went to look at
  • 33:11 - 33:15
    this a bit more systematic. So I started
    with this whole list of vulnerabilities
  • 33:15 - 33:20
    and then with with David, we kind of made
    it into a more systematic analysis. And
  • 33:20 - 33:26
    and that was sort of a Pandora's box. I
    dare to say from the moment on this, this
  • 33:26 - 33:32
    kind of same errors being repeated. And
    then also Fitzhugh, who recently joined
  • 33:32 - 33:36
    our team in London, started working
    together with us on one or more of these
  • 33:36 - 33:41
    low level Sebu estate. And that's the
    Pandora's box in itself. I would say,
  • 33:41 - 33:47
    especially one of the lessons, as we said,
    that particular six is extremely complex.
  • 33:47 - 33:51
    And it turns out that almost all of that
    complexity, I would say, can be abused,
  • 33:51 - 33:56
    potentially biodiversity. So it's more
    like a fractal in a fraction of a fractal
  • 33:56 - 34:02
    where you're opening a box and you're
    getting more and more of questions out of
  • 34:02 - 34:09
    that. In a way, I think. Yes, I think it's
    fair to say this this research is not the
  • 34:09 - 34:14
    final answer to to this, but it's an
    attempt to to give a systematic way of
  • 34:14 - 34:19
    looking at probably never ending up
    actually funding is. So there is a
  • 34:19 - 34:26
    question from the Internet. So are there
    any other circumstances where he has
  • 34:26 - 34:33
    Mianus and he is writing its registers
    into memory, or is this executed exclusive
  • 34:33 - 34:44
    to SGX? So I repeat, I do not understand
    the question either, so, so well, I think
  • 34:44 - 34:49
    the question is that this is a tactical
    defeat. Prison depends on, of course,
  • 34:50 - 34:55
    having a memory disclosure about the
    content and people that are accusing us
  • 34:55 - 34:59
    except to kind of forcibly right the
    memory content of the content into memory.
  • 35:00 - 35:05
    So that is definitely a specific um.
    However, I would say one of the the
  • 35:05 - 35:09
    lessons from the past five years of
    research is that often these things
  • 35:09 - 35:13
    generalize beyond the six and at least the
    general concept of, let's say, the
  • 35:13 - 35:19
    insights that sebu, that justice end up in
    memory one way or another sooner or later.
  • 35:19 - 35:23
    I think that also applies to creating
    systems that if you somehow can force an
  • 35:23 - 35:26
    operating system to complex, which pertain
    to applications, that you also have to
  • 35:27 - 35:32
    register temporarily in memory. So if you
    would have something similar like what we
  • 35:32 - 35:37
    have in an operating system, Colonel, you
    would potentially mount a similar attack.
  • 35:38 - 35:44
    But maybe David wants to say something
    about operating systems there as well. No,
  • 35:44 - 35:48
    no, not really. I think, like one one
    thing that helps with SGX is that you have
  • 35:48 - 35:53
    very precise control, as you explained,
    which was the interrupts and stuff because
  • 35:53 - 35:58
    you were your route outside the outside
    the enclave. So you can signal step
  • 35:58 - 36:03
    essentially the whole enclave where it's
    like, um, interrupting the operating
  • 36:03 - 36:08
    system. Exactly repeatedly at exactly the
    point you want or some other process also
  • 36:09 - 36:14
    tends to be probably probably harder just
    by design. But of course, on a context
  • 36:14 - 36:19
    which keep us to save somewhere, it's
    register set and then then it will end up
  • 36:19 - 36:26
    in memoria in some situations probably not
    not as controlled as it is for for as
  • 36:26 - 36:34
    Asgeirsson. So there is the question, what
    about other CPU architectures other than
  • 36:34 - 36:42
    Intel, did you test those? So maybe I can
    I can go into this so. Well, interesting.
  • 36:42 - 36:48
    See, that's the largest one with the
    largest software base and the most runtime
  • 36:48 - 36:53
    that is also that we could look at. Right.
    But there, of course, some other stuff we
  • 36:53 - 37:01
    have or as this eternity that we developed
    some years ago, it's called Sancho's. And
  • 37:01 - 37:05
    of course, for this, there are similar
    issues. Right. So you always need the
  • 37:05 - 37:15
    software layer to interact, to enter the
    enclave into the enclave. And I think you
  • 37:15 - 37:21
    had David in the earlier work, also found
    issues in our TI. So it's not just Intel
  • 37:21 - 37:27
    and really related product projects that
    mess up there, of course. But what we
  • 37:27 - 37:34
    definitely found is it's easier to to
    think of all cases for simpler designs
  • 37:34 - 37:38
    like risk five or simpler risk designs
    then for this complex actually six
  • 37:39 - 37:44
    architecture. Right. So right now there
    are not that many sites into less Jicks.
  • 37:44 - 37:49
    So so they have the advantage and
    disadvantage of being the first widely
  • 37:49 - 37:56
    deployed, let's say. And um, but I think
    as soon as others start to, to grow out
  • 37:56 - 38:01
    and simpler designs start to be more
    common, I think we will see this, that
  • 38:01 - 38:06
    it's easier to fix alleged cases for
    simpler designs. OK, so what is a
  • 38:06 - 38:19
    reasonable alternative to tea, or is there
    any way you want to take that or think,
  • 38:19 - 38:27
    should I say what? Uh, well, we can
    probably both give our perspectives. So I
  • 38:27 - 38:32
    think. Well, the question to start
    statute, of course, is do we need an
  • 38:32 - 38:35
    alternative or do we need to find more
    systematic ways to to to sanitize
  • 38:35 - 38:39
    Australians? That's, I think, one part of
    the answer here, that we don't have to
  • 38:39 - 38:43
    necessarily throw away these because we
    have problems with them. We can also look
  • 38:43 - 38:47
    at how to solve those problems. But apart
    from that, there is some exciting
  • 38:47 - 38:52
    research. OK, maybe David also wants to
    say a bit more about, for instance, on
  • 38:52 - 38:57
    capabilities, but that's not in a way not
    so different than these necessarily. But
  • 38:57 - 39:01
    but when you have high tech support for
    capabilities like like the Cherry
  • 39:01 - 39:05
    Borjesson computer, which essentially
    associates metadata to a point of
  • 39:05 - 39:10
    metadata, like commission checks, then you
    could at least for some cause of the
  • 39:10 - 39:15
    issues we talked about point to point of
    poisoning attacks, you could natively
  • 39:15 - 39:21
    catch those without support. But but it's
    a very high level idea. Maybe David wants
  • 39:21 - 39:26
    to say something. Yeah. So so I think,
    like alternative to tea is whenever you
  • 39:26 - 39:32
    want to partition your system into into
    parts, which is, I think, a good idea. And
  • 39:32 - 39:38
    everybody is now doing that also in there,
    how we build online services and stuff so
  • 39:38 - 39:44
    that these are one systems that we have
    become quite used to from from mobile
  • 39:44 - 39:49
    phones or from maybe even even from
    something like a banking card or so out,
  • 39:49 - 39:53
    which is sort of like a protected
    environment for a very simple job. But the
  • 39:53 - 39:58
    problem then starts when you throw a lot
    of functionality into the tea. As we saw,
  • 39:58 - 40:03
    the trusted code base becomes more and
    more complex and you get traditional box.
  • 40:03 - 40:08
    So I'm saying like, yeah, it's really a
    question if you need an alternative or a
  • 40:08 - 40:12
    better way of approaching it. How are you
    partition software? And as you mentioned,
  • 40:12 - 40:16
    there are some other things you can do
    architecturally so you can change the way
  • 40:16 - 40:21
    we or extends the way we build build
    architectures for with capabilities and
  • 40:21 - 40:26
    then start to isolate components. For
    instance, in one software project, say it,
  • 40:26 - 40:30
    say in your Web server, you isolate the
    stack or something like this. And also,
  • 40:30 - 40:38
    thanks for the people noticing the secret
    password here. You so obviously only for
  • 40:38 - 40:46
    decoration purposes to give the people
    something to watch. So but it's not
  • 40:46 - 40:55
    fundamentally broken, isn't? Yeah, not 60.
    I mean, these are so many of them, I
  • 40:55 - 41:02
    think, like you cannot say, fundamentally
    broken for but for a question I had was
  • 41:02 - 41:08
    specifically for SGX at that point,
    because signal uses its mobile coin,
  • 41:08 - 41:16
    cryptocurrency uses it and so on and so
    forth. Is that fundamentally broken or
  • 41:16 - 41:24
    would you rather say so? So I guess it
    depends what you call fundamentally right.
  • 41:24 - 41:30
    So there has been in the past, we have
    worked also on what I would say for
  • 41:30 - 41:35
    breaches of attitudes, but they have been
    fixed and it's actually quite a beautiful
  • 41:35 - 41:41
    instance of a well researched and have
    short term industry impact. So you find a
  • 41:41 - 41:46
    vulnerability, then the vendor has to
    devise a fix that they are often not
  • 41:46 - 41:50
    available and there are often workarounds
    to the problem. And then the later,
  • 41:50 - 41:54
    because you're are talking, of course,
    about how to talk to. So then you need new
  • 41:54 - 41:59
    processes to really get a fundamental fix
    for the problem and then you have
  • 41:59 - 42:05
    temporary workarounds. So I would say, for
    instance, a company like Signeul using it,
  • 42:05 - 42:10
    if they so it does not give you security
    by default. But you need to think about
  • 42:10 - 42:14
    the software. That's what you focused on
    in this stock. We also need to think about
  • 42:14 - 42:20
    all of the hardware, micro patches and on
    the processors to take care of all the
  • 42:20 - 42:26
    known vulnerabilities. And then, of
    course, the question always remains, are
  • 42:26 - 42:31
    the abilities that we don't know of yet
    with any secure system? I guess. But but
  • 42:31 - 42:37
    maybe also David wants to say something
    about some of his latest work there.
  • 42:37 - 42:43
    That's a bit interesting. Yeah. So I think
    what what your source or my answer to this
  • 42:43 - 42:48
    question would be, it depends on your
    threat model, really. So some some people
  • 42:48 - 42:54
    use SGX as a way to kind of like remove
    the trust in the cloud provider. So you
  • 42:54 - 43:00
    say like RSS and Signaler. So I move all
    this functionality that that is hosted
  • 43:00 - 43:05
    maybe on some cloud provider into an
    enclave and then then I don't have to
  • 43:05 - 43:11
    trust the cloud provider anymore because
    there's also some form of protection
  • 43:11 - 43:16
    against physical access. But recently we
    actually we published another attack,
  • 43:16 - 43:22
    which shows that if you have hardware
    access to an SGX processor, you can inject
  • 43:22 - 43:28
    false into into the processor by playing
    with the on the voting interface with was
  • 43:28 - 43:33
    hardware. And so you really saw that to
    the main board to to a couple of a couple
  • 43:33 - 43:38
    of wires on the bus to the voltage
    regulator. And then you can do voltage
  • 43:38 - 43:44
    glitching, as some people might know, from
    from other embedded contexts. And that way
  • 43:44 - 43:49
    then you can flip bits essentially in the
    enclave and of course, do all kinds of,
  • 43:49 - 43:55
    um, it kind of like inject all kinds of
    evil effects that then can be used further
  • 43:55 - 44:00
    to get keys out or maybe hijack control
    flow or something. So it depends on your
  • 44:00 - 44:05
    threat model. I wouldn't say so. That ASX
    is completely pointless. It's, I think,
  • 44:05 - 44:10
    better than not having it at all. But it
    definitely cannot you cannot have, like,
  • 44:10 - 44:15
    complete protection against somebody who
    has physical access to your server. So I
  • 44:15 - 44:21
    have to close this talk. It's a bummer.
    And I would ask all the questions that I
  • 44:21 - 44:26
    flew in. But one very, very fast answer,
    please. What is that with a password in
  • 44:26 - 44:31
    your background? I explained it. It's
    it's, of course, like just a joke. So I'll
  • 44:31 - 44:36
    say it again, because some people seem to
    have taken it seriously. So it was such an
  • 44:36 - 44:40
    empty whiteboard. So I put a password
    there. Unfortunately, it's not fully
  • 44:40 - 44:46
    visible in the in the screen. OK, so I
    think you should open book out of David
  • 44:46 - 45:00
    Oswald. Thank you for having that nice
    talk. And now we make the transition to
  • 45:00 - 45:04
    the new show.
  • 45:04 - 45:34
    Subtitles created by c3subtitles.de
    in the year 2021. Join, and help us!
Title:
#rC3 - Ramming Enclave Gates: A Systematic Vulnerability Assessment of TEE Shielding Runtimes
Description:

more » « less
Video Language:
English
Duration:
45:34

English subtitles

Incomplete

Revisions