Return to Video

36C3 - Intel Management Engine deep dive

  • 0:00 - 0:19
    36C3 preroll music
  • 0:19 - 0:23
    Herald: The next talk is an intel
    management engine, deep dive.
  • 0:23 - 0:27
    Understanding the ME at the OS and
    hardware level and it is by Peter Bos,
  • 0:27 - 0:31
    Please welcome him with a great round of
    applause!
  • 0:31 - 0:39
    Applause
  • 0:39 - 0:49
    Peter Bosch: Right. So everybody. Harry.
    Nice. OK. So welcome. Well, this is me.
  • 0:49 - 1:00
    I'm a student at Leiden University. Yeah,
    I've always been really interested in how
  • 1:00 - 1:05
    stuff works. And when I got a new laptop,
    I was like, you know, how does this thing
  • 1:05 - 1:08
    really boot? I knew everything from reset
    vector onwards. I wanted to know what
  • 1:08 - 1:15
    happened before it. So first I started
    looking at the boot guard ACM. While
  • 1:15 - 1:21
    looking through it, I realized that not
    everything was as it was supposed to be.
  • 1:21 - 1:26
    That led to a later part in the boot
    process being vulnerable, which ended up
  • 1:26 - 1:34
    being discovered by me. And I found out
    here last year that I wasn't the only one
  • 1:34 - 1:38
    to find it. Trammell Hudson also found it,
    and we reported it together, presented it
  • 1:38 - 1:43
    at Hack in the Box. And then at the same
    time, I was already also looking at the
  • 1:43 - 1:49
    management engine. Well, there had been a
    lot of research done on that before. The
  • 1:49 - 1:58
    public info was mostly on the file system
    and on specific vulnerabilities, which
  • 1:58 - 2:04
    still made it pretty hard to get started
    on reverse-engineering it. So that's why I
  • 2:04 - 2:10
    thought it might be useful for me to
    present this work here. It's basically
  • 2:10 - 2:17
    broken up into three parts. The first bit
    is just a quick introduction into the
  • 2:17 - 2:22
    operating system it runs. So if you want
    to work on this yourself, you're more
  • 2:22 - 2:29
    easily able to understand whats in your
    face in your Disassembler. So and then
  • 2:29 - 2:38
    after that, I'll go over its role in the
    boot process and then also how this
  • 2:38 - 2:46
    information can be used to to start
    developing a new firmware for it or do
  • 2:46 - 2:50
    more security research on it. So first of
    all, what exactly is the management
  • 2:50 - 2:57
    engine? There's been a lot of fuss about
    it being a backdoor and everything, in
  • 2:57 - 3:05
    reality, if it is or not depends on the
    software that it runs. It's basically a
  • 3:05 - 3:09
    processor with his own RAM and his own IO
    and MMUs and everything's sitting inside
  • 3:09 - 3:16
    your south ridge. It's not in the CPU,
    It's in its outreach. So when I say this
  • 3:16 - 3:24
    is gonna be about the sixth and seventh
    generation of Intel chips, I mean, mostly
  • 3:24 - 3:28
    motherboards from those generations. If
    you run a newer CPU on it, it will also
  • 3:28 - 3:40
    work for that. So yeah. Bit more detail.
    CPU it runs is based on the 80486, which,
  • 3:40 - 3:44
    you know, is funny. It's quite an old CPU
    you and it's still being used in almost
  • 3:44 - 3:51
    every computer nowadays. So it has a
    little bit of its own RAM. It has quite a
  • 3:51 - 3:58
    bit of built in ROM, has a hardware
    accelerated cryptographic unit and it has
  • 3:58 - 4:05
    fuses which are right once memory is used
    to store security settings and keys and
  • 4:05 - 4:11
    everything. Some of the more scary
    features it has: Bus bridges to all of the
  • 4:11 - 4:16
    buses inside the south ridge, it can
    access the RAM on the CPU and it can
  • 4:16 - 4:21
    access the network, which makes it really
    quite dangerous. If there is a
  • 4:21 - 4:28
    vulnerability or if it runs anything
    nefarious and it's tasks nowadays include
  • 4:28 - 4:36
    starting the computer as well as adding
    management features. This is mostly used
  • 4:36 - 4:41
    in servers where it can serve as a board
    management controller, do like a remote
  • 4:41 - 4:49
    keyboard and video and it does security
    boot guard, which is the signing of a
  • 4:49 - 4:55
    firmware and verification of signatures.
    It implements a firmware TPM and there is
  • 4:55 - 5:03
    also a SDK to use it as a general purpose
    secure enclave. So on the software side of
  • 5:03 - 5:13
    it, it runs a custom operating system,
    parts of which are taken from MINIX, the
  • 5:13 - 5:17
    teaching operating system by Andrew
    Tanenbaum. It's a micro kernel operating
  • 5:17 - 5:33
    system. It runs binaries that are in a
    completely custom format. It's really
  • 5:33 - 5:36
    quite high level system actually. If you
    look at it in terms of the operating
  • 5:36 - 5:41
    system, it runs, it's mostly like Unix,
    which makes it kind of familiar, but it
  • 5:41 - 5:47
    also has large custom parts. Like I said
    before in this talk, I'm going to be
  • 5:47 - 5:53
    speaking about sixth and seventh
    generation Intel core chipsets, so that's
  • 5:53 - 5:59
    Sunrise Point. Lewisburg, which is the
    server version of this and also the laptop
  • 5:59 - 6:04
    system on a chip they're just called Intel
    core low power. They also include the
  • 6:04 - 6:08
    chipset as a separate die. So it also
    applies to them. In fact, I've been
  • 6:08 - 6:12
    testing most of this stuff. I'm going to
    tell you about on the laptop that's
  • 6:12 - 6:19
    sitting right here, which is a Lenovo T
    460. The version of the firmware I've been
  • 6:19 - 6:31
    looking at is 11001205. Right. So I do
    need to put this up there. I'm not a part
  • 6:31 - 6:39
    of Intel, nor have I signed any contracts
    to them. I've found everything in ways
  • 6:39 - 6:44
    that you could also do. I didn't have any
    leaked NDA stuff or anything that you
  • 6:44 - 6:53
    couldn't get your hands on. It's also a
    very wide subject area, so there might be
  • 6:53 - 7:01
    some mistakes here or there, but generally
    it should be right. Well, if you want to
  • 7:01 - 7:04
    get started working on an ME firmware,
    want to reverse-engineer it or modify it
  • 7:04 - 7:09
    in some way first, you've got to deal with
    the image file. You've got your SPI flash.
  • 7:09 - 7:12
    It's where most of its firmware lives in
    the same flash chip as your BIOS. So
  • 7:12 - 7:17
    you've got that image. And then how do you
    get the code out? Well, there's tools for
  • 7:17 - 7:23
    that. It's already been extensively
    documented, documented by other people.
  • 7:23 - 7:29
    And you can basically just download a tool
    and run it against it. Which makes this
  • 7:29 - 7:32
    really easy. This is also the reason why
    there hasn't been a lot of research done
  • 7:32 - 7:36
    yet before these tools were around. You
    couldn't get to all of the code. The
  • 7:36 - 7:41
    kernel was compressed using Huffman
    tables, which were stored in ROM. You
  • 7:41 - 7:45
    couldn't get to the ROM without getting
    code execution on the thing. So there was
  • 7:45 - 7:53
    basically no way of getting access to the
    kernel code. And I think also to see some
  • 7:53 - 7:56
    library. But that's not a problem anymore.
    You can just download a tool and unpack
  • 7:56 - 8:03
    it. Also, the intel tool to generate
    firmware images, which you can find in
  • 8:03 - 8:12
    some open directories on the internet, has
    Qt resources, XML-files which basically have the
  • 8:12 - 8:18
    description for all of the file formats
    used by these ME versions, including names
  • 8:18 - 8:26
    and comments to go with those structured
    definitions. So that's really useful. So
  • 8:26 - 8:30
    we look at one of these images. It has a
    couple of partitions, some of them overlap
  • 8:30 - 8:38
    and some of them are storage, some are
    code. So there is the main partitions,
  • 8:38 - 8:46
    FTPR and NFTP, which contain the programs
    it runs. There's MFS, which is the read-write
  • 8:46 - 8:52
    file system it uses for persistent
    storage. And then there is a log to flash
  • 8:52 - 8:57
    option, the possibility to embed a token
    that will tell the system to unlock all
  • 8:57 - 9:03
    debug access which has to be signed by
    Intel so it's not really of any use to us.
  • 9:03 - 9:07
    And then there is something interesting,
    ROM bypass. Like I said, you can't get
  • 9:07 - 9:13
    access to the ROM without running code on
    it. And ROM is mask ROM. So it's internal
  • 9:13 - 9:18
    to the chip, but Intel has to develop new
    ROM code and have to test it without
  • 9:18 - 9:23
    respinning the die every time. So they
    have a possibility on a unlocked
  • 9:23 - 9:28
    preproduction chipset to completely bypass
    the internal ROM and load even the early
  • 9:28 - 9:34
    boot code from the flash chip. Some of
    these images have leaked and you can use
  • 9:34 - 9:39
    them to get a look at the ROM code, even
    without being able to dump it. That's
  • 9:39 - 9:46
    going to be really useful later on. So
    then you've got these code partitions and
  • 9:46 - 9:51
    they contain a whole lot of files. So
    there is the binaries themselves which
  • 9:51 - 9:58
    don't have any extension. There is the
    metadata files. So the binary format they
  • 9:58 - 10:05
    use has no headers, nothing included. And
    all of that data is in the metadata file.
  • 10:05 - 10:12
    And when you use the unME11 tool, you can
    actually, it'll convert those to text
  • 10:12 - 10:16
    files for you so you can just get started
    without really understanding how they
  • 10:16 - 10:27
    work. Yes. So the metadata. It's type-
    length-value structure, which contains a
  • 10:27 - 10:31
    whole lot of information the operating
    system needs. It has the info on the
  • 10:31 - 10:36
    module, whether it's data or code, where
    it should be loaded, what the privileges
  • 10:36 - 10:43
    of the process should be, a SHA
    checksum for validating it and also some
  • 10:43 - 10:49
    higher level stuff such as device file
    definitions if it's a device driver or any
  • 10:49 - 10:55
    other kind of server. I've actually
    written some code that uses this, that's
  • 10:55 - 11:01
    on GitHub, so if you want a closer look at
    it, some of the slides have a link to to
  • 11:01 - 11:10
    get a file in there which contains the
    full definitions. Right. So all the code
  • 11:10 - 11:17
    on the ME is signed and verified by Intel.
    So you can't just go and put in a new
  • 11:17 - 11:25
    binary and say, hey, let's run this. The
    way they do this is in Intel's
  • 11:25 - 11:30
    manufacture-time fuses, they have a hash
    of the public key that they use to sign
  • 11:30 - 11:36
    it. And then on each flash partition,
    there is a manifest which is signed by the
  • 11:36 - 11:41
    key and it contains the SHA hashes for all
    the metadata files, which then contain a
  • 11:41 - 11:47
    SHA hash for the code files. It doesn't
    seem to be any major problems in verifying
  • 11:47 - 11:53
    this, so it's useful to know, but it's
    you're not really gonna use this. And then
  • 11:53 - 12:00
    the modules themself, as I've said,
    they're flat binaries. Mostly. The
  • 12:00 - 12:06
    metadata contains all the info the kernel
    uses to reconstruct the actual program
  • 12:06 - 12:14
    image in memory. And a curious thing here
    is that the actual base address for all
  • 12:14 - 12:17
    the modules for old programs is the same
    across an image. So if you have a
  • 12:17 - 12:20
    different version, it's going to be
    different. But if you have two programs
  • 12:20 - 12:26
    from the same firmware it's gonna be
    loaded at the same virtual address. Right.
  • 12:26 - 12:33
    So when you want to look at it, you're
    gonna load it in some disassembler, like
  • 12:33 - 12:40
    for example IDA, and you'll see this, it
    disassembles fine, but it's gonna
  • 12:40 - 12:44
    reference all kinds of memory that you
    don't have access to. So usually you'd
  • 12:44 - 12:49
    think maybe I've loaded up a wrong address
    or or am I missing some library? Well,
  • 12:49 - 12:55
    here you've loaded it correctly if you use
    that, the address from the metadata file.
  • 12:55 - 13:02
    But you are in fact missing a lot of
    memory segments. And let's just take a
  • 13:02 - 13:10
    look at each of these. It's calling and
    switching code. It's pushing a pointer
  • 13:10 - 13:16
    there, which is data. And what's that? So
    it has shared libraries, even though it's
  • 13:16 - 13:20
    flat binaries. It actually does use shared
    libraries because you only have 1.5
  • 13:20 - 13:24
    megabyte of RAM. You don't want to
    link your C library into everything and
  • 13:24 - 13:33
    waste what little memory you have. So
    there is the main system library which is
  • 13:33 - 13:39
    like libc on a Linux system. It's in a
    flash partition, so you can actually just
  • 13:39 - 13:46
    load it and take a look at it easily and
    it starts out with a jump table. So
  • 13:46 - 13:49
    there's no symbols in the metadata file or
    anything. It doesn't do dynamic linking.
  • 13:49 - 13:57
    It loads the pages for the shared library
    at a fixed address, which is also in the
  • 13:57 - 14:02
    shared library's metadata. And then it's
    just there in the processor's memory and
  • 14:02 - 14:06
    it's gonna jump there if it needs a
    function. And the functions themself are
  • 14:06 - 14:13
    just using the normal System V, x86
    calling conventions. So it's pretty easy
  • 14:13 - 14:18
    to look at that using your normal tools.
    There's no weird register argument passing
  • 14:18 - 14:25
    going on here. So, right. Now, shared
    libraries. There's two of them. And this
  • 14:25 - 14:28
    is where it gets annoying. The system
    library, you've got access to that so you
  • 14:28 - 14:33
    can just take your time and go through it
    and try to figure out, you know, oh, hey,
  • 14:33 - 14:40
    is this open or is this read or what's
    this function doing? But then there's also
  • 14:40 - 14:49
    another second really large library, which
    is in ROM. They have all the C library
  • 14:49 - 14:54
    functions and some of their custom helper
    routines that don't interact with the
  • 14:54 - 15:01
    kernel directly, such as strings
    functions. They live in ROM. So when
  • 15:01 - 15:05
    you've got your code and this is basically
    where I was when I was here last year,
  • 15:05 - 15:07
    you're looking through it and you're
    seeing calls to a function you don't have
  • 15:07 - 15:11
    the code for all over the place. And you
    have to figure out by its signature what
  • 15:11 - 15:15
    is it doing. And that works for some of
    the functions and it's really difficult
  • 15:15 - 15:21
    for other ones. That really had me stopped
    for a while. Then I managed to find one of
  • 15:21 - 15:25
    these ROM bypass images and I had the code
    for a very early development build of the
  • 15:25 - 15:29
    ROM. This is where I got lucky. So the
    actual entry point addresses are fixed
  • 15:29 - 15:34
    across a entire chipset family. So if you
    have an image for the server version of
  • 15:34 - 15:39
    like 100 series chipset or for client
    version or for a desktop or laptop
  • 15:39 - 15:48
    version, it's all gonna be the same ROM
    addresses. So even though the code might
  • 15:48 - 15:52
    be different, you'll have the jump table,
    which means the addresses can say fixed.
  • 15:52 - 15:57
    So this only needs to be done once. And in
    fact when I upload my slides later, there
  • 15:57 - 16:03
    is a slide in there at the end that has
    the addresses for the most used functions.
  • 16:03 - 16:07
    So you're not going to have to repeat that
    work, at least not for this chipset. So if
  • 16:07 - 16:15
    you want to look at a simple module,
    you've loaded it, now you've applied the
  • 16:15 - 16:22
    things I just said, and you still don't
    have the data sections. If I don't know
  • 16:22 - 16:27
    what that function there is doing, but
    it's not very important. It actually
  • 16:27 - 16:33
    returns a value, I think, that's not used
    anywhere, but it must have a purpose
  • 16:33 - 16:40
    because it's there. Right. So then you
    look at the entry point and this is a lot
  • 16:40 - 16:45
    of stuff. And the main thing that matters
    here is on the right half of the screen,
  • 16:45 - 16:50
    there is a listing from a MINIX repository
    and on the left half there is a
  • 16:50 - 16:55
    disassembly from an ME module. So it's
    mostly the same. There is one key
  • 16:55 - 16:58
    difference, though. The ME module actually
    has a little bit of code that runs before
  • 16:58 - 17:06
    its C library startup function. And that
    function actually does all the ME specific
  • 17:06 - 17:14
    initialization, does a lot of stuff
    related to how C library data is kept
  • 17:14 - 17:22
    because there is also no data segments for
    the C library being allocated by the
  • 17:22 - 17:26
    kernel. So each process actually reserves
    a part of its own memory and tells the C
  • 17:26 - 17:31
    library, like, any global variables you
    can store in there. But when you look at
  • 17:31 - 17:38
    that function, one of the most important
    things that it calls is this function.
  • 17:38 - 17:42
    It's very simple, it just copies a bunch
    of RAM. So they don't have support for
  • 17:42 - 17:47
    initialized data sections. It's a flat
    binary. What they do is they they actually
  • 17:47 - 17:52
    use the .bss segment, the zeroed segment
    at the end of the address space, and copy
  • 17:52 - 17:57
    over a bunch of data in the program. The
    program itself is not aware of this. It's
  • 17:57 - 18:04
    really in the initialization code and in
    linker script. So this is also something
  • 18:04 - 18:09
    that's very important because you're going
    to need to also at that address in the
  • 18:09 - 18:13
    data section, you're going to need to load
    the last bit of the of the binary.
  • 18:13 - 18:21
    Otherwise you're missing constants or at
    least initialization values. Right. Then
  • 18:21 - 18:26
    there is the full memory map to the
    processes themselves. It's a flat 32 bit
  • 18:26 - 18:32
    address space. It's got everything you
    expect in there. It's got a stack and a
  • 18:32 - 18:40
    heap and everything. There's a little bit
    of heap allocated right on initialization.
  • 18:40 - 18:45
    This is this is basically how you derive
    the address space layout from the
  • 18:45 - 18:51
    metadata, especially like the data
    segment, then, and the stack itself is
  • 18:51 - 18:56
    like the address location varies a lot
    because of the number of threads that are
  • 18:56 - 19:03
    in use or the size of data sections. And
    also those stack guards, they're not
  • 19:03 - 19:08
    really stack guards. There is also
    metadata for each thread in there. But
  • 19:08 - 19:14
    that's nothing that's relevant to the
    process itself, only to the kernel. And
  • 19:14 - 19:22
    well, if you then skip forward a bit and
    you've done all these - you look at your
  • 19:22 - 19:29
    simple driver like this. This is taken
    from a driver used to talk to the CPU,
  • 19:29 - 19:35
    like, OK. So when I say CPU or host, by
    the way, I mean the CPU, like your big
  • 19:35 - 19:39
    SkyLake, or KabyLake, or CoffeeLake,
    whatever your big CPU that runs your own
  • 19:39 - 19:46
    operating system. Right. So this is used
    to to send messages there. But if you look
  • 19:46 - 19:52
    at what's going on here, OK - think I had
    a problem with the animation here - it
  • 19:52 - 19:57
    sets up some stuff and then it calls a
    library function that's in the main syslib
  • 19:57 - 20:01
    library, which actually has a main loop
    for the program. That's because Intel was
  • 20:01 - 20:06
    smart and they added a nice framework for
    device driver implementing programs,
  • 20:06 - 20:10
    because it's a micro kernel, so device
    drivers are just usual programs, calling
  • 20:10 - 20:20
    specific APIs. Then there's normal POSIX
    file I/O. No standard I/O, but it has all
  • 20:20 - 20:27
    the normal open, and read, and ioctl and
    everything functions. And then there's
  • 20:27 - 20:30
    more initialization for the srv library.
    And this is basically what all the simple
  • 20:30 - 20:39
    drivers look like in it. And then there's
    this. Because they're so low a memory,
  • 20:39 - 20:50
    they don't actually use standard I/O, or
    even printf itself to do most of the
  • 20:50 - 20:55
    debugging. It uses a thing that's called
    "sven", I'll touch on that later. So there
  • 20:55 - 20:59
    is the familiar APIs that I talked about.
    It even has POSIX threads, or at least a
  • 20:59 - 21:05
    subset of it, and there is all the
    functions that you'd expect to find on
  • 21:05 - 21:09
    some generic Unix machine. So that
    shouldn't be too much of a problem to do
  • 21:09 - 21:15
    with, but then there's also their own
    tracing solution, sven. That's what Intel
  • 21:15 - 21:17
    calls it. The name is in all the development
    tools that you can download
  • 21:17 - 21:23
    from their site, and basically, they don't
    include format strings for a lot of the
  • 21:23 - 21:28
    stuff. They just have a 32-bit identifier
    that is sent over debug port, and it
  • 21:28 - 21:34
    refers to a format string in a dictionary
    that you don't have. There is one of the
  • 21:34 - 21:39
    dictionaries for a server chip that's
    floating around the internet, but even
  • 21:39 - 21:46
    that is incomplete. And the normal non-NDA
    version of the Intel developer tools has
  • 21:46 - 21:54
    some 50 format strings for really common
    status messages it might output, but yeah,
  • 21:54 - 21:57
    like, if you see these functions, just
    realize it's doing some debug print. There
  • 21:57 - 22:01
    might be dumping some states or just
    telling it it's gonna do something else.
  • 22:01 - 22:12
    It's no important logic actually happens
    in here. Right. So then for device files.
  • 22:12 - 22:16
    They're actually defined in a manifest.
    When the kernel loads a program, and that
  • 22:16 - 22:21
    program wants to expose some kind of
    interface to other programs its manifest
  • 22:21 - 22:28
    will contai,n or it's metadata file will
    contain a special file producer entry, and
  • 22:28 - 22:33
    that says, you know, you have these device
    files, with a name, and an access mode and
  • 22:33 - 22:39
    the user, and group ID, and everything,
    and the minor numbers, and the kernel
  • 22:39 - 22:43
    sends this to the- or not kernel- the
    program loader sends this to the virtual
  • 22:43 - 22:48
    file system server and it automatically
    gets a device file, pointing to the right
  • 22:48 - 22:52
    major or minor number. And then there's
    also a library, as I said, to provide a
  • 22:52 - 23:04
    framework for a driver. And that looks
    like this. It's really easy to use. If you
  • 23:04 - 23:08
    were a ME developer you just write some
    callbacks for open, and close, and
  • 23:08 - 23:11
    everything, and it automatically calls
    them for you, when a message comes in,
  • 23:11 - 23:15
    telling you that that happened, which also
    makes it really easy to reverse engineer,
  • 23:15 - 23:21
    'cause if you look at a driver, it just
    loads some callbacks, and you can know, by
  • 23:21 - 23:28
    their offset in a structure, what actual
    call they're implementing. Right, so then
  • 23:28 - 23:32
    there is one of the more weird things
    that's going on here: How the actual
  • 23:32 - 23:37
    userland programs get access to memory map
    registers. There's a lot of this going on.
  • 23:37 - 23:43
    Calls to a couple of functions that have
    some magic arguments. The second one you
  • 23:43 - 23:51
    can easily tell is the offset, because it
    has- it increases in very nice power-of-
  • 23:51 - 23:55
    two steps, so it's probably the register
    offsets, and then what comes after it
  • 23:55 - 24:00
    looks like a value. And then the first bit
    seems to be a magic number. Well, it's
  • 24:00 - 24:05
    not. There is also an extension in the
    metadata, saying these are the memory
  • 24:05 - 24:12
    mapped I/O ranges, and those ranges,
    they'd each list a physical base address,
  • 24:12 - 24:19
    and a size, and permissions for them. Then
    the index in that list does not directly
  • 24:19 - 24:23
    correspond to the magic value. The magic
    value actually you need to do a little
  • 24:23 - 24:28
    computation on the offset, and you can
    access it through those functions. The
  • 24:28 - 24:39
    computation itself might be familiar.
    Yeah, so these are the functions. The
  • 24:39 - 24:45
    value is a segment selector. So they use
    them. Actually, don't use paging for inter
  • 24:45 - 24:52
    process isolation, they use segments like
    x86 Protected Mode segments. And for each
  • 24:52 - 24:57
    memory mapped I/O range there is a
    separate segments, and you manually specify
  • 24:57 - 25:04
    that, which is just weird to me, like, why
    would you use x86 segmenting on a modern
  • 25:04 - 25:11
    system? Minix does it, but, yeah, to
    extent that even to this? Luckily, normal
  • 25:11 - 25:16
    address space is flat, like, to the
    process, not to the kernel. Right, so now
  • 25:16 - 25:25
    we can access memory mapped I/O. That's
    all the, like the really high level stuff.
  • 25:25 - 25:29
    So what's going on under there? It's got
    all the basic microkernel stuff, so
  • 25:29 - 25:33
    message passing, and then some
    optimizations to actually make it perform
  • 25:33 - 25:40
    well on a really slow CPU. The basics are,
    you can send a message, you can receive a
  • 25:40 - 25:46
    message, and you can send and receive a
    message, where you basically say "Send a
  • 25:46 - 25:51
    message, wait till a response comes in,
    then continue", which is used to wrap
  • 25:51 - 25:58
    function calls. This is mostly the same as
    in Minix. There's some subtle changes,
  • 25:58 - 26:08
    which I'll get to later. And then memory
    grants are something that only appeared in
  • 26:08 - 26:13
    Minix really recently. It's a way for a
    process to basically create a new name for
  • 26:13 - 26:17
    a piece of memory it has, and give a
    different process access to it, just by
  • 26:17 - 26:22
    sharing the number. These are referred to
    by the process ID and a number of that
  • 26:22 - 26:28
    range. So the process IDs are actually
    local per process, so to uniquely identify
  • 26:28 - 26:35
    one you need to say process ID plus that
    number, and they're only granted to a
  • 26:35 - 26:38
    single process. So when a process creates
    one of these, it can't even access it
  • 26:38 - 26:42
    itself, unless it creates a grant for
    itself, which is not really that useful,
  • 26:42 - 26:52
    usually. These grants are used to prevent
    having to copy over all the data inside
  • 26:52 - 26:58
    the IPC message used to implement a system
    call. Yeah, these are the basic operations
  • 26:58 - 27:03
    on it. You can create one, you can copy
    into and from it. So, you can't actually
  • 27:03 - 27:07
    map it. A process that receives one of
    these has to say to the kernel, using a
  • 27:07 - 27:13
    system call, "please write this data into
    that area of memory that belongs to a
  • 27:13 - 27:18
    different process." And then there's also
    indirect grants, because, you know, in
  • 27:18 - 27:25
    Minix they do have this, but also only
    recently, and usually if you have a
  • 27:25 - 27:30
    microkernel system, you would have to copy
    your buffer for a read call first to the
  • 27:30 - 27:37
    file system server and then back to, like,
    either the hard disk driver, or the device
  • 27:37 - 27:41
    driver that's implementing a device file.
    So the ME actually allows you to create a
  • 27:41 - 27:46
    grant, pointing to a grant, that was given
    to you by someone else. And then that
  • 27:46 - 27:53
    grant will inherit the privileges of the
    process that creates it, combined with
  • 27:53 - 27:58
    those that it assignes to it. So if the
    process has a read/write grant it can
  • 27:58 - 28:01
    create a read-only or write-only grant,
    but it cannot, if it only has a read
  • 28:01 - 28:09
    grant, it cannot add write rights to it
    for a different process, obviously. So
  • 28:09 - 28:13
    then there is also some big differences
    from MINIX. In MINIX you address a process
  • 28:13 - 28:18
    by its process ID or thread ID with a
    generation number attached to it. In the
  • 28:18 - 28:25
    ME you can actually address IPC to a file
    descriptor. Kernel doesn't actually know a
  • 28:25 - 28:29
    lot about file descriptors, it just
    implements the basic thing where you have
  • 28:29 - 28:32
    a list of files and each process has a
    list of file descriptors assigning integer
  • 28:32 - 28:39
    numbers to those files to refer to them
    by. And this is used so you can as a
  • 28:39 - 28:43
    process, you can actually directly talk to
    a device driver without knowing what is
  • 28:43 - 28:47
    process ID is. So you don't send it to the
    file system server, you send it to the
  • 28:47 - 28:52
    file descriptor or the Kernel just
    magically corrects it for you. And they
  • 28:52 - 28:56
    moved select into the kernel so you can
    tell the kernel: "Hey, I want to wait till
  • 28:56 - 29:00
    the file system server tells me that it
    has not available or till a message comes
  • 29:00 - 29:05
    in." This is one of the most complicated
    system calls the ME offers that's used in
  • 29:05 - 29:12
    a normal program. You can mostly ignore it
    and just look like: "Hey, those arguments
  • 29:12 - 29:17
    sort of define a file descriptor set as a
    bit field." And then there's the message
  • 29:17 - 29:21
    that might have been received and there's
    DMA locks because you don't just want to
  • 29:21 - 29:25
    write to registers. You actually might
    want to do the direct memory access from
  • 29:25 - 29:31
    hardware so you you can actually tell the
    kernel to lock one of these memory grounds
  • 29:31 - 29:38
    in RAM for you, it won't be swapped out
    anymore. And yeah, it will even tell you
  • 29:38 - 29:42
    the physical address so you can just load
    that into a register and it's not really
  • 29:42 - 29:47
    that complicated. Just lock it, get a
    physical access, write into the register
  • 29:47 - 29:54
    and continue. Well, that's the most
    important stuff about the operating
  • 29:54 - 29:59
    system. The hardware itself is a lot more
    complicated because the operating system,
  • 29:59 - 30:03
    once you have the code, you can just
    reverse engineer it and get to know it.
  • 30:03 - 30:11
    The hardware. Well, let's just say it's a
    real pain to have to reverse engineer a
  • 30:11 - 30:16
    piece of hardware together with its
    driver. Like if you've got the driver
  • 30:16 - 30:18
    code, but you don't know what the
    registers do. So you don't know what a lot
  • 30:18 - 30:24
    of logic does. And you're trying to both
    figure out what the logic is and what the
  • 30:24 - 30:30
    actual registers do. Right. So first you
    want to know which physical address goes
  • 30:30 - 30:40
    where? The metadata listings I showed you
    actually have names in there. Those are
  • 30:40 - 30:48
    not in the metadata files themself, I
    annotated those. So you just see the
  • 30:48 - 30:57
    physical address and size. But there is
    one module, the bus driver module and the
  • 30:57 - 31:04
    bus driver is normal user process, but it
    implements stuff like PCI configuration
  • 31:04 - 31:10
    space accesses and those things. And it
    has a nice table in it with names for
  • 31:10 - 31:17
    devices. So if you just run strings on it,
    you'll see these things. When I saw this,
  • 31:17 - 31:21
    I was was pretty glad because at least I
    could make sense what device was being
  • 31:21 - 31:27
    talked to in a in a certain program. So
    the bus driver does all these things. It
  • 31:27 - 31:31
    manages power getting to devices, it
    manages configuration space access, it
  • 31:31 - 31:36
    manages the different kinds of buses and
    IOMU that are on the system. And it makes
  • 31:36 - 31:40
    sure that the normal driver never has to
    know any of these details. It just asked
  • 31:40 - 31:46
    it for a device by a number assigned to it
    a build time. And then the bus driver
  • 31:46 - 31:50
    says, OK, here's a range of physical
    address space you can now write to. So
  • 31:50 - 31:57
    that's a really nice abstraction and also
    gives us a lot of information because the
  • 31:57 - 32:02
    really old builds for sunrise point
    actually have a hell of a lot of debug
  • 32:02 - 32:07
    strings in there as printf format strings,
    not as catalogue ID. It's
  • 32:07 - 32:12
    one of the only pieces of code for the ME
    that does this, so that already tells you
  • 32:12 - 32:15
    a lot. And then there's also the table
    that I just talked about that has the
  • 32:15 - 32:24
    actual info on the devices and names. So I
    generated some DocuWiki content from this
  • 32:24 - 32:29
    that I use myself and this is what's in
    the table, part of it. So it tells you
  • 32:29 - 32:33
    what address PCI configuration space lives
    at. That tells you to do the bus device
  • 32:33 - 32:38
    function for it through that. It tells you
    on what chipset SKU they're present using
  • 32:38 - 32:45
    a bitfield. And it tells you their names
    in different fields. It also contains the
  • 32:45 - 32:49
    values that are used to write the base
    address registers for PCI. So also their
  • 32:49 - 32:54
    normal memory ranges. And there's even
    more devices. So the ME has access to a
  • 32:54 - 32:59
    lot of stuff. A lot of it is private to
    it. A lot of it is components that also
  • 32:59 - 33:06
    exist in the rest of the computer. And
    there's not a lot of information. A lot of
  • 33:06 - 33:11
    these are basically all the things that
    are out there together with conference
  • 33:11 - 33:15
    slides published by other people who have
    done research on the ME. I didn't have
  • 33:15 - 33:22
    time to add links to those, but they're
    easy to find on Google. I'll get later to
  • 33:22 - 33:28
    this, I actually wrote a emulator for the
    ME, a partial emulator to be able to run
  • 33:28 - 33:34
    ME code and analyze it, which obviously
    needs to know a bit about the hardware so
  • 33:34 - 33:41
    you can look at the app. There is some
    files in Intel's debugger package,
  • 33:41 - 33:46
    specific versions of that that have really
    detailed info on some of the devices, also
  • 33:46 - 33:51
    not all of it. And I wrote some tool to
    parse some of the files. It's really rough
  • 33:51 - 33:57
    code. I published it because people wanted
    to see what I was doing. It doesn't work
  • 33:57 - 34:04
    out of the box. And there is a nice talk
    on this by Mark Ermolov and Maxim
  • 34:04 - 34:07
    Goryachy.. Actually I don't know if I'm
    pronouncing that correctly, but they've
  • 34:07 - 34:12
    done a lot of work on the ME and this
    particular talk by them is really useful.
  • 34:12 - 34:16
    And then there's also something else.
    There is a second ME on server chipsets,
  • 34:16 - 34:21
    the innovation engine. It's basically a
    copy paste of the ME to provide a ME that
  • 34:21 - 34:25
    the vendor can write code for. Don't think
    it's used a lot. I've only been able to
  • 34:25 - 34:32
    find HP software that actually targets it
    and that has some more debug strings, but
  • 34:32 - 34:37
    also not a lot, it mostly has a table
    containing register names, but they're
  • 34:37 - 34:42
    really abbreviated and for a really small
    subset of the devices, there is
  • 34:42 - 34:48
    documentation out there in a Pentium N and
    J series datasheet. It's seems like they
  • 34:48 - 34:52
    compile their a lot of code or whatever
    with the wrong defines because it doesn't
  • 34:52 - 35:00
    actually fit into the manual that well,
    it's just a section that has like some 20
  • 35:00 - 35:09
    tables that shouldn't be in there. So this
    is from that talk I just referenced and
  • 35:09 - 35:13
    it's a overview of the innovation engine
    and the bus bridges and everything in
  • 35:13 - 35:20
    there. This isn't very precise. So based
    on some of those files from System Studio,
  • 35:20 - 35:24
    I try to get a better understanding of
    this, which is this. This is the entire
  • 35:24 - 35:30
    chipset. The little DMA block in the top
    left corner is what connects to your CPU.
  • 35:30 - 35:37
    And all of the big blocks with a lot of
    ports are our bus bridges or switches for
  • 35:37 - 35:45
    PCIexpress-like fabric. So there's a lot
    going on. The highlighted area is the
  • 35:45 - 35:59
    management engine memory space and the
    rest of it is like the global chipset. The
  • 35:59 - 36:03
    things I've highlighted in green hair are
    on the primary PCI bus. So there's this
  • 36:03 - 36:08
    weird thing going on where there seems to
    be two PCI hierarchies, at least
  • 36:08 - 36:14
    logically. So in reality it's not even
    PCI, but on intel systems, there's a lot
  • 36:14 - 36:20
    of stuff that behaves as if it is PCI. So
    it has like a bus device function and
  • 36:20 - 36:29
    numbers, PCI configuration space registers
    and they have two different roots for the
  • 36:29 - 36:32
    configuration space. So even though the
    configuration space address includes a bus
  • 36:32 - 36:36
    number, they have two completely different
    things with each. Each of which has its
  • 36:36 - 36:41
    own bus zero. So that's that's weird also
    because they don't make sense when you
  • 36:41 - 36:46
    look at how the hardware is laid out. So
    this is stuff that's on the primary PCI
  • 36:46 - 36:51
    configuration space that's directly
    accessed by the EM, by the north bridge on
  • 36:51 - 36:55
    the ME CPU. So that's the minute I A
    system agent. System agent is what Intel
  • 36:55 - 37:01
    calls a Northbridge nowadays, now that
    it's not a separate chip anymore. It's
  • 37:01 - 37:08
    basically just a Northbridge and a crypto
    unit that's on there and the stuff that's
  • 37:08 - 37:13
    directly attached to Northbridge being the
    ROM and the RAM. So the processor itself
  • 37:13 - 37:17
    is, as I said, derived from a 486, but it
    does actually have some more modern
  • 37:17 - 37:22
    features that it does CPU ID, at least on
    my systems. Some other researchers said
  • 37:22 - 37:29
    theirs didn't. It's basically the core
    that's in the quark MCU, which is really
  • 37:29 - 37:33
    great because it's one of the only cores
    made by Intel that has public
  • 37:33 - 37:40
    documentation on how to do run control. So
    breakpoints and accessing registers and
  • 37:40 - 37:44
    everything over JTAG. Intel doesn't
    publish this stuff except for the quark
  • 37:44 - 37:51
    MCU, because they were targeted makers.
    But they reused that in here, which is
  • 37:51 - 37:58
    really useful. It even has an official
    port to the OpenOCD debugger, which I have
  • 37:58 - 38:03
    not gotten to test because I don't have a
    JTAG probe, which is compatible with Intel
  • 38:03 - 38:11
    voltage levels and supported by OpenOCD
    and also has like a set CPU ID and MSRs.
  • 38:11 - 38:21
    It has some really fancy features like
    branch tracing and some more strict paging
  • 38:21 - 38:30
    permission enforcement stuff. They don't
    use the interrupt pins on this. So it's an
  • 38:30 - 38:35
    IP block but if there are some files out
    there, that's where it is this screenshot
  • 38:35 - 38:41
    is from, that actually are used by a
    built in logic analyzer Intel has on the
  • 38:41 - 38:47
    chipset and you can select different
    signals on the chip to to watch, which is
  • 38:47 - 38:51
    a really great source of information on
    how the IP blocks are laid out and what
  • 38:51 - 38:54
    signals are in there, because you
    basically get a tree view of the IP blocks
  • 38:54 - 39:01
    and chip and some of their signals. They
    don't use the legacy interrupt system,
  • 39:01 - 39:08
    they only use message based interrupts by
    what a device writes a value into a
  • 39:08 - 39:13
    register on the interrupt controller
    instead of asserting a pin. And then there
  • 39:13 - 39:22
    is the Northbridge. It's partially
    documented in that data sheet I mentioned,
  • 39:22 - 39:29
    it does support x86 IO address space, but
    it's never used. Everything in the ME is
  • 39:29 - 39:37
    in memory space or expose as memory space
    through bridges, in the Northbridge
  • 39:37 - 39:43
    implements access to the ROM,RAM, it has a
    IOMMU which is only used for transactions
  • 39:43 - 39:49
    coming from the rest of the system and
    it's always initialized to, at least in
  • 39:49 - 39:52
    the firmware I looked up, it's always
    initialized to the inverse of the page
  • 39:52 - 40:00
    table, so linear addresses can be used for
    memory maps, sorry, for DMA. It also does
  • 40:00 - 40:06
    PCI configuration space access to the
    primary PCI bus. And it has a firewall
  • 40:06 - 40:15
    that allows the operating system to deny
    any IP block in the chipset from sending a
  • 40:15 - 40:19
    completion on the bus request. So it can
    actually say: "Hey, I want to read some
  • 40:19 - 40:25
    register and only these devices are
    allowed to send me value for it." So
  • 40:25 - 40:30
    they've actually thought about security
    here, which is great. Then there is one of
  • 40:30 - 40:38
    the most important blocks in the ME, which
    is the crypto engine. It does some sort of
  • 40:38 - 40:47
    more well-known crypto algorithms. AES,
    SHA hashes, RSA and it has a secure key
  • 40:47 - 40:56
    store, which I'm not gonna [audio dropped]
    ... all about it in their ME talk at
  • 40:56 - 41:04
    Blackhat. And a lot of these things have
    DMA engines, which all seem to be the
  • 41:04 - 41:10
    same. And there is no other DM agents ...
    engines in ME, so this is also used from
  • 41:10 - 41:23
    memory to memory copy or DMA into other
    devices. So that's used in a lot of
  • 41:23 - 41:27
    things. This is actually a diagram which I
    don't have the vector for anymore. So
  • 41:27 - 41:35
    that's why the libre office background is
    in there. I'm sorry. So this is basically
  • 41:35 - 41:39
    what that crypto engine looks like when
    you look at that signal tree that I was
  • 41:39 - 41:45
    talking about earlier. The DMA engines are
    both able to do memory to memory copies
  • 41:45 - 41:53
    until directly targets the crypto unit
    they're part of. Basically, when you, I
  • 41:53 - 41:57
    don't know about the control bits that go
    with this, but when you set the target
  • 41:57 - 42:02
    address to zero and the right control
    bits, it will copy into the buffer that's
  • 42:02 - 42:12
    used for the encryption. So that is how it
    accelerates memory access for crypto. And
  • 42:12 - 42:16
    these are the actual register offsets.
    They're the same for all of the DMA
  • 42:16 - 42:22
    engines in there relative to the base
    address of the subunit they're in. And
  • 42:22 - 42:27
    then there's the second PCI bus or bus
    hierarchy, which is like in some places
  • 42:27 - 42:34
    called the PCI fixed bus. I'm actually not
    entirely sure whether this is actually
  • 42:34 - 42:39
    implemented as a PCI bus as I've drawn it
    here, but this is what it behaves like. So
  • 42:39 - 42:44
    it has all the ME private stuff, that's
    not a part of the normal chipset. So it's
  • 42:44 - 42:51
    timers for the ME, it has the
    implementation of the secure enclave
  • 42:51 - 42:58
    stuff, that the firmware TPM registers.
    And it has the gen device which I've
  • 42:58 - 43:02
    mostly ignored because it's only used the
    boot time. It's only used by the actual
  • 43:02 - 43:11
    boot ROM for the ME mostly. It is what the
    ME uses to get the fuses Intel burns. So
  • 43:11 - 43:15
    that's the intel public key, whether it's
    a production or pre-production part, but
  • 43:15 - 43:20
    it's pretty much a black box. It's not
    used that much, fortunately. There is the
  • 43:20 - 43:24
    IPC block which allows the ME to talk to
    the sensor hub, which is a different CPU
  • 43:24 - 43:28
    in the chipset. It allows it to talk to
    power management controller and all kinds
  • 43:28 - 43:34
    of other embedded CPUs. So it's inter
    processor communication not interprocess.
  • 43:34 - 43:39
    Confused me for a bit. And here's the host
    embedded controller interface, which is
  • 43:39 - 43:44
    how the ME talks to the rest of the
    computer when it wants the computer to
  • 43:44 - 43:48
    know that it's talking so it can directly
    access a lot of stuff. But when it wants
  • 43:48 - 43:54
    to send a message to the EFI or to Windows
    or Linux, it'll use this. And it also has
  • 43:54 - 43:59
    status registers, which are really simple
    things where the ME writes in a value. And
  • 43:59 - 44:05
    even if the ME crashes, the host can still
    read the value, which is how you can see
  • 44:05 - 44:11
    whether the ME is running, whether it's
    disabled, whether it fully booted, or
  • 44:11 - 44:15
    whether it crashed halfway through. But at
    a point where it could still get the rest
  • 44:15 - 44:21
    of the computer running and there is some
    corporate code to to read it. I've also
  • 44:21 - 44:27
    implemented some decoding for it on the
    emulator because it's useful to see what
  • 44:27 - 44:33
    those values mean. So then there's
    something really interesting, the primary
  • 44:33 - 44:37
    adverse translation table, which is the
    bus bridge that allows the ME to actually
  • 44:37 - 44:44
    access the PCIexpress fabric of the
    computer. For a lot of the, what in this
  • 44:44 - 44:50
    table call ME peripherals, that are
    actually outside the ME domain and the
  • 44:50 - 45:00
    chipset, it uses this to access it. It
    also uses it to access the UMA, which is
  • 45:00 - 45:05
    an area of host RAM that's used as a swap
    device for the ME and to Trace Hub, which is
  • 45:05 - 45:11
    the debug port, but also has a couple of
    windows which allow the ME to access any
  • 45:11 - 45:19
    random area of host RAM, which is the most
    scary bit because UMA is specified by
  • 45:19 - 45:25
    host, but the host DRAM area is where you
    can just point it anywhere. You can read
  • 45:25 - 45:29
    or write any value that that Windows or
    Linux or whatever you're running has
  • 45:29 - 45:37
    sitting there. So that's scary to me. So
    and then there's the rest of it, the rest
  • 45:37 - 45:46
    of the devices which are behind the
    primary ATT. And that's a lot of stuff,
  • 45:46 - 45:53
    that's debug, that's also the older normal
    peripherals that your P.C. has, but it
  • 45:53 - 45:56
    also includes things like the power
    management controller, which actually
  • 45:56 - 46:00
    turns on and off all the different parts
    of your computer. It controls clocks and
  • 46:00 - 46:08
    resets. So this is really important. There
    is a concept that you'll come across where
  • 46:08 - 46:14
    you're reading Intel manuals or ME related
    stuff that's root spaces besides your
  • 46:14 - 46:20
    normal addressing information for a PCI
    device, it also has a root space number,
  • 46:20 - 46:25
    which is basically how you have a single
    PCI device exposing two completely
  • 46:25 - 46:31
    different address spaces. And it's 0 for
    the host, it's one for the ME. Some
  • 46:31 - 46:35
    devices expose the same information on
    there. Other ones behave completely
  • 46:35 - 46:43
    different. That's something you don't
    usually see. And then there's the side
  • 46:43 - 46:49
    band fabric. So besides all this stuff
    they just covered, which is PCI like at
  • 46:49 - 46:53
    least. There is also something completely
    different, side band fabric, which is a
  • 46:53 - 47:01
    completely packet switched network, where
    you don't use any memory mapping by
  • 47:01 - 47:06
    default. You just have a one byte address
    for a device and some other addressing
  • 47:06 - 47:10
    fields and you're just sending a message
    saying: "Hey, I want to read configuration
  • 47:10 - 47:14
    or data or memory." And there is actually
    a lot of information out there on this,
  • 47:14 - 47:18
    because Intel, it seems like I just copy
    pasted their internal specification into a
  • 47:18 - 47:27
    patent. This is how you address it. This
    is all devices on there, which is quite a
  • 47:27 - 47:33
    lot. It's also what you, if any of you are
    kernel developers, and you've had to deal
  • 47:33 - 47:40
    with GPIO on Intel SoCs. There's this P2SB
    device that you have to use. That's what
  • 47:40 - 47:48
    the host uses to access this. Their
    documentation on it is really, really bad.
  • 47:48 - 47:52
    This was all done using static analysis.
    But then I wanted to figure out how some
  • 47:52 - 47:57
    of the logic actually works and it was
    really complicated to play around with the
  • 47:57 - 48:07
    ME. There was this nice talk by Ermolov
    and Goryachy, where they said: "You know,
  • 48:07 - 48:12
    we found a an exploit that gives you code
    execution and you can you can get JTAG
  • 48:12 - 48:19
    access to." It sounds really nice. It's
    actually not that easy. So arbitrary code
  • 48:19 - 48:23
    execution in the BUP module, they actually
    describe their exploit and how you should
  • 48:23 - 48:30
    use it. But they didn't describe anything
    that's needed to actually implement that.
  • 48:30 - 48:36
    So if you want to do that, what you need
    to do to figure out where to stack lives,
  • 48:36 - 48:40
    you need to know where you need to write a
    payload that will actually get it from a
  • 48:40 - 48:45
    buffer overflow on a stack that, by the
    way, uses stack cookies. So you can't just
  • 48:45 - 48:51
    overwrite the return address to turn that
    into an arbitrary write. And you need to
  • 48:51 - 48:56
    find out what the return pointer address
    is so you can overwrite it and find ROP
  • 48:56 - 49:03
    gadgets because the stack is not
    executable. And then when you've done
  • 49:03 - 49:10
    that, you can just turn on debug access or
    change to custom firmware or whatever. So
  • 49:10 - 49:14
    what I did is I had a bit of trouble
    getting that running and in order to test
  • 49:14 - 49:18
    your payload, you have to flash it into
    the system and it takes a while and then
  • 49:18 - 49:21
    the system just doesn't power on if the
    ME's not working, if you're crashing it
  • 49:21 - 49:25
    instead of getting code execution. So it's
    not really valuable to to develop it that
  • 49:25 - 49:33
    way, I think. Some people did. I respect
    that because it's really, really hard. And
  • 49:33 - 49:39
    then I wrote this ME Loader, it's called
    Loader because at first I started out like
  • 49:39 - 49:43
    writing it as a sort of a wine thing where
    you where you would just mmap the right
  • 49:43 - 49:47
    ranges at the right place and jump into
    it, execute it, patch some system calls.
  • 49:47 - 49:52
    But because the ME is a micro kernel
    system in almost every user space program
  • 49:52 - 49:57
    accesses hardware directly, it ended up
    implementing like a good part of the
  • 49:57 - 50:08
    chipset, at least as stubs or enough logic
    to get the code running. And I later on
  • 50:08 - 50:15
    added some features that actually allowed
    to talk to the hardware. I can use it as a
  • 50:15 - 50:19
    debugger, but just because it's actually
    running the ME firmware or parts of it
  • 50:19 - 50:26
    inside a normal Linux process, I can just
    use gdb to debug it. And back in April
  • 50:26 - 50:30
    last year, I got that working to the point
    where I could run the bootstrap process,
  • 50:30 - 50:39
    which is where the vulnerability is. And
    then you just develop the exploit against
  • 50:39 - 50:44
    it, which I did. And then I made a mistake
    cleaning up some old change root
  • 50:44 - 50:52
    environments for close source software.
    And I nuked my home dir. Yeah. I hadn't
  • 50:52 - 50:57
    yet pushed everything to GitHub. So I
    stuck with an old version and I decided,
  • 50:57 - 51:00
    you know, let's refactor this and turn it
    into something that might actually at some
  • 51:00 - 51:04
    point be published, which by the way I
    did last summer. This is all public code. The
  • 51:04 - 51:10
    ME Loader thing. It's on GitHub. And
    someone else beat me to it and replicated
  • 51:10 - 51:15
    that exploit by the Russian guys. Which up to
    then they have produced a proof of concept
  • 51:15 - 51:23
    thing for Apollo like chipsets, which were
    completely different for from what you had
  • 51:23 - 51:34
    to do for normal ME. I was a bit
    disappointed by that one, not being the
  • 51:34 - 51:39
    first one to actually replicate this. But
    then I did about a week later, I got it
  • 51:39 - 51:44
    got my loader back to the point where I
    could actually get to the vulnerable code
  • 51:44 - 51:51
    and develop that exploit and got it
    working not too long after. And here's the
  • 51:51 - 51:55
    great thing. Then I went to the hacker
    space. I flash it into my laptop. The
  • 51:55 - 51:59
    image that I had just been using only on
    the emulator. I didn't change it. I flash.
  • 51:59 - 52:05
    I was like, this is never gonna work on
    it. It works. some laughter And I've still got an image
  • 52:05 - 52:08
    on a flash ship with me because that's
    what I used to actually turn on the
  • 52:08 - 52:14
    debugger. And then you need a debug probe
    because that USB based debugging stuff
  • 52:14 - 52:19
    that's mentioned here only works pretty
    late in boot. Which is also why I only
  • 52:19 - 52:22
    really see Apollo Lake stuff because on
    those chipsets you can actually use this
  • 52:22 - 52:33
    for the ME. And then you need this thing
    because there's a second channel, that is
  • 52:33 - 52:36
    using the USB plug, but it's a completely
    different physical layer and you need an
  • 52:36 - 52:41
    adapter for it, which I don't think was
    intended to be publicly available. Because
  • 52:41 - 52:45
    if you go to Intel site to say, I want to
    buy this, they say, here's the C-NDA,
  • 52:45 - 52:54
    please sign it. But it appeared on mouser.
    And luckily I knew some people, who had
  • 52:54 - 52:59
    done some other stuff, got a nice bounty
    for it and bought it and I let me use it.
  • 52:59 - 53:05
    Thanks to them. It's expensive, but you
    can buy it if it's still up there. Haven't
  • 53:05 - 53:12
    checked. That's the Link. So I'm a bit
    late, so I'm gonna use the time for
  • 53:12 - 53:16
    questions as well. So the main thing the
    ME does that you cannot replace is the
  • 53:16 - 53:21
    boot process. It's not just breaking the
    system. If you don't turn it on, it
  • 53:21 - 53:25
    actually does stuff that has to be done.
    So you gonna have to use the ME anyway if
  • 53:25 - 53:31
    you want to boot a computer. I don't
    necessarily have to use Intel's firmware.
  • 53:31 - 53:36
    The ME itself boots is like a micro kernel
    system, so it has a process which
  • 53:36 - 53:40
    implements a lot of the servers that will
    allow it to get to a point where it can
  • 53:40 - 53:45
    start those servers. This process has very
    high privileges in older versions, which
  • 53:45 - 53:49
    is what is being used on these chipsets.
    And if you exploit that, you're still ring
  • 53:49 - 53:56
    3, but you can turn on debugger and you
    can use the debugger to become ring 0. So
  • 53:56 - 53:59
    this is what normal boot process for a
    computer looks like. And this is what
  • 53:59 - 54:02
    happens when you use Boot Guard. There's a
    bit of code that runs even before the
  • 54:02 - 54:07
    reset vector, and that's started by micro
    code initialization, of course. And this
  • 54:07 - 54:12
    is what actually happens. The ME loads a
    new firmware into a power management
  • 54:12 - 54:16
    controller, it then ready some stuff in a
    chipset and it tells the power mentioning
  • 54:16 - 54:24
    controller like please stop pulling that
    CPU reset pin low and the CPU will start.
  • 54:24 - 54:28
    Power managment controller is a completely
    independent thing I say 8051 derived
  • 54:28 - 54:33
    microcontroller that runs a real time
    operating system from the 90s. This is the
  • 54:33 - 54:39
    only string in the firmware by the way,
    that's quoted there. And depending on the
  • 54:39 - 54:42
    chipsset that you have, it's either loaded
    with a patch or with a complete binary
  • 54:42 - 54:47
    from the ME, and it does a lot of
    important stuff. No documentation on it
  • 54:47 - 54:52
    besides ACPI interface, which is not
    really any useful. The ME has to do these
  • 54:52 - 54:59
    things. It needs to load the keys for the
    Boot Guard process needs to set up clock
  • 54:59 - 55:07
    controllers and then tell the PMC to turn
    on the power to to the CPU. It needs to
  • 55:07 - 55:15
    configure PCI express fabric and reset -
    like get the CPU to come out of reset.
  • 55:15 - 55:18
    There's a lot of code involved in this, so
    I really didn't want to do this all
  • 55:18 - 55:22
    statically. What I did is I added hardware
    support, hardware passthrough support to
  • 55:22 - 55:28
    the emulator and booted my laptop that
    way. Actually had a video of this, but I
  • 55:28 - 55:34
    don't have the time to show it, which is a
    pity. But this is what I - the bring up
  • 55:34 - 55:38
    process from the ME running in a Linux
    process, sending whatever hardware access
  • 55:38 - 55:43
    as it was trying to do that are important
    for boot to the debugger. And then that
  • 55:43 - 55:50
    was using a ME in real hardware that was
    halted to actually do to register accesses
  • 55:50 - 55:57
    and it works. It's not going to show this.
    It actually booted the computer reliably.
  • 55:57 - 56:02
    Then Boot Guard configuration is fun
    because you know where they say they fuse
  • 56:02 - 56:11
    in the keys. Well yeah. But the ME loads
    them from fuses and then manually loads
  • 56:11 - 56:15
    them into registers. So if you have code
    execution on the ME before it does this,
  • 56:15 - 56:18
    you can just load your own values and you
    can run core boot even on a machine that
  • 56:18 - 56:24
    has Boot Guard. Yeah. So I'm gonna go
    through this really quickly. This is, by
  • 56:24 - 56:30
    the way, these are the registers that
    configure what security model the CPU is
  • 56:30 - 56:35
    gonna enforce for the firmware. I'm going
    to release this code after my talk. It's
  • 56:35 - 56:40
    part of a Python script that I wrote that
    uses the debugger to start the CPU without
  • 56:40 - 56:46
    ME firmware. I traced all the of the ME
    firmware did. And I now have a Python
  • 56:46 - 56:51
    script that can just start a computer
    without Intel's code. If you translate
  • 56:51 - 56:56
    this into a rough sequence or even into
    binary for the ME, you can start a
  • 56:56 - 57:03
    computer without the ME itself or at least
    without it running the operating system.
  • 57:03 - 57:13
    applause
    So, yeah, future goals. I really do want
  • 57:13 - 57:20
    to share this because if there is a way to
    escalate, to ring 0 fruit, a rope chain,
  • 57:20 - 57:24
    then you could just start your own kernel
    in the ME and have custom firmware, at
  • 57:24 - 57:30
    least from the vulnerability on. But you
    could also build a mod chip that uses the
  • 57:30 - 57:35
    debugger interface to load a new firmware.
    There's lots of stuff still needs to be
  • 57:35 - 57:41
    discovered, but I'm gonna hang out at the
    open source firmware village later, at
  • 57:41 - 57:47
    least part of the week here. So because I
    really want to get started on open source
  • 57:47 - 57:55
    ME firmware using this. Right. And there's
    a lot of people that's played a role in
  • 57:55 - 58:01
    getting me to this point. Also would like
    to thank the guy from Hague hacker space,
  • 58:01 - 58:08
    BinoAlpha, who basically allowed me to use
    his laptop to prepare the demo, which I
  • 58:08 - 58:15
    ended up not being able to show, but.
    Right. I was gonna ask what are the
  • 58:15 - 58:17
    worrying questions? But I don't think
    there's really any time for any more.
  • 58:17 - 58:23
    Herald: Peter, thank you so much. Applause
    Unfortunately, we don't have any more time
  • 58:23 - 58:31
    left.
    Peter: I'll be around. I'll be around.
  • 58:31 - 58:36
    Herald: I think it's very, very
    interesting because I hope that your talk
  • 58:36 - 58:41
    will inspire many people to keep looking
    into how the management engine works and
  • 58:41 - 58:47
    hopefully uncover even more stuff. I think
    we have time for just one single question.
  • 58:47 - 58:51
    I don't know, do we? How one from the
    Internet. Thank you so much.
  • 58:51 - 58:57
    Signal Angel: OK. First off, I have to
    tell you. Your shirt is nice. Chat wanted
  • 58:57 - 59:05
    me to say this. And they asked how
    reliable this exploit is and does it work
  • 59:05 - 59:09
    on every boot?
    Peter: Right, Yeah. That's actually
  • 59:09 - 59:15
    something really important that I forgot
    to mention. So they patch a vulnerability,
  • 59:15 - 59:17
    but they didn't provide downgrade
    protection. If you could flash a
  • 59:17 - 59:24
    vulnerable image with an exploit in it,
    it'll just boot every time on these chips
  • 59:24 - 59:28
    that's so six or seven generation chips
    that's put in that image and it will
  • 59:28 - 59:31
    reliably turn on the debugger every time
    you turn on the computer. applause
  • 59:31 - 59:37
    Herald: Thank you so much for the
    question. And Peter Bosch thank you so
  • 59:37 - 59:39
    much. Please give him a great round of
    applause.
  • 59:39 - 59:44
    applause
  • 59:44 - 60:08
    subtitles created by c3subtitles.de
    in the year 20??. Join, and help us!
Title:
36C3 - Intel Management Engine deep dive
Description:

more » « less
Video Language:
English
Duration:
01:00:08

English subtitles

Revisions