0:00:00.000,0:00:19.152
36C3 preroll music
0:00:19.152,0:00:22.520
Herald: The next talk is an intel[br]management engine, deep dive.
0:00:22.520,0:00:27.230
Understanding the ME at the OS and[br]hardware level and it is by Peter Bos,
0:00:27.230,0:00:31.089
Please welcome him with a great round of[br]applause!
0:00:31.089,0:00:38.780
Applause
0:00:38.780,0:00:49.409
Peter Bosch: Right. So everybody. Harry.[br]Nice. OK. So welcome. Well, this is me.
0:00:49.409,0:00:59.510
I'm a student at Leiden University. Yeah,[br]I've always been really interested in how
0:00:59.510,0:01:04.610
stuff works. And when I got a new laptop,[br]I was like, you know, how does this thing
0:01:04.610,0:01:08.410
really boot? I knew everything from reset[br]vector onwards. I wanted to know what
0:01:08.410,0:01:15.221
happened before it. So first I started[br]looking at the boot guard ACM. While
0:01:15.221,0:01:21.420
looking through it, I realized that not[br]everything was as it was supposed to be.
0:01:21.420,0:01:26.280
That led to a later part in the boot[br]process being vulnerable, which ended up
0:01:26.280,0:01:34.249
being discovered by me. And I found out[br]here last year that I wasn't the only one
0:01:34.249,0:01:38.310
to find it. Trammell Hudson also found it,[br]and we reported it together, presented it
0:01:38.310,0:01:43.399
at Hack in the Box. And then at the same[br]time, I was already also looking at the
0:01:43.399,0:01:49.350
management engine. Well, there had been a[br]lot of research done on that before. The
0:01:49.350,0:01:58.140
public info was mostly on the file system[br]and on specific vulnerabilities, which
0:01:58.140,0:02:04.400
still made it pretty hard to get started[br]on reverse-engineering it. So that's why I
0:02:04.400,0:02:10.340
thought it might be useful for me to[br]present this work here. It's basically
0:02:10.340,0:02:16.910
broken up into three parts. The first bit[br]is just a quick introduction into the
0:02:16.910,0:02:22.250
operating system it runs. So if you want[br]to work on this yourself, you're more
0:02:22.250,0:02:28.690
easily able to understand whats in your[br]face in your Disassembler. So and then
0:02:28.690,0:02:37.950
after that, I'll go over its role in the[br]boot process and then also how this
0:02:37.950,0:02:45.780
information can be used to to start[br]developing a new firmware for it or do
0:02:45.780,0:02:49.730
more security research on it. So first of[br]all, what exactly is the management
0:02:49.730,0:02:57.280
engine? There's been a lot of fuss about[br]it being a backdoor and everything, in
0:02:57.280,0:03:05.000
reality, if it is or not depends on the[br]software that it runs. It's basically a
0:03:05.000,0:03:09.110
processor with his own RAM and his own IO[br]and MMUs and everything's sitting inside
0:03:09.110,0:03:16.049
your south ridge. It's not in the CPU,[br]It's in its outreach. So when I say this
0:03:16.049,0:03:24.010
is gonna be about the sixth and seventh[br]generation of Intel chips, I mean, mostly
0:03:24.010,0:03:28.489
motherboards from those generations. If[br]you run a newer CPU on it, it will also
0:03:28.489,0:03:39.584
work for that. So yeah. Bit more detail.[br]CPU it runs is based on the 80486, which,
0:03:39.584,0:03:43.510
you know, is funny. It's quite an old CPU[br]you and it's still being used in almost
0:03:43.510,0:03:51.079
every computer nowadays. So it has a[br]little bit of its own RAM. It has quite a
0:03:51.079,0:03:58.150
bit of built in ROM, has a hardware[br]accelerated cryptographic unit and it has
0:03:58.150,0:04:05.450
fuses which are right once memory is used[br]to store security settings and keys and
0:04:05.450,0:04:11.079
everything. Some of the more scary[br]features it has: Bus bridges to all of the
0:04:11.079,0:04:16.419
buses inside the south ridge, it can[br]access the RAM on the CPU and it can
0:04:16.419,0:04:21.359
access the network, which makes it really[br]quite dangerous. If there is a
0:04:21.359,0:04:28.409
vulnerability or if it runs anything[br]nefarious and it's tasks nowadays include
0:04:28.409,0:04:35.860
starting the computer as well as adding[br]management features. This is mostly used
0:04:35.860,0:04:41.190
in servers where it can serve as a board[br]management controller, do like a remote
0:04:41.190,0:04:49.001
keyboard and video and it does security[br]boot guard, which is the signing of a
0:04:49.001,0:04:54.830
firmware and verification of signatures.[br]It implements a firmware TPM and there is
0:04:54.830,0:05:02.590
also a SDK to use it as a general purpose[br]secure enclave. So on the software side of
0:05:02.630,0:05:12.650
it, it runs a custom operating system,[br]parts of which are taken from MINIX, the
0:05:12.650,0:05:17.250
teaching operating system by Andrew[br]Tanenbaum. It's a micro kernel operating
0:05:17.250,0:05:32.930
system. It runs binaries that are in a[br]completely custom format. It's really
0:05:32.930,0:05:36.030
quite high level system actually. If you[br]look at it in terms of the operating
0:05:36.030,0:05:40.681
system, it runs, it's mostly like Unix,[br]which makes it kind of familiar, but it
0:05:40.681,0:05:46.819
also has large custom parts. Like I said[br]before in this talk, I'm going to be
0:05:46.819,0:05:52.740
speaking about sixth and seventh[br]generation Intel core chipsets, so that's
0:05:52.740,0:05:58.949
Sunrise Point. Lewisburg, which is the[br]server version of this and also the laptop
0:05:58.949,0:06:04.410
system on a chip they're just called Intel[br]core low power. They also include the
0:06:04.410,0:06:08.360
chipset as a separate die. So it also[br]applies to them. In fact, I've been
0:06:08.360,0:06:11.979
testing most of this stuff. I'm going to[br]tell you about on the laptop that's
0:06:11.979,0:06:19.430
sitting right here, which is a Lenovo T[br]460. The version of the firmware I've been
0:06:19.430,0:06:30.820
looking at is 11001205. Right. So I do[br]need to put this up there. I'm not a part
0:06:30.820,0:06:38.520
of Intel, nor have I signed any contracts[br]to them. I've found everything in ways
0:06:38.520,0:06:43.500
that you could also do. I didn't have any[br]leaked NDA stuff or anything that you
0:06:43.500,0:06:53.099
couldn't get your hands on. It's also a[br]very wide subject area, so there might be
0:06:53.099,0:07:00.580
some mistakes here or there, but generally[br]it should be right. Well, if you want to
0:07:00.580,0:07:04.220
get started working on an ME firmware,[br]want to reverse-engineer it or modify it
0:07:04.220,0:07:08.580
in some way first, you've got to deal with[br]the image file. You've got your SPI flash.
0:07:08.580,0:07:12.009
It's where most of its firmware lives in[br]the same flash chip as your BIOS. So
0:07:12.009,0:07:17.410
you've got that image. And then how do you[br]get the code out? Well, there's tools for
0:07:17.410,0:07:22.949
that. It's already been extensively[br]documented, documented by other people.
0:07:22.949,0:07:28.681
And you can basically just download a tool[br]and run it against it. Which makes this
0:07:28.681,0:07:31.690
really easy. This is also the reason why[br]there hasn't been a lot of research done
0:07:31.690,0:07:35.940
yet before these tools were around. You[br]couldn't get to all of the code. The
0:07:35.940,0:07:41.349
kernel was compressed using Huffman[br]tables, which were stored in ROM. You
0:07:41.349,0:07:45.360
couldn't get to the ROM without getting[br]code execution on the thing. So there was
0:07:45.360,0:07:52.639
basically no way of getting access to the[br]kernel code. And I think also to see some
0:07:52.639,0:07:55.800
library. But that's not a problem anymore.[br]You can just download a tool and unpack
0:07:55.800,0:08:02.520
it. Also, the intel tool to generate[br]firmware images, which you can find in
0:08:02.520,0:08:11.979
some open directories on the internet, has[br]Qt resources, XML-files which basically have the
0:08:11.979,0:08:18.330
description for all of the file formats[br]used by these ME versions, including names
0:08:18.330,0:08:26.050
and comments to go with those structured[br]definitions. So that's really useful. So
0:08:26.050,0:08:30.430
we look at one of these images. It has a[br]couple of partitions, some of them overlap
0:08:30.430,0:08:38.150
and some of them are storage, some are[br]code. So there is the main partitions,
0:08:38.150,0:08:45.709
FTPR and NFTP, which contain the programs[br]it runs. There's MFS, which is the read-write
0:08:45.709,0:08:51.980
file system it uses for persistent[br]storage. And then there is a log to flash
0:08:51.980,0:08:57.320
option, the possibility to embed a token[br]that will tell the system to unlock all
0:08:57.320,0:09:02.850
debug access which has to be signed by[br]Intel so it's not really of any use to us.
0:09:02.850,0:09:07.439
And then there is something interesting,[br]ROM bypass. Like I said, you can't get
0:09:07.439,0:09:13.160
access to the ROM without running code on[br]it. And ROM is mask ROM. So it's internal
0:09:13.160,0:09:17.540
to the chip, but Intel has to develop new[br]ROM code and have to test it without
0:09:17.540,0:09:23.270
respinning the die every time. So they[br]have a possibility on a unlocked
0:09:23.270,0:09:28.170
preproduction chipset to completely bypass[br]the internal ROM and load even the early
0:09:28.170,0:09:33.670
boot code from the flash chip. Some of[br]these images have leaked and you can use
0:09:33.670,0:09:39.250
them to get a look at the ROM code, even[br]without being able to dump it. That's
0:09:39.250,0:09:45.610
going to be really useful later on. So[br]then you've got these code partitions and
0:09:45.610,0:09:51.230
they contain a whole lot of files. So[br]there is the binaries themselves which
0:09:51.230,0:09:57.569
don't have any extension. There is the[br]metadata files. So the binary format they
0:09:57.569,0:10:05.350
use has no headers, nothing included. And[br]all of that data is in the metadata file.
0:10:05.350,0:10:12.000
And when you use the unME11 tool, you can[br]actually, it'll convert those to text
0:10:12.000,0:10:16.069
files for you so you can just get started[br]without really understanding how they
0:10:16.069,0:10:26.640
work. Yes. So the metadata. It's type-[br]length-value structure, which contains a
0:10:26.640,0:10:31.180
whole lot of information the operating[br]system needs. It has the info on the
0:10:31.180,0:10:35.820
module, whether it's data or code, where[br]it should be loaded, what the privileges
0:10:35.820,0:10:43.390
of the process should be, a SHA[br]checksum for validating it and also some
0:10:43.390,0:10:49.000
higher level stuff such as device file[br]definitions if it's a device driver or any
0:10:49.000,0:10:55.430
other kind of server. I've actually[br]written some code that uses this, that's
0:10:55.430,0:11:01.460
on GitHub, so if you want a closer look at[br]it, some of the slides have a link to to
0:11:01.460,0:11:09.780
get a file in there which contains the[br]full definitions. Right. So all the code
0:11:09.780,0:11:16.801
on the ME is signed and verified by Intel.[br]So you can't just go and put in a new
0:11:16.801,0:11:24.689
binary and say, hey, let's run this. The[br]way they do this is in Intel's
0:11:24.689,0:11:30.300
manufacture-time fuses, they have a hash[br]of the public key that they use to sign
0:11:30.300,0:11:36.070
it. And then on each flash partition,[br]there is a manifest which is signed by the
0:11:36.070,0:11:40.820
key and it contains the SHA hashes for all[br]the metadata files, which then contain a
0:11:40.820,0:11:47.150
SHA hash for the code files. It doesn't[br]seem to be any major problems in verifying
0:11:47.150,0:11:52.530
this, so it's useful to know, but it's[br]you're not really gonna use this. And then
0:11:52.530,0:12:00.300
the modules themself, as I've said,[br]they're flat binaries. Mostly. The
0:12:00.300,0:12:05.560
metadata contains all the info the kernel[br]uses to reconstruct the actual program
0:12:05.560,0:12:13.530
image in memory. And a curious thing here[br]is that the actual base address for all
0:12:13.530,0:12:17.459
the modules for old programs is the same[br]across an image. So if you have a
0:12:17.459,0:12:19.930
different version, it's going to be[br]different. But if you have two programs
0:12:19.930,0:12:25.949
from the same firmware it's gonna be[br]loaded at the same virtual address. Right.
0:12:25.949,0:12:32.820
So when you want to look at it, you're[br]gonna load it in some disassembler, like
0:12:32.820,0:12:39.540
for example IDA, and you'll see this, it[br]disassembles fine, but it's gonna
0:12:39.540,0:12:44.270
reference all kinds of memory that you[br]don't have access to. So usually you'd
0:12:44.270,0:12:49.459
think maybe I've loaded up a wrong address[br]or or am I missing some library? Well,
0:12:49.459,0:12:55.150
here you've loaded it correctly if you use[br]that, the address from the metadata file.
0:12:55.150,0:13:02.310
But you are in fact missing a lot of[br]memory segments. And let's just take a
0:13:02.310,0:13:09.829
look at each of these. It's calling and[br]switching code. It's pushing a pointer
0:13:09.829,0:13:15.890
there, which is data. And what's that? So[br]it has shared libraries, even though it's
0:13:15.890,0:13:19.920
flat binaries. It actually does use shared[br]libraries because you only have 1.5
0:13:19.920,0:13:24.319
megabyte of RAM. You don't want to[br]link your C library into everything and
0:13:24.319,0:13:32.800
waste what little memory you have. So[br]there is the main system library which is
0:13:32.800,0:13:39.270
like libc on a Linux system. It's in a[br]flash partition, so you can actually just
0:13:39.270,0:13:45.689
load it and take a look at it easily and[br]it starts out with a jump table. So
0:13:45.689,0:13:48.770
there's no symbols in the metadata file or[br]anything. It doesn't do dynamic linking.
0:13:48.770,0:13:56.549
It loads the pages for the shared library[br]at a fixed address, which is also in the
0:13:56.549,0:14:01.620
shared library's metadata. And then it's[br]just there in the processor's memory and
0:14:01.620,0:14:06.130
it's gonna jump there if it needs a[br]function. And the functions themself are
0:14:06.130,0:14:12.890
just using the normal System V, x86[br]calling conventions. So it's pretty easy
0:14:12.890,0:14:17.980
to look at that using your normal tools.[br]There's no weird register argument passing
0:14:17.980,0:14:24.559
going on here. So, right. Now, shared[br]libraries. There's two of them. And this
0:14:24.559,0:14:28.160
is where it gets annoying. The system[br]library, you've got access to that so you
0:14:28.160,0:14:32.850
can just take your time and go through it[br]and try to figure out, you know, oh, hey,
0:14:32.850,0:14:39.880
is this open or is this read or what's[br]this function doing? But then there's also
0:14:39.880,0:14:49.150
another second really large library, which[br]is in ROM. They have all the C library
0:14:49.150,0:14:54.300
functions and some of their custom helper[br]routines that don't interact with the
0:14:54.300,0:15:00.920
kernel directly, such as strings[br]functions. They live in ROM. So when
0:15:00.920,0:15:04.700
you've got your code and this is basically[br]where I was when I was here last year,
0:15:04.700,0:15:07.040
you're looking through it and you're[br]seeing calls to a function you don't have
0:15:07.040,0:15:11.010
the code for all over the place. And you[br]have to figure out by its signature what
0:15:11.010,0:15:14.870
is it doing. And that works for some of[br]the functions and it's really difficult
0:15:14.870,0:15:20.610
for other ones. That really had me stopped[br]for a while. Then I managed to find one of
0:15:20.610,0:15:25.070
these ROM bypass images and I had the code[br]for a very early development build of the
0:15:25.070,0:15:29.370
ROM. This is where I got lucky. So the[br]actual entry point addresses are fixed
0:15:29.370,0:15:33.939
across a entire chipset family. So if you[br]have an image for the server version of
0:15:33.939,0:15:39.310
like 100 series chipset or for client[br]version or for a desktop or laptop
0:15:39.310,0:15:47.540
version, it's all gonna be the same ROM[br]addresses. So even though the code might
0:15:47.540,0:15:51.930
be different, you'll have the jump table,[br]which means the addresses can say fixed.
0:15:51.930,0:15:56.760
So this only needs to be done once. And in[br]fact when I upload my slides later, there
0:15:56.760,0:16:02.919
is a slide in there at the end that has[br]the addresses for the most used functions.
0:16:02.919,0:16:07.350
So you're not going to have to repeat that[br]work, at least not for this chipset. So if
0:16:07.350,0:16:15.160
you want to look at a simple module,[br]you've loaded it, now you've applied the
0:16:15.160,0:16:21.860
things I just said, and you still don't[br]have the data sections. If I don't know
0:16:21.860,0:16:26.669
what that function there is doing, but[br]it's not very important. It actually
0:16:26.669,0:16:33.230
returns a value, I think, that's not used[br]anywhere, but it must have a purpose
0:16:33.230,0:16:40.220
because it's there. Right. So then you[br]look at the entry point and this is a lot
0:16:40.220,0:16:44.660
of stuff. And the main thing that matters[br]here is on the right half of the screen,
0:16:44.660,0:16:50.189
there is a listing from a MINIX repository[br]and on the left half there is a
0:16:50.189,0:16:54.809
disassembly from an ME module. So it's[br]mostly the same. There is one key
0:16:54.809,0:16:58.419
difference, though. The ME module actually[br]has a little bit of code that runs before
0:16:58.419,0:17:06.230
its C library startup function. And that[br]function actually does all the ME specific
0:17:06.230,0:17:13.980
initialization, does a lot of stuff[br]related to how C library data is kept
0:17:13.980,0:17:21.520
because there is also no data segments for[br]the C library being allocated by the
0:17:21.520,0:17:25.820
kernel. So each process actually reserves[br]a part of its own memory and tells the C
0:17:25.820,0:17:31.290
library, like, any global variables you[br]can store in there. But when you look at
0:17:31.290,0:17:37.610
that function, one of the most important[br]things that it calls is this function.
0:17:37.610,0:17:41.510
It's very simple, it just copies a bunch[br]of RAM. So they don't have support for
0:17:41.510,0:17:46.650
initialized data sections. It's a flat[br]binary. What they do is they they actually
0:17:46.650,0:17:51.520
use the .bss segment, the zeroed segment[br]at the end of the address space, and copy
0:17:51.520,0:17:57.070
over a bunch of data in the program. The[br]program itself is not aware of this. It's
0:17:57.070,0:18:04.180
really in the initialization code and in[br]linker script. So this is also something
0:18:04.180,0:18:09.170
that's very important because you're going[br]to need to also at that address in the
0:18:09.170,0:18:13.310
data section, you're going to need to load[br]the last bit of the of the binary.
0:18:13.310,0:18:20.520
Otherwise you're missing constants or at[br]least initialization values. Right. Then
0:18:20.520,0:18:26.150
there is the full memory map to the[br]processes themselves. It's a flat 32 bit
0:18:26.150,0:18:31.970
address space. It's got everything you[br]expect in there. It's got a stack and a
0:18:31.970,0:18:39.500
heap and everything. There's a little bit[br]of heap allocated right on initialization.
0:18:39.500,0:18:44.690
This is this is basically how you derive[br]the address space layout from the
0:18:44.690,0:18:51.100
metadata, especially like the data[br]segment, then, and the stack itself is
0:18:51.100,0:18:56.180
like the address location varies a lot[br]because of the number of threads that are
0:18:56.180,0:19:03.380
in use or the size of data sections. And[br]also those stack guards, they're not
0:19:03.380,0:19:07.960
really stack guards. There is also[br]metadata for each thread in there. But
0:19:07.960,0:19:13.640
that's nothing that's relevant to the[br]process itself, only to the kernel. And
0:19:13.640,0:19:21.890
well, if you then skip forward a bit and[br]you've done all these - you look at your
0:19:21.890,0:19:28.790
simple driver like this. This is taken[br]from a driver used to talk to the CPU,
0:19:28.790,0:19:34.630
like, OK. So when I say CPU or host, by[br]the way, I mean the CPU, like your big
0:19:34.630,0:19:39.370
SkyLake, or KabyLake, or CoffeeLake,[br]whatever your big CPU that runs your own
0:19:39.370,0:19:46.070
operating system. Right. So this is used[br]to to send messages there. But if you look
0:19:46.070,0:19:51.680
at what's going on here, OK - think I had[br]a problem with the animation here - it
0:19:51.680,0:19:57.000
sets up some stuff and then it calls a[br]library function that's in the main syslib
0:19:57.000,0:20:01.270
library, which actually has a main loop[br]for the program. That's because Intel was
0:20:01.270,0:20:06.440
smart and they added a nice framework for[br]device driver implementing programs,
0:20:06.440,0:20:10.130
because it's a micro kernel, so device[br]drivers are just usual programs, calling
0:20:10.130,0:20:20.060
specific APIs. Then there's normal POSIX[br]file I/O. No standard I/O, but it has all
0:20:20.060,0:20:26.530
the normal open, and read, and ioctl and[br]everything functions. And then there's
0:20:26.530,0:20:30.170
more initialization for the srv library.[br]And this is basically what all the simple
0:20:30.170,0:20:38.890
drivers look like in it. And then there's[br]this. Because they're so low a memory,
0:20:38.890,0:20:50.040
they don't actually use standard I/O, or[br]even printf itself to do most of the
0:20:50.040,0:20:54.820
debugging. It uses a thing that's called[br]"sven", I'll touch on that later. So there
0:20:54.820,0:20:59.150
is the familiar APIs that I talked about.[br]It even has POSIX threads, or at least a
0:20:59.150,0:21:04.510
subset of it, and there is all the[br]functions that you'd expect to find on
0:21:04.510,0:21:08.700
some generic Unix machine. So that[br]shouldn't be too much of a problem to do
0:21:08.700,0:21:14.570
with, but then there's also their own[br]tracing solution, sven. That's what Intel
0:21:14.570,0:21:17.350
calls it. The name is in all the development[br]tools that you can download
0:21:17.350,0:21:23.370
from their site, and basically, they don't[br]include format strings for a lot of the
0:21:23.370,0:21:28.390
stuff. They just have a 32-bit identifier[br]that is sent over debug port, and it
0:21:28.390,0:21:34.270
refers to a format string in a dictionary[br]that you don't have. There is one of the
0:21:34.270,0:21:38.820
dictionaries for a server chip that's[br]floating around the internet, but even
0:21:38.820,0:21:45.940
that is incomplete. And the normal non-NDA[br]version of the Intel developer tools has
0:21:45.940,0:21:53.810
some 50 format strings for really common[br]status messages it might output, but yeah,
0:21:53.810,0:21:57.391
like, if you see these functions, just[br]realize it's doing some debug print. There
0:21:57.391,0:22:00.550
might be dumping some states or just[br]telling it it's gonna do something else.
0:22:00.550,0:22:12.020
It's no important logic actually happens[br]in here. Right. So then for device files.
0:22:12.020,0:22:16.190
They're actually defined in a manifest.[br]When the kernel loads a program, and that
0:22:16.190,0:22:20.830
program wants to expose some kind of[br]interface to other programs its manifest
0:22:20.830,0:22:27.780
will contai,n or it's metadata file will[br]contain a special file producer entry, and
0:22:27.780,0:22:33.120
that says, you know, you have these device[br]files, with a name, and an access mode and
0:22:33.120,0:22:39.210
the user, and group ID, and everything,[br]and the minor numbers, and the kernel
0:22:39.210,0:22:42.830
sends this to the- or not kernel- the[br]program loader sends this to the virtual
0:22:42.830,0:22:47.720
file system server and it automatically[br]gets a device file, pointing to the right
0:22:47.720,0:22:51.800
major or minor number. And then there's[br]also a library, as I said, to provide a
0:22:51.800,0:23:03.680
framework for a driver. And that looks[br]like this. It's really easy to use. If you
0:23:03.680,0:23:08.070
were a ME developer you just write some[br]callbacks for open, and close, and
0:23:08.070,0:23:11.000
everything, and it automatically calls[br]them for you, when a message comes in,
0:23:11.000,0:23:15.400
telling you that that happened, which also[br]makes it really easy to reverse engineer,
0:23:15.400,0:23:21.100
'cause if you look at a driver, it just[br]loads some callbacks, and you can know, by
0:23:21.100,0:23:27.510
their offset in a structure, what actual[br]call they're implementing. Right, so then
0:23:27.510,0:23:31.950
there is one of the more weird things[br]that's going on here: How the actual
0:23:31.950,0:23:37.470
userland programs get access to memory map[br]registers. There's a lot of this going on.
0:23:37.470,0:23:42.830
Calls to a couple of functions that have[br]some magic arguments. The second one you
0:23:42.830,0:23:50.640
can easily tell is the offset, because it[br]has- it increases in very nice power-of-
0:23:50.640,0:23:54.670
two steps, so it's probably the register[br]offsets, and then what comes after it
0:23:54.670,0:24:00.160
looks like a value. And then the first bit[br]seems to be a magic number. Well, it's
0:24:00.160,0:24:05.479
not. There is also an extension in the[br]metadata, saying these are the memory
0:24:05.479,0:24:12.170
mapped I/O ranges, and those ranges,[br]they'd each list a physical base address,
0:24:12.170,0:24:19.360
and a size, and permissions for them. Then[br]the index in that list does not directly
0:24:19.360,0:24:23.150
correspond to the magic value. The magic[br]value actually you need to do a little
0:24:23.150,0:24:27.680
computation on the offset, and you can[br]access it through those functions. The
0:24:27.680,0:24:38.600
computation itself might be familiar.[br]Yeah, so these are the functions. The
0:24:38.600,0:24:44.610
value is a segment selector. So they use[br]them. Actually, don't use paging for inter
0:24:44.610,0:24:51.820
process isolation, they use segments like[br]x86 Protected Mode segments. And for each
0:24:51.820,0:24:56.610
memory mapped I/O range there is a[br]separate segments, and you manually specify
0:24:56.610,0:25:04.280
that, which is just weird to me, like, why[br]would you use x86 segmenting on a modern
0:25:04.280,0:25:10.610
system? Minix does it, but, yeah, to[br]extent that even to this? Luckily, normal
0:25:10.610,0:25:16.130
address space is flat, like, to the[br]process, not to the kernel. Right, so now
0:25:16.130,0:25:24.870
we can access memory mapped I/O. That's[br]all the, like the really high level stuff.
0:25:24.870,0:25:28.700
So what's going on under there? It's got[br]all the basic microkernel stuff, so
0:25:28.700,0:25:33.020
message passing, and then some[br]optimizations to actually make it perform
0:25:33.020,0:25:40.140
well on a really slow CPU. The basics are,[br]you can send a message, you can receive a
0:25:40.140,0:25:46.160
message, and you can send and receive a[br]message, where you basically say "Send a
0:25:46.160,0:25:50.930
message, wait till a response comes in,[br]then continue", which is used to wrap
0:25:50.930,0:25:58.400
function calls. This is mostly the same as[br]in Minix. There's some subtle changes,
0:25:58.400,0:26:08.230
which I'll get to later. And then memory[br]grants are something that only appeared in
0:26:08.230,0:26:13.080
Minix really recently. It's a way for a[br]process to basically create a new name for
0:26:13.080,0:26:16.690
a piece of memory it has, and give a[br]different process access to it, just by
0:26:16.690,0:26:21.630
sharing the number. These are referred to[br]by the process ID and a number of that
0:26:21.630,0:26:28.470
range. So the process IDs are actually[br]local per process, so to uniquely identify
0:26:28.470,0:26:35.461
one you need to say process ID plus that[br]number, and they're only granted to a
0:26:35.461,0:26:38.300
single process. So when a process creates[br]one of these, it can't even access it
0:26:38.300,0:26:42.490
itself, unless it creates a grant for[br]itself, which is not really that useful,
0:26:42.490,0:26:51.880
usually. These grants are used to prevent[br]having to copy over all the data inside
0:26:51.880,0:26:57.500
the IPC message used to implement a system[br]call. Yeah, these are the basic operations
0:26:57.500,0:27:03.190
on it. You can create one, you can copy[br]into and from it. So, you can't actually
0:27:03.190,0:27:07.010
map it. A process that receives one of[br]these has to say to the kernel, using a
0:27:07.010,0:27:12.721
system call, "please write this data into[br]that area of memory that belongs to a
0:27:12.721,0:27:17.930
different process." And then there's also[br]indirect grants, because, you know, in
0:27:17.930,0:27:25.309
Minix they do have this, but also only[br]recently, and usually if you have a
0:27:25.309,0:27:30.360
microkernel system, you would have to copy[br]your buffer for a read call first to the
0:27:30.360,0:27:36.540
file system server and then back to, like,[br]either the hard disk driver, or the device
0:27:36.540,0:27:40.620
driver that's implementing a device file.[br]So the ME actually allows you to create a
0:27:40.620,0:27:45.860
grant, pointing to a grant, that was given[br]to you by someone else. And then that
0:27:45.860,0:27:52.820
grant will inherit the privileges of the[br]process that creates it, combined with
0:27:52.820,0:27:57.530
those that it assignes to it. So if the[br]process has a read/write grant it can
0:27:57.530,0:28:01.340
create a read-only or write-only grant,[br]but it cannot, if it only has a read
0:28:01.340,0:28:08.860
grant, it cannot add write rights to it[br]for a different process, obviously. So
0:28:08.860,0:28:12.880
then there is also some big differences[br]from MINIX. In MINIX you address a process
0:28:12.880,0:28:18.080
by its process ID or thread ID with a[br]generation number attached to it. In the
0:28:18.080,0:28:25.440
ME you can actually address IPC to a file[br]descriptor. Kernel doesn't actually know a
0:28:25.440,0:28:28.610
lot about file descriptors, it just[br]implements the basic thing where you have
0:28:28.610,0:28:32.350
a list of files and each process has a[br]list of file descriptors assigning integer
0:28:32.350,0:28:39.320
numbers to those files to refer to them[br]by. And this is used so you can as a
0:28:39.320,0:28:43.040
process, you can actually directly talk to[br]a device driver without knowing what is
0:28:43.040,0:28:47.110
process ID is. So you don't send it to the[br]file system server, you send it to the
0:28:47.110,0:28:51.740
file descriptor or the Kernel just[br]magically corrects it for you. And they
0:28:51.740,0:28:55.550
moved select into the kernel so you can[br]tell the kernel: "Hey, I want to wait till
0:28:55.550,0:28:59.720
the file system server tells me that it[br]has not available or till a message comes
0:28:59.720,0:29:05.440
in." This is one of the most complicated[br]system calls the ME offers that's used in
0:29:05.440,0:29:12.010
a normal program. You can mostly ignore it[br]and just look like: "Hey, those arguments
0:29:12.010,0:29:16.760
sort of define a file descriptor set as a[br]bit field." And then there's the message
0:29:16.760,0:29:21.040
that might have been received and there's[br]DMA locks because you don't just want to
0:29:21.040,0:29:24.790
write to registers. You actually might[br]want to do the direct memory access from
0:29:24.790,0:29:30.720
hardware so you you can actually tell the[br]kernel to lock one of these memory grounds
0:29:30.720,0:29:38.260
in RAM for you, it won't be swapped out[br]anymore. And yeah, it will even tell you
0:29:38.260,0:29:42.020
the physical address so you can just load[br]that into a register and it's not really
0:29:42.020,0:29:46.760
that complicated. Just lock it, get a[br]physical access, write into the register
0:29:46.760,0:29:53.580
and continue. Well, that's the most[br]important stuff about the operating
0:29:53.580,0:29:58.929
system. The hardware itself is a lot more[br]complicated because the operating system,
0:29:58.929,0:30:03.300
once you have the code, you can just[br]reverse engineer it and get to know it.
0:30:03.300,0:30:11.010
The hardware. Well, let's just say it's a[br]real pain to have to reverse engineer a
0:30:11.010,0:30:16.179
piece of hardware together with its[br]driver. Like if you've got the driver
0:30:16.179,0:30:18.450
code, but you don't know what the[br]registers do. So you don't know what a lot
0:30:18.450,0:30:24.440
of logic does. And you're trying to both[br]figure out what the logic is and what the
0:30:24.440,0:30:30.050
actual registers do. Right. So first you[br]want to know which physical address goes
0:30:30.050,0:30:39.881
where? The metadata listings I showed you[br]actually have names in there. Those are
0:30:39.881,0:30:47.940
not in the metadata files themself, I[br]annotated those. So you just see the
0:30:47.940,0:30:56.680
physical address and size. But there is[br]one module, the bus driver module and the
0:30:56.680,0:31:04.230
bus driver is normal user process, but it[br]implements stuff like PCI configuration
0:31:04.230,0:31:09.550
space accesses and those things. And it[br]has a nice table in it with names for
0:31:09.550,0:31:17.049
devices. So if you just run strings on it,[br]you'll see these things. When I saw this,
0:31:17.049,0:31:20.960
I was was pretty glad because at least I[br]could make sense what device was being
0:31:20.960,0:31:26.680
talked to in a in a certain program. So[br]the bus driver does all these things. It
0:31:26.680,0:31:30.990
manages power getting to devices, it[br]manages configuration space access, it
0:31:30.990,0:31:35.960
manages the different kinds of buses and[br]IOMU that are on the system. And it makes
0:31:35.960,0:31:39.500
sure that the normal driver never has to[br]know any of these details. It just asked
0:31:39.500,0:31:45.520
it for a device by a number assigned to it[br]a build time. And then the bus driver
0:31:45.520,0:31:50.360
says, OK, here's a range of physical[br]address space you can now write to. So
0:31:50.360,0:31:56.640
that's a really nice abstraction and also[br]gives us a lot of information because the
0:31:56.640,0:32:01.640
really old builds for sunrise point[br]actually have a hell of a lot of debug
0:32:01.640,0:32:07.021
strings in there as printf format strings,[br]not as catalogue ID. It's
0:32:07.021,0:32:11.910
one of the only pieces of code for the ME[br]that does this, so that already tells you
0:32:11.910,0:32:15.480
a lot. And then there's also the table[br]that I just talked about that has the
0:32:15.480,0:32:23.760
actual info on the devices and names. So I[br]generated some DocuWiki content from this
0:32:23.760,0:32:28.570
that I use myself and this is what's in[br]the table, part of it. So it tells you
0:32:28.570,0:32:33.070
what address PCI configuration space lives[br]at. That tells you to do the bus device
0:32:33.070,0:32:38.130
function for it through that. It tells you[br]on what chipset SKU they're present using
0:32:38.130,0:32:44.640
a bitfield. And it tells you their names[br]in different fields. It also contains the
0:32:44.640,0:32:48.540
values that are used to write the base[br]address registers for PCI. So also their
0:32:48.540,0:32:54.190
normal memory ranges. And there's even[br]more devices. So the ME has access to a
0:32:54.190,0:32:58.860
lot of stuff. A lot of it is private to[br]it. A lot of it is components that also
0:32:58.860,0:33:06.110
exist in the rest of the computer. And[br]there's not a lot of information. A lot of
0:33:06.110,0:33:11.410
these are basically all the things that[br]are out there together with conference
0:33:11.410,0:33:15.140
slides published by other people who have[br]done research on the ME. I didn't have
0:33:15.140,0:33:21.980
time to add links to those, but they're[br]easy to find on Google. I'll get later to
0:33:21.980,0:33:28.230
this, I actually wrote a emulator for the[br]ME, a partial emulator to be able to run
0:33:28.230,0:33:34.230
ME code and analyze it, which obviously[br]needs to know a bit about the hardware so
0:33:34.230,0:33:41.030
you can look at the app. There is some[br]files in Intel's debugger package,
0:33:41.030,0:33:46.150
specific versions of that that have really[br]detailed info on some of the devices, also
0:33:46.150,0:33:51.460
not all of it. And I wrote some tool to[br]parse some of the files. It's really rough
0:33:51.460,0:33:57.040
code. I published it because people wanted[br]to see what I was doing. It doesn't work
0:33:57.040,0:34:04.080
out of the box. And there is a nice talk[br]on this by Mark Ermolov and Maxim
0:34:04.080,0:34:06.870
Goryachy.. Actually I don't know if I'm[br]pronouncing that correctly, but they've
0:34:06.870,0:34:12.049
done a lot of work on the ME and this[br]particular talk by them is really useful.
0:34:12.049,0:34:16.339
And then there's also something else.[br]There is a second ME on server chipsets,
0:34:16.339,0:34:21.299
the innovation engine. It's basically a[br]copy paste of the ME to provide a ME that
0:34:21.299,0:34:24.760
the vendor can write code for. Don't think[br]it's used a lot. I've only been able to
0:34:24.760,0:34:31.639
find HP software that actually targets it[br]and that has some more debug strings, but
0:34:31.639,0:34:36.639
also not a lot, it mostly has a table[br]containing register names, but they're
0:34:36.639,0:34:41.869
really abbreviated and for a really small[br]subset of the devices, there is
0:34:41.869,0:34:48.280
documentation out there in a Pentium N and[br]J series datasheet. It's seems like they
0:34:48.280,0:34:52.409
compile their a lot of code or whatever[br]with the wrong defines because it doesn't
0:34:52.409,0:35:00.350
actually fit into the manual that well,[br]it's just a section that has like some 20
0:35:00.350,0:35:08.640
tables that shouldn't be in there. So this[br]is from that talk I just referenced and
0:35:08.640,0:35:12.609
it's a overview of the innovation engine[br]and the bus bridges and everything in
0:35:12.609,0:35:20.070
there. This isn't very precise. So based[br]on some of those files from System Studio,
0:35:20.070,0:35:24.500
I try to get a better understanding of[br]this, which is this. This is the entire
0:35:24.500,0:35:29.760
chipset. The little DMA block in the top[br]left corner is what connects to your CPU.
0:35:29.760,0:35:36.570
And all of the big blocks with a lot of[br]ports are our bus bridges or switches for
0:35:36.570,0:35:45.470
PCIexpress-like fabric. So there's a lot[br]going on. The highlighted area is the
0:35:45.470,0:35:59.081
management engine memory space and the[br]rest of it is like the global chipset. The
0:35:59.081,0:36:02.840
things I've highlighted in green hair are[br]on the primary PCI bus. So there's this
0:36:02.840,0:36:08.210
weird thing going on where there seems to[br]be two PCI hierarchies, at least
0:36:08.210,0:36:13.741
logically. So in reality it's not even[br]PCI, but on intel systems, there's a lot
0:36:13.741,0:36:19.600
of stuff that behaves as if it is PCI. So[br]it has like a bus device function and
0:36:19.600,0:36:28.650
numbers, PCI configuration space registers[br]and they have two different roots for the
0:36:28.650,0:36:32.310
configuration space. So even though the[br]configuration space address includes a bus
0:36:32.310,0:36:36.480
number, they have two completely different[br]things with each. Each of which has its
0:36:36.480,0:36:41.290
own bus zero. So that's that's weird also[br]because they don't make sense when you
0:36:41.290,0:36:45.680
look at how the hardware is laid out. So[br]this is stuff that's on the primary PCI
0:36:45.680,0:36:50.780
configuration space that's directly[br]accessed by the EM, by the north bridge on
0:36:50.780,0:36:55.260
the ME CPU. So that's the minute I A[br]system agent. System agent is what Intel
0:36:55.260,0:37:00.619
calls a Northbridge nowadays, now that[br]it's not a separate chip anymore. It's
0:37:00.619,0:37:07.530
basically just a Northbridge and a crypto[br]unit that's on there and the stuff that's
0:37:07.530,0:37:12.530
directly attached to Northbridge being the[br]ROM and the RAM. So the processor itself
0:37:12.530,0:37:16.960
is, as I said, derived from a 486, but it[br]does actually have some more modern
0:37:16.960,0:37:21.830
features that it does CPU ID, at least on[br]my systems. Some other researchers said
0:37:21.830,0:37:29.369
theirs didn't. It's basically the core[br]that's in the quark MCU, which is really
0:37:29.369,0:37:33.260
great because it's one of the only cores[br]made by Intel that has public
0:37:33.260,0:37:39.800
documentation on how to do run control. So[br]breakpoints and accessing registers and
0:37:39.800,0:37:44.420
everything over JTAG. Intel doesn't[br]publish this stuff except for the quark
0:37:44.420,0:37:50.920
MCU, because they were targeted makers.[br]But they reused that in here, which is
0:37:50.920,0:37:58.200
really useful. It even has an official[br]port to the OpenOCD debugger, which I have
0:37:58.200,0:38:03.100
not gotten to test because I don't have a[br]JTAG probe, which is compatible with Intel
0:38:03.100,0:38:11.000
voltage levels and supported by OpenOCD[br]and also has like a set CPU ID and MSRs.
0:38:11.000,0:38:21.170
It has some really fancy features like[br]branch tracing and some more strict paging
0:38:21.170,0:38:30.480
permission enforcement stuff. They don't[br]use the interrupt pins on this. So it's an
0:38:30.480,0:38:34.710
IP block but if there are some files out[br]there, that's where it is this screenshot
0:38:34.710,0:38:40.601
is from, that actually are used by a[br]built in logic analyzer Intel has on the
0:38:40.601,0:38:46.680
chipset and you can select different[br]signals on the chip to to watch, which is
0:38:46.680,0:38:50.900
a really great source of information on[br]how the IP blocks are laid out and what
0:38:50.900,0:38:54.200
signals are in there, because you[br]basically get a tree view of the IP blocks
0:38:54.200,0:39:00.800
and chip and some of their signals. They[br]don't use the legacy interrupt system,
0:39:00.800,0:39:07.920
they only use message based interrupts by[br]what a device writes a value into a
0:39:07.920,0:39:13.050
register on the interrupt controller[br]instead of asserting a pin. And then there
0:39:13.050,0:39:21.700
is the Northbridge. It's partially[br]documented in that data sheet I mentioned,
0:39:21.700,0:39:29.020
it does support x86 IO address space, but[br]it's never used. Everything in the ME is
0:39:29.020,0:39:36.600
in memory space or expose as memory space[br]through bridges, in the Northbridge
0:39:36.600,0:39:43.070
implements access to the ROM,RAM, it has a[br]IOMMU which is only used for transactions
0:39:43.070,0:39:48.750
coming from the rest of the system and[br]it's always initialized to, at least in
0:39:48.750,0:39:51.660
the firmware I looked up, it's always[br]initialized to the inverse of the page
0:39:51.660,0:40:00.200
table, so linear addresses can be used for[br]memory maps, sorry, for DMA. It also does
0:40:00.200,0:40:06.270
PCI configuration space access to the[br]primary PCI bus. And it has a firewall
0:40:06.270,0:40:15.080
that allows the operating system to deny[br]any IP block in the chipset from sending a
0:40:15.080,0:40:18.890
completion on the bus request. So it can[br]actually say: "Hey, I want to read some
0:40:18.890,0:40:25.040
register and only these devices are[br]allowed to send me value for it." So
0:40:25.040,0:40:29.570
they've actually thought about security[br]here, which is great. Then there is one of
0:40:29.570,0:40:38.190
the most important blocks in the ME, which[br]is the crypto engine. It does some sort of
0:40:38.190,0:40:47.100
more well-known crypto algorithms. AES,[br]SHA hashes, RSA and it has a secure key
0:40:47.100,0:40:56.330
store, which I'm not gonna [audio dropped][br]... all about it in their ME talk at
0:40:56.330,0:41:04.250
Blackhat. And a lot of these things have[br]DMA engines, which all seem to be the
0:41:04.250,0:41:09.500
same. And there is no other DM agents ...[br]engines in ME, so this is also used from
0:41:09.500,0:41:23.170
memory to memory copy or DMA into other[br]devices. So that's used in a lot of
0:41:23.170,0:41:27.400
things. This is actually a diagram which I[br]don't have the vector for anymore. So
0:41:27.400,0:41:35.260
that's why the libre office background is[br]in there. I'm sorry. So this is basically
0:41:35.260,0:41:39.020
what that crypto engine looks like when[br]you look at that signal tree that I was
0:41:39.020,0:41:44.910
talking about earlier. The DMA engines are[br]both able to do memory to memory copies
0:41:44.910,0:41:52.570
until directly targets the crypto unit[br]they're part of. Basically, when you, I
0:41:52.570,0:41:57.490
don't know about the control bits that go[br]with this, but when you set the target
0:41:57.490,0:42:02.150
address to zero and the right control[br]bits, it will copy into the buffer that's
0:42:02.150,0:42:11.960
used for the encryption. So that is how it[br]accelerates memory access for crypto. And
0:42:11.960,0:42:15.590
these are the actual register offsets.[br]They're the same for all of the DMA
0:42:15.590,0:42:21.580
engines in there relative to the base[br]address of the subunit they're in. And
0:42:21.580,0:42:27.290
then there's the second PCI bus or bus[br]hierarchy, which is like in some places
0:42:27.290,0:42:33.540
called the PCI fixed bus. I'm actually not[br]entirely sure whether this is actually
0:42:33.540,0:42:38.840
implemented as a PCI bus as I've drawn it[br]here, but this is what it behaves like. So
0:42:38.840,0:42:43.920
it has all the ME private stuff, that's[br]not a part of the normal chipset. So it's
0:42:43.920,0:42:51.310
timers for the ME, it has the[br]implementation of the secure enclave
0:42:51.310,0:42:58.010
stuff, that the firmware TPM registers.[br]And it has the gen device which I've
0:42:58.010,0:43:01.780
mostly ignored because it's only used the[br]boot time. It's only used by the actual
0:43:01.780,0:43:10.869
boot ROM for the ME mostly. It is what the[br]ME uses to get the fuses Intel burns. So
0:43:10.869,0:43:15.420
that's the intel public key, whether it's[br]a production or pre-production part, but
0:43:15.420,0:43:20.260
it's pretty much a black box. It's not[br]used that much, fortunately. There is the
0:43:20.260,0:43:24.340
IPC block which allows the ME to talk to[br]the sensor hub, which is a different CPU
0:43:24.340,0:43:28.190
in the chipset. It allows it to talk to[br]power management controller and all kinds
0:43:28.190,0:43:34.180
of other embedded CPUs. So it's inter[br]processor communication not interprocess.
0:43:34.180,0:43:39.090
Confused me for a bit. And here's the host[br]embedded controller interface, which is
0:43:39.090,0:43:44.320
how the ME talks to the rest of the[br]computer when it wants the computer to
0:43:44.320,0:43:47.960
know that it's talking so it can directly[br]access a lot of stuff. But when it wants
0:43:47.960,0:43:54.250
to send a message to the EFI or to Windows[br]or Linux, it'll use this. And it also has
0:43:54.250,0:43:59.080
status registers, which are really simple[br]things where the ME writes in a value. And
0:43:59.080,0:44:05.290
even if the ME crashes, the host can still[br]read the value, which is how you can see
0:44:05.290,0:44:11.160
whether the ME is running, whether it's[br]disabled, whether it fully booted, or
0:44:11.160,0:44:15.400
whether it crashed halfway through. But at[br]a point where it could still get the rest
0:44:15.400,0:44:21.230
of the computer running and there is some[br]corporate code to to read it. I've also
0:44:21.230,0:44:27.080
implemented some decoding for it on the[br]emulator because it's useful to see what
0:44:27.080,0:44:33.210
those values mean. So then there's[br]something really interesting, the primary
0:44:33.210,0:44:37.240
adverse translation table, which is the[br]bus bridge that allows the ME to actually
0:44:37.240,0:44:44.200
access the PCIexpress fabric of the[br]computer. For a lot of the, what in this
0:44:44.200,0:44:50.010
table call ME peripherals, that are[br]actually outside the ME domain and the
0:44:50.010,0:45:00.320
chipset, it uses this to access it. It[br]also uses it to access the UMA, which is
0:45:00.320,0:45:04.960
an area of host RAM that's used as a swap[br]device for the ME and to Trace Hub, which is
0:45:04.960,0:45:11.190
the debug port, but also has a couple of[br]windows which allow the ME to access any
0:45:11.190,0:45:19.060
random area of host RAM, which is the most[br]scary bit because UMA is specified by
0:45:19.060,0:45:24.650
host, but the host DRAM area is where you[br]can just point it anywhere. You can read
0:45:24.650,0:45:28.750
or write any value that that Windows or[br]Linux or whatever you're running has
0:45:28.750,0:45:37.460
sitting there. So that's scary to me. So[br]and then there's the rest of it, the rest
0:45:37.460,0:45:46.490
of the devices which are behind the[br]primary ATT. And that's a lot of stuff,
0:45:46.490,0:45:53.450
that's debug, that's also the older normal[br]peripherals that your P.C. has, but it
0:45:53.450,0:45:56.200
also includes things like the power[br]management controller, which actually
0:45:56.200,0:45:59.789
turns on and off all the different parts[br]of your computer. It controls clocks and
0:45:59.789,0:46:07.680
resets. So this is really important. There[br]is a concept that you'll come across where
0:46:07.680,0:46:14.261
you're reading Intel manuals or ME related[br]stuff that's root spaces besides your
0:46:14.261,0:46:20.320
normal addressing information for a PCI[br]device, it also has a root space number,
0:46:20.320,0:46:24.980
which is basically how you have a single[br]PCI device exposing two completely
0:46:24.980,0:46:31.151
different address spaces. And it's 0 for[br]the host, it's one for the ME. Some
0:46:31.151,0:46:34.940
devices expose the same information on[br]there. Other ones behave completely
0:46:34.940,0:46:43.370
different. That's something you don't[br]usually see. And then there's the side
0:46:43.370,0:46:48.560
band fabric. So besides all this stuff[br]they just covered, which is PCI like at
0:46:48.560,0:46:52.880
least. There is also something completely[br]different, side band fabric, which is a
0:46:52.880,0:47:00.990
completely packet switched network, where[br]you don't use any memory mapping by
0:47:00.990,0:47:06.370
default. You just have a one byte address[br]for a device and some other addressing
0:47:06.370,0:47:09.590
fields and you're just sending a message[br]saying: "Hey, I want to read configuration
0:47:09.590,0:47:14.320
or data or memory." And there is actually[br]a lot of information out there on this,
0:47:14.320,0:47:18.480
because Intel, it seems like I just copy[br]pasted their internal specification into a
0:47:18.480,0:47:26.860
patent. This is how you address it. This[br]is all devices on there, which is quite a
0:47:26.860,0:47:32.590
lot. It's also what you, if any of you are[br]kernel developers, and you've had to deal
0:47:32.590,0:47:40.110
with GPIO on Intel SoCs. There's this P2SB[br]device that you have to use. That's what
0:47:40.110,0:47:48.240
the host uses to access this. Their[br]documentation on it is really, really bad.
0:47:48.240,0:47:52.420
This was all done using static analysis.[br]But then I wanted to figure out how some
0:47:52.420,0:47:57.410
of the logic actually works and it was[br]really complicated to play around with the
0:47:57.410,0:48:07.310
ME. There was this nice talk by Ermolov[br]and Goryachy, where they said: "You know,
0:48:07.310,0:48:11.790
we found a an exploit that gives you code[br]execution and you can you can get JTAG
0:48:11.790,0:48:18.813
access to." It sounds really nice. It's[br]actually not that easy. So arbitrary code
0:48:18.813,0:48:23.359
execution in the BUP module, they actually[br]describe their exploit and how you should
0:48:23.359,0:48:30.270
use it. But they didn't describe anything[br]that's needed to actually implement that.
0:48:30.270,0:48:35.690
So if you want to do that, what you need[br]to do to figure out where to stack lives,
0:48:35.690,0:48:40.230
you need to know where you need to write a[br]payload that will actually get it from a
0:48:40.230,0:48:44.640
buffer overflow on a stack that, by the[br]way, uses stack cookies. So you can't just
0:48:44.640,0:48:51.369
overwrite the return address to turn that[br]into an arbitrary write. And you need to
0:48:51.369,0:48:56.369
find out what the return pointer address[br]is so you can overwrite it and find ROP
0:48:56.369,0:49:03.320
gadgets because the stack is not[br]executable. And then when you've done
0:49:03.320,0:49:09.920
that, you can just turn on debug access or[br]change to custom firmware or whatever. So
0:49:09.920,0:49:13.660
what I did is I had a bit of trouble[br]getting that running and in order to test
0:49:13.660,0:49:17.720
your payload, you have to flash it into[br]the system and it takes a while and then
0:49:17.720,0:49:20.880
the system just doesn't power on if the[br]ME's not working, if you're crashing it
0:49:20.880,0:49:24.580
instead of getting code execution. So it's[br]not really valuable to to develop it that
0:49:24.580,0:49:32.910
way, I think. Some people did. I respect[br]that because it's really, really hard. And
0:49:32.910,0:49:38.790
then I wrote this ME Loader, it's called[br]Loader because at first I started out like
0:49:38.790,0:49:42.849
writing it as a sort of a wine thing where[br]you where you would just mmap the right
0:49:42.849,0:49:47.380
ranges at the right place and jump into[br]it, execute it, patch some system calls.
0:49:47.380,0:49:51.849
But because the ME is a micro kernel[br]system in almost every user space program
0:49:51.849,0:49:57.480
accesses hardware directly, it ended up[br]implementing like a good part of the
0:49:57.480,0:50:08.080
chipset, at least as stubs or enough logic[br]to get the code running. And I later on
0:50:08.080,0:50:14.510
added some features that actually allowed[br]to talk to the hardware. I can use it as a
0:50:14.510,0:50:18.530
debugger, but just because it's actually[br]running the ME firmware or parts of it
0:50:18.530,0:50:26.200
inside a normal Linux process, I can just[br]use gdb to debug it. And back in April
0:50:26.200,0:50:30.320
last year, I got that working to the point[br]where I could run the bootstrap process,
0:50:30.320,0:50:38.580
which is where the vulnerability is. And[br]then you just develop the exploit against
0:50:38.580,0:50:43.960
it, which I did. And then I made a mistake[br]cleaning up some old change root
0:50:43.960,0:50:52.010
environments for close source software.[br]And I nuked my home dir. Yeah. I hadn't
0:50:52.010,0:50:56.599
yet pushed everything to GitHub. So I[br]stuck with an old version and I decided,
0:50:56.599,0:51:00.160
you know, let's refactor this and turn it[br]into something that might actually at some
0:51:00.160,0:51:03.930
point be published, which by the way I [br]did last summer. This is all public code. The
0:51:03.930,0:51:09.790
ME Loader thing. It's on GitHub. And[br]someone else beat me to it and replicated
0:51:09.790,0:51:15.250
that exploit by the Russian guys. Which up to[br]then they have produced a proof of concept
0:51:15.250,0:51:22.760
thing for Apollo like chipsets, which were[br]completely different for from what you had
0:51:22.760,0:51:33.690
to do for normal ME. I was a bit[br]disappointed by that one, not being the
0:51:33.690,0:51:38.580
first one to actually replicate this. But[br]then I did about a week later, I got it
0:51:38.580,0:51:44.270
got my loader back to the point where I[br]could actually get to the vulnerable code
0:51:44.270,0:51:51.120
and develop that exploit and got it[br]working not too long after. And here's the
0:51:51.120,0:51:54.720
great thing. Then I went to the hacker[br]space. I flash it into my laptop. The
0:51:54.720,0:51:59.040
image that I had just been using only on[br]the emulator. I didn't change it. I flash.
0:51:59.040,0:52:05.280
I was like, this is never gonna work on[br]it. It works. some laughter And I've still got an image
0:52:05.280,0:52:08.480
on a flash ship with me because that's[br]what I used to actually turn on the
0:52:08.480,0:52:14.490
debugger. And then you need a debug probe[br]because that USB based debugging stuff
0:52:14.490,0:52:18.810
that's mentioned here only works pretty[br]late in boot. Which is also why I only
0:52:18.810,0:52:21.880
really see Apollo Lake stuff because on[br]those chipsets you can actually use this
0:52:21.880,0:52:33.010
for the ME. And then you need this thing[br]because there's a second channel, that is
0:52:33.010,0:52:36.360
using the USB plug, but it's a completely[br]different physical layer and you need an
0:52:36.360,0:52:40.911
adapter for it, which I don't think was[br]intended to be publicly available. Because
0:52:40.911,0:52:44.859
if you go to Intel site to say, I want to[br]buy this, they say, here's the C-NDA,
0:52:44.859,0:52:54.460
please sign it. But it appeared on mouser.[br]And luckily I knew some people, who had
0:52:54.460,0:52:59.120
done some other stuff, got a nice bounty[br]for it and bought it and I let me use it.
0:52:59.120,0:53:05.430
Thanks to them. It's expensive, but you[br]can buy it if it's still up there. Haven't
0:53:05.430,0:53:11.520
checked. That's the Link. So I'm a bit[br]late, so I'm gonna use the time for
0:53:11.520,0:53:15.760
questions as well. So the main thing the[br]ME does that you cannot replace is the
0:53:15.760,0:53:21.250
boot process. It's not just breaking the[br]system. If you don't turn it on, it
0:53:21.250,0:53:25.240
actually does stuff that has to be done.[br]So you gonna have to use the ME anyway if
0:53:25.240,0:53:30.730
you want to boot a computer. I don't[br]necessarily have to use Intel's firmware.
0:53:30.730,0:53:35.810
The ME itself boots is like a micro kernel[br]system, so it has a process which
0:53:35.810,0:53:39.859
implements a lot of the servers that will[br]allow it to get to a point where it can
0:53:39.859,0:53:44.710
start those servers. This process has very[br]high privileges in older versions, which
0:53:44.710,0:53:49.160
is what is being used on these chipsets.[br]And if you exploit that, you're still ring
0:53:49.160,0:53:55.680
3, but you can turn on debugger and you[br]can use the debugger to become ring 0. So
0:53:55.680,0:53:59.171
this is what normal boot process for a[br]computer looks like. And this is what
0:53:59.171,0:54:02.050
happens when you use Boot Guard. There's a[br]bit of code that runs even before the
0:54:02.050,0:54:07.170
reset vector, and that's started by micro[br]code initialization, of course. And this
0:54:07.170,0:54:12.120
is what actually happens. The ME loads a[br]new firmware into a power management
0:54:12.120,0:54:16.390
controller, it then ready some stuff in a[br]chipset and it tells the power mentioning
0:54:16.390,0:54:23.660
controller like please stop pulling that[br]CPU reset pin low and the CPU will start.
0:54:23.660,0:54:28.160
Power managment controller is a completely[br]independent thing I say 8051 derived
0:54:28.160,0:54:32.690
microcontroller that runs a real time[br]operating system from the 90s. This is the
0:54:32.690,0:54:38.690
only string in the firmware by the way,[br]that's quoted there. And depending on the
0:54:38.690,0:54:42.410
chipsset that you have, it's either loaded[br]with a patch or with a complete binary
0:54:42.410,0:54:46.690
from the ME, and it does a lot of[br]important stuff. No documentation on it
0:54:46.690,0:54:52.120
besides ACPI interface, which is not[br]really any useful. The ME has to do these
0:54:52.120,0:54:58.710
things. It needs to load the keys for the[br]Boot Guard process needs to set up clock
0:54:58.710,0:55:06.550
controllers and then tell the PMC to turn[br]on the power to to the CPU. It needs to
0:55:06.550,0:55:15.240
configure PCI express fabric and reset -[br]like get the CPU to come out of reset.
0:55:15.240,0:55:18.290
There's a lot of code involved in this, so[br]I really didn't want to do this all
0:55:18.290,0:55:22.150
statically. What I did is I added hardware[br]support, hardware passthrough support to
0:55:22.150,0:55:28.500
the emulator and booted my laptop that[br]way. Actually had a video of this, but I
0:55:28.500,0:55:33.970
don't have the time to show it, which is a[br]pity. But this is what I - the bring up
0:55:33.970,0:55:38.030
process from the ME running in a Linux[br]process, sending whatever hardware access
0:55:38.030,0:55:43.340
as it was trying to do that are important[br]for boot to the debugger. And then that
0:55:43.340,0:55:49.880
was using a ME in real hardware that was[br]halted to actually do to register accesses
0:55:49.880,0:55:56.520
and it works. It's not going to show this.[br]It actually booted the computer reliably.
0:55:56.520,0:56:02.410
Then Boot Guard configuration is fun[br]because you know where they say they fuse
0:56:02.410,0:56:10.990
in the keys. Well yeah. But the ME loads[br]them from fuses and then manually loads
0:56:10.990,0:56:14.530
them into registers. So if you have code[br]execution on the ME before it does this,
0:56:14.530,0:56:18.000
you can just load your own values and you[br]can run core boot even on a machine that
0:56:18.000,0:56:24.190
has Boot Guard. Yeah. So I'm gonna go[br]through this really quickly. This is, by
0:56:24.190,0:56:29.570
the way, these are the registers that[br]configure what security model the CPU is
0:56:29.570,0:56:34.579
gonna enforce for the firmware. I'm going[br]to release this code after my talk. It's
0:56:34.579,0:56:39.810
part of a Python script that I wrote that[br]uses the debugger to start the CPU without
0:56:39.810,0:56:45.670
ME firmware. I traced all the of the ME[br]firmware did. And I now have a Python
0:56:45.670,0:56:51.470
script that can just start a computer[br]without Intel's code. If you translate
0:56:51.470,0:56:55.920
this into a rough sequence or even into[br]binary for the ME, you can start a
0:56:55.920,0:57:02.850
computer without the ME itself or at least[br]without it running the operating system.
0:57:02.850,0:57:12.710
applause[br]So, yeah, future goals. I really do want
0:57:12.710,0:57:20.420
to share this because if there is a way to[br]escalate, to ring 0 fruit, a rope chain,
0:57:20.420,0:57:24.359
then you could just start your own kernel[br]in the ME and have custom firmware, at
0:57:24.359,0:57:29.600
least from the vulnerability on. But you[br]could also build a mod chip that uses the
0:57:29.600,0:57:34.829
debugger interface to load a new firmware.[br]There's lots of stuff still needs to be
0:57:34.829,0:57:41.210
discovered, but I'm gonna hang out at the[br]open source firmware village later, at
0:57:41.210,0:57:46.690
least part of the week here. So because I[br]really want to get started on open source
0:57:46.690,0:57:55.250
ME firmware using this. Right. And there's[br]a lot of people that's played a role in
0:57:55.250,0:58:00.700
getting me to this point. Also would like[br]to thank the guy from Hague hacker space,
0:58:00.700,0:58:07.680
BinoAlpha, who basically allowed me to use[br]his laptop to prepare the demo, which I
0:58:07.680,0:58:14.660
ended up not being able to show, but.[br]Right. I was gonna ask what are the
0:58:14.660,0:58:17.380
worrying questions? But I don't think[br]there's really any time for any more.
0:58:17.380,0:58:22.570
Herald: Peter, thank you so much. Applause[br]Unfortunately, we don't have any more time
0:58:22.570,0:58:30.720
left.[br]Peter: I'll be around. I'll be around.
0:58:30.720,0:58:35.660
Herald: I think it's very, very[br]interesting because I hope that your talk
0:58:35.660,0:58:41.119
will inspire many people to keep looking[br]into how the management engine works and
0:58:41.119,0:58:46.930
hopefully uncover even more stuff. I think[br]we have time for just one single question.
0:58:46.930,0:58:51.040
I don't know, do we? How one from the[br]Internet. Thank you so much.
0:58:51.040,0:58:56.790
Signal Angel: OK. First off, I have to[br]tell you. Your shirt is nice. Chat wanted
0:58:56.790,0:59:05.000
me to say this. And they asked how[br]reliable this exploit is and does it work
0:59:05.000,0:59:09.160
on every boot?[br]Peter: Right, Yeah. That's actually
0:59:09.160,0:59:14.960
something really important that I forgot[br]to mention. So they patch a vulnerability,
0:59:14.960,0:59:17.339
but they didn't provide downgrade[br]protection. If you could flash a
0:59:17.339,0:59:24.170
vulnerable image with an exploit in it,[br]it'll just boot every time on these chips
0:59:24.170,0:59:27.850
that's so six or seven generation chips[br]that's put in that image and it will
0:59:27.850,0:59:31.230
reliably turn on the debugger every time[br]you turn on the computer. applause
0:59:31.230,0:59:36.650
Herald: Thank you so much for the[br]question. And Peter Bosch thank you so
0:59:36.650,0:59:39.160
much. Please give him a great round of[br]applause.
0:59:39.160,0:59:43.625
applause
0:59:43.625,1:00:08.000
subtitles created by c3subtitles.de[br]in the year 20??. Join, and help us!