0:00:00.000,0:00:19.152 36C3 preroll music 0:00:19.152,0:00:22.520 Herald: The next talk is an intel[br]management engine, deep dive. 0:00:22.520,0:00:27.230 Understanding the ME at the OS and[br]hardware level and it is by Peter Bos, 0:00:27.230,0:00:31.089 Please welcome him with a great round of[br]applause! 0:00:31.089,0:00:38.780 Applause 0:00:38.780,0:00:49.409 Peter Bosch: Right. So everybody. Harry.[br]Nice. OK. So welcome. Well, this is me. 0:00:49.409,0:00:59.510 I'm a student at Leiden University. Yeah,[br]I've always been really interested in how 0:00:59.510,0:01:04.610 stuff works. And when I got a new laptop,[br]I was like, you know, how does this thing 0:01:04.610,0:01:08.410 really boot? I knew everything from reset[br]vector onwards. I wanted to know what 0:01:08.410,0:01:15.221 happened before it. So first I started[br]looking at the boot guard ACM. While 0:01:15.221,0:01:21.420 looking through it, I realized that not[br]everything was as it was supposed to be. 0:01:21.420,0:01:26.280 That led to a later part in the boot[br]process being vulnerable, which ended up 0:01:26.280,0:01:34.249 being discovered by me. And I found out[br]here last year that I wasn't the only one 0:01:34.249,0:01:38.310 to find it. Trammell Hudson also found it,[br]and we reported it together, presented it 0:01:38.310,0:01:43.399 at Hack in the Box. And then at the same[br]time, I was already also looking at the 0:01:43.399,0:01:49.350 management engine. Well, there had been a[br]lot of research done on that before. The 0:01:49.350,0:01:58.140 public info was mostly on the file system[br]and on specific vulnerabilities, which 0:01:58.140,0:02:04.400 still made it pretty hard to get started[br]on reverse-engineering it. So that's why I 0:02:04.400,0:02:10.340 thought it might be useful for me to[br]present this work here. It's basically 0:02:10.340,0:02:16.910 broken up into three parts. The first bit[br]is just a quick introduction into the 0:02:16.910,0:02:22.250 operating system it runs. So if you want[br]to work on this yourself, you're more 0:02:22.250,0:02:28.690 easily able to understand whats in your[br]face in your Disassembler. So and then 0:02:28.690,0:02:37.950 after that, I'll go over its role in the[br]boot process and then also how this 0:02:37.950,0:02:45.780 information can be used to to start[br]developing a new firmware for it or do 0:02:45.780,0:02:49.730 more security research on it. So first of[br]all, what exactly is the management 0:02:49.730,0:02:57.280 engine? There's been a lot of fuss about[br]it being a backdoor and everything, in 0:02:57.280,0:03:05.000 reality, if it is or not depends on the[br]software that it runs. It's basically a 0:03:05.000,0:03:09.110 processor with his own RAM and his own IO[br]and MMUs and everything's sitting inside 0:03:09.110,0:03:16.049 your south ridge. It's not in the CPU,[br]It's in its outreach. So when I say this 0:03:16.049,0:03:24.010 is gonna be about the sixth and seventh[br]generation of Intel chips, I mean, mostly 0:03:24.010,0:03:28.489 motherboards from those generations. If[br]you run a newer CPU on it, it will also 0:03:28.489,0:03:39.584 work for that. So yeah. Bit more detail.[br]CPU it runs is based on the 80486, which, 0:03:39.584,0:03:43.510 you know, is funny. It's quite an old CPU[br]you and it's still being used in almost 0:03:43.510,0:03:51.079 every computer nowadays. So it has a[br]little bit of its own RAM. It has quite a 0:03:51.079,0:03:58.150 bit of built in ROM, has a hardware[br]accelerated cryptographic unit and it has 0:03:58.150,0:04:05.450 fuses which are right once memory is used[br]to store security settings and keys and 0:04:05.450,0:04:11.079 everything. Some of the more scary[br]features it has: Bus bridges to all of the 0:04:11.079,0:04:16.419 buses inside the south ridge, it can[br]access the RAM on the CPU and it can 0:04:16.419,0:04:21.359 access the network, which makes it really[br]quite dangerous. If there is a 0:04:21.359,0:04:28.409 vulnerability or if it runs anything[br]nefarious and it's tasks nowadays include 0:04:28.409,0:04:35.860 starting the computer as well as adding[br]management features. This is mostly used 0:04:35.860,0:04:41.190 in servers where it can serve as a board[br]management controller, do like a remote 0:04:41.190,0:04:49.001 keyboard and video and it does security[br]boot guard, which is the signing of a 0:04:49.001,0:04:54.830 firmware and verification of signatures.[br]It implements a firmware TPM and there is 0:04:54.830,0:05:02.590 also a SDK to use it as a general purpose[br]secure enclave. So on the software side of 0:05:02.630,0:05:12.650 it, it runs a custom operating system,[br]parts of which are taken from MINIX, the 0:05:12.650,0:05:17.250 teaching operating system by Andrew[br]Tanenbaum. It's a micro kernel operating 0:05:17.250,0:05:32.930 system. It runs binaries that are in a[br]completely custom format. It's really 0:05:32.930,0:05:36.030 quite high level system actually. If you[br]look at it in terms of the operating 0:05:36.030,0:05:40.681 system, it runs, it's mostly like Unix,[br]which makes it kind of familiar, but it 0:05:40.681,0:05:46.819 also has large custom parts. Like I said[br]before in this talk, I'm going to be 0:05:46.819,0:05:52.740 speaking about sixth and seventh[br]generation Intel core chipsets, so that's 0:05:52.740,0:05:58.949 Sunrise Point. Lewisburg, which is the[br]server version of this and also the laptop 0:05:58.949,0:06:04.410 system on a chip they're just called Intel[br]core low power. They also include the 0:06:04.410,0:06:08.360 chipset as a separate die. So it also[br]applies to them. In fact, I've been 0:06:08.360,0:06:11.979 testing most of this stuff. I'm going to[br]tell you about on the laptop that's 0:06:11.979,0:06:19.430 sitting right here, which is a Lenovo T[br]460. The version of the firmware I've been 0:06:19.430,0:06:30.820 looking at is 11001205. Right. So I do[br]need to put this up there. I'm not a part 0:06:30.820,0:06:38.520 of Intel, nor have I signed any contracts[br]to them. I've found everything in ways 0:06:38.520,0:06:43.500 that you could also do. I didn't have any[br]leaked NDA stuff or anything that you 0:06:43.500,0:06:53.099 couldn't get your hands on. It's also a[br]very wide subject area, so there might be 0:06:53.099,0:07:00.580 some mistakes here or there, but generally[br]it should be right. Well, if you want to 0:07:00.580,0:07:04.220 get started working on an ME firmware,[br]want to reverse-engineer it or modify it 0:07:04.220,0:07:08.580 in some way first, you've got to deal with[br]the image file. You've got your SPI flash. 0:07:08.580,0:07:12.009 It's where most of its firmware lives in[br]the same flash chip as your BIOS. So 0:07:12.009,0:07:17.410 you've got that image. And then how do you[br]get the code out? Well, there's tools for 0:07:17.410,0:07:22.949 that. It's already been extensively[br]documented, documented by other people. 0:07:22.949,0:07:28.681 And you can basically just download a tool[br]and run it against it. Which makes this 0:07:28.681,0:07:31.690 really easy. This is also the reason why[br]there hasn't been a lot of research done 0:07:31.690,0:07:35.940 yet before these tools were around. You[br]couldn't get to all of the code. The 0:07:35.940,0:07:41.349 kernel was compressed using Huffman[br]tables, which were stored in ROM. You 0:07:41.349,0:07:45.360 couldn't get to the ROM without getting[br]code execution on the thing. So there was 0:07:45.360,0:07:52.639 basically no way of getting access to the[br]kernel code. And I think also to see some 0:07:52.639,0:07:55.800 library. But that's not a problem anymore.[br]You can just download a tool and unpack 0:07:55.800,0:08:02.520 it. Also, the intel tool to generate[br]firmware images, which you can find in 0:08:02.520,0:08:11.979 some open directories on the internet, has[br]Qt resources, XML-files which basically have the 0:08:11.979,0:08:18.330 description for all of the file formats[br]used by these ME versions, including names 0:08:18.330,0:08:26.050 and comments to go with those structured[br]definitions. So that's really useful. So 0:08:26.050,0:08:30.430 we look at one of these images. It has a[br]couple of partitions, some of them overlap 0:08:30.430,0:08:38.150 and some of them are storage, some are[br]code. So there is the main partitions, 0:08:38.150,0:08:45.709 FTPR and NFTP, which contain the programs[br]it runs. There's MFS, which is the read-write 0:08:45.709,0:08:51.980 file system it uses for persistent[br]storage. And then there is a log to flash 0:08:51.980,0:08:57.320 option, the possibility to embed a token[br]that will tell the system to unlock all 0:08:57.320,0:09:02.850 debug access which has to be signed by[br]Intel so it's not really of any use to us. 0:09:02.850,0:09:07.439 And then there is something interesting,[br]ROM bypass. Like I said, you can't get 0:09:07.439,0:09:13.160 access to the ROM without running code on[br]it. And ROM is mask ROM. So it's internal 0:09:13.160,0:09:17.540 to the chip, but Intel has to develop new[br]ROM code and have to test it without 0:09:17.540,0:09:23.270 respinning the die every time. So they[br]have a possibility on a unlocked 0:09:23.270,0:09:28.170 preproduction chipset to completely bypass[br]the internal ROM and load even the early 0:09:28.170,0:09:33.670 boot code from the flash chip. Some of[br]these images have leaked and you can use 0:09:33.670,0:09:39.250 them to get a look at the ROM code, even[br]without being able to dump it. That's 0:09:39.250,0:09:45.610 going to be really useful later on. So[br]then you've got these code partitions and 0:09:45.610,0:09:51.230 they contain a whole lot of files. So[br]there is the binaries themselves which 0:09:51.230,0:09:57.569 don't have any extension. There is the[br]metadata files. So the binary format they 0:09:57.569,0:10:05.350 use has no headers, nothing included. And[br]all of that data is in the metadata file. 0:10:05.350,0:10:12.000 And when you use the unME11 tool, you can[br]actually, it'll convert those to text 0:10:12.000,0:10:16.069 files for you so you can just get started[br]without really understanding how they 0:10:16.069,0:10:26.640 work. Yes. So the metadata. It's type-[br]length-value structure, which contains a 0:10:26.640,0:10:31.180 whole lot of information the operating[br]system needs. It has the info on the 0:10:31.180,0:10:35.820 module, whether it's data or code, where[br]it should be loaded, what the privileges 0:10:35.820,0:10:43.390 of the process should be, a SHA[br]checksum for validating it and also some 0:10:43.390,0:10:49.000 higher level stuff such as device file[br]definitions if it's a device driver or any 0:10:49.000,0:10:55.430 other kind of server. I've actually[br]written some code that uses this, that's 0:10:55.430,0:11:01.460 on GitHub, so if you want a closer look at[br]it, some of the slides have a link to to 0:11:01.460,0:11:09.780 get a file in there which contains the[br]full definitions. Right. So all the code 0:11:09.780,0:11:16.801 on the ME is signed and verified by Intel.[br]So you can't just go and put in a new 0:11:16.801,0:11:24.689 binary and say, hey, let's run this. The[br]way they do this is in Intel's 0:11:24.689,0:11:30.300 manufacture-time fuses, they have a hash[br]of the public key that they use to sign 0:11:30.300,0:11:36.070 it. And then on each flash partition,[br]there is a manifest which is signed by the 0:11:36.070,0:11:40.820 key and it contains the SHA hashes for all[br]the metadata files, which then contain a 0:11:40.820,0:11:47.150 SHA hash for the code files. It doesn't[br]seem to be any major problems in verifying 0:11:47.150,0:11:52.530 this, so it's useful to know, but it's[br]you're not really gonna use this. And then 0:11:52.530,0:12:00.300 the modules themself, as I've said,[br]they're flat binaries. Mostly. The 0:12:00.300,0:12:05.560 metadata contains all the info the kernel[br]uses to reconstruct the actual program 0:12:05.560,0:12:13.530 image in memory. And a curious thing here[br]is that the actual base address for all 0:12:13.530,0:12:17.459 the modules for old programs is the same[br]across an image. So if you have a 0:12:17.459,0:12:19.930 different version, it's going to be[br]different. But if you have two programs 0:12:19.930,0:12:25.949 from the same firmware it's gonna be[br]loaded at the same virtual address. Right. 0:12:25.949,0:12:32.820 So when you want to look at it, you're[br]gonna load it in some disassembler, like 0:12:32.820,0:12:39.540 for example IDA, and you'll see this, it[br]disassembles fine, but it's gonna 0:12:39.540,0:12:44.270 reference all kinds of memory that you[br]don't have access to. So usually you'd 0:12:44.270,0:12:49.459 think maybe I've loaded up a wrong address[br]or or am I missing some library? Well, 0:12:49.459,0:12:55.150 here you've loaded it correctly if you use[br]that, the address from the metadata file. 0:12:55.150,0:13:02.310 But you are in fact missing a lot of[br]memory segments. And let's just take a 0:13:02.310,0:13:09.829 look at each of these. It's calling and[br]switching code. It's pushing a pointer 0:13:09.829,0:13:15.890 there, which is data. And what's that? So[br]it has shared libraries, even though it's 0:13:15.890,0:13:19.920 flat binaries. It actually does use shared[br]libraries because you only have 1.5 0:13:19.920,0:13:24.319 megabyte of RAM. You don't want to[br]link your C library into everything and 0:13:24.319,0:13:32.800 waste what little memory you have. So[br]there is the main system library which is 0:13:32.800,0:13:39.270 like libc on a Linux system. It's in a[br]flash partition, so you can actually just 0:13:39.270,0:13:45.689 load it and take a look at it easily and[br]it starts out with a jump table. So 0:13:45.689,0:13:48.770 there's no symbols in the metadata file or[br]anything. It doesn't do dynamic linking. 0:13:48.770,0:13:56.549 It loads the pages for the shared library[br]at a fixed address, which is also in the 0:13:56.549,0:14:01.620 shared library's metadata. And then it's[br]just there in the processor's memory and 0:14:01.620,0:14:06.130 it's gonna jump there if it needs a[br]function. And the functions themself are 0:14:06.130,0:14:12.890 just using the normal System V, x86[br]calling conventions. So it's pretty easy 0:14:12.890,0:14:17.980 to look at that using your normal tools.[br]There's no weird register argument passing 0:14:17.980,0:14:24.559 going on here. So, right. Now, shared[br]libraries. There's two of them. And this 0:14:24.559,0:14:28.160 is where it gets annoying. The system[br]library, you've got access to that so you 0:14:28.160,0:14:32.850 can just take your time and go through it[br]and try to figure out, you know, oh, hey, 0:14:32.850,0:14:39.880 is this open or is this read or what's[br]this function doing? But then there's also 0:14:39.880,0:14:49.150 another second really large library, which[br]is in ROM. They have all the C library 0:14:49.150,0:14:54.300 functions and some of their custom helper[br]routines that don't interact with the 0:14:54.300,0:15:00.920 kernel directly, such as strings[br]functions. They live in ROM. So when 0:15:00.920,0:15:04.700 you've got your code and this is basically[br]where I was when I was here last year, 0:15:04.700,0:15:07.040 you're looking through it and you're[br]seeing calls to a function you don't have 0:15:07.040,0:15:11.010 the code for all over the place. And you[br]have to figure out by its signature what 0:15:11.010,0:15:14.870 is it doing. And that works for some of[br]the functions and it's really difficult 0:15:14.870,0:15:20.610 for other ones. That really had me stopped[br]for a while. Then I managed to find one of 0:15:20.610,0:15:25.070 these ROM bypass images and I had the code[br]for a very early development build of the 0:15:25.070,0:15:29.370 ROM. This is where I got lucky. So the[br]actual entry point addresses are fixed 0:15:29.370,0:15:33.939 across a entire chipset family. So if you[br]have an image for the server version of 0:15:33.939,0:15:39.310 like 100 series chipset or for client[br]version or for a desktop or laptop 0:15:39.310,0:15:47.540 version, it's all gonna be the same ROM[br]addresses. So even though the code might 0:15:47.540,0:15:51.930 be different, you'll have the jump table,[br]which means the addresses can say fixed. 0:15:51.930,0:15:56.760 So this only needs to be done once. And in[br]fact when I upload my slides later, there 0:15:56.760,0:16:02.919 is a slide in there at the end that has[br]the addresses for the most used functions. 0:16:02.919,0:16:07.350 So you're not going to have to repeat that[br]work, at least not for this chipset. So if 0:16:07.350,0:16:15.160 you want to look at a simple module,[br]you've loaded it, now you've applied the 0:16:15.160,0:16:21.860 things I just said, and you still don't[br]have the data sections. If I don't know 0:16:21.860,0:16:26.669 what that function there is doing, but[br]it's not very important. It actually 0:16:26.669,0:16:33.230 returns a value, I think, that's not used[br]anywhere, but it must have a purpose 0:16:33.230,0:16:40.220 because it's there. Right. So then you[br]look at the entry point and this is a lot 0:16:40.220,0:16:44.660 of stuff. And the main thing that matters[br]here is on the right half of the screen, 0:16:44.660,0:16:50.189 there is a listing from a MINIX repository[br]and on the left half there is a 0:16:50.189,0:16:54.809 disassembly from an ME module. So it's[br]mostly the same. There is one key 0:16:54.809,0:16:58.419 difference, though. The ME module actually[br]has a little bit of code that runs before 0:16:58.419,0:17:06.230 its C library startup function. And that[br]function actually does all the ME specific 0:17:06.230,0:17:13.980 initialization, does a lot of stuff[br]related to how C library data is kept 0:17:13.980,0:17:21.520 because there is also no data segments for[br]the C library being allocated by the 0:17:21.520,0:17:25.820 kernel. So each process actually reserves[br]a part of its own memory and tells the C 0:17:25.820,0:17:31.290 library, like, any global variables you[br]can store in there. But when you look at 0:17:31.290,0:17:37.610 that function, one of the most important[br]things that it calls is this function. 0:17:37.610,0:17:41.510 It's very simple, it just copies a bunch[br]of RAM. So they don't have support for 0:17:41.510,0:17:46.650 initialized data sections. It's a flat[br]binary. What they do is they they actually 0:17:46.650,0:17:51.520 use the .bss segment, the zeroed segment[br]at the end of the address space, and copy 0:17:51.520,0:17:57.070 over a bunch of data in the program. The[br]program itself is not aware of this. It's 0:17:57.070,0:18:04.180 really in the initialization code and in[br]linker script. So this is also something 0:18:04.180,0:18:09.170 that's very important because you're going[br]to need to also at that address in the 0:18:09.170,0:18:13.310 data section, you're going to need to load[br]the last bit of the of the binary. 0:18:13.310,0:18:20.520 Otherwise you're missing constants or at[br]least initialization values. Right. Then 0:18:20.520,0:18:26.150 there is the full memory map to the[br]processes themselves. It's a flat 32 bit 0:18:26.150,0:18:31.970 address space. It's got everything you[br]expect in there. It's got a stack and a 0:18:31.970,0:18:39.500 heap and everything. There's a little bit[br]of heap allocated right on initialization. 0:18:39.500,0:18:44.690 This is this is basically how you derive[br]the address space layout from the 0:18:44.690,0:18:51.100 metadata, especially like the data[br]segment, then, and the stack itself is 0:18:51.100,0:18:56.180 like the address location varies a lot[br]because of the number of threads that are 0:18:56.180,0:19:03.380 in use or the size of data sections. And[br]also those stack guards, they're not 0:19:03.380,0:19:07.960 really stack guards. There is also[br]metadata for each thread in there. But 0:19:07.960,0:19:13.640 that's nothing that's relevant to the[br]process itself, only to the kernel. And 0:19:13.640,0:19:21.890 well, if you then skip forward a bit and[br]you've done all these - you look at your 0:19:21.890,0:19:28.790 simple driver like this. This is taken[br]from a driver used to talk to the CPU, 0:19:28.790,0:19:34.630 like, OK. So when I say CPU or host, by[br]the way, I mean the CPU, like your big 0:19:34.630,0:19:39.370 SkyLake, or KabyLake, or CoffeeLake,[br]whatever your big CPU that runs your own 0:19:39.370,0:19:46.070 operating system. Right. So this is used[br]to to send messages there. But if you look 0:19:46.070,0:19:51.680 at what's going on here, OK - think I had[br]a problem with the animation here - it 0:19:51.680,0:19:57.000 sets up some stuff and then it calls a[br]library function that's in the main syslib 0:19:57.000,0:20:01.270 library, which actually has a main loop[br]for the program. That's because Intel was 0:20:01.270,0:20:06.440 smart and they added a nice framework for[br]device driver implementing programs, 0:20:06.440,0:20:10.130 because it's a micro kernel, so device[br]drivers are just usual programs, calling 0:20:10.130,0:20:20.060 specific APIs. Then there's normal POSIX[br]file I/O. No standard I/O, but it has all 0:20:20.060,0:20:26.530 the normal open, and read, and ioctl and[br]everything functions. And then there's 0:20:26.530,0:20:30.170 more initialization for the srv library.[br]And this is basically what all the simple 0:20:30.170,0:20:38.890 drivers look like in it. And then there's[br]this. Because they're so low a memory, 0:20:38.890,0:20:50.040 they don't actually use standard I/O, or[br]even printf itself to do most of the 0:20:50.040,0:20:54.820 debugging. It uses a thing that's called[br]"sven", I'll touch on that later. So there 0:20:54.820,0:20:59.150 is the familiar APIs that I talked about.[br]It even has POSIX threads, or at least a 0:20:59.150,0:21:04.510 subset of it, and there is all the[br]functions that you'd expect to find on 0:21:04.510,0:21:08.700 some generic Unix machine. So that[br]shouldn't be too much of a problem to do 0:21:08.700,0:21:14.570 with, but then there's also their own[br]tracing solution, sven. That's what Intel 0:21:14.570,0:21:17.350 calls it. The name is in all the development[br]tools that you can download 0:21:17.350,0:21:23.370 from their site, and basically, they don't[br]include format strings for a lot of the 0:21:23.370,0:21:28.390 stuff. They just have a 32-bit identifier[br]that is sent over debug port, and it 0:21:28.390,0:21:34.270 refers to a format string in a dictionary[br]that you don't have. There is one of the 0:21:34.270,0:21:38.820 dictionaries for a server chip that's[br]floating around the internet, but even 0:21:38.820,0:21:45.940 that is incomplete. And the normal non-NDA[br]version of the Intel developer tools has 0:21:45.940,0:21:53.810 some 50 format strings for really common[br]status messages it might output, but yeah, 0:21:53.810,0:21:57.391 like, if you see these functions, just[br]realize it's doing some debug print. There 0:21:57.391,0:22:00.550 might be dumping some states or just[br]telling it it's gonna do something else. 0:22:00.550,0:22:12.020 It's no important logic actually happens[br]in here. Right. So then for device files. 0:22:12.020,0:22:16.190 They're actually defined in a manifest.[br]When the kernel loads a program, and that 0:22:16.190,0:22:20.830 program wants to expose some kind of[br]interface to other programs its manifest 0:22:20.830,0:22:27.780 will contai,n or it's metadata file will[br]contain a special file producer entry, and 0:22:27.780,0:22:33.120 that says, you know, you have these device[br]files, with a name, and an access mode and 0:22:33.120,0:22:39.210 the user, and group ID, and everything,[br]and the minor numbers, and the kernel 0:22:39.210,0:22:42.830 sends this to the- or not kernel- the[br]program loader sends this to the virtual 0:22:42.830,0:22:47.720 file system server and it automatically[br]gets a device file, pointing to the right 0:22:47.720,0:22:51.800 major or minor number. And then there's[br]also a library, as I said, to provide a 0:22:51.800,0:23:03.680 framework for a driver. And that looks[br]like this. It's really easy to use. If you 0:23:03.680,0:23:08.070 were a ME developer you just write some[br]callbacks for open, and close, and 0:23:08.070,0:23:11.000 everything, and it automatically calls[br]them for you, when a message comes in, 0:23:11.000,0:23:15.400 telling you that that happened, which also[br]makes it really easy to reverse engineer, 0:23:15.400,0:23:21.100 'cause if you look at a driver, it just[br]loads some callbacks, and you can know, by 0:23:21.100,0:23:27.510 their offset in a structure, what actual[br]call they're implementing. Right, so then 0:23:27.510,0:23:31.950 there is one of the more weird things[br]that's going on here: How the actual 0:23:31.950,0:23:37.470 userland programs get access to memory map[br]registers. There's a lot of this going on. 0:23:37.470,0:23:42.830 Calls to a couple of functions that have[br]some magic arguments. The second one you 0:23:42.830,0:23:50.640 can easily tell is the offset, because it[br]has- it increases in very nice power-of- 0:23:50.640,0:23:54.670 two steps, so it's probably the register[br]offsets, and then what comes after it 0:23:54.670,0:24:00.160 looks like a value. And then the first bit[br]seems to be a magic number. Well, it's 0:24:00.160,0:24:05.479 not. There is also an extension in the[br]metadata, saying these are the memory 0:24:05.479,0:24:12.170 mapped I/O ranges, and those ranges,[br]they'd each list a physical base address, 0:24:12.170,0:24:19.360 and a size, and permissions for them. Then[br]the index in that list does not directly 0:24:19.360,0:24:23.150 correspond to the magic value. The magic[br]value actually you need to do a little 0:24:23.150,0:24:27.680 computation on the offset, and you can[br]access it through those functions. The 0:24:27.680,0:24:38.600 computation itself might be familiar.[br]Yeah, so these are the functions. The 0:24:38.600,0:24:44.610 value is a segment selector. So they use[br]them. Actually, don't use paging for inter 0:24:44.610,0:24:51.820 process isolation, they use segments like[br]x86 Protected Mode segments. And for each 0:24:51.820,0:24:56.610 memory mapped I/O range there is a[br]separate segments, and you manually specify 0:24:56.610,0:25:04.280 that, which is just weird to me, like, why[br]would you use x86 segmenting on a modern 0:25:04.280,0:25:10.610 system? Minix does it, but, yeah, to[br]extent that even to this? Luckily, normal 0:25:10.610,0:25:16.130 address space is flat, like, to the[br]process, not to the kernel. Right, so now 0:25:16.130,0:25:24.870 we can access memory mapped I/O. That's[br]all the, like the really high level stuff. 0:25:24.870,0:25:28.700 So what's going on under there? It's got[br]all the basic microkernel stuff, so 0:25:28.700,0:25:33.020 message passing, and then some[br]optimizations to actually make it perform 0:25:33.020,0:25:40.140 well on a really slow CPU. The basics are,[br]you can send a message, you can receive a 0:25:40.140,0:25:46.160 message, and you can send and receive a[br]message, where you basically say "Send a 0:25:46.160,0:25:50.930 message, wait till a response comes in,[br]then continue", which is used to wrap 0:25:50.930,0:25:58.400 function calls. This is mostly the same as[br]in Minix. There's some subtle changes, 0:25:58.400,0:26:08.230 which I'll get to later. And then memory[br]grants are something that only appeared in 0:26:08.230,0:26:13.080 Minix really recently. It's a way for a[br]process to basically create a new name for 0:26:13.080,0:26:16.690 a piece of memory it has, and give a[br]different process access to it, just by 0:26:16.690,0:26:21.630 sharing the number. These are referred to[br]by the process ID and a number of that 0:26:21.630,0:26:28.470 range. So the process IDs are actually[br]local per process, so to uniquely identify 0:26:28.470,0:26:35.461 one you need to say process ID plus that[br]number, and they're only granted to a 0:26:35.461,0:26:38.300 single process. So when a process creates[br]one of these, it can't even access it 0:26:38.300,0:26:42.490 itself, unless it creates a grant for[br]itself, which is not really that useful, 0:26:42.490,0:26:51.880 usually. These grants are used to prevent[br]having to copy over all the data inside 0:26:51.880,0:26:57.500 the IPC message used to implement a system[br]call. Yeah, these are the basic operations 0:26:57.500,0:27:03.190 on it. You can create one, you can copy[br]into and from it. So, you can't actually 0:27:03.190,0:27:07.010 map it. A process that receives one of[br]these has to say to the kernel, using a 0:27:07.010,0:27:12.721 system call, "please write this data into[br]that area of memory that belongs to a 0:27:12.721,0:27:17.930 different process." And then there's also[br]indirect grants, because, you know, in 0:27:17.930,0:27:25.309 Minix they do have this, but also only[br]recently, and usually if you have a 0:27:25.309,0:27:30.360 microkernel system, you would have to copy[br]your buffer for a read call first to the 0:27:30.360,0:27:36.540 file system server and then back to, like,[br]either the hard disk driver, or the device 0:27:36.540,0:27:40.620 driver that's implementing a device file.[br]So the ME actually allows you to create a 0:27:40.620,0:27:45.860 grant, pointing to a grant, that was given[br]to you by someone else. And then that 0:27:45.860,0:27:52.820 grant will inherit the privileges of the[br]process that creates it, combined with 0:27:52.820,0:27:57.530 those that it assignes to it. So if the[br]process has a read/write grant it can 0:27:57.530,0:28:01.340 create a read-only or write-only grant,[br]but it cannot, if it only has a read 0:28:01.340,0:28:08.860 grant, it cannot add write rights to it[br]for a different process, obviously. So 0:28:08.860,0:28:12.880 then there is also some big differences[br]from MINIX. In MINIX you address a process 0:28:12.880,0:28:18.080 by its process ID or thread ID with a[br]generation number attached to it. In the 0:28:18.080,0:28:25.440 ME you can actually address IPC to a file[br]descriptor. Kernel doesn't actually know a 0:28:25.440,0:28:28.610 lot about file descriptors, it just[br]implements the basic thing where you have 0:28:28.610,0:28:32.350 a list of files and each process has a[br]list of file descriptors assigning integer 0:28:32.350,0:28:39.320 numbers to those files to refer to them[br]by. And this is used so you can as a 0:28:39.320,0:28:43.040 process, you can actually directly talk to[br]a device driver without knowing what is 0:28:43.040,0:28:47.110 process ID is. So you don't send it to the[br]file system server, you send it to the 0:28:47.110,0:28:51.740 file descriptor or the Kernel just[br]magically corrects it for you. And they 0:28:51.740,0:28:55.550 moved select into the kernel so you can[br]tell the kernel: "Hey, I want to wait till 0:28:55.550,0:28:59.720 the file system server tells me that it[br]has not available or till a message comes 0:28:59.720,0:29:05.440 in." This is one of the most complicated[br]system calls the ME offers that's used in 0:29:05.440,0:29:12.010 a normal program. You can mostly ignore it[br]and just look like: "Hey, those arguments 0:29:12.010,0:29:16.760 sort of define a file descriptor set as a[br]bit field." And then there's the message 0:29:16.760,0:29:21.040 that might have been received and there's[br]DMA locks because you don't just want to 0:29:21.040,0:29:24.790 write to registers. You actually might[br]want to do the direct memory access from 0:29:24.790,0:29:30.720 hardware so you you can actually tell the[br]kernel to lock one of these memory grounds 0:29:30.720,0:29:38.260 in RAM for you, it won't be swapped out[br]anymore. And yeah, it will even tell you 0:29:38.260,0:29:42.020 the physical address so you can just load[br]that into a register and it's not really 0:29:42.020,0:29:46.760 that complicated. Just lock it, get a[br]physical access, write into the register 0:29:46.760,0:29:53.580 and continue. Well, that's the most[br]important stuff about the operating 0:29:53.580,0:29:58.929 system. The hardware itself is a lot more[br]complicated because the operating system, 0:29:58.929,0:30:03.300 once you have the code, you can just[br]reverse engineer it and get to know it. 0:30:03.300,0:30:11.010 The hardware. Well, let's just say it's a[br]real pain to have to reverse engineer a 0:30:11.010,0:30:16.179 piece of hardware together with its[br]driver. Like if you've got the driver 0:30:16.179,0:30:18.450 code, but you don't know what the[br]registers do. So you don't know what a lot 0:30:18.450,0:30:24.440 of logic does. And you're trying to both[br]figure out what the logic is and what the 0:30:24.440,0:30:30.050 actual registers do. Right. So first you[br]want to know which physical address goes 0:30:30.050,0:30:39.881 where? The metadata listings I showed you[br]actually have names in there. Those are 0:30:39.881,0:30:47.940 not in the metadata files themself, I[br]annotated those. So you just see the 0:30:47.940,0:30:56.680 physical address and size. But there is[br]one module, the bus driver module and the 0:30:56.680,0:31:04.230 bus driver is normal user process, but it[br]implements stuff like PCI configuration 0:31:04.230,0:31:09.550 space accesses and those things. And it[br]has a nice table in it with names for 0:31:09.550,0:31:17.049 devices. So if you just run strings on it,[br]you'll see these things. When I saw this, 0:31:17.049,0:31:20.960 I was was pretty glad because at least I[br]could make sense what device was being 0:31:20.960,0:31:26.680 talked to in a in a certain program. So[br]the bus driver does all these things. It 0:31:26.680,0:31:30.990 manages power getting to devices, it[br]manages configuration space access, it 0:31:30.990,0:31:35.960 manages the different kinds of buses and[br]IOMU that are on the system. And it makes 0:31:35.960,0:31:39.500 sure that the normal driver never has to[br]know any of these details. It just asked 0:31:39.500,0:31:45.520 it for a device by a number assigned to it[br]a build time. And then the bus driver 0:31:45.520,0:31:50.360 says, OK, here's a range of physical[br]address space you can now write to. So 0:31:50.360,0:31:56.640 that's a really nice abstraction and also[br]gives us a lot of information because the 0:31:56.640,0:32:01.640 really old builds for sunrise point[br]actually have a hell of a lot of debug 0:32:01.640,0:32:07.021 strings in there as printf format strings,[br]not as catalogue ID. It's 0:32:07.021,0:32:11.910 one of the only pieces of code for the ME[br]that does this, so that already tells you 0:32:11.910,0:32:15.480 a lot. And then there's also the table[br]that I just talked about that has the 0:32:15.480,0:32:23.760 actual info on the devices and names. So I[br]generated some DocuWiki content from this 0:32:23.760,0:32:28.570 that I use myself and this is what's in[br]the table, part of it. So it tells you 0:32:28.570,0:32:33.070 what address PCI configuration space lives[br]at. That tells you to do the bus device 0:32:33.070,0:32:38.130 function for it through that. It tells you[br]on what chipset SKU they're present using 0:32:38.130,0:32:44.640 a bitfield. And it tells you their names[br]in different fields. It also contains the 0:32:44.640,0:32:48.540 values that are used to write the base[br]address registers for PCI. So also their 0:32:48.540,0:32:54.190 normal memory ranges. And there's even[br]more devices. So the ME has access to a 0:32:54.190,0:32:58.860 lot of stuff. A lot of it is private to[br]it. A lot of it is components that also 0:32:58.860,0:33:06.110 exist in the rest of the computer. And[br]there's not a lot of information. A lot of 0:33:06.110,0:33:11.410 these are basically all the things that[br]are out there together with conference 0:33:11.410,0:33:15.140 slides published by other people who have[br]done research on the ME. I didn't have 0:33:15.140,0:33:21.980 time to add links to those, but they're[br]easy to find on Google. I'll get later to 0:33:21.980,0:33:28.230 this, I actually wrote a emulator for the[br]ME, a partial emulator to be able to run 0:33:28.230,0:33:34.230 ME code and analyze it, which obviously[br]needs to know a bit about the hardware so 0:33:34.230,0:33:41.030 you can look at the app. There is some[br]files in Intel's debugger package, 0:33:41.030,0:33:46.150 specific versions of that that have really[br]detailed info on some of the devices, also 0:33:46.150,0:33:51.460 not all of it. And I wrote some tool to[br]parse some of the files. It's really rough 0:33:51.460,0:33:57.040 code. I published it because people wanted[br]to see what I was doing. It doesn't work 0:33:57.040,0:34:04.080 out of the box. And there is a nice talk[br]on this by Mark Ermolov and Maxim 0:34:04.080,0:34:06.870 Goryachy.. Actually I don't know if I'm[br]pronouncing that correctly, but they've 0:34:06.870,0:34:12.049 done a lot of work on the ME and this[br]particular talk by them is really useful. 0:34:12.049,0:34:16.339 And then there's also something else.[br]There is a second ME on server chipsets, 0:34:16.339,0:34:21.299 the innovation engine. It's basically a[br]copy paste of the ME to provide a ME that 0:34:21.299,0:34:24.760 the vendor can write code for. Don't think[br]it's used a lot. I've only been able to 0:34:24.760,0:34:31.639 find HP software that actually targets it[br]and that has some more debug strings, but 0:34:31.639,0:34:36.639 also not a lot, it mostly has a table[br]containing register names, but they're 0:34:36.639,0:34:41.869 really abbreviated and for a really small[br]subset of the devices, there is 0:34:41.869,0:34:48.280 documentation out there in a Pentium N and[br]J series datasheet. It's seems like they 0:34:48.280,0:34:52.409 compile their a lot of code or whatever[br]with the wrong defines because it doesn't 0:34:52.409,0:35:00.350 actually fit into the manual that well,[br]it's just a section that has like some 20 0:35:00.350,0:35:08.640 tables that shouldn't be in there. So this[br]is from that talk I just referenced and 0:35:08.640,0:35:12.609 it's a overview of the innovation engine[br]and the bus bridges and everything in 0:35:12.609,0:35:20.070 there. This isn't very precise. So based[br]on some of those files from System Studio, 0:35:20.070,0:35:24.500 I try to get a better understanding of[br]this, which is this. This is the entire 0:35:24.500,0:35:29.760 chipset. The little DMA block in the top[br]left corner is what connects to your CPU. 0:35:29.760,0:35:36.570 And all of the big blocks with a lot of[br]ports are our bus bridges or switches for 0:35:36.570,0:35:45.470 PCIexpress-like fabric. So there's a lot[br]going on. The highlighted area is the 0:35:45.470,0:35:59.081 management engine memory space and the[br]rest of it is like the global chipset. The 0:35:59.081,0:36:02.840 things I've highlighted in green hair are[br]on the primary PCI bus. So there's this 0:36:02.840,0:36:08.210 weird thing going on where there seems to[br]be two PCI hierarchies, at least 0:36:08.210,0:36:13.741 logically. So in reality it's not even[br]PCI, but on intel systems, there's a lot 0:36:13.741,0:36:19.600 of stuff that behaves as if it is PCI. So[br]it has like a bus device function and 0:36:19.600,0:36:28.650 numbers, PCI configuration space registers[br]and they have two different roots for the 0:36:28.650,0:36:32.310 configuration space. So even though the[br]configuration space address includes a bus 0:36:32.310,0:36:36.480 number, they have two completely different[br]things with each. Each of which has its 0:36:36.480,0:36:41.290 own bus zero. So that's that's weird also[br]because they don't make sense when you 0:36:41.290,0:36:45.680 look at how the hardware is laid out. So[br]this is stuff that's on the primary PCI 0:36:45.680,0:36:50.780 configuration space that's directly[br]accessed by the EM, by the north bridge on 0:36:50.780,0:36:55.260 the ME CPU. So that's the minute I A[br]system agent. System agent is what Intel 0:36:55.260,0:37:00.619 calls a Northbridge nowadays, now that[br]it's not a separate chip anymore. It's 0:37:00.619,0:37:07.530 basically just a Northbridge and a crypto[br]unit that's on there and the stuff that's 0:37:07.530,0:37:12.530 directly attached to Northbridge being the[br]ROM and the RAM. So the processor itself 0:37:12.530,0:37:16.960 is, as I said, derived from a 486, but it[br]does actually have some more modern 0:37:16.960,0:37:21.830 features that it does CPU ID, at least on[br]my systems. Some other researchers said 0:37:21.830,0:37:29.369 theirs didn't. It's basically the core[br]that's in the quark MCU, which is really 0:37:29.369,0:37:33.260 great because it's one of the only cores[br]made by Intel that has public 0:37:33.260,0:37:39.800 documentation on how to do run control. So[br]breakpoints and accessing registers and 0:37:39.800,0:37:44.420 everything over JTAG. Intel doesn't[br]publish this stuff except for the quark 0:37:44.420,0:37:50.920 MCU, because they were targeted makers.[br]But they reused that in here, which is 0:37:50.920,0:37:58.200 really useful. It even has an official[br]port to the OpenOCD debugger, which I have 0:37:58.200,0:38:03.100 not gotten to test because I don't have a[br]JTAG probe, which is compatible with Intel 0:38:03.100,0:38:11.000 voltage levels and supported by OpenOCD[br]and also has like a set CPU ID and MSRs. 0:38:11.000,0:38:21.170 It has some really fancy features like[br]branch tracing and some more strict paging 0:38:21.170,0:38:30.480 permission enforcement stuff. They don't[br]use the interrupt pins on this. So it's an 0:38:30.480,0:38:34.710 IP block but if there are some files out[br]there, that's where it is this screenshot 0:38:34.710,0:38:40.601 is from, that actually are used by a[br]built in logic analyzer Intel has on the 0:38:40.601,0:38:46.680 chipset and you can select different[br]signals on the chip to to watch, which is 0:38:46.680,0:38:50.900 a really great source of information on[br]how the IP blocks are laid out and what 0:38:50.900,0:38:54.200 signals are in there, because you[br]basically get a tree view of the IP blocks 0:38:54.200,0:39:00.800 and chip and some of their signals. They[br]don't use the legacy interrupt system, 0:39:00.800,0:39:07.920 they only use message based interrupts by[br]what a device writes a value into a 0:39:07.920,0:39:13.050 register on the interrupt controller[br]instead of asserting a pin. And then there 0:39:13.050,0:39:21.700 is the Northbridge. It's partially[br]documented in that data sheet I mentioned, 0:39:21.700,0:39:29.020 it does support x86 IO address space, but[br]it's never used. Everything in the ME is 0:39:29.020,0:39:36.600 in memory space or expose as memory space[br]through bridges, in the Northbridge 0:39:36.600,0:39:43.070 implements access to the ROM,RAM, it has a[br]IOMMU which is only used for transactions 0:39:43.070,0:39:48.750 coming from the rest of the system and[br]it's always initialized to, at least in 0:39:48.750,0:39:51.660 the firmware I looked up, it's always[br]initialized to the inverse of the page 0:39:51.660,0:40:00.200 table, so linear addresses can be used for[br]memory maps, sorry, for DMA. It also does 0:40:00.200,0:40:06.270 PCI configuration space access to the[br]primary PCI bus. And it has a firewall 0:40:06.270,0:40:15.080 that allows the operating system to deny[br]any IP block in the chipset from sending a 0:40:15.080,0:40:18.890 completion on the bus request. So it can[br]actually say: "Hey, I want to read some 0:40:18.890,0:40:25.040 register and only these devices are[br]allowed to send me value for it." So 0:40:25.040,0:40:29.570 they've actually thought about security[br]here, which is great. Then there is one of 0:40:29.570,0:40:38.190 the most important blocks in the ME, which[br]is the crypto engine. It does some sort of 0:40:38.190,0:40:47.100 more well-known crypto algorithms. AES,[br]SHA hashes, RSA and it has a secure key 0:40:47.100,0:40:56.330 store, which I'm not gonna [audio dropped][br]... all about it in their ME talk at 0:40:56.330,0:41:04.250 Blackhat. And a lot of these things have[br]DMA engines, which all seem to be the 0:41:04.250,0:41:09.500 same. And there is no other DM agents ...[br]engines in ME, so this is also used from 0:41:09.500,0:41:23.170 memory to memory copy or DMA into other[br]devices. So that's used in a lot of 0:41:23.170,0:41:27.400 things. This is actually a diagram which I[br]don't have the vector for anymore. So 0:41:27.400,0:41:35.260 that's why the libre office background is[br]in there. I'm sorry. So this is basically 0:41:35.260,0:41:39.020 what that crypto engine looks like when[br]you look at that signal tree that I was 0:41:39.020,0:41:44.910 talking about earlier. The DMA engines are[br]both able to do memory to memory copies 0:41:44.910,0:41:52.570 until directly targets the crypto unit[br]they're part of. Basically, when you, I 0:41:52.570,0:41:57.490 don't know about the control bits that go[br]with this, but when you set the target 0:41:57.490,0:42:02.150 address to zero and the right control[br]bits, it will copy into the buffer that's 0:42:02.150,0:42:11.960 used for the encryption. So that is how it[br]accelerates memory access for crypto. And 0:42:11.960,0:42:15.590 these are the actual register offsets.[br]They're the same for all of the DMA 0:42:15.590,0:42:21.580 engines in there relative to the base[br]address of the subunit they're in. And 0:42:21.580,0:42:27.290 then there's the second PCI bus or bus[br]hierarchy, which is like in some places 0:42:27.290,0:42:33.540 called the PCI fixed bus. I'm actually not[br]entirely sure whether this is actually 0:42:33.540,0:42:38.840 implemented as a PCI bus as I've drawn it[br]here, but this is what it behaves like. So 0:42:38.840,0:42:43.920 it has all the ME private stuff, that's[br]not a part of the normal chipset. So it's 0:42:43.920,0:42:51.310 timers for the ME, it has the[br]implementation of the secure enclave 0:42:51.310,0:42:58.010 stuff, that the firmware TPM registers.[br]And it has the gen device which I've 0:42:58.010,0:43:01.780 mostly ignored because it's only used the[br]boot time. It's only used by the actual 0:43:01.780,0:43:10.869 boot ROM for the ME mostly. It is what the[br]ME uses to get the fuses Intel burns. So 0:43:10.869,0:43:15.420 that's the intel public key, whether it's[br]a production or pre-production part, but 0:43:15.420,0:43:20.260 it's pretty much a black box. It's not[br]used that much, fortunately. There is the 0:43:20.260,0:43:24.340 IPC block which allows the ME to talk to[br]the sensor hub, which is a different CPU 0:43:24.340,0:43:28.190 in the chipset. It allows it to talk to[br]power management controller and all kinds 0:43:28.190,0:43:34.180 of other embedded CPUs. So it's inter[br]processor communication not interprocess. 0:43:34.180,0:43:39.090 Confused me for a bit. And here's the host[br]embedded controller interface, which is 0:43:39.090,0:43:44.320 how the ME talks to the rest of the[br]computer when it wants the computer to 0:43:44.320,0:43:47.960 know that it's talking so it can directly[br]access a lot of stuff. But when it wants 0:43:47.960,0:43:54.250 to send a message to the EFI or to Windows[br]or Linux, it'll use this. And it also has 0:43:54.250,0:43:59.080 status registers, which are really simple[br]things where the ME writes in a value. And 0:43:59.080,0:44:05.290 even if the ME crashes, the host can still[br]read the value, which is how you can see 0:44:05.290,0:44:11.160 whether the ME is running, whether it's[br]disabled, whether it fully booted, or 0:44:11.160,0:44:15.400 whether it crashed halfway through. But at[br]a point where it could still get the rest 0:44:15.400,0:44:21.230 of the computer running and there is some[br]corporate code to to read it. I've also 0:44:21.230,0:44:27.080 implemented some decoding for it on the[br]emulator because it's useful to see what 0:44:27.080,0:44:33.210 those values mean. So then there's[br]something really interesting, the primary 0:44:33.210,0:44:37.240 adverse translation table, which is the[br]bus bridge that allows the ME to actually 0:44:37.240,0:44:44.200 access the PCIexpress fabric of the[br]computer. For a lot of the, what in this 0:44:44.200,0:44:50.010 table call ME peripherals, that are[br]actually outside the ME domain and the 0:44:50.010,0:45:00.320 chipset, it uses this to access it. It[br]also uses it to access the UMA, which is 0:45:00.320,0:45:04.960 an area of host RAM that's used as a swap[br]device for the ME and to Trace Hub, which is 0:45:04.960,0:45:11.190 the debug port, but also has a couple of[br]windows which allow the ME to access any 0:45:11.190,0:45:19.060 random area of host RAM, which is the most[br]scary bit because UMA is specified by 0:45:19.060,0:45:24.650 host, but the host DRAM area is where you[br]can just point it anywhere. You can read 0:45:24.650,0:45:28.750 or write any value that that Windows or[br]Linux or whatever you're running has 0:45:28.750,0:45:37.460 sitting there. So that's scary to me. So[br]and then there's the rest of it, the rest 0:45:37.460,0:45:46.490 of the devices which are behind the[br]primary ATT. And that's a lot of stuff, 0:45:46.490,0:45:53.450 that's debug, that's also the older normal[br]peripherals that your P.C. has, but it 0:45:53.450,0:45:56.200 also includes things like the power[br]management controller, which actually 0:45:56.200,0:45:59.789 turns on and off all the different parts[br]of your computer. It controls clocks and 0:45:59.789,0:46:07.680 resets. So this is really important. There[br]is a concept that you'll come across where 0:46:07.680,0:46:14.261 you're reading Intel manuals or ME related[br]stuff that's root spaces besides your 0:46:14.261,0:46:20.320 normal addressing information for a PCI[br]device, it also has a root space number, 0:46:20.320,0:46:24.980 which is basically how you have a single[br]PCI device exposing two completely 0:46:24.980,0:46:31.151 different address spaces. And it's 0 for[br]the host, it's one for the ME. Some 0:46:31.151,0:46:34.940 devices expose the same information on[br]there. Other ones behave completely 0:46:34.940,0:46:43.370 different. That's something you don't[br]usually see. And then there's the side 0:46:43.370,0:46:48.560 band fabric. So besides all this stuff[br]they just covered, which is PCI like at 0:46:48.560,0:46:52.880 least. There is also something completely[br]different, side band fabric, which is a 0:46:52.880,0:47:00.990 completely packet switched network, where[br]you don't use any memory mapping by 0:47:00.990,0:47:06.370 default. You just have a one byte address[br]for a device and some other addressing 0:47:06.370,0:47:09.590 fields and you're just sending a message[br]saying: "Hey, I want to read configuration 0:47:09.590,0:47:14.320 or data or memory." And there is actually[br]a lot of information out there on this, 0:47:14.320,0:47:18.480 because Intel, it seems like I just copy[br]pasted their internal specification into a 0:47:18.480,0:47:26.860 patent. This is how you address it. This[br]is all devices on there, which is quite a 0:47:26.860,0:47:32.590 lot. It's also what you, if any of you are[br]kernel developers, and you've had to deal 0:47:32.590,0:47:40.110 with GPIO on Intel SoCs. There's this P2SB[br]device that you have to use. That's what 0:47:40.110,0:47:48.240 the host uses to access this. Their[br]documentation on it is really, really bad. 0:47:48.240,0:47:52.420 This was all done using static analysis.[br]But then I wanted to figure out how some 0:47:52.420,0:47:57.410 of the logic actually works and it was[br]really complicated to play around with the 0:47:57.410,0:48:07.310 ME. There was this nice talk by Ermolov[br]and Goryachy, where they said: "You know, 0:48:07.310,0:48:11.790 we found a an exploit that gives you code[br]execution and you can you can get JTAG 0:48:11.790,0:48:18.813 access to." It sounds really nice. It's[br]actually not that easy. So arbitrary code 0:48:18.813,0:48:23.359 execution in the BUP module, they actually[br]describe their exploit and how you should 0:48:23.359,0:48:30.270 use it. But they didn't describe anything[br]that's needed to actually implement that. 0:48:30.270,0:48:35.690 So if you want to do that, what you need[br]to do to figure out where to stack lives, 0:48:35.690,0:48:40.230 you need to know where you need to write a[br]payload that will actually get it from a 0:48:40.230,0:48:44.640 buffer overflow on a stack that, by the[br]way, uses stack cookies. So you can't just 0:48:44.640,0:48:51.369 overwrite the return address to turn that[br]into an arbitrary write. And you need to 0:48:51.369,0:48:56.369 find out what the return pointer address[br]is so you can overwrite it and find ROP 0:48:56.369,0:49:03.320 gadgets because the stack is not[br]executable. And then when you've done 0:49:03.320,0:49:09.920 that, you can just turn on debug access or[br]change to custom firmware or whatever. So 0:49:09.920,0:49:13.660 what I did is I had a bit of trouble[br]getting that running and in order to test 0:49:13.660,0:49:17.720 your payload, you have to flash it into[br]the system and it takes a while and then 0:49:17.720,0:49:20.880 the system just doesn't power on if the[br]ME's not working, if you're crashing it 0:49:20.880,0:49:24.580 instead of getting code execution. So it's[br]not really valuable to to develop it that 0:49:24.580,0:49:32.910 way, I think. Some people did. I respect[br]that because it's really, really hard. And 0:49:32.910,0:49:38.790 then I wrote this ME Loader, it's called[br]Loader because at first I started out like 0:49:38.790,0:49:42.849 writing it as a sort of a wine thing where[br]you where you would just mmap the right 0:49:42.849,0:49:47.380 ranges at the right place and jump into[br]it, execute it, patch some system calls. 0:49:47.380,0:49:51.849 But because the ME is a micro kernel[br]system in almost every user space program 0:49:51.849,0:49:57.480 accesses hardware directly, it ended up[br]implementing like a good part of the 0:49:57.480,0:50:08.080 chipset, at least as stubs or enough logic[br]to get the code running. And I later on 0:50:08.080,0:50:14.510 added some features that actually allowed[br]to talk to the hardware. I can use it as a 0:50:14.510,0:50:18.530 debugger, but just because it's actually[br]running the ME firmware or parts of it 0:50:18.530,0:50:26.200 inside a normal Linux process, I can just[br]use gdb to debug it. And back in April 0:50:26.200,0:50:30.320 last year, I got that working to the point[br]where I could run the bootstrap process, 0:50:30.320,0:50:38.580 which is where the vulnerability is. And[br]then you just develop the exploit against 0:50:38.580,0:50:43.960 it, which I did. And then I made a mistake[br]cleaning up some old change root 0:50:43.960,0:50:52.010 environments for close source software.[br]And I nuked my home dir. Yeah. I hadn't 0:50:52.010,0:50:56.599 yet pushed everything to GitHub. So I[br]stuck with an old version and I decided, 0:50:56.599,0:51:00.160 you know, let's refactor this and turn it[br]into something that might actually at some 0:51:00.160,0:51:03.930 point be published, which by the way I [br]did last summer. This is all public code. The 0:51:03.930,0:51:09.790 ME Loader thing. It's on GitHub. And[br]someone else beat me to it and replicated 0:51:09.790,0:51:15.250 that exploit by the Russian guys. Which up to[br]then they have produced a proof of concept 0:51:15.250,0:51:22.760 thing for Apollo like chipsets, which were[br]completely different for from what you had 0:51:22.760,0:51:33.690 to do for normal ME. I was a bit[br]disappointed by that one, not being the 0:51:33.690,0:51:38.580 first one to actually replicate this. But[br]then I did about a week later, I got it 0:51:38.580,0:51:44.270 got my loader back to the point where I[br]could actually get to the vulnerable code 0:51:44.270,0:51:51.120 and develop that exploit and got it[br]working not too long after. And here's the 0:51:51.120,0:51:54.720 great thing. Then I went to the hacker[br]space. I flash it into my laptop. The 0:51:54.720,0:51:59.040 image that I had just been using only on[br]the emulator. I didn't change it. I flash. 0:51:59.040,0:52:05.280 I was like, this is never gonna work on[br]it. It works. some laughter And I've still got an image 0:52:05.280,0:52:08.480 on a flash ship with me because that's[br]what I used to actually turn on the 0:52:08.480,0:52:14.490 debugger. And then you need a debug probe[br]because that USB based debugging stuff 0:52:14.490,0:52:18.810 that's mentioned here only works pretty[br]late in boot. Which is also why I only 0:52:18.810,0:52:21.880 really see Apollo Lake stuff because on[br]those chipsets you can actually use this 0:52:21.880,0:52:33.010 for the ME. And then you need this thing[br]because there's a second channel, that is 0:52:33.010,0:52:36.360 using the USB plug, but it's a completely[br]different physical layer and you need an 0:52:36.360,0:52:40.911 adapter for it, which I don't think was[br]intended to be publicly available. Because 0:52:40.911,0:52:44.859 if you go to Intel site to say, I want to[br]buy this, they say, here's the C-NDA, 0:52:44.859,0:52:54.460 please sign it. But it appeared on mouser.[br]And luckily I knew some people, who had 0:52:54.460,0:52:59.120 done some other stuff, got a nice bounty[br]for it and bought it and I let me use it. 0:52:59.120,0:53:05.430 Thanks to them. It's expensive, but you[br]can buy it if it's still up there. Haven't 0:53:05.430,0:53:11.520 checked. That's the Link. So I'm a bit[br]late, so I'm gonna use the time for 0:53:11.520,0:53:15.760 questions as well. So the main thing the[br]ME does that you cannot replace is the 0:53:15.760,0:53:21.250 boot process. It's not just breaking the[br]system. If you don't turn it on, it 0:53:21.250,0:53:25.240 actually does stuff that has to be done.[br]So you gonna have to use the ME anyway if 0:53:25.240,0:53:30.730 you want to boot a computer. I don't[br]necessarily have to use Intel's firmware. 0:53:30.730,0:53:35.810 The ME itself boots is like a micro kernel[br]system, so it has a process which 0:53:35.810,0:53:39.859 implements a lot of the servers that will[br]allow it to get to a point where it can 0:53:39.859,0:53:44.710 start those servers. This process has very[br]high privileges in older versions, which 0:53:44.710,0:53:49.160 is what is being used on these chipsets.[br]And if you exploit that, you're still ring 0:53:49.160,0:53:55.680 3, but you can turn on debugger and you[br]can use the debugger to become ring 0. So 0:53:55.680,0:53:59.171 this is what normal boot process for a[br]computer looks like. And this is what 0:53:59.171,0:54:02.050 happens when you use Boot Guard. There's a[br]bit of code that runs even before the 0:54:02.050,0:54:07.170 reset vector, and that's started by micro[br]code initialization, of course. And this 0:54:07.170,0:54:12.120 is what actually happens. The ME loads a[br]new firmware into a power management 0:54:12.120,0:54:16.390 controller, it then ready some stuff in a[br]chipset and it tells the power mentioning 0:54:16.390,0:54:23.660 controller like please stop pulling that[br]CPU reset pin low and the CPU will start. 0:54:23.660,0:54:28.160 Power managment controller is a completely[br]independent thing I say 8051 derived 0:54:28.160,0:54:32.690 microcontroller that runs a real time[br]operating system from the 90s. This is the 0:54:32.690,0:54:38.690 only string in the firmware by the way,[br]that's quoted there. And depending on the 0:54:38.690,0:54:42.410 chipsset that you have, it's either loaded[br]with a patch or with a complete binary 0:54:42.410,0:54:46.690 from the ME, and it does a lot of[br]important stuff. No documentation on it 0:54:46.690,0:54:52.120 besides ACPI interface, which is not[br]really any useful. The ME has to do these 0:54:52.120,0:54:58.710 things. It needs to load the keys for the[br]Boot Guard process needs to set up clock 0:54:58.710,0:55:06.550 controllers and then tell the PMC to turn[br]on the power to to the CPU. It needs to 0:55:06.550,0:55:15.240 configure PCI express fabric and reset -[br]like get the CPU to come out of reset. 0:55:15.240,0:55:18.290 There's a lot of code involved in this, so[br]I really didn't want to do this all 0:55:18.290,0:55:22.150 statically. What I did is I added hardware[br]support, hardware passthrough support to 0:55:22.150,0:55:28.500 the emulator and booted my laptop that[br]way. Actually had a video of this, but I 0:55:28.500,0:55:33.970 don't have the time to show it, which is a[br]pity. But this is what I - the bring up 0:55:33.970,0:55:38.030 process from the ME running in a Linux[br]process, sending whatever hardware access 0:55:38.030,0:55:43.340 as it was trying to do that are important[br]for boot to the debugger. And then that 0:55:43.340,0:55:49.880 was using a ME in real hardware that was[br]halted to actually do to register accesses 0:55:49.880,0:55:56.520 and it works. It's not going to show this.[br]It actually booted the computer reliably. 0:55:56.520,0:56:02.410 Then Boot Guard configuration is fun[br]because you know where they say they fuse 0:56:02.410,0:56:10.990 in the keys. Well yeah. But the ME loads[br]them from fuses and then manually loads 0:56:10.990,0:56:14.530 them into registers. So if you have code[br]execution on the ME before it does this, 0:56:14.530,0:56:18.000 you can just load your own values and you[br]can run core boot even on a machine that 0:56:18.000,0:56:24.190 has Boot Guard. Yeah. So I'm gonna go[br]through this really quickly. This is, by 0:56:24.190,0:56:29.570 the way, these are the registers that[br]configure what security model the CPU is 0:56:29.570,0:56:34.579 gonna enforce for the firmware. I'm going[br]to release this code after my talk. It's 0:56:34.579,0:56:39.810 part of a Python script that I wrote that[br]uses the debugger to start the CPU without 0:56:39.810,0:56:45.670 ME firmware. I traced all the of the ME[br]firmware did. And I now have a Python 0:56:45.670,0:56:51.470 script that can just start a computer[br]without Intel's code. If you translate 0:56:51.470,0:56:55.920 this into a rough sequence or even into[br]binary for the ME, you can start a 0:56:55.920,0:57:02.850 computer without the ME itself or at least[br]without it running the operating system. 0:57:02.850,0:57:12.710 applause[br]So, yeah, future goals. I really do want 0:57:12.710,0:57:20.420 to share this because if there is a way to[br]escalate, to ring 0 fruit, a rope chain, 0:57:20.420,0:57:24.359 then you could just start your own kernel[br]in the ME and have custom firmware, at 0:57:24.359,0:57:29.600 least from the vulnerability on. But you[br]could also build a mod chip that uses the 0:57:29.600,0:57:34.829 debugger interface to load a new firmware.[br]There's lots of stuff still needs to be 0:57:34.829,0:57:41.210 discovered, but I'm gonna hang out at the[br]open source firmware village later, at 0:57:41.210,0:57:46.690 least part of the week here. So because I[br]really want to get started on open source 0:57:46.690,0:57:55.250 ME firmware using this. Right. And there's[br]a lot of people that's played a role in 0:57:55.250,0:58:00.700 getting me to this point. Also would like[br]to thank the guy from Hague hacker space, 0:58:00.700,0:58:07.680 BinoAlpha, who basically allowed me to use[br]his laptop to prepare the demo, which I 0:58:07.680,0:58:14.660 ended up not being able to show, but.[br]Right. I was gonna ask what are the 0:58:14.660,0:58:17.380 worrying questions? But I don't think[br]there's really any time for any more. 0:58:17.380,0:58:22.570 Herald: Peter, thank you so much. Applause[br]Unfortunately, we don't have any more time 0:58:22.570,0:58:30.720 left.[br]Peter: I'll be around. I'll be around. 0:58:30.720,0:58:35.660 Herald: I think it's very, very[br]interesting because I hope that your talk 0:58:35.660,0:58:41.119 will inspire many people to keep looking[br]into how the management engine works and 0:58:41.119,0:58:46.930 hopefully uncover even more stuff. I think[br]we have time for just one single question. 0:58:46.930,0:58:51.040 I don't know, do we? How one from the[br]Internet. Thank you so much. 0:58:51.040,0:58:56.790 Signal Angel: OK. First off, I have to[br]tell you. Your shirt is nice. Chat wanted 0:58:56.790,0:59:05.000 me to say this. And they asked how[br]reliable this exploit is and does it work 0:59:05.000,0:59:09.160 on every boot?[br]Peter: Right, Yeah. That's actually 0:59:09.160,0:59:14.960 something really important that I forgot[br]to mention. So they patch a vulnerability, 0:59:14.960,0:59:17.339 but they didn't provide downgrade[br]protection. If you could flash a 0:59:17.339,0:59:24.170 vulnerable image with an exploit in it,[br]it'll just boot every time on these chips 0:59:24.170,0:59:27.850 that's so six or seven generation chips[br]that's put in that image and it will 0:59:27.850,0:59:31.230 reliably turn on the debugger every time[br]you turn on the computer. applause 0:59:31.230,0:59:36.650 Herald: Thank you so much for the[br]question. And Peter Bosch thank you so 0:59:36.650,0:59:39.160 much. Please give him a great round of[br]applause. 0:59:39.160,0:59:43.625 applause 0:59:43.625,1:00:08.000 subtitles created by c3subtitles.de[br]in the year 20??. Join, and help us!