1 00:00:00,000 --> 00:00:19,152 36C3 preroll music 2 00:00:19,152 --> 00:00:22,520 Herald: The next talk is an intel management engine, deep dive. 3 00:00:22,520 --> 00:00:27,230 Understanding the ME at the OS and hardware level and it is by Peter Bos, 4 00:00:27,230 --> 00:00:31,089 Please welcome him with a great round of applause! 5 00:00:31,089 --> 00:00:38,780 Applause 6 00:00:38,780 --> 00:00:49,409 Peter Bosch: Right. So everybody. Harry. Nice. OK. So welcome. Well, this is me. 7 00:00:49,409 --> 00:00:59,510 I'm a student at Leiden University. Yeah, I've always been really interested in how 8 00:00:59,510 --> 00:01:04,610 stuff works. And when I got a new laptop, I was like, you know, how does this thing 9 00:01:04,610 --> 00:01:08,410 really boot? I knew everything from reset vector onwards. I wanted to know what 10 00:01:08,410 --> 00:01:15,221 happened before it. So first I started looking at the boot guard ACM. While 11 00:01:15,221 --> 00:01:21,420 looking through it, I realized that not everything was as it was supposed to be. 12 00:01:21,420 --> 00:01:26,280 That led to a later part in the boot process being vulnerable, which ended up 13 00:01:26,280 --> 00:01:34,249 being discovered by me. And I found out here last year that I wasn't the only one 14 00:01:34,249 --> 00:01:38,310 to find it. Trammell Hudson also found it, and we reported it together, presented it 15 00:01:38,310 --> 00:01:43,399 at Hack in the Box. And then at the same time, I was already also looking at the 16 00:01:43,399 --> 00:01:49,350 management engine. Well, there had been a lot of research done on that before. The 17 00:01:49,350 --> 00:01:58,140 public info was mostly on the file system and on specific vulnerabilities, which 18 00:01:58,140 --> 00:02:04,400 still made it pretty hard to get started on reverse-engineering it. So that's why I 19 00:02:04,400 --> 00:02:10,340 thought it might be useful for me to present this work here. It's basically 20 00:02:10,340 --> 00:02:16,910 broken up into three parts. The first bit is just a quick introduction into the 21 00:02:16,910 --> 00:02:22,250 operating system it runs. So if you want to work on this yourself, you're more 22 00:02:22,250 --> 00:02:28,690 easily able to understand whats in your face in your Disassembler. So and then 23 00:02:28,690 --> 00:02:37,950 after that, I'll go over its role in the boot process and then also how this 24 00:02:37,950 --> 00:02:45,780 information can be used to to start developing a new firmware for it or do 25 00:02:45,780 --> 00:02:49,730 more security research on it. So first of all, what exactly is the management 26 00:02:49,730 --> 00:02:57,280 engine? There's been a lot of fuss about it being a backdoor and everything, in 27 00:02:57,280 --> 00:03:05,000 reality, if it is or not depends on the software that it runs. It's basically a 28 00:03:05,000 --> 00:03:09,110 processor with his own RAM and his own IO and MMUs and everything's sitting inside 29 00:03:09,110 --> 00:03:16,049 your south ridge. It's not in the CPU, It's in its outreach. So when I say this 30 00:03:16,049 --> 00:03:24,010 is gonna be about the sixth and seventh generation of Intel chips, I mean, mostly 31 00:03:24,010 --> 00:03:28,489 motherboards from those generations. If you run a newer CPU on it, it will also 32 00:03:28,489 --> 00:03:39,584 work for that. So yeah. Bit more detail. CPU it runs is based on the 80486, which, 33 00:03:39,584 --> 00:03:43,510 you know, is funny. It's quite an old CPU you and it's still being used in almost 34 00:03:43,510 --> 00:03:51,079 every computer nowadays. So it has a little bit of its own RAM. It has quite a 35 00:03:51,079 --> 00:03:58,150 bit of built in ROM, has a hardware accelerated cryptographic unit and it has 36 00:03:58,150 --> 00:04:05,450 fuses which are right once memory is used to store security settings and keys and 37 00:04:05,450 --> 00:04:11,079 everything. Some of the more scary features it has: Bus bridges to all of the 38 00:04:11,079 --> 00:04:16,419 buses inside the south ridge, it can access the RAM on the CPU and it can 39 00:04:16,419 --> 00:04:21,359 access the network, which makes it really quite dangerous. If there is a 40 00:04:21,359 --> 00:04:28,409 vulnerability or if it runs anything nefarious and it's tasks nowadays include 41 00:04:28,409 --> 00:04:35,860 starting the computer as well as adding management features. This is mostly used 42 00:04:35,860 --> 00:04:41,190 in servers where it can serve as a board management controller, do like a remote 43 00:04:41,190 --> 00:04:49,001 keyboard and video and it does security boot guard, which is the signing of a 44 00:04:49,001 --> 00:04:54,830 firmware and verification of signatures. It implements a firmware TPM and there is 45 00:04:54,830 --> 00:05:02,590 also a SDK to use it as a general purpose secure enclave. So on the software side of 46 00:05:02,630 --> 00:05:12,650 it, it runs a custom operating system, parts of which are taken from MINIX, the 47 00:05:12,650 --> 00:05:17,250 teaching operating system by Andrew Tanenbaum. It's a micro kernel operating 48 00:05:17,250 --> 00:05:32,930 system. It runs binaries that are in a completely custom format. It's really 49 00:05:32,930 --> 00:05:36,030 quite high level system actually. If you look at it in terms of the operating 50 00:05:36,030 --> 00:05:40,681 system, it runs, it's mostly like Unix, which makes it kind of familiar, but it 51 00:05:40,681 --> 00:05:46,819 also has large custom parts. Like I said before in this talk, I'm going to be 52 00:05:46,819 --> 00:05:52,740 speaking about sixth and seventh generation Intel core chipsets, so that's 53 00:05:52,740 --> 00:05:58,949 Sunrise Point. Lewisburg, which is the server version of this and also the laptop 54 00:05:58,949 --> 00:06:04,410 system on a chip they're just called Intel core low power. They also include the 55 00:06:04,410 --> 00:06:08,360 chipset as a separate die. So it also applies to them. In fact, I've been 56 00:06:08,360 --> 00:06:11,979 testing most of this stuff. I'm going to tell you about on the laptop that's 57 00:06:11,979 --> 00:06:19,430 sitting right here, which is a Lenovo T 460. The version of the firmware I've been 58 00:06:19,430 --> 00:06:30,820 looking at is 11001205. Right. So I do need to put this up there. I'm not a part 59 00:06:30,820 --> 00:06:38,520 of Intel, nor have I signed any contracts to them. I've found everything in ways 60 00:06:38,520 --> 00:06:43,500 that you could also do. I didn't have any leaked NDA stuff or anything that you 61 00:06:43,500 --> 00:06:53,099 couldn't get your hands on. It's also a very wide subject area, so there might be 62 00:06:53,099 --> 00:07:00,580 some mistakes here or there, but generally it should be right. Well, if you want to 63 00:07:00,580 --> 00:07:04,220 get started working on an ME firmware, want to reverse-engineer it or modify it 64 00:07:04,220 --> 00:07:08,580 in some way first, you've got to deal with the image file. You've got your SPI flash. 65 00:07:08,580 --> 00:07:12,009 It's where most of its firmware lives in the same flash chip as your BIOS. So 66 00:07:12,009 --> 00:07:17,410 you've got that image. And then how do you get the code out? Well, there's tools for 67 00:07:17,410 --> 00:07:22,949 that. It's already been extensively documented, documented by other people. 68 00:07:22,949 --> 00:07:28,681 And you can basically just download a tool and run it against it. Which makes this 69 00:07:28,681 --> 00:07:31,690 really easy. This is also the reason why there hasn't been a lot of research done 70 00:07:31,690 --> 00:07:35,940 yet before these tools were around. You couldn't get to all of the code. The 71 00:07:35,940 --> 00:07:41,349 kernel was compressed using Huffman tables, which were stored in ROM. You 72 00:07:41,349 --> 00:07:45,360 couldn't get to the ROM without getting code execution on the thing. So there was 73 00:07:45,360 --> 00:07:52,639 basically no way of getting access to the kernel code. And I think also to see some 74 00:07:52,639 --> 00:07:55,800 library. But that's not a problem anymore. You can just download a tool and unpack 75 00:07:55,800 --> 00:08:02,520 it. Also, the intel tool to generate firmware images, which you can find in 76 00:08:02,520 --> 00:08:11,979 some open directories on the internet, has Qt resources, XML-files which basically have the 77 00:08:11,979 --> 00:08:18,330 description for all of the file formats used by these ME versions, including names 78 00:08:18,330 --> 00:08:26,050 and comments to go with those structured definitions. So that's really useful. So 79 00:08:26,050 --> 00:08:30,430 we look at one of these images. It has a couple of partitions, some of them overlap 80 00:08:30,430 --> 00:08:38,150 and some of them are storage, some are code. So there is the main partitions, 81 00:08:38,150 --> 00:08:45,709 FTPR and NFTP, which contain the programs it runs. There's MFS, which is the read-write 82 00:08:45,709 --> 00:08:51,980 file system it uses for persistent storage. And then there is a log to flash 83 00:08:51,980 --> 00:08:57,320 option, the possibility to embed a token that will tell the system to unlock all 84 00:08:57,320 --> 00:09:02,850 debug access which has to be signed by Intel so it's not really of any use to us. 85 00:09:02,850 --> 00:09:07,439 And then there is something interesting, ROM bypass. Like I said, you can't get 86 00:09:07,439 --> 00:09:13,160 access to the ROM without running code on it. And ROM is mask ROM. So it's internal 87 00:09:13,160 --> 00:09:17,540 to the chip, but Intel has to develop new ROM code and have to test it without 88 00:09:17,540 --> 00:09:23,270 respinning the die every time. So they have a possibility on a unlocked 89 00:09:23,270 --> 00:09:28,170 preproduction chipset to completely bypass the internal ROM and load even the early 90 00:09:28,170 --> 00:09:33,670 boot code from the flash chip. Some of these images have leaked and you can use 91 00:09:33,670 --> 00:09:39,250 them to get a look at the ROM code, even without being able to dump it. That's 92 00:09:39,250 --> 00:09:45,610 going to be really useful later on. So then you've got these code partitions and 93 00:09:45,610 --> 00:09:51,230 they contain a whole lot of files. So there is the binaries themselves which 94 00:09:51,230 --> 00:09:57,569 don't have any extension. There is the metadata files. So the binary format they 95 00:09:57,569 --> 00:10:05,350 use has no headers, nothing included. And all of that data is in the metadata file. 96 00:10:05,350 --> 00:10:12,000 And when you use the unME11 tool, you can actually, it'll convert those to text 97 00:10:12,000 --> 00:10:16,069 files for you so you can just get started without really understanding how they 98 00:10:16,069 --> 00:10:26,640 work. Yes. So the metadata. It's type- length-value structure, which contains a 99 00:10:26,640 --> 00:10:31,180 whole lot of information the operating system needs. It has the info on the 100 00:10:31,180 --> 00:10:35,820 module, whether it's data or code, where it should be loaded, what the privileges 101 00:10:35,820 --> 00:10:43,390 of the process should be, a SHA checksum for validating it and also some 102 00:10:43,390 --> 00:10:49,000 higher level stuff such as device file definitions if it's a device driver or any 103 00:10:49,000 --> 00:10:55,430 other kind of server. I've actually written some code that uses this, that's 104 00:10:55,430 --> 00:11:01,460 on GitHub, so if you want a closer look at it, some of the slides have a link to to 105 00:11:01,460 --> 00:11:09,780 get a file in there which contains the full definitions. Right. So all the code 106 00:11:09,780 --> 00:11:16,801 on the ME is signed and verified by Intel. So you can't just go and put in a new 107 00:11:16,801 --> 00:11:24,689 binary and say, hey, let's run this. The way they do this is in Intel's 108 00:11:24,689 --> 00:11:30,300 manufacture-time fuses, they have a hash of the public key that they use to sign 109 00:11:30,300 --> 00:11:36,070 it. And then on each flash partition, there is a manifest which is signed by the 110 00:11:36,070 --> 00:11:40,820 key and it contains the SHA hashes for all the metadata files, which then contain a 111 00:11:40,820 --> 00:11:47,150 SHA hash for the code files. It doesn't seem to be any major problems in verifying 112 00:11:47,150 --> 00:11:52,530 this, so it's useful to know, but it's you're not really gonna use this. And then 113 00:11:52,530 --> 00:12:00,300 the modules themself, as I've said, they're flat binaries. Mostly. The 114 00:12:00,300 --> 00:12:05,560 metadata contains all the info the kernel uses to reconstruct the actual program 115 00:12:05,560 --> 00:12:13,530 image in memory. And a curious thing here is that the actual base address for all 116 00:12:13,530 --> 00:12:17,459 the modules for old programs is the same across an image. So if you have a 117 00:12:17,459 --> 00:12:19,930 different version, it's going to be different. But if you have two programs 118 00:12:19,930 --> 00:12:25,949 from the same firmware it's gonna be loaded at the same virtual address. Right. 119 00:12:25,949 --> 00:12:32,820 So when you want to look at it, you're gonna load it in some disassembler, like 120 00:12:32,820 --> 00:12:39,540 for example IDA, and you'll see this, it disassembles fine, but it's gonna 121 00:12:39,540 --> 00:12:44,270 reference all kinds of memory that you don't have access to. So usually you'd 122 00:12:44,270 --> 00:12:49,459 think maybe I've loaded up a wrong address or or am I missing some library? Well, 123 00:12:49,459 --> 00:12:55,150 here you've loaded it correctly if you use that, the address from the metadata file. 124 00:12:55,150 --> 00:13:02,310 But you are in fact missing a lot of memory segments. And let's just take a 125 00:13:02,310 --> 00:13:09,829 look at each of these. It's calling and switching code. It's pushing a pointer 126 00:13:09,829 --> 00:13:15,890 there, which is data. And what's that? So it has shared libraries, even though it's 127 00:13:15,890 --> 00:13:19,920 flat binaries. It actually does use shared libraries because you only have 1.5 128 00:13:19,920 --> 00:13:24,319 megabyte of RAM. You don't want to link your C library into everything and 129 00:13:24,319 --> 00:13:32,800 waste what little memory you have. So there is the main system library which is 130 00:13:32,800 --> 00:13:39,270 like libc on a Linux system. It's in a flash partition, so you can actually just 131 00:13:39,270 --> 00:13:45,689 load it and take a look at it easily and it starts out with a jump table. So 132 00:13:45,689 --> 00:13:48,770 there's no symbols in the metadata file or anything. It doesn't do dynamic linking. 133 00:13:48,770 --> 00:13:56,549 It loads the pages for the shared library at a fixed address, which is also in the 134 00:13:56,549 --> 00:14:01,620 shared library's metadata. And then it's just there in the processor's memory and 135 00:14:01,620 --> 00:14:06,130 it's gonna jump there if it needs a function. And the functions themself are 136 00:14:06,130 --> 00:14:12,890 just using the normal System V, x86 calling conventions. So it's pretty easy 137 00:14:12,890 --> 00:14:17,980 to look at that using your normal tools. There's no weird register argument passing 138 00:14:17,980 --> 00:14:24,559 going on here. So, right. Now, shared libraries. There's two of them. And this 139 00:14:24,559 --> 00:14:28,160 is where it gets annoying. The system library, you've got access to that so you 140 00:14:28,160 --> 00:14:32,850 can just take your time and go through it and try to figure out, you know, oh, hey, 141 00:14:32,850 --> 00:14:39,880 is this open or is this read or what's this function doing? But then there's also 142 00:14:39,880 --> 00:14:49,150 another second really large library, which is in ROM. They have all the C library 143 00:14:49,150 --> 00:14:54,300 functions and some of their custom helper routines that don't interact with the 144 00:14:54,300 --> 00:15:00,920 kernel directly, such as strings functions. They live in ROM. So when 145 00:15:00,920 --> 00:15:04,700 you've got your code and this is basically where I was when I was here last year, 146 00:15:04,700 --> 00:15:07,040 you're looking through it and you're seeing calls to a function you don't have 147 00:15:07,040 --> 00:15:11,010 the code for all over the place. And you have to figure out by its signature what 148 00:15:11,010 --> 00:15:14,870 is it doing. And that works for some of the functions and it's really difficult 149 00:15:14,870 --> 00:15:20,610 for other ones. That really had me stopped for a while. Then I managed to find one of 150 00:15:20,610 --> 00:15:25,070 these ROM bypass images and I had the code for a very early development build of the 151 00:15:25,070 --> 00:15:29,370 ROM. This is where I got lucky. So the actual entry point addresses are fixed 152 00:15:29,370 --> 00:15:33,939 across a entire chipset family. So if you have an image for the server version of 153 00:15:33,939 --> 00:15:39,310 like 100 series chipset or for client version or for a desktop or laptop 154 00:15:39,310 --> 00:15:47,540 version, it's all gonna be the same ROM addresses. So even though the code might 155 00:15:47,540 --> 00:15:51,930 be different, you'll have the jump table, which means the addresses can say fixed. 156 00:15:51,930 --> 00:15:56,760 So this only needs to be done once. And in fact when I upload my slides later, there 157 00:15:56,760 --> 00:16:02,919 is a slide in there at the end that has the addresses for the most used functions. 158 00:16:02,919 --> 00:16:07,350 So you're not going to have to repeat that work, at least not for this chipset. So if 159 00:16:07,350 --> 00:16:15,160 you want to look at a simple module, you've loaded it, now you've applied the 160 00:16:15,160 --> 00:16:21,860 things I just said, and you still don't have the data sections. If I don't know 161 00:16:21,860 --> 00:16:26,669 what that function there is doing, but it's not very important. It actually 162 00:16:26,669 --> 00:16:33,230 returns a value, I think, that's not used anywhere, but it must have a purpose 163 00:16:33,230 --> 00:16:40,220 because it's there. Right. So then you look at the entry point and this is a lot 164 00:16:40,220 --> 00:16:44,660 of stuff. And the main thing that matters here is on the right half of the screen, 165 00:16:44,660 --> 00:16:50,189 there is a listing from a MINIX repository and on the left half there is a 166 00:16:50,189 --> 00:16:54,809 disassembly from an ME module. So it's mostly the same. There is one key 167 00:16:54,809 --> 00:16:58,419 difference, though. The ME module actually has a little bit of code that runs before 168 00:16:58,419 --> 00:17:06,230 its C library startup function. And that function actually does all the ME specific 169 00:17:06,230 --> 00:17:13,980 initialization, does a lot of stuff related to how C library data is kept 170 00:17:13,980 --> 00:17:21,520 because there is also no data segments for the C library being allocated by the 171 00:17:21,520 --> 00:17:25,820 kernel. So each process actually reserves a part of its own memory and tells the C 172 00:17:25,820 --> 00:17:31,290 library, like, any global variables you can store in there. But when you look at 173 00:17:31,290 --> 00:17:37,610 that function, one of the most important things that it calls is this function. 174 00:17:37,610 --> 00:17:41,510 It's very simple, it just copies a bunch of RAM. So they don't have support for 175 00:17:41,510 --> 00:17:46,650 initialized data sections. It's a flat binary. What they do is they they actually 176 00:17:46,650 --> 00:17:51,520 use the .bss segment, the zeroed segment at the end of the address space, and copy 177 00:17:51,520 --> 00:17:57,070 over a bunch of data in the program. The program itself is not aware of this. It's 178 00:17:57,070 --> 00:18:04,180 really in the initialization code and in linker script. So this is also something 179 00:18:04,180 --> 00:18:09,170 that's very important because you're going to need to also at that address in the 180 00:18:09,170 --> 00:18:13,310 data section, you're going to need to load the last bit of the of the binary. 181 00:18:13,310 --> 00:18:20,520 Otherwise you're missing constants or at least initialization values. Right. Then 182 00:18:20,520 --> 00:18:26,150 there is the full memory map to the processes themselves. It's a flat 32 bit 183 00:18:26,150 --> 00:18:31,970 address space. It's got everything you expect in there. It's got a stack and a 184 00:18:31,970 --> 00:18:39,500 heap and everything. There's a little bit of heap allocated right on initialization. 185 00:18:39,500 --> 00:18:44,690 This is this is basically how you derive the address space layout from the 186 00:18:44,690 --> 00:18:51,100 metadata, especially like the data segment, then, and the stack itself is 187 00:18:51,100 --> 00:18:56,180 like the address location varies a lot because of the number of threads that are 188 00:18:56,180 --> 00:19:03,380 in use or the size of data sections. And also those stack guards, they're not 189 00:19:03,380 --> 00:19:07,960 really stack guards. There is also metadata for each thread in there. But 190 00:19:07,960 --> 00:19:13,640 that's nothing that's relevant to the process itself, only to the kernel. And 191 00:19:13,640 --> 00:19:21,890 well, if you then skip forward a bit and you've done all these - you look at your 192 00:19:21,890 --> 00:19:28,790 simple driver like this. This is taken from a driver used to talk to the CPU, 193 00:19:28,790 --> 00:19:34,630 like, OK. So when I say CPU or host, by the way, I mean the CPU, like your big 194 00:19:34,630 --> 00:19:39,370 SkyLake, or KabyLake, or CoffeeLake, whatever your big CPU that runs your own 195 00:19:39,370 --> 00:19:46,070 operating system. Right. So this is used to to send messages there. But if you look 196 00:19:46,070 --> 00:19:51,680 at what's going on here, OK - think I had a problem with the animation here - it 197 00:19:51,680 --> 00:19:57,000 sets up some stuff and then it calls a library function that's in the main syslib 198 00:19:57,000 --> 00:20:01,270 library, which actually has a main loop for the program. That's because Intel was 199 00:20:01,270 --> 00:20:06,440 smart and they added a nice framework for device driver implementing programs, 200 00:20:06,440 --> 00:20:10,130 because it's a micro kernel, so device drivers are just usual programs, calling 201 00:20:10,130 --> 00:20:20,060 specific APIs. Then there's normal POSIX file I/O. No standard I/O, but it has all 202 00:20:20,060 --> 00:20:26,530 the normal open, and read, and ioctl and everything functions. And then there's 203 00:20:26,530 --> 00:20:30,170 more initialization for the srv library. And this is basically what all the simple 204 00:20:30,170 --> 00:20:38,890 drivers look like in it. And then there's this. Because they're so low a memory, 205 00:20:38,890 --> 00:20:50,040 they don't actually use standard I/O, or even printf itself to do most of the 206 00:20:50,040 --> 00:20:54,820 debugging. It uses a thing that's called "sven", I'll touch on that later. So there 207 00:20:54,820 --> 00:20:59,150 is the familiar APIs that I talked about. It even has POSIX threads, or at least a 208 00:20:59,150 --> 00:21:04,510 subset of it, and there is all the functions that you'd expect to find on 209 00:21:04,510 --> 00:21:08,700 some generic Unix machine. So that shouldn't be too much of a problem to do 210 00:21:08,700 --> 00:21:14,570 with, but then there's also their own tracing solution, sven. That's what Intel 211 00:21:14,570 --> 00:21:17,350 calls it. The name is in all the development tools that you can download 212 00:21:17,350 --> 00:21:23,370 from their site, and basically, they don't include format strings for a lot of the 213 00:21:23,370 --> 00:21:28,390 stuff. They just have a 32-bit identifier that is sent over debug port, and it 214 00:21:28,390 --> 00:21:34,270 refers to a format string in a dictionary that you don't have. There is one of the 215 00:21:34,270 --> 00:21:38,820 dictionaries for a server chip that's floating around the internet, but even 216 00:21:38,820 --> 00:21:45,940 that is incomplete. And the normal non-NDA version of the Intel developer tools has 217 00:21:45,940 --> 00:21:53,810 some 50 format strings for really common status messages it might output, but yeah, 218 00:21:53,810 --> 00:21:57,391 like, if you see these functions, just realize it's doing some debug print. There 219 00:21:57,391 --> 00:22:00,550 might be dumping some states or just telling it it's gonna do something else. 220 00:22:00,550 --> 00:22:12,020 It's no important logic actually happens in here. Right. So then for device files. 221 00:22:12,020 --> 00:22:16,190 They're actually defined in a manifest. When the kernel loads a program, and that 222 00:22:16,190 --> 00:22:20,830 program wants to expose some kind of interface to other programs its manifest 223 00:22:20,830 --> 00:22:27,780 will contai,n or it's metadata file will contain a special file producer entry, and 224 00:22:27,780 --> 00:22:33,120 that says, you know, you have these device files, with a name, and an access mode and 225 00:22:33,120 --> 00:22:39,210 the user, and group ID, and everything, and the minor numbers, and the kernel 226 00:22:39,210 --> 00:22:42,830 sends this to the- or not kernel- the program loader sends this to the virtual 227 00:22:42,830 --> 00:22:47,720 file system server and it automatically gets a device file, pointing to the right 228 00:22:47,720 --> 00:22:51,800 major or minor number. And then there's also a library, as I said, to provide a 229 00:22:51,800 --> 00:23:03,680 framework for a driver. And that looks like this. It's really easy to use. If you 230 00:23:03,680 --> 00:23:08,070 were a ME developer you just write some callbacks for open, and close, and 231 00:23:08,070 --> 00:23:11,000 everything, and it automatically calls them for you, when a message comes in, 232 00:23:11,000 --> 00:23:15,400 telling you that that happened, which also makes it really easy to reverse engineer, 233 00:23:15,400 --> 00:23:21,100 'cause if you look at a driver, it just loads some callbacks, and you can know, by 234 00:23:21,100 --> 00:23:27,510 their offset in a structure, what actual call they're implementing. Right, so then 235 00:23:27,510 --> 00:23:31,950 there is one of the more weird things that's going on here: How the actual 236 00:23:31,950 --> 00:23:37,470 userland programs get access to memory map registers. There's a lot of this going on. 237 00:23:37,470 --> 00:23:42,830 Calls to a couple of functions that have some magic arguments. The second one you 238 00:23:42,830 --> 00:23:50,640 can easily tell is the offset, because it has- it increases in very nice power-of- 239 00:23:50,640 --> 00:23:54,670 two steps, so it's probably the register offsets, and then what comes after it 240 00:23:54,670 --> 00:24:00,160 looks like a value. And then the first bit seems to be a magic number. Well, it's 241 00:24:00,160 --> 00:24:05,479 not. There is also an extension in the metadata, saying these are the memory 242 00:24:05,479 --> 00:24:12,170 mapped I/O ranges, and those ranges, they'd each list a physical base address, 243 00:24:12,170 --> 00:24:19,360 and a size, and permissions for them. Then the index in that list does not directly 244 00:24:19,360 --> 00:24:23,150 correspond to the magic value. The magic value actually you need to do a little 245 00:24:23,150 --> 00:24:27,680 computation on the offset, and you can access it through those functions. The 246 00:24:27,680 --> 00:24:38,600 computation itself might be familiar. Yeah, so these are the functions. The 247 00:24:38,600 --> 00:24:44,610 value is a segment selector. So they use them. Actually, don't use paging for inter 248 00:24:44,610 --> 00:24:51,820 process isolation, they use segments like x86 Protected Mode segments. And for each 249 00:24:51,820 --> 00:24:56,610 memory mapped I/O range there is a separate segments, and you manually specify 250 00:24:56,610 --> 00:25:04,280 that, which is just weird to me, like, why would you use x86 segmenting on a modern 251 00:25:04,280 --> 00:25:10,610 system? Minix does it, but, yeah, to extent that even to this? Luckily, normal 252 00:25:10,610 --> 00:25:16,130 address space is flat, like, to the process, not to the kernel. Right, so now 253 00:25:16,130 --> 00:25:24,870 we can access memory mapped I/O. That's all the, like the really high level stuff. 254 00:25:24,870 --> 00:25:28,700 So what's going on under there? It's got all the basic microkernel stuff, so 255 00:25:28,700 --> 00:25:33,020 message passing, and then some optimizations to actually make it perform 256 00:25:33,020 --> 00:25:40,140 well on a really slow CPU. The basics are, you can send a message, you can receive a 257 00:25:40,140 --> 00:25:46,160 message, and you can send and receive a message, where you basically say "Send a 258 00:25:46,160 --> 00:25:50,930 message, wait till a response comes in, then continue", which is used to wrap 259 00:25:50,930 --> 00:25:58,400 function calls. This is mostly the same as in Minix. There's some subtle changes, 260 00:25:58,400 --> 00:26:08,230 which I'll get to later. And then memory grants are something that only appeared in 261 00:26:08,230 --> 00:26:13,080 Minix really recently. It's a way for a process to basically create a new name for 262 00:26:13,080 --> 00:26:16,690 a piece of memory it has, and give a different process access to it, just by 263 00:26:16,690 --> 00:26:21,630 sharing the number. These are referred to by the process ID and a number of that 264 00:26:21,630 --> 00:26:28,470 range. So the process IDs are actually local per process, so to uniquely identify 265 00:26:28,470 --> 00:26:35,461 one you need to say process ID plus that number, and they're only granted to a 266 00:26:35,461 --> 00:26:38,300 single process. So when a process creates one of these, it can't even access it 267 00:26:38,300 --> 00:26:42,490 itself, unless it creates a grant for itself, which is not really that useful, 268 00:26:42,490 --> 00:26:51,880 usually. These grants are used to prevent having to copy over all the data inside 269 00:26:51,880 --> 00:26:57,500 the IPC message used to implement a system call. Yeah, these are the basic operations 270 00:26:57,500 --> 00:27:03,190 on it. You can create one, you can copy into and from it. So, you can't actually 271 00:27:03,190 --> 00:27:07,010 map it. A process that receives one of these has to say to the kernel, using a 272 00:27:07,010 --> 00:27:12,721 system call, "please write this data into that area of memory that belongs to a 273 00:27:12,721 --> 00:27:17,930 different process." And then there's also indirect grants, because, you know, in 274 00:27:17,930 --> 00:27:25,309 Minix they do have this, but also only recently, and usually if you have a 275 00:27:25,309 --> 00:27:30,360 microkernel system, you would have to copy your buffer for a read call first to the 276 00:27:30,360 --> 00:27:36,540 file system server and then back to, like, either the hard disk driver, or the device 277 00:27:36,540 --> 00:27:40,620 driver that's implementing a device file. So the ME actually allows you to create a 278 00:27:40,620 --> 00:27:45,860 grant, pointing to a grant, that was given to you by someone else. And then that 279 00:27:45,860 --> 00:27:52,820 grant will inherit the privileges of the process that creates it, combined with 280 00:27:52,820 --> 00:27:57,530 those that it assignes to it. So if the process has a read/write grant it can 281 00:27:57,530 --> 00:28:01,340 create a read-only or write-only grant, but it cannot, if it only has a read 282 00:28:01,340 --> 00:28:08,860 grant, it cannot add write rights to it for a different process, obviously. So 283 00:28:08,860 --> 00:28:12,880 then there is also some big differences from MINIX. In MINIX you address a process 284 00:28:12,880 --> 00:28:18,080 by its process ID or thread ID with a generation number attached to it. In the 285 00:28:18,080 --> 00:28:25,440 ME you can actually address IPC to a file descriptor. Kernel doesn't actually know a 286 00:28:25,440 --> 00:28:28,610 lot about file descriptors, it just implements the basic thing where you have 287 00:28:28,610 --> 00:28:32,350 a list of files and each process has a list of file descriptors assigning integer 288 00:28:32,350 --> 00:28:39,320 numbers to those files to refer to them by. And this is used so you can as a 289 00:28:39,320 --> 00:28:43,040 process, you can actually directly talk to a device driver without knowing what is 290 00:28:43,040 --> 00:28:47,110 process ID is. So you don't send it to the file system server, you send it to the 291 00:28:47,110 --> 00:28:51,740 file descriptor or the Kernel just magically corrects it for you. And they 292 00:28:51,740 --> 00:28:55,550 moved select into the kernel so you can tell the kernel: "Hey, I want to wait till 293 00:28:55,550 --> 00:28:59,720 the file system server tells me that it has not available or till a message comes 294 00:28:59,720 --> 00:29:05,440 in." This is one of the most complicated system calls the ME offers that's used in 295 00:29:05,440 --> 00:29:12,010 a normal program. You can mostly ignore it and just look like: "Hey, those arguments 296 00:29:12,010 --> 00:29:16,760 sort of define a file descriptor set as a bit field." And then there's the message 297 00:29:16,760 --> 00:29:21,040 that might have been received and there's DMA locks because you don't just want to 298 00:29:21,040 --> 00:29:24,790 write to registers. You actually might want to do the direct memory access from 299 00:29:24,790 --> 00:29:30,720 hardware so you you can actually tell the kernel to lock one of these memory grounds 300 00:29:30,720 --> 00:29:38,260 in RAM for you, it won't be swapped out anymore. And yeah, it will even tell you 301 00:29:38,260 --> 00:29:42,020 the physical address so you can just load that into a register and it's not really 302 00:29:42,020 --> 00:29:46,760 that complicated. Just lock it, get a physical access, write into the register 303 00:29:46,760 --> 00:29:53,580 and continue. Well, that's the most important stuff about the operating 304 00:29:53,580 --> 00:29:58,929 system. The hardware itself is a lot more complicated because the operating system, 305 00:29:58,929 --> 00:30:03,300 once you have the code, you can just reverse engineer it and get to know it. 306 00:30:03,300 --> 00:30:11,010 The hardware. Well, let's just say it's a real pain to have to reverse engineer a 307 00:30:11,010 --> 00:30:16,179 piece of hardware together with its driver. Like if you've got the driver 308 00:30:16,179 --> 00:30:18,450 code, but you don't know what the registers do. So you don't know what a lot 309 00:30:18,450 --> 00:30:24,440 of logic does. And you're trying to both figure out what the logic is and what the 310 00:30:24,440 --> 00:30:30,050 actual registers do. Right. So first you want to know which physical address goes 311 00:30:30,050 --> 00:30:39,881 where? The metadata listings I showed you actually have names in there. Those are 312 00:30:39,881 --> 00:30:47,940 not in the metadata files themself, I annotated those. So you just see the 313 00:30:47,940 --> 00:30:56,680 physical address and size. But there is one module, the bus driver module and the 314 00:30:56,680 --> 00:31:04,230 bus driver is normal user process, but it implements stuff like PCI configuration 315 00:31:04,230 --> 00:31:09,550 space accesses and those things. And it has a nice table in it with names for 316 00:31:09,550 --> 00:31:17,049 devices. So if you just run strings on it, you'll see these things. When I saw this, 317 00:31:17,049 --> 00:31:20,960 I was was pretty glad because at least I could make sense what device was being 318 00:31:20,960 --> 00:31:26,680 talked to in a in a certain program. So the bus driver does all these things. It 319 00:31:26,680 --> 00:31:30,990 manages power getting to devices, it manages configuration space access, it 320 00:31:30,990 --> 00:31:35,960 manages the different kinds of buses and IOMU that are on the system. And it makes 321 00:31:35,960 --> 00:31:39,500 sure that the normal driver never has to know any of these details. It just asked 322 00:31:39,500 --> 00:31:45,520 it for a device by a number assigned to it a build time. And then the bus driver 323 00:31:45,520 --> 00:31:50,360 says, OK, here's a range of physical address space you can now write to. So 324 00:31:50,360 --> 00:31:56,640 that's a really nice abstraction and also gives us a lot of information because the 325 00:31:56,640 --> 00:32:01,640 really old builds for sunrise point actually have a hell of a lot of debug 326 00:32:01,640 --> 00:32:07,021 strings in there as printf format strings, not as catalogue ID. It's 327 00:32:07,021 --> 00:32:11,910 one of the only pieces of code for the ME that does this, so that already tells you 328 00:32:11,910 --> 00:32:15,480 a lot. And then there's also the table that I just talked about that has the 329 00:32:15,480 --> 00:32:23,760 actual info on the devices and names. So I generated some DocuWiki content from this 330 00:32:23,760 --> 00:32:28,570 that I use myself and this is what's in the table, part of it. So it tells you 331 00:32:28,570 --> 00:32:33,070 what address PCI configuration space lives at. That tells you to do the bus device 332 00:32:33,070 --> 00:32:38,130 function for it through that. It tells you on what chipset SKU they're present using 333 00:32:38,130 --> 00:32:44,640 a bitfield. And it tells you their names in different fields. It also contains the 334 00:32:44,640 --> 00:32:48,540 values that are used to write the base address registers for PCI. So also their 335 00:32:48,540 --> 00:32:54,190 normal memory ranges. And there's even more devices. So the ME has access to a 336 00:32:54,190 --> 00:32:58,860 lot of stuff. A lot of it is private to it. A lot of it is components that also 337 00:32:58,860 --> 00:33:06,110 exist in the rest of the computer. And there's not a lot of information. A lot of 338 00:33:06,110 --> 00:33:11,410 these are basically all the things that are out there together with conference 339 00:33:11,410 --> 00:33:15,140 slides published by other people who have done research on the ME. I didn't have 340 00:33:15,140 --> 00:33:21,980 time to add links to those, but they're easy to find on Google. I'll get later to 341 00:33:21,980 --> 00:33:28,230 this, I actually wrote a emulator for the ME, a partial emulator to be able to run 342 00:33:28,230 --> 00:33:34,230 ME code and analyze it, which obviously needs to know a bit about the hardware so 343 00:33:34,230 --> 00:33:41,030 you can look at the app. There is some files in Intel's debugger package, 344 00:33:41,030 --> 00:33:46,150 specific versions of that that have really detailed info on some of the devices, also 345 00:33:46,150 --> 00:33:51,460 not all of it. And I wrote some tool to parse some of the files. It's really rough 346 00:33:51,460 --> 00:33:57,040 code. I published it because people wanted to see what I was doing. It doesn't work 347 00:33:57,040 --> 00:34:04,080 out of the box. And there is a nice talk on this by Mark Ermolov and Maxim 348 00:34:04,080 --> 00:34:06,870 Goryachy.. Actually I don't know if I'm pronouncing that correctly, but they've 349 00:34:06,870 --> 00:34:12,049 done a lot of work on the ME and this particular talk by them is really useful. 350 00:34:12,049 --> 00:34:16,339 And then there's also something else. There is a second ME on server chipsets, 351 00:34:16,339 --> 00:34:21,299 the innovation engine. It's basically a copy paste of the ME to provide a ME that 352 00:34:21,299 --> 00:34:24,760 the vendor can write code for. Don't think it's used a lot. I've only been able to 353 00:34:24,760 --> 00:34:31,639 find HP software that actually targets it and that has some more debug strings, but 354 00:34:31,639 --> 00:34:36,639 also not a lot, it mostly has a table containing register names, but they're 355 00:34:36,639 --> 00:34:41,869 really abbreviated and for a really small subset of the devices, there is 356 00:34:41,869 --> 00:34:48,280 documentation out there in a Pentium N and J series datasheet. It's seems like they 357 00:34:48,280 --> 00:34:52,409 compile their a lot of code or whatever with the wrong defines because it doesn't 358 00:34:52,409 --> 00:35:00,350 actually fit into the manual that well, it's just a section that has like some 20 359 00:35:00,350 --> 00:35:08,640 tables that shouldn't be in there. So this is from that talk I just referenced and 360 00:35:08,640 --> 00:35:12,609 it's a overview of the innovation engine and the bus bridges and everything in 361 00:35:12,609 --> 00:35:20,070 there. This isn't very precise. So based on some of those files from System Studio, 362 00:35:20,070 --> 00:35:24,500 I try to get a better understanding of this, which is this. This is the entire 363 00:35:24,500 --> 00:35:29,760 chipset. The little DMA block in the top left corner is what connects to your CPU. 364 00:35:29,760 --> 00:35:36,570 And all of the big blocks with a lot of ports are our bus bridges or switches for 365 00:35:36,570 --> 00:35:45,470 PCIexpress-like fabric. So there's a lot going on. The highlighted area is the 366 00:35:45,470 --> 00:35:59,081 management engine memory space and the rest of it is like the global chipset. The 367 00:35:59,081 --> 00:36:02,840 things I've highlighted in green hair are on the primary PCI bus. So there's this 368 00:36:02,840 --> 00:36:08,210 weird thing going on where there seems to be two PCI hierarchies, at least 369 00:36:08,210 --> 00:36:13,741 logically. So in reality it's not even PCI, but on intel systems, there's a lot 370 00:36:13,741 --> 00:36:19,600 of stuff that behaves as if it is PCI. So it has like a bus device function and 371 00:36:19,600 --> 00:36:28,650 numbers, PCI configuration space registers and they have two different roots for the 372 00:36:28,650 --> 00:36:32,310 configuration space. So even though the configuration space address includes a bus 373 00:36:32,310 --> 00:36:36,480 number, they have two completely different things with each. Each of which has its 374 00:36:36,480 --> 00:36:41,290 own bus zero. So that's that's weird also because they don't make sense when you 375 00:36:41,290 --> 00:36:45,680 look at how the hardware is laid out. So this is stuff that's on the primary PCI 376 00:36:45,680 --> 00:36:50,780 configuration space that's directly accessed by the EM, by the north bridge on 377 00:36:50,780 --> 00:36:55,260 the ME CPU. So that's the minute I A system agent. System agent is what Intel 378 00:36:55,260 --> 00:37:00,619 calls a Northbridge nowadays, now that it's not a separate chip anymore. It's 379 00:37:00,619 --> 00:37:07,530 basically just a Northbridge and a crypto unit that's on there and the stuff that's 380 00:37:07,530 --> 00:37:12,530 directly attached to Northbridge being the ROM and the RAM. So the processor itself 381 00:37:12,530 --> 00:37:16,960 is, as I said, derived from a 486, but it does actually have some more modern 382 00:37:16,960 --> 00:37:21,830 features that it does CPU ID, at least on my systems. Some other researchers said 383 00:37:21,830 --> 00:37:29,369 theirs didn't. It's basically the core that's in the quark MCU, which is really 384 00:37:29,369 --> 00:37:33,260 great because it's one of the only cores made by Intel that has public 385 00:37:33,260 --> 00:37:39,800 documentation on how to do run control. So breakpoints and accessing registers and 386 00:37:39,800 --> 00:37:44,420 everything over JTAG. Intel doesn't publish this stuff except for the quark 387 00:37:44,420 --> 00:37:50,920 MCU, because they were targeted makers. But they reused that in here, which is 388 00:37:50,920 --> 00:37:58,200 really useful. It even has an official port to the OpenOCD debugger, which I have 389 00:37:58,200 --> 00:38:03,100 not gotten to test because I don't have a JTAG probe, which is compatible with Intel 390 00:38:03,100 --> 00:38:11,000 voltage levels and supported by OpenOCD and also has like a set CPU ID and MSRs. 391 00:38:11,000 --> 00:38:21,170 It has some really fancy features like branch tracing and some more strict paging 392 00:38:21,170 --> 00:38:30,480 permission enforcement stuff. They don't use the interrupt pins on this. So it's an 393 00:38:30,480 --> 00:38:34,710 IP block but if there are some files out there, that's where it is this screenshot 394 00:38:34,710 --> 00:38:40,601 is from, that actually are used by a built in logic analyzer Intel has on the 395 00:38:40,601 --> 00:38:46,680 chipset and you can select different signals on the chip to to watch, which is 396 00:38:46,680 --> 00:38:50,900 a really great source of information on how the IP blocks are laid out and what 397 00:38:50,900 --> 00:38:54,200 signals are in there, because you basically get a tree view of the IP blocks 398 00:38:54,200 --> 00:39:00,800 and chip and some of their signals. They don't use the legacy interrupt system, 399 00:39:00,800 --> 00:39:07,920 they only use message based interrupts by what a device writes a value into a 400 00:39:07,920 --> 00:39:13,050 register on the interrupt controller instead of asserting a pin. And then there 401 00:39:13,050 --> 00:39:21,700 is the Northbridge. It's partially documented in that data sheet I mentioned, 402 00:39:21,700 --> 00:39:29,020 it does support x86 IO address space, but it's never used. Everything in the ME is 403 00:39:29,020 --> 00:39:36,600 in memory space or expose as memory space through bridges, in the Northbridge 404 00:39:36,600 --> 00:39:43,070 implements access to the ROM,RAM, it has a IOMMU which is only used for transactions 405 00:39:43,070 --> 00:39:48,750 coming from the rest of the system and it's always initialized to, at least in 406 00:39:48,750 --> 00:39:51,660 the firmware I looked up, it's always initialized to the inverse of the page 407 00:39:51,660 --> 00:40:00,200 table, so linear addresses can be used for memory maps, sorry, for DMA. It also does 408 00:40:00,200 --> 00:40:06,270 PCI configuration space access to the primary PCI bus. And it has a firewall 409 00:40:06,270 --> 00:40:15,080 that allows the operating system to deny any IP block in the chipset from sending a 410 00:40:15,080 --> 00:40:18,890 completion on the bus request. So it can actually say: "Hey, I want to read some 411 00:40:18,890 --> 00:40:25,040 register and only these devices are allowed to send me value for it." So 412 00:40:25,040 --> 00:40:29,570 they've actually thought about security here, which is great. Then there is one of 413 00:40:29,570 --> 00:40:38,190 the most important blocks in the ME, which is the crypto engine. It does some sort of 414 00:40:38,190 --> 00:40:47,100 more well-known crypto algorithms. AES, SHA hashes, RSA and it has a secure key 415 00:40:47,100 --> 00:40:56,330 store, which I'm not gonna [audio dropped] ... all about it in their ME talk at 416 00:40:56,330 --> 00:41:04,250 Blackhat. And a lot of these things have DMA engines, which all seem to be the 417 00:41:04,250 --> 00:41:09,500 same. And there is no other DM agents ... engines in ME, so this is also used from 418 00:41:09,500 --> 00:41:23,170 memory to memory copy or DMA into other devices. So that's used in a lot of 419 00:41:23,170 --> 00:41:27,400 things. This is actually a diagram which I don't have the vector for anymore. So 420 00:41:27,400 --> 00:41:35,260 that's why the libre office background is in there. I'm sorry. So this is basically 421 00:41:35,260 --> 00:41:39,020 what that crypto engine looks like when you look at that signal tree that I was 422 00:41:39,020 --> 00:41:44,910 talking about earlier. The DMA engines are both able to do memory to memory copies 423 00:41:44,910 --> 00:41:52,570 until directly targets the crypto unit they're part of. Basically, when you, I 424 00:41:52,570 --> 00:41:57,490 don't know about the control bits that go with this, but when you set the target 425 00:41:57,490 --> 00:42:02,150 address to zero and the right control bits, it will copy into the buffer that's 426 00:42:02,150 --> 00:42:11,960 used for the encryption. So that is how it accelerates memory access for crypto. And 427 00:42:11,960 --> 00:42:15,590 these are the actual register offsets. They're the same for all of the DMA 428 00:42:15,590 --> 00:42:21,580 engines in there relative to the base address of the subunit they're in. And 429 00:42:21,580 --> 00:42:27,290 then there's the second PCI bus or bus hierarchy, which is like in some places 430 00:42:27,290 --> 00:42:33,540 called the PCI fixed bus. I'm actually not entirely sure whether this is actually 431 00:42:33,540 --> 00:42:38,840 implemented as a PCI bus as I've drawn it here, but this is what it behaves like. So 432 00:42:38,840 --> 00:42:43,920 it has all the ME private stuff, that's not a part of the normal chipset. So it's 433 00:42:43,920 --> 00:42:51,310 timers for the ME, it has the implementation of the secure enclave 434 00:42:51,310 --> 00:42:58,010 stuff, that the firmware TPM registers. And it has the gen device which I've 435 00:42:58,010 --> 00:43:01,780 mostly ignored because it's only used the boot time. It's only used by the actual 436 00:43:01,780 --> 00:43:10,869 boot ROM for the ME mostly. It is what the ME uses to get the fuses Intel burns. So 437 00:43:10,869 --> 00:43:15,420 that's the intel public key, whether it's a production or pre-production part, but 438 00:43:15,420 --> 00:43:20,260 it's pretty much a black box. It's not used that much, fortunately. There is the 439 00:43:20,260 --> 00:43:24,340 IPC block which allows the ME to talk to the sensor hub, which is a different CPU 440 00:43:24,340 --> 00:43:28,190 in the chipset. It allows it to talk to power management controller and all kinds 441 00:43:28,190 --> 00:43:34,180 of other embedded CPUs. So it's inter processor communication not interprocess. 442 00:43:34,180 --> 00:43:39,090 Confused me for a bit. And here's the host embedded controller interface, which is 443 00:43:39,090 --> 00:43:44,320 how the ME talks to the rest of the computer when it wants the computer to 444 00:43:44,320 --> 00:43:47,960 know that it's talking so it can directly access a lot of stuff. But when it wants 445 00:43:47,960 --> 00:43:54,250 to send a message to the EFI or to Windows or Linux, it'll use this. And it also has 446 00:43:54,250 --> 00:43:59,080 status registers, which are really simple things where the ME writes in a value. And 447 00:43:59,080 --> 00:44:05,290 even if the ME crashes, the host can still read the value, which is how you can see 448 00:44:05,290 --> 00:44:11,160 whether the ME is running, whether it's disabled, whether it fully booted, or 449 00:44:11,160 --> 00:44:15,400 whether it crashed halfway through. But at a point where it could still get the rest 450 00:44:15,400 --> 00:44:21,230 of the computer running and there is some corporate code to to read it. I've also 451 00:44:21,230 --> 00:44:27,080 implemented some decoding for it on the emulator because it's useful to see what 452 00:44:27,080 --> 00:44:33,210 those values mean. So then there's something really interesting, the primary 453 00:44:33,210 --> 00:44:37,240 adverse translation table, which is the bus bridge that allows the ME to actually 454 00:44:37,240 --> 00:44:44,200 access the PCIexpress fabric of the computer. For a lot of the, what in this 455 00:44:44,200 --> 00:44:50,010 table call ME peripherals, that are actually outside the ME domain and the 456 00:44:50,010 --> 00:45:00,320 chipset, it uses this to access it. It also uses it to access the UMA, which is 457 00:45:00,320 --> 00:45:04,960 an area of host RAM that's used as a swap device for the ME and to Trace Hub, which is 458 00:45:04,960 --> 00:45:11,190 the debug port, but also has a couple of windows which allow the ME to access any 459 00:45:11,190 --> 00:45:19,060 random area of host RAM, which is the most scary bit because UMA is specified by 460 00:45:19,060 --> 00:45:24,650 host, but the host DRAM area is where you can just point it anywhere. You can read 461 00:45:24,650 --> 00:45:28,750 or write any value that that Windows or Linux or whatever you're running has 462 00:45:28,750 --> 00:45:37,460 sitting there. So that's scary to me. So and then there's the rest of it, the rest 463 00:45:37,460 --> 00:45:46,490 of the devices which are behind the primary ATT. And that's a lot of stuff, 464 00:45:46,490 --> 00:45:53,450 that's debug, that's also the older normal peripherals that your P.C. has, but it 465 00:45:53,450 --> 00:45:56,200 also includes things like the power management controller, which actually 466 00:45:56,200 --> 00:45:59,789 turns on and off all the different parts of your computer. It controls clocks and 467 00:45:59,789 --> 00:46:07,680 resets. So this is really important. There is a concept that you'll come across where 468 00:46:07,680 --> 00:46:14,261 you're reading Intel manuals or ME related stuff that's root spaces besides your 469 00:46:14,261 --> 00:46:20,320 normal addressing information for a PCI device, it also has a root space number, 470 00:46:20,320 --> 00:46:24,980 which is basically how you have a single PCI device exposing two completely 471 00:46:24,980 --> 00:46:31,151 different address spaces. And it's 0 for the host, it's one for the ME. Some 472 00:46:31,151 --> 00:46:34,940 devices expose the same information on there. Other ones behave completely 473 00:46:34,940 --> 00:46:43,370 different. That's something you don't usually see. And then there's the side 474 00:46:43,370 --> 00:46:48,560 band fabric. So besides all this stuff they just covered, which is PCI like at 475 00:46:48,560 --> 00:46:52,880 least. There is also something completely different, side band fabric, which is a 476 00:46:52,880 --> 00:47:00,990 completely packet switched network, where you don't use any memory mapping by 477 00:47:00,990 --> 00:47:06,370 default. You just have a one byte address for a device and some other addressing 478 00:47:06,370 --> 00:47:09,590 fields and you're just sending a message saying: "Hey, I want to read configuration 479 00:47:09,590 --> 00:47:14,320 or data or memory." And there is actually a lot of information out there on this, 480 00:47:14,320 --> 00:47:18,480 because Intel, it seems like I just copy pasted their internal specification into a 481 00:47:18,480 --> 00:47:26,860 patent. This is how you address it. This is all devices on there, which is quite a 482 00:47:26,860 --> 00:47:32,590 lot. It's also what you, if any of you are kernel developers, and you've had to deal 483 00:47:32,590 --> 00:47:40,110 with GPIO on Intel SoCs. There's this P2SB device that you have to use. That's what 484 00:47:40,110 --> 00:47:48,240 the host uses to access this. Their documentation on it is really, really bad. 485 00:47:48,240 --> 00:47:52,420 This was all done using static analysis. But then I wanted to figure out how some 486 00:47:52,420 --> 00:47:57,410 of the logic actually works and it was really complicated to play around with the 487 00:47:57,410 --> 00:48:07,310 ME. There was this nice talk by Ermolov and Goryachy, where they said: "You know, 488 00:48:07,310 --> 00:48:11,790 we found a an exploit that gives you code execution and you can you can get JTAG 489 00:48:11,790 --> 00:48:18,813 access to." It sounds really nice. It's actually not that easy. So arbitrary code 490 00:48:18,813 --> 00:48:23,359 execution in the BUP module, they actually describe their exploit and how you should 491 00:48:23,359 --> 00:48:30,270 use it. But they didn't describe anything that's needed to actually implement that. 492 00:48:30,270 --> 00:48:35,690 So if you want to do that, what you need to do to figure out where to stack lives, 493 00:48:35,690 --> 00:48:40,230 you need to know where you need to write a payload that will actually get it from a 494 00:48:40,230 --> 00:48:44,640 buffer overflow on a stack that, by the way, uses stack cookies. So you can't just 495 00:48:44,640 --> 00:48:51,369 overwrite the return address to turn that into an arbitrary write. And you need to 496 00:48:51,369 --> 00:48:56,369 find out what the return pointer address is so you can overwrite it and find ROP 497 00:48:56,369 --> 00:49:03,320 gadgets because the stack is not executable. And then when you've done 498 00:49:03,320 --> 00:49:09,920 that, you can just turn on debug access or change to custom firmware or whatever. So 499 00:49:09,920 --> 00:49:13,660 what I did is I had a bit of trouble getting that running and in order to test 500 00:49:13,660 --> 00:49:17,720 your payload, you have to flash it into the system and it takes a while and then 501 00:49:17,720 --> 00:49:20,880 the system just doesn't power on if the ME's not working, if you're crashing it 502 00:49:20,880 --> 00:49:24,580 instead of getting code execution. So it's not really valuable to to develop it that 503 00:49:24,580 --> 00:49:32,910 way, I think. Some people did. I respect that because it's really, really hard. And 504 00:49:32,910 --> 00:49:38,790 then I wrote this ME Loader, it's called Loader because at first I started out like 505 00:49:38,790 --> 00:49:42,849 writing it as a sort of a wine thing where you where you would just mmap the right 506 00:49:42,849 --> 00:49:47,380 ranges at the right place and jump into it, execute it, patch some system calls. 507 00:49:47,380 --> 00:49:51,849 But because the ME is a micro kernel system in almost every user space program 508 00:49:51,849 --> 00:49:57,480 accesses hardware directly, it ended up implementing like a good part of the 509 00:49:57,480 --> 00:50:08,080 chipset, at least as stubs or enough logic to get the code running. And I later on 510 00:50:08,080 --> 00:50:14,510 added some features that actually allowed to talk to the hardware. I can use it as a 511 00:50:14,510 --> 00:50:18,530 debugger, but just because it's actually running the ME firmware or parts of it 512 00:50:18,530 --> 00:50:26,200 inside a normal Linux process, I can just use gdb to debug it. And back in April 513 00:50:26,200 --> 00:50:30,320 last year, I got that working to the point where I could run the bootstrap process, 514 00:50:30,320 --> 00:50:38,580 which is where the vulnerability is. And then you just develop the exploit against 515 00:50:38,580 --> 00:50:43,960 it, which I did. And then I made a mistake cleaning up some old change root 516 00:50:43,960 --> 00:50:52,010 environments for close source software. And I nuked my home dir. Yeah. I hadn't 517 00:50:52,010 --> 00:50:56,599 yet pushed everything to GitHub. So I stuck with an old version and I decided, 518 00:50:56,599 --> 00:51:00,160 you know, let's refactor this and turn it into something that might actually at some 519 00:51:00,160 --> 00:51:03,930 point be published, which by the way I did last summer. This is all public code. The 520 00:51:03,930 --> 00:51:09,790 ME Loader thing. It's on GitHub. And someone else beat me to it and replicated 521 00:51:09,790 --> 00:51:15,250 that exploit by the Russian guys. Which up to then they have produced a proof of concept 522 00:51:15,250 --> 00:51:22,760 thing for Apollo like chipsets, which were completely different for from what you had 523 00:51:22,760 --> 00:51:33,690 to do for normal ME. I was a bit disappointed by that one, not being the 524 00:51:33,690 --> 00:51:38,580 first one to actually replicate this. But then I did about a week later, I got it 525 00:51:38,580 --> 00:51:44,270 got my loader back to the point where I could actually get to the vulnerable code 526 00:51:44,270 --> 00:51:51,120 and develop that exploit and got it working not too long after. And here's the 527 00:51:51,120 --> 00:51:54,720 great thing. Then I went to the hacker space. I flash it into my laptop. The 528 00:51:54,720 --> 00:51:59,040 image that I had just been using only on the emulator. I didn't change it. I flash. 529 00:51:59,040 --> 00:52:05,280 I was like, this is never gonna work on it. It works. some laughter And I've still got an image 530 00:52:05,280 --> 00:52:08,480 on a flash ship with me because that's what I used to actually turn on the 531 00:52:08,480 --> 00:52:14,490 debugger. And then you need a debug probe because that USB based debugging stuff 532 00:52:14,490 --> 00:52:18,810 that's mentioned here only works pretty late in boot. Which is also why I only 533 00:52:18,810 --> 00:52:21,880 really see Apollo Lake stuff because on those chipsets you can actually use this 534 00:52:21,880 --> 00:52:33,010 for the ME. And then you need this thing because there's a second channel, that is 535 00:52:33,010 --> 00:52:36,360 using the USB plug, but it's a completely different physical layer and you need an 536 00:52:36,360 --> 00:52:40,911 adapter for it, which I don't think was intended to be publicly available. Because 537 00:52:40,911 --> 00:52:44,859 if you go to Intel site to say, I want to buy this, they say, here's the C-NDA, 538 00:52:44,859 --> 00:52:54,460 please sign it. But it appeared on mouser. And luckily I knew some people, who had 539 00:52:54,460 --> 00:52:59,120 done some other stuff, got a nice bounty for it and bought it and I let me use it. 540 00:52:59,120 --> 00:53:05,430 Thanks to them. It's expensive, but you can buy it if it's still up there. Haven't 541 00:53:05,430 --> 00:53:11,520 checked. That's the Link. So I'm a bit late, so I'm gonna use the time for 542 00:53:11,520 --> 00:53:15,760 questions as well. So the main thing the ME does that you cannot replace is the 543 00:53:15,760 --> 00:53:21,250 boot process. It's not just breaking the system. If you don't turn it on, it 544 00:53:21,250 --> 00:53:25,240 actually does stuff that has to be done. So you gonna have to use the ME anyway if 545 00:53:25,240 --> 00:53:30,730 you want to boot a computer. I don't necessarily have to use Intel's firmware. 546 00:53:30,730 --> 00:53:35,810 The ME itself boots is like a micro kernel system, so it has a process which 547 00:53:35,810 --> 00:53:39,859 implements a lot of the servers that will allow it to get to a point where it can 548 00:53:39,859 --> 00:53:44,710 start those servers. This process has very high privileges in older versions, which 549 00:53:44,710 --> 00:53:49,160 is what is being used on these chipsets. And if you exploit that, you're still ring 550 00:53:49,160 --> 00:53:55,680 3, but you can turn on debugger and you can use the debugger to become ring 0. So 551 00:53:55,680 --> 00:53:59,171 this is what normal boot process for a computer looks like. And this is what 552 00:53:59,171 --> 00:54:02,050 happens when you use Boot Guard. There's a bit of code that runs even before the 553 00:54:02,050 --> 00:54:07,170 reset vector, and that's started by micro code initialization, of course. And this 554 00:54:07,170 --> 00:54:12,120 is what actually happens. The ME loads a new firmware into a power management 555 00:54:12,120 --> 00:54:16,390 controller, it then ready some stuff in a chipset and it tells the power mentioning 556 00:54:16,390 --> 00:54:23,660 controller like please stop pulling that CPU reset pin low and the CPU will start. 557 00:54:23,660 --> 00:54:28,160 Power managment controller is a completely independent thing I say 8051 derived 558 00:54:28,160 --> 00:54:32,690 microcontroller that runs a real time operating system from the 90s. This is the 559 00:54:32,690 --> 00:54:38,690 only string in the firmware by the way, that's quoted there. And depending on the 560 00:54:38,690 --> 00:54:42,410 chipsset that you have, it's either loaded with a patch or with a complete binary 561 00:54:42,410 --> 00:54:46,690 from the ME, and it does a lot of important stuff. No documentation on it 562 00:54:46,690 --> 00:54:52,120 besides ACPI interface, which is not really any useful. The ME has to do these 563 00:54:52,120 --> 00:54:58,710 things. It needs to load the keys for the Boot Guard process needs to set up clock 564 00:54:58,710 --> 00:55:06,550 controllers and then tell the PMC to turn on the power to to the CPU. It needs to 565 00:55:06,550 --> 00:55:15,240 configure PCI express fabric and reset - like get the CPU to come out of reset. 566 00:55:15,240 --> 00:55:18,290 There's a lot of code involved in this, so I really didn't want to do this all 567 00:55:18,290 --> 00:55:22,150 statically. What I did is I added hardware support, hardware passthrough support to 568 00:55:22,150 --> 00:55:28,500 the emulator and booted my laptop that way. Actually had a video of this, but I 569 00:55:28,500 --> 00:55:33,970 don't have the time to show it, which is a pity. But this is what I - the bring up 570 00:55:33,970 --> 00:55:38,030 process from the ME running in a Linux process, sending whatever hardware access 571 00:55:38,030 --> 00:55:43,340 as it was trying to do that are important for boot to the debugger. And then that 572 00:55:43,340 --> 00:55:49,880 was using a ME in real hardware that was halted to actually do to register accesses 573 00:55:49,880 --> 00:55:56,520 and it works. It's not going to show this. It actually booted the computer reliably. 574 00:55:56,520 --> 00:56:02,410 Then Boot Guard configuration is fun because you know where they say they fuse 575 00:56:02,410 --> 00:56:10,990 in the keys. Well yeah. But the ME loads them from fuses and then manually loads 576 00:56:10,990 --> 00:56:14,530 them into registers. So if you have code execution on the ME before it does this, 577 00:56:14,530 --> 00:56:18,000 you can just load your own values and you can run core boot even on a machine that 578 00:56:18,000 --> 00:56:24,190 has Boot Guard. Yeah. So I'm gonna go through this really quickly. This is, by 579 00:56:24,190 --> 00:56:29,570 the way, these are the registers that configure what security model the CPU is 580 00:56:29,570 --> 00:56:34,579 gonna enforce for the firmware. I'm going to release this code after my talk. It's 581 00:56:34,579 --> 00:56:39,810 part of a Python script that I wrote that uses the debugger to start the CPU without 582 00:56:39,810 --> 00:56:45,670 ME firmware. I traced all the of the ME firmware did. And I now have a Python 583 00:56:45,670 --> 00:56:51,470 script that can just start a computer without Intel's code. If you translate 584 00:56:51,470 --> 00:56:55,920 this into a rough sequence or even into binary for the ME, you can start a 585 00:56:55,920 --> 00:57:02,850 computer without the ME itself or at least without it running the operating system. 586 00:57:02,850 --> 00:57:12,710 applause So, yeah, future goals. I really do want 587 00:57:12,710 --> 00:57:20,420 to share this because if there is a way to escalate, to ring 0 fruit, a rope chain, 588 00:57:20,420 --> 00:57:24,359 then you could just start your own kernel in the ME and have custom firmware, at 589 00:57:24,359 --> 00:57:29,600 least from the vulnerability on. But you could also build a mod chip that uses the 590 00:57:29,600 --> 00:57:34,829 debugger interface to load a new firmware. There's lots of stuff still needs to be 591 00:57:34,829 --> 00:57:41,210 discovered, but I'm gonna hang out at the open source firmware village later, at 592 00:57:41,210 --> 00:57:46,690 least part of the week here. So because I really want to get started on open source 593 00:57:46,690 --> 00:57:55,250 ME firmware using this. Right. And there's a lot of people that's played a role in 594 00:57:55,250 --> 00:58:00,700 getting me to this point. Also would like to thank the guy from Hague hacker space, 595 00:58:00,700 --> 00:58:07,680 BinoAlpha, who basically allowed me to use his laptop to prepare the demo, which I 596 00:58:07,680 --> 00:58:14,660 ended up not being able to show, but. Right. I was gonna ask what are the 597 00:58:14,660 --> 00:58:17,380 worrying questions? But I don't think there's really any time for any more. 598 00:58:17,380 --> 00:58:22,570 Herald: Peter, thank you so much. Applause Unfortunately, we don't have any more time 599 00:58:22,570 --> 00:58:30,720 left. Peter: I'll be around. I'll be around. 600 00:58:30,720 --> 00:58:35,660 Herald: I think it's very, very interesting because I hope that your talk 601 00:58:35,660 --> 00:58:41,119 will inspire many people to keep looking into how the management engine works and 602 00:58:41,119 --> 00:58:46,930 hopefully uncover even more stuff. I think we have time for just one single question. 603 00:58:46,930 --> 00:58:51,040 I don't know, do we? How one from the Internet. Thank you so much. 604 00:58:51,040 --> 00:58:56,790 Signal Angel: OK. First off, I have to tell you. Your shirt is nice. Chat wanted 605 00:58:56,790 --> 00:59:05,000 me to say this. And they asked how reliable this exploit is and does it work 606 00:59:05,000 --> 00:59:09,160 on every boot? Peter: Right, Yeah. That's actually 607 00:59:09,160 --> 00:59:14,960 something really important that I forgot to mention. So they patch a vulnerability, 608 00:59:14,960 --> 00:59:17,339 but they didn't provide downgrade protection. If you could flash a 609 00:59:17,339 --> 00:59:24,170 vulnerable image with an exploit in it, it'll just boot every time on these chips 610 00:59:24,170 --> 00:59:27,850 that's so six or seven generation chips that's put in that image and it will 611 00:59:27,850 --> 00:59:31,230 reliably turn on the debugger every time you turn on the computer. applause 612 00:59:31,230 --> 00:59:36,650 Herald: Thank you so much for the question. And Peter Bosch thank you so 613 00:59:36,650 --> 00:59:39,160 much. Please give him a great round of applause. 614 00:59:39,160 --> 00:59:43,625 applause 615 00:59:43,625 --> 01:00:08,000 subtitles created by c3subtitles.de in the year 20??. Join, and help us!