WEBVTT 00:00:05.075 --> 00:00:10.098 Such a weird processor - messing with x86 opcodes... and a little bit of PE [Portable Executable] 00:00:10.513 --> 00:00:19.048 So welcome. ...And especially let me know if I speak too quickly. Um, so -- who I am -- oh, yes so 00:00:19.048 --> 00:00:28.025 I will talk about opcodes and a little bit about the PE [portable executable] file format and their oddities. So, I've been 00:00:28.025 --> 00:00:35.008 a reverse engineer for some years, for some time. I created a project called Corkami. 00:00:35.008 --> 00:00:42.039 Also in the past I worked on the MAME arcade emulator, and professionally I am a malware analyst, but 00:00:42.039 --> 00:00:48.541 this is only on the behalf of my hobbies, this is my own experiments and research at home. 00:00:48.541 --> 00:00:57.010 So, I introduced Corkami. Corkami is just the name of the project I created for RCE project. 00:00:57.010 --> 00:01:04.073 I tried to keep it just to the technical stuff, no ads, no login required. 00:01:04.073 --> 00:01:06.057 Really direct to the good stuff. 00:01:06.057 --> 00:01:12.040 I try to update it and make it useful, so I also created cheat sheets and the kind of easy documents 00:01:12.040 --> 00:01:15.081 that I would use for work on a daily basis, 00:01:15.081 --> 00:01:18.326 but it's only a hobby; I do that once the kids are asleep 00:01:18.326 --> 00:01:23.020 and late at night so it's probably doesn't look professional 00:01:23.020 --> 00:01:24.782 and as good as I would like it to be. 00:01:24.782 --> 00:01:30.966 So right now, Corkami, the form of Corkami, is wiki pages and cheat sheets 00:01:30.966 --> 00:01:37.982 and I focus on creating as many as possible relevant proof of concepts [Hi Bob!] 00:01:37.982 --> 00:01:43.088 so the binaries are hand-written, usually I don't use a compiler, I create the PE (structure) myself 00:01:43.088 --> 00:01:46.040 so that it's only focusing on the exact interesting point 00:01:46.040 --> 00:01:49.004 and you don't have a lot of noise even -- you don't probably 00:01:49.004 --> 00:01:51.040 need IDA to actually understand what's going on 00:01:51.040 --> 00:01:54.790 because I try to focus only on what's important. 00:01:54.790 --> 00:01:58.290 The binaries are all directly available to download so you can 00:01:58.459 --> 00:02:01.030 really test your debugger, your tools, your knowledge 00:02:01.030 --> 00:02:03.899 and just get them directly from that. 00:02:03.899 --> 00:02:07.068 So far, I've focused on the PDF, assembly and the PE.. 00:02:07.068 --> 00:02:11.024 ...file format. A few other stuff, but that's mainly the most 00:02:11.024 --> 00:02:15.008 covered subject of my website. And I share that with a 00:02:15.008 --> 00:02:19.084 very permissive license so BSD you can reuse them commercially 00:02:19.084 --> 00:02:24.784 whatever. Even the images are done in open-source format. 00:02:25.399 --> 00:02:29.513 So the story behind this presentation is that some time ago 00:02:29.513 --> 00:02:32.332 I was young and innocent and I thought that CPUs, being 00:02:32.332 --> 00:02:38.045 electronic - whatever - they had to be perfectly logical and no problems 00:02:38.045 --> 00:02:41.830 and then I was tricked by malware. And basically 00:02:41.830 --> 00:02:46.059 IDA wasn't able to work on it, so I decided to go back 00:02:46.059 --> 00:02:49.770 to the basics and study assembly and PE files from scratch. 00:02:49.770 --> 00:02:52.737 I created in the meantime documents on Corkami 00:02:53.737 --> 00:02:57.066 and now I'm presenting you more or less the final results. 00:02:57.066 --> 00:03:01.063 or the good programs results. If I wasn't -- if I was just a 00:03:01.063 --> 00:03:05.679 guy who learned assembly I probably wouldn't be in HashDays 00:03:05.679 --> 00:03:10.058 to talk about it, if I didn't get a few achievements from 00:03:10.058 --> 00:03:14.072 various tools. So basically I failed all the disassemblers that I tried 00:03:14.072 --> 00:03:21.010 and I also created a few crashes - in IDA. I insist that all 00:03:21.010 --> 00:03:26.009 the authors were notified and most of the bugs are already fixed, but 00:03:26.009 --> 00:03:30.691 basically it was like this in 6.1 -- you get a direct crash -- but 00:03:30.691 --> 00:03:33.034 now it's fixed in 6.2, and everything. 00:03:33.034 --> 00:03:37.047 And Hiew [Hacker's view] - that's the latest version - but the newest and released, 00:03:37.047 --> 00:03:40.366 - well, the newest beta - fixed that and so on. 00:03:40.397 --> 00:03:45.087 So the agenda for the presentation is that I first try with 00:03:45.087 --> 00:03:50.639 an easy introduction, but I assume that most of you already know or are familiar with disassembly, right? 00:03:52.085 --> 00:03:57.035 Yes. And another question: are you all familiar with 00:03:57.558 --> 00:04:02.584 or you already had an event of undocumented disassembly in your ... or never? 00:04:02.584 --> 00:04:05.802 Like, you trust IDA and that's all. 00:04:06.602 --> 00:04:10.548 Like, is it a common thing to have an undocumented disassembly in IDA? 00:04:11.302 --> 00:04:14.036 Raise you arms -- okay, not so much. 00:04:14.036 --> 00:04:19.625 Okay. So then after the introduction (that will go quickly), 00:04:19.625 --> 00:04:25.036 I will mention a few tricks, then introduce CoST, the program that I created. 00:04:25.036 --> 00:04:29.033 And I will also talk a little bit more about the PE file format. 00:04:29.633 --> 00:04:34.008 So as you all have assembly knowledge I will go quickly on that. 00:04:34.008 --> 00:04:37.492 So basically, you compile a binary, there is assembly, there is 00:04:37.554 --> 00:04:44.025 some relevance, some common points between the [source] code and the assembled [generated] code. 00:04:44.025 --> 00:04:48.491 Then of course there is a relation between the opcode and the [assembly] code, you all know that. 00:04:49.061 --> 00:04:53.060 What is important is that the assembly is generated by the compiler, but actually what is 00:04:53.991 --> 00:04:59.691 then from the assembly what is -- what's only kept in the binary are the opcodes itself which are understood 00:04:59.691 --> 00:05:03.280 directly by the CPU, which means the CPU just knows 00:05:03.280 --> 00:05:07.039 what to do with the bytes, it doesn't care if you or the 00:05:07.039 --> 00:05:10.579 tool you're using know what it will do, because it just does it. 00:05:10.579 --> 00:05:16.339 And the problem is that what we read is not usually the opcodes for most people but actually the disassembly 00:05:16.339 --> 00:05:20.645 and if the disassembler doesn't give you any result, well, 00:05:20.645 --> 00:05:25.412 we're stuck, we're blind, we don't know what execution will do. 00:05:25.412 --> 00:05:28.009 And the other problem is because of the opcode length you 00:05:28.009 --> 00:05:30.309 don't know what the next instruction will be because you 00:05:30.309 --> 00:05:32.032 don't know how to disassemble it. 00:05:32.032 --> 00:05:40.017 So, here I just create one undocumented opcode in a simple program. 00:05:40.017 --> 00:05:48.217 So basically we just '_emit' -- [it's] a keyword in -- that's Visual Studio 2010 ultimate -- 00:05:48.217 --> 00:05:52.024 you will get a byte that is unidentified at disassembly 00:05:52.024 --> 00:05:58.997 so you get question marks, so basically this program 00:05:58.997 --> 00:06:01.076 even though it costs several thousand dollars is not able 00:06:01.076 --> 00:06:05.001 to -- it doesn't know what will happen. 00:06:05.001 --> 00:06:09.010 So usually if you do that... Oh, yeah, if you check the Intel documentation 00:06:09.010 --> 00:06:14.471 there is nothing to see at the D6 opcode, there is nothing to see there. 00:06:14.471 --> 00:06:17.645 Microsoft doesn't say anything, Intel doesn't say anything, 00:06:17.645 --> 00:06:21.013 so usually if you try that you could expect bad results. 00:06:21.013 --> 00:06:26.502 So, not documented, directly: usually it is a crash or not the expected result. 00:06:26.502 --> 00:06:29.503 But here, in this case, this specific case, no problem. 00:06:29.903 --> 00:06:35.275 We don't know what is was, if we follow Intel or Microsoft documentation, we don't know what happened. 00:06:35.275 --> 00:06:41.490 But if we -- the CPU just does its stuff. So what happened is that actually 00:06:41.490 --> 00:06:49.013 D6 is a very simple opcode, that doesn't do much, but somehow it's not documented by Intel 00:06:49.013 --> 00:06:53.573 [but] it's documented by AMD, and most of the opcodes are actually documented by AMD 00:06:53.573 --> 00:06:58.019 but not Intel. I don't know why, if anyone has any idea why... 00:06:58.019 --> 00:07:04.013 It's quite a trivial opcode, but it's not -- Intel still says there's nothing there. Okay. 00:07:04.013 --> 00:07:08.031 So it's commonly used, the common use for those undocumented opcodes are malware 00:07:08.031 --> 00:07:13.385 and packers, just to prevent automated analysis or easy reverse-engineering. 00:07:14.323 --> 00:07:22.367 What's funny is that, Intel, if you follow the documentation you will have many holes, but Intel's own disassembler, 00:07:22.367 --> 00:07:26.375 Xed, which is free of use, it is not open source, but just handles 00:07:26.375 --> 00:07:35.688 all these opcodes correctly, while Microsoft, and Visual Studio, and WinDBG, they follow blindly the documentation. 00:07:35.688 --> 00:07:42.653 So you will get question marks even though Intel knows perfectly what it does. 00:07:43.007 --> 00:07:51.525 So it's like "[...] do as I disassemble and don't read my documentation." 00:07:52.033 --> 00:08:00.693 So - of course - you could argue that WinDBG is only made to debug what the compiler, 00:08:00.693 --> 00:08:07.439 Microsoft compiler created, but then it kind of rules out WinDBG as a malware debugging tool, 00:08:08.008 --> 00:08:17.460 because you just inserted D6, it's trivial, and WinDBG is just not able to tell you what the instructions 00:08:17.460 --> 00:08:24.813 are. So it's not very useful for malware analysis -- for a malware analysis debugger 00:08:25.059 --> 00:08:32.967 So, another problem that happens is that of course each of the 00:08:32.967 --> 00:08:37.309 undocumented things, facts, are available, maybe one 00:08:37.309 --> 00:08:42.348 you will have in a trojan, one in a packer, and everything, but it's not so easy 00:08:42.348 --> 00:08:46.581 to find a good, exhaustive, clean test set to actually 00:08:46.581 --> 00:08:48.979 gather all these undocumented facts, so for example if you 00:08:49.240 --> 00:08:53.235 so, for example, someone says - a colleague - mentions an undocumented 00:08:53.235 --> 00:08:55.718 opcode or behaviour, and then you say "oh yeah, it's 00:08:55.718 --> 00:08:58.949 in MebRoot [MBR infector], or you skip this part of the file or whatever", 00:08:58.949 --> 00:09:03.466 and then you are actually, you know first it's a malware so you have -- you cannot 00:09:03.466 --> 00:09:08.063 really spread that, and then there is a lot of noise -- the malware payload or something before and 00:09:08.063 --> 00:09:15.012 after -- so it's not so easy to analyse. So that's why I focused on creating a small and clean test 00:09:15.012 --> 00:09:21.017 set that would actually provide --- insists just on one particular instruction or fact. 00:09:22.202 --> 00:09:27.556 So, now let's start, at last, the real stuff, and a few of the undocumented opcodes. 00:09:28.033 --> 00:09:36.856 But before I actually started [studying], [I was] wondering what the actual possibilities of the CPUs, I didn't even know 00:09:36.856 --> 00:09:44.455 what are the possibilities, what are the opcodes that are still supported or not by the -- by the CPU. 00:09:44.455 --> 00:09:52.024 And I think it's a bit like English, everybody, or most people in the world, would be able to read and 00:09:52.024 --> 00:09:57.410 understand these words, and if you['ve] see[n] some disassembly [before] then well you are used to seeing these opcodes, 00:09:57.410 --> 00:10:03.570 they are made by all the compilers and they are so common that if they are not here then we are a bit 00:10:03.570 --> 00:10:08.068 ill-at-ease, and if it's something different then we probably would be surprised. 00:10:08.622 --> 00:10:19.536 So this is standard English, but the Intel CPUs were made in the 70s, so it'd be the same as if you take 00:10:19.536 --> 00:10:27.072 Shakespearean English, so you could say that it's still English, but mmm... You know, I don't know what that means actually... 00:10:27.072 --> 00:10:30.051 or maybe I forgot, I quickly forgot at least, and it's a bit the same 00:10:30.051 --> 00:10:36.032 for those opcodes which are still supported by all the CPUs that we have -- all the Intel CPUs -- but 00:10:36.032 --> 00:10:41.098 we probably don't know what they actually do, and that's a problem. 00:10:41.098 --> 00:10:46.084 I actually made, one of the proof of concepts that I made was only using these old opcodes, and these 00:10:46.084 --> 00:10:53.015 old opcodes are actually doing something, so if someone is familiar with reading that, maybe I should 00:10:53.015 --> 00:10:59.072 ask "how old are you?", because myself I am used to the PUSH/JUMP/CALLs, but when it's about this, 00:10:59.072 --> 00:11:05.992 mmm... what is exactly being done. And it's still working on an i7, and it's still usable by malware, 00:11:05.992 --> 00:11:13.806 packers or anything, and yet some of them are -- totally unused now and they are still fully working on 00:11:13.806 --> 00:11:15.882 modern CPUs. 00:11:15.882 --> 00:11:21.429 And of course, it's a bit like English, it's an evolving language, and a bit like maybe the oldest generations 00:11:21.429 --> 00:11:27.487 of people -- of humans wouldn't be used to the buzzwords - the latest buzzwords. 00:11:27.487 --> 00:11:35.020 These opcodes are sometimes present in the most recent CPUs, so, and you have direct opcodes for 00:11:35.020 --> 00:11:41.267 CRC32 or AES decryption, string matching, and then some complex operation, in just one opcode. 00:11:41.267 --> 00:11:47.652 So this, this is possible, this exists in modern CPUs. Not all of them, of course. 00:11:47.652 --> 00:11:54.401 One thing that I like is the MOVBE -- move big endian -- opcode, because move big endian is the rejected 00:11:54.401 --> 00:12:01.918 offspring, it's only implemented in the Atom CPU, which means this netbook has -- supports this opcode 00:12:01.918 --> 00:12:09.038 and the i7 64-bit doesn't have this opcode, even though it will have CRC32 or maybe AES [op]code, so... 00:12:09.038 --> 00:12:12.054 so much for complete backward compatibility. 00:12:12.054 --> 00:12:20.029 There is no physical CPU as far as I know that can emulate -- execute CRC32 and MOVBE. 00:12:20.029 --> 00:12:24.084 And of course, MOVBE is quite meaningless itself because you already have an opcode for the big -- 00:12:24.084 --> 00:12:32.004 endian-ness swapping. So I don't know, this small computer has an opcode that most PC's don't. 00:12:32.004 --> 00:12:35.025 Okay. Why? I don't know. If you know... 00:12:35.025 --> 00:12:37.593 [Audience member:] "Is this opcode documented in the CPU feature set?" 00:12:37.593 --> 00:12:38.093 Yeah. 00:12:38.093 --> 00:12:42.424 Yeah, it's totally -- this MOVBE -- it's totally documented, it's official. 00:12:42.424 --> 00:12:47.390 [Audience member:] "But, no; is it like a CPU flag just for this instruction or is it implicit by 'this 00:12:47.390 --> 00:12:50.236 is an Atom CPU'?" 00:12:51.036 --> 00:12:58.039 Uh... Yeah, I don't know. I check the value by CPUID but I don't know if it's relevant to the... but 00:12:58.039 --> 00:13:07.055 I think it's by itself. ...but the CPUID result is so big that I don't remember it all. 00:13:07.947 --> 00:13:13.061 Uh, another thing, a bit specific to Windows in my case, because I focus on malware, is that before you do 00:13:13.061 --> 00:13:22.052 actually any opcode, I was focusing on what are the register values when you start a program, and I found 00:13:22.052 --> 00:13:28.072 out that the register values by default when you start a program and you haven't executed, theoretically, any opcode, 00:13:28.072 --> 00:13:33.063 - theoretically- actually gives you some information that are actively used in malwares. 00:13:33.063 --> 00:13:40.054 So for example, at the start point, EAX gives you either gives you if it's older generation (XP or before), 00:13:40.054 --> 00:13:41.899 or Vista or later. 00:13:42.084 --> 00:13:50.617 This is not so used by malwares, I don't recall seeing it, but GS, if GS is null, then it's a 32-bit 00:13:50.617 --> 00:13:54.029 system, and if it's not it's a 64-bit system. 00:13:54.029 --> 00:13:56.094 I will actually use that later in one of the tricks. 00:13:56.094 --> 00:14:04.032 And also, the relations between the registers -- there are many registers on the Intel CPUs -- is not 00:14:04.032 --> 00:14:10.077 sometimes very clear. I was surprised that when you do a FPU operation, it changes the FPU status, the 00:14:10.077 --> 00:14:18.051 FPU registers themselves, but also the MMX registers, and somehow all the documentations I saw on the 00:14:18.051 --> 00:14:24.679 internet are always mapping ST0 and MM0 in front of each other which makes sense, but actually if you 00:14:24.679 --> 00:14:30.470 modify -- if you just do a single FPU operation, it will actually modify not MM0, but MM7. 00:14:31.039 --> 00:14:36.075 So if you do an FPU operation like "load PI" [FLDPI] and then you check the value of MM7, that could be used 00:14:36.075 --> 00:14:38.781 as a trick or it's just like the way it is. 00:14:38.781 --> 00:14:45.093 And like, all the documentations, wikipedia and so on, that I could find about the overlapping of the registers. 00:14:45.093 --> 00:14:53.032 Another thing is that this was used as an anti-emulation trick in XP, that FPU also changes CR0 00:14:53.032 --> 00:14:59.077 so you have quite an unexpected anti-emulation trick by just using FPU operation. 00:14:59.077 --> 00:15:08.647 So here is it; basically 'store machine status word' [SMSW] is an older 286 CPU opcode -- mnemonic, that was 00:15:08.647 --> 00:15:18.048 created at the 286 era, so before the protected mode was fully created, and so it allows you to access 00:15:18.048 --> 00:15:26.096 to read the value of CR0, even from user mode, while the 'MOV CR0' is actually a privileged opcode. 00:15:26.096 --> 00:15:33.643 For some reason, the higher word of the register is undefined officially by the documentation, so Intel 00:15:33.643 --> 00:15:40.044 just says "this is the value -- the lowest value is correct but you cannot expect the real value". So for 00:15:40.044 --> 00:15:45.072 some reason, I don't know why they say that, because it's actually the value - the higher bits - of CR0. 00:15:45.072 --> 00:15:52.722 And under XP, when you do FPU operations, the value of CR0 will be modified, and eventually reverts 00:15:52.722 --> 00:16:00.033 by itself. So you can have, just by doing -- SMSW, and then you expect the result, then 00:16:00.033 --> 00:16:05.091 you do a FPU operation, then the result should be different, and then eventually the result will revert 00:16:05.091 --> 00:16:10.263 to the original value. So it's quite a tricky and unexpected anti-emulator. 00:16:11.001 --> 00:16:18.918 You have a similar trick on 32-bit Windows, where GS is not stored in the context, so it means that on 00:16:18.918 --> 00:16:25.314 thread-switch the value of GS is lost, which means if you just wait for something, GS will eventually 00:16:25.314 --> 00:16:32.343 reset to 0. So if you set GS and you are stepping manually, this is slow and this creates a thread-switch, 00:16:32.420 --> 00:16:39.559 so instantly GS is lost. And also, like the previous trick, if you just wait for GS not to be... 00:16:40.051 --> 00:16:45.067 if you just loop until GS is not 0, this on a real system, will eventually exit from the loop. 00:16:45.067 --> 00:16:52.843 But the first time, it blew me, I was really wondering what can happen there, there's no other thread 00:16:52.843 --> 00:16:58.334 and of course in my proof of concept, it directly starts like this. What happens? What should happen now , 00:16:58.334 --> 00:17:02.091 but on a real system? Eventually, it's reset to 0. 00:17:02.091 --> 00:17:10.308 Another thing is that of course it's reset to 0, but not in 0 time, so if you do wait for GS's reset 00:17:10.308 --> 00:17:17.051 and then another loop, this can only happen between two resets... thread switch, which means it should 00:17:17.051 --> 00:17:23.008 take a minimum of time, so you can use that for timing -- anti-emulation timing tricks. 00:17:25.054 --> 00:17:32.405 Of course, I was also thinking that NOP is perfect, because NOP is NOP, it does nothing. 00:17:33.020 --> 00:17:44.013 But originally NOP is 'exchange eax with eax' [xchg eax, eax], or 'ax with ax', but the problem is that NOP [encoded as] 0x90 is always doing nothing, 00:17:44.013 --> 00:17:51.020 but on 64-bit you always have, you have another encoding [87 c0] to do an 'exchange EAX AX' which this time again 00:17:51.020 --> 00:17:54.064 doesn't do anything on 32b, but like all the other opcodes 00:17:54.064 --> 00:17:58.075 in 64b mode, it actually resets the higher DWORD 00:17:58.075 --> 00:18:02.075 so you have an XCHG EAX [,EAX] that does something, 00:18:02.075 --> 00:18:05.075 even though at first it looks like it would do nothing 00:18:05.075 --> 00:18:09.251 but hopefully in this case the 90 NOP is still doing nothing 00:18:10.020 --> 00:18:13.632 and this is probably now common in malwares and stuff 00:18:14.017 --> 00:18:18.049 HINT NOP was the multi-byte nop 00:18:18.049 --> 00:18:22.518 that actually gives a hint about what will be executed next, by the CPU 00:18:23.041 --> 00:18:24.051 whatever the address here [in memory referenced HINT NOP] 00:18:24.051 --> 00:18:25.595 it wouldn't trigger an exception 00:18:25.595 --> 00:18:29.631 but as you can see, it's really a multi-byte opcode -- it can be a very long nop 00:18:30.755 --> 00:18:31.666 that's weird to say 00:18:32.143 --> 00:18:35.455 another thing is, once again it's partially undocumented by Intel 00:18:37.024 --> 00:18:44.055 the full range of HINT NOP encoding is bigger on AMD documentation 00:18:44.055 --> 00:18:47.702 and another thing is that, because it's a multi-byte opcode 00:18:48.040 --> 00:18:51.076 if you - at the end of a page - insert those bytes 00:18:51.076 --> 00:18:54.478 then it will look for the operands 00:18:54.739 --> 00:18:56.064 then it could trigger an exception, 00:18:56.064 --> 00:18:59.702 so it's a nop that could trigger an exception if at the end of the page 00:19:01.040 --> 00:19:04.059 so, thank you Intel -- or whatever, I don't know, I'm not sure 00:19:04.059 --> 00:19:06.270 MOV, once again, I thought... 00:19:06.270 --> 00:19:09.963 MOV being MOV, should be perfectly logical 00:19:10.994 --> 00:19:15.378 sadly not... first... all this is documented, but it's tricky 00:19:15.440 --> 00:19:19.076 because -- there were even bugs for that in all the disassemblers I tried, I think 00:19:19.076 --> 00:19:20.861 well, except Xed, maybe 00:19:22.569 --> 00:19:29.055 you cannot do MOV on or from CR0 on memory 00:19:29.055 --> 00:19:32.046 so the documentation says that the Mod/RM is ignored 00:19:32.754 --> 00:19:34.585 it doesn't mean it's illegal, it's just ignored 00:19:34.701 --> 00:19:36.602 so if you do this, which could lead to a crash 00:19:36.602 --> 00:19:39.051 it's actually interpreted as that 00:19:39.051 --> 00:19:42.033 and as far as I can remember, you'd fail all the disassemblers with that 00:19:42.033 --> 00:19:43.657 until recently [ ;) ] 00:19:44.042 --> 00:19:50.455 MOVSXD is a 64b opcode, is sign-extending, so theoretically 00:19:50.455 --> 00:19:55.040 it should work from a smaller register to a bigger register 00:19:55.040 --> 00:19:57.811 but if you use no REX prefix, which is discouraged 00:19:57.811 --> 00:20:00.233 you can actually make it work like a standard MOV, 00:20:01.402 --> 00:20:04.035 and the other way around, 00:20:04.035 --> 00:20:09.092 MOV from a selector to a 32b register actually works 00:20:09.092 --> 00:20:12.486 so many disassemblers were disassembling that as MOV AX, CS 00:20:12.486 --> 00:20:15.671 because that would make both operands the same size, 00:20:15.671 --> 00:20:19.306 but actually the upper word of the target register 00:20:19.306 --> 00:20:22.635 is 'undefined' but actually there is no funny thing here, 00:20:22.635 --> 00:20:24.824 there's no random value, it's zeroes 00:20:24.824 --> 00:20:29.271 so basically, it makes it equivalent to MOV EAX, CS 00:20:30.363 --> 00:20:32.058 BSWAP is one of my favorite 00:20:32.058 --> 00:20:34.693 because I think it's like an administration 00:20:35.016 --> 00:20:37.687 it's supposed to just swap the endianness of the registers 00:20:37.687 --> 00:20:42.406 but because of -- external reasons 00:20:42.406 --> 00:20:44.558 it's never really doing the work you expect 00:20:44.558 --> 00:20:50.041 so, only in 64b, it's actually correctly swapping the endianness 00:20:50.041 --> 00:20:51.096 as you would expect 00:20:51.096 --> 00:20:55.096 on EAX [32b], in 64b [mode], like all the 32b opcodes, 00:20:55.096 --> 00:20:58.341 it will actually register [clear] the higher dword -- ok ! 00:20:58.341 --> 00:21:02.072 and, on word, it's actually 'undefined' again 00:21:02.072 --> 00:21:04.020 but it's commonly used in malwares and packers 00:21:04.072 --> 00:21:07.008 because it just resets [the register] 00:21:07.008 --> 00:21:09.055 so it's like a XOR AX, AX 00:21:09.055 --> 00:21:14.051 so, with this unexplainable result, I understand 00:21:14.051 --> 00:21:18.072 that Intel probably doesn't want to explain -- just say it's 'undefined' 00:21:18.072 --> 00:21:20.092 because they would be too ashamed to explain 00:21:20.092 --> 00:21:22.395 why we get this funny result 00:21:24.072 --> 00:21:31.134 BSWAP AX is also wrongly disassembled by WinDbg and so on 00:21:33.042 --> 00:21:35.068 it will be disassembled as BSWAP EAX 00:21:35.068 --> 00:21:36.776 and actually, you clear the register 00:21:42.007 --> 00:21:44.317 can everybody understand this code? 00:21:47.040 --> 00:21:49.509 anybody sees the potential trap? 00:21:53.001 --> 00:21:56.064 so, it pushes the address of on the stack, 00:21:56.064 --> 00:21:59.502 then RETN takes the address from the stack, 00:21:59.502 --> 00:22:02.699 and, basically, you just jump to an immediate value, 00:22:10.114 --> 00:22:10.948 execution ordering ? 00:22:10.974 --> 00:22:12.846 yeah, the execution starts here 00:22:14.031 --> 00:22:17.125 ??? 00:22:17.125 --> 00:22:20.096 no -- ok, it's not the point here 00:22:20.096 --> 00:22:25.546 and of course, if you -- this is OllyDbg 1, it's fixed in OllyDbg 2 00:22:25.546 --> 00:22:28.026 but OllyDbg1 is even trying to be nice, 00:22:28.026 --> 00:22:30.055 telling you -- this is an automatic comment -- that RET 00:22:30.055 --> 00:22:32.382 is used as a jump to 00:22:33.059 --> 00:22:36.034 and, as you can see, not exactly the same [happens] 00:22:36.034 --> 00:22:37.048 so, what happened ? 00:22:37.048 --> 00:22:38.239 no one sees ? 00:22:40.024 --> 00:22:42.457 so, basically, here, you have a 66 prefix on RETN 00:22:42.826 --> 00:22:46.081 which actually makes RETN to IP, and not EIP 00:22:47.035 --> 00:22:55.017 so, actually, you don't jump to 401008, but to 00001008 00:22:55.663 --> 00:22:58.564 and in this proof of concept, I mapped the NULL page 00:22:58.564 --> 00:23:01.009 and I created -- added some code at this address 00:23:01.009 --> 00:23:05.602 so, this is actually not a return to this [] 00:23:05.602 --> 00:23:10.085 but the problem is that, officially, this is also called a 'return' 00:23:10.085 --> 00:23:15.079 it's not [different from the standard one] -- the disassemblers added their own, now, way of disassembling it 00:23:15.079 --> 00:23:19.055 like 'small retn', ret.16, or something like this 00:23:19.055 --> 00:23:22.075 but actually officially, it's the same mnemonic 00:23:22.075 --> 00:23:26.743 so, the latest Hiew, I think, and that's OllyDbg 1 00:23:28.328 --> 00:23:31.017 maybe the latest OllyDbg 2 fixed that 00:23:31.017 --> 00:23:33.016 but you can still be tricked just by that 00:23:33.016 --> 00:23:41.024 the 66 prefix - the jump to IP - also works on CALLs, RETs, LOOPs, [and JMPs] 00:23:41.024 --> 00:23:44.145 so all the flow control opcodes 00:23:45.098 --> 00:23:47.489 so, I won't enumerate all the tricks, 00:23:47.489 --> 00:23:51.074 because otherwise you'll die of boredom probably 00:23:51.074 --> 00:23:55.041 if you want more, then I created a page on Corkami [x86.corkami.com], 00:23:55.041 --> 00:24:00.075 and I already made some graphs and cheat sheets 00:24:00.075 --> 00:24:03.521 to have an easy [table] -- list of opcodes 00:24:04.413 --> 00:24:06.877 and, that's quite too much theory for now... 00:24:06.877 --> 00:24:11.780 So, I don't like just -- reading stuff and not having something to feed my debugger 00:24:11.780 --> 00:24:12.789 so I created CoST 00:24:12.789 --> 00:24:16.000 which stands for Corkami Standard Test 00:24:16.000 --> 00:24:20.680 CoST is a single binary, there is no option, 00:24:20.680 --> 00:24:25.050 you just run it, and it will just execute a lot of different tests 00:24:25.050 --> 00:24:28.066 and then, I also made it a hardened PE, 00:24:28.066 --> 00:24:35.016 so it may also help you to test the PE side of your tools 00:24:35.016 --> 00:24:36.044 or your knowledge 00:24:36.044 --> 00:24:40.020 but, because in hardened PE, it's actually quite difficult to debug, 00:24:40.020 --> 00:24:42.073 I also made an easy PE mode so that 00:24:42.073 --> 00:24:47.042 you can study only the assembly, and not have too much troubles 00:24:47.042 --> 00:24:48.167 debugging it 00:24:49.044 --> 00:24:50.980 so, CoST contains a lot of tests 00:24:57.088 --> 00:24:59.075 classic stuff -- very trivial stuff 00:24:59.075 --> 00:25:03.096 then, a few more complex stuff, like JMP to IP, IRET... 00:25:03.096 --> 00:25:05.031 undocumented opcodes 00:25:05.031 --> 00:25:10.041 CPU specific, like MOVBE, POPCNT, CRC32 00:25:10.041 --> 00:25:17.083 also some detections of OS and VM by using common opcodes 00:25:17.083 --> 00:25:25.048 like, the 'red pill trick'... yeah, just SLDT execution, and you get a value, and you compare... 00:25:25.048 --> 00:25:27.509 but it's 'the blue pill', or whatever... 00:25:29.017 --> 00:25:32.538 and also some OS bugs because sometimes, Windows XP 00:25:32.538 --> 00:25:35.053 was doing the wrong job trying to tell you which was 00:25:35.053 --> 00:25:38.064 the exception that just happened, and it would be a way 00:25:38.064 --> 00:25:44.079 to make the difference between an actual OS and an emulator that would try to be logical 00:25:45.187 --> 00:25:49.030 CoST is written in assembly, so, there's no extra 00:25:50.030 --> 00:25:52.075 it's not compiled, it's not generated, but 00:25:52.075 --> 00:25:56.075 to make it self-documented, I created internal exports 00:25:56.075 --> 00:25:59.615 so that each section of the file is easy to browse [to], 00:25:59.615 --> 00:26:05.088 so that you will know -- if you quickly want to jump to the 64b part 00:26:06.350 --> 00:26:08.032 then it's easier via the exports 00:26:08.032 --> 00:26:13.079 and also I wanted it to print messages in the most convenient way 00:26:13.079 --> 00:26:18.058 so, if you keep printing messages, then it will make the assembly 00:26:18.058 --> 00:26:21.092 wider, I mean longer to scroll, so I used 00:26:21.092 --> 00:26:25.073 Vectored Exception Handling, and a fake opcode 00:26:25.073 --> 00:26:28.052 so that you have the comments of what's gonna happen, 00:26:28.052 --> 00:26:30.035 appearing directly in the code 00:26:30.035 --> 00:26:34.088 so it's a kind of self-documented, without a debug symbols file 00:26:34.088 --> 00:26:38.092 and, you saw, it doesn't have much of output 00:26:38.092 --> 00:26:41.092 but actually it has a lot of debug output 00:26:41.092 --> 00:26:46.996 like 100 -- I forgot -- messages. it's even saying '[trick] I'm gonna do this' 00:26:46.996 --> 00:26:48.793 and then, 'i'm gonna do that...', so 00:26:49.070 --> 00:26:54.571 trying to make it helpful yet a bit hard to disassemble 00:26:57.079 --> 00:26:59.520 can anyone understand what this code is doing ? 00:26:59.520 --> 00:27:00.806 this is one of my favourite 00:27:02.083 --> 00:27:04.949 we can't see the opcodes 00:27:06.011 --> 00:27:07.378 no, there's no [opcode] trick this time 00:27:17.070 --> 00:27:19.070 so, basically you push some arguments on the stack 00:27:19.070 --> 00:27:21.003 you jump to here 00:27:21.003 --> 00:27:25.589 basically, with the return far [RETF]... I pushed 'push_eip' on the stack 00:27:25.635 --> 00:27:28.052 with a 33 word 00:27:28.052 --> 00:27:30.446 so basically I will RETurn Far to this 00:27:30.446 --> 00:27:35.062 basically I will return back to this EIP in selector 33 00:27:35.062 --> 00:27:38.743 if this is in a 64b OS, and this is a 32b process 00:27:38.774 --> 00:27:42.078 you will return back to execution here, in 64b mode 00:27:42.078 --> 00:27:47.079 because selector 33 is the selector for 64b mode 00:27:47.079 --> 00:27:49.083 which you can access from a 32b process 00:27:49.083 --> 00:27:53.569 so basically this code will be executed first in the current selector 00:27:56.031 --> 00:28:01.096 as you see, and then it's executed back on selector 33, 00:28:01.096 --> 00:28:03.529 which means in 64b mode 00:28:03.529 --> 00:28:08.036 so you have the same EIP, you have the same opcodes 00:28:08.036 --> 00:28:10.016 but the disassembly will be different, 00:28:10.016 --> 00:28:14.024 and I chose some opcodes will make mnemonics 00:28:14.024 --> 00:28:17.374 specific to each side, 32b or 64b sides 00:28:17.374 --> 00:28:22.092 so, it's already quite a b*tch to disassemble 00:28:22.092 --> 00:28:27.092 because, same EIP, so unless you're careful about the selector, 00:28:27.092 --> 00:28:29.011 well, it's a problem 00:28:29.826 --> 00:28:36.468 [Errata: you can debug this kind of code, check my berlinsides presentation (screencast on slide 58)] 00:28:38.422 --> 00:28:44.542 http://bsx2.corkami.com , slide 58 [screencast] 00:28:46.619 --> 00:28:50.063 if you run over it, you return to the original selector, 00:28:50.063 --> 00:28:52.055 which is why there is the PUSH CS here 00:28:52.055 --> 00:28:56.029 and you go back to with the original selector 00:28:56.029 --> 00:28:58.071 execution will go through quickly 00:28:58.071 --> 00:29:00.079 but you cannot step through that code [WRONG, you can with WinDbg+wow64exts] 00:29:00.079 --> 00:29:03.077 so, killing the disassemblers, and the debuggers 00:29:03.077 --> 00:29:04.092 and yet, simple 00:29:04.092 --> 00:29:07.035 so, here is the result that you get when you run CoST 00:29:07.035 --> 00:29:10.069 with the latest -- well the latest public version of Hiew 00:29:10.069 --> 00:29:13.031 I think it's gonna be fixed 00:29:13.031 --> 00:29:16.077 so, this is a HINT NOP that's not documented by Intel 00:29:16.077 --> 00:29:20.249 and it's a bit forgotten by most disassemblers 00:29:20.249 --> 00:29:24.053 so, WinDbg and Hiew are giving you 00:29:24.053 --> 00:29:28.539 undocumented, well -- questions marks, or the Hiew style of question marks 00:29:28.539 --> 00:29:34.393 then, since -- that was originally what I planned to present at Hashdays 00:29:34.429 --> 00:29:39.064 but then, I decided to bring a few tricks in CoST itself, on the PE side of things 00:29:39.064 --> 00:29:42.398 so, this is the header, so it has MZ, and then some text 00:29:42.398 --> 00:29:44.045 so you can 'type cost.exe' 00:29:44.045 --> 00:29:46.088 and it has some text - I made it type-able 00:29:46.088 --> 00:29:51.079 and the NT headers - the 'PE' header, the one starting with PE 00:29:51.079 --> 00:29:54.075 is actually starting at the bottom of the file -- the bottom of the file is here 00:29:54.075 --> 00:29:55.154 so it's a footer 00:29:55.154 --> 00:29:57.559 and I made it so the values are quite critical 00:29:57.636 --> 00:30:01.035 so, they are not the one you would expect 00:30:01.035 --> 00:30:03.044 so this is the result that you would get when you were 00:30:03.044 --> 00:30:05.457 loading CoST under IDA 6.1 00:30:07.011 --> 00:30:10.270 so, well, some values were random and everything 00:30:11.024 --> 00:30:15.329 but, if you have -- with CoST, you can test and set the value of a register 00:30:15.329 --> 00:30:16.620 then compare it 00:30:16.620 --> 00:30:19.065 but you cannot test all the possibilities of PE files 00:30:19.065 --> 00:30:21.068 with a single file, because you have to choose 00:30:21.068 --> 00:30:25.073 so, for example, CoST has no section, weird alignments and everything 00:30:25.073 --> 00:30:27.079 but you cannot make all the possible cases [in a single file] 00:30:27.079 --> 00:30:31.011 so, I went on and I created another page on Corkami 00:30:31.011 --> 00:30:37.024 with, as usual, the proof of concepts, some graphs about the PE files and everything 00:30:37.024 --> 00:30:40.073 I don't consider it finished but I consider it good enough to break 00:30:40.073 --> 00:30:41.496 a bit everything 00:30:42.096 --> 00:30:46.040 now, I already created more than 100 PoCs, which try 00:30:46.040 --> 00:30:50.680 0 section, big alignments, huge alignments, and I have some funny results... 00:30:50.680 --> 00:30:55.020 so, here is the 'virtual section table vs Hiew' 00:30:55.020 --> 00:31:00.007 so, when you're in low alignments, you can have no section, 00:31:00.007 --> 00:31:03.027 or the section table can be empty 00:31:03.027 --> 00:31:08.040 so basically, I made the SizeOfOptionalHeader point in virtual memory space 00:31:08.040 --> 00:31:11.097 which means the section table is out of the PE file [full of 00, in virtual space] 00:31:11.097 --> 00:31:16.028 and Hiew doesn't like this. A consequence of that it doesn't even think it's a PE file 00:31:16.028 --> 00:31:18.088 while it's fully working, but this trick only works under XP 00:31:18.088 --> 00:31:25.009 because Windows 7 is a bit more picky on the unused section table values 00:31:29.051 --> 00:31:34.027 so when you got some ASCII art in the Data Directories 00:31:34.027 --> 00:31:37.020 you can probably guess that there is something going on 00:31:37.020 --> 00:31:40.003 if you have better ASCII art suggestion, I'm all ears 00:31:40.003 --> 00:31:43.031 so, basically, this is the 'Dual PE header' that was presented by 00:31:43.031 --> 00:31:45.108 Reversing Labs in BlackHat 00:31:45.108 --> 00:31:47.816 so, are you familiar with that ? 00:31:50.031 --> 00:31:52.053 so, basically, you extend the SizeOfHeaders so that 00:31:52.053 --> 00:31:59.068 the NT headers will be actually mapped at the bottom of the file 00:31:59.068 --> 00:32:03.269 so that when it's far enough to reach section [not file] alignment 00:32:03.684 --> 00:32:05.053 and when you load that, in memory 00:32:05.053 --> 00:32:07.299 the first section will actually be mapped over it 00:32:09.530 --> 00:32:12.683 the first part of the OPTIONAL_HEADER is the one used on disk 00:32:13.052 --> 00:32:16.054 so, this is what is used to check if the file will load 00:32:16.054 --> 00:32:20.096 but the Data Directories are read from the values in memory 00:32:20.096 --> 00:32:25.003 so, first, the OPTIONAL_HEADER is parsed, mapped in memory 00:32:25.003 --> 00:32:29.036 then the section is folding itself over the bottom part of the header 00:32:29.036 --> 00:32:31.088 and then the true Data directories that were originally 00:32:31.088 --> 00:32:34.028 in the start of the section will be taken in account 00:32:34.028 --> 00:32:39.068 so all this is garbage and visible on disk, it follows the SizeOfOptionalHeader 00:32:39.068 --> 00:32:43.515 but actually in memory, this is not what is used to be parsed 00:32:45.007 --> 00:32:47.040 another weird thing is that the export names can just be 00:32:47.040 --> 00:32:51.007 absolutely anything, until a null character 00:32:51.007 --> 00:32:53.072 which means, non ASCII, whatever 00:32:53.072 --> 00:32:56.013 and another funny thing is that 00:32:56.013 --> 00:32:57.052 Hiew displays them in line 00:32:57.052 --> 00:32:59.038 so you can just add your own ads, 00:32:59.038 --> 00:33:02.055 because those are just export names, and one of the export 00:33:02.055 --> 00:33:05.088 [name] is actually more than 16 Kb 00:33:05.088 --> 00:33:08.030 so that it's good enough to create a buffer overflow 00:33:08.030 --> 00:33:10.053 if your tool is not careful about that 00:33:10.053 --> 00:33:14.029 and it's also possible to have a NULL export [name], just a character NULL 00:33:14.029 --> 00:33:15.484 and you can import a NULL API 00:33:15.484 --> 00:33:16.652 no problem 00:33:19.021 --> 00:33:23.000 I also just tried to see the different possibilities 00:33:23.000 --> 00:33:26.048 created a few files that had the maximum number of sections 00:33:26.048 --> 00:33:31.068 the limit is 96 under XP, and 64K under Vista and [Windows] 7 00:33:31.068 --> 00:33:33.007 which means, well 00:33:33.007 --> 00:33:36.072 OllyDbg 2 - the latest OllyDbg - gives you a funny message 00:33:36.072 --> 00:33:38.025 but it still loads the file. 00:33:38.025 --> 00:33:40.240 OllyDbg 1 crashes directly on this file 00:33:42.102 --> 00:33:42.964 err...still some time ? 00:33:45.057 --> 00:33:48.059 and the one last, not very visual, but I noticed 00:33:48.059 --> 00:33:52.068 that the AddressOfIndex of the TLS is overwritten on loading 00:33:52.068 --> 00:33:59.027 and imports - the terminator of imports doesn't need to be five null dwords 00:33:59.027 --> 00:34:03.010 but only if the name [of the DLL] is 0, then the import descriptor 00:34:03.010 --> 00:34:05.093 is considered a terminator 00:34:05.093 --> 00:34:09.290 so, basically, if you make AddressOfIndex point to the name of an import descriptor 00:34:10.028 --> 00:34:15.026 you could get that overwritten, and then the imports will be truncated 00:34:15.026 --> 00:34:16.067 will be considered truncated 00:34:16.067 --> 00:34:20.060 and actually, the behavior is different under XP or Windows 7 00:34:20.060 --> 00:34:25.085 so, under XP, it's overwritten after imports loading, 00:34:25.085 --> 00:34:28.027 so the whole imports table is not truncated, 00:34:28.027 --> 00:34:32.036 while under Windows 7, it's happening before the imports are loaded, 00:34:32.036 --> 00:34:35.032 which means you have the same PE, but different loading behaviour 00:34:35.032 --> 00:34:37.018 under different versions of windows 00:34:37.018 --> 00:34:40.617 and the file works on both versions of windows 00:34:43.033 --> 00:34:45.706 oh wait, before that... maybe I still have some time ? 00:34:55.059 --> 00:34:56.051 15 minutes left ? ok 00:34:56.051 --> 00:34:58.493 I'll do the demo 00:35:00.586 --> 00:35:01.457 This is just to prove... 00:35:02.400 --> 00:35:03.379 sorry? 00:35:23.117 --> 00:35:25.423 This is the kind of PE file that I typically create 00:35:25.454 --> 00:35:28.702 I only defined [required] elements that just need to work 00:35:28.702 --> 00:35:30.318 and this is actually a driver 00:35:30.318 --> 00:35:33.734 so, even though I used some undocumented opcodes 00:35:36.673 --> 00:35:39.008 It's a working driver and it doesn't have the usual 00:35:40.362 --> 00:35:41.612 [compiler] stuff you have in a driver 00:35:43.596 --> 00:35:47.092 just to say that this is the kind of PoC, clear to see 00:35:47.092 --> 00:35:51.046 you don't have external stuff that bother, that bugs your view 00:35:51.046 --> 00:35:52.088 or your debugging 00:35:52.088 --> 00:36:02.061 so, this one is just to see the possible values of CR0 00:36:02.061 --> 00:36:07.053 via the SMSW, theoretically undefined on DWORD 00:36:07.053 --> 00:36:08.677 but it actually gives you the same value 00:36:08.677 --> 00:36:11.071 [like] the standard MOV EAX, CR0 00:36:11.071 --> 00:36:16.059 and here is MOV EAX, CR0 with the wrong Mod/RM 00:36:16.059 --> 00:36:22.197 which, in the latest Hiew, is actually not disassembled at all 00:36:37.813 --> 00:36:38.840 let's hope it doesn't crash... 00:36:43.071 --> 00:36:47.032 so, as you can see, you get exactly the same value 00:36:47.032 --> 00:36:53.377 whether you're using the normal CR0, the 'invalid' one, and the 'undefined' 00:36:55.069 --> 00:36:57.059 the upper part is supposed to be undefined 00:36:57.059 --> 00:37:00.071 usually when it's undefined, it's zeroes, in Intel language 00:37:00.071 --> 00:37:02.028 but here it just works fine 00:37:02.028 --> 00:37:03.063 and my machine didn't even crash 00:37:03.063 --> 00:37:05.092 which means the driver is fine 00:37:05.092 --> 00:37:07.061 so you can study small drivers 00:37:08.046 --> 00:37:11.245 the first PoC that I presented here 00:37:11.537 --> 00:37:15.076 was the one with old disassembly 00:37:15.076 --> 00:37:17.669 anyone still knows what the value is? 00:37:19.669 --> 00:37:22.526 so basically, some opcodes are here for garbage 00:37:22.526 --> 00:37:28.029 just to prove that they are actually [supported], they are just used as junk 00:37:28.029 --> 00:37:30.088 but registers are actually modified [in the others] 00:37:30.088 --> 00:37:37.563 and these opcodes from the 70's, or something -- the early 80's 00:37:37.563 --> 00:37:40.903 are still perfectly working on a modern CPU or even an i7 00:37:43.088 --> 00:37:47.557 one of the PoC I created is the one that actually tests the values 00:37:47.557 --> 00:37:50.429 -- the initial values [of each registers] -- so that you can see 00:37:50.506 --> 00:37:54.615 what would be the possible values whether it's on XP or Windows 7 00:37:56.015 --> 00:38:01.084 each time [TLS, EntryPoint, DllMain], I just save all the values of the registers 00:38:01.084 --> 00:38:03.665 and then I compare them to possible values 00:38:03.665 --> 00:38:06.028 so I test them one after each other 00:38:06.028 --> 00:38:10.071 actually, on TLS, you have much more control of the values 00:38:10.071 --> 00:38:16.077 because the values you will get in the TLS -- on loading the TLS 00:38:16.077 --> 00:38:20.021 are the RVA [of the TLS data directory], the callbacks, the size of the TLS 00:38:20.021 --> 00:38:23.476 you get that in -- I forgot exactly, but it's in the source... 00:38:26.076 --> 00:38:33.063 running this will help you to mimic an OS better in your emulator 00:38:33.063 --> 00:38:35.055 if that's what you're interested [in] 00:38:35.055 --> 00:38:41.062 SMSW is actually the one comparing -- so, using SMSW, 00:38:41.062 --> 00:38:46.045 then comparing the value, then checking whether the register changed 00:38:46.045 --> 00:38:48.080 [after an FPU operation] and then when it reverts normally 00:38:48.080 --> 00:38:52.053 a funny fact that I would like an explanation [for], 00:38:52.053 --> 00:38:53.538 if you know it 00:38:54.076 --> 00:39:01.015 is that actually, this behaviour is different if you run the file normally 00:39:01.015 --> 00:39:04.061 or if you run it with a redirection 00:39:04.061 --> 00:39:08.046 if you pipe the output, you get a 'fail' result 00:39:08.046 --> 00:39:11.053 if you run the file normally, it just works 00:39:11.053 --> 00:39:17.436 so, I would like -- here, I will just run it, and then I will run it to a file, and just TYPE the result 00:39:22.021 --> 00:39:24.071 normal execution: OK 00:39:24.071 --> 00:39:26.057 redirection: FAIL 00:39:26.057 --> 00:39:28.743 if you guys have any explanation for that, I'm all ears 00:39:30.066 --> 00:39:37.011 did you try redirecting to something else ? like, a COM 00:39:37.088 --> 00:39:38.670 oh, I didn't try 00:39:42.008 --> 00:39:44.362 so, you would pipe to another device, and ... 00:39:44.686 --> 00:39:46.071 but then, how do you get it back ? 00:39:46.071 --> 00:39:48.017 printer, or ... 00:39:48.017 --> 00:39:51.087 yeah, I don't have a COM device or... 00:39:54.025 --> 00:39:56.076 yeah, I don't know 00:39:56.076 --> 00:39:59.092 but it was a big surprise, because I had a test bench 00:39:59.092 --> 00:40:01.248 and then, 'FAIL'. .. uh ? 00:40:02.048 --> 00:40:05.948 run, OK... so, I have no idea why... 00:40:07.025 --> 00:40:07.979 the GS trick... 00:40:09.071 --> 00:40:10.379 quite simple 00:40:10.394 --> 00:40:15.327 and I also have some output 00:40:19.065 --> 00:40:21.084 I modified GS then it's reset 00:40:21.084 --> 00:40:23.065 then it's waited for result 00:40:23.065 --> 00:40:26.057 then I'm doing 2 resets and checking the time in between 00:40:26.057 --> 00:40:28.711 so that, it shouldn't happen too quickly 00:40:30.003 --> 00:40:31.317 NOPs, so... 00:40:37.025 --> 00:40:39.059 I'm testing the undocumented NOPs 00:40:39.059 --> 00:40:44.004 testing the NOP that are on invalid page 00:40:53.851 --> 00:40:55.239 so, standard NOP 00:41:00.423 --> 00:41:01.935 32b nop 00:41:07.058 --> 00:41:15.071 so, all my 64b tests are still done in 32b process so that you can run them on normal OS 00:41:15.071 --> 00:41:19.033 then it detects via GS if 64b [mode] is available 00:41:19.033 --> 00:41:21.067 and in this case, you would get a different result 00:41:21.067 --> 00:41:26.059 so, if you run it on 64b, which I don't have here, you would get 00:41:26.059 --> 00:41:28.066 the actual tests on 64b 00:41:28.066 --> 00:41:30.162 and the results printed out. 00:41:31.023 --> 00:41:35.017 but still, it's not possible to debug that easily [wrong] 00:41:35.017 --> 00:41:39.048 but at least, there's no trick over there, so it's easy to bring back to a 64b process 00:41:39.048 --> 00:41:43.311 [to step over 64b code and return to the 32b process] 00:41:45.003 --> 00:41:45.768 PUSH/RET 00:41:48.045 --> 00:41:50.577 you print the output, and then... 00:41:52.023 --> 00:41:56.877 Olly nicely tells you that you will jump to 401008 00:41:58.000 --> 00:42:03.077 but actually -- here the display is actually correct 00:42:03.077 --> 00:42:05.096 and the TLS already created a null page 00:42:05.096 --> 00:42:06.705 which prints 'FAIL' 00:42:09.059 --> 00:42:13.667 so, as expected, but there is no standard way to disassemble that correctly 00:42:15.036 --> 00:42:23.069 I can't execute the working 64k sections. 00:42:23.069 --> 00:42:27.076 and actually I'm executing all the code [the complete virtual space of all 64k sections] 00:42:27.076 --> 00:42:29.028 the sections are quite big 00:42:29.028 --> 00:42:32.443 and I'm modifying EAX so that all the 00 00 are executed 00:42:32.443 --> 00:42:35.070 and just to do a printf in the end. 00:42:35.070 --> 00:42:39.007 it actually takes a few seconds to execute on an i7 00:42:39.007 --> 00:42:43.008 so it's actually quite funny to see... you launch it... even when the cache is loaded, 00:42:43.008 --> 00:42:48.236 and the OS is ready to be fast... you launch it... and printf comes a few seconds later 00:42:50.098 --> 00:42:58.486 virtual sections is the one that Hiew doesn't think it's a PE at all -- this is the latest Hiew 00:43:00.055 --> 00:43:02.040 well, it's been patched anyway 00:43:02.040 --> 00:43:08.067 well, I can't browse PE now that it doesn't think it's a PE file... 00:43:08.067 --> 00:43:13.015 but basically, it thinks that the OPTIONAL_HEADER points to the end of the file -- beyond the end of 00:43:13.015 --> 00:43:14.048 the file 00:43:14.048 --> 00:43:15.085 the folded header... 00:43:16.562 --> 00:43:17.509 a few error messages... 00:43:18.063 --> 00:43:20.099 because of the wrong data directories 00:43:20.099 --> 00:43:22.900 and the actual DD are at the start of... 00:43:30.023 --> 00:43:31.365 ...the section 00:43:33.011 --> 00:43:40.910 this would be the imports and the actual real DD 00:43:42.048 --> 00:43:48.672 and last, the one with the TLS AddressOfIndex that is pointing... 00:43:57.518 --> 00:44:01.425 ...inside the imports, at the AddressOfName 00:44:02.040 --> 00:44:04.040 so it will overwrite the loading [overwrite the pointer during loading] 00:44:04.040 --> 00:44:11.092 and when you just load it, it just says 'it's XP' because 00:44:11.092 --> 00:44:14.071 my imports were loaded this way, and not the other way. 00:44:14.071 --> 00:44:17.017 and if you run that file [under W7], it will give you another results 00:44:17.017 --> 00:44:18.167 and then, the exports... 00:44:19.059 --> 00:44:24.059 where some of the exports are actually very long 00:44:24.059 --> 00:44:30.050 you can see that actually, here I'm taking over the disassembly 00:44:30.050 --> 00:44:33.096 so I'm repeating the same fake opcodes and address 00:44:33.096 --> 00:44:36.151 so you fool the disassembler that way 00:44:37.059 --> 00:44:40.051 I think it's just a visual effect, they are no big problems 00:44:40.051 --> 00:44:43.018 but it's a known problem that was fixed recently in IDA 00:44:43.018 --> 00:44:46.546 that if you put an export in the middle of the instruction 00:44:46.546 --> 00:44:49.076 the fake export will actually take over the disassembly, 00:44:49.076 --> 00:44:52.011 and that would ruin the disassembly 00:44:52.011 --> 00:44:56.127 there's actually a PoC for that in Corkami, of course 00:44:57.204 --> 00:44:59.502 so, that's all for the demos 00:45:04.564 --> 00:45:09.124 so, I wanted to know more about x86 and PE 00:45:09.616 --> 00:45:12.067 which are far from perfectly documented 00:45:12.067 --> 00:45:14.080 and are still not perfectly documented, 00:45:14.080 --> 00:45:18.048 but at least, I've been covering some parts of it, 00:45:18.048 --> 00:45:20.042 there are still some gray areas, 00:45:20.042 --> 00:45:23.486 but at least, every day, I'm just learning a bit more, 00:45:23.486 --> 00:45:25.842 and publishing my results and sharing them openly, 00:45:27.303 --> 00:45:31.080 like WinDbg, if you follow only the official documentations, 00:45:31.695 --> 00:45:36.017 you will only get bad results, with malwares and packers out there, 00:45:36.017 --> 00:45:40.063 if you - yourself - are interested, or you develop a tool, an emulator, an engine, whatever... 00:45:40.063 --> 00:45:44.057 well you know you can just visit Corkami, read the pages, 00:45:44.057 --> 00:45:48.024 download the PoCs, which are [freely] available, 00:45:48.024 --> 00:45:50.076 and if you find any bugs - which might happen, 00:45:50.076 --> 00:45:54.269 then send me a postcard, or a red-cross T-shirt 00:45:57.007 --> 00:46:01.076 Thanks to Peter Ferrie, and all my reviewers, and people who contributed... 00:46:01.076 --> 00:46:02.251 do you have any questions ? 00:46:03.066 --> 00:46:10.081 did you ran them through AVs - antivirus scanners? you would find a sh*tload of 0days 00:46:10.081 --> 00:46:21.727 no, then, I wouldn't be good to actually turn them into exploits or anything, so... 00:46:23.096 --> 00:46:29.000 already breaking all the disassemblers and stuff was good enough for me 00:46:29.000 --> 00:46:32.794 I found a crash in Intel XED, which was good enough 00:46:40.025 --> 00:46:43.585 any other question? everybody survived the presentation? 00:46:45.000 --> 00:46:46.547 it's a great talk, man 00:46:46.644 --> 00:46:47.608 thank you! 00:46:48.023 --> 00:46:50.070 THANK YOU! [for watching]