0:00:00.099,0:00:14.800 Music 0:00:14.800,0:00:18.760 Herald: And give a great round of applause[br]to Sebastian Eschweiler! 0:00:18.760,0:00:24.290 Applause 0:00:24.290,0:00:32.360 Sebastian: So hi everyone. So I want to[br]talk about how I defeated not (Not)Petya's 0:00:32.360,0:00:42.800 cryptography. Some I'd say that (Not)Petya[br]would be Ukraine's scourge, and for those 0:00:42.800,0:00:48.190 of you who don't know what the word[br]scourge means: this guy right here doesn't 0:00:48.190,0:00:57.249 know either. And quick trivia quiz - does[br]anyone know what this movie is? What's the 0:00:57.249,0:01:08.940 name of the movie? So, in the next scene[br]Johnny Depp enters. Does ring a bell? Jim, 0:01:08.940,0:01:17.140 a movie by Jim Jarmusch. Soundtrack by[br]Neil Young? Dead Man, right! Great! It's a 0:01:17.140,0:01:24.310 great movie. So if you want to know what a[br]scourge is then you could watch the movie. 0:01:24.310,0:01:31.640 So let's begin with my with my talk. So[br]this is what actually the official Ukraine 0:01:31.640,0:01:39.930 Twitter account tweeted some time ago in[br]at the end of June 2017. And there was an 0:01:39.930,0:01:47.210 outbreak of a ransomware attack which was[br]noticed mostly in Ukraine but also all 0:01:47.210,0:01:51.999 over the world. So millions of users and[br]also companies large companies were 0:01:51.999,0:02:02.969 affected, and the damage went into the[br]billions of dollars. So, the problem there 0:02:02.969,0:02:09.889 is, I mean, this is this is not the[br]everyday ransomware outbreak you have 0:02:09.889,0:02:17.220 there. And I want to give you a short[br]glimpse into the the (Not)Petya universe. 0:02:17.220,0:02:24.130 And also how I could decrypt all these[br]stuff that, yeah, actually was encrypted 0:02:24.130,0:02:31.330 by the this ransomware outbreak. So first[br]I want to begin my talk with a 0:02:31.330,0:02:35.880 differentiation. I want to draw a[br]demarcation line because all this 0:02:35.880,0:02:41.560 (Not)Petya universe - there's much sub-[br]summarised under under this whole label. 0:02:41.560,0:02:49.660 And I really, I'm just talking about a[br]small fraction of this whole universe. So 0:02:49.660,0:02:55.430 I will first distinguish between what my[br]talk will be in what my talk will not be 0:02:55.430,0:03:02.630 about. Next I will describe (Not)Petya's[br]cryptography, and especially (Not)Petya's 0:03:02.630,0:03:08.310 cryptographic failures. Which I then will[br]be exploiting in the remainder of the talk 0:03:08.310,0:03:22.360 and see how can the users get their, yeah,[br]vacation photos back. So what was this 0:03:22.360,0:03:34.700 whole thing? The outbreak started, as I[br]said, in June 2017, and it started as fake 0:03:34.700,0:03:40.990 update or as malicious update from a[br]software called Madoc 0:03:40.990,0:03:48.400 - this is a tax software, one of the two[br]official tax softwares in the Ukraine. So 0:03:48.400,0:03:55.730 almost every company has it installed on[br]tax accounts on their computers many 0:03:55.730,0:04:02.960 private persons have it installed. It was[br]pushed and then site loaded this file 0:04:02.960,0:04:14.380 perfc that was then downloaded to the[br]computers and it comprises several parts. 0:04:14.380,0:04:22.870 And some parts are more interesting than[br]then as others, so one component ever like 0:04:22.870,0:04:28.880 half an hour time it would start[br]encrypting the files depending on the 0:04:28.880,0:04:37.250 access level. So, if there wasn't any[br]access to infect the computer with well 0:04:37.250,0:04:44.080 this MBR infector. then it would just[br]based on the current user level encrypt 0:04:44.080,0:04:53.480 the files based on that current user. In[br]lack of a better name I would call this 0:04:53.480,0:05:00.810 "Mischa" component I know it's usually[br]somewhat different, something different. 0:05:00.810,0:05:06.260 However, this was the best name I could[br]find there. So it's basically just a 0:05:06.260,0:05:13.280 finite crypter with AES. My talk will not[br]go about this this part - this file 0:05:13.280,0:05:22.190 infector. The next very interesting[br]component was the spreader part - it's 0:05:22.190,0:05:27.389 basically based on the[br]EternalBlue/EternalRomance exploits that 0:05:27.389,0:05:34.371 had been leaked by the Shadow Brokers. My[br]talk will not go about this as well. This 0:05:34.371,0:05:41.260 is a whole different universe as well.[br]What my talk will be about is the actual 0:05:41.260,0:05:49.260 (Not)Petya component and this is an MBR[br]encrypter and I will show you in the next 0:05:49.260,0:05:58.049 slide what it's actually about. So, the[br]user will see something like this upon 0:05:58.049,0:06:06.240 reboot. So if the access rights are[br]granted so if there is some local admin 0:06:06.240,0:06:12.230 installed on the computer or the correct[br]password could be guessed by some attack 0:06:12.230,0:06:18.610 and then this dropper, the perfc that[br]would infect the system by overwriting the 0:06:18.610,0:06:26.110 Master Boot Record with a custom boot[br]loader. And it would reboot the system 0:06:26.110,0:06:32.760 after a predefined time - usually being[br]about 10 minutes. And then the the actual 0:06:32.760,0:06:39.500 (Not)Petya compound would kick into[br]action. This infected MBR, the bootloader 0:06:39.500,0:06:46.760 shows this decoy CheckDisk screen, and in[br]the background would find and iterate the 0:06:46.760,0:06:49.500 old files on the file system and 0:06:49.500,0:06:56.060 encrypt all these files.[br]So the main takeaways of this slide are 0:06:56.060,0:07:04.530 that we are dealing with 16-bit code here.[br]So we're we're in 16-bit real mode - so 0:07:04.530,0:07:10.740 this means no proper file system it means[br]no 32-bit or 64-bit code. There are no 0:07:10.740,0:07:20.449 windows APIs-- so debugging all this and[br]analyzing this is a tedious work. However, 0:07:20.449,0:07:26.290 we have something on the plus side which[br]is a BIOS, you know, the basic 0:07:26.290,0:07:33.910 input/output system and with that comes a[br]range of interrupts that are very well 0:07:33.910,0:07:43.020 described and a really nice thing is[br]having Bochs and being able to debug all 0:07:43.020,0:07:50.130 this in in IDA. So this was a really neat[br]plugin that had been developed by the 0:07:50.130,0:08:00.680 authors. So let's analyze a bit and check[br]the cryptography and why implementing 0:08:00.680,0:08:09.510 crypto is really hard. So I will start[br]this part with describing in short words 0:08:09.510,0:08:16.510 the theory of Salsa20 and then check that[br]compare that against the (Not)Petya 0:08:16.510,0:08:28.000 implementation of this cipher. So Salsa20[br]is a stream cipher. So basically you will 0:08:28.000,0:08:35.568 have a plaintext, it's it's about here and[br]then you will have some kind of random 0:08:35.568,0:08:41.089 number generator or pseudo-random number[br]generator and then apply some operations 0:08:41.089,0:08:49.240 on the plaintext and out comes the[br]ciphertext. And what you put in there are 0:08:49.240,0:08:54.649 four different variables, or four[br]different inputs, because there's the 0:08:54.649,0:08:59.570 constant part which is obviously not[br]variable. But we'll see about that in a 0:08:59.570,0:09:09.069 bit. So you have these key and the nonce,[br]and then there's this really nifty thing 0:09:09.069,0:09:17.610 called counter. What's so nifty about that[br]is if you were choose to stream a Salsa 20 0:09:17.610,0:09:24.540 encrypted stream and you would lose some[br]frames then you could adjust this counter 0:09:24.540,0:09:33.709 variable which would de-mark the offset of[br]the current stream in the current stream 0:09:33.709,0:09:39.190 and then could continue needlessly with[br]with the decryption of the stream. So this 0:09:39.190,0:09:48.230 is a very nice feature. The size of the[br]counter is 64-bits and the hash size here, 0:09:48.230,0:09:57.730 so what salsa does is create a 64-byte[br]hash for every different of these inputs 0:09:57.730,0:10:06.429 and with that then apply to the input. If[br]you want any more details about this salsa 0:10:06.429,0:10:13.330 cipher, the inventor of salsa should be in[br]this room, and should be in at the 0:10:13.330,0:10:17.429 convention. He is at the convention: I[br]just rode the train with him, so I guess 0:10:17.429,0:10:24.399 you can ask him the the gory details of[br]salsa 20. So it's a really nice crypto 0:10:24.399,0:10:31.070 cipher and you should hit him up and ask[br]him. I shouldn't have said that, right? 0:10:31.070,0:10:37.940 Sorry. So basically what a very important[br]thing is to note is for every invocation 0:10:37.940,0:10:46.499 of this this or for every instance of this[br](Not)Petya encryption and you would have 0:10:46.499,0:10:54.220 these three variables or these three[br]inputs be constant so (Not)Petya would 0:10:54.220,0:11:00.139 would patch during the infection process[br]the key, the nonce, and the constants into 0:11:00.139,0:11:08.089 a configuration sector. The constants, not[br]these are somewhat different. And then the 0:11:08.089,0:11:15.079 counter would only change throughout every[br]iteration. So the interesting thing or the 0:11:15.079,0:11:22.540 interesting question was first - what is[br]the length of the key stream? So, the key 0:11:22.540,0:11:27.839 stream, you know the the number of[br]different outputs that that would come out 0:11:27.839,0:11:34.579 of this hashing function. And I mean for[br]this implementation for the theory this 0:11:34.579,0:11:42.629 is quite clear it's 64-bits here eight[br]bytes times the 64 bytes of output so it 0:11:42.629,0:11:53.769 would be about two to the seventy of a[br]periodicity. One different about the 0:11:53.769,0:12:00.619 actual implementation in (Not)Petya on the[br]theory would be that this constants thing 0:12:00.619,0:12:07.160 here had been changed to I string a[br]reading out invalid sect ID. So this would 0:12:07.160,0:12:23.069 break the official implementation. So the[br]very first failure I saw in (Not)Petya was 0:12:23.069,0:12:27.439 something like this. So I think I can slip[br]this slide because it's obvious for 0:12:27.439,0:12:35.240 everyone that this is a fail right? So who[br]sees the fail? So, not many hands? Okay, 0:12:35.240,0:12:43.189 then then I'll explain it. So I was just[br]kidding , I'm almost not not expecting for 0:12:43.189,0:12:49.149 for you to grasp of that first. So[br]remember, we're in 16-bit code. We have 0:12:49.149,0:12:57.380 here this shift left operation which would[br]shift a register by N-bytes. The register 0:12:57.380,0:13:03.579 with here is 16 bits, so it only has 16[br]digits so to speak and you would shift it 0:13:03.579,0:13:11.879 by 16 - by 10 hex - sixteen. And this[br]would effectively null the register, and 0:13:11.879,0:13:14.910 even worse is here. So 0:13:14.910,0:13:22.009 you would shift a an 8-bit register for[br]sixteen. So this is something you you 0:13:22.009,0:13:28.919 wouldn't expect from my proper[br]cryptographic implementation. And I was 0:13:28.919,0:13:32.680 was really intrigued why that is, because[br]it wouldn't make any sense in source code 0:13:32.680,0:13:39.829 and and did the (Not)Petya or the Petya[br]authors really implement that on purpose, 0:13:39.829,0:13:47.669 or what was the gist with it. And I looked[br]up the (Not)Petya, the salsa 20 0:13:47.669,0:13:54.760 implementation um I just googled it and[br]funny a nice website that had a 0:13:54.760,0:14:04.779 implementation of salsa 20. And there you[br]would see this code. So you see here it 0:14:04.779,0:14:12.090 sits in the the endianness conversion. And[br]you see here these these shifting of bits 0:14:12.090,0:14:19.810 of registers and you see here this this[br]uint fast 16 or 32 type. So it becomes 0:14:19.810,0:14:23.480 quite clear that this is a broken[br]implementation right? So everyone can see 0:14:23.480,0:14:32.730 that right? No, not right now because you[br]need to know some things more about this. 0:14:32.730,0:14:38.829 There are two important facts that make[br]this implementation broken and the two 0:14:38.829,0:14:46.210 facts are: you need to compile this code[br]for 16 bits, and you need to look up what 0:14:46.210,0:14:52.259 Visual Studio makes of these these type[br]definitions here. And when you look that 0:14:52.259,0:14:58.619 up, this is from Visual Studio and it's in[br]the standard int.h header file. And there 0:14:58.619,0:15:04.209 you see it's interpreted or translated as[br]unsigned int, so this is the base type. 0:15:04.209,0:15:08.629 And this base type, unsigned int, in[br]16-bit code is a 16-bit variable or a 0:15:08.629,0:15:14.459 16-bit register. And then everything makes[br]sense. And this was somewhat of a a 0:15:14.459,0:15:22.000 failure here - the authors didn't really[br]check if their code was actually working 0:15:22.000,0:15:27.230 against the test vectors. And this guy who[br]wrote the code here on this salsa 20 0:15:27.230,0:15:40.179 implementation made this mistake also. On[br]this slide you see two bugs off of the 0:15:40.179,0:15:47.240 (Not)Petya implementation of salsa 20. And[br]I quickly want to to explain both to you 0:15:47.240,0:15:52.059 because they are of some somewhat in[br]importance throughout the remainder of the 0:15:52.059,0:16:02.380 talk. So, both revolve around the the[br]counter variable. Just remember - this 0:16:02.380,0:16:06.950 counter variable is the only dynamic input[br]the only variable input throughout all 0:16:06.950,0:16:09.920 these salsa 20 invocations. 0:16:09.920,0:16:21.689 And the the first error is: so you read[br]a sector, a sector number into the memory. 0:16:21.689,0:16:28.670 So a bit about the the low-level aspects[br]of a hard drive. A hard drive from the 0:16:28.670,0:16:35.019 view of the BIOS would look somewhat like[br]a bunch of different sectors. So these are 0:16:35.019,0:16:43.079 512 byte chunks of data, and they they[br]come one after another. So if you were to 0:16:43.079,0:16:48.759 read a sector, you would actually read a[br]512 bytes about me a byte long portion of 0:16:48.759,0:16:56.009 data. And this is obviously not the offset[br]in the stream, and this is somewhat of a 0:16:56.009,0:17:03.790 problem there. So, you see here the same[br]variable is used to decrypt or encrypt the 0:17:03.790,0:17:13.049 data. And, I mean, this this is it doesn't[br]it isn't really apparent to to the 0:17:13.049,0:17:21.449 implementer of this cipher. However, if[br]you were to analyze it, it would look 0:17:21.449,0:17:28.199 something like this. So you would have the[br]key stream of two different sectors, two 0:17:28.199,0:17:33.080 different consecutive sectors here. So it[br]would start with with FF and then 0:17:33.080,0:17:39.570 continues with D7 and so on. And the next[br]sector would have almost all of the bytes 0:17:39.570,0:17:49.639 identical. And this is yeah a really big[br]failure because this really nice salsa 20 0:17:49.639,0:17:57.770 implementation, this really nice salsa[br]algorithm will then be, from would then be 0:17:57.770,0:18:03.140 converted from a one-time pad to a many[br]times pad. And this is the first fine I 0:18:03.140,0:18:10.090 wanted to show you in this very few lines[br]of code. The second part is, the second 0:18:10.090,0:18:17.740 bug is here this large keyword. Remember,[br]we are in 16-bit code so this large 0:18:17.740,0:18:24.580 keyword here does not push a 64-bit[br]variable as we would suspect to do it, but 0:18:24.580,0:18:31.450 a 32-bit variable. So only 32 bits of this[br]this nice counter variable are actually 0:18:31.450,0:18:40.970 used in this case. So, these two failures[br]are somewhat of a problem for this salsa 0:18:40.970,0:18:59.990 20 implementation. So, in this slide I[br]took two different hex dumps, and these 0:18:59.990,0:19:09.399 hex times were are within this expand key[br]function. And they they are well basically 0:19:09.399,0:19:16.970 two snapshots - one right before these[br]bugs become apparent: so before this this 0:19:16.970,0:19:25.480 endianness conversion, and right after on[br]the lower half. So you very nicely see the 0:19:25.480,0:19:32.140 different variables being put into all the[br]different key inputs being put into these 0:19:32.140,0:19:34.140 this memory blocks. So here, it would 0:19:34.140,0:19:39.331 spell out invalid sect ID, you know, the[br]constants (Not)Petya uses. Here you would 0:19:39.331,0:19:46.309 see the key and here, so it's broken into[br]two halves. Additionally you would see the 0:19:46.309,0:19:53.710 nonce here. And what really sticks out is[br]this bunch of zeros here. So this this 0:19:53.710,0:20:01.380 higher part of this a 64-bit variable[br]isn't even used - is it's not even filled. 0:20:01.380,0:20:07.170 So this is well the first problem here,[br]and after the endianness conversion you 0:20:07.170,0:20:13.820 see that it's not really an endianness[br]conversion but it's more of a nulling of 0:20:13.820,0:20:23.060 of bytes. So the result would be that this[br]initially 64 bit variable would be just 0:20:23.060,0:20:32.600 16-bit in length. And, as I said earlier,[br]the original salsa implementation would be 0:20:32.600,0:20:44.610 2^70s key lengths. And right now we have[br]16 bits times 64 bytes in in key lengths 0:20:44.610,0:20:51.299 which would result an in 26 bits in key[br]lengths. Which would be a measly 4 0:20:51.299,0:20:57.289 megabytes in key lengths. So this is the[br]this was a very interesting observation I 0:20:57.289,0:21:08.730 made there and this would be possible then[br]to decrypt together with the many times 0:21:08.730,0:21:17.720 pad properties of this cipher which make[br]it really easy to break. So to quickly 0:21:17.720,0:21:24.980 summarize this part of the talk - so we[br]have a very very short key stream of just 0:21:24.980,0:21:32.009 4 megabytes, it's highly repetitive[br]repetitive. So for each secretary progress 0:21:32.009,0:21:41.620 you would only have a progress of one byte[br]at a time. So only 26 bits remain of the 0:21:41.620,0:21:50.230 whole stream. And as I said, the many[br]times pad property's a very nice thing to 0:21:50.230,0:22:00.190 to analyze. I could come around to[br]implement a small joke here, so this salsa 0:22:00.190,0:22:08.690 implementation I would only call "LOLsa"[br]from now on. Sorry, is a bad joke sorry! 0:22:08.690,0:22:16.630 So, the the main goal is to derive the[br]keystream, and as I'm not really a crypto 0:22:16.630,0:22:22.029 expert, the basically the only attack I[br]know it would be a known plaintext attack 0:22:22.029,0:22:28.470 so this was my goal there because it's so[br]straightforward to do that. And in the 0:22:28.470,0:22:34.909 remainder of the talk I will tell you how[br]I did that. So without further ado, let's 0:22:34.909,0:22:41.710 exploit these failures and see how much of[br]the plaintext we can actually get from a 0:22:41.710,0:22:47.050 (Not)Petya infected drive.[br]So the modulus operandi 0:22:47.050,0:22:53.299 of (Not)Petya would look[br]somewhat like that. This, so let's let's 0:22:53.299,0:22:58.480 stop with the with the left hand side of[br]the of this slide, and concentrate on the 0:22:58.480,0:23:03.860 right hand side. For those of you who you[br]are not really intimately familiar with 0:23:03.860,0:23:10.890 NTFS - I wasn't before analyzing Petya or[br](Not)Petya as well so no worries. It's 0:23:10.890,0:23:17.659 quite simple - so every NTFS partition has[br]something called an master file table. 0:23:17.659,0:23:24.940 MFT, abbreviated. And it would contain[br]some metadata about the file. For example, 0:23:24.940,0:23:32.399 the file name, the file size, and if the[br]file is small enough, it would even fit 0:23:32.399,0:23:42.320 the actual content of the file into this[br]record. So, as I said, MFT is just list of 0:23:42.320,0:23:50.091 records. And if the file is larger it[br]would have a pointer, a so called data 0:23:50.091,0:23:57.860 run, which would point to a cluster or[br]sector on the disk on the partition which 0:23:57.860,0:24:06.020 would then actually be the the payload[br]data. One of these MFT records is one 0:24:06.020,0:24:12.330 kilobyte in size. So now to the actual[br]implementation. So let's zoom out a bit 0:24:12.330,0:24:19.370 and see how this LOLsa implementation is[br]used in (Not)Petya. So it would basically 0:24:19.370,0:24:26.399 just iterate over over all of these MFT[br]records, and then check if this record 0:24:26.399,0:24:33.230 would put into would point to a file. If[br]that is the case, it would encrypt the 0:24:33.230,0:24:41.879 first kilobyte of the file, and then would[br]encrypt the record itself. This 0:24:41.879,0:24:47.769 implementation is good for a bunch of[br]reasons - it's very fast: You would only 0:24:47.769,0:24:52.259 need to encrypt the first kilobyte, and[br]this is this first kilobyte contains 0:24:52.259,0:25:03.440 really really important information. For[br]example, headers, or especially compressed 0:25:03.440,0:25:09.191 files have these really important header[br]structures there. Additionally, your file 0:25:09.191,0:25:14.429 recovery tools wouldn't be able to work[br]anymore because most of them rely on this 0:25:14.429,0:25:23.160 very header. And the second thing is this[br]MFT is can be considered as table of 0:25:23.160,0:25:31.000 contents. So with no metadata, with with[br]no pointers to these to the files, you 0:25:31.000,0:25:37.870 won't have anything there to work with to[br]recover from. And this is a I mean from 0:25:37.870,0:25:44.320 from the implementation standpoint it's[br]very neat, because it's fast and it's a 0:25:44.320,0:25:49.729 somewhat thorough. As the MFT is 0:25:49.729,0:25:57.450 really important, my idea was to to[br]recover that first and then check what 0:25:57.450,0:26:07.090 what comes out from there and see how I[br]can can further progress there with the 0:26:07.090,0:26:14.180 decryption of the files. So the metadata[br]would would be of most importance there. 0:26:14.180,0:26:27.419 I'm a visual person, and here I took two[br]disk dumps from a from one of my test 0:26:27.419,0:26:33.339 disks. So I infected a clean system with[br](Not)Petya and let it encrypt the hard 0:26:33.339,0:26:38.270 drive. And on the left-hand side you see[br]the plaintext on and on the right-hand 0:26:38.270,0:26:45.190 side you see the encrypted data. So to[br]just get a better picture about the 0:26:45.190,0:26:54.220 encryption process. On the far left-hand[br]side fancy PowerPoint altered animation, 0:26:54.220,0:27:01.059 you see some kind of indicator so which[br]which would tell you at which offset, how 0:27:01.059,0:27:08.320 much of the of the data has actually been[br]different. And you see the whole disk is 0:27:08.320,0:27:19.600 more or less being encrypted. However, you[br]see at the far bottom part here it's more 0:27:19.600,0:27:27.199 dark red. And this is actually the MFT, so[br]the MFT is towards the end of the disk 0:27:27.199,0:27:33.630 sometimes. But this might be a[br]misconception. So my my idea was something 0:27:33.630,0:27:38.779 like this: we have two input sizes, right?[br]We have the the encrypted MFT, and we have 0:27:38.779,0:27:47.330 encrypted files. And first i would analyze[br]the MFT, and then derive the key stream 0:27:47.330,0:27:55.850 from that. After that analysis stage had[br]been finished I would put the key stream 0:27:55.850,0:28:03.970 back into the this this little box here,[br]and actually decrypt that. And then out 0:28:03.970,0:28:09.659 comes the decrypted MFT. And with that and[br]the key stream I would be able to find the 0:28:09.659,0:28:16.899 encrypted files on the disk, and then be[br]able to decrypt them; then be ready with 0:28:16.899,0:28:24.160 it right? So this was my initial idea[br]there and so let's start with the 0:28:24.160,0:28:35.059 decryption of the MFT. Known plaintext[br]attack, right? So an MFT you looks for 0:28:35.059,0:28:40.320 from from the viewpoint of the keystream,[br]somewhat like this. So you would have here 0:28:40.320,0:28:46.800 the first, the second and so on MFT[br]records. And on the on the column you will 0:28:46.800,0:28:54.510 have the actual byte that is used as key[br]to encrypt the key stream. Remember, the 0:28:54.510,0:29:03.829 operation that that encrypts the you know[br]the the key stream and the the plaintext 0:29:03.829,0:29:07.350 is just a mirror XOR operation.[br]So you have the the key 0:29:07.350,0:29:15.769 stream and the plaintext, and it's plainly[br]- so you can switch forth and back between 0:29:15.769,0:29:21.170 plaintext and ciphertext and even the key[br]stream with a with just applying an XOR 0:29:21.170,0:29:29.850 operation. So what you see here is for the[br]very first records you only have very few 0:29:29.850,0:29:36.680 key stream bytes or very few sample bytes.[br]However, as the progress as you make 0:29:36.680,0:29:41.610 progress with the analysis and then you[br]will have more and more of these sample 0:29:41.610,0:29:49.560 bytes to collect from and this this will[br]give you more confidence in the in the 0:29:49.560,0:29:56.930 result, and the may be known key stream[br]then. The question is, does the MFT hold 0:29:56.930,0:30:02.889 enough plaintext to do some some kind of[br]known plaintext attack? So let's look into 0:30:02.889,0:30:13.399 the specification. The MFT record has[br]basically two fields. So there is this the 0:30:13.399,0:30:19.570 standard information, which is a well-[br]defined structure and there's something 0:30:19.570,0:30:25.840 else called attribute list, which is a[br]quite dynamic structure and this would be 0:30:25.840,0:30:32.799 a somewhat more different, difficult to[br]clean plan text from. So I concentrate on 0:30:32.799,0:30:38.200 this first part. And the first part[br]quickly turned out to be quite constant 0:30:38.200,0:30:48.590 for many or most of the MFT records. And[br]you see it starts with FILE and then as 0:30:48.590,0:30:55.210 some some hex digits after that. And[br]although on the far bottom part of the 0:30:55.210,0:31:02.210 slide, I added my yeah certainty level. So[br]the the certainty level would be the 0:31:02.210,0:31:08.749 number of different sample bytes I would[br]have multiplied by the confidence I would 0:31:08.749,0:31:16.559 have in that sample byte being actually[br]this this plain text. So you see for the 0:31:16.559,0:31:22.759 very first record, we would have a quite[br]low, of a low certainty. I mean, it's just 0:31:22.759,0:31:30.580 one byte, right? Oh, the two byte skipping[br]is I mean quite a forward considering you 0:31:30.580,0:31:37.370 would have usually 512 bytes per sector on[br]a disk, and the MFT record is one kilobyte 0:31:37.370,0:31:47.919 in size, so the stream would progress two[br]bytes. And for record 100, so for if for 0:31:47.919,0:31:54.820 the 100th record I would have a certainty[br]of four. You know, I would just assume 0:31:54.820,0:32:02.730 these eight plaintext bytes here, divided[br]by 2, with an resulting 4. This wasn't 0:32:02.730,0:32:06.320 really satisfactory.[br]The problem there was towards 0:32:06.320,0:32:12.720 the end I would have many many unknown[br]records because I would was concentrating 0:32:12.720,0:32:19.799 on on the very first parts of the header.[br]So the remainder of the key stream, the 0:32:19.799,0:32:25.799 very end of the key stream wasn't be able[br]to being analyzed and eventually 0:32:25.799,0:32:30.159 decrypted. So I thought of something[br]different. And there was something like a 0:32:30.159,0:32:39.799 I would call a byte histogram. So for[br]every offset the MFT record, i would i 0:32:39.799,0:32:48.860 will then calculate creatin and a[br]histogram and check how many different 0:32:48.860,0:32:53.549 bytes are there actually for plaintext,[br]you know, it's a plaintext known plaintext 0:32:53.549,0:32:58.560 attack so I would need some kind of[br]plaintext and would do that for every 0:32:58.560,0:33:07.509 offset for every record. And so these the[br]questions there how to get many MFT 0:33:07.509,0:33:12.039 records - it's quite easy if you have some[br]nice colleagues you just need to hang them 0:33:12.039,0:33:16.999 all of the balcony and shake them a bit[br]then more or less voluntarily they will 0:33:16.999,0:33:25.080 give you some MFT keys to work with. And[br]the result of that is quite nice: you, I 0:33:25.080,0:33:30.640 mean for for the very first, there's[br]nothing much you can do the very first 0:33:30.640,0:33:35.929 record will always have very few sample[br]bytes. But as the stream progresses and 0:33:35.929,0:33:41.970 you will have a dramatic change there so[br]from this relatively low certainty of 0:33:41.970,0:33:51.499 four, I could increase that to more than[br]thirty. So this is somewhat nice, and ever 0:33:51.499,0:34:00.480 bit of doing science, this table drops out[br]from nowhere. So I compared these two 0:34:00.480,0:34:08.230 attack types. So, let's read that from[br]from right to left. On the on the far 0:34:08.230,0:34:16.270 right, I have for the first approach about[br]98% of the MFT records being successfully 0:34:16.270,0:34:22.070 recovered. You know, with the good thing[br]with science and with all this academic 0:34:22.070,0:34:29.659 approach is you have a ground truth. So I[br]have a plaintext an unencrypted hard drive 0:34:29.659,0:34:36.920 which were virtually unordered from[br]something right after infection, and then 0:34:36.920,0:34:43.370 you let execute the the whole infection[br]and encryption process. And then you can 0:34:43.370,0:34:49.120 differentiate and take you knows several[br]snapshots throughout the infection, change 0:34:49.120,0:34:54.539 different key values and all the stuff. So[br]this is a very nice thing about this 0:34:54.539,0:35:02.830 academic approach I was taking the other[br]So I could I could exactly pinpoint how many 0:35:02.830,0:35:14.280 of these records were perfectly recovered.[br]So for the byte histogram approach I could 0:35:14.280,0:35:17.940 decrypt almost all of the records, which[br]is quite nice because then we have a high 0:35:17.940,0:35:28.250 quality MFT to work with. What's also[br]quite nice is that we have zero wrong key 0:35:28.250,0:35:34.210 stream bytes. What's not so nice, however,[br]is that I was only able to recover about 0:35:34.210,0:35:43.130 1.3 percent of the overall key stream. And[br]remember - this this key stream is about 4 0:35:43.130,0:35:47.840 megabytes in length and I was able to[br]recover only 50 kilobytes of that, so we 0:35:47.840,0:35:55.540 can't recover all of the the files. Or is[br]that so? This is this was my next question 0:35:55.540,0:36:05.370 there. And so I drew another nice diagram.[br]This is the key stream; in the this is the 0:36:05.370,0:36:15.880 MFT, sorry, in the key stream. So the the[br]key stream is only filled and in sample 0:36:15.880,0:36:24.310 bytes at this 2 megabytes mark. And the[br]question is, are there many files in this 0:36:24.310,0:36:29.430 area being encrypted or is there some[br]other bug I could exploit. And I checked 0:36:29.430,0:36:34.680 where the files would lie into this the[br]key stream. So I would check how many 0:36:34.680,0:36:41.470 files are at which location in the key[br]stream being encrypted. And you see the 0:36:41.470,0:36:47.550 key stream is used all over the place - so[br]I mean sometimes it's used more, sometimes 0:36:47.550,0:36:56.110 it's used just the less, however, it's[br]basically used all over the place. And so 0:36:56.110,0:37:00.611 this much of the key stream could in a[br]perfect scenario be, I mean, perfect 0:37:00.611,0:37:09.340 scenario being a perfect known plaintext[br]oracle could theoretically be recovered. 0:37:09.340,0:37:17.570 However, this is somewhat of a problem[br]here and in the next part of the talk I 0:37:17.570,0:37:26.190 will present you how I was able to solve[br]this problem as well. So remember when the 0:37:26.190,0:37:32.290 file system is actually encrypted by[br](Not)Petya, the file system looks somewhat 0:37:32.290,0:37:37.051 like this. So you would have the MFT which[br]is totally garbled so you won't have any 0:37:37.051,0:37:42.250 of any nice file pointers or data runs[br]pointing through those files. You won't 0:37:42.250,0:37:47.520 have any nice file names and all the[br]stuff. But with the first stage being 0:37:47.520,0:37:53.320 finished, the MFT looks really nice - I[br]mean like almost a hundred percent 0:37:53.320,0:37:57.530 of the records could be[br]recovered. So it looks 0:37:57.530,0:38:04.430 somewhat like this. So you have a bunch of[br]metadata you can now use to analyze the 0:38:04.430,0:38:08.960 the remainder of the files and the[br]remainder of the encryption. So all this 0:38:08.960,0:38:17.090 MFT or almost always actually decrypted.[br]And for files you would have the very 0:38:17.090,0:38:24.830 first kilobyte being encrypted. The[br]remainder of the file, the, I mean, most 0:38:24.830,0:38:30.540 files are more than a kilobyte in size,[br]right? So all the rest is not encrypted. 0:38:30.540,0:38:37.280 So you would have all this data and[br]metadata to collect information and to to 0:38:37.280,0:38:45.520 use to exploit as-known plaintext as[br]indicators for known plaintext. So I 0:38:45.520,0:38:56.060 thought of three different approaches to[br]you know to attack this problem. Basically 0:38:56.060,0:39:01.620 I was thinking about: okay what different[br]files are there, and I was quickly 0:39:01.620,0:39:08.100 thinking about different file types. And I[br]mean, there are I mean the file type can 0:39:08.100,0:39:12.890 be easily gleaned from this, right?[br]Because you would have the file extension 0:39:12.890,0:39:18.540 and it would be basically the file type.[br]And you would have two different types of 0:39:18.540,0:39:22.990 files - you would have structured files[br]and unstructured files. So I thought of 0:39:22.990,0:39:28.730 these and you would have something like,[br]yeah, source code, which I would consider 0:39:28.730,0:39:34.610 as more or less unstructured. And I was[br]calculating the histogram, so this would 0:39:34.610,0:39:39.980 give me some kind of prevalence towards[br]different bytes in the key stream. So it 0:39:39.980,0:39:46.600 would be somewhat like a guest plaintext[br]attack or something like that. The next 0:39:46.600,0:39:54.670 thing for structured files would be to do[br]the very same approach as with the MFT 0:39:54.670,0:40:00.930 records. I would calculate the histogram[br]for every offset of the first kilobyte, 0:40:00.930,0:40:07.690 and then quickly see how how many[br]different bytes are there per offset. And 0:40:07.690,0:40:14.583 the last approach uses somewhat more data.[br]It uses some of the metadata, and also 0:40:14.583,0:40:23.870 some of the file data. I will go into this[br]right now. So, what I basically have here 0:40:23.870,0:40:29.410 as I said it's only this this little[br]portion of the files encrypted. The whole 0:40:29.410,0:40:33.830 remainder of the file is not encrypted,[br]which is quite nice. And also the file 0:40:33.830,0:40:40.490 name the file sizes not encrypted.[br]So what I use, what I would do here 0:40:40.490,0:40:44.970 is create a database of [br]known files of known 0:40:44.970,0:40:51.860 Windows system files, for example. Y'all[br]might remember these nice background 0:40:51.860,0:40:59.310 images fonts all this stuff - plaintext[br]is flying around everywhere if you just 0:40:59.310,0:41:07.220 look for it. And I would have basically[br]three different, yeah, three three 0:41:07.220,0:41:17.300 different distinctors between those to[br]know which which is the correct plaintext. 0:41:17.300,0:41:24.520 So there is this this key file name, the[br]file size, and then the hash of this whole 0:41:24.520,0:41:32.180 remainder, of this whole tier. So if all[br]these 3-tuple would match, then I would 0:41:32.180,0:41:38.640 consider this as a proper plaintext.[br]However, there are some collisions, so 0:41:38.640,0:41:48.090 this is this is not really something that[br]is straightforward. So the initial idea of 0:41:48.090,0:41:55.760 having of only needing to analyze the MFT[br]and then could just being being able to 0:41:55.760,0:42:03.920 straightforward decrypt files needed to be[br]altered a bit. So I added this database of 0:42:03.920,0:42:11.920 known files there, I added another another[br]analysis stage in this nice box here, and 0:42:11.920,0:42:19.970 then I would be able to decrypt the files,[br]eventually. So let's do a bit of science 0:42:19.970,0:42:26.430 and check if this approach would be[br]worthwhile pursuing on a real-world 0:42:26.430,0:42:37.690 scenario. So let the science cat do its[br]science, let me have a drink. So whenever 0:42:37.690,0:42:46.920 you did do here is to create a database of[br]known files - I collected a bunch of 0:42:46.920,0:42:55.290 default Windows installations, which[br]resulted in about 340,000 unique files in 0:42:55.290,0:43:00.090 it. Then I calculated, you know, the the[br]different file type histograms I talked 0:43:00.090,0:43:08.910 about, I prepared my test setup, I added[br]there 1000 different files. And you should 0:43:08.910,0:43:15.760 note that these files were not part of[br]this known files database - they were 0:43:15.760,0:43:20.520 different from that because otherwise it[br]would have been easy to do. Then I would 0:43:20.520,0:43:28.010 infect this machine and then let its[br]encrypt by (Not)Petya and then attempting 0:43:28.010,0:43:40.170 recovery. And these are the results there.[br]So, I did this with 4 different runs, so I 0:43:40.170,0:43:46.430 tried every approach separately, and then[br]combined the three approaches. And the 0:43:46.430,0:43:52.240 outcome was something like this: So I[br]would have only two files by the general 0:43:52.240,0:43:59.950 histogram approach being able to[br]be to decrypt. At least 8% 0:43:59.950,0:44:06.510 where I will were yet decrypted by the[br]location histogram approach. And by the 0:44:06.510,0:44:12.480 known files approach over 90% of the files[br]could be successfully recovered. And even 0:44:12.480,0:44:22.280 better, the combined outcome would be[br]almost all files being able to decrypt. 0:44:22.280,0:44:30.530 So, so much about academia. So the problem[br]there is if you apply this to the real 0:44:30.530,0:44:37.350 world, you could get into more trouble[br]there. And there was lots and lots of more 0:44:37.350,0:44:42.910 to think about so. There were, for[br]example, non default windows installations 0:44:42.910,0:44:48.500 with the whole history of updates so this[br]might be really interesting from a 0:44:48.500,0:44:56.010 forensics perspective. Moreover, there's[br]lots of installed programs to derive known 0:44:56.010,0:45:04.610 plaintext from - for example, dotnet uses[br]a vast source of known plaintext or JDK 0:45:04.610,0:45:11.560 installations. So, especially the JDK[br]would would result in about tens of 0:45:11.560,0:45:17.051 thousands of different source code and[br]HTML files. So this is this really was 0:45:17.051,0:45:27.940 really quite nice. The drawback there was,[br]so there was so much data to collect, the 0:45:27.940,0:45:35.170 first attempts failed miserably because of[br]the sheer amount of RAM I was using. And 0:45:35.170,0:45:43.960 this would would result in the admins[br]constantly giving me more more RAM in my 0:45:43.960,0:45:52.900 VM so I would eventually end up with, I[br]think, 128 gigabytes of RAM and my my test 0:45:52.900,0:46:00.080 VM. Also you have a much larger master[br]file table you know because of all these 0:46:00.080,0:46:06.230 files that would need to be stored, so to[br]put them in in comparison. So this nice 0:46:06.230,0:46:13.530 working prototype, so this nice test[br]setup, I mean, would have an MFT of about 0:46:13.530,0:46:20.310 26 megabytes. And for real-world scenarios[br]you would have something like at least 500 0:46:20.310,0:46:26.980 megabytes, and even MFT could be even like[br]a couple of gigabytes in size. So this 0:46:26.980,0:46:33.060 means much more plaintext. So for through[br]these really large MFTs you could quickly 0:46:33.060,0:46:40.180 recover the whole key stream, the whole[br]four megabytes just by looking at the MFT. 0:46:40.180,0:46:46.600 And, in summary, this means that the[br]decryption of (Not)Petya encrypted drives 0:46:46.600,0:46:57.301 is possible. So, for for the file systems[br]the drives, I have looked at it, was 0:46:57.301,0:47:03.240 really I mean ever after having all these[br]these first bugs out of the way, 0:47:06.891,0:47:10.410 I was able to decrypt and[br]recover all the files there. So 0:47:10.410,0:47:17.550 there's a good chance the vacation photos[br]can be recovered as well. And this will 0:47:17.550,0:47:24.010 conclude my talk, so quick summary -[br](Not)Petya has some severe cryptographic 0:47:24.010,0:47:33.020 failures and flaws, so I would only be[br]able to call this LOLsa and not salsa 0:47:33.020,0:47:41.210 anymore. It might be possible to further[br]look into this, this you know this expand 0:47:41.210,0:47:48.790 key and thing. It it has really really a[br]bunch of more cryptographic failures in 0:47:48.790,0:47:56.310 it. I didn't look into that deeper but I[br]know some of you guys are professors and 0:47:56.310,0:48:01.140 this might be a nice homework for your[br]students to look at this (Not)Petya 0:48:01.140,0:48:07.510 implementation and check out some, you[br]know, more advanced crypto analysis 0:48:07.510,0:48:15.690 methods. And you should note that this[br]hook this this whole (Not)Petya thing here 0:48:15.690,0:48:22.050 I described the whole cryptographic flaws[br]there are not just in (Not)Petya. They are 0:48:22.050,0:48:28.080 you see the brackets there: there they are[br]in every each and every version of Petya. 0:48:28.080,0:48:38.251 So all of the drives that you potentially[br]have saved can be decrypted. And so my 0:48:38.251,0:48:45.210 last point is if any of you or any of us[br]falls victim to ransomware you really 0:48:45.210,0:48:51.850 should keep the drives. You keep them[br]untouched in a locker and wait for a talk 0:48:51.850,0:48:59.090 like this basically. And then hope that[br]someone comes around and then be able to 0:48:59.090,0:49:06.540 decrypt the drives and then, you will have[br]your vacation photos back. So that's all 0:49:06.540,0:49:09.010 folks, thank you for your attention.[br] 0:49:09.010,0:49:13.210 Applause 0:49:13.210,0:49:16.090 Herald: Thank you very much Sebastian. Now 0:49:16.090,0:49:19.330 it's time for questions, please queue up[br]by the microphones. 0:49:26.270,0:49:27.910 Mic 1: Hi, well thank you very much for 0:49:27.910,0:49:33.550 sharing your findings with us. I'm from[br]Rotterdam the largest harbour in Europe, 0:49:33.550,0:49:36.230 and as you might know, we were struck by[br]Petya 0:49:36.230,0:49:39.420 Sebastian Yes[br]Mic 1: Two terminals went down for a 0:49:39.420,0:49:44.260 couple of days, a couple of hundred[br]billion euros of damage. Your approach is 0:49:44.260,0:49:48.930 quite theoretically, so now to practice -[br]if you were there in this summer with 0:49:48.930,0:49:53.840 these findings, would you've been able to[br]help us and decrypt the files and get all 0:49:53.840,0:49:58.000 the companies up and running again? Or is[br]this too academic? 0:49:58.000,0:50:04.920 Sebastian: No, it's a practical approach I[br]mean, I work for CrowdStrike and we had 0:50:04.920,0:50:13.640 some occasions where we were able to help[br]the customers. So it's a very practical 0:50:13.640,0:50:21.030 thing to do so. And I was trying to[br]insinuate this with, you know, this slide 0:50:21.030,0:50:26.350 here. So I was talking about the real[br]world in this scenario. So I looked at 0:50:26.350,0:50:32.100 about 50 different encrypted hard drives[br]with (Not)Petya, and I was able to decrypt 0:50:32.100,0:50:38.730 all of them, or most of them. Mostly the[br]the ones not being able to decrypt or to 0:50:38.730,0:50:45.010 some, well, let's say, level 8 mistakes.[br]Mic 1: Awesome. Have you shared your 0:50:45.010,0:50:48.930 findings with nomoreransoms.org?[br]Sebastian: No, I don't know about this 0:50:48.930,0:50:50.910 site[br]Mic 1: They provide decryption tools for 0:50:50.910,0:50:56.860 ransomware. Please do, thank you[br]Herald: Microphone six, please 0:50:56.860,0:51:01.930 Mic 6: Thank you, in the beginning you[br]mentioned that basically the key was 0:51:01.930,0:51:05.490 shortened to what was a 24-bit, something[br]like that? 0:51:05.490,0:51:09.560 Sebastian: 26, yes[br]Mic 6: 26, yeah. From that point, wasn't a 0:51:09.560,0:51:13.760 brute-force attack way faster and way more[br]reliable? 0:51:13.760,0:51:20.820 Sebastian: Oh no no, don't get me wrong[br]there. So the keystream length was 2^26, 0:51:20.820,0:51:27.660 so the keystream length was four[br]megabytes. So you wouldn't be able to 0:51:27.660,0:51:34.650 brute-force that. So sorry, do you get[br]that? The number of bytes was four 0:51:34.650,0:51:39.120 megabytes, and you couldn't be able to[br]brute-force that. 0:51:39.120,0:51:43.740 Mic 6: Yeah but you already mention at the[br]beginning that you basically shortened it 0:51:43.740,0:51:49.940 down to the possibility of at most two to[br]the power of 26, and it is calculable. 0:51:49.940,0:51:56.870 Sebastian: Yes yes, I totally get the[br]question. But you're you're missing the 0:51:56.870,0:52:03.230 point there. So it's not the key space,[br]2^26, but the key length is 2^26, which 0:52:03.230,0:52:10.660 would be something - I I'm not good at[br]that converting this to decimal. Would be 0:52:10.660,0:52:16.930 something like, let me check, two to[br]the... I mean, here, math guys, computer 0:52:16.930,0:52:21.610 guys: how many bits is that?[br][Inaudible audience answer] 0:52:21.610,0:52:25.510 Sebastian: Say again?[br]Crowd: A lot? 0:52:25.510,0:52:31.370 Sebastian: A lot! So it's, I mean, it's[br]it's four megabytes of keylength and you 0:52:31.370,0:52:35.220 couldn't just brute force that because you[br]you would have each of these 0:52:35.220,0:52:43.660 four megabytes you would have to brute[br]force. Got that? So the key is not 2^26 0:52:43.660,0:52:52.290 but the key length, the keystream length,[br]is that long. Got that? So yeah, this is 0:52:52.290,0:52:58.170 the key space would be longer than the[br]Bible. You know, and you couldn't just 0:52:58.170,0:53:03.050 brute force the Bible, the text of the[br]Bible. I mean given enough time yes but 0:53:03.050,0:53:08.990 you know we will all know the stories with[br]in monkeys and typewriters. I mean we can 0:53:08.990,0:53:12.500 talk about that that offline but you're[br]you're mixing up two different numbers 0:53:12.500,0:53:18.370 there.[br]Mic 6: Yeah I guess, thank you. 0:53:18.370,0:53:20.350 Herald: Questions from the Internet,[br]please 0:53:20.350,0:53:25.710 Signal Angel: Does the MFT encryption work[br]the same for Petya and (Not)Petya? 0:53:25.710,0:53:34.400 Sebastian: Yes, the underlying mechanism[br]is the same. The cryptography differs in 0:53:34.400,0:53:43.160 such a way that the constants number is[br]different. So, this little guy here - this 0:53:43.160,0:53:50.110 is different. It would be usually like[br]expand key something-something. And here 0:53:50.110,0:53:56.980 it's invalid sect ID. So it's somewhat[br]different, but the the MFT encryption; I 0:53:56.980,0:54:03.170 mean the the bytecode the and the[br]algorithm is the very same. 0:54:03.170,0:54:12.410 Signal Angel: Thanks[br]Herald: Any more questions? Then please 0:54:12.410,0:54:14.820 give a great round of applause to[br]Sebastian 0:54:14.820,0:54:18.320 Sebastian: Thank you 0:54:18.320,0:54:24.130 Applause 0:54:24.130,0:54:44.000 subtitles created by c3subtitles.de[br]in the year 2018. Join, and help us!