0:00:00.099,0:00:14.800
Music
0:00:14.800,0:00:18.760
Herald: And give a great round of applause[br]to Sebastian Eschweiler!
0:00:18.760,0:00:24.290
Applause
0:00:24.290,0:00:32.360
Sebastian: So hi everyone. So I want to[br]talk about how I defeated not (Not)Petya's
0:00:32.360,0:00:42.800
cryptography. Some I'd say that (Not)Petya[br]would be Ukraine's scourge, and for those
0:00:42.800,0:00:48.190
of you who don't know what the word[br]scourge means: this guy right here doesn't
0:00:48.190,0:00:57.249
know either. And quick trivia quiz - does[br]anyone know what this movie is? What's the
0:00:57.249,0:01:08.940
name of the movie? So, in the next scene[br]Johnny Depp enters. Does ring a bell? Jim,
0:01:08.940,0:01:17.140
a movie by Jim Jarmusch. Soundtrack by[br]Neil Young? Dead Man, right! Great! It's a
0:01:17.140,0:01:24.310
great movie. So if you want to know what a[br]scourge is then you could watch the movie.
0:01:24.310,0:01:31.640
So let's begin with my with my talk. So[br]this is what actually the official Ukraine
0:01:31.640,0:01:39.930
Twitter account tweeted some time ago in[br]at the end of June 2017. And there was an
0:01:39.930,0:01:47.210
outbreak of a ransomware attack which was[br]noticed mostly in Ukraine but also all
0:01:47.210,0:01:51.999
over the world. So millions of users and[br]also companies large companies were
0:01:51.999,0:02:02.969
affected, and the damage went into the[br]billions of dollars. So, the problem there
0:02:02.969,0:02:09.889
is, I mean, this is this is not the[br]everyday ransomware outbreak you have
0:02:09.889,0:02:17.220
there. And I want to give you a short[br]glimpse into the the (Not)Petya universe.
0:02:17.220,0:02:24.130
And also how I could decrypt all these[br]stuff that, yeah, actually was encrypted
0:02:24.130,0:02:31.330
by the this ransomware outbreak. So first[br]I want to begin my talk with a
0:02:31.330,0:02:35.880
differentiation. I want to draw a[br]demarcation line because all this
0:02:35.880,0:02:41.560
(Not)Petya universe - there's much sub-[br]summarised under under this whole label.
0:02:41.560,0:02:49.660
And I really, I'm just talking about a[br]small fraction of this whole universe. So
0:02:49.660,0:02:55.430
I will first distinguish between what my[br]talk will be in what my talk will not be
0:02:55.430,0:03:02.630
about. Next I will describe (Not)Petya's[br]cryptography, and especially (Not)Petya's
0:03:02.630,0:03:08.310
cryptographic failures. Which I then will[br]be exploiting in the remainder of the talk
0:03:08.310,0:03:22.360
and see how can the users get their, yeah,[br]vacation photos back. So what was this
0:03:22.360,0:03:34.700
whole thing? The outbreak started, as I[br]said, in June 2017, and it started as fake
0:03:34.700,0:03:40.990
update or as malicious update from a[br]software called Madoc
0:03:40.990,0:03:48.400
- this is a tax software, one of the two[br]official tax softwares in the Ukraine. So
0:03:48.400,0:03:55.730
almost every company has it installed on[br]tax accounts on their computers many
0:03:55.730,0:04:02.960
private persons have it installed. It was[br]pushed and then site loaded this file
0:04:02.960,0:04:14.380
perfc that was then downloaded to the[br]computers and it comprises several parts.
0:04:14.380,0:04:22.870
And some parts are more interesting than[br]then as others, so one component ever like
0:04:22.870,0:04:28.880
half an hour time it would start[br]encrypting the files depending on the
0:04:28.880,0:04:37.250
access level. So, if there wasn't any[br]access to infect the computer with well
0:04:37.250,0:04:44.080
this MBR infector. then it would just[br]based on the current user level encrypt
0:04:44.080,0:04:53.480
the files based on that current user. In[br]lack of a better name I would call this
0:04:53.480,0:05:00.810
"Mischa" component I know it's usually[br]somewhat different, something different.
0:05:00.810,0:05:06.260
However, this was the best name I could[br]find there. So it's basically just a
0:05:06.260,0:05:13.280
finite crypter with AES. My talk will not[br]go about this this part - this file
0:05:13.280,0:05:22.190
infector. The next very interesting[br]component was the spreader part - it's
0:05:22.190,0:05:27.389
basically based on the[br]EternalBlue/EternalRomance exploits that
0:05:27.389,0:05:34.371
had been leaked by the Shadow Brokers. My[br]talk will not go about this as well. This
0:05:34.371,0:05:41.260
is a whole different universe as well.[br]What my talk will be about is the actual
0:05:41.260,0:05:49.260
(Not)Petya component and this is an MBR[br]encrypter and I will show you in the next
0:05:49.260,0:05:58.049
slide what it's actually about. So, the[br]user will see something like this upon
0:05:58.049,0:06:06.240
reboot. So if the access rights are[br]granted so if there is some local admin
0:06:06.240,0:06:12.230
installed on the computer or the correct[br]password could be guessed by some attack
0:06:12.230,0:06:18.610
and then this dropper, the perfc that[br]would infect the system by overwriting the
0:06:18.610,0:06:26.110
Master Boot Record with a custom boot[br]loader. And it would reboot the system
0:06:26.110,0:06:32.760
after a predefined time - usually being[br]about 10 minutes. And then the the actual
0:06:32.760,0:06:39.500
(Not)Petya compound would kick into[br]action. This infected MBR, the bootloader
0:06:39.500,0:06:46.760
shows this decoy CheckDisk screen, and in[br]the background would find and iterate the
0:06:46.760,0:06:49.500
old files on the file system and
0:06:49.500,0:06:56.060
encrypt all these files.[br]So the main takeaways of this slide are
0:06:56.060,0:07:04.530
that we are dealing with 16-bit code here.[br]So we're we're in 16-bit real mode - so
0:07:04.530,0:07:10.740
this means no proper file system it means[br]no 32-bit or 64-bit code. There are no
0:07:10.740,0:07:20.449
windows APIs-- so debugging all this and[br]analyzing this is a tedious work. However,
0:07:20.449,0:07:26.290
we have something on the plus side which[br]is a BIOS, you know, the basic
0:07:26.290,0:07:33.910
input/output system and with that comes a[br]range of interrupts that are very well
0:07:33.910,0:07:43.020
described and a really nice thing is[br]having Bochs and being able to debug all
0:07:43.020,0:07:50.130
this in in IDA. So this was a really neat[br]plugin that had been developed by the
0:07:50.130,0:08:00.680
authors. So let's analyze a bit and check[br]the cryptography and why implementing
0:08:00.680,0:08:09.510
crypto is really hard. So I will start[br]this part with describing in short words
0:08:09.510,0:08:16.510
the theory of Salsa20 and then check that[br]compare that against the (Not)Petya
0:08:16.510,0:08:28.000
implementation of this cipher. So Salsa20[br]is a stream cipher. So basically you will
0:08:28.000,0:08:35.568
have a plaintext, it's it's about here and[br]then you will have some kind of random
0:08:35.568,0:08:41.089
number generator or pseudo-random number[br]generator and then apply some operations
0:08:41.089,0:08:49.240
on the plaintext and out comes the[br]ciphertext. And what you put in there are
0:08:49.240,0:08:54.649
four different variables, or four[br]different inputs, because there's the
0:08:54.649,0:08:59.570
constant part which is obviously not[br]variable. But we'll see about that in a
0:08:59.570,0:09:09.069
bit. So you have these key and the nonce,[br]and then there's this really nifty thing
0:09:09.069,0:09:17.610
called counter. What's so nifty about that[br]is if you were choose to stream a Salsa 20
0:09:17.610,0:09:24.540
encrypted stream and you would lose some[br]frames then you could adjust this counter
0:09:24.540,0:09:33.709
variable which would de-mark the offset of[br]the current stream in the current stream
0:09:33.709,0:09:39.190
and then could continue needlessly with[br]with the decryption of the stream. So this
0:09:39.190,0:09:48.230
is a very nice feature. The size of the[br]counter is 64-bits and the hash size here,
0:09:48.230,0:09:57.730
so what salsa does is create a 64-byte[br]hash for every different of these inputs
0:09:57.730,0:10:06.429
and with that then apply to the input. If[br]you want any more details about this salsa
0:10:06.429,0:10:13.330
cipher, the inventor of salsa should be in[br]this room, and should be in at the
0:10:13.330,0:10:17.429
convention. He is at the convention: I[br]just rode the train with him, so I guess
0:10:17.429,0:10:24.399
you can ask him the the gory details of[br]salsa 20. So it's a really nice crypto
0:10:24.399,0:10:31.070
cipher and you should hit him up and ask[br]him. I shouldn't have said that, right?
0:10:31.070,0:10:37.940
Sorry. So basically what a very important[br]thing is to note is for every invocation
0:10:37.940,0:10:46.499
of this this or for every instance of this[br](Not)Petya encryption and you would have
0:10:46.499,0:10:54.220
these three variables or these three[br]inputs be constant so (Not)Petya would
0:10:54.220,0:11:00.139
would patch during the infection process[br]the key, the nonce, and the constants into
0:11:00.139,0:11:08.089
a configuration sector. The constants, not[br]these are somewhat different. And then the
0:11:08.089,0:11:15.079
counter would only change throughout every[br]iteration. So the interesting thing or the
0:11:15.079,0:11:22.540
interesting question was first - what is[br]the length of the key stream? So, the key
0:11:22.540,0:11:27.839
stream, you know the the number of[br]different outputs that that would come out
0:11:27.839,0:11:34.579
of this hashing function. And I mean for[br]this implementation for the theory this
0:11:34.579,0:11:42.629
is quite clear it's 64-bits here eight[br]bytes times the 64 bytes of output so it
0:11:42.629,0:11:53.769
would be about two to the seventy of a[br]periodicity. One different about the
0:11:53.769,0:12:00.619
actual implementation in (Not)Petya on the[br]theory would be that this constants thing
0:12:00.619,0:12:07.160
here had been changed to I string a[br]reading out invalid sect ID. So this would
0:12:07.160,0:12:23.069
break the official implementation. So the[br]very first failure I saw in (Not)Petya was
0:12:23.069,0:12:27.439
something like this. So I think I can slip[br]this slide because it's obvious for
0:12:27.439,0:12:35.240
everyone that this is a fail right? So who[br]sees the fail? So, not many hands? Okay,
0:12:35.240,0:12:43.189
then then I'll explain it. So I was just[br]kidding , I'm almost not not expecting for
0:12:43.189,0:12:49.149
for you to grasp of that first. So[br]remember, we're in 16-bit code. We have
0:12:49.149,0:12:57.380
here this shift left operation which would[br]shift a register by N-bytes. The register
0:12:57.380,0:13:03.579
with here is 16 bits, so it only has 16[br]digits so to speak and you would shift it
0:13:03.579,0:13:11.879
by 16 - by 10 hex - sixteen. And this[br]would effectively null the register, and
0:13:11.879,0:13:14.910
even worse is here. So
0:13:14.910,0:13:22.009
you would shift a an 8-bit register for[br]sixteen. So this is something you you
0:13:22.009,0:13:28.919
wouldn't expect from my proper[br]cryptographic implementation. And I was
0:13:28.919,0:13:32.680
was really intrigued why that is, because[br]it wouldn't make any sense in source code
0:13:32.680,0:13:39.829
and and did the (Not)Petya or the Petya[br]authors really implement that on purpose,
0:13:39.829,0:13:47.669
or what was the gist with it. And I looked[br]up the (Not)Petya, the salsa 20
0:13:47.669,0:13:54.760
implementation um I just googled it and[br]funny a nice website that had a
0:13:54.760,0:14:04.779
implementation of salsa 20. And there you[br]would see this code. So you see here it
0:14:04.779,0:14:12.090
sits in the the endianness conversion. And[br]you see here these these shifting of bits
0:14:12.090,0:14:19.810
of registers and you see here this this[br]uint fast 16 or 32 type. So it becomes
0:14:19.810,0:14:23.480
quite clear that this is a broken[br]implementation right? So everyone can see
0:14:23.480,0:14:32.730
that right? No, not right now because you[br]need to know some things more about this.
0:14:32.730,0:14:38.829
There are two important facts that make[br]this implementation broken and the two
0:14:38.829,0:14:46.210
facts are: you need to compile this code[br]for 16 bits, and you need to look up what
0:14:46.210,0:14:52.259
Visual Studio makes of these these type[br]definitions here. And when you look that
0:14:52.259,0:14:58.619
up, this is from Visual Studio and it's in[br]the standard int.h header file. And there
0:14:58.619,0:15:04.209
you see it's interpreted or translated as[br]unsigned int, so this is the base type.
0:15:04.209,0:15:08.629
And this base type, unsigned int, in[br]16-bit code is a 16-bit variable or a
0:15:08.629,0:15:14.459
16-bit register. And then everything makes[br]sense. And this was somewhat of a a
0:15:14.459,0:15:22.000
failure here - the authors didn't really[br]check if their code was actually working
0:15:22.000,0:15:27.230
against the test vectors. And this guy who[br]wrote the code here on this salsa 20
0:15:27.230,0:15:40.179
implementation made this mistake also. On[br]this slide you see two bugs off of the
0:15:40.179,0:15:47.240
(Not)Petya implementation of salsa 20. And[br]I quickly want to to explain both to you
0:15:47.240,0:15:52.059
because they are of some somewhat in[br]importance throughout the remainder of the
0:15:52.059,0:16:02.380
talk. So, both revolve around the the[br]counter variable. Just remember - this
0:16:02.380,0:16:06.950
counter variable is the only dynamic input[br]the only variable input throughout all
0:16:06.950,0:16:09.920
these salsa 20 invocations.
0:16:09.920,0:16:21.689
And the the first error is: so you read[br]a sector, a sector number into the memory.
0:16:21.689,0:16:28.670
So a bit about the the low-level aspects[br]of a hard drive. A hard drive from the
0:16:28.670,0:16:35.019
view of the BIOS would look somewhat like[br]a bunch of different sectors. So these are
0:16:35.019,0:16:43.079
512 byte chunks of data, and they they[br]come one after another. So if you were to
0:16:43.079,0:16:48.759
read a sector, you would actually read a[br]512 bytes about me a byte long portion of
0:16:48.759,0:16:56.009
data. And this is obviously not the offset[br]in the stream, and this is somewhat of a
0:16:56.009,0:17:03.790
problem there. So, you see here the same[br]variable is used to decrypt or encrypt the
0:17:03.790,0:17:13.049
data. And, I mean, this this is it doesn't[br]it isn't really apparent to to the
0:17:13.049,0:17:21.449
implementer of this cipher. However, if[br]you were to analyze it, it would look
0:17:21.449,0:17:28.199
something like this. So you would have the[br]key stream of two different sectors, two
0:17:28.199,0:17:33.080
different consecutive sectors here. So it[br]would start with with FF and then
0:17:33.080,0:17:39.570
continues with D7 and so on. And the next[br]sector would have almost all of the bytes
0:17:39.570,0:17:49.639
identical. And this is yeah a really big[br]failure because this really nice salsa 20
0:17:49.639,0:17:57.770
implementation, this really nice salsa[br]algorithm will then be, from would then be
0:17:57.770,0:18:03.140
converted from a one-time pad to a many[br]times pad. And this is the first fine I
0:18:03.140,0:18:10.090
wanted to show you in this very few lines[br]of code. The second part is, the second
0:18:10.090,0:18:17.740
bug is here this large keyword. Remember,[br]we are in 16-bit code so this large
0:18:17.740,0:18:24.580
keyword here does not push a 64-bit[br]variable as we would suspect to do it, but
0:18:24.580,0:18:31.450
a 32-bit variable. So only 32 bits of this[br]this nice counter variable are actually
0:18:31.450,0:18:40.970
used in this case. So, these two failures[br]are somewhat of a problem for this salsa
0:18:40.970,0:18:59.990
20 implementation. So, in this slide I[br]took two different hex dumps, and these
0:18:59.990,0:19:09.399
hex times were are within this expand key[br]function. And they they are well basically
0:19:09.399,0:19:16.970
two snapshots - one right before these[br]bugs become apparent: so before this this
0:19:16.970,0:19:25.480
endianness conversion, and right after on[br]the lower half. So you very nicely see the
0:19:25.480,0:19:32.140
different variables being put into all the[br]different key inputs being put into these
0:19:32.140,0:19:34.140
this memory blocks. So here, it would
0:19:34.140,0:19:39.331
spell out invalid sect ID, you know, the[br]constants (Not)Petya uses. Here you would
0:19:39.331,0:19:46.309
see the key and here, so it's broken into[br]two halves. Additionally you would see the
0:19:46.309,0:19:53.710
nonce here. And what really sticks out is[br]this bunch of zeros here. So this this
0:19:53.710,0:20:01.380
higher part of this a 64-bit variable[br]isn't even used - is it's not even filled.
0:20:01.380,0:20:07.170
So this is well the first problem here,[br]and after the endianness conversion you
0:20:07.170,0:20:13.820
see that it's not really an endianness[br]conversion but it's more of a nulling of
0:20:13.820,0:20:23.060
of bytes. So the result would be that this[br]initially 64 bit variable would be just
0:20:23.060,0:20:32.600
16-bit in length. And, as I said earlier,[br]the original salsa implementation would be
0:20:32.600,0:20:44.610
2^70s key lengths. And right now we have[br]16 bits times 64 bytes in in key lengths
0:20:44.610,0:20:51.299
which would result an in 26 bits in key[br]lengths. Which would be a measly 4
0:20:51.299,0:20:57.289
megabytes in key lengths. So this is the[br]this was a very interesting observation I
0:20:57.289,0:21:08.730
made there and this would be possible then[br]to decrypt together with the many times
0:21:08.730,0:21:17.720
pad properties of this cipher which make[br]it really easy to break. So to quickly
0:21:17.720,0:21:24.980
summarize this part of the talk - so we[br]have a very very short key stream of just
0:21:24.980,0:21:32.009
4 megabytes, it's highly repetitive[br]repetitive. So for each secretary progress
0:21:32.009,0:21:41.620
you would only have a progress of one byte[br]at a time. So only 26 bits remain of the
0:21:41.620,0:21:50.230
whole stream. And as I said, the many[br]times pad property's a very nice thing to
0:21:50.230,0:22:00.190
to analyze. I could come around to[br]implement a small joke here, so this salsa
0:22:00.190,0:22:08.690
implementation I would only call "LOLsa"[br]from now on. Sorry, is a bad joke sorry!
0:22:08.690,0:22:16.630
So, the the main goal is to derive the[br]keystream, and as I'm not really a crypto
0:22:16.630,0:22:22.029
expert, the basically the only attack I[br]know it would be a known plaintext attack
0:22:22.029,0:22:28.470
so this was my goal there because it's so[br]straightforward to do that. And in the
0:22:28.470,0:22:34.909
remainder of the talk I will tell you how[br]I did that. So without further ado, let's
0:22:34.909,0:22:41.710
exploit these failures and see how much of[br]the plaintext we can actually get from a
0:22:41.710,0:22:47.050
(Not)Petya infected drive.[br]So the modulus operandi
0:22:47.050,0:22:53.299
of (Not)Petya would look[br]somewhat like that. This, so let's let's
0:22:53.299,0:22:58.480
stop with the with the left hand side of[br]the of this slide, and concentrate on the
0:22:58.480,0:23:03.860
right hand side. For those of you who you[br]are not really intimately familiar with
0:23:03.860,0:23:10.890
NTFS - I wasn't before analyzing Petya or[br](Not)Petya as well so no worries. It's
0:23:10.890,0:23:17.659
quite simple - so every NTFS partition has[br]something called an master file table.
0:23:17.659,0:23:24.940
MFT, abbreviated. And it would contain[br]some metadata about the file. For example,
0:23:24.940,0:23:32.399
the file name, the file size, and if the[br]file is small enough, it would even fit
0:23:32.399,0:23:42.320
the actual content of the file into this[br]record. So, as I said, MFT is just list of
0:23:42.320,0:23:50.091
records. And if the file is larger it[br]would have a pointer, a so called data
0:23:50.091,0:23:57.860
run, which would point to a cluster or[br]sector on the disk on the partition which
0:23:57.860,0:24:06.020
would then actually be the the payload[br]data. One of these MFT records is one
0:24:06.020,0:24:12.330
kilobyte in size. So now to the actual[br]implementation. So let's zoom out a bit
0:24:12.330,0:24:19.370
and see how this LOLsa implementation is[br]used in (Not)Petya. So it would basically
0:24:19.370,0:24:26.399
just iterate over over all of these MFT[br]records, and then check if this record
0:24:26.399,0:24:33.230
would put into would point to a file. If[br]that is the case, it would encrypt the
0:24:33.230,0:24:41.879
first kilobyte of the file, and then would[br]encrypt the record itself. This
0:24:41.879,0:24:47.769
implementation is good for a bunch of[br]reasons - it's very fast: You would only
0:24:47.769,0:24:52.259
need to encrypt the first kilobyte, and[br]this is this first kilobyte contains
0:24:52.259,0:25:03.440
really really important information. For[br]example, headers, or especially compressed
0:25:03.440,0:25:09.191
files have these really important header[br]structures there. Additionally, your file
0:25:09.191,0:25:14.429
recovery tools wouldn't be able to work[br]anymore because most of them rely on this
0:25:14.429,0:25:23.160
very header. And the second thing is this[br]MFT is can be considered as table of
0:25:23.160,0:25:31.000
contents. So with no metadata, with with[br]no pointers to these to the files, you
0:25:31.000,0:25:37.870
won't have anything there to work with to[br]recover from. And this is a I mean from
0:25:37.870,0:25:44.320
from the implementation standpoint it's[br]very neat, because it's fast and it's a
0:25:44.320,0:25:49.729
somewhat thorough. As the MFT is
0:25:49.729,0:25:57.450
really important, my idea was to to[br]recover that first and then check what
0:25:57.450,0:26:07.090
what comes out from there and see how I[br]can can further progress there with the
0:26:07.090,0:26:14.180
decryption of the files. So the metadata[br]would would be of most importance there.
0:26:14.180,0:26:27.419
I'm a visual person, and here I took two[br]disk dumps from a from one of my test
0:26:27.419,0:26:33.339
disks. So I infected a clean system with[br](Not)Petya and let it encrypt the hard
0:26:33.339,0:26:38.270
drive. And on the left-hand side you see[br]the plaintext on and on the right-hand
0:26:38.270,0:26:45.190
side you see the encrypted data. So to[br]just get a better picture about the
0:26:45.190,0:26:54.220
encryption process. On the far left-hand[br]side fancy PowerPoint altered animation,
0:26:54.220,0:27:01.059
you see some kind of indicator so which[br]which would tell you at which offset, how
0:27:01.059,0:27:08.320
much of the of the data has actually been[br]different. And you see the whole disk is
0:27:08.320,0:27:19.600
more or less being encrypted. However, you[br]see at the far bottom part here it's more
0:27:19.600,0:27:27.199
dark red. And this is actually the MFT, so[br]the MFT is towards the end of the disk
0:27:27.199,0:27:33.630
sometimes. But this might be a[br]misconception. So my my idea was something
0:27:33.630,0:27:38.779
like this: we have two input sizes, right?[br]We have the the encrypted MFT, and we have
0:27:38.779,0:27:47.330
encrypted files. And first i would analyze[br]the MFT, and then derive the key stream
0:27:47.330,0:27:55.850
from that. After that analysis stage had[br]been finished I would put the key stream
0:27:55.850,0:28:03.970
back into the this this little box here,[br]and actually decrypt that. And then out
0:28:03.970,0:28:09.659
comes the decrypted MFT. And with that and[br]the key stream I would be able to find the
0:28:09.659,0:28:16.899
encrypted files on the disk, and then be[br]able to decrypt them; then be ready with
0:28:16.899,0:28:24.160
it right? So this was my initial idea[br]there and so let's start with the
0:28:24.160,0:28:35.059
decryption of the MFT. Known plaintext[br]attack, right? So an MFT you looks for
0:28:35.059,0:28:40.320
from from the viewpoint of the keystream,[br]somewhat like this. So you would have here
0:28:40.320,0:28:46.800
the first, the second and so on MFT[br]records. And on the on the column you will
0:28:46.800,0:28:54.510
have the actual byte that is used as key[br]to encrypt the key stream. Remember, the
0:28:54.510,0:29:03.829
operation that that encrypts the you know[br]the the key stream and the the plaintext
0:29:03.829,0:29:07.350
is just a mirror XOR operation.[br]So you have the the key
0:29:07.350,0:29:15.769
stream and the plaintext, and it's plainly[br]- so you can switch forth and back between
0:29:15.769,0:29:21.170
plaintext and ciphertext and even the key[br]stream with a with just applying an XOR
0:29:21.170,0:29:29.850
operation. So what you see here is for the[br]very first records you only have very few
0:29:29.850,0:29:36.680
key stream bytes or very few sample bytes.[br]However, as the progress as you make
0:29:36.680,0:29:41.610
progress with the analysis and then you[br]will have more and more of these sample
0:29:41.610,0:29:49.560
bytes to collect from and this this will[br]give you more confidence in the in the
0:29:49.560,0:29:56.930
result, and the may be known key stream[br]then. The question is, does the MFT hold
0:29:56.930,0:30:02.889
enough plaintext to do some some kind of[br]known plaintext attack? So let's look into
0:30:02.889,0:30:13.399
the specification. The MFT record has[br]basically two fields. So there is this the
0:30:13.399,0:30:19.570
standard information, which is a well-[br]defined structure and there's something
0:30:19.570,0:30:25.840
else called attribute list, which is a[br]quite dynamic structure and this would be
0:30:25.840,0:30:32.799
a somewhat more different, difficult to[br]clean plan text from. So I concentrate on
0:30:32.799,0:30:38.200
this first part. And the first part[br]quickly turned out to be quite constant
0:30:38.200,0:30:48.590
for many or most of the MFT records. And[br]you see it starts with FILE and then as
0:30:48.590,0:30:55.210
some some hex digits after that. And[br]although on the far bottom part of the
0:30:55.210,0:31:02.210
slide, I added my yeah certainty level. So[br]the the certainty level would be the
0:31:02.210,0:31:08.749
number of different sample bytes I would[br]have multiplied by the confidence I would
0:31:08.749,0:31:16.559
have in that sample byte being actually[br]this this plain text. So you see for the
0:31:16.559,0:31:22.759
very first record, we would have a quite[br]low, of a low certainty. I mean, it's just
0:31:22.759,0:31:30.580
one byte, right? Oh, the two byte skipping[br]is I mean quite a forward considering you
0:31:30.580,0:31:37.370
would have usually 512 bytes per sector on[br]a disk, and the MFT record is one kilobyte
0:31:37.370,0:31:47.919
in size, so the stream would progress two[br]bytes. And for record 100, so for if for
0:31:47.919,0:31:54.820
the 100th record I would have a certainty[br]of four. You know, I would just assume
0:31:54.820,0:32:02.730
these eight plaintext bytes here, divided[br]by 2, with an resulting 4. This wasn't
0:32:02.730,0:32:06.320
really satisfactory.[br]The problem there was towards
0:32:06.320,0:32:12.720
the end I would have many many unknown[br]records because I would was concentrating
0:32:12.720,0:32:19.799
on on the very first parts of the header.[br]So the remainder of the key stream, the
0:32:19.799,0:32:25.799
very end of the key stream wasn't be able[br]to being analyzed and eventually
0:32:25.799,0:32:30.159
decrypted. So I thought of something[br]different. And there was something like a
0:32:30.159,0:32:39.799
I would call a byte histogram. So for[br]every offset the MFT record, i would i
0:32:39.799,0:32:48.860
will then calculate creatin and a[br]histogram and check how many different
0:32:48.860,0:32:53.549
bytes are there actually for plaintext,[br]you know, it's a plaintext known plaintext
0:32:53.549,0:32:58.560
attack so I would need some kind of[br]plaintext and would do that for every
0:32:58.560,0:33:07.509
offset for every record. And so these the[br]questions there how to get many MFT
0:33:07.509,0:33:12.039
records - it's quite easy if you have some[br]nice colleagues you just need to hang them
0:33:12.039,0:33:16.999
all of the balcony and shake them a bit[br]then more or less voluntarily they will
0:33:16.999,0:33:25.080
give you some MFT keys to work with. And[br]the result of that is quite nice: you, I
0:33:25.080,0:33:30.640
mean for for the very first, there's[br]nothing much you can do the very first
0:33:30.640,0:33:35.929
record will always have very few sample[br]bytes. But as the stream progresses and
0:33:35.929,0:33:41.970
you will have a dramatic change there so[br]from this relatively low certainty of
0:33:41.970,0:33:51.499
four, I could increase that to more than[br]thirty. So this is somewhat nice, and ever
0:33:51.499,0:34:00.480
bit of doing science, this table drops out[br]from nowhere. So I compared these two
0:34:00.480,0:34:08.230
attack types. So, let's read that from[br]from right to left. On the on the far
0:34:08.230,0:34:16.270
right, I have for the first approach about[br]98% of the MFT records being successfully
0:34:16.270,0:34:22.070
recovered. You know, with the good thing[br]with science and with all this academic
0:34:22.070,0:34:29.659
approach is you have a ground truth. So I[br]have a plaintext an unencrypted hard drive
0:34:29.659,0:34:36.920
which were virtually unordered from[br]something right after infection, and then
0:34:36.920,0:34:43.370
you let execute the the whole infection[br]and encryption process. And then you can
0:34:43.370,0:34:49.120
differentiate and take you knows several[br]snapshots throughout the infection, change
0:34:49.120,0:34:54.539
different key values and all the stuff. So[br]this is a very nice thing about this
0:34:54.539,0:35:02.830
academic approach I was taking the other[br]So I could I could exactly pinpoint how many
0:35:02.830,0:35:14.280
of these records were perfectly recovered.[br]So for the byte histogram approach I could
0:35:14.280,0:35:17.940
decrypt almost all of the records, which[br]is quite nice because then we have a high
0:35:17.940,0:35:28.250
quality MFT to work with. What's also[br]quite nice is that we have zero wrong key
0:35:28.250,0:35:34.210
stream bytes. What's not so nice, however,[br]is that I was only able to recover about
0:35:34.210,0:35:43.130
1.3 percent of the overall key stream. And[br]remember - this this key stream is about 4
0:35:43.130,0:35:47.840
megabytes in length and I was able to[br]recover only 50 kilobytes of that, so we
0:35:47.840,0:35:55.540
can't recover all of the the files. Or is[br]that so? This is this was my next question
0:35:55.540,0:36:05.370
there. And so I drew another nice diagram.[br]This is the key stream; in the this is the
0:36:05.370,0:36:15.880
MFT, sorry, in the key stream. So the the[br]key stream is only filled and in sample
0:36:15.880,0:36:24.310
bytes at this 2 megabytes mark. And the[br]question is, are there many files in this
0:36:24.310,0:36:29.430
area being encrypted or is there some[br]other bug I could exploit. And I checked
0:36:29.430,0:36:34.680
where the files would lie into this the[br]key stream. So I would check how many
0:36:34.680,0:36:41.470
files are at which location in the key[br]stream being encrypted. And you see the
0:36:41.470,0:36:47.550
key stream is used all over the place - so[br]I mean sometimes it's used more, sometimes
0:36:47.550,0:36:56.110
it's used just the less, however, it's[br]basically used all over the place. And so
0:36:56.110,0:37:00.611
this much of the key stream could in a[br]perfect scenario be, I mean, perfect
0:37:00.611,0:37:09.340
scenario being a perfect known plaintext[br]oracle could theoretically be recovered.
0:37:09.340,0:37:17.570
However, this is somewhat of a problem[br]here and in the next part of the talk I
0:37:17.570,0:37:26.190
will present you how I was able to solve[br]this problem as well. So remember when the
0:37:26.190,0:37:32.290
file system is actually encrypted by[br](Not)Petya, the file system looks somewhat
0:37:32.290,0:37:37.051
like this. So you would have the MFT which[br]is totally garbled so you won't have any
0:37:37.051,0:37:42.250
of any nice file pointers or data runs[br]pointing through those files. You won't
0:37:42.250,0:37:47.520
have any nice file names and all the[br]stuff. But with the first stage being
0:37:47.520,0:37:53.320
finished, the MFT looks really nice - I[br]mean like almost a hundred percent
0:37:53.320,0:37:57.530
of the records could be[br]recovered. So it looks
0:37:57.530,0:38:04.430
somewhat like this. So you have a bunch of[br]metadata you can now use to analyze the
0:38:04.430,0:38:08.960
the remainder of the files and the[br]remainder of the encryption. So all this
0:38:08.960,0:38:17.090
MFT or almost always actually decrypted.[br]And for files you would have the very
0:38:17.090,0:38:24.830
first kilobyte being encrypted. The[br]remainder of the file, the, I mean, most
0:38:24.830,0:38:30.540
files are more than a kilobyte in size,[br]right? So all the rest is not encrypted.
0:38:30.540,0:38:37.280
So you would have all this data and[br]metadata to collect information and to to
0:38:37.280,0:38:45.520
use to exploit as-known plaintext as[br]indicators for known plaintext. So I
0:38:45.520,0:38:56.060
thought of three different approaches to[br]you know to attack this problem. Basically
0:38:56.060,0:39:01.620
I was thinking about: okay what different[br]files are there, and I was quickly
0:39:01.620,0:39:08.100
thinking about different file types. And I[br]mean, there are I mean the file type can
0:39:08.100,0:39:12.890
be easily gleaned from this, right?[br]Because you would have the file extension
0:39:12.890,0:39:18.540
and it would be basically the file type.[br]And you would have two different types of
0:39:18.540,0:39:22.990
files - you would have structured files[br]and unstructured files. So I thought of
0:39:22.990,0:39:28.730
these and you would have something like,[br]yeah, source code, which I would consider
0:39:28.730,0:39:34.610
as more or less unstructured. And I was[br]calculating the histogram, so this would
0:39:34.610,0:39:39.980
give me some kind of prevalence towards[br]different bytes in the key stream. So it
0:39:39.980,0:39:46.600
would be somewhat like a guest plaintext[br]attack or something like that. The next
0:39:46.600,0:39:54.670
thing for structured files would be to do[br]the very same approach as with the MFT
0:39:54.670,0:40:00.930
records. I would calculate the histogram[br]for every offset of the first kilobyte,
0:40:00.930,0:40:07.690
and then quickly see how how many[br]different bytes are there per offset. And
0:40:07.690,0:40:14.583
the last approach uses somewhat more data.[br]It uses some of the metadata, and also
0:40:14.583,0:40:23.870
some of the file data. I will go into this[br]right now. So, what I basically have here
0:40:23.870,0:40:29.410
as I said it's only this this little[br]portion of the files encrypted. The whole
0:40:29.410,0:40:33.830
remainder of the file is not encrypted,[br]which is quite nice. And also the file
0:40:33.830,0:40:40.490
name the file sizes not encrypted.[br]So what I use, what I would do here
0:40:40.490,0:40:44.970
is create a database of [br]known files of known
0:40:44.970,0:40:51.860
Windows system files, for example. Y'all[br]might remember these nice background
0:40:51.860,0:40:59.310
images fonts all this stuff - plaintext[br]is flying around everywhere if you just
0:40:59.310,0:41:07.220
look for it. And I would have basically[br]three different, yeah, three three
0:41:07.220,0:41:17.300
different distinctors between those to[br]know which which is the correct plaintext.
0:41:17.300,0:41:24.520
So there is this this key file name, the[br]file size, and then the hash of this whole
0:41:24.520,0:41:32.180
remainder, of this whole tier. So if all[br]these 3-tuple would match, then I would
0:41:32.180,0:41:38.640
consider this as a proper plaintext.[br]However, there are some collisions, so
0:41:38.640,0:41:48.090
this is this is not really something that[br]is straightforward. So the initial idea of
0:41:48.090,0:41:55.760
having of only needing to analyze the MFT[br]and then could just being being able to
0:41:55.760,0:42:03.920
straightforward decrypt files needed to be[br]altered a bit. So I added this database of
0:42:03.920,0:42:11.920
known files there, I added another another[br]analysis stage in this nice box here, and
0:42:11.920,0:42:19.970
then I would be able to decrypt the files,[br]eventually. So let's do a bit of science
0:42:19.970,0:42:26.430
and check if this approach would be[br]worthwhile pursuing on a real-world
0:42:26.430,0:42:37.690
scenario. So let the science cat do its[br]science, let me have a drink. So whenever
0:42:37.690,0:42:46.920
you did do here is to create a database of[br]known files - I collected a bunch of
0:42:46.920,0:42:55.290
default Windows installations, which[br]resulted in about 340,000 unique files in
0:42:55.290,0:43:00.090
it. Then I calculated, you know, the the[br]different file type histograms I talked
0:43:00.090,0:43:08.910
about, I prepared my test setup, I added[br]there 1000 different files. And you should
0:43:08.910,0:43:15.760
note that these files were not part of[br]this known files database - they were
0:43:15.760,0:43:20.520
different from that because otherwise it[br]would have been easy to do. Then I would
0:43:20.520,0:43:28.010
infect this machine and then let its[br]encrypt by (Not)Petya and then attempting
0:43:28.010,0:43:40.170
recovery. And these are the results there.[br]So, I did this with 4 different runs, so I
0:43:40.170,0:43:46.430
tried every approach separately, and then[br]combined the three approaches. And the
0:43:46.430,0:43:52.240
outcome was something like this: So I[br]would have only two files by the general
0:43:52.240,0:43:59.950
histogram approach being able to[br]be to decrypt. At least 8%
0:43:59.950,0:44:06.510
where I will were yet decrypted by the[br]location histogram approach. And by the
0:44:06.510,0:44:12.480
known files approach over 90% of the files[br]could be successfully recovered. And even
0:44:12.480,0:44:22.280
better, the combined outcome would be[br]almost all files being able to decrypt.
0:44:22.280,0:44:30.530
So, so much about academia. So the problem[br]there is if you apply this to the real
0:44:30.530,0:44:37.350
world, you could get into more trouble[br]there. And there was lots and lots of more
0:44:37.350,0:44:42.910
to think about so. There were, for[br]example, non default windows installations
0:44:42.910,0:44:48.500
with the whole history of updates so this[br]might be really interesting from a
0:44:48.500,0:44:56.010
forensics perspective. Moreover, there's[br]lots of installed programs to derive known
0:44:56.010,0:45:04.610
plaintext from - for example, dotnet uses[br]a vast source of known plaintext or JDK
0:45:04.610,0:45:11.560
installations. So, especially the JDK[br]would would result in about tens of
0:45:11.560,0:45:17.051
thousands of different source code and[br]HTML files. So this is this really was
0:45:17.051,0:45:27.940
really quite nice. The drawback there was,[br]so there was so much data to collect, the
0:45:27.940,0:45:35.170
first attempts failed miserably because of[br]the sheer amount of RAM I was using. And
0:45:35.170,0:45:43.960
this would would result in the admins[br]constantly giving me more more RAM in my
0:45:43.960,0:45:52.900
VM so I would eventually end up with, I[br]think, 128 gigabytes of RAM and my my test
0:45:52.900,0:46:00.080
VM. Also you have a much larger master[br]file table you know because of all these
0:46:00.080,0:46:06.230
files that would need to be stored, so to[br]put them in in comparison. So this nice
0:46:06.230,0:46:13.530
working prototype, so this nice test[br]setup, I mean, would have an MFT of about
0:46:13.530,0:46:20.310
26 megabytes. And for real-world scenarios[br]you would have something like at least 500
0:46:20.310,0:46:26.980
megabytes, and even MFT could be even like[br]a couple of gigabytes in size. So this
0:46:26.980,0:46:33.060
means much more plaintext. So for through[br]these really large MFTs you could quickly
0:46:33.060,0:46:40.180
recover the whole key stream, the whole[br]four megabytes just by looking at the MFT.
0:46:40.180,0:46:46.600
And, in summary, this means that the[br]decryption of (Not)Petya encrypted drives
0:46:46.600,0:46:57.301
is possible. So, for for the file systems[br]the drives, I have looked at it, was
0:46:57.301,0:47:03.240
really I mean ever after having all these[br]these first bugs out of the way,
0:47:06.891,0:47:10.410
I was able to decrypt and[br]recover all the files there. So
0:47:10.410,0:47:17.550
there's a good chance the vacation photos[br]can be recovered as well. And this will
0:47:17.550,0:47:24.010
conclude my talk, so quick summary -[br](Not)Petya has some severe cryptographic
0:47:24.010,0:47:33.020
failures and flaws, so I would only be[br]able to call this LOLsa and not salsa
0:47:33.020,0:47:41.210
anymore. It might be possible to further[br]look into this, this you know this expand
0:47:41.210,0:47:48.790
key and thing. It it has really really a[br]bunch of more cryptographic failures in
0:47:48.790,0:47:56.310
it. I didn't look into that deeper but I[br]know some of you guys are professors and
0:47:56.310,0:48:01.140
this might be a nice homework for your[br]students to look at this (Not)Petya
0:48:01.140,0:48:07.510
implementation and check out some, you[br]know, more advanced crypto analysis
0:48:07.510,0:48:15.690
methods. And you should note that this[br]hook this this whole (Not)Petya thing here
0:48:15.690,0:48:22.050
I described the whole cryptographic flaws[br]there are not just in (Not)Petya. They are
0:48:22.050,0:48:28.080
you see the brackets there: there they are[br]in every each and every version of Petya.
0:48:28.080,0:48:38.251
So all of the drives that you potentially[br]have saved can be decrypted. And so my
0:48:38.251,0:48:45.210
last point is if any of you or any of us[br]falls victim to ransomware you really
0:48:45.210,0:48:51.850
should keep the drives. You keep them[br]untouched in a locker and wait for a talk
0:48:51.850,0:48:59.090
like this basically. And then hope that[br]someone comes around and then be able to
0:48:59.090,0:49:06.540
decrypt the drives and then, you will have[br]your vacation photos back. So that's all
0:49:06.540,0:49:09.010
folks, thank you for your attention.[br]
0:49:09.010,0:49:13.210
Applause
0:49:13.210,0:49:16.090
Herald: Thank you very much Sebastian. Now
0:49:16.090,0:49:19.330
it's time for questions, please queue up[br]by the microphones.
0:49:26.270,0:49:27.910
Mic 1: Hi, well thank you very much for
0:49:27.910,0:49:33.550
sharing your findings with us. I'm from[br]Rotterdam the largest harbour in Europe,
0:49:33.550,0:49:36.230
and as you might know, we were struck by[br]Petya
0:49:36.230,0:49:39.420
Sebastian Yes[br]Mic 1: Two terminals went down for a
0:49:39.420,0:49:44.260
couple of days, a couple of hundred[br]billion euros of damage. Your approach is
0:49:44.260,0:49:48.930
quite theoretically, so now to practice -[br]if you were there in this summer with
0:49:48.930,0:49:53.840
these findings, would you've been able to[br]help us and decrypt the files and get all
0:49:53.840,0:49:58.000
the companies up and running again? Or is[br]this too academic?
0:49:58.000,0:50:04.920
Sebastian: No, it's a practical approach I[br]mean, I work for CrowdStrike and we had
0:50:04.920,0:50:13.640
some occasions where we were able to help[br]the customers. So it's a very practical
0:50:13.640,0:50:21.030
thing to do so. And I was trying to[br]insinuate this with, you know, this slide
0:50:21.030,0:50:26.350
here. So I was talking about the real[br]world in this scenario. So I looked at
0:50:26.350,0:50:32.100
about 50 different encrypted hard drives[br]with (Not)Petya, and I was able to decrypt
0:50:32.100,0:50:38.730
all of them, or most of them. Mostly the[br]the ones not being able to decrypt or to
0:50:38.730,0:50:45.010
some, well, let's say, level 8 mistakes.[br]Mic 1: Awesome. Have you shared your
0:50:45.010,0:50:48.930
findings with nomoreransoms.org?[br]Sebastian: No, I don't know about this
0:50:48.930,0:50:50.910
site[br]Mic 1: They provide decryption tools for
0:50:50.910,0:50:56.860
ransomware. Please do, thank you[br]Herald: Microphone six, please
0:50:56.860,0:51:01.930
Mic 6: Thank you, in the beginning you[br]mentioned that basically the key was
0:51:01.930,0:51:05.490
shortened to what was a 24-bit, something[br]like that?
0:51:05.490,0:51:09.560
Sebastian: 26, yes[br]Mic 6: 26, yeah. From that point, wasn't a
0:51:09.560,0:51:13.760
brute-force attack way faster and way more[br]reliable?
0:51:13.760,0:51:20.820
Sebastian: Oh no no, don't get me wrong[br]there. So the keystream length was 2^26,
0:51:20.820,0:51:27.660
so the keystream length was four[br]megabytes. So you wouldn't be able to
0:51:27.660,0:51:34.650
brute-force that. So sorry, do you get[br]that? The number of bytes was four
0:51:34.650,0:51:39.120
megabytes, and you couldn't be able to[br]brute-force that.
0:51:39.120,0:51:43.740
Mic 6: Yeah but you already mention at the[br]beginning that you basically shortened it
0:51:43.740,0:51:49.940
down to the possibility of at most two to[br]the power of 26, and it is calculable.
0:51:49.940,0:51:56.870
Sebastian: Yes yes, I totally get the[br]question. But you're you're missing the
0:51:56.870,0:52:03.230
point there. So it's not the key space,[br]2^26, but the key length is 2^26, which
0:52:03.230,0:52:10.660
would be something - I I'm not good at[br]that converting this to decimal. Would be
0:52:10.660,0:52:16.930
something like, let me check, two to[br]the... I mean, here, math guys, computer
0:52:16.930,0:52:21.610
guys: how many bits is that?[br][Inaudible audience answer]
0:52:21.610,0:52:25.510
Sebastian: Say again?[br]Crowd: A lot?
0:52:25.510,0:52:31.370
Sebastian: A lot! So it's, I mean, it's[br]it's four megabytes of keylength and you
0:52:31.370,0:52:35.220
couldn't just brute force that because you[br]you would have each of these
0:52:35.220,0:52:43.660
four megabytes you would have to brute[br]force. Got that? So the key is not 2^26
0:52:43.660,0:52:52.290
but the key length, the keystream length,[br]is that long. Got that? So yeah, this is
0:52:52.290,0:52:58.170
the key space would be longer than the[br]Bible. You know, and you couldn't just
0:52:58.170,0:53:03.050
brute force the Bible, the text of the[br]Bible. I mean given enough time yes but
0:53:03.050,0:53:08.990
you know we will all know the stories with[br]in monkeys and typewriters. I mean we can
0:53:08.990,0:53:12.500
talk about that that offline but you're[br]you're mixing up two different numbers
0:53:12.500,0:53:18.370
there.[br]Mic 6: Yeah I guess, thank you.
0:53:18.370,0:53:20.350
Herald: Questions from the Internet,[br]please
0:53:20.350,0:53:25.710
Signal Angel: Does the MFT encryption work[br]the same for Petya and (Not)Petya?
0:53:25.710,0:53:34.400
Sebastian: Yes, the underlying mechanism[br]is the same. The cryptography differs in
0:53:34.400,0:53:43.160
such a way that the constants number is[br]different. So, this little guy here - this
0:53:43.160,0:53:50.110
is different. It would be usually like[br]expand key something-something. And here
0:53:50.110,0:53:56.980
it's invalid sect ID. So it's somewhat[br]different, but the the MFT encryption; I
0:53:56.980,0:54:03.170
mean the the bytecode the and the[br]algorithm is the very same.
0:54:03.170,0:54:12.410
Signal Angel: Thanks[br]Herald: Any more questions? Then please
0:54:12.410,0:54:14.820
give a great round of applause to[br]Sebastian
0:54:14.820,0:54:18.320
Sebastian: Thank you
0:54:18.320,0:54:24.130
Applause
0:54:24.130,0:54:44.000
subtitles created by c3subtitles.de[br]in the year 2018. Join, and help us!