9:59:59.000,9:59:59.000 OK everyone 9:59:59.000,9:59:59.000 also now please join me in welcoming Eric, who is a PhD-student at the VU Amsterdam,[br]and he will talk about ASLR. 9:59:59.000,9:59:59.000 Please give him a warm round of appluse. 9:59:59.000,9:59:59.000 Hello. 9:59:59.000,9:59:59.000 Like Herold said, I'm Eric, PhD-student at the VU Amsterdam, VUSec group. 9:59:59.000,9:59:59.000 I will now be presenting work, that we have done in the group. 9:59:59.000,9:59:59.000 But the work I'm presenting, most of the work was done by Ben and Kaveh and by Stephen, who showed that attack I'm presenting is applicable to all 22 micro CPU micro architectures that he has tested. 9:59:59.000,9:59:59.000 I tried to sneak this slide in all my talks, but this time is especially apt, because this talk is about finding them. 9:59:59.000,9:59:59.000 So this talk is about attacking ALSR, which is short for Address Space Layout Randomization. 9:59:59.000,9:59:59.000 It's an exploit mitigation technique, which as far as deployment concert is one of the success stories since it's been introduced. 9:59:59.000,9:59:59.000 It's been widely adopted and it makes exploitation somewhat more difficult. 9:59:59.000,9:59:59.000 The way ASLR makes it more difficult is that it changes the location of code and data usually every time the processes run, so that an attacker cannot rely on certain addresses to be the same all the time. 9:59:59.000,9:59:59.000 On modern 64 bit architectures the address space usually is 48 bits, which means you can address about 256 terabytes of memory. 9:59:59.000,9:59:59.000 Of course you cannot write everywhere or read everywhere, because your computer probably doesn't have that much memory. 9:59:59.000,9:59:59.000 So in reality only very small portion of the memory is allocated to a process. 9:59:59.000,9:59:59.000 So it's quite easy to change the location of that memory. 9:59:59.000,9:59:59.000 So it makes life for a exploit writer a tiny bit more difficult, because it's very usefull to know the location of data, for example if you want to overwrite a return address on the stack, then it's nice to know where you can jump to and if you don't known, you may jump into nowhere and the program crashes. 9:59:59.000,9:59:59.000 However not much is needed to bypass this mitigation. 9:59:59.000,9:59:59.000 You just need leak location of the memory. 9:59:59.000,9:59:59.000 So I really like this backronym. 9:59:59.000,9:59:59.000 You can try to reuse the bug, that you can use to exploit, to first leak information an then exploit. 9:59:59.000,9:59:59.000 Or, if that is not possible, you can find another bug, which allows you to leak this location. 9:59:59.000,9:59:59.000 Or maybe you don't have to. 9:59:59.000,9:59:59.000 This presentation si about an attack, which uses side channel from javascript on processes in the hardware itself, to discover information about locations of data or code in memory. 9:59:59.000,9:59:59.000 The modern CPU architecture is a wonderous abstraction layer. 9:59:59.000,9:59:59.000 Even if you as a programmer write machine code, there is lots of stuff you don't have to worry about. 9:59:59.000,9:59:59.000 Especially stuff to make your programs fast. 9:59:59.000,9:59:59.000 Memory access is very slow compared to CPU on modern computers. 9:59:59.000,9:59:59.000 That's why there is a cache mechanism built in. 9:59:59.000,9:59:59.000 Other things are also abstracted away. 9:59:59.000,9:59:59.000 For example if your program does a memory access, the data is written to the cache, but where is it written? 9:59:59.000,9:59:59.000 Your program gives a virtual address to the CPU and the CPU needs to translate that to the physical address, which is done by a component called memory management unit. 9:59:59.000,9:59:59.000 The memory management unit has a small cache of mappings from virtual memory to physical memory, 9:59:59.000,9:59:59.000 but if an address is not in the cache, it has to do a page table walk. 9:59:59.000,9:59:59.000 The page table walk is what we are going to try to attack. 9:59:59.000,9:59:59.000 We'll measure the effect, that page table walk will have on the L3 cache, the last and biggest cache on the CPU, to find out, what happens in the page table walk. 9:59:59.000,9:59:59.000 We're talking about doing timing attack from javascript to measure wheter mememory gets accessed, which means that we need a pretty good timer to be able to do this. 9:59:59.000,9:59:59.000 Luckely for us, the browser standard commitees have come up with an API to just do that. 9:59:59.000,9:59:59.000 You can take a timestamp, do an operation and then take another timestamp and then you get a very crisp time measurement. 9:59:59.000,9:59:59.000 Until someone published a paper, which showed basically that you can do last level cache attack on the cpu and discover something. 9:59:59.000,9:59:59.000 So the browser makers made the time measurements much more granular. 9:59:59.000,9:59:59.000 Every microsecond or so you get a little bump and then for one microsecond nothing changes. 9:59:59.000,9:59:59.000 But all is not lost for the attacker, because you can turn the coarse grained timer into a fine grained timer, 9:59:59.000,9:59:59.000 What you can do is for example wait for this bump to happen and then quickly do an operation and then start a counter. 9:59:59.000,9:59:59.000 And then, the longer the operation takes, the smaller the counter is when the jump happens. 9:59:59.000,9:59:59.000 So in chrome they chosed to vary the length of the time when this happens, but still you can do multiple measurements and then you take an average and then you can still get a good measurement. 9:59:59.000,9:59:59.000 However we can do better. 9:59:59.000,9:59:59.000 The browser makers decided to make this a bit more difficult, but when the browser standards commitee takes they also gives. 9:59:59.000,9:59:59.000 They decided to implement an object called the shared array buffer, which allows multiple threads, which are called webworkers in javascript, to work on single piece of memory. 9:59:59.000,9:59:59.000 They decided to enable this by default, which is actually after we published the attack. 9:59:59.000,9:59:59.000 They basically given up on preventing nanoseconds scale time measurements in javascript. 9:59:59.000,9:59:59.000 The shared array buffer can be used for other things, but I'll no talk about this today. 9:59:59.000,9:59:59.000 So how can we measure time using shared memory. 9:59:59.000,9:59:59.000 Well it's quite simple. 9:59:59.000,9:59:59.000 One thread is used for doing the time measurement and the other thread does the operation. 9:59:59.000,9:59:59.000 The timer thread waits until the the thread, which does the operation sets a variable and starts the operation, meanwhile the timer thread sees that the shared buffer has changed and will start counting and when the operation is done, the second thread changes the buffer again and the counter thread stops. 9:59:59.000,9:59:59.000 So this gives a very crisp measurement. 9:59:59.000,9:59:59.000 So now we have a nano second scaled timer and we can do side channel attacks from javascript. 9:59:59.000,9:59:59.000 So we'll be doing a timer attack on the last level cache and when the CPU accesses memory everything is on the granularity of cache line, which is 64 bytes. 9:59:59.000,9:59:59.000 Whitin for example the level 3 cache a certain phyiscal address maps on to a certain cache set and this cache set can for example on a four core desktop intel machine contain 16 different cache lines. 9:59:59.000,9:59:59.000 I'll talk about a modern intel machine, but the concept translates also to other microarchitectures.