9:59:59.000,9:59:59.000
OK everyone

9:59:59.000,9:59:59.000
also now please join me in welcoming Eric, who is a PhD-student at the VU Amsterdam,[br]and he will talk about ASLR.

9:59:59.000,9:59:59.000
Please give him a warm round of appluse.

9:59:59.000,9:59:59.000
Hello.

9:59:59.000,9:59:59.000
Like Herold said, I'm Eric, PhD-student at the VU Amsterdam, VUSec group.

9:59:59.000,9:59:59.000
I will now be presenting work, that we have done in the group.

9:59:59.000,9:59:59.000
But the work I'm presenting, most of the work was done by Ben and Kaveh and by Stephen, who showed that attack I'm presenting is applicable to all 22 micro CPU micro architectures that he has tested.

9:59:59.000,9:59:59.000
I tried to sneak this slide in all my talks, but this time is especially apt, because this talk is about finding them.

9:59:59.000,9:59:59.000
So this talk is about attacking ALSR, which is short for Address Space Layout Randomization.

9:59:59.000,9:59:59.000
It's an exploit mitigation technique, which as far as deployment concert is one of the success stories since it's been introduced.

9:59:59.000,9:59:59.000
It's been widely adopted and it makes exploitation somewhat more difficult.

9:59:59.000,9:59:59.000
The way ASLR makes it more difficult is that it changes the location of code and data usually every time the processes run, so that an attacker cannot rely on certain addresses to be the same all the time.

9:59:59.000,9:59:59.000
On modern 64 bit architectures the address space usually is 48 bits, which means you can address about 256 terabytes of memory.

9:59:59.000,9:59:59.000
Of course you cannot write everywhere or read everywhere, because your computer probably doesn't have that much memory.

9:59:59.000,9:59:59.000
So in reality only very small portion of the memory is allocated to a process.

9:59:59.000,9:59:59.000
So it's quite easy to change the location of that memory.

9:59:59.000,9:59:59.000
So it makes life for a exploit writer a tiny bit more difficult, because it's very usefull to know the location of data, for example if you want to overwrite a return address on the stack, then it's nice to know where you can jump to and if you don't known, you may jump into nowhere and the program crashes.

9:59:59.000,9:59:59.000
However not much is needed to bypass this mitigation.

9:59:59.000,9:59:59.000
You just need leak location of the memory.

9:59:59.000,9:59:59.000
So I really like this backronym.

9:59:59.000,9:59:59.000
You can try to reuse the bug, that you can use to exploit, to first leak information an then exploit.

9:59:59.000,9:59:59.000
Or, if that is not possible, you can find another bug, which allows you to leak this location.

9:59:59.000,9:59:59.000
Or maybe you don't have to.

9:59:59.000,9:59:59.000
This presentation si about an attack, which uses side channel from javascript on processes in the hardware itself, to discover information about locations of data or code in memory.

9:59:59.000,9:59:59.000
The modern CPU architecture is a wonderous abstraction layer.

9:59:59.000,9:59:59.000
Even if you as a programmer write machine code, there is lots of stuff you don't have to worry about.

9:59:59.000,9:59:59.000
Especially stuff to make your programs fast.

9:59:59.000,9:59:59.000
Memory access is very slow compared to CPU on modern computers.

9:59:59.000,9:59:59.000
That's why there is a cache mechanism built in.

9:59:59.000,9:59:59.000
Other things are also abstracted away.

9:59:59.000,9:59:59.000
For example if your program does a memory access, the data is written to the cache, but where is it written?

9:59:59.000,9:59:59.000
Your program gives a virtual address to the CPU and the CPU needs to translate that to the physical address, which is done by a component called memory management unit.

9:59:59.000,9:59:59.000
The memory management unit has a small cache of mappings from virtual memory to physical memory,

9:59:59.000,9:59:59.000
but if an address is not in the cache, it has to do a page table walk.

9:59:59.000,9:59:59.000
The page table walk is what we are going to try to attack.

9:59:59.000,9:59:59.000
We'll measure the effect, that page table walk will have on the L3 cache, the last and biggest cache on the CPU, to find out, what happens in the page table walk.

9:59:59.000,9:59:59.000
We're talking about doing timing attack from javascript to measure wheter mememory gets accessed, which means that we need a pretty good timer to be able to do this.

9:59:59.000,9:59:59.000
Luckely for us, the browser standard commitees have come up with an API to just do that.

9:59:59.000,9:59:59.000
You can take a timestamp, do an operation and then take another timestamp and then you get a very crisp time measurement.

9:59:59.000,9:59:59.000
Until someone published a paper, which showed basically that you can do last level cache attack on the cpu and discover something.

9:59:59.000,9:59:59.000
So the browser makers made the time measurements much more granular.

9:59:59.000,9:59:59.000
Every microsecond or so you get a little bump and then for one microsecond nothing changes.

9:59:59.000,9:59:59.000
But all is not lost for the attacker, because you can turn the coarse grained timer into a fine grained timer,

9:59:59.000,9:59:59.000
What you can do is for example wait for this bump to happen and then quickly do an operation and then start a counter.

9:59:59.000,9:59:59.000
And then, the longer the operation takes, the smaller the counter is when the jump happens.

9:59:59.000,9:59:59.000
So in chrome they chosed to vary the length of the time when this happens, but still you can do multiple measurements and then you take an average and then you can still get a good measurement.

9:59:59.000,9:59:59.000
However we can do better.

9:59:59.000,9:59:59.000
The browser makers decided to make this a bit more difficult, but when the browser standards commitee takes they also gives.

9:59:59.000,9:59:59.000
They decided to implement an object called the shared array buffer, which allows multiple threads, which are called webworkers in javascript, to work on single piece of memory.

9:59:59.000,9:59:59.000
They decided to enable this by default, which is actually after we published the attack.

9:59:59.000,9:59:59.000
They basically given up on preventing nanoseconds scale time measurements in javascript.

9:59:59.000,9:59:59.000
The shared array buffer can be used for other things, but I'll no talk about this today.

9:59:59.000,9:59:59.000
So how can we measure time using shared memory.

9:59:59.000,9:59:59.000
Well it's quite simple.

9:59:59.000,9:59:59.000
One thread is used for doing the time measurement and the other thread does the operation.

9:59:59.000,9:59:59.000
The timer thread waits until the the thread, which does the operation sets a variable and starts the operation, meanwhile the timer thread sees that the shared buffer has changed and will start counting and when the operation is done, the second thread changes the buffer again and the counter thread stops.

9:59:59.000,9:59:59.000
So this gives a very crisp measurement.

9:59:59.000,9:59:59.000
So now we have a nano second scaled timer and we can do side channel attacks from javascript.

9:59:59.000,9:59:59.000
So we'll be doing a timer attack on the last level cache and when the CPU accesses memory everything is on the granularity of cache line, which is 64 bytes.

9:59:59.000,9:59:59.000
Whitin for example the level 3 cache a certain phyiscal address maps on to a certain cache set and this cache set can for example on a four core desktop intel machine contain 16 different cache lines.

9:59:59.000,9:59:59.000
I'll talk about a modern intel machine, but the concept translates also to other microarchitectures.