[Script Info] Title: [Events] Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text Dialogue: 0,0:00:00.00,0:00:09.52,Default,,0000,0000,0000,,{\i1}36c3 prerol music{\i0} Dialogue: 0,0:00:18.41,0:00:23.25,Default,,0000,0000,0000,,Herald: So, the next talk for this\Nafternoon is about high speed binary Dialogue: 0,0:00:23.25,0:00:28.11,Default,,0000,0000,0000,,fuzzing. We have two researchers that will\Nbe presenting the product of their latest Dialogue: 0,0:00:28.11,0:00:33.64,Default,,0000,0000,0000,,work, which is a framework for static\Nbinary rewriting. Our speakers are—the Dialogue: 0,0:00:33.64,0:00:38.58,Default,,0000,0000,0000,,first one is a computer science master's\Nstudent at EPFL and the second one is a Dialogue: 0,0:00:38.58,0:00:42.73,Default,,0000,0000,0000,,security researcher and assistant\Nprofessor at EPFL. Please give a big round Dialogue: 0,0:00:42.73,0:00:45.05,Default,,0000,0000,0000,,of applause to Nspace and gannimo. Dialogue: 0,0:00:45.05,0:00:50.28,Default,,0000,0000,0000,,{\i1}Applause{\i0} Dialogue: 0,0:00:50.28,0:00:52.61,Default,,0000,0000,0000,,gannimo (Mathias Payer): Thanks for the\Nintroduction. It's a pleasure to be here, Dialogue: 0,0:00:52.61,0:00:57.85,Default,,0000,0000,0000,,as always. We're going to talk about\Ndifferent ways to speed up your fuzzing Dialogue: 0,0:00:57.85,0:01:02.05,Default,,0000,0000,0000,,and to find different kinds of\Nvulnerabilities or to tweak your binaries Dialogue: 0,0:01:02.05,0:01:08.07,Default,,0000,0000,0000,,in somewhat unintended ways. I'm Mathias\NPayer or I go by gannimo on Twitter and I Dialogue: 0,0:01:08.07,0:01:14.44,Default,,0000,0000,0000,,am an assistant professor at EPFL working\Non different forms of software security: Dialogue: 0,0:01:14.44,0:01:18.70,Default,,0000,0000,0000,,fuzzing sanitization, but also different\Nkinds of mitigations. And Matteo over Dialogue: 0,0:01:18.70,0:01:24.16,Default,,0000,0000,0000,,there is working on his master's thesis on\Ndifferent forms of binary rewriting for Dialogue: 0,0:01:24.16,0:01:27.82,Default,,0000,0000,0000,,the kernel. And today we're going to take\Nyou on a journey on how to actually Dialogue: 0,0:01:27.82,0:01:32.18,Default,,0000,0000,0000,,develop very fast and very efficient\Nbinary rewriting mechanisms that allow you Dialogue: 0,0:01:32.18,0:01:37.71,Default,,0000,0000,0000,,to do unintended modifications to the\Nbinaries and allow you to explore Dialogue: 0,0:01:37.71,0:01:45.70,Default,,0000,0000,0000,,different kinds of unintended features in\Nbinaries. So about this talk. What we Dialogue: 0,0:01:45.70,0:01:49.73,Default,,0000,0000,0000,,discovered or the reason why we set out on\Nthis journey was that fuzzing binaries is Dialogue: 0,0:01:49.73,0:01:56.46,Default,,0000,0000,0000,,really, really hard. There's very few\Ntools in user space. There's—it's Dialogue: 0,0:01:56.46,0:01:59.68,Default,,0000,0000,0000,,extremely hard to set it up and it's\Nextremely hard to set it up in a Dialogue: 0,0:01:59.68,0:02:04.48,Default,,0000,0000,0000,,performant way. The setup is complex. You\Nhave to compile different tools. You have Dialogue: 0,0:02:04.48,0:02:08.52,Default,,0000,0000,0000,,to modify it. And the results are not\Nreally that satisfactory. As soon as you Dialogue: 0,0:02:08.52,0:02:13.32,Default,,0000,0000,0000,,move to the kernel, fuzzing binaries in a\Nkernel is even harder. There's no tooling Dialogue: 0,0:02:13.32,0:02:16.88,Default,,0000,0000,0000,,whatsoever, there's very few users\Nactually working with binary code in the Dialogue: 0,0:02:16.88,0:02:22.63,Default,,0000,0000,0000,,kernel or modifying binary code, and it's\Njust a nightmare to work with. So what we Dialogue: 0,0:02:22.63,0:02:26.85,Default,,0000,0000,0000,,are presenting today is a new approach\Nthat allows you to instrument any form of Dialogue: 0,0:02:26.85,0:02:31.92,Default,,0000,0000,0000,,binary code or modern binary code based on\Nstatic rewriting, which gives you full Dialogue: 0,0:02:31.92,0:02:36.82,Default,,0000,0000,0000,,native performance. You only pay for the\Ninstrumentation that you add, and you can Dialogue: 0,0:02:36.82,0:02:41.69,Default,,0000,0000,0000,,do very heavyweight transformations on top\Nof it. The picture, if you look at the Dialogue: 0,0:02:41.69,0:02:47.47,Default,,0000,0000,0000,,modern system, let's say we are looking at\Na modern setup. Let's say you're looking Dialogue: 0,0:02:47.47,0:02:52.70,Default,,0000,0000,0000,,at cat pictures in your browser: Chrome\Nplus the kernel plus the libc plus the Dialogue: 0,0:02:52.70,0:02:57.92,Default,,0000,0000,0000,,graphical user interface together clog up\Nat about 100 million lines of code. Dialogue: 0,0:02:57.92,0:03:02.67,Default,,0000,0000,0000,,Instrumenting all of this for some form of\Nsecurity analysis is a nightmare, Dialogue: 0,0:03:02.67,0:03:06.69,Default,,0000,0000,0000,,especially along this large stack of\Nsoftware. There's quite a bit of different Dialogue: 0,0:03:06.69,0:03:11.26,Default,,0000,0000,0000,,compilers involved. There's different\Nlinkers. It may be compiled on a different Dialogue: 0,0:03:11.26,0:03:14.62,Default,,0000,0000,0000,,system, with different settings and so on.\NAnd then getting your instrumentation Dialogue: 0,0:03:14.62,0:03:18.57,Default,,0000,0000,0000,,across all of this is pretty much\Nimpossible and extremely hard to work Dialogue: 0,0:03:18.57,0:03:24.27,Default,,0000,0000,0000,,with. And we want to enable you to select\Nthose different parts that you're actually Dialogue: 0,0:03:24.27,0:03:29.63,Default,,0000,0000,0000,,interested in. Modify those and then focus\Nyour fuzzing or analysis approaches on Dialogue: 0,0:03:29.63,0:03:35.04,Default,,0000,0000,0000,,those small subsets of the code, giving\Nyou a much better and stronger capability Dialogue: 0,0:03:35.04,0:03:38.69,Default,,0000,0000,0000,,to test the systems that you're, or those\Nparts of the system that you're really, Dialogue: 0,0:03:38.69,0:03:45.66,Default,,0000,0000,0000,,really interested in. Who's worked on\Nfuzzing before? Quick show of hands. Wow, Dialogue: 0,0:03:45.66,0:03:54.38,Default,,0000,0000,0000,,that's a bunch of you. Do you use AFL?\NYeah, most of you, AFL. Libfuzzer? Cool, Dialogue: 0,0:03:54.38,0:03:59.76,Default,,0000,0000,0000,,about 10, 15 percent libfuzzer, 30 percent\Nfuzzing, and AFL. There's a quite good Dialogue: 0,0:03:59.76,0:04:03.98,Default,,0000,0000,0000,,knowledge of fuzzing, so I'm not going to\Nspend too much time on fuzzing, but for Dialogue: 0,0:04:03.98,0:04:07.50,Default,,0000,0000,0000,,those that haven't really run their\Nfuzzing campaigns yet, it's a very simple Dialogue: 0,0:04:07.50,0:04:12.06,Default,,0000,0000,0000,,software testing technique. You're\Neffectively taking a binary, let's say Dialogue: 0,0:04:12.06,0:04:16.48,Default,,0000,0000,0000,,Chrome, as a target and you're running\Nthis in some form of execution Dialogue: 0,0:04:16.48,0:04:20.96,Default,,0000,0000,0000,,environment. And fuzzing then consists of\Nsome form of input generation that creates Dialogue: 0,0:04:20.96,0:04:26.62,Default,,0000,0000,0000,,new test cases, throws them at your\Nprogram and sees—and checks what is Dialogue: 0,0:04:26.62,0:04:31.31,Default,,0000,0000,0000,,happening with your program. And either\Neverything is OK, and your code is being Dialogue: 0,0:04:31.31,0:04:35.64,Default,,0000,0000,0000,,executed, and your input—the program\Nterminates, everything is fine, or you Dialogue: 0,0:04:35.64,0:04:39.77,Default,,0000,0000,0000,,have a bug report. If you have a bug\Nreport, you can use this. Find the Dialogue: 0,0:04:39.77,0:04:44.52,Default,,0000,0000,0000,,vulnerability, maybe develop a PoC and\Nthen come up with some form of either Dialogue: 0,0:04:44.52,0:04:49.24,Default,,0000,0000,0000,,exploit or patch or anything else. Right.\NSo this is pretty much fuzzing in a in a Dialogue: 0,0:04:49.24,0:04:55.56,Default,,0000,0000,0000,,nutshell. How do you get fuzzing to be\Neffective? How can you cover large source Dialogue: 0,0:04:55.56,0:05:00.42,Default,,0000,0000,0000,,bases, complex code, and complex\Nenvironment? Well, there's a couple of Dialogue: 0,0:05:00.42,0:05:04.98,Default,,0000,0000,0000,,simple steps that you can take. And let's\Nwalk quickly through effective fuzzing Dialogue: 0,0:05:04.98,0:05:12.63,Default,,0000,0000,0000,,101. Well, first, you want to be able to\Ncreate test cases that actually trigger Dialogue: 0,0:05:12.63,0:05:18.10,Default,,0000,0000,0000,,bugs. And this is a very, very\Ncomplicated, complicated part. And we need Dialogue: 0,0:05:18.10,0:05:22.80,Default,,0000,0000,0000,,to have some notion of the inputs that a\Nprogram accepts. And we need to have some Dialogue: 0,0:05:22.80,0:05:27.78,Default,,0000,0000,0000,,notion of how we can explore different\Nparts of the program, right? Different Dialogue: 0,0:05:27.78,0:05:30.87,Default,,0000,0000,0000,,parts of functionality. Well, on one hand,\Nwe could have a developer write all the Dialogue: 0,0:05:30.87,0:05:34.37,Default,,0000,0000,0000,,test cases by hand, but this would be kind\Nof boring. It would also require a lot of Dialogue: 0,0:05:34.37,0:05:40.22,Default,,0000,0000,0000,,human effort in creating these different\Ninputs and so on. So coverage guided Dialogue: 0,0:05:40.22,0:05:46.99,Default,,0000,0000,0000,,fuzzing has evolved as a very simple way\Nto guide the fuzzing process, leveraging Dialogue: 0,0:05:46.99,0:05:51.22,Default,,0000,0000,0000,,the information on which parts of the code\Nhave been executed by simply tracing the Dialogue: 0,0:05:51.22,0:05:58.50,Default,,0000,0000,0000,,individual path through the program based\Non the execution flow. So we can—the Dialogue: 0,0:05:58.50,0:06:03.46,Default,,0000,0000,0000,,fuzzer can use this feedback to then\Nmodify the inputs that are being thrown at Dialogue: 0,0:06:03.46,0:06:09.83,Default,,0000,0000,0000,,the fuzzing process. The second step is\Nthe fuzzer must be able to detect bugs. If Dialogue: 0,0:06:09.83,0:06:13.08,Default,,0000,0000,0000,,you've ever looked at a memory corruption,\Nif you're just writing one byte after the Dialogue: 0,0:06:13.08,0:06:18.49,Default,,0000,0000,0000,,end of a buffer, it's highly likely that\Nyour software is not going to crash. But Dialogue: 0,0:06:18.49,0:06:21.18,Default,,0000,0000,0000,,it's still a bug, and it may still be\Nexploitable based on the underlying Dialogue: 0,0:06:21.18,0:06:26.69,Default,,0000,0000,0000,,conditions. So we want to be able to\Ndetect violations as soon as they happen, Dialogue: 0,0:06:26.69,0:06:31.60,Default,,0000,0000,0000,,for example, based on on some form of\Nsanitization that we add, some form of Dialogue: 0,0:06:31.60,0:06:35.40,Default,,0000,0000,0000,,instrumentation that we add to the to the\Nbinary, that then tells us, hey, there's a Dialogue: 0,0:06:35.40,0:06:39.73,Default,,0000,0000,0000,,violation of the memory safety property,\Nand we terminate the application right Dialogue: 0,0:06:39.73,0:06:45.30,Default,,0000,0000,0000,,away as a feedback to the fuzzer. Third,\Nbut the—and last but not least: Speed is Dialogue: 0,0:06:45.30,0:06:49.57,Default,,0000,0000,0000,,key, right? For if you're running a\Nfuzzing campaign, you have a fixed Dialogue: 0,0:06:49.57,0:06:54.64,Default,,0000,0000,0000,,resource budget. You have a couple of\Ncores, and you want to run for 24 hours, Dialogue: 0,0:06:54.64,0:06:59.47,Default,,0000,0000,0000,,48 hours, a couple of days. But in any\Nway, whatever your constraints are, you Dialogue: 0,0:06:59.47,0:07:04.21,Default,,0000,0000,0000,,have a fixed amount of instructions that\Nyou can actually execute. And you have to Dialogue: 0,0:07:04.21,0:07:08.70,Default,,0000,0000,0000,,decide, am I spending my instructions on\Ngenerating new inputs, tracking Dialogue: 0,0:07:08.70,0:07:14.14,Default,,0000,0000,0000,,constraints, finding bugs, running\Nsanitization or executing the program? And Dialogue: 0,0:07:14.14,0:07:17.79,Default,,0000,0000,0000,,you need to find a balance between all of\Nthem, as it is a zero sum game. You have a Dialogue: 0,0:07:17.79,0:07:20.87,Default,,0000,0000,0000,,fixed amount of resources and you're\Ntrying to make the best with these Dialogue: 0,0:07:20.87,0:07:26.89,Default,,0000,0000,0000,,resources. So any overhead is slowing you\Ndown. And again, this becomes an Dialogue: 0,0:07:26.89,0:07:30.82,Default,,0000,0000,0000,,optimization problem. How can you most\Neffectively use the resources that you Dialogue: 0,0:07:30.82,0:07:37.58,Default,,0000,0000,0000,,have available? As we are fuzzing with\Nsource code, it's quite easy to actually Dialogue: 0,0:07:37.58,0:07:41.77,Default,,0000,0000,0000,,leverage existing mechanisms, and we add\Nall that instrumentation at compile time. Dialogue: 0,0:07:41.77,0:07:45.63,Default,,0000,0000,0000,,We take source code, we pipe it through\Nthe compiler and modern compiler Dialogue: 0,0:07:45.63,0:07:51.17,Default,,0000,0000,0000,,platforms, allow you to instrument and add\Nlittle code snippets during the Dialogue: 0,0:07:51.17,0:07:55.42,Default,,0000,0000,0000,,compilation process that then carry out\Nall these tasks that are useful for Dialogue: 0,0:07:55.42,0:08:00.27,Default,,0000,0000,0000,,fuzzing. For example, modern compilers can\Nadd short snippets of code for coverage Dialogue: 0,0:08:00.27,0:08:03.99,Default,,0000,0000,0000,,tracking that will record which parts of\Nthe code that you have executed, or for Dialogue: 0,0:08:03.99,0:08:08.77,Default,,0000,0000,0000,,sanitization which record and check every\Nsingle memory access if it is safe or not. Dialogue: 0,0:08:08.77,0:08:12.36,Default,,0000,0000,0000,,And then when you're running the\Ninstrumented binary, everything is fine Dialogue: 0,0:08:12.36,0:08:17.38,Default,,0000,0000,0000,,and you can detect the policy violations\Nas you go along. Now if you would have Dialogue: 0,0:08:17.38,0:08:21.33,Default,,0000,0000,0000,,source code for everything, this would be\Namazing. But it's often not the case, Dialogue: 0,0:08:21.33,0:08:28.13,Default,,0000,0000,0000,,right? We may be able on Linux to cover a\Nlarge part of the protocol stack by Dialogue: 0,0:08:28.13,0:08:33.94,Default,,0000,0000,0000,,focusing only on source-code-based\Napproaches. But there may be applications Dialogue: 0,0:08:33.94,0:08:39.30,Default,,0000,0000,0000,,where no source code is available. If we\Nmove to Android or other mobile systems, Dialogue: 0,0:08:39.30,0:08:43.20,Default,,0000,0000,0000,,there's many drivers that are not\Navailable as open source or just available Dialogue: 0,0:08:43.20,0:08:48.63,Default,,0000,0000,0000,,as binary blobs, or the full software\Nstack may be closed-source and we only get Dialogue: 0,0:08:48.63,0:08:52.33,Default,,0000,0000,0000,,the binaries. And we still want to find\Nvulnerabilities in these complex software Dialogue: 0,0:08:52.33,0:08:59.53,Default,,0000,0000,0000,,stacks that span hundreds of millions of\Nlines of code in a very efficient way. The Dialogue: 0,0:08:59.53,0:09:04.62,Default,,0000,0000,0000,,only solution to cover this part of\Nmassive code base is to actually rewrite Dialogue: 0,0:09:04.62,0:09:08.99,Default,,0000,0000,0000,,and focus on binaries. A very simple\Napproach could be black box fuzzing, but Dialogue: 0,0:09:08.99,0:09:11.62,Default,,0000,0000,0000,,this is—this doesn't really get you\Nanywhere because you don't get any Dialogue: 0,0:09:11.62,0:09:16.10,Default,,0000,0000,0000,,feedback; you don't get any information if\Nyou're triggering bugs. So one simple Dialogue: 0,0:09:16.10,0:09:20.29,Default,,0000,0000,0000,,approach, and this is the approach that is\Nmost dominantly used today, is to rewrite Dialogue: 0,0:09:20.29,0:09:26.04,Default,,0000,0000,0000,,the program or the binary dynamically. So\Nyou're taking the binary and during Dialogue: 0,0:09:26.04,0:09:32.01,Default,,0000,0000,0000,,execution you use some form of dynamic\Nbinary instrumentation based on Pin, angr, Dialogue: 0,0:09:32.01,0:09:37.14,Default,,0000,0000,0000,,or some other binary rewriting tool and\Ntranslate the targeted runtime, adding Dialogue: 0,0:09:37.14,0:09:43.33,Default,,0000,0000,0000,,this binary instrumentation on top of it\Nas you're executing it. It's simple, it's Dialogue: 0,0:09:43.33,0:09:46.93,Default,,0000,0000,0000,,straightforward, but it comes at a\Nterrible performance cost of ten to a Dialogue: 0,0:09:46.93,0:09:51.60,Default,,0000,0000,0000,,hundred x slow down, which is not really\Neffective. And you're spending all your Dialogue: 0,0:09:51.60,0:09:57.60,Default,,0000,0000,0000,,cores and your cycles on just executing\Nthe binary instrumentation. So we don't Dialogue: 0,0:09:57.60,0:10:01.79,Default,,0000,0000,0000,,really want to do this and we want to have\Nsomething that's more effective than that. Dialogue: 0,0:10:01.79,0:10:07.36,Default,,0000,0000,0000,,So what we are focusing on is to do static\Nrewriting. It involves a much more complex Dialogue: 0,0:10:07.36,0:10:12.38,Default,,0000,0000,0000,,analysis as we are rewriting the binary\Nbefore it is being executed, and we have Dialogue: 0,0:10:12.38,0:10:17.88,Default,,0000,0000,0000,,to recover all of the control flow, all of\Nthe different mechanisms, but it results Dialogue: 0,0:10:17.88,0:10:24.69,Default,,0000,0000,0000,,in a much better performance. And we can\Nget more bang for our buck. So why is Dialogue: 0,0:10:24.69,0:10:30.83,Default,,0000,0000,0000,,static rewriting so challenging? Well,\Nfirst, simply adding code will break the Dialogue: 0,0:10:30.83,0:10:35.32,Default,,0000,0000,0000,,target. So if you are disassembling this\Npiece of code here, which is a simple loop Dialogue: 0,0:10:35.32,0:10:40.62,Default,,0000,0000,0000,,that loads data, decrements the registers,\Nand then jumps if you're not at the end of Dialogue: 0,0:10:40.62,0:10:46.47,Default,,0000,0000,0000,,the array and keeps iterating through this\Narray. Now, as you look at the jump-not- Dialogue: 0,0:10:46.47,0:10:52.10,Default,,0000,0000,0000,,zero instruction, the last instruction of\Nthe snippet, it is a relative offset. So Dialogue: 0,0:10:52.10,0:10:57.99,Default,,0000,0000,0000,,it jumps backward seven bytes. Which is\Nnice if you just execute the code as is. Dialogue: 0,0:10:57.99,0:11:02.04,Default,,0000,0000,0000,,But as soon as you want to insert new\Ncode, you change the offsets in the Dialogue: 0,0:11:02.04,0:11:07.11,Default,,0000,0000,0000,,program, and you're modifying all these\Ndifferent offsets. And simply adding new Dialogue: 0,0:11:07.11,0:11:12.77,Default,,0000,0000,0000,,code somewhere in between will break the\Ntarget. So a core feature that we need to Dialogue: 0,0:11:12.77,0:11:18.17,Default,,0000,0000,0000,,enforce, or core property that we need to\Nenforce, is that we must find all the Dialogue: 0,0:11:18.17,0:11:24.05,Default,,0000,0000,0000,,references and properly adjust them, both\Nrelative offsets and absolute offsets as Dialogue: 0,0:11:24.05,0:11:29.80,Default,,0000,0000,0000,,well. Getting a single one wrong will\Nbreak everything. What makes this problem Dialogue: 0,0:11:29.80,0:11:34.52,Default,,0000,0000,0000,,really, really hard is that if you're\Nlooking at the binary, a byte is a byte, Dialogue: 0,0:11:34.52,0:11:38.32,Default,,0000,0000,0000,,right? There's no way for us to\Ndistinguish between scalars and Dialogue: 0,0:11:38.32,0:11:43.65,Default,,0000,0000,0000,,references, and in fact they are\Nindistinguishable. Getting a single Dialogue: 0,0:11:43.65,0:11:50.40,Default,,0000,0000,0000,,reference wrong breaks the target and\Nwould introduce arbitrary crashes. So we Dialogue: 0,0:11:50.40,0:11:54.46,Default,,0000,0000,0000,,have to come up with ways that allow us to\Ndistinguish between the two. So for Dialogue: 0,0:11:54.46,0:11:59.90,Default,,0000,0000,0000,,example, if you have this code here, it\Ntakes a value and stores it somewhere on Dialogue: 0,0:11:59.90,0:12:07.06,Default,,0000,0000,0000,,the stack. This could come from two\Ndifferent kind of high-level constructs. Dialogue: 0,0:12:07.06,0:12:12.17,Default,,0000,0000,0000,,On one hand, it could be taking the\Naddress of a function and storing this Dialogue: 0,0:12:12.17,0:12:16.54,Default,,0000,0000,0000,,function address somewhere and in a stack\Nvariable. Or it could be just storing a Dialogue: 0,0:12:16.54,0:12:21.58,Default,,0000,0000,0000,,scalar in a stack variable. And these two\Nare indistinguishable, and rewriting them, Dialogue: 0,0:12:21.58,0:12:25.22,Default,,0000,0000,0000,,as soon as we add new code, the offsets\Nwill change. If it is a function, we would Dialogue: 0,0:12:25.22,0:12:31.80,Default,,0000,0000,0000,,have to modify the value; if it is a\Nscalar, we have to keep the same value. So Dialogue: 0,0:12:31.80,0:12:35.51,Default,,0000,0000,0000,,how can we come up with a way that allows\Nus to distinguish between the two and Dialogue: 0,0:12:35.51,0:12:44.61,Default,,0000,0000,0000,,rewrite binaries by recovering this\Nmissing information? So let us take—let me Dialogue: 0,0:12:44.61,0:12:48.12,Default,,0000,0000,0000,,take you or let us take you on a journey\Ntowards instrumenting binaries in the Dialogue: 0,0:12:48.12,0:12:53.07,Default,,0000,0000,0000,,kernel. This is what we aim for. We'll\Nstart with the simple case of Dialogue: 0,0:12:53.07,0:12:57.41,Default,,0000,0000,0000,,instrumenting binaries in user land, talk\Nabout different kinds of coverage guided Dialogue: 0,0:12:57.41,0:13:01.75,Default,,0000,0000,0000,,fuzzing and what kind of instrumentation\Nwe can add, what kind of sanitization we Dialogue: 0,0:13:01.75,0:13:06.39,Default,,0000,0000,0000,,can add, and then focusing on taking it\Nall together and applying it to kernel Dialogue: 0,0:13:06.39,0:13:11.48,Default,,0000,0000,0000,,binaries to see what what will fall out of\Nit. Let's start with instrumenting Dialogue: 0,0:13:11.48,0:13:17.02,Default,,0000,0000,0000,,binaries first. I will now talk a little\Nbit about RetroWrite, our mechanism and Dialogue: 0,0:13:17.02,0:13:24.56,Default,,0000,0000,0000,,our tool that enables static binary\Ninstrumentation by symbolizing existing Dialogue: 0,0:13:24.56,0:13:30.80,Default,,0000,0000,0000,,binaries. So we recover the information\Nand we translate relative offsets and Dialogue: 0,0:13:30.80,0:13:39.71,Default,,0000,0000,0000,,absolute offsets into actual labels that\Nare added to the assembly file. The Dialogue: 0,0:13:39.71,0:13:42.76,Default,,0000,0000,0000,,instrumentation can then work on the\Nrecovered assembly file, which can then be Dialogue: 0,0:13:42.76,0:13:48.11,Default,,0000,0000,0000,,reassembled into a binary that can then be\Nexecuted for fuzzing. We implement Dialogue: 0,0:13:48.11,0:13:52.46,Default,,0000,0000,0000,,coverage tracking and binary address\Nsanitizer on top of this, leveraging Dialogue: 0,0:13:52.46,0:13:57.97,Default,,0000,0000,0000,,abstraction as we go forward. The key to\Nenabling this kind of binary rewriting is Dialogue: 0,0:13:57.97,0:14:02.17,Default,,0000,0000,0000,,position-independent code. And position-\Nindependent code has become the de-facto Dialogue: 0,0:14:02.17,0:14:07.42,Default,,0000,0000,0000,,standard for any code that is being\Nexecuted on a modern system. And it Dialogue: 0,0:14:07.42,0:14:12.02,Default,,0000,0000,0000,,effectively says that it is code that can\Nbe loaded at any arbitrary address in your Dialogue: 0,0:14:12.02,0:14:15.60,Default,,0000,0000,0000,,address space as you are executing\Nbinaries. It is essential and a Dialogue: 0,0:14:15.60,0:14:19.01,Default,,0000,0000,0000,,requirement if you want to have address\Nspace layout randomization or if you want Dialogue: 0,0:14:19.01,0:14:22.27,Default,,0000,0000,0000,,to use shared libraries, which de facto\Nyou want to use in all these different Dialogue: 0,0:14:22.27,0:14:26.09,Default,,0000,0000,0000,,systems. So since a couple of years, all\Nthe code that you're executing on your Dialogue: 0,0:14:26.09,0:14:33.08,Default,,0000,0000,0000,,phones, on your desktops, on your laptops\Nis position-independent code. And the idea Dialogue: 0,0:14:33.08,0:14:36.68,Default,,0000,0000,0000,,between the position-independent code is\Nthat you can load it anywhere in your Dialogue: 0,0:14:36.68,0:14:41.04,Default,,0000,0000,0000,,address space and you can therefore not\Nuse any hard-coded static addresses and Dialogue: 0,0:14:41.04,0:14:44.42,Default,,0000,0000,0000,,you have to inform the system of\Nrelocations or pick relative Dialogue: 0,0:14:44.42,0:14:52.92,Default,,0000,0000,0000,,addresses—to—on how the system can\Nrelocate these different mechanisms. On Dialogue: 0,0:14:52.92,0:14:58.54,Default,,0000,0000,0000,,x86_64, position-independent code\Nleverages addressing that is relative to Dialogue: 0,0:14:58.54,0:15:03.44,Default,,0000,0000,0000,,the instruction pointer. So for example,\Nit uses the current instruction pointer Dialogue: 0,0:15:03.44,0:15:07.52,Default,,0000,0000,0000,,and then a relative offset to that\Ninstruction pointer to reference global Dialogue: 0,0:15:07.52,0:15:12.03,Default,,0000,0000,0000,,variables, other functions and so on. And\Nthis is a very easy way for us to Dialogue: 0,0:15:12.03,0:15:17.71,Default,,0000,0000,0000,,distinguish references from constants,\Nespecially in PIE binaries. If it is RIP- Dialogue: 0,0:15:17.71,0:15:21.36,Default,,0000,0000,0000,,relative, it is a reference; everything\Nelse is a constant. And we can build our Dialogue: 0,0:15:21.36,0:15:25.69,Default,,0000,0000,0000,,translation algorithm and our translation\Nmechanism based on this fundamental Dialogue: 0,0:15:25.69,0:15:30.13,Default,,0000,0000,0000,,finding to remove any form of heuristic\Nthat is needed by focusing especially on Dialogue: 0,0:15:30.13,0:15:35.03,Default,,0000,0000,0000,,position-independent code. So we're\Nsupporting position-independent code; we Dialogue: 0,0:15:35.03,0:15:38.92,Default,,0000,0000,0000,,are—we don't support non-position-\Nindependent code, but we give you the Dialogue: 0,0:15:38.92,0:15:43.20,Default,,0000,0000,0000,,guarantee that we can rewrite all the\Ndifferent code that is out there. So Dialogue: 0,0:15:43.20,0:15:48.45,Default,,0000,0000,0000,,symbolization works as follows: If you\Nhave the little bit of code on the lower Dialogue: 0,0:15:48.45,0:15:54.03,Default,,0000,0000,0000,,right, symbolization replaces first all\Nthe references with assembler labels. So Dialogue: 0,0:15:54.03,0:15:57.70,Default,,0000,0000,0000,,look at the call instruction and the jump-\Nnot-zero instruction; the call instruction Dialogue: 0,0:15:57.70,0:16:02.40,Default,,0000,0000,0000,,references an absolute address and the\Njump-not-zero instruction jumps backward Dialogue: 0,0:16:02.40,0:16:08.26,Default,,0000,0000,0000,,relative 15 bytes. So by focusing on these\Nrelative jumps and calls, we can replace Dialogue: 0,0:16:08.26,0:16:12.02,Default,,0000,0000,0000,,them with actual labels and rewrite the\Nbinary as follows: so we're calling a Dialogue: 0,0:16:12.02,0:16:15.84,Default,,0000,0000,0000,,function, replacing it with the actual\Nlabel, and for the jump-not-zero we are Dialogue: 0,0:16:15.84,0:16:21.02,Default,,0000,0000,0000,,inserting an actual label in the assembly\Ncode and adding a backward reference. For Dialogue: 0,0:16:21.02,0:16:26.09,Default,,0000,0000,0000,,PC-relative addresses, for example the\Ndata load, we can then replace it with the Dialogue: 0,0:16:26.09,0:16:30.33,Default,,0000,0000,0000,,name of the actual data that we have\Nrecovered, and we can then add all the Dialogue: 0,0:16:30.33,0:16:35.63,Default,,0000,0000,0000,,different relocations and use that as\Nauxiliary information on top of it. After Dialogue: 0,0:16:35.63,0:16:43.48,Default,,0000,0000,0000,,these three steps, we can insert any new\Ncode in between, and can therefore add Dialogue: 0,0:16:43.48,0:16:47.42,Default,,0000,0000,0000,,different forms of instrumentations or run\Nsome more higher-level analysis on top of Dialogue: 0,0:16:47.42,0:16:53.94,Default,,0000,0000,0000,,it, and then reassemble the file for\Nfuzzing or coverage-guided tracking, Dialogue: 0,0:16:53.94,0:16:59.10,Default,,0000,0000,0000,,address sanitization or whatever else you\Nwant to do. I will now hand over to Dialogue: 0,0:16:59.10,0:17:04.49,Default,,0000,0000,0000,,Matteo, who will cover coverage-guided\Nfuzzing and sanitization and then Dialogue: 0,0:17:04.49,0:17:07.26,Default,,0000,0000,0000,,instrumenting the binaries in the kernel.\NGo ahead. Dialogue: 0,0:17:07.26,0:17:11.30,Default,,0000,0000,0000,,Nspace (Matteo Rizzo): So, now that we\Nhave this really nice framework to rewrite Dialogue: 0,0:17:11.30,0:17:16.50,Default,,0000,0000,0000,,binaries, one of the things that we want\Nto add to actually get the fuzzing is this Dialogue: 0,0:17:16.50,0:17:22.96,Default,,0000,0000,0000,,coverage-tracking instrumentation. So\Ncoverage-guided fuzzing is a way, a Dialogue: 0,0:17:22.96,0:17:27.55,Default,,0000,0000,0000,,method, for—to let the fuzzer discover\Ninteresting inputs, an interesting path to Dialogue: 0,0:17:27.55,0:17:35.52,Default,,0000,0000,0000,,the target by itself. So the basic idea is\Nthat the fuzzer will track coverage—the Dialogue: 0,0:17:35.52,0:17:39.19,Default,,0000,0000,0000,,parts of the programs that are covered by\Ndifferent inputs by inserting some kind of Dialogue: 0,0:17:39.19,0:17:43.42,Default,,0000,0000,0000,,instrumentation. So, for example, here we\Nhave this target program that checks if Dialogue: 0,0:17:43.42,0:17:48.65,Default,,0000,0000,0000,,the input contains the string "PNG" at the\Nbeginning, and if it does, then it does Dialogue: 0,0:17:48.65,0:17:53.56,Default,,0000,0000,0000,,something interesting, otherwise it just\Nbails out and fails. So if we track the Dialogue: 0,0:17:53.56,0:17:58.24,Default,,0000,0000,0000,,part of the programs that each input\Nexecutes, the fuzzer can figure out that Dialogue: 0,0:17:58.24,0:18:03.10,Default,,0000,0000,0000,,an input that contains "P" will have\Ndiscovered a different path through the Dialogue: 0,0:18:03.10,0:18:08.08,Default,,0000,0000,0000,,program than input that doesn't contain\Nit. And then so on it can, one byte at a Dialogue: 0,0:18:08.08,0:18:13.36,Default,,0000,0000,0000,,time, discover that this program expects\Nthis magic sequence "PNG" at the start of Dialogue: 0,0:18:13.36,0:18:19.28,Default,,0000,0000,0000,,the input. So the way that the fuzzer does\Nthis is that every time a new input Dialogue: 0,0:18:19.28,0:18:23.73,Default,,0000,0000,0000,,discovers a new path though the target, it\Nis considered interesting and added to a Dialogue: 0,0:18:23.73,0:18:28.89,Default,,0000,0000,0000,,corpus of interesting inputs. And every\Ntime the fuzzer needs to generate a new Dialogue: 0,0:18:28.89,0:18:35.61,Default,,0000,0000,0000,,input, it will select something from the\Ncorpus, mutate it randomly, and then use Dialogue: 0,0:18:35.61,0:18:39.83,Default,,0000,0000,0000,,it as the new input. So this is like\Na—this is, like, conceptually pretty Dialogue: 0,0:18:39.83,0:18:43.15,Default,,0000,0000,0000,,simple, but in practice it works really\Nwell and it really lets the fuzzer Dialogue: 0,0:18:43.15,0:18:47.74,Default,,0000,0000,0000,,discover the format that the target\Nexpects in an unsupervised way. So as an Dialogue: 0,0:18:47.74,0:18:53.01,Default,,0000,0000,0000,,example, this is an experiment that was\Nrun by the author of AFL—AFL is the fuzzer Dialogue: 0,0:18:53.01,0:18:58.05,Default,,0000,0000,0000,,that sort of popularized this\Ntechnique—where he was fuzzing a JPEG- Dialogue: 0,0:18:58.05,0:19:02.16,Default,,0000,0000,0000,,parsing library, starting from a corpus\Nthat only contained the string "hello". So Dialogue: 0,0:19:02.16,0:19:07.65,Default,,0000,0000,0000,,now clearly "hello" is not a valid JPEG\Nimage and so—but still, like, the fuzzer Dialogue: 0,0:19:07.65,0:19:12.07,Default,,0000,0000,0000,,was still able to find—to discover the\Ncorrect format. So after a while it Dialogue: 0,0:19:12.07,0:19:17.58,Default,,0000,0000,0000,,started generating some grayscale images,\Non the top left, and as it generated more Dialogue: 0,0:19:17.58,0:19:20.72,Default,,0000,0000,0000,,and more inputs, it started generating\Nmore interesting images, such as some Dialogue: 0,0:19:20.72,0:19:25.12,Default,,0000,0000,0000,,grayscale gradients, and later on even\Nsome color images. So as you can see, this Dialogue: 0,0:19:25.12,0:19:30.63,Default,,0000,0000,0000,,really works, and it allows us to fuzz a\Nprogram without really teaching the fuzzer Dialogue: 0,0:19:30.63,0:19:34.60,Default,,0000,0000,0000,,how the input should look like. So that's\Nit for coverage-guided fuzzing. Now we'll Dialogue: 0,0:19:34.60,0:19:38.19,Default,,0000,0000,0000,,talk a bit about sanitizations. As a\Nreminder, the core idea behind Dialogue: 0,0:19:38.19,0:19:42.33,Default,,0000,0000,0000,,sanitization is that just looking for\Ncrashes is likely to miss some of the Dialogue: 0,0:19:42.33,0:19:45.92,Default,,0000,0000,0000,,bugs. So, for example, if you have this\Nout-of-bounds one-byte read, that will Dialogue: 0,0:19:45.92,0:19:49.59,Default,,0000,0000,0000,,probably not crash the target, but you\Nwould still like to catch it because it Dialogue: 0,0:19:49.59,0:19:53.08,Default,,0000,0000,0000,,could be used for an info leak, for\Nexample. So one of the most popular Dialogue: 0,0:19:53.08,0:19:59.03,Default,,0000,0000,0000,,sanitizers is Address Sanitizer. So\NAddress Sanitizer will instrument all the Dialogue: 0,0:19:59.03,0:20:04.63,Default,,0000,0000,0000,,memory accesses in your program and check\Nfor memory corruption, which—so, memory Dialogue: 0,0:20:04.63,0:20:08.81,Default,,0000,0000,0000,,corruption is a pretty dangerous class of\Nbugs that unfortunately still plagues C Dialogue: 0,0:20:08.81,0:20:16.77,Default,,0000,0000,0000,,and C++ programs and unsafe languages in\Ngeneral. And ASan tries to catch it by Dialogue: 0,0:20:16.77,0:20:21.22,Default,,0000,0000,0000,,instrumenting the target. It is very\Npopular; it has been used to find Dialogue: 0,0:20:21.22,0:20:26.90,Default,,0000,0000,0000,,thousands of bugs in complex software like\NChrome and Linux, and even though it has, Dialogue: 0,0:20:26.90,0:20:31.50,Default,,0000,0000,0000,,like, a bit of a slowdown—like about 2x—it\Nis still really popular because it lets Dialogue: 0,0:20:31.50,0:20:37.12,Default,,0000,0000,0000,,you find many, many more bugs. So how does\Nit work? The basic idea is that ASan will Dialogue: 0,0:20:37.12,0:20:41.79,Default,,0000,0000,0000,,insert some special regions of memory\Ncalled 'red zones' around every object in Dialogue: 0,0:20:41.79,0:20:47.27,Default,,0000,0000,0000,,memory. So we have a small example here\Nwhere we declare a 4-byte array on the Dialogue: 0,0:20:47.27,0:20:53.70,Default,,0000,0000,0000,,stack. So ASan will allocate the array\N"buf" and then add a red zone before it Dialogue: 0,0:20:53.70,0:20:59.06,Default,,0000,0000,0000,,and a red zone after it. Whenever the\Nprogram accesses the red zones, it is Dialogue: 0,0:20:59.06,0:21:02.66,Default,,0000,0000,0000,,terminated with a security violation. So\Nthe instrumentation just prints a bug Dialogue: 0,0:21:02.66,0:21:07.42,Default,,0000,0000,0000,,report and then crashes the target. This\Nis very useful for detecting, for example, Dialogue: 0,0:21:07.42,0:21:11.40,Default,,0000,0000,0000,,buffer overflows or underflows and many\Nother kinds of bugs such as use-after-free Dialogue: 0,0:21:11.40,0:21:16.23,Default,,0000,0000,0000,,and so on. So, as an example here, we are\Ntrying to copy 5 bytes into a 4-byte Dialogue: 0,0:21:16.23,0:21:22.58,Default,,0000,0000,0000,,buffer, and ASan will check each of the\Naccesses one by one. And when it sees that Dialogue: 0,0:21:22.58,0:21:26.81,Default,,0000,0000,0000,,the last byte writes to a red zone, it\Ndetects the violation and crashes the Dialogue: 0,0:21:26.81,0:21:32.37,Default,,0000,0000,0000,,program. So this is good for us because\Nthis bug might have not been found by Dialogue: 0,0:21:32.37,0:21:36.12,Default,,0000,0000,0000,,simply looking for crashes, but it's\Ndefinitely found if we use ASan. So this Dialogue: 0,0:21:36.12,0:21:40.75,Default,,0000,0000,0000,,is something we want for fuzzing. So now\Nthat we've covered—briefly covered ASan we Dialogue: 0,0:21:40.75,0:21:45.97,Default,,0000,0000,0000,,can talk about instrumenting binaries in\Nthe kernel. So Mathias left us with Dialogue: 0,0:21:45.97,0:21:52.58,Default,,0000,0000,0000,,RetroWrite, and with RetroWrite we can add\Nboth coverage tracking and ASan to Dialogue: 0,0:21:52.58,0:21:57.41,Default,,0000,0000,0000,,binaries. So the simple—it's a really\Nsimple idea: now that we can rewrite this Dialogue: 0,0:21:57.41,0:22:02.76,Default,,0000,0000,0000,,binary and add instructions wherever we\Nwant, we can implement both coverage Dialogue: 0,0:22:02.76,0:22:07.39,Default,,0000,0000,0000,,tracking and ASan. In order to implement\Ncoverage tracking, we simply have to Dialogue: 0,0:22:07.39,0:22:11.71,Default,,0000,0000,0000,,identify the start of every basic block\Nand add a little piece of instrumentation Dialogue: 0,0:22:11.71,0:22:15.79,Default,,0000,0000,0000,,at the start of the basic block that tells\Nthe fuzzer 'hey, we've reached this part Dialogue: 0,0:22:15.79,0:22:19.40,Default,,0000,0000,0000,,of the program'—'hey, we've reached this\Nother part of the program'. Then the Dialogue: 0,0:22:19.40,0:22:25.04,Default,,0000,0000,0000,,fuzzer can figure out whether that's a new\Npart or not. ASan is also, like, you know, Dialogue: 0,0:22:25.04,0:22:29.24,Default,,0000,0000,0000,,it's also somewhat—it can also be\Nimplemented in this way by finding all Dialogue: 0,0:22:29.24,0:22:33.93,Default,,0000,0000,0000,,memory accesses, and then linking with\NlibASan. libASan is a sort of runtime for Dialogue: 0,0:22:33.93,0:22:38.82,Default,,0000,0000,0000,,ASan that takes care of inserting the red\Nzones and instrument—and adding, you know, Dialogue: 0,0:22:38.82,0:22:43.34,Default,,0000,0000,0000,,like, keeping around all the metadata that\NASan needs to know where the red zones Dialogue: 0,0:22:43.34,0:22:48.42,Default,,0000,0000,0000,,are, and detecting whether a memory access\Nis invalid. So, how can we apply all of Dialogue: 0,0:22:48.42,0:22:52.31,Default,,0000,0000,0000,,this in the kernel? Well, first of all,\Nfuzzing the kernel is not as easy as Dialogue: 0,0:22:52.31,0:22:57.92,Default,,0000,0000,0000,,fuzzing some userspace program. There's\Nsome issues here. So first of all, there's Dialogue: 0,0:22:57.92,0:23:01.95,Default,,0000,0000,0000,,crash handling. So whenever you're fuzzing\Na userspace program, you expect crashes, Dialogue: 0,0:23:01.95,0:23:06.29,Default,,0000,0000,0000,,well, because that's what we're after. And\Nif a userspace program crashes, then the Dialogue: 0,0:23:06.29,0:23:11.41,Default,,0000,0000,0000,,US simply terminates the crash gracefully.\NAnd so the fuzzer can detect this, and Dialogue: 0,0:23:11.41,0:23:16.27,Default,,0000,0000,0000,,save the input as a crashing input, and so\Non. And this is all fine. But when you're Dialogue: 0,0:23:16.27,0:23:19.47,Default,,0000,0000,0000,,fuzzing the kernel, so—if you were fuzzing\Nthe kernel of the machine that you were Dialogue: 0,0:23:19.47,0:23:23.04,Default,,0000,0000,0000,,using for fuzzing, after a while, the\Nmachine would just go down. Because, after Dialogue: 0,0:23:23.04,0:23:27.18,Default,,0000,0000,0000,,all, the kernel runs the machine, and if\Nit starts misbehaving, then all of it can Dialogue: 0,0:23:27.18,0:23:31.72,Default,,0000,0000,0000,,go wrong. And more importantly, you can\Nlose your crashes, because the if the Dialogue: 0,0:23:31.72,0:23:35.45,Default,,0000,0000,0000,,machine crashes, then the state of the\Nfuzzer is lost and you have no idea what Dialogue: 0,0:23:35.45,0:23:39.59,Default,,0000,0000,0000,,your crashing input was. So what most\Nkernel fuzzers have to do is that they Dialogue: 0,0:23:39.59,0:23:43.42,Default,,0000,0000,0000,,resort to some kind of VM to keep the\Nsystem stable. So they fuzz the kernel in Dialogue: 0,0:23:43.42,0:23:48.50,Default,,0000,0000,0000,,a VM and then run the fuzzing agent\Noutside the VM. On top of that is tooling. Dialogue: 0,0:23:48.50,0:23:52.71,Default,,0000,0000,0000,,So, if you want to fuzz a user space\Nprogram, you can just download AFL or use Dialogue: 0,0:23:52.71,0:23:57.54,Default,,0000,0000,0000,,libfuzzer; there's plenty of tutorials\Nonline, it's really easy to set up and Dialogue: 0,0:23:57.54,0:24:01.20,Default,,0000,0000,0000,,just, like—compile your program, you start\Nfuzzing and you're good to go. If you want Dialogue: 0,0:24:01.20,0:24:05.24,Default,,0000,0000,0000,,to fuzz the kernel, it's already much more\Ncomplicated. So, for example, if you want Dialogue: 0,0:24:05.24,0:24:09.39,Default,,0000,0000,0000,,to fuzz Linux with, say, syzkaller, which\Nis a popular kernel fuzzer, you have to Dialogue: 0,0:24:09.39,0:24:14.03,Default,,0000,0000,0000,,compile the kernel, you have to use a\Nspecial config that supports syzkaller, Dialogue: 0,0:24:14.03,0:24:20.10,Default,,0000,0000,0000,,you have way less guides available than\Nfor userspace fuzzing, and in general it's Dialogue: 0,0:24:20.10,0:24:24.94,Default,,0000,0000,0000,,just much more complex and less intuitive\Nthan just fuzzing userspace. And lastly, Dialogue: 0,0:24:24.94,0:24:29.33,Default,,0000,0000,0000,,we have the issue of determinism. So in\Ngeneral, if you have a single threaded Dialogue: 0,0:24:29.33,0:24:32.77,Default,,0000,0000,0000,,userspace program, unless it uses some\Nkind of random number generator, it is Dialogue: 0,0:24:32.77,0:24:38.21,Default,,0000,0000,0000,,more or less deterministic. There's\Nnothing that affects the execution of the Dialogue: 0,0:24:38.21,0:24:42.30,Default,,0000,0000,0000,,program. But—and this is really nice if\Nyou want to try to reproduce a test case, Dialogue: 0,0:24:42.30,0:24:46.34,Default,,0000,0000,0000,,because if you have a non-deterministic\Ntest case, then it's really hard to know Dialogue: 0,0:24:46.34,0:24:50.68,Default,,0000,0000,0000,,whether this is really a crash or if it's\Njust something that you should ignore, and Dialogue: 0,0:24:50.68,0:24:56.28,Default,,0000,0000,0000,,in the kernel this is even harder, because\Nyou don't only have concurrency, like Dialogue: 0,0:24:56.28,0:25:01.20,Default,,0000,0000,0000,,multi-processing, you also have interrupts.\NSo interrupts can happen at any time, and Dialogue: 0,0:25:01.20,0:25:05.85,Default,,0000,0000,0000,,if one time you got an interrupt while\Nexecuting your test case and the second Dialogue: 0,0:25:05.85,0:25:09.95,Default,,0000,0000,0000,,time you didn't, then maybe it only\Ncrashes one time - you don't really know, Dialogue: 0,0:25:09.95,0:25:15.91,Default,,0000,0000,0000,,it's not pretty. And so again, we\Nhave several approaches to fuzzing Dialogue: 0,0:25:15.91,0:25:20.55,Default,,0000,0000,0000,,binaries in the kernel. First one is to do\Nblack box fuzzing. We don't really Dialogue: 0,0:25:20.55,0:25:23.68,Default,,0000,0000,0000,,like this because it doesn't find much,\Nespecially in something complex Dialogue: 0,0:25:23.68,0:25:27.38,Default,,0000,0000,0000,,like a kernel. Approach 1 is to\Nuse dynamic translation, Dialogue: 0,0:25:27.38,0:25:32.62,Default,,0000,0000,0000,,so, use something\Nlike QEMU or—you name it. This works, and Dialogue: 0,0:25:32.62,0:25:35.12,Default,,0000,0000,0000,,people have used it successfully; the\Nproblem is that it is really, really, Dialogue: 0,0:25:35.12,0:25:41.50,Default,,0000,0000,0000,,really slow. Like, we're talking about\N10x-plus overhead. And as we said before, Dialogue: 0,0:25:41.50,0:25:45.57,Default,,0000,0000,0000,,the more iterations, the more test cases\Nyou can execute in the same amount of Dialogue: 0,0:25:45.57,0:25:50.70,Default,,0000,0000,0000,,time, the better, because you find more\Nbugs. And on top of that, there's no Dialogue: 0,0:25:50.70,0:25:57.52,Default,,0000,0000,0000,,currently available sanitizer for\Nkernel binaries that works—is based on Dialogue: 0,0:25:57.52,0:26:01.31,Default,,0000,0000,0000,,this approach. So in userspace you have\Nsomething like valgrind; in the kernel, Dialogue: 0,0:26:01.31,0:26:05.07,Default,,0000,0000,0000,,you don't have anything, at least that we\Nknow of. There is another approach, which Dialogue: 0,0:26:05.07,0:26:09.95,Default,,0000,0000,0000,,is to use Intel Processor Trace. This has\Nbeen, like—there's been some research Dialogue: 0,0:26:09.95,0:26:14.24,Default,,0000,0000,0000,,papers on this recently, and this is nice\Nbecause it allows you to collect coverage Dialogue: 0,0:26:14.24,0:26:18.04,Default,,0000,0000,0000,,at nearly zero overhead. It's, like,\Nreally fast, but the problem is that it Dialogue: 0,0:26:18.04,0:26:23.02,Default,,0000,0000,0000,,requires hardware support, so it requires\Na fairly new x86 CPU, and if you want to Dialogue: 0,0:26:23.02,0:26:27.16,Default,,0000,0000,0000,,fuzz something on ARM, say, like, your\NAndroid driver, or if you want to use an Dialogue: 0,0:26:27.16,0:26:32.12,Default,,0000,0000,0000,,older CPU, then you're out of luck. And\Nwhat's worse, you cannot really use it for Dialogue: 0,0:26:32.12,0:26:36.49,Default,,0000,0000,0000,,sanitization, or at least not the kind of\Nsanitization that ASan does, because it Dialogue: 0,0:26:36.49,0:26:41.77,Default,,0000,0000,0000,,just traces the execution; it doesn't\Nallow you to do checks on memory accesses. Dialogue: 0,0:26:41.77,0:26:47.35,Default,,0000,0000,0000,,So Approach 3, which is what we will use,\Nis static rewriting. So, we had this very Dialogue: 0,0:26:47.35,0:26:50.75,Default,,0000,0000,0000,,nice framework for rewriting userspace\Nbinaries, and then we asked ourselves, can Dialogue: 0,0:26:50.75,0:26:56.66,Default,,0000,0000,0000,,we make this work in the kernel? So we\Ntook the system, the original RetroWrite, Dialogue: 0,0:26:56.66,0:27:02.65,Default,,0000,0000,0000,,we modified it, we implemented support for\NLinux modules, and... it works! So we have Dialogue: 0,0:27:02.65,0:27:08.11,Default,,0000,0000,0000,,implemented it—we have used it to fuzz\Nsome kernel modules, and it really shows Dialogue: 0,0:27:08.11,0:27:11.64,Default,,0000,0000,0000,,that this approach doesn't only work for\Nuserspace; it can also be applied to the Dialogue: 0,0:27:11.64,0:27:18.51,Default,,0000,0000,0000,,kernel. So as for some implementation, the\Nnice thing about kernel modules is that Dialogue: 0,0:27:18.51,0:27:22.17,Default,,0000,0000,0000,,they're always position independent. So\Nyou cannot have position—like, fixed- Dialogue: 0,0:27:22.17,0:27:26.37,Default,,0000,0000,0000,,position kernel modules because Linux just\Ndoesn't allow it. So we sort of get that Dialogue: 0,0:27:26.37,0:27:32.22,Default,,0000,0000,0000,,for free, which is nice. And Linux modules\Nare also a special class of ELF files, Dialogue: 0,0:27:32.22,0:27:35.89,Default,,0000,0000,0000,,which means that the format is—even though\Nit's not the same as userspace binaries, Dialogue: 0,0:27:35.89,0:27:40.31,Default,,0000,0000,0000,,it's still somewhat similar, so we didn't\Nhave to change the symbolizer that much, Dialogue: 0,0:27:40.31,0:27:46.54,Default,,0000,0000,0000,,which is also nice. And we implemented\Nsymbolization with this, and we used it to Dialogue: 0,0:27:46.54,0:27:54.49,Default,,0000,0000,0000,,implement both code coverage and binary\NASan for kernel binary modules. So for Dialogue: 0,0:27:54.49,0:27:59.04,Default,,0000,0000,0000,,coverage: The idea behind the whole\NRetroWrite project was that we wanted to Dialogue: 0,0:27:59.04,0:28:03.50,Default,,0000,0000,0000,,integrate with existing tools. So existing\Nfuzzing tools. We didn't want to force our Dialogue: 0,0:28:03.50,0:28:08.77,Default,,0000,0000,0000,,users to write their own fuzzer that is\Ncompatible with RetroWrite. So for—in Dialogue: 0,0:28:08.77,0:28:13.47,Default,,0000,0000,0000,,userspace we had AFL-style coverage\Ntracking, and binary ASan which is Dialogue: 0,0:28:13.47,0:28:16.49,Default,,0000,0000,0000,,compatible with source-based ASan, and we\Nwanted to follow the same principle in the Dialogue: 0,0:28:16.49,0:28:21.90,Default,,0000,0000,0000,,kernel. So it turns out that Linux has\Nthis built-in coverage-tracking framework Dialogue: 0,0:28:21.90,0:28:26.53,Default,,0000,0000,0000,,called kCov that is used by several\Npopular kernel fuzzers like syzkaller, and Dialogue: 0,0:28:26.53,0:28:31.05,Default,,0000,0000,0000,,we wanted to use it ourselves. So we\Ndesigned our coverage instrumentation so Dialogue: 0,0:28:31.05,0:28:36.59,Default,,0000,0000,0000,,that it integrates with kCov. The downside\Nis that you need to compile the kernel Dialogue: 0,0:28:36.59,0:28:40.69,Default,,0000,0000,0000,,with kCov, but then again, Linux is open\Nsource, so you can sort of always do that; Dialogue: 0,0:28:40.69,0:28:44.28,Default,,0000,0000,0000,,the kernel usually—it's usually not the\Nkernel, it is a binary blob, but it's Dialogue: 0,0:28:44.28,0:28:48.93,Default,,0000,0000,0000,,usually only the modules. So that's just\Nstill fine. And the way you do this is—the Dialogue: 0,0:28:48.93,0:28:53.37,Default,,0000,0000,0000,,way you implement kCov for binary modules\Nis that you just have to find the start of Dialogue: 0,0:28:53.37,0:28:58.54,Default,,0000,0000,0000,,every basic block, and add a call to some\Nfunction that then stores the collected Dialogue: 0,0:28:58.54,0:29:02.53,Default,,0000,0000,0000,,coverage. So here's an example: we have a\Nshort snippet of code with three basic Dialogue: 0,0:29:02.53,0:29:07.62,Default,,0000,0000,0000,,blocks, and all we have to do is add a\Ncall to "trace_pc" to the start of the Dialogue: 0,0:29:07.62,0:29:11.94,Default,,0000,0000,0000,,basic block. "trace_pc" is a function that\Nis part of the main kernel image that then Dialogue: 0,0:29:11.94,0:29:17.23,Default,,0000,0000,0000,,collects this coverage and makes it\Navailable to a userspace fuzzing agent. So Dialogue: 0,0:29:17.23,0:29:21.21,Default,,0000,0000,0000,,this is all really easy and it works. And\Nlet's now see how we implemented binary Dialogue: 0,0:29:21.21,0:29:25.60,Default,,0000,0000,0000,,ASan. So as I mentioned before, when we\Ninstrument the program with binary ASan in Dialogue: 0,0:29:25.60,0:29:29.69,Default,,0000,0000,0000,,userspace we link with libASan, which\Ntakes care of setting up the metadata, Dialogue: 0,0:29:29.69,0:29:33.88,Default,,0000,0000,0000,,takes care of putting the red zones around\Nour allocations, and so on. So we had to Dialogue: 0,0:29:33.88,0:29:37.33,Default,,0000,0000,0000,,do something similar in the kernel; of\Ncourse, you cannot link with libASan in Dialogue: 0,0:29:37.33,0:29:42.63,Default,,0000,0000,0000,,the kernel, because that doesn't work, but\Nwhat we can do instead is, again, compile Dialogue: 0,0:29:42.63,0:29:47.24,Default,,0000,0000,0000,,the kernel with kASan support. So this\Ninstruments the allocator, kmalloc, to add Dialogue: 0,0:29:47.24,0:29:52.11,Default,,0000,0000,0000,,the red zones; it allocates space for the\Nmetadata, it keeps this metadata around, Dialogue: 0,0:29:52.11,0:29:56.28,Default,,0000,0000,0000,,does this all for us, which is really\Nnice. And again, the big advantage of Dialogue: 0,0:29:56.28,0:30:00.58,Default,,0000,0000,0000,,using this approach is that we can\Nintegrate seamlessly with a kASan- Dialogue: 0,0:30:00.58,0:30:05.80,Default,,0000,0000,0000,,instrumented kernel and with fuzzers that\Nrely on kASan such as syzkaller. So we see Dialogue: 0,0:30:05.80,0:30:11.50,Default,,0000,0000,0000,,this as more of a plus than, like, a\Nlimitation. And how do you implement ASan? Dialogue: 0,0:30:11.50,0:30:16.56,Default,,0000,0000,0000,,Well, you have to find every memory access\Nand instrument it to check the—to check Dialogue: 0,0:30:16.56,0:30:22.37,Default,,0000,0000,0000,,whether this is accessing a red zone. And\Nif it does then you just call this bug Dialogue: 0,0:30:22.37,0:30:26.01,Default,,0000,0000,0000,,report function that produces a stack\Ntrace, a bug report, and crashes the Dialogue: 0,0:30:26.01,0:30:29.65,Default,,0000,0000,0000,,kernel, so that the fuzzer can detect it.\NAgain, this is compatible with source- Dialogue: 0,0:30:29.65,0:30:36.99,Default,,0000,0000,0000,,based kASan, so we're happy. We can simply\Nload the rewritten module with added Dialogue: 0,0:30:36.99,0:30:40.22,Default,,0000,0000,0000,,instrumentation into a kernel, as long as\Nyou have compiled the kernel with the Dialogue: 0,0:30:40.22,0:30:44.34,Default,,0000,0000,0000,,right flags, and we can use a standard\Nkernel fuzzer. Here for the—our Dialogue: 0,0:30:44.34,0:30:49.91,Default,,0000,0000,0000,,evaluation, we used syzkaller, a popular\Nkernel fuzzer by some folks at Google, and Dialogue: 0,0:30:49.91,0:30:55.46,Default,,0000,0000,0000,,it worked really well. So we've finally\Nreached the end of our journey, and now we Dialogue: 0,0:30:55.46,0:31:00.47,Default,,0000,0000,0000,,wanted to present some experiments we did\Nto see if this really works. So for Dialogue: 0,0:31:00.47,0:31:05.29,Default,,0000,0000,0000,,userspace, we wanted to compare the\Nperformance of our binary ASan with Dialogue: 0,0:31:05.29,0:31:10.36,Default,,0000,0000,0000,,source-based ASan and with existing\Nsolutions that also work on binaries. So Dialogue: 0,0:31:10.36,0:31:15.86,Default,,0000,0000,0000,,for userspace, you can use valgrind\Nmemcheck. It's a memory sanitizer that is Dialogue: 0,0:31:15.86,0:31:20.85,Default,,0000,0000,0000,,based on binary translation and dynamic\Nbinary translation and works on binaries. Dialogue: 0,0:31:20.85,0:31:25.46,Default,,0000,0000,0000,,We compared it with source ASan and\NRetroWrite ASan on the SPEC CPU benchmark Dialogue: 0,0:31:25.46,0:31:31.10,Default,,0000,0000,0000,,and saw how fast it was. And for the\Nkernel we decided to fuzz some file Dialogue: 0,0:31:31.10,0:31:37.52,Default,,0000,0000,0000,,systems and some drivers with syzkaller\Nusing both source-based KASan and kCov and Dialogue: 0,0:31:37.52,0:31:44.67,Default,,0000,0000,0000,,kRetroWrite-based KASan and kCov. So these\Nare our results for userspace. So the red Dialogue: 0,0:31:44.67,0:31:48.99,Default,,0000,0000,0000,,bar is valgrind. We can see that the\Nexecution time of valgrind is the highest. Dialogue: 0,0:31:48.99,0:31:55.89,Default,,0000,0000,0000,,It is really, really slow—like, 3, 10, 30x\Noverhead, way too slow for fuzzing. Then Dialogue: 0,0:31:55.89,0:32:02.58,Default,,0000,0000,0000,,in green, we have our binary ASan, which\Nis, like, already a large improvement. In Dialogue: 0,0:32:02.58,0:32:07.06,Default,,0000,0000,0000,,orange we have source-based ASan. And then\Nfinally in blue we have the original code Dialogue: 0,0:32:07.06,0:32:11.09,Default,,0000,0000,0000,,without any instrumentation whatsoever. So\Nwe can see that source-based ASan has, Dialogue: 0,0:32:11.09,0:32:16.66,Default,,0000,0000,0000,,like, 2x or 3x overhead, and binary ASan\Nis a bit higher, like, a bit less Dialogue: 0,0:32:16.66,0:32:21.31,Default,,0000,0000,0000,,efficient, but still somewhat close. So\Nthat's for userspace, and for the kernel, Dialogue: 0,0:32:21.31,0:32:25.44,Default,,0000,0000,0000,,we—these are some preliminary results, so,\Nthis is, like—I'm doing this work as part Dialogue: 0,0:32:25.44,0:32:29.90,Default,,0000,0000,0000,,of my master's thesis, and so I'm still,\Nlike, running the evaluation. Here we can Dialogue: 0,0:32:29.90,0:32:33.42,Default,,0000,0000,0000,,see that the overhead is already, like, a\Nbit lower. So the reason for this is that Dialogue: 0,0:32:33.42,0:32:39.69,Default,,0000,0000,0000,,SPEC is a pure CPU benchmark; it doesn't\Ninteract with the system that much. And so Dialogue: 0,0:32:39.69,0:32:44.42,Default,,0000,0000,0000,,any instrumentation that you add is going\Nto massively slow down, or, like, Dialogue: 0,0:32:44.42,0:32:49.32,Default,,0000,0000,0000,,considerably slow down the execution. By\Ncontrast, when you fuzz a file system with Dialogue: 0,0:32:49.32,0:32:56.46,Default,,0000,0000,0000,,syzkaller, not only every test case has to\Ngo from the high—the host to the guest and Dialogue: 0,0:32:56.46,0:33:01.77,Default,,0000,0000,0000,,then do multiple syscalls and so on, but\Nalso every system call has to go through Dialogue: 0,0:33:01.77,0:33:05.37,Default,,0000,0000,0000,,several layers of abstraction before it\Ngets to the actual file system. And all Dialogue: 0,0:33:05.37,0:33:09.61,Default,,0000,0000,0000,,these—like, all of this takes a lot of\Ntime, and so in practice the overhead of Dialogue: 0,0:33:09.61,0:33:15.58,Default,,0000,0000,0000,,our instrumentation seems to be pretty\Nreasonable. So, since we know that you Dialogue: 0,0:33:15.58,0:33:32.84,Default,,0000,0000,0000,,like demos, we've prepared a small demo of\NkRetroWrite. So. Let's see. Yep. Okay. All Dialogue: 0,0:33:32.84,0:33:40.47,Default,,0000,0000,0000,,right, so we've prepared a small kernel\Nmodule. And this module is just, like, Dialogue: 0,0:33:40.47,0:33:45.67,Default,,0000,0000,0000,,really simple; it contains a\Nvulnerability, and what it does is that it Dialogue: 0,0:33:45.67,0:33:49.93,Default,,0000,0000,0000,,creates a character device. So if you're\Nnot familiar with this, a character device Dialogue: 0,0:33:49.93,0:33:55.13,Default,,0000,0000,0000,,is like a fake file that is exposed by a\Nkernel driver and that it can read to and Dialogue: 0,0:33:55.13,0:34:01.63,Default,,0000,0000,0000,,write from. And instead of going to a\Nfile, the data that you read—that you, in Dialogue: 0,0:34:01.63,0:34:05.59,Default,,0000,0000,0000,,this case, write to the fake file—goes to\Nthe driver and is handled by this demo Dialogue: 0,0:34:05.59,0:34:10.48,Default,,0000,0000,0000,,write function. So as we can see, this\Nfunction allocates a buffer, a 16-byte Dialogue: 0,0:34:10.48,0:34:14.85,Default,,0000,0000,0000,,buffer on the heap, and then copies some\Ndata into it, and then it checks if the Dialogue: 0,0:34:14.85,0:34:19.97,Default,,0000,0000,0000,,data contains the string "1337". If it\Ndoes, then it accesses the buffer out of Dialogue: 0,0:34:19.97,0:34:23.45,Default,,0000,0000,0000,,bounds; you can see "alloc[16]" and the\Nbuffer is sixteen bytes; this is an out- Dialogue: 0,0:34:23.45,0:34:27.55,Default,,0000,0000,0000,,of-bounds read by one byte. And if it\Ndoesn't then it just accesses the buffer Dialogue: 0,0:34:27.55,0:34:33.05,Default,,0000,0000,0000,,in bounds, which is fine, and it's not a\Nvulnerability. So we can compile this Dialogue: 0,0:34:33.05,0:34:47.45,Default,,0000,0000,0000,,driver. OK, um... OK, and then so we have\Nour module, and then we will instrument it Dialogue: 0,0:34:47.45,0:35:01.50,Default,,0000,0000,0000,,using kRetroWrite. So, instrument... Yes,\Nplease. OK. Right. So kRetroWrite did some Dialogue: 0,0:35:01.50,0:35:07.33,Default,,0000,0000,0000,,processing, and it produced an\Ninstrumented module with ASan or kASan and Dialogue: 0,0:35:07.33,0:35:09.77,Default,,0000,0000,0000,,a symbolized assembly file. We can\Nactually have a look at the symbolized Dialogue: 0,0:35:09.77,0:35:17.74,Default,,0000,0000,0000,,assembly file to see what it looks like.\NYes. Yes. OK. So, is this big enough? Dialogue: 0,0:35:17.74,0:35:22.90,Default,,0000,0000,0000,,Yeah... As you can see, so—we can actually\Nsee here the ASan instrumentation. Ah, Dialogue: 0,0:35:22.90,0:35:29.33,Default,,0000,0000,0000,,shouldn't—yeah. So, we—this is the ASan\Ninstrumentation. The original code loads Dialogue: 0,0:35:29.33,0:35:33.29,Default,,0000,0000,0000,,some data from this address. And as you\Ncan see, the ASan instrumentation first Dialogue: 0,0:35:33.29,0:35:38.24,Default,,0000,0000,0000,,computes the actual address, and then does\Nsome checking—basically, this is checking Dialogue: 0,0:35:38.24,0:35:44.43,Default,,0000,0000,0000,,some metadata that ASan stores to check if\Nthe address is in a red zone or not, and Dialogue: 0,0:35:44.43,0:35:49.43,Default,,0000,0000,0000,,then if the fail check fails, then it\Ncalls this ASan report which produces a Dialogue: 0,0:35:49.43,0:35:54.83,Default,,0000,0000,0000,,stack trace and crashes the kernel. So\Nthis is fine. We can actually even look at Dialogue: 0,0:35:54.83,0:36:17.82,Default,,0000,0000,0000,,the disassembly of both modules, so...\Nobject dump and then demo... Ah, nope. OK, Dialogue: 0,0:36:17.82,0:36:21.83,Default,,0000,0000,0000,,so on the left, we have the original\Nmodule without any instrumentation; on the Dialogue: 0,0:36:21.83,0:36:27.07,Default,,0000,0000,0000,,right, we have the module instrumented\Nwith ASan. So as you can see, the original Dialogue: 0,0:36:27.07,0:36:33.16,Default,,0000,0000,0000,,module has "push r13" and then has this\Nmemory load here; on the right in the Dialogue: 0,0:36:33.16,0:36:38.56,Default,,0000,0000,0000,,instrumented module, kRetroWrite inserted\Nthe ASan instrumentation. So the original Dialogue: 0,0:36:38.56,0:36:43.94,Default,,0000,0000,0000,,load is still down here, but between that,\Nbetween the first instruction and this Dialogue: 0,0:36:43.94,0:36:47.85,Default,,0000,0000,0000,,instruction, we have—now have the kASan\Ninstrumentation that does our check. So Dialogue: 0,0:36:47.85,0:36:56.70,Default,,0000,0000,0000,,this is all fine. Now we can actually test\Nit and see what it does. So we can—we will Dialogue: 0,0:36:56.70,0:37:02.21,Default,,0000,0000,0000,,boot a very simple, a very minimal Linux\Nsystem, and try to target the Dialogue: 0,0:37:02.21,0:37:05.79,Default,,0000,0000,0000,,vulnerability first with the non-\Ninstrumented module and then with the Dialogue: 0,0:37:05.79,0:37:10.41,Default,,0000,0000,0000,,instrumented module. And we can—we will\Nsee that in the—with the non-instrumented Dialogue: 0,0:37:10.41,0:37:14.55,Default,,0000,0000,0000,,module, the kernel will not crash, but\Nwith the instrumented module it will crash Dialogue: 0,0:37:14.55,0:37:22.43,Default,,0000,0000,0000,,and produce a bug report. So. Let's see.\NYeah, this is a QEMU VM, I have no idea Dialogue: 0,0:37:22.43,0:37:27.48,Default,,0000,0000,0000,,why it's taking so long to boot. I'll\Nblame the the demo gods not being kind to Dialogue: 0,0:37:27.48,0:37:39.73,Default,,0000,0000,0000,,us. Yeah, I guess we just have to wait.\NOK. So. All right, so we loaded the Dialogue: 0,0:37:39.73,0:37:47.33,Default,,0000,0000,0000,,module. We will see that it has created a\Nfake file character device in /dev/demo. Dialogue: 0,0:37:47.33,0:37:59.02,Default,,0000,0000,0000,,Yep. We can write this file. Yep. So this\Nwill—this accesses the array in bounds, Dialogue: 0,0:37:59.02,0:38:04.41,Default,,0000,0000,0000,,and so this is fine. Then what we can also\Ndo is write "1337" to it so it will access Dialogue: 0,0:38:04.41,0:38:08.97,Default,,0000,0000,0000,,the array out of bounds. So this is the\Nnon instrumented module, so this will not Dialogue: 0,0:38:08.97,0:38:14.05,Default,,0000,0000,0000,,crash. It will just print some garbage\Nvalue. Okay, that's it. Now we can load Dialogue: 0,0:38:14.05,0:38:25.89,Default,,0000,0000,0000,,the instrumented module instead... and do\Nthe same experiment again. All right. We Dialogue: 0,0:38:25.89,0:38:31.64,Default,,0000,0000,0000,,can see that /dev/demo is still here. So\Nthe module still works. Let's try to write Dialogue: 0,0:38:31.64,0:38:38.54,Default,,0000,0000,0000,,"1234" into it. This, again, doesn't\Ncrash. But when we try to write "1337", Dialogue: 0,0:38:38.54,0:38:47.94,Default,,0000,0000,0000,,this will produce a bug report.\N{\i1}applause{\i0} Dialogue: 0,0:38:47.94,0:38:51.13,Default,,0000,0000,0000,,So this has quite a lot of information. We Dialogue: 0,0:38:51.13,0:38:55.70,Default,,0000,0000,0000,,can see, like, the—where the memory was\Nallocated, there's a stack trace for that; Dialogue: 0,0:38:55.70,0:39:02.15,Default,,0000,0000,0000,,it wasn't freed, so there's no stack trace\Nfor the free. And we see that the cache Dialogue: 0,0:39:02.15,0:39:06.76,Default,,0000,0000,0000,,size of the memory, like, it was a 16-byte\Nallocation. We can see the shape of the Dialogue: 0,0:39:06.76,0:39:10.90,Default,,0000,0000,0000,,memory. We see that these two zeros means\Nthat there's two 8-byte chunks of valid Dialogue: 0,0:39:10.90,0:39:15.55,Default,,0000,0000,0000,,memory. And then these "fc fc fc" is\Nthe—are the red zones that I was talking Dialogue: 0,0:39:15.55,0:39:19.98,Default,,0000,0000,0000,,about before. All right, so that's it for\Nthe demo. We will switch back to our Dialogue: 0,0:39:19.98,0:39:24.63,Default,,0000,0000,0000,,presentation now. So... hope you enjoyed\Nit. Dialogue: 0,0:39:24.63,0:39:30.53,Default,,0000,0000,0000,,gannimo: Cool. So after applying this to a\Ndemo module, we also wanted to see what Dialogue: 0,0:39:30.53,0:39:35.36,Default,,0000,0000,0000,,happens if we apply this to a real file\Nsystem. After a couple of hours we Dialogue: 0,0:39:35.36,0:39:41.39,Default,,0000,0000,0000,,were—when we came back and checked on the\Nresults, we saw a couple of issues popping Dialogue: 0,0:39:41.39,0:39:48.72,Default,,0000,0000,0000,,up, including a nice set of use-after-free\Nreads, a set of use-after-free writes, and Dialogue: 0,0:39:48.72,0:39:56.22,Default,,0000,0000,0000,,we checked the bug reports and we saw a\Nwhole bunch of Linux kernel issues popping Dialogue: 0,0:39:56.22,0:40:02.64,Default,,0000,0000,0000,,up one after the other in this nondescript\Nmodule that we fuzzed. We're in the Dialogue: 0,0:40:02.64,0:40:06.93,Default,,0000,0000,0000,,process of reporting it. This will take\Nsome time until it is fixed; that's why Dialogue: 0,0:40:06.93,0:40:13.47,Default,,0000,0000,0000,,you see the blurry lines. But as you see,\Nthere's still quite a bit of opportunity Dialogue: 0,0:40:13.47,0:40:19.19,Default,,0000,0000,0000,,in the Linux kernel where you can apply\Ndifferent forms of targeted fuzzing into Dialogue: 0,0:40:19.19,0:40:26.35,Default,,0000,0000,0000,,different modules, leverage these modules\Non top of a kASan instrumented kernel and Dialogue: 0,0:40:26.35,0:40:31.72,Default,,0000,0000,0000,,then leveraging this as part of your\Nfuzzing toolchain to find interesting Dialogue: 0,0:40:31.72,0:40:39.08,Default,,0000,0000,0000,,kernel 0days that... yeah. You can then\Ndevelop further, or report, or do whatever Dialogue: 0,0:40:39.08,0:40:44.77,Default,,0000,0000,0000,,you want with them. Now, we've shown you\Nhow you can take existing binary-only Dialogue: 0,0:40:44.77,0:40:51.25,Default,,0000,0000,0000,,modules, think different binary-only\Ndrivers, or even existing modules where Dialogue: 0,0:40:51.25,0:40:55.80,Default,,0000,0000,0000,,you don't want to instrument a full set of\Nthe Linux kernel, but only focus fuzzing Dialogue: 0,0:40:55.80,0:41:02.13,Default,,0000,0000,0000,,and exploration on a small different—small\Nlimited piece of code and then do security Dialogue: 0,0:41:02.13,0:41:09.25,Default,,0000,0000,0000,,tests on those. We've shown you how we can\Ndo coverage-based tracking and address Dialogue: 0,0:41:09.25,0:41:13.50,Default,,0000,0000,0000,,sanitization. But this is also up to you\Non what kind of other instrumentation you Dialogue: 0,0:41:13.50,0:41:17.89,Default,,0000,0000,0000,,want. Like this is just a tool, a\Nframework that allows you to do arbitrary Dialogue: 0,0:41:17.89,0:41:23.78,Default,,0000,0000,0000,,forms of instrumentation. So we've taken\Nyou on a journey from instrumenting Dialogue: 0,0:41:23.78,0:41:29.38,Default,,0000,0000,0000,,binaries over coverage-guided fuzzing and\Nsanitization to instrumenting modules in Dialogue: 0,0:41:29.38,0:41:36.69,Default,,0000,0000,0000,,the kernel and then finding crashes in the\Nkernel. Let me wrap up the talk. So, this Dialogue: 0,0:41:36.69,0:41:41.58,Default,,0000,0000,0000,,is one of the the fun pieces of work that\Nwe do in the hexhive lab at EPFL. So if Dialogue: 0,0:41:41.58,0:41:45.74,Default,,0000,0000,0000,,you're looking for postdoc opportunities\Nor if you're thinking about a PhD, come Dialogue: 0,0:41:45.74,0:41:51.81,Default,,0000,0000,0000,,talk to us. We're always hiring. The tools\Nwill be released as open source. A large Dialogue: 0,0:41:51.81,0:41:57.32,Default,,0000,0000,0000,,chunk of the userspace work is already\Nopen source. We're working on a set of Dialogue: 0,0:41:57.32,0:42:02.35,Default,,0000,0000,0000,,additional demos and so on so that you can\Nget started faster, leveraging the Dialogue: 0,0:42:02.35,0:42:07.81,Default,,0000,0000,0000,,different existing instrumentation that is\Nalready out there. The userspace work is Dialogue: 0,0:42:07.81,0:42:12.14,Default,,0000,0000,0000,,already available. The kernel work will be\Navailable in a couple of weeks. This Dialogue: 0,0:42:12.14,0:42:16.77,Default,,0000,0000,0000,,allows you to instrument real-world\Nbinaries for fuzzing, leveraging existing Dialogue: 0,0:42:16.77,0:42:21.20,Default,,0000,0000,0000,,transformations for coverage tracking to\Nenable fast and effective fuzzing and Dialogue: 0,0:42:21.20,0:42:26.49,Default,,0000,0000,0000,,memory checking to detect the actual bugs\Nthat exist there. The key takeaway from Dialogue: 0,0:42:26.49,0:42:32.43,Default,,0000,0000,0000,,this talk is that RetroWrite and\NkRetroWrite enables static binary Dialogue: 0,0:42:32.43,0:42:38.30,Default,,0000,0000,0000,,rewriting at zero instrumentation cost. We\Ntake the limitation of focusing only on Dialogue: 0,0:42:38.30,0:42:43.24,Default,,0000,0000,0000,,position-independent code, which is not a\Nreal implementation, but we get the Dialogue: 0,0:42:43.24,0:42:47.80,Default,,0000,0000,0000,,advantage of being able to symbolize\Nwithout actually relying on heuristics, so Dialogue: 0,0:42:47.80,0:42:55.38,Default,,0000,0000,0000,,we can even symbolize large, complex\Nsource—large, complex applications and Dialogue: 0,0:42:55.38,0:43:01.09,Default,,0000,0000,0000,,effectively rewrite those aspects and then\Nyou can focus fuzzing on these parts. Dialogue: 0,0:43:01.09,0:43:06.33,Default,,0000,0000,0000,,Another point I want to mention is that\Nthis enables you to reuse existing tooling Dialogue: 0,0:43:06.33,0:43:10.98,Default,,0000,0000,0000,,so you can take a binary blob, instrument\Nit, and then reuse, for example, Address Dialogue: 0,0:43:10.98,0:43:15.97,Default,,0000,0000,0000,,Sanitizer or existing fuzzing tools, as it\Nintegrates really, really nice. As I said, Dialogue: 0,0:43:15.97,0:43:22.70,Default,,0000,0000,0000,,all the code is open source. Check it out.\NTry it. Let us know if it breaks. We're Dialogue: 0,0:43:22.70,0:43:27.52,Default,,0000,0000,0000,,happy to fix. We are committed to open\Nsource. And let us know if there are any Dialogue: 0,0:43:27.52,0:43:36.75,Default,,0000,0000,0000,,questions. Thank you.\N{\i1}applause{\i0} Dialogue: 0,0:43:36.75,0:43:42.25,Default,,0000,0000,0000,,Herald: So, thanks, guys, for an\Ninteresting talk. We have some time for Dialogue: 0,0:43:42.25,0:43:47.18,Default,,0000,0000,0000,,questions, so we have microphones along\Nthe aisles. We'll start from question from Dialogue: 0,0:43:47.18,0:43:51.08,Default,,0000,0000,0000,,microphone number two.\NQ: Hi. Thanks for your talk and for the Dialogue: 0,0:43:51.08,0:43:59.40,Default,,0000,0000,0000,,demo. I'm not sure about the use-case you\Nshowed for the kernel RetroWrite. 'Cause Dialogue: 0,0:43:59.40,0:44:05.58,Default,,0000,0000,0000,,you're usually interested in fuzzing\Nbinary in kernelspace when you don't have Dialogue: 0,0:44:05.58,0:44:13.98,Default,,0000,0000,0000,,source code for the kernel. For example,\Nfor IoT or Android and so on. But you just Dialogue: 0,0:44:13.98,0:44:22.26,Default,,0000,0000,0000,,reuse the kCov and kASan in the kernel,\Nand you never have the kernel in IoT or Dialogue: 0,0:44:22.26,0:44:28.60,Default,,0000,0000,0000,,Android which is compiled with that. So\Nare you—do you have any plans to binary Dialogue: 0,0:44:28.60,0:44:31.67,Default,,0000,0000,0000,,instrument the kernel itself, not the\Nmodules? Dialogue: 0,0:44:31.67,0:44:39.39,Default,,0000,0000,0000,,Nspace: So we thought about that. I think\Nthat there's some additional problems that Dialogue: 0,0:44:39.39,0:44:43.91,Default,,0000,0000,0000,,we would have to solve in order to be able\Nto instrument the full kernel. So other Dialogue: 0,0:44:43.91,0:44:47.82,Default,,0000,0000,0000,,than the fact that it gives us\Ncompatibility with, like, existing tools, Dialogue: 0,0:44:47.82,0:44:51.72,Default,,0000,0000,0000,,the reason why we decided to go with\Ncompiling the kernel with kASan and kCov Dialogue: 0,0:44:51.72,0:44:56.76,Default,,0000,0000,0000,,is that building the, like—you would you\Nhave to, like, think about it. You Dialogue: 0,0:44:56.76,0:45:01.54,Default,,0000,0000,0000,,have to instrument the memory allocator to\Nadd red zones, which is, like, already Dialogue: 0,0:45:01.54,0:45:07.07,Default,,0000,0000,0000,,somewhat complex. You have to instrument\Nthe exception handlers to catch, like, any Dialogue: 0,0:45:07.07,0:45:12.24,Default,,0000,0000,0000,,faults that the instrumentation detects.\NYou would have to, like, set up some Dialogue: 0,0:45:12.24,0:45:17.48,Default,,0000,0000,0000,,memory for the ASan shadow. So this is,\Nlike—I think you should be able to do it, Dialogue: 0,0:45:17.48,0:45:21.69,Default,,0000,0000,0000,,but it would require a lot of additional\Nwork. So this is, like—this was like four Dialogue: 0,0:45:21.69,0:45:25.51,Default,,0000,0000,0000,,months' thesis. So we decided to start\Nsmall and prove that it works in Dialogue: 0,0:45:25.51,0:45:30.47,Default,,0000,0000,0000,,the kernel for modules, and then leave it\Nto future work to actually extend it to Dialogue: 0,0:45:30.47,0:45:37.56,Default,,0000,0000,0000,,the full kernel. Also, like, I think for\NAndroid—so in the case of Linux, the Dialogue: 0,0:45:37.56,0:45:42.07,Default,,0000,0000,0000,,kernel is GPL, right, so if the\Nmanufacturers ships a custom kernel, they Dialogue: 0,0:45:42.07,0:45:44.61,Default,,0000,0000,0000,,have to release the source code, right?\NQ: They never do. Dialogue: 0,0:45:44.61,0:45:47.22,Default,,0000,0000,0000,,Nspace: They never—well, that's a\Ndifferent issue. Right? Dialogue: 0,0:45:47.22,0:45:49.01,Default,,0000,0000,0000,,gannimo: Right.\NQ: So that's why I ask, because I don't Dialogue: 0,0:45:49.01,0:45:51.84,Default,,0000,0000,0000,,see how it just can be used in the real\Nworld. Dialogue: 0,0:45:51.84,0:45:57.12,Default,,0000,0000,0000,,gannimo: Well, let me try to put this into\Nperspective a little bit as well. Right. Dialogue: 0,0:45:57.12,0:46:02.03,Default,,0000,0000,0000,,So there's the—what we did so far is we\Nleveraged existing tools, like kASan or Dialogue: 0,0:46:02.03,0:46:09.44,Default,,0000,0000,0000,,kCov, and integrated into these existing\Ntools. Now, doing heap-based allocation is Dialogue: 0,0:46:09.44,0:46:13.57,Default,,0000,0000,0000,,fairly simple and replacing those with\Nadditional red zones—that instrumentation Dialogue: 0,0:46:13.57,0:46:20.20,Default,,0000,0000,0000,,you can carry out fairly well by focusing\Non the different allocators. Second to Dialogue: 0,0:46:20.20,0:46:24.97,Default,,0000,0000,0000,,that, simply oopsing the kernel and\Nprinting the stack trace is also fairly Dialogue: 0,0:46:24.97,0:46:29.25,Default,,0000,0000,0000,,straightforward. So it's not a lot of\Nadditional effort. So it is—it involves Dialogue: 0,0:46:29.25,0:46:38.47,Default,,0000,0000,0000,,some engineering effort to port this to\Nnon-kASan-compiled kernels. But we think Dialogue: 0,0:46:38.47,0:46:44.74,Default,,0000,0000,0000,,it is very feasible. In the interest of\Ntime, we focused on kASan-enabled kernels, Dialogue: 0,0:46:44.74,0:46:50.96,Default,,0000,0000,0000,,so that some form of ASan is already\Nenabled. But yeah, this is additional Dialogue: 0,0:46:50.96,0:46:55.66,Default,,0000,0000,0000,,engineering effort. But there is also a\Ncommunity out there that can help us with Dialogue: 0,0:46:55.66,0:47:00.96,Default,,0000,0000,0000,,these kind of changes. So kRetroWrite and\NRetroWrite themselves are the binary Dialogue: 0,0:47:00.96,0:47:07.06,Default,,0000,0000,0000,,rewriting platform that allow you to turn\Na binary into an assembly file that you Dialogue: 0,0:47:07.06,0:47:11.62,Default,,0000,0000,0000,,can then instrument and run different\Npasses on top of it. So another pass would Dialogue: 0,0:47:11.62,0:47:16.40,Default,,0000,0000,0000,,be a full ASan pass or kASan pass that\Nsomebody could add and then contribute Dialogue: 0,0:47:16.40,0:47:19.10,Default,,0000,0000,0000,,back to the community.\NQ: Yeah, it would be really useful. Dialogue: 0,0:47:19.10,0:47:20.19,Default,,0000,0000,0000,,Thanks.\Ngannimo: Cool. Dialogue: 0,0:47:20.19,0:47:24.26,Default,,0000,0000,0000,,Angel: Next question from the Internet.\NQ: Yes, there is a question regarding the Dialogue: 0,0:47:24.26,0:47:30.89,Default,,0000,0000,0000,,slide on the SPEC CPU benchmark. The\Nsecond or third graph from the right had Dialogue: 0,0:47:30.89,0:47:36.70,Default,,0000,0000,0000,,an instrumented version that was faster\Nthan the original program. Why is that? Dialogue: 0,0:47:36.70,0:47:42.30,Default,,0000,0000,0000,,gannimo: Cache effect. Thank you.\NAngel: Microphone number one. Dialogue: 0,0:47:42.30,0:47:47.03,Default,,0000,0000,0000,,Q: Thank you. Thank you for presentation.\NI have question: how many architecture do Dialogue: 0,0:47:47.03,0:47:51.21,Default,,0000,0000,0000,,you support, and if you have support more,\Nwhat then? Dialogue: 0,0:47:51.21,0:47:56.40,Default,,0000,0000,0000,,gannimo: x86_64.\NQ: Okay. So no plans for ARM or MIPS, Dialogue: 0,0:47:56.40,0:47:58.13,Default,,0000,0000,0000,,or...?\Ngannimo: Oh, there are plans. Dialogue: 0,0:47:58.13,0:48:01.39,Default,,0000,0000,0000,,Q: Okay.\NNspace: Right, so— Dialogue: 0,0:48:01.39,0:48:05.98,Default,,0000,0000,0000,,gannimo: Right. Again, there's a finite\Namount of time. We focused on the Dialogue: 0,0:48:05.98,0:48:11.78,Default,,0000,0000,0000,,technology. ARM is high up on the list. If\Nsomebody is interested in working on it Dialogue: 0,0:48:11.78,0:48:17.67,Default,,0000,0000,0000,,and contributing, we're happy to hear from\Nit. Our list of targets is ARM first and Dialogue: 0,0:48:17.67,0:48:22.92,Default,,0000,0000,0000,,then maybe something else. But I think\Nwith x86_64 and ARM we've covered a Dialogue: 0,0:48:22.92,0:48:33.42,Default,,0000,0000,0000,,majority of the interesting platforms.\NQ: And second question, did you try to Dialogue: 0,0:48:33.42,0:48:37.97,Default,,0000,0000,0000,,fuzz any real closed-source program?\NBecause as I understand from presentation, Dialogue: 0,0:48:37.97,0:48:44.71,Default,,0000,0000,0000,,you fuzz, like, just file system, what we\Ncan compile and fuzz with syzkaller like Dialogue: 0,0:48:44.71,0:48:48.57,Default,,0000,0000,0000,,in the past.\NNspace: So for the evaluation, we wanted Dialogue: 0,0:48:48.57,0:48:52.13,Default,,0000,0000,0000,,to be able to compare between the source-\Nbased instrumentation and the binary-based Dialogue: 0,0:48:52.13,0:48:57.46,Default,,0000,0000,0000,,instrumentation, so we focused mostly on\Nopen-source filesystem and drivers because Dialogue: 0,0:48:57.46,0:49:02.06,Default,,0000,0000,0000,,then we could instrument them with a\Ncompiler. We haven't yet tried, but this Dialogue: 0,0:49:02.06,0:49:05.74,Default,,0000,0000,0000,,is, like, also pretty high up on the list.\NWe wanted to try to find some closed- Dialogue: 0,0:49:05.74,0:49:10.61,Default,,0000,0000,0000,,source drivers—there's lots of them, like\Nfor GPUs or anything—and we'll give it a Dialogue: 0,0:49:10.61,0:49:15.46,Default,,0000,0000,0000,,try and find some 0days, perhaps.\NQ: Yes, but with syzkaller, you still have Dialogue: 0,0:49:15.46,0:49:22.58,Default,,0000,0000,0000,,a problem. You have to write rules, like,\Ndictionaries. I mean, you have to Dialogue: 0,0:49:22.58,0:49:24.60,Default,,0000,0000,0000,,understand the format, have to communicate\Nwith the driver. Dialogue: 0,0:49:24.60,0:49:28.55,Default,,0000,0000,0000,,Nspace: Yeah, right But there's, for\Nexample, closed-source file systems that Dialogue: 0,0:49:28.55,0:49:33.27,Default,,0000,0000,0000,,we are looking at.\NQ: Okay. Thinking. Dialogue: 0,0:49:33.27,0:49:38.66,Default,,0000,0000,0000,,Herald: Number two.\NQ: Hi. Thank you for your talk. So I don't Dialogue: 0,0:49:38.66,0:49:45.07,Default,,0000,0000,0000,,know if there are any kCov- or kASan-\Nequivalent solution to Windows, but I was Dialogue: 0,0:49:45.07,0:49:49.93,Default,,0000,0000,0000,,wondering if you tried, or are you\Nplanning to do it on Windows, the Dialogue: 0,0:49:49.93,0:49:52.54,Default,,0000,0000,0000,,framework? Because I know it might be\Nchallenging because of the driver Dialogue: 0,0:49:52.54,0:49:56.85,Default,,0000,0000,0000,,signature enforcement and PatchGuard, but\NI wondered if you tried or thought about Dialogue: 0,0:49:56.85,0:49:59.29,Default,,0000,0000,0000,,it.\Ngannimo: Yes, we thought about it and we Dialogue: 0,0:49:59.29,0:50:06.38,Default,,0000,0000,0000,,decided against it. Windows is incredibly\Nhard and we are academics. The research I Dialogue: 0,0:50:06.38,0:50:11.80,Default,,0000,0000,0000,,do in my lab, or we do in my research lab,\Nfocuses on predominantly open-source Dialogue: 0,0:50:11.80,0:50:17.06,Default,,0000,0000,0000,,software and empowers open-source\Nsoftware. Doing full support for Microsoft Dialogue: 0,0:50:17.06,0:50:20.78,Default,,0000,0000,0000,,Windows is somewhat out of scope. If\Nsomebody wants to port these tools, we are Dialogue: 0,0:50:20.78,0:50:24.19,Default,,0000,0000,0000,,happy to hear it and work with these\Npeople. But it's a lot of additional Dialogue: 0,0:50:24.19,0:50:28.53,Default,,0000,0000,0000,,engineering effort, versus very\Nadditional—very low additional research Dialogue: 0,0:50:28.53,0:50:33.06,Default,,0000,0000,0000,,value, so we'll have to find some form of\Ncompromise. And, like, if you would be Dialogue: 0,0:50:33.06,0:50:38.65,Default,,0000,0000,0000,,willing to fund us, we would go ahead. But\Nit's—yeah, it's a cost question. Dialogue: 0,0:50:38.65,0:50:42.09,Default,,0000,0000,0000,,Q: And you're referring both to kernel and\Nuser space, right? Dialogue: 0,0:50:42.09,0:50:45.09,Default,,0000,0000,0000,,gannimo: Yeah.\NQ: Okay. Thank you. Dialogue: 0,0:50:45.09,0:50:48.10,Default,,0000,0000,0000,,Herald: Number five.\NQ: Hi, thanks for the talk. This seems Dialogue: 0,0:50:48.10,0:50:52.40,Default,,0000,0000,0000,,most interesting if you're looking for\Nvulnerabilities in closed source kernel Dialogue: 0,0:50:52.40,0:50:58.36,Default,,0000,0000,0000,,modules, but not giving it too much\Nthought, it seems it's really trivial to Dialogue: 0,0:50:58.36,0:51:01.92,Default,,0000,0000,0000,,prevent this if you're writing a closed\Nsource module. Dialogue: 0,0:51:01.92,0:51:07.13,Default,,0000,0000,0000,,gannimo: Well, how would you prevent this?\NQ: Well, for starters, you would just take Dialogue: 0,0:51:07.13,0:51:11.49,Default,,0000,0000,0000,,a difference between the address of two\Nfunctions. That's not gonna be IP Dialogue: 0,0:51:11.49,0:51:15.86,Default,,0000,0000,0000,,relative, so...\NNspace: Right. So we explicitly—like, even Dialogue: 0,0:51:15.86,0:51:21.59,Default,,0000,0000,0000,,in the original RetroWrite paper—we\Nexplicitly decided to not try to deal with Dialogue: 0,0:51:21.59,0:51:25.78,Default,,0000,0000,0000,,obfuscated code, or code that is\Npurposefully trying to defeat this kind of Dialogue: 0,0:51:25.78,0:51:30.51,Default,,0000,0000,0000,,rewriting. Because, like, the assumption\Nis that first of all, there are techniques Dialogue: 0,0:51:30.51,0:51:34.10,Default,,0000,0000,0000,,to, like, deobfuscate code or remove\Nthese, like, checks in some way, but this Dialogue: 0,0:51:34.10,0:51:39.51,Default,,0000,0000,0000,,is, like, sort of orthogonal work. And at\Nthe same time, I guess most drivers are Dialogue: 0,0:51:39.51,0:51:43.98,Default,,0000,0000,0000,,not really compiled with the sort of\Nobfuscation; they're just, like, you know, Dialogue: 0,0:51:43.98,0:51:47.66,Default,,0000,0000,0000,,they're compiled with a regular compiler.\NBut yeah, of course, this is, like, a Dialogue: 0,0:51:47.66,0:51:50.07,Default,,0000,0000,0000,,limitation.\Ngannimo: They're likely stripped, but not Dialogue: 0,0:51:50.07,0:51:54.28,Default,,0000,0000,0000,,necessarily obfuscated. At least from what\Nwe've seen when we looked at binary-only Dialogue: 0,0:51:54.28,0:51:58.98,Default,,0000,0000,0000,,drivers.\NHerald: Microphone number two. Dialogue: 0,0:51:58.98,0:52:04.35,Default,,0000,0000,0000,,Q: How do you decide where to place the\Nred zones? From what I heard, you talked Dialogue: 0,0:52:04.35,0:52:10.03,Default,,0000,0000,0000,,about instrumenting the allocators, but,\Nwell, there are a lot of variables on the Dialogue: 0,0:52:10.03,0:52:13.27,Default,,0000,0000,0000,,stack, so how do you deal with those?\Ngannimo: Oh, yeah, that's actually super Dialogue: 0,0:52:13.27,0:52:20.16,Default,,0000,0000,0000,,cool. I refer to some extent to the paper\Nthat is on the GitHub repo as well. If you Dialogue: 0,0:52:20.16,0:52:26.78,Default,,0000,0000,0000,,think about it, modern compilers use\Ncanaries for buffers. Are you aware of Dialogue: 0,0:52:26.78,0:52:31.15,Default,,0000,0000,0000,,stack canaries—how stack canaries work?\NSo, stack canaries—like, if the compiler Dialogue: 0,0:52:31.15,0:52:34.44,Default,,0000,0000,0000,,sees there's a buffer that may be\Noverflown, it places a stack canary Dialogue: 0,0:52:34.44,0:52:39.74,Default,,0000,0000,0000,,between the buffer and any other data.\NWhat we use is we—as part of our analysis Dialogue: 0,0:52:39.74,0:52:44.75,Default,,0000,0000,0000,,tool, we find these stack canaries, remove\Nthe code that does the stack canary, and Dialogue: 0,0:52:44.75,0:52:49.42,Default,,0000,0000,0000,,use this space to place our red zones. So\Nwe actually hack the stack in areas, Dialogue: 0,0:52:49.42,0:52:54.57,Default,,0000,0000,0000,,remove that code, and add ASan red zones\Ninto the empty stack canaries that are now Dialogue: 0,0:52:54.57,0:52:58.60,Default,,0000,0000,0000,,there. It's actually a super cool\Noptimization because we piggyback on what Dialogue: 0,0:52:58.60,0:53:02.63,Default,,0000,0000,0000,,kind of work the compiler already did for\Nus before, and we can then leverage that Dialogue: 0,0:53:02.63,0:53:06.78,Default,,0000,0000,0000,,to gain additional benefits and protect\Nthe stack as well. Dialogue: 0,0:53:06.78,0:53:11.12,Default,,0000,0000,0000,,Q: Thanks.\NAngel: Another question from the Internet. Dialogue: 0,0:53:16.04,0:53:20.92,Default,,0000,0000,0000,,Q: Yes. Did you consider lifting the\Nbinary code to LLVM IR instead of Dialogue: 0,0:53:20.92,0:53:28.37,Default,,0000,0000,0000,,generating assembler source?\Ngannimo: Yes. {\i1}laughter{\i0} But, so—a little Dialogue: 0,0:53:28.37,0:53:32.06,Default,,0000,0000,0000,,bit longer answer. Yes, we did consider\Nthat. Yes, it would be super nice to lift Dialogue: 0,0:53:32.06,0:53:38.71,Default,,0000,0000,0000,,to LLVM IR. We've actually looked into\Nthis. It's incredibly hard. It's Dialogue: 0,0:53:38.71,0:53:42.27,Default,,0000,0000,0000,,incredibly complex. There's no direct\Nmapping between the machine code Dialogue: 0,0:53:42.27,0:53:48.49,Default,,0000,0000,0000,,equivalent and the LLVM IR. You would\Nstill need to recover all the types. So Dialogue: 0,0:53:48.49,0:53:51.80,Default,,0000,0000,0000,,it's like this magic dream that you\Nrecover full LLVM IR, then do heavyweight Dialogue: 0,0:53:51.80,0:53:57.47,Default,,0000,0000,0000,,transformations on top of it. But this is\Nincredibly hard because if you compile Dialogue: 0,0:53:57.47,0:54:03.57,Default,,0000,0000,0000,,down from LLVM IR to machine code, you\Nlose a massive amount of information. You Dialogue: 0,0:54:03.57,0:54:07.15,Default,,0000,0000,0000,,would have to find a way to recover all of\Nthat information, which is pretty much Dialogue: 0,0:54:07.15,0:54:14.99,Default,,0000,0000,0000,,impossible and undecidable for many cases.\NSo for example, just as a note, we only Dialogue: 0,0:54:14.99,0:54:19.42,Default,,0000,0000,0000,,recover control flow and we only\Ndesymbolize control flow. For data Dialogue: 0,0:54:19.42,0:54:23.03,Default,,0000,0000,0000,,references—we don't support\Ninstrumentation of data references yet Dialogue: 0,0:54:23.03,0:54:28.84,Default,,0000,0000,0000,,because there's still an undecidable\Nproblem that we are facing with. I can Dialogue: 0,0:54:28.84,0:54:32.86,Default,,0000,0000,0000,,talk more about this offline, or there is\Na note in the paper as well. So this is Dialogue: 0,0:54:32.86,0:54:37.27,Default,,0000,0000,0000,,just a small problem. Only if you're\Nlifting to assembly files. If you're Dialogue: 0,0:54:37.27,0:54:41.70,Default,,0000,0000,0000,,lifting to LLVM IR, you would have to do\Nfull end-to-end type recovery, which is Dialogue: 0,0:54:41.70,0:54:46.40,Default,,0000,0000,0000,,massively more complicated. Yes, it would\Nbe super nice. Unfortunately, it is Dialogue: 0,0:54:46.40,0:54:50.53,Default,,0000,0000,0000,,undecidable and really, really hard. So\Nyou can come up with some heuristics, but Dialogue: 0,0:54:50.53,0:54:55.27,Default,,0000,0000,0000,,there is no solution that will do this\Nin—that will be correct 100 percent of the Dialogue: 0,0:54:55.27,0:54:57.49,Default,,0000,0000,0000,,time.\NAngel: We'll take one more question from Dialogue: 0,0:54:57.49,0:55:02.61,Default,,0000,0000,0000,,microphone number six.\NQ: Thank you for your talk. What kind of Dialogue: 0,0:55:02.61,0:55:07.30,Default,,0000,0000,0000,,disassemblers did you use for RetroWrite,\Nand did you have problems with the wrong Dialogue: 0,0:55:07.30,0:55:12.88,Default,,0000,0000,0000,,disassembly? And if so, how did you handle\Nit? Dialogue: 0,0:55:12.88,0:55:18.79,Default,,0000,0000,0000,,Nspace: So, RetroWrite—so we used\NCapstone for the disassembly. Dialogue: 0,0:55:18.79,0:55:24.15,Default,,0000,0000,0000,,gannimo: An amazing tool, by the way.\NNspace: Yeah. So the idea is that, like, Dialogue: 0,0:55:24.15,0:55:30.24,Default,,0000,0000,0000,,we need some kind of—some information\Nabout where the functions are. So for the Dialogue: 0,0:55:30.24,0:55:33.55,Default,,0000,0000,0000,,kernel modules, this is actually fine\Nbecause kernel modules come with this sort Dialogue: 0,0:55:33.55,0:55:37.73,Default,,0000,0000,0000,,of information because the kernel needs\Nit, to build stack traces, for example. Dialogue: 0,0:55:37.73,0:55:41.87,Default,,0000,0000,0000,,For userspace binaries, this is somewhat\Nless common, but you can use another tool Dialogue: 0,0:55:41.87,0:55:46.17,Default,,0000,0000,0000,,to try to do function identification. And\Nwe do, like—sort of, like, disassemble the Dialogue: 0,0:55:46.17,0:55:54.50,Default,,0000,0000,0000,,entire function. So we have run into some\Nissues with, like—in AT&T syntax, because Dialogue: 0,0:55:54.50,0:55:59.65,Default,,0000,0000,0000,,like we wanted to use gas, GNU's\Nassembler, for, for... Dialogue: 0,0:55:59.65,0:56:04.24,Default,,0000,0000,0000,,gannimo: Reassembling.\NNspace: Reassembly, yeah. And some Dialogue: 0,0:56:04.24,0:56:09.82,Default,,0000,0000,0000,,instructions are a lot—you can express the\Nsame, like, two different instructions, Dialogue: 0,0:56:09.82,0:56:15.67,Default,,0000,0000,0000,,like five-byte NOP and six-byte NOP, using\Nthe same string of, like, text—a mnemonic, Dialogue: 0,0:56:15.67,0:56:19.97,Default,,0000,0000,0000,,an operand string. But the problem is\Nthat, like, the kernel doesn't like it and Dialogue: 0,0:56:19.97,0:56:21.97,Default,,0000,0000,0000,,crashes. This took me like two days to\Ndebug. Dialogue: 0,0:56:21.97,0:56:27.64,Default,,0000,0000,0000,,gannimo: So the kernel uses dynamic binary\Npatching when it runs, at runtime, and it Dialogue: 0,0:56:27.64,0:56:32.98,Default,,0000,0000,0000,,uses fixed offsets, so if you replace a\Nfive-byte NOP with a six-byte NOP or vice Dialogue: 0,0:56:32.98,0:56:37.83,Default,,0000,0000,0000,,versa, your offsets change and your kernel\Njust blows up in your face. Dialogue: 0,0:56:37.83,0:56:43.10,Default,,0000,0000,0000,,Q: So it was kind of a case-on-case basis\Nwhere you saw the errors coming out of the Dialogue: 0,0:56:43.10,0:56:47.92,Default,,0000,0000,0000,,disassembly and you had to fix it?\NNspace: So sorry, can you repeat the Dialogue: 0,0:56:47.92,0:56:51.03,Default,,0000,0000,0000,,question?\NQ: Like, for example, if you—if some Dialogue: 0,0:56:51.03,0:56:54.91,Default,,0000,0000,0000,,instruction is not supported by the\Ndisassembler, so you saw that it crashed, Dialogue: 0,0:56:54.91,0:56:58.00,Default,,0000,0000,0000,,that there's something wrong, and then you\Nfix it by hand? Dialogue: 0,0:56:58.00,0:57:02.94,Default,,0000,0000,0000,,Nspace: Yeah, well, if we saw that there\Nwas a problem with it, this—like, I don't Dialogue: 0,0:57:02.94,0:57:06.96,Default,,0000,0000,0000,,recall having any unknown instructions in\Nthe dissasembler. I don't think I've ever Dialogue: 0,0:57:06.96,0:57:11.29,Default,,0000,0000,0000,,had a problem with that. But yeah, this\Nwas a lot of, like, you know, engineering Dialogue: 0,0:57:11.29,0:57:14.29,Default,,0000,0000,0000,,work.\Ngannimo: So let me repeat. The problem was Dialogue: 0,0:57:14.29,0:57:19.22,Default,,0000,0000,0000,,not a bug in the disassembler, but an\Nissue with the instruction format—that the Dialogue: 0,0:57:19.22,0:57:24.53,Default,,0000,0000,0000,,same mnemonic can be translated into two\Ndifferent instructions, one of which was Dialogue: 0,0:57:24.53,0:57:29.09,Default,,0000,0000,0000,,five bytes long, the other one was six\Nbytes long. Both used the exact same Dialogue: 0,0:57:29.09,0:57:32.88,Default,,0000,0000,0000,,mnemonic. Right, so this was an issue with\Nassembly encoding. Dialogue: 0,0:57:32.88,0:57:38.29,Default,,0000,0000,0000,,Q: But you had no problems with\Nunsupported instructions which couldn't be Dialogue: 0,0:57:38.29,0:57:41.34,Default,,0000,0000,0000,,disassembled?\NNspace: No, no. Not as far as I know, at Dialogue: 0,0:57:41.34,0:57:43.34,Default,,0000,0000,0000,,least.\NAngel: We have one more minute, so a very Dialogue: 0,0:57:43.34,0:57:52.07,Default,,0000,0000,0000,,short question from microphone number two.\NQ: Does it work? Ah. Is your binary Dialogue: 0,0:57:52.07,0:58:02.02,Default,,0000,0000,0000,,instrumentation equally powerful as kernel\Naddress space... I mean, kASan? So, does Dialogue: 0,0:58:02.02,0:58:06.35,Default,,0000,0000,0000,,it detect all the memory corruptions on\Nstack, heap and globals? Dialogue: 0,0:58:06.35,0:58:13.05,Default,,0000,0000,0000,,gannimo: No globals. But heap—it does all\Nof them on the heap. There's some slight Dialogue: 0,0:58:13.05,0:58:20.15,Default,,0000,0000,0000,,variation on the stack because we have to\Npiggyback on the canary stuff. As I Dialogue: 0,0:58:20.15,0:58:23.88,Default,,0000,0000,0000,,mentioned quickly before, there is no\Nreflowing and full recovery of data Dialogue: 0,0:58:23.88,0:58:28.99,Default,,0000,0000,0000,,layouts. So to get anything on the stack,\Nwe have to piggyback on existing compiler Dialogue: 0,0:58:28.99,0:58:36.65,Default,,0000,0000,0000,,extensions like stack canaries. But—so we\Ndon't support intra-object overflows on Dialogue: 0,0:58:36.65,0:58:40.63,Default,,0000,0000,0000,,the stack. But we do leverage the stack in\Nareas to get some stack benefits, which Dialogue: 0,0:58:40.63,0:58:45.49,Default,,0000,0000,0000,,is, I don't know, 90, 95 percent there\Nbecause the stack canaries are pretty Dialogue: 0,0:58:45.49,0:58:51.32,Default,,0000,0000,0000,,good. For heap, we get the same precision.\NFor globals, we have very limited support. Dialogue: 0,0:58:51.32,0:58:54.29,Default,,0000,0000,0000,,Q: Thanks.\NAngel: So that's all the time we have for Dialogue: 0,0:58:54.29,0:58:57.60,Default,,0000,0000,0000,,this talk. You can find the speakers, I\Nthink, afterwards offline. Please give Dialogue: 0,0:58:57.60,0:58:59.82,Default,,0000,0000,0000,,them a big round of applause for an\Ninteresting talk. Dialogue: 0,0:58:59.82,0:59:03.05,Default,,0000,0000,0000,,{\i1}applause{\i0} Dialogue: 0,0:59:03.05,0:59:07.36,Default,,0000,0000,0000,,{\i1}36c3 postrol music{\i0} Dialogue: 0,0:59:07.36,0:59:29.00,Default,,0000,0000,0000,,Subtitles created by c3subtitles.de\Nin the year 2021. Join, and help us!