[Script Info] Title: [Events] Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text Dialogue: 0,0:00:00.00,0:00:09.80,Default,,0000,0000,0000,,*preroll music* Dialogue: 0,0:00:09.80,0:00:24.74,Default,,0000,0000,0000,,Herald: Our next speaker for today is a\Ncomputer science PhD student at UC Santa Dialogue: 0,0:00:24.74,0:00:30.80,Default,,0000,0000,0000,,Barbara. He is a member of the Shellfish\NHacking Team and he's also the organizer Dialogue: 0,0:00:30.80,0:00:35.82,Default,,0000,0000,0000,,of the IECTF Hacking Competition. Please\Ngive a big round of applause to Nilo Dialogue: 0,0:00:35.82,0:00:36.23,Default,,0000,0000,0000,,Redini. Dialogue: 0,0:00:36.23,0:00:39.51,Default,,0000,0000,0000,,*applause* Dialogue: 0,0:00:39.51,0:00:46.67,Default,,0000,0000,0000,,Nilo: Thanks for the introduction, hello\Nto everyone. My name is Nilo, and today Dialogue: 0,0:00:46.67,0:00:52.33,Default,,0000,0000,0000,,I'm going to present you my work Koronte:\Nidentifying multi-binary vulnerabilities Dialogue: 0,0:00:52.33,0:00:56.49,Default,,0000,0000,0000,,in embedded firmware at scale. This work\Nis a co-joint effort between me and Dialogue: 0,0:00:56.49,0:01:02.10,Default,,0000,0000,0000,,several of my colleagues at University of\NSanta Barbara and ASU. This talk is going Dialogue: 0,0:01:02.10,0:01:08.25,Default,,0000,0000,0000,,to be about IoT devices. So before\Nstarting, let's see an overview about IoT Dialogue: 0,0:01:08.25,0:01:13.90,Default,,0000,0000,0000,,devices. IoT devices are everywhere. As\Nthe research suggests, they will reach the Dialogue: 0,0:01:13.90,0:01:19.76,Default,,0000,0000,0000,,20 billion units by the end of the next\Nyear. And a recent study conducted this Dialogue: 0,0:01:19.76,0:01:25.77,Default,,0000,0000,0000,,year in 2019 on 16 million households\Nshowed that more than 70 percent of homes Dialogue: 0,0:01:25.77,0:01:31.84,Default,,0000,0000,0000,,in North America already have an IoT\Nnetwork connected device. IoT devices make Dialogue: 0,0:01:31.84,0:01:37.66,Default,,0000,0000,0000,,everyday life smarter. You can literally\Nsay "Alexa, I'm cold" and Alexa will Dialogue: 0,0:01:37.66,0:01:43.57,Default,,0000,0000,0000,,interact with the thermostat and increase\Nthe temperature of your room. Usually the Dialogue: 0,0:01:43.57,0:01:49.61,Default,,0000,0000,0000,,way we interact with the IoT devices is\Nthrough our smartphone. We send a request Dialogue: 0,0:01:49.61,0:01:55.16,Default,,0000,0000,0000,,to the local network, to some device,\Nrouter or door lock, or we might send the Dialogue: 0,0:01:55.16,0:02:01.14,Default,,0000,0000,0000,,same request through a cloud endpoint,\Nwhich is usually managed by the vendor of Dialogue: 0,0:02:01.14,0:02:07.29,Default,,0000,0000,0000,,the IoT device. Another way is through the\NIoT hubs, smartphone will send the request Dialogue: 0,0:02:07.29,0:02:13.66,Default,,0000,0000,0000,,to some IoT hub, which in turn will send\Nthe request to some other IoT devices. As Dialogue: 0,0:02:13.66,0:02:18.88,Default,,0000,0000,0000,,you can imagine, IoT devices use and\Ncollect our data and some data is more Dialogue: 0,0:02:18.88,0:02:23.38,Default,,0000,0000,0000,,sensitive than other. For instance, think\Nof all the data that is collected by my Dialogue: 0,0:02:23.38,0:02:29.73,Default,,0000,0000,0000,,lightbulb or data that is collected by our\Nsecurity camera. As such, IoT devices can Dialogue: 0,0:02:29.73,0:02:37.08,Default,,0000,0000,0000,,compromise people's safety and privacy.\NThings, for example, about the security Dialogue: 0,0:02:37.08,0:02:44.33,Default,,0000,0000,0000,,implication of a faulty smartlock or the\Nbrakes of your smart car. So the question Dialogue: 0,0:02:44.33,0:02:53.13,Default,,0000,0000,0000,,that we asked is: Are IoT devices secure?\NWell, like everything else, they are not. Dialogue: 0,0:02:53.13,0:03:00.95,Default,,0000,0000,0000,,OK, in 2016 the Mirai botnet compromised\Nand leveraged millions of IoT devices to Dialogue: 0,0:03:00.95,0:03:06.96,Default,,0000,0000,0000,,disrupt core Internet services such as\NTwitter, GitHub and Netflix. And in 2018, Dialogue: 0,0:03:06.96,0:03:13.29,Default,,0000,0000,0000,,154 vulnerabilities affecting IoT devices\Nwere published, which represented an Dialogue: 0,0:03:13.29,0:03:20.92,Default,,0000,0000,0000,,increment of 15% compared to 2017 and an\Nincrease of 115% compared to 2016. So then Dialogue: 0,0:03:20.92,0:03:27.71,Default,,0000,0000,0000,,we wonder: So why is it hard to secure IoT\Ndevices? To answer this question we have Dialogue: 0,0:03:27.71,0:03:33.64,Default,,0000,0000,0000,,to look up how IoT devices work and they\Nare made. Usually when you remove all the Dialogue: 0,0:03:33.64,0:03:40.42,Default,,0000,0000,0000,,plastic and peripherals IoT devices look\Nlike this. A board with some chips laying Dialogue: 0,0:03:40.42,0:03:45.60,Default,,0000,0000,0000,,on it. Usually you can find the big chip,\Nthe microcontroller which runs the Dialogue: 0,0:03:45.60,0:03:50.54,Default,,0000,0000,0000,,firmware and one or more peripheral\Ncontrollers which interact with external Dialogue: 0,0:03:50.54,0:03:57.19,Default,,0000,0000,0000,,peripherals such as the motor of, your\Nsmart lock or cameras. Though the design Dialogue: 0,0:03:57.19,0:04:03.44,Default,,0000,0000,0000,,is generic, implementations are very\Ndiverse. For instance, firmware may run on Dialogue: 0,0:04:03.44,0:04:08.78,Default,,0000,0000,0000,,several different architectures such as\NARM, MIPS, x86, PowerPC and so forth. And Dialogue: 0,0:04:08.78,0:04:14.35,Default,,0000,0000,0000,,sometimes they are even proprietary, which\Nmeans that if a security analyst wants to Dialogue: 0,0:04:14.35,0:04:20.04,Default,,0000,0000,0000,,understand what's going on in the\Nfirmware, he'll have a hard time if he Dialogue: 0,0:04:20.04,0:04:26.06,Default,,0000,0000,0000,,doesn't have the vendor specifics. Also,\Nthey're operating in environments with Dialogue: 0,0:04:26.06,0:04:30.56,Default,,0000,0000,0000,,limited resources, which means that they\Nrun small and optimized code. For Dialogue: 0,0:04:30.56,0:04:38.04,Default,,0000,0000,0000,,instance, vendors might implement their\Nown version of some known algorithm in an Dialogue: 0,0:04:38.04,0:04:45.26,Default,,0000,0000,0000,,optimized way. Also, IoT devices manage\Nexternal peripherals that often use custom Dialogue: 0,0:04:45.26,0:04:51.24,Default,,0000,0000,0000,,code. Again, with peripherals we mean like\Ncameras, sensors and so forth. The Dialogue: 0,0:04:51.24,0:04:57.48,Default,,0000,0000,0000,,firmware of IoT devices can be either\NLinux based or a blob firmware, Linux Dialogue: 0,0:04:57.48,0:05:03.13,Default,,0000,0000,0000,,based are by far the most common. A study\Nshowed that 86% of firmware are based on Dialogue: 0,0:05:03.13,0:05:07.90,Default,,0000,0000,0000,,Linux and on the other hand, blobs\Nfirmware are usually operating systems and Dialogue: 0,0:05:07.90,0:05:15.01,Default,,0000,0000,0000,,user applications packaged in a single\Nbinary. In any case, firmware samples are Dialogue: 0,0:05:15.01,0:05:20.02,Default,,0000,0000,0000,,usually made of multiple components. For\Ninstance, let's say that you have your Dialogue: 0,0:05:20.02,0:05:26.41,Default,,0000,0000,0000,,smart phone and you send a request to your\NIoT device. This request will be received Dialogue: 0,0:05:26.41,0:05:33.19,Default,,0000,0000,0000,,by a binary which we term as body binary,\Nwhich in this example is an webserver. The Dialogue: 0,0:05:33.19,0:05:37.99,Default,,0000,0000,0000,,request will be received, parsed, and then\Nit might be sent to another binary code, Dialogue: 0,0:05:37.99,0:05:43.15,Default,,0000,0000,0000,,the handler binary, which will take the\Nrequest, work on it, produce an answer, Dialogue: 0,0:05:43.15,0:05:48.13,Default,,0000,0000,0000,,send it back to the webserver, which in\Nturn would produce a response to send to Dialogue: 0,0:05:48.13,0:05:54.10,Default,,0000,0000,0000,,the smartphone. So to come back to the\Nquestion why is it hard to secure IoT Dialogue: 0,0:05:54.10,0:06:01.06,Default,,0000,0000,0000,,devices? Well, the answer is because IoT\Ndevices are in practice very diverse. Of Dialogue: 0,0:06:01.06,0:06:05.89,Default,,0000,0000,0000,,course, there have been various work that\Nhave been proposed to analyze and secure Dialogue: 0,0:06:05.89,0:06:11.50,Default,,0000,0000,0000,,firmware for IoT devices. Some of them\Nusing static analysis. Others using Dialogue: 0,0:06:11.50,0:06:15.91,Default,,0000,0000,0000,,dynamic analysis and several others using\Na combination of both. Here I wrote Dialogue: 0,0:06:15.91,0:06:19.69,Default,,0000,0000,0000,,several of them. Again at the end of the\Npresentation there is a bibliography with Dialogue: 0,0:06:19.69,0:06:28.99,Default,,0000,0000,0000,,the title of these works. Of course, all\Nthese approaches have some problems. For Dialogue: 0,0:06:28.99,0:06:33.85,Default,,0000,0000,0000,,instance, the current dynamic analysis are\Nhard to apply to scale because of the Dialogue: 0,0:06:33.85,0:06:39.43,Default,,0000,0000,0000,,customized environments that IoT devices\Nwork on. Usually when you try to Dialogue: 0,0:06:39.43,0:06:45.40,Default,,0000,0000,0000,,dynamically execute a firmware, it's gonna\Ncheck if the peripherals are connected and Dialogue: 0,0:06:45.40,0:06:49.78,Default,,0000,0000,0000,,are working properly. In a case where you\Ncan't have the peripherals, it's gonna be Dialogue: 0,0:06:49.78,0:06:55.39,Default,,0000,0000,0000,,hard to actually run the firmware. Also\Ncurrent static analysis approaches are Dialogue: 0,0:06:55.39,0:07:00.58,Default,,0000,0000,0000,,based on what we call the single binary\Napproach, which means that binaries from a Dialogue: 0,0:07:00.58,0:07:05.62,Default,,0000,0000,0000,,firmware are taken individually and\Nanalysed. This approach might produce many Dialogue: 0,0:07:05.62,0:07:11.53,Default,,0000,0000,0000,,false positives. For instance, so let's\Nsay again that we have our two binaries. Dialogue: 0,0:07:11.53,0:07:17.32,Default,,0000,0000,0000,,This is actually an example that we found\Non one firmware, so the web server will Dialogue: 0,0:07:17.32,0:07:22.99,Default,,0000,0000,0000,,take the user request, will parse the\Nrequest and produce some data, will set Dialogue: 0,0:07:22.99,0:07:27.43,Default,,0000,0000,0000,,this data to an environment variable and\Neventually will execute the handle binary. Dialogue: 0,0:07:27.43,0:07:33.67,Default,,0000,0000,0000,,Now, if you see the parsing function\Ncontains a string compare which checks if Dialogue: 0,0:07:33.67,0:07:37.93,Default,,0000,0000,0000,,some keyword is present in the request.\NAnd if so, it just returns the whole Dialogue: 0,0:07:37.93,0:07:43.78,Default,,0000,0000,0000,,request. Otherwise, it will constrain the\Nsize of the request to 128 bytes and Dialogue: 0,0:07:43.78,0:07:51.79,Default,,0000,0000,0000,,return it. The handler binary in turn when\Nspawned will receive the data by doing a Dialogue: 0,0:07:51.79,0:07:59.38,Default,,0000,0000,0000,,getenv on the query string, but also will\Ngetenv on another environment variable Dialogue: 0,0:07:59.38,0:08:04.06,Default,,0000,0000,0000,,which in this case is not user controlled\Nand they user cannot influence the content Dialogue: 0,0:08:04.06,0:08:10.48,Default,,0000,0000,0000,,of this variable. Then it's gonna call\Nfunction process_request. This function Dialogue: 0,0:08:10.48,0:08:16.69,Default,,0000,0000,0000,,eventually will do two string copies. One\Nfrom the user data, the other one from the Dialogue: 0,0:08:16.69,0:08:22.93,Default,,0000,0000,0000,,log path on two different local variables\Nthat are 128 bytes long. Now in the first Dialogue: 0,0:08:22.93,0:08:28.36,Default,,0000,0000,0000,,case, as we have seen before, the data can\Nbe greater than 128 bytes and this string Dialogue: 0,0:08:28.36,0:08:33.46,Default,,0000,0000,0000,,copy may result in a bug. While in the\Nsecond case it will not. Because here we Dialogue: 0,0:08:33.46,0:08:40.81,Default,,0000,0000,0000,,assume that the system handles its own\Ndata in a good manner. So throughout this Dialogue: 0,0:08:40.81,0:08:45.55,Default,,0000,0000,0000,,work, we're gonna call the first type of\Nbinary, the setter binary, which means Dialogue: 0,0:08:45.55,0:08:50.53,Default,,0000,0000,0000,,that it is the binary that takes the data\Nand set the data for another binary to be Dialogue: 0,0:08:50.53,0:08:57.70,Default,,0000,0000,0000,,consumed. And the second type of binary we\Ncalled them the getter binary. So the Dialogue: 0,0:08:57.70,0:09:01.57,Default,,0000,0000,0000,,current bug finding tools are inadequate\Nbecause other bugs are left undiscovered Dialogue: 0,0:09:01.57,0:09:08.08,Default,,0000,0000,0000,,if the analysis only consider those\Nbinaries that received network requests or Dialogue: 0,0:09:08.08,0:09:12.75,Default,,0000,0000,0000,,they're likely to produce many false\Npositives if the analysis considers all of Dialogue: 0,0:09:12.75,0:09:19.41,Default,,0000,0000,0000,,them individually. So then we wonder how\Nthese different components actually Dialogue: 0,0:09:19.41,0:09:23.43,Default,,0000,0000,0000,,communicate. They communicate through what\Nare called interprocess communication, Dialogue: 0,0:09:23.43,0:09:28.89,Default,,0000,0000,0000,,which basically it's a finite set of\Nparadigms used by binaries to communicate Dialogue: 0,0:09:28.89,0:09:36.66,Default,,0000,0000,0000,,such as files, environment variables, MMIO\Nand so forth. All these pieces are Dialogue: 0,0:09:36.66,0:09:42.15,Default,,0000,0000,0000,,represented by data keys, which are file\Nnames, or in the case of the example Dialogue: 0,0:09:42.15,0:09:49.44,Default,,0000,0000,0000,,before here on the right, it's the query\Nstring environment variable. Each binary Dialogue: 0,0:09:49.44,0:09:53.28,Default,,0000,0000,0000,,that relies on some shared data must know\Nthe endpoint where such data will be Dialogue: 0,0:09:53.28,0:09:57.54,Default,,0000,0000,0000,,available, for instance, again, like a\Nfile name or like even a socket endpoint Dialogue: 0,0:09:58.08,0:10:02.91,Default,,0000,0000,0000,,or the environment variable. This means\Nthat usually, data keys are coded in the Dialogue: 0,0:10:02.91,0:10:10.77,Default,,0000,0000,0000,,program itself, as we saw before. To find\Nbugs in firmware, in a precise manner, we Dialogue: 0,0:10:10.77,0:10:14.10,Default,,0000,0000,0000,,need to track how user data is introduced\Nand propagated across the different Dialogue: 0,0:10:14.10,0:10:22.68,Default,,0000,0000,0000,,binaries. Okay, let's talk about our work.\NBefore you start talking about Karonte, we Dialogue: 0,0:10:22.68,0:10:27.93,Default,,0000,0000,0000,,define our threat model. We hypotesized\Nthat attacker sends arbitrary requests Dialogue: 0,0:10:27.93,0:10:33.36,Default,,0000,0000,0000,,over the network, both LAN and WAN\Ndirectly to the IoT device. Though we said Dialogue: 0,0:10:33.36,0:10:38.64,Default,,0000,0000,0000,,before that sometimes IoT device can\Ncommunicate through the clouds, research Dialogue: 0,0:10:38.64,0:10:42.69,Default,,0000,0000,0000,,showed that some form of local\Ncommunication is usually available, for Dialogue: 0,0:10:42.69,0:10:50.04,Default,,0000,0000,0000,,instance, during the setup phase of the\Ndevice. Karonte is defined as a static Dialogue: 0,0:10:50.04,0:10:54.27,Default,,0000,0000,0000,,analysis tool that tracks data flow across\Nmultiple binaries, to find Dialogue: 0,0:10:54.27,0:11:00.69,Default,,0000,0000,0000,,vulnerabilities. Let's see how it works.\NSo the first step, Karonte find those Dialogue: 0,0:11:00.69,0:11:04.59,Default,,0000,0000,0000,,binaries that introduce the user input\Ninto the firmware. We call these border Dialogue: 0,0:11:04.59,0:11:09.18,Default,,0000,0000,0000,,binaries, which are the binaries, that\Nbasically interface the device to the Dialogue: 0,0:11:09.18,0:11:15.57,Default,,0000,0000,0000,,outside world. Which in the example is our\Nweb server. Then it tracks how a data is Dialogue: 0,0:11:15.57,0:11:20.76,Default,,0000,0000,0000,,shared with other binaries within the\Nfirmware sample. Which we'll understand in Dialogue: 0,0:11:20.76,0:11:25.17,Default,,0000,0000,0000,,this example, the web server communicates\Nwith the handle binary, and builds what we Dialogue: 0,0:11:25.17,0:11:30.63,Default,,0000,0000,0000,,call the BDG. BDG which stands for binary\Ndependency graph. It's basically a graph Dialogue: 0,0:11:30.63,0:11:39.72,Default,,0000,0000,0000,,representation of the data dependencies\Namong different binaries. Then we detect Dialogue: 0,0:11:39.72,0:11:45.36,Default,,0000,0000,0000,,vulnerabilities that arise from the misuse\Nof the data using the BDG. This is an Dialogue: 0,0:11:45.36,0:11:52.65,Default,,0000,0000,0000,,overview of our system. We start by taking\Na packed firmware, we unpack it. We find Dialogue: 0,0:11:52.65,0:11:58.74,Default,,0000,0000,0000,,the border binaries. Then we build the\Nbinary dependency graph, which relies on a Dialogue: 0,0:11:58.74,0:12:04.80,Default,,0000,0000,0000,,set of CPFs, as we will see soon. CPF\Nstands for Communication Paradigm Finder. Dialogue: 0,0:12:04.80,0:12:10.32,Default,,0000,0000,0000,,Then we find the specifics of the\Ncommunication, for instance, like the Dialogue: 0,0:12:10.32,0:12:16.14,Default,,0000,0000,0000,,constraints applied to the data that is\Nshared through our module multi-binary Dialogue: 0,0:12:16.14,0:12:20.55,Default,,0000,0000,0000,,data-flow analysis. Eventually we run our\Ninsecure interaction detection module, Dialogue: 0,0:12:20.55,0:12:26.04,Default,,0000,0000,0000,,which basically takes all the information\Nand produces alerts. Our system is Dialogue: 0,0:12:26.04,0:12:32.43,Default,,0000,0000,0000,,completely static and relies on our static\Ntaint engine. So let's see each one of Dialogue: 0,0:12:32.43,0:12:37.32,Default,,0000,0000,0000,,these steps, more in details. The\Nunpacking procedure is pretty easy, we use Dialogue: 0,0:12:37.32,0:12:42.60,Default,,0000,0000,0000,,the off-the-shelf firmware unpacking tool\Nbinwalk. And then we have to find the Dialogue: 0,0:12:42.60,0:12:47.73,Default,,0000,0000,0000,,border binaries. Now we see that border\Nbinaries basically are binaries that Dialogue: 0,0:12:47.73,0:12:54.15,Default,,0000,0000,0000,,receive data from the network. And we\Nhypotesize that will contain parsers to Dialogue: 0,0:12:54.15,0:12:57.93,Default,,0000,0000,0000,,validate the data that they received. So\Nin order to find them, we have to find Dialogue: 0,0:12:57.93,0:13:04.17,Default,,0000,0000,0000,,parsers which accept data from network and\Nparse this data. To find parsers we rely Dialogue: 0,0:13:04.17,0:13:12.90,Default,,0000,0000,0000,,on related work, which basically uses a\Nfew metrics and define through a number Dialogue: 0,0:13:12.90,0:13:18.00,Default,,0000,0000,0000,,the likelihood for a function to contain\Nparsing capabilities. These metrics that Dialogue: 0,0:13:18.00,0:13:22.47,Default,,0000,0000,0000,,we used are number of basic blocks, number\Nof memory comparison operations and number Dialogue: 0,0:13:22.47,0:13:29.07,Default,,0000,0000,0000,,of branches. Now while these define\Nparsers, we also have to find if a binary Dialogue: 0,0:13:29.07,0:13:34.11,Default,,0000,0000,0000,,takes data from the network. As such, we\Ndefine two more metrics. The first one, we Dialogue: 0,0:13:34.11,0:13:39.48,Default,,0000,0000,0000,,check if binary contains any network\Nrelated keywords as SOAP, http and so Dialogue: 0,0:13:39.48,0:13:45.24,Default,,0000,0000,0000,,forth. And then we check if there exists a\Ndata flow between read from socket and a Dialogue: 0,0:13:45.24,0:13:51.66,Default,,0000,0000,0000,,memory comparison operation. Once for each\Nfunction, we got all these metrics, we Dialogue: 0,0:13:51.66,0:13:56.07,Default,,0000,0000,0000,,compute what is called a parsing score,\Nwhich basically is just a sum of products. Dialogue: 0,0:13:56.07,0:14:01.71,Default,,0000,0000,0000,,Once we got a parsing score for each\Nfunction in a binary, we represent the Dialogue: 0,0:14:01.71,0:14:07.68,Default,,0000,0000,0000,,binary with its highest parsing score.\NOnce we got that for each binary in the Dialogue: 0,0:14:07.68,0:14:14.37,Default,,0000,0000,0000,,firmware we cluster them using the DBSCAN\Ndensity based algorithm and consider the Dialogue: 0,0:14:14.37,0:14:18.24,Default,,0000,0000,0000,,cluster with the highest parsing score as\Ncontaining the set of border binaries. Dialogue: 0,0:14:18.24,0:14:25.62,Default,,0000,0000,0000,,After this, we build the binary dependency\Ngraph. Again the binary dependency graph Dialogue: 0,0:14:25.62,0:14:29.79,Default,,0000,0000,0000,,represents the data dependency among the\Nbinaries in a firmware sample. For Dialogue: 0,0:14:29.79,0:14:35.43,Default,,0000,0000,0000,,instance, this simple graph will tell us\Nthat a binary A communicates with binary C Dialogue: 0,0:14:35.43,0:14:40.77,Default,,0000,0000,0000,,using files and the same binary A\Ncommunicates with another binary B using Dialogue: 0,0:14:40.77,0:14:47.31,Default,,0000,0000,0000,,environment variables. Let's see how this\Nworks. So we start from the identified Dialogue: 0,0:14:47.31,0:14:53.01,Default,,0000,0000,0000,,border binaries and then we taint the data\Ncompared against network related keywords Dialogue: 0,0:14:53.01,0:14:58.32,Default,,0000,0000,0000,,that we found and run a static analysis,\Nstatic taint analysis to detect whether Dialogue: 0,0:14:58.32,0:15:04.68,Default,,0000,0000,0000,,the binary relies on any IPC paradigm to\Nshare the data. If we find that it does, Dialogue: 0,0:15:04.68,0:15:09.36,Default,,0000,0000,0000,,we establish if the binary is a setter or\Na getter, which again means that if the Dialogue: 0,0:15:09.36,0:15:13.32,Default,,0000,0000,0000,,binary is setting the data to be consumed\Nby another binary, or if the binary Dialogue: 0,0:15:13.32,0:15:20.52,Default,,0000,0000,0000,,actually gets the data and consumes it.\NThen we retrieve the employed data key Dialogue: 0,0:15:20.52,0:15:25.86,Default,,0000,0000,0000,,which in the example before was the\Nkeyword QUERY_STRING. And finally we scan Dialogue: 0,0:15:25.86,0:15:30.45,Default,,0000,0000,0000,,the firmware sample to find other binaries\Nthat may rely on the same data keys and Dialogue: 0,0:15:30.45,0:15:35.82,Default,,0000,0000,0000,,schedule them for further analysis. To\Nunderstand whether a binary relies on any Dialogue: 0,0:15:35.82,0:15:42.51,Default,,0000,0000,0000,,IPC, we use what we call CPFs, which again\Nmeans communication paradigm finder. We Dialogue: 0,0:15:42.51,0:15:52.29,Default,,0000,0000,0000,,design a CPF for each IPC. And the CPFs\Nare also used to find the same data keys Dialogue: 0,0:15:52.29,0:15:56.28,Default,,0000,0000,0000,,within the firmware sample. We also\Nprovide Karonte with a generic CPF to Dialogue: 0,0:15:56.28,0:16:00.39,Default,,0000,0000,0000,,cover those cases where the IPC is\Nunknown. Or those cases were the vendor Dialogue: 0,0:16:00.39,0:16:06.09,Default,,0000,0000,0000,,implemented their own versions of some\NIPC. So for example they don't use the Dialogue: 0,0:16:06.09,0:16:13.35,Default,,0000,0000,0000,,setenv. But they implemented their own\Nsetenv. The idea behind this generic CPF Dialogue: 0,0:16:13.35,0:16:19.74,Default,,0000,0000,0000,,that we call the semantic CPF is that data\Nkeys has to be used as index to set, or to Dialogue: 0,0:16:19.74,0:16:27.87,Default,,0000,0000,0000,,get some data in this simple example. So\Nlet's see how the BDG algorithm works. We Dialogue: 0,0:16:27.87,0:16:31.89,Default,,0000,0000,0000,,start from the body binary, which again\Nwill start from the server request and Dialogue: 0,0:16:31.89,0:16:38.25,Default,,0000,0000,0000,,will pass the URI and we see that here. it\Nruns a string comparison against some Dialogue: 0,0:16:38.25,0:16:44.85,Default,,0000,0000,0000,,network related keyword. As such, we taint\Nthe variable P. And we see that the Dialogue: 0,0:16:44.85,0:16:52.80,Default,,0000,0000,0000,,variable P is returned from the function\Nto these two different points. As such, we Dialogue: 0,0:16:52.80,0:16:57.18,Default,,0000,0000,0000,,continue. And now we see that data gets\Ntainted and the variable data, it's passed Dialogue: 0,0:16:57.18,0:17:02.31,Default,,0000,0000,0000,,to the function setenv. At this point, the\Nenvironment CPF will understand that Dialogue: 0,0:17:02.31,0:17:08.46,Default,,0000,0000,0000,,tainted data is passed, is set to an\Nenvironment variable and will understand Dialogue: 0,0:17:08.46,0:17:13.68,Default,,0000,0000,0000,,that this binary is indeed the setter\Nbinary that uses the environment. Then we Dialogue: 0,0:17:13.68,0:17:18.54,Default,,0000,0000,0000,,retrieve the data key QUERY_STRING and\Nwe'll search within the firmware sample Dialogue: 0,0:17:18.54,0:17:28.07,Default,,0000,0000,0000,,all the other binaries that rely on the\Nsame data key. And it will find that this Dialogue: 0,0:17:28.07,0:17:29.88,Default,,0000,0000,0000,,binary relies on the same data key and\Nwill schedule this for further analysis. Dialogue: 0,0:17:29.88,0:17:37.02,Default,,0000,0000,0000,,After this algorithm we build the BDG by\Ncreating edges between setters and getters Dialogue: 0,0:17:37.02,0:17:45.15,Default,,0000,0000,0000,,for each data key. The multi binary data\Nflow analysis uses the BDG to find and Dialogue: 0,0:17:45.15,0:17:51.27,Default,,0000,0000,0000,,propagate the data constraints from a\Nsetter to a getter. Now, through this we Dialogue: 0,0:17:51.27,0:17:56.61,Default,,0000,0000,0000,,apply only the least three constraints,\Nwhich means that ideally between two Dialogue: 0,0:17:56.61,0:18:02.76,Default,,0000,0000,0000,,program points, there might be an infinite\Nnumber of parts and ideally in theory an Dialogue: 0,0:18:02.76,0:18:06.69,Default,,0000,0000,0000,,infinite amount of constraints that we can\Npropagate to the setter binary to the Dialogue: 0,0:18:06.69,0:18:11.79,Default,,0000,0000,0000,,getter binary. But since our goal here is\Nto find bugs, we only propagate the least Dialogue: 0,0:18:11.79,0:18:17.04,Default,,0000,0000,0000,,strict set of constraints. Let's see an\Nexample. So again, we have our two Dialogue: 0,0:18:17.04,0:18:24.06,Default,,0000,0000,0000,,binaries and we see that the variable that\Nis passed to the setenv function is data, Dialogue: 0,0:18:24.06,0:18:29.49,Default,,0000,0000,0000,,which comes from two different parts from\Nthe parse URI function. In the first case, Dialogue: 0,0:18:29.49,0:18:35.04,Default,,0000,0000,0000,,the data that its passed is unconstrained\None in the second case, a line 8 is Dialogue: 0,0:18:35.04,0:18:40.47,Default,,0000,0000,0000,,constrained to be at most 128 bytes. As\Nsuch, we only propagate the constraints of Dialogue: 0,0:18:40.47,0:18:49.98,Default,,0000,0000,0000,,the first guy. In turn, the getter binary\Nwill retrieve this variable from the Dialogue: 0,0:18:49.98,0:18:55.83,Default,,0000,0000,0000,,environment and set the variable query.\NOh, sorry. Which in this case will be Dialogue: 0,0:18:55.83,0:19:03.39,Default,,0000,0000,0000,,unconstrained. Insecure interaction\Ndetection run a static taint analysis and Dialogue: 0,0:19:03.39,0:19:07.65,Default,,0000,0000,0000,,check whether tainted data can reach a\Nsink in an unsafe way. We consider as Dialogue: 0,0:19:07.65,0:19:12.66,Default,,0000,0000,0000,,sinks memcpy like functions which are\Nfunctions that implement semantically Dialogue: 0,0:19:12.66,0:19:19.05,Default,,0000,0000,0000,,equivalent memcyp, strcpy and so forth. We\Nraise alert if we see that there is a Dialogue: 0,0:19:19.05,0:19:23.10,Default,,0000,0000,0000,,dereference of a tainted variable and if\Nwe see there are comparisons of tainted Dialogue: 0,0:19:23.10,0:19:31.62,Default,,0000,0000,0000,,variables in loop conditions to detect\Npossible DoS vulnerabilities. Let's see an Dialogue: 0,0:19:31.62,0:19:37.26,Default,,0000,0000,0000,,example again. So we got here. We know\Nthat our query variable is tainted and Dialogue: 0,0:19:37.26,0:19:43.77,Default,,0000,0000,0000,,it's unconstrained. And then we follow the\Ntaint in the function process_request, Dialogue: 0,0:19:43.77,0:19:52.74,Default,,0000,0000,0000,,which we see will eventually copy the data\Nfrom q to arg. Now we see that arg is 128 Dialogue: 0,0:19:52.74,0:20:01.05,Default,,0000,0000,0000,,bytes long while q is unconstrained and\Ntherefore we generate an alert here. Our Dialogue: 0,0:20:01.05,0:20:04.98,Default,,0000,0000,0000,,static taint engine is based on BootStomp\Nand is completely based on symbolic Dialogue: 0,0:20:04.98,0:20:09.75,Default,,0000,0000,0000,,execution, which means that the taint is\Npropagated following the program data Dialogue: 0,0:20:09.75,0:20:14.43,Default,,0000,0000,0000,,flow. Let's see an example. So assuming\Nthat we have this code, the first Dialogue: 0,0:20:14.43,0:20:19.62,Default,,0000,0000,0000,,instruction takes the result from some\Nseed function that might return for Dialogue: 0,0:20:19.62,0:20:25.76,Default,,0000,0000,0000,,instance, some user input. And in a\Nsymbolic world, what we do is we create a Dialogue: 0,0:20:25.76,0:20:33.63,Default,,0000,0000,0000,,symbolic variable ty and assign to it a\Ntainted variable that we call TAINT_ty, Dialogue: 0,0:20:33.63,0:20:40.29,Default,,0000,0000,0000,,which is the taint target. The next\Ndestruction X takes the value ty plus 5 Dialogue: 0,0:20:40.29,0:20:46.89,Default,,0000,0000,0000,,and a symbolic word. We just follow the\Ndata flow and x gets assigned TAINT_ty Dialogue: 0,0:20:46.89,0:20:54.30,Default,,0000,0000,0000,,plus 5 which effectively taints also X. If\Nat some point X is overwritten with some Dialogue: 0,0:20:54.30,0:21:00.90,Default,,0000,0000,0000,,constant data, the taint is automatically\Nremoved. In its original design, Dialogue: 0,0:21:00.90,0:21:07.86,Default,,0000,0000,0000,,BootStomp, the taint is removed also when\Ndata is constrained. For instance, here we Dialogue: 0,0:21:07.86,0:21:11.88,Default,,0000,0000,0000,,can see that the variable n is tainted but\Nthen is constrained between two values 0 Dialogue: 0,0:21:11.88,0:21:19.77,Default,,0000,0000,0000,,and 255. And therefore, the taint is\Nremoved. In our taint engine we have two Dialogue: 0,0:21:19.77,0:21:26.61,Default,,0000,0000,0000,,additions. We added a path prioritization\Nstrategy and we add taint dependencies. Dialogue: 0,0:21:26.61,0:21:32.43,Default,,0000,0000,0000,,The path prioritization strategy valorizes\Npaths that propagate the taint and Dialogue: 0,0:21:33.03,0:21:39.03,Default,,0000,0000,0000,,deprioritizes those that remove it. For\Ninstance, say again that some user input Dialogue: 0,0:21:39.03,0:21:46.11,Default,,0000,0000,0000,,comes from some function and the variable\Nuser input gets tainted. Gets tainted and Dialogue: 0,0:21:46.11,0:21:51.18,Default,,0000,0000,0000,,then is passed to another function called\Nparse. Here, if you see there are possibly Dialogue: 0,0:21:51.18,0:21:57.93,Default,,0000,0000,0000,,an infinite number of symbolic parts in\Nthis while. But only 1 will return tainted Dialogue: 0,0:21:57.93,0:22:05.49,Default,,0000,0000,0000,,data. While the others won't. So the path\Nprioritization strategy valorizes this Dialogue: 0,0:22:05.49,0:22:09.99,Default,,0000,0000,0000,,path instead of the others. This has been\Nimplemented by finding basic blocks within Dialogue: 0,0:22:09.99,0:22:16.14,Default,,0000,0000,0000,,a function that return a nonconstant data.\NAnd if one is found, we follow its return Dialogue: 0,0:22:16.14,0:22:21.87,Default,,0000,0000,0000,,before considering the others. Taint\Ndependencies allows smart untaint Dialogue: 0,0:22:21.87,0:22:26.31,Default,,0000,0000,0000,,strategies. Let's see again the example.\NSo we know that user input here is Dialogue: 0,0:22:26.31,0:22:33.90,Default,,0000,0000,0000,,tainted, is then parsed and then we see\Nthat it's length is checked and stored in Dialogue: 0,0:22:33.90,0:22:40.76,Default,,0000,0000,0000,,a variable n. Its size is checked and if\Nit's higher than 512 bytes, the function Dialogue: 0,0:22:40.76,0:22:48.21,Default,,0000,0000,0000,,will return. Otherwise it copies the data.\NNow in this case, it might happen that if Dialogue: 0,0:22:48.21,0:22:53.54,Default,,0000,0000,0000,,this strlen function is not analyzed\Nbecause of some static analysis input Dialogue: 0,0:22:53.54,0:23:00.78,Default,,0000,0000,0000,,decisions, the taint tag of cmd might be\Ndifferent from the taint tag of n and in Dialogue: 0,0:23:00.78,0:23:07.38,Default,,0000,0000,0000,,this case, though, and gets untainted, cmd\Nis not untainted and the strcpy can raise, Dialogue: 0,0:23:07.38,0:23:15.54,Default,,0000,0000,0000,,sorry, carries a false positive. So to fix\Nthis problem. Basically we create a Dialogue: 0,0:23:15.54,0:23:21.36,Default,,0000,0000,0000,,dependency between the taint tag of n and\Nthe taint tag of cmd. And when n gets Dialogue: 0,0:23:21.36,0:23:28.41,Default,,0000,0000,0000,,untainted, cmd gets untainted as well. So\Nwe don't have more false positives. This Dialogue: 0,0:23:28.41,0:23:33.33,Default,,0000,0000,0000,,procedure is automatic and we find\Nfunctions that implement streamlined Dialogue: 0,0:23:33.33,0:23:40.14,Default,,0000,0000,0000,,semantically equivalent code and create\Ntaint tag dependencies. OK. Let's see our Dialogue: 0,0:23:40.14,0:23:48.24,Default,,0000,0000,0000,,evaluation. We ran 3 different evaluations\Non 2 different data sets. The first one Dialogue: 0,0:23:48.24,0:23:55.14,Default,,0000,0000,0000,,composed by 53 latest firmware samples\Nfrom seven vendors and a second one 899 Dialogue: 0,0:23:55.14,0:24:02.34,Default,,0000,0000,0000,,firmware gathered from related work. In\Nthe first case, we can see that the total Dialogue: 0,0:24:02.34,0:24:09.72,Default,,0000,0000,0000,,number of binaries considered are 8.5k,\Nfew more than that. And our system Dialogue: 0,0:24:09.72,0:24:15.90,Default,,0000,0000,0000,,generated 87 alerts of which 51 were found\Nto be true positive and 34 of them were Dialogue: 0,0:24:15.90,0:24:21.96,Default,,0000,0000,0000,,multibinary vulnerabilities, which means\Nthat the vulnerability was found by Dialogue: 0,0:24:21.96,0:24:27.99,Default,,0000,0000,0000,,tracking the data flow from the setter to\Nthe getter binary. We also ran a Dialogue: 0,0:24:27.99,0:24:32.01,Default,,0000,0000,0000,,comparative evaluation, which basically we\Ntried to measure the effort that an Dialogue: 0,0:24:32.01,0:24:37.26,Default,,0000,0000,0000,,analyst would go through in analyzing\Nfirmware using different strategies. In Dialogue: 0,0:24:37.26,0:24:41.28,Default,,0000,0000,0000,,the first one, we consider each and every\Nbinary in the firmware sample Dialogue: 0,0:24:41.28,0:24:49.05,Default,,0000,0000,0000,,independently and run the analysis for up\Nto seven days for each firmware. The Dialogue: 0,0:24:49.05,0:24:57.39,Default,,0000,0000,0000,,system generated almost 21000 alerts.\NConsidering only almost 2.5k binaries. In Dialogue: 0,0:24:57.39,0:25:04.02,Default,,0000,0000,0000,,the second case we found the border\Nbinaries, the parsers and we statically Dialogue: 0,0:25:04.02,0:25:11.07,Default,,0000,0000,0000,,analyzed only them, and the system\Ngenerated 9.3k alerts. Notice that in this Dialogue: 0,0:25:11.07,0:25:15.63,Default,,0000,0000,0000,,case, since we don't know how the user\Ninput is introduced, like in this Dialogue: 0,0:25:15.63,0:25:21.12,Default,,0000,0000,0000,,experiment, we consider every IPC that we\Nfind in the binary as a possible source of Dialogue: 0,0:25:21.12,0:25:28.47,Default,,0000,0000,0000,,user input. And this is true for all of\Nthem. In the third case we ran the BDG but Dialogue: 0,0:25:28.47,0:25:33.06,Default,,0000,0000,0000,,we consider each binaries independently.\NWhich means that we don't propagate Dialogue: 0,0:25:33.06,0:25:37.80,Default,,0000,0000,0000,,constraints and we run a static single\Ncorner analysis on each one of them. And Dialogue: 0,0:25:37.80,0:25:45.75,Default,,0000,0000,0000,,the system generated almost 15000 alerts.\NFinally, we run Karonte and the generated Dialogue: 0,0:25:45.75,0:25:55.23,Default,,0000,0000,0000,,alerts were only 74. We also run a larger\Nscale analysis on 899 firmware samples. Dialogue: 0,0:25:55.23,0:26:01.38,Default,,0000,0000,0000,,And we found that almost 40% of them were\Nmulti binary, which means that the network Dialogue: 0,0:26:01.38,0:26:08.22,Default,,0000,0000,0000,,functionalities were carried on by more\Nthan one binary. And the system generated Dialogue: 0,0:26:08.22,0:26:16.62,Default,,0000,0000,0000,,1000 alerts. Now, there is a lot going on\Nin this table, like details are on the Dialogue: 0,0:26:16.62,0:26:21.66,Default,,0000,0000,0000,,paper. Here in this presentation I just go\Nthrough some as I'll motivate. So we found Dialogue: 0,0:26:21.66,0:26:27.36,Default,,0000,0000,0000,,that on average, a firmware contains 4\Nborder binaries. A BDG contains 5 binaries Dialogue: 0,0:26:27.36,0:26:34.05,Default,,0000,0000,0000,,and some BDG have more than 10 binaries.\NAlso, we plot some statistics and we found Dialogue: 0,0:26:34.05,0:26:39.03,Default,,0000,0000,0000,,that 80% of the firmware were analysed\Nwithin a day, as you can see from the top Dialogue: 0,0:26:39.03,0:26:46.35,Default,,0000,0000,0000,,left figure. However, experiments\Npresented a great variance which we found Dialogue: 0,0:26:46.35,0:26:51.30,Default,,0000,0000,0000,,was due to implementation details. For\Ninstance we found that angr would take Dialogue: 0,0:26:51.30,0:26:56.22,Default,,0000,0000,0000,,more than seven hours to build some CFGs.\NAnd sometimes they were due to a high Dialogue: 0,0:26:56.22,0:27:01.65,Default,,0000,0000,0000,,number of data keys. Also, we found that\Nthe number of paths, as you can see from Dialogue: 0,0:27:01.65,0:27:09.48,Default,,0000,0000,0000,,this second picture from the top, the\Nnumber of paths do not have an impact on Dialogue: 0,0:27:09.48,0:27:15.03,Default,,0000,0000,0000,,the total time. And as you can see from\Nthe bottom two pictures, performance not Dialogue: 0,0:27:15.87,0:27:23.61,Default,,0000,0000,0000,,heavily affected by firmware size.\NFirmware size here we mean the number of Dialogue: 0,0:27:23.61,0:27:29.61,Default,,0000,0000,0000,,binaries in a firmware sample and the\Ntotal number of basic blocks. So let's see Dialogue: 0,0:27:29.61,0:27:35.19,Default,,0000,0000,0000,,how to run Karonte. The procedure is\Npretty straightforward. So first you get a Dialogue: 0,0:27:35.19,0:27:38.79,Default,,0000,0000,0000,,firmware sample. You create a\Nconfiguration file containing information Dialogue: 0,0:27:38.79,0:27:45.15,Default,,0000,0000,0000,,of the firmware sample and then you run\Nit. So let's see how. So this is an Dialogue: 0,0:27:45.15,0:27:51.45,Default,,0000,0000,0000,,example of a configuration file. It\Ncontains the information, but most of them Dialogue: 0,0:27:51.45,0:27:55.29,Default,,0000,0000,0000,,are optional. The only ones that are not\Nare this one: Firmware path, that is the Dialogue: 0,0:27:55.29,0:28:00.30,Default,,0000,0000,0000,,path to your firmware. And this too, the\Narchitecture of the firmware and the base Dialogue: 0,0:28:00.30,0:28:07.17,Default,,0000,0000,0000,,address if the firmware is a blob, is a\Nfirmware blob. All the other fields are Dialogue: 0,0:28:07.17,0:28:12.38,Default,,0000,0000,0000,,optional. And you can set them if you have\Nsome information about the firmware. A Dialogue: 0,0:28:12.38,0:28:18.33,Default,,0000,0000,0000,,detailed explanation of all of these\Nfields are on our GitHub repo. Once you Dialogue: 0,0:28:18.33,0:28:23.98,Default,,0000,0000,0000,,set the configuration file, you can run\NKaronte. Now we provide a Docker Dialogue: 0,0:28:23.98,0:28:28.67,Default,,0000,0000,0000,,container, you can find the link on our\NGitHub repo. And I'm gonna run it, but Dialogue: 0,0:28:28.67,0:28:41.40,Default,,0000,0000,0000,,it's not gonna finish because it's gonna\Ntake several hours. But all you have to do Dialogue: 0,0:28:41.40,0:28:53.22,Default,,0000,0000,0000,,is merely... *typing noises* just run it\Non the configuration file and it's gonna Dialogue: 0,0:28:53.22,0:28:57.63,Default,,0000,0000,0000,,do each step that we saw. Eventually I'm\Ngoing to stop it because it's going to Dialogue: 0,0:28:57.63,0:29:02.54,Default,,0000,0000,0000,,take several hours anyway. Eventually it\Nwill produce a result file that... I ran Dialogue: 0,0:29:02.54,0:29:07.86,Default,,0000,0000,0000,,this yesterday so you can see it here.\NThere is a lot going on here. I'm just Dialogue: 0,0:29:07.86,0:29:14.78,Default,,0000,0000,0000,,gonna go through some important like\Ninformation. So one thing that you can see Dialogue: 0,0:29:14.78,0:29:21.92,Default,,0000,0000,0000,,is that these are the border binaries that\NKaronte found. Now, there might be some Dialogue: 0,0:29:21.92,0:29:26.36,Default,,0000,0000,0000,,false positives. I'm not sure how many\Nthere are here. But as long as there are Dialogue: 0,0:29:26.36,0:29:32.13,Default,,0000,0000,0000,,no false negatives or the number is very\Nlow, it's fine. It's good. In this case, Dialogue: 0,0:29:32.13,0:29:38.88,Default,,0000,0000,0000,,wait. Oh, I might have removed something.\NAll right, here, perfect. In this case, Dialogue: 0,0:29:38.88,0:29:45.44,Default,,0000,0000,0000,,this guy httpd is a true positive, which\Nis the web server that we were talking Dialogue: 0,0:29:45.44,0:29:52.18,Default,,0000,0000,0000,,before. Then we have the BDG. In this\Ncase, we can see that Karonte found that Dialogue: 0,0:29:52.18,0:30:00.25,Default,,0000,0000,0000,,httpd communicates with two different\Nbinaries, fileaccess.cgi and cgibin. Then Dialogue: 0,0:30:00.25,0:30:10.80,Default,,0000,0000,0000,,we have information about the CPFs. For\Ninstance, here we can see that. Sorry. So Dialogue: 0,0:30:10.80,0:30:19.78,Default,,0000,0000,0000,,we can see here that httpd has 28 data\Nkeys. And that the semantics CPF found 27 Dialogue: 0,0:30:19.78,0:30:26.82,Default,,0000,0000,0000,,of them and then there might be one other\Nhere or somewhere that I don't see . Dialogue: 0,0:30:26.82,0:30:35.84,Default,,0000,0000,0000,,Anyway. And then we have a list of alerts.\NNow, thanks. Now, some of those may be Dialogue: 0,0:30:35.84,0:30:44.14,Default,,0000,0000,0000,,duplicates because of loops, so you can go\Nahead and inspect all of them manually. Dialogue: 0,0:30:44.14,0:30:50.98,Default,,0000,0000,0000,,But I wrote a utility that you can use,\Nwhich is basically it's gonna filter out Dialogue: 0,0:30:50.98,0:31:02.10,Default,,0000,0000,0000,,all the loops for you. Now to remember how\NI called it. This guy? Yeah. And you can Dialogue: 0,0:31:02.10,0:31:13.37,Default,,0000,0000,0000,,see that in total it generated, the system\Ngenerated 6... 7... 8 alerts. So let's see Dialogue: 0,0:31:13.37,0:31:20.58,Default,,0000,0000,0000,,one of them. Oh, and I recently realized\Nthat the path that I'm reporting on the Dialogue: 0,0:31:20.58,0:31:25.97,Default,,0000,0000,0000,,log. It's not the path from the setter\Nbinary to the getter binary, to the sink. Dialogue: 0,0:31:25.97,0:31:31.43,Default,,0000,0000,0000,,But it's only related to the getter binary\Nup to the sink. I'm gonna fix this in the Dialogue: 0,0:31:31.43,0:31:37.55,Default,,0000,0000,0000,,next days and report the whole paths.\NAnyway. So here we can see that the key Dialogue: 0,0:31:37.55,0:31:43.40,Default,,0000,0000,0000,,content type contains user input and it's\Npassed in an unsafe way to the sink Dialogue: 0,0:31:43.40,0:31:49.69,Default,,0000,0000,0000,,address at this address. Now. And the\Nbinary in question is called Dialogue: 0,0:31:49.69,0:32:02.42,Default,,0000,0000,0000,,fileaccess.cgi. So we can see what happens\Nthere. *keyboard noises* If you see here, Dialogue: 0,0:32:02.42,0:32:12.48,Default,,0000,0000,0000,,we have a string copy that copies the\Ncontent of haystack to destination, Dialogue: 0,0:32:12.48,0:32:20.75,Default,,0000,0000,0000,,haystack comes basically from this getenv.\NAnd if you see destination comes as Dialogue: 0,0:32:20.75,0:32:30.00,Default,,0000,0000,0000,,parameter from this function and return\Nand these and this by for it's as big as Dialogue: 0,0:32:30.00,0:32:38.90,Default,,0000,0000,0000,,0x68 bytes. And this turned out to be\Nactually a positive. OK. So in summary, we Dialogue: 0,0:32:38.90,0:32:46.53,Default,,0000,0000,0000,,presented a strategy to track data flow\Nacross different binaries. We evaluated Dialogue: 0,0:32:46.53,0:32:52.97,Default,,0000,0000,0000,,our system on 952 firmware samples and\Nsome takeaways. Analyzing firmware is not Dialogue: 0,0:32:52.97,0:32:58.16,Default,,0000,0000,0000,,easy and vulnerabilities persist. We found\Nout that firmware are made of Dialogue: 0,0:32:58.16,0:33:02.66,Default,,0000,0000,0000,,interconnected components and static\Nanalysis can still be used to efficiently Dialogue: 0,0:33:02.66,0:33:07.73,Default,,0000,0000,0000,,find vulnerabilities at scale and finding\Nthat communication is key for precision. Dialogue: 0,0:33:07.73,0:33:12.23,Default,,0000,0000,0000,,Here's a list of bibliography that I use\Nthroughout the presentation and I'm gonna Dialogue: 0,0:33:12.23,0:33:12.96,Default,,0000,0000,0000,,take questions. Dialogue: 0,0:33:12.96,0:33:18.43,Default,,0000,0000,0000,,*applause* Dialogue: 0,0:33:18.43,0:33:27.37,Default,,0000,0000,0000,,Herald: So thank you, Nilo, for a very\Ninteresting talk. If you have questions, Dialogue: 0,0:33:27.37,0:33:32.47,Default,,0000,0000,0000,,we have three microphones one, two and\Nthree. If you have a question, please go Dialogue: 0,0:33:32.47,0:33:37.68,Default,,0000,0000,0000,,head to the microphone and we'll take your\Nquestion. Yes. Microphone number two. Dialogue: 0,0:33:37.68,0:33:41.100,Default,,0000,0000,0000,,Q: Do you rely on imports from libc or\Nsomething like that or do you have some Dialogue: 0,0:33:41.100,0:33:46.73,Default,,0000,0000,0000,,issues with like statically linked\Nbinaries, stripped binaries or is it all Dialogue: 0,0:33:46.73,0:33:51.90,Default,,0000,0000,0000,,semantic analysis of a function?\NNilo: So. Okay. We use angr. So for Dialogue: 0,0:33:51.90,0:33:57.28,Default,,0000,0000,0000,,example, if you have an indirect call, we\Nuse angr to figure out, what's the target? Dialogue: 0,0:33:57.28,0:34:02.63,Default,,0000,0000,0000,,And to answer your question like if you\Nuse libc some CPFs do, for instance, then Dialogue: 0,0:34:02.63,0:34:08.31,Default,,0000,0000,0000,,environment CPF do any checks, if the\Nsetenv or getenv functions are called. But Dialogue: 0,0:34:08.31,0:34:12.87,Default,,0000,0000,0000,,also we use the semantic CPF, which\Nbasically in cases where information are Dialogue: 0,0:34:12.87,0:34:17.69,Default,,0000,0000,0000,,missing like there is no such thing as\Nlibc or some vendors reimplemented their Dialogue: 0,0:34:17.69,0:34:21.98,Default,,0000,0000,0000,,own functions. We use the CPF to actually\Ntry to understand the semantics of the Dialogue: 0,0:34:21.98,0:34:25.89,Default,,0000,0000,0000,,function and understand if it's, for\Nexample, a custom setenv. Dialogue: 0,0:34:25.89,0:34:29.90,Default,,0000,0000,0000,,Q: Yeah, thanks.\NHerald: Microphone number three. Dialogue: 0,0:34:29.90,0:34:36.90,Default,,0000,0000,0000,,Q: In embedded environments you often have\Nalso that the getter might work on a DMA, Dialogue: 0,0:34:36.90,0:34:43.23,Default,,0000,0000,0000,,some kind of vendor driver on a DMA. Are\Nyou considering this? And second part of Dialogue: 0,0:34:43.23,0:34:47.79,Default,,0000,0000,0000,,the question, how would you then\Ndistinguish this from your generic IPC? Dialogue: 0,0:34:47.79,0:34:52.50,Default,,0000,0000,0000,,Because I can imagine that they look very\Nsimilar in the actual code. Dialogue: 0,0:34:52.50,0:34:58.75,Default,,0000,0000,0000,,Nilo: So if I understand correctly your\Nquestion, you mention a case of MMIO where Dialogue: 0,0:34:58.75,0:35:03.96,Default,,0000,0000,0000,,some data is retrieved directly from some\Naddress in memory. So what we found is Dialogue: 0,0:35:03.96,0:35:08.43,Default,,0000,0000,0000,,that these addresses are usually hardcoded\Nsomewhere. So the vendor knows that, for Dialogue: 0,0:35:08.43,0:35:13.28,Default,,0000,0000,0000,,example, from this address A to this\Naddress B if some data is some data from Dialogue: 0,0:35:13.28,0:35:18.86,Default,,0000,0000,0000,,this peripheral. So when we find that some\Nhardcoded address, like we think that this Dialogue: 0,0:35:18.86,0:35:21.69,Default,,0000,0000,0000,,is like some read from some interesting\Ndata. Dialogue: 0,0:35:21.69,0:35:28.07,Default,,0000,0000,0000,,Q: Okay. And this would be also\Ndistinguishable from your sort of CPF, the Dialogue: 0,0:35:28.07,0:35:32.18,Default,,0000,0000,0000,,generic CPF would be distinguishable...\NNilo: Yeah. Yeah, yeah. Dialogue: 0,0:35:32.18,0:35:35.78,Default,,0000,0000,0000,,Q: ...from a DMA driver by using this\Nfixed address assuming. Dialogue: 0,0:35:35.78,0:35:39.83,Default,,0000,0000,0000,,Nilo: Yeah. That's what the semantic CPF\Ndoes, among the other things. Dialogue: 0,0:35:39.83,0:35:41.34,Default,,0000,0000,0000,,Q: Okay. Thank you.\NNilo: Sure. Dialogue: 0,0:35:41.34,0:35:43.86,Default,,0000,0000,0000,,Herald: Another question for microphone\Nnumber 3. Dialogue: 0,0:35:43.86,0:35:46.12,Default,,0000,0000,0000,,Q: What's the license for Karonte?\NNilo: Sorry? Dialogue: 0,0:35:46.12,0:35:51.13,Default,,0000,0000,0000,,Q: I checked the software license, I\Nchecked the git repository and there is no Dialogue: 0,0:35:51.13,0:35:53.44,Default,,0000,0000,0000,,license like at all.\NNilo: That is a very good question. I Dialogue: 0,0:35:53.44,0:36:00.61,Default,,0000,0000,0000,,haven't thought about it yet. I will.\NHerald: Any more questions from here or Dialogue: 0,0:36:00.61,0:36:04.41,Default,,0000,0000,0000,,from the Internet? Okay. Then a big round\Nof applause to Nilo again for your talk. Dialogue: 0,0:36:04.41,0:36:24.82,Default,,0000,0000,0000,,*postroll music* Dialogue: 0,0:36:24.82,0:36:31.63,Default,,0000,0000,0000,,Subtitles created by many many volunteers and\Nthe c3subtitles.de team. Join us, and help us!