WEBVTT 00:00:00.099 --> 00:00:16.600 Music Herald: The next talk is about how risky 00:00:16.600 --> 00:00:23.210 is software you use. So you may be heard about Trump versus a Russian security 00:00:23.210 --> 00:00:30.949 company. We won't judge this, we won't comment this, but we dislike the 00:00:30.949 --> 00:00:36.590 prejudgments of this case. Tim Carstens and Parker Thompson will tell you a little 00:00:36.590 --> 00:00:43.300 bit more about how risky the software is you use. Tim Carstens is CITL's Acting 00:00:43.300 --> 00:00:48.350 Director and Parker Thompson is CITL's lead engineer. Please welcome with a very, 00:00:48.350 --> 00:00:53.879 very warm applause: Tim and Parker! Thanks. 00:00:53.879 --> 00:01:05.409 Applause Tim Carstens: Howdy, howdy. So my name is 00:01:05.409 --> 00:01:13.010 Tim Carstens. I'm the acting director of the cyber independent testing lab. It's 00:01:13.010 --> 00:01:19.039 four words there, we'll talk about all for today, especially cyber. With me today as 00:01:19.039 --> 00:01:25.760 our lead engineer Parker Thompson. Not on stage or our other collaborators: Patrick 00:01:25.760 --> 00:01:32.929 Stach, Sarah Zatko, and present in the room but not on stage - Mudge. So today 00:01:32.929 --> 00:01:37.010 we're going to be talking about our work, the lead in. The introduction that was 00:01:37.010 --> 00:01:40.289 given is phrased in terms of Kaspersky and all of that, I'm not gonna be speaking 00:01:40.289 --> 00:01:45.370 about Kaspersky and I guarantee you I'm not gonna be speaking about my president. 00:01:45.370 --> 00:01:50.010 Right, yeah? Okay. Thank you. Applause 00:01:50.010 --> 00:01:55.289 All right, so why don't we go ahead and kick off: I'll mention now parts of this 00:01:55.289 --> 00:02:00.539 presentation are going to be quite technical. Not most of it, and I will 00:02:00.539 --> 00:02:04.030 always include analogies and all these other things if you are here in security 00:02:04.030 --> 00:02:10.530 but not a bit-twiddler. But if you do want to be able to review some of the technical 00:02:10.530 --> 00:02:14.810 material, if I go through it too fast you like to read if you're a mathematician or 00:02:14.810 --> 00:02:20.510 if you are a computer scientist, our sides are already available for download at this 00:02:20.510 --> 00:02:25.400 site here. We think our pal our partners at power door for getting that set up for 00:02:25.400 --> 00:02:31.630 us. Let's let's get started on the real material here. Alright, so we are CITL: a 00:02:31.630 --> 00:02:35.770 nonprofit organization based in the United States founded by our chief scientist 00:02:35.770 --> 00:02:43.020 Sarah Zatko and our board chair Mudge. And our mission is a public good mission - we 00:02:43.020 --> 00:02:47.400 are hackers but our mission here is actually to look out for people who do not 00:02:47.400 --> 00:02:50.460 know very much about machines or as much as the other hackers do. 00:02:50.460 --> 00:02:56.030 Specifically, we seek to improve the state of software security by providing the 00:02:56.030 --> 00:03:01.340 public with accurate reporting on the security of popular software, right? And 00:03:01.340 --> 00:03:05.520 so there was a mouthful for you. But no doubt, no doubt, every single one of you 00:03:05.520 --> 00:03:10.700 has received questions of the form: what do I run on my phone, what do I do with 00:03:10.700 --> 00:03:13.950 this, what do I do with that, how do I protect myself - all these other things 00:03:13.950 --> 00:03:19.770 lots of people in the general public looking for agency in computing. No one's 00:03:19.770 --> 00:03:25.000 offering it to them, and so we're trying to go ahead and provide a forcing function 00:03:25.000 --> 00:03:29.980 on the software field in order to, you know, again be able to enable consumers 00:03:29.980 --> 00:03:36.480 and users and all these things. Our social good work is funded largely by charitable 00:03:36.480 --> 00:03:40.820 monies from the Ford Foundation whom we thank a great deal, but we also have major 00:03:40.820 --> 00:03:44.920 partnerships with Consumer Reports, which is a major organization in the United 00:03:44.920 --> 00:03:51.620 States that generally, broadly, looks at consumer goods for safety and performance. 00:03:51.620 --> 00:03:55.690 But also partners with The Digital Standard, which probably would be of great 00:03:55.690 --> 00:03:59.460 interest to many people here at Congress as it is a holistic standard for 00:03:59.460 --> 00:04:04.340 protecting user rights. We'll talk about some of the work that goes into those 00:04:04.340 --> 00:04:10.250 things here in a bit, but first I want to give the big picture of what it is we're 00:04:10.250 --> 00:04:17.940 really trying to do in one one short little sentence. Something like this but 00:04:17.940 --> 00:04:23.710 for security, right? What are the important facts, how does it rate, you 00:04:23.710 --> 00:04:26.810 know, is it easy to consume, is it easy to go ahead and look and say this thing is 00:04:26.810 --> 00:04:31.300 good this thing is not good. Something like this, but for software security. 00:04:33.120 --> 00:04:39.190 Sounds hard doesn't it? So I want to talk a little bit about what I mean by 00:04:39.190 --> 00:04:44.870 something like this. There are lots of consumer outlook and 00:04:44.870 --> 00:04:50.270 watchdog and protection groups - some private, some government, which are 00:04:50.270 --> 00:04:54.820 looking to do this for various things that are not a software security. And you can 00:04:54.820 --> 00:04:58.210 see some examples here that are big in the United States - I happen to not like these 00:04:58.210 --> 00:05:02.120 as much as some of the newer consumer labels coming out from the EU. But 00:05:02.120 --> 00:05:05.080 nonetheless they are examples of the kinds of things people have done in other 00:05:05.080 --> 00:05:10.870 fields, fields that are not security to try to achieve that same end. And when 00:05:10.870 --> 00:05:17.410 these things work well, it is for three reasons: One, it has to contain the 00:05:17.410 --> 00:05:22.960 relevant information. Two: it has to be based in fact, we're not talking opinions, 00:05:22.960 --> 00:05:28.800 this is not a book club or something like that. And three: it has to be actionable, 00:05:28.800 --> 00:05:32.520 it has to be actionable - you have to be able to know how to make a decision based 00:05:32.520 --> 00:05:36.370 on it. How do you do that for software security? How do you do that for 00:05:36.370 --> 00:05:43.760 software security? So the rest of the talk is going to go in three parts. 00:05:43.760 --> 00:05:49.450 First, we're going to give a bit of an overview for more of the consumer facing 00:05:49.450 --> 00:05:52.820 side of things for that we do: look at some data that we have reported on early 00:05:52.820 --> 00:05:57.260 and all these other kinds of good things. We're then going to go ahead and get 00:05:57.940 --> 00:06:06.011 terrifyingly, terrifyingly technical. And then after that we'll talk about tools to 00:06:06.011 --> 00:06:09.560 actually implement all this stuff. The technical part comes before the tools. So 00:06:09.560 --> 00:06:12.170 it just tells you how terrifyingly technical we're gonna get. It's gonna be 00:06:12.170 --> 00:06:19.680 fun right. So how do you do this for software security: a consumer version. So, 00:06:19.680 --> 00:06:25.010 if you set forth to the task of trying to measure software security, right, many 00:06:25.010 --> 00:06:27.680 people here probably do work in the security field perhaps as consultants 00:06:27.680 --> 00:06:32.281 doing reviews; certainly I used to. Then probably what you're thinking to yourself 00:06:32.281 --> 00:06:38.650 right now is that there are lots and lots and lots and lots of things that affect 00:06:38.650 --> 00:06:44.150 the security of a piece of software. Some of which are, mmm, you're only gonna see 00:06:44.150 --> 00:06:47.640 them if you go reversing. And some of which are just you know kicking around on 00:06:47.640 --> 00:06:51.580 the ground waiting for you to notice, right. So we're going to talk about both 00:06:51.580 --> 00:06:55.919 of those kinds of things that you might measure. But here you see these giant 00:06:55.919 --> 00:07:03.380 charts that basically go through on the left - on the left we have Microsoft Excel 00:07:03.380 --> 00:07:08.310 on OS X on the right Google Chrome for OS X this is a couple years old at this point 00:07:08.310 --> 00:07:12.850 maybe one and a half years old but over here I'm not expecting you to be able to 00:07:12.850 --> 00:07:16.250 read these - the real point is to say look at all of the different things you can 00:07:16.250 --> 00:07:20.490 measure very easily. How do you distill, it how do you boil it 00:07:20.490 --> 00:07:26.770 down, right. So this is a the opposite of a good consumer safety label. This is just 00:07:26.770 --> 00:07:29.780 um if you ever done any consulting this is the kind of report you hand a client to 00:07:29.780 --> 00:07:32.870 tell them how good their software is, right? It's the opposite of consumer 00:07:32.870 --> 00:07:39.800 grade. But the reason I'm showing it here is because, you know, I'm gonna call out 00:07:39.800 --> 00:07:42.650 some things and maybe you can't process all of this because it's too much 00:07:42.650 --> 00:07:46.911 material, you know. But I'm gonna call it some things and once I call them out just 00:07:46.911 --> 00:07:52.949 like NP you're gonna recognize them instantly. So for example, Excel, at the 00:07:52.949 --> 00:07:56.820 time of this review - look at this column of dots. What's this dots telling you? 00:07:56.820 --> 00:07:59.990 It's telling you look at all these libraries -all of them are 32-bit only. 00:07:59.990 --> 00:08:07.180 Not 64 bits, not 64 bits. Take a look at Chrome - exact opposite, exact opposite 00:08:07.180 --> 00:08:14.021 64-bit binary, right? What are some other things? Excel, again, on OSX maybe you can 00:08:14.021 --> 00:08:19.550 see these danger warning signs that go straight straight up the whole thing. 00:08:19.550 --> 00:08:27.520 That's the the absence of major heat protection flags in the binary headers. 00:08:27.520 --> 00:08:31.919 We'll talk about some what that means exactly in a bit. But also if you hop over 00:08:31.919 --> 00:08:35.639 here you'll see like yeah yeah yeah like Chrome has all the different heat 00:08:35.639 --> 00:08:41.578 protections that a binary might enable, on OSX that is, but it also has more dots in 00:08:41.578 --> 00:08:44.649 this column here off to the right. And what do those dots represent? 00:08:44.649 --> 00:08:52.050 Those dots represent functions, functions that historically have been the source of 00:08:52.050 --> 00:08:54.460 you know if you call these functions are very hard to call correctly - if you're a 00:08:54.460 --> 00:08:59.029 C programmer the "gets" function is a good example. But there are lots of them. And 00:08:59.029 --> 00:09:03.279 you can see here the Chrome doesn't mind, it uses them all a bunch. And Excel not so 00:09:03.279 --> 00:09:08.360 much. And if you know the history of Microsoft and the trusted computing 00:09:08.360 --> 00:09:12.379 initiative and the SDO and all of that you will know that a very long time ago 00:09:12.379 --> 00:09:17.180 Microsoft made the decision and they said we're gonna start purging some of these 00:09:17.180 --> 00:09:22.009 risky functions from our code bases because we think it's easier to ban them 00:09:22.009 --> 00:09:24.990 than teach our devs to use them correctly. And you see that reverberating out in 00:09:24.990 --> 00:09:28.980 their software. Google on the other hand says yeah yeah yeah those functions can be 00:09:28.980 --> 00:09:31.920 dangerous to use but if you know how to use them they can be very good and so 00:09:31.920 --> 00:09:38.959 they're permitted. The point all of this is building to is that if you start by 00:09:38.959 --> 00:09:42.540 just measuring every little thing that like your static analyzers can detect in a 00:09:42.540 --> 00:09:47.760 piece of software. Two things: one, you wind up with way more data than you can 00:09:47.760 --> 00:09:55.269 show in a slide. And two: the engineering process, the software development life 00:09:55.269 --> 00:09:59.779 cycle that went into the software will leave behind artifacts that tell you 00:09:59.779 --> 00:10:05.170 something about the decisions that went into designing that engineering process. 00:10:05.170 --> 00:10:10.179 And so you know, Google for example: quite rigorous as far as hitting you know 00:10:10.179 --> 00:10:14.379 GCC dash, and then enable all of the compiler protections. Microsoft may be 00:10:14.379 --> 00:10:19.949 less good at that, but much more rigid in things that's were very popular ideas when 00:10:19.949 --> 00:10:24.199 they introduced trusted computing, alright. So the big takeaway from this 00:10:24.199 --> 00:10:29.040 material is that again the software engineering process results in artifacts 00:10:29.040 --> 00:10:35.610 in the software that people can find. Alright. Ok, so that's that's a whole 00:10:35.610 --> 00:10:40.579 bunch of data, certainly it's not a consumer-friendly label. So how do you 00:10:40.579 --> 00:10:45.899 start to get in towards the consumer zone? Well, the main defect of the big reports 00:10:45.899 --> 00:10:51.240 that we just saw is that it's too much information. It's a very dense on data but 00:10:51.240 --> 00:10:55.649 it's very hard to distill it to the "so what" of it, right? 00:10:55.649 --> 00:11:00.470 And so this here is one of our earlier attempts to go ahead and do that 00:11:00.470 --> 00:11:04.990 distillation. What are these charts how did we come up with these? Well on the 00:11:04.990 --> 00:11:08.490 previous slide when we saw all these different factors that you can analyze in 00:11:08.490 --> 00:11:14.189 software, basically here's whose views that we arrive at this. For each of those 00:11:14.189 --> 00:11:18.639 things: pick a weight. Go ahead and compute a score, average against the 00:11:18.639 --> 00:11:22.110 weight: tada, now you have some number. You can do that for each of the libraries 00:11:22.110 --> 00:11:25.819 and the piece of software. And if you do that for each of the libraries in the 00:11:25.819 --> 00:11:29.399 software you can then go ahead and produce these histograms to show, you know, like 00:11:29.399 --> 00:11:35.619 this percentage of the DLLs had a score in this range. Boom, there's a bar, right. 00:11:35.619 --> 00:11:39.269 How do you pick those weights? We'll talk about that in a sec - it's very technical. 00:11:39.269 --> 00:11:45.339 But the the takeaway though, is you know that you wind up with these charts. Now 00:11:45.339 --> 00:11:48.329 I've obscured the labels, I've obscured the labels and the reason I've done that 00:11:48.329 --> 00:11:52.329 is because I don't really care that much about the actual counts. I want to talk 00:11:52.329 --> 00:11:57.420 about the shapes, the shapes of these charts: it's a qualitative thing. 00:11:57.420 --> 00:12:02.540 So here: good scores appear on the right, bad scores appear on the left. The 00:12:02.540 --> 00:12:06.269 histogram measuring all the libraries and components and so a very secure piece of 00:12:06.269 --> 00:12:12.879 software in this model manifests as a tall bar far to the right. And you can see a 00:12:12.879 --> 00:12:17.910 clear example at in our custom Gentoo build. Anyone here is a Gentoo fan knows - 00:12:17.910 --> 00:12:21.189 hey I'm going to install this thing, I think I'm going to go ahead and turn on 00:12:21.189 --> 00:12:25.120 every single one of those flags, and lo and behold if you do that yeah you wind up 00:12:25.120 --> 00:12:30.520 with tall bar far to the right. Here's in Ubuntu 16, I bet it's 16.04 but I don't 00:12:30.520 --> 00:12:35.959 recall exactly, 16 LTS. Here you see a lot of tall bars to the right - not quite as 00:12:35.959 --> 00:12:39.619 consolidated as a custom Gentoo build, but that makes sense doesn't it right? Because 00:12:39.619 --> 00:12:44.769 then you know you don't do your whole Ubuntu build. Now I want to contrast. I 00:12:44.769 --> 00:12:50.360 want to contrast. So over here on the right we see in the same model, an 00:12:50.360 --> 00:12:55.929 analysis of the firmware obtained from two smart televisions. Last year's models from 00:12:55.929 --> 00:12:59.920 Samsung and LG. And here the model numbers. We did this work in concert with 00:12:59.920 --> 00:13:05.039 Consumer Reports. And what do you notice about these histograms, right. Are the 00:13:05.039 --> 00:13:11.790 bars tall and to the right? No, they look almost normal, not quite, but that doesn't 00:13:11.790 --> 00:13:16.619 really matter. The main thing that matters is that this is the shape you would expect 00:13:16.619 --> 00:13:23.649 to get if you were playing a random game basically to decide what security features 00:13:23.649 --> 00:13:27.879 to enable in your software. This is the shape of not having a security program, is 00:13:27.879 --> 00:13:33.540 my bet. That's my bet. And so what do you see? You see heavy concentration here in 00:13:33.540 --> 00:13:38.800 the middle, right, that seems fair, and like it tails off. On the Samsung nothing 00:13:38.800 --> 00:13:43.549 scored all that great, same on the LG. Both of them are you know running their 00:13:43.549 --> 00:13:46.639 respective operating systems and they're basically just inheriting whatever 00:13:46.639 --> 00:13:51.249 security came from whatever open source thing they forked, right. 00:13:51.249 --> 00:13:55.000 So this is this is the kind of message, this right here is the kind of thing that 00:13:55.000 --> 00:14:01.929 we serve to exist for. This is us producing charts showing that the current 00:14:01.929 --> 00:14:08.019 practices in the not-so consumer-friendly space of running your own Linux distros 00:14:08.019 --> 00:14:13.290 far exceed the products being delivered, certainly in this case in the smart TV 00:14:13.290 --> 00:14:24.941 market. But I think you might agree with me, it's much worse than this. So let's 00:14:24.941 --> 00:14:28.319 dig into that a little bit more, I have a different point that I want to make about 00:14:28.319 --> 00:14:33.959 that same data set - so this table here this table is again looking at the LG 00:14:33.959 --> 00:14:39.769 Samsung and Gentoo Linux installations. And on this table we're just pulling out 00:14:39.769 --> 00:14:43.839 some of the easy to identify security features you might enable in a binary 00:14:43.839 --> 00:14:49.989 right. So percentage of binaries with address space layout randomization, right? 00:14:49.989 --> 00:14:56.429 Let's talk about that on our Gentoo build it's over 99%. That also holds for the 00:14:56.429 --> 00:15:02.699 Amazon Linux AMI - it holds in Ubuntu. ASLR is incredibly common in modern Linux. 00:15:02.699 --> 00:15:09.290 And despite that, fewer than 70 percent of the binaries on the LG television had it 00:15:09.290 --> 00:15:13.739 enabled. Fewer than 70 percent. And the Samsung was doing, you know, better than 00:15:13.739 --> 00:15:19.780 that I guess, but 80 percent is a pretty disappointing when a default Linux 00:15:19.780 --> 00:15:25.190 install, you know, mainstream Linux distro is going to get you 99, right? And it only 00:15:25.190 --> 00:15:28.079 gets worse, it only gets worse right you know? 00:15:28.079 --> 00:15:32.379 RELRO support, if you don't know what that is that's ok but if you do, look abysmal 00:15:32.379 --> 00:15:37.809 coverage look at this abysmal coverage coming out of these IOT devices very sad. 00:15:37.809 --> 00:15:40.749 And you see it over and over and over again. I'm showing this because some 00:15:40.749 --> 00:15:46.339 people in this room or watching this video ship software - and I have a message, I 00:15:46.339 --> 00:15:50.309 have a message to those people who ship software who aren't working on say Chrome 00:15:50.309 --> 00:15:58.609 or any of the other big-name Pwn2Own kinds of targets. Look at this: you can be 00:15:58.609 --> 00:16:02.480 leading the pack by mastering the fundamentals. You can be leading the pack 00:16:02.480 --> 00:16:07.079 by mastering the fundamentals. This is a point that really as a security field we 00:16:07.079 --> 00:16:11.179 really need to be driving home. You know, one of the things that we're seeing here 00:16:11.179 --> 00:16:15.709 in our data is that if you're the vendor who is shipping the product everyone has 00:16:15.709 --> 00:16:19.390 heard of in the security field and maybe your game is pretty decent right? If 00:16:19.390 --> 00:16:23.599 you're shipping say Windows or if you're shipping Firefox or whatever. But if 00:16:23.599 --> 00:16:26.149 you're if you're doing one of these things where people are just kind of beating you 00:16:26.149 --> 00:16:30.619 up for default passwords, then your problems are way further than just default 00:16:30.619 --> 00:16:35.399 passwords, right? Like the house, the house is messy it needs to be cleaned, 00:16:35.399 --> 00:16:43.190 needs to be cleaned. So the rest of the talk like I said we're going to be 00:16:43.190 --> 00:16:47.019 discussing a lot of other things that amount to getting you know a peek behind 00:16:47.019 --> 00:16:50.689 the curtain and where some of these things come from and getting very specific about 00:16:50.689 --> 00:16:54.420 how this business works, but if you're interested in more of the high level 00:16:54.420 --> 00:16:58.980 material - especially if you're interested in interesting results and insights, some 00:16:58.980 --> 00:17:01.949 of which I'm going to have here later. But I really encourage you though to take a 00:17:01.949 --> 00:17:06.750 look at the talk from this past summer by our chief scientist Sarah Zatko, which is 00:17:06.750 --> 00:17:11.217 predominantly on the topic of surprising results in the data. 00:17:14.892 --> 00:17:18.539 Today, though, this being our first time presenting here in Europe, we figured we 00:17:18.539 --> 00:17:22.869 would take more of an overarching kind of view. What we're doing and why we're 00:17:22.869 --> 00:17:26.619 excited about it and where it's headed. So we're about to move into a little bit of 00:17:26.619 --> 00:17:31.600 the underlying theory, you know. Why do I think it's reasonable to even try to 00:17:31.600 --> 00:17:35.429 measure the security of software from a technical perspective. But before we can 00:17:35.429 --> 00:17:39.310 get into that I need to talk a little bit about our goals, so that the decisions and 00:17:39.310 --> 00:17:45.380 the theory; the motivation is clear, right. Our goals are really simple: it's a 00:17:45.380 --> 00:17:51.399 very easy organization to run because of that. Goal number one: remain independent 00:17:51.399 --> 00:17:56.260 of vendor influence. We are not the first organization to purport to be looking out 00:17:56.260 --> 00:18:02.470 for the consumer. But unlike many of our predecessors, we are not taking money from 00:18:02.470 --> 00:18:09.920 the people we review, right? Seems like some basic stuff. Seems like some basic 00:18:09.920 --> 00:18:17.539 stuff right? Thank you, okay. Two: automated, comparable, quantitative 00:18:17.539 --> 00:18:23.790 analysis. Why automated? Well, we need our test results to be reproducible. And Tim 00:18:23.790 --> 00:18:27.720 goes in opens up your software in IDA and finds a bunch of stuff that makes them all 00:18:27.720 --> 00:18:32.620 stoped - that's not a very repeatable kind of a standard for things. And so we're 00:18:32.620 --> 00:18:36.440 interested in things which are automated. We'll talk about, maybe a few hackers in 00:18:36.440 --> 00:18:39.940 here know how hard that is. We'll talk about that, but then last we also we're 00:18:39.940 --> 00:18:43.539 well acting as a watchdog - we're protecting the interests of the user, the 00:18:43.539 --> 00:18:47.630 consumer, however you would like to look at it. But we also have three non goals, 00:18:47.630 --> 00:18:52.510 three non goals that are equally important. One: we have a non goal of 00:18:52.510 --> 00:18:56.859 finding and disclosing vulnerabilities. I reserve the right to find and disclose 00:18:56.859 --> 00:19:01.370 vulnerabilities. But that's not my goal, it's not my goal. Another non goal is to 00:19:01.370 --> 00:19:04.840 tell software vendors what to do. If a vendor asks me how to remediate their 00:19:04.840 --> 00:19:08.500 terrible score, I will tell them what we are measuring but I'm not there to help 00:19:08.500 --> 00:19:11.950 them remediate. It's on them to be able to ship a secure product without me holding 00:19:11.950 --> 00:19:19.049 their hand. We'll see. And then three: non-goal, perform free security testing 00:19:19.049 --> 00:19:24.090 for vendors. Our testing happens after you release. Because when you release your 00:19:24.090 --> 00:19:28.980 software you are telling people it is ready to be used. Is it really though, is 00:19:28.980 --> 00:19:31.799 it really though, right? Applause 00:19:31.799 --> 00:19:37.309 Yeah, thank you. Yeah, so we are not there to give you a preview of what your score 00:19:37.309 --> 00:19:42.270 will be. There is no sum of money you can hand me that will get you an early preview 00:19:42.270 --> 00:19:46.169 of what your score is - you can try me, you can try me: there's a fee for trying 00:19:46.169 --> 00:19:50.260 me. There's a fee for trying me. But I'm not gonna look at your stuff until I'm 00:19:50.260 --> 00:19:58.549 ready to drop it, right. Yeah bitte, yeah. All right. So moving into this theory 00:19:58.549 --> 00:20:02.770 territory. Three big questions, three big questions that need to be addressed if you 00:20:02.770 --> 00:20:06.990 want to do our work efficiently: what works, what works for improving security, 00:20:06.990 --> 00:20:13.030 what are the things that need or that you really want to see in software. Two: how 00:20:13.030 --> 00:20:17.120 do you recognize when it's being done? It's no good if someone hands you a piece 00:20:17.120 --> 00:20:20.169 of software and says, "I've done all the latest things" and it's a complete black 00:20:20.169 --> 00:20:24.529 box. If you can't check the claim, the claim is as good as false, in practical 00:20:24.529 --> 00:20:30.210 terms, period, right. Software has to be reviewable or a priori, I'll think you're 00:20:30.210 --> 00:20:35.730 full of it. And then three: who's doing it - of all the things that work, that you 00:20:35.730 --> 00:20:39.820 can recognize, who's actually doing it. You know, let's go ahead - our field is 00:20:39.820 --> 00:20:47.429 famous for ruining people's holidays and weekends over Friday bug disclosures, you 00:20:47.429 --> 00:20:51.799 know New Year's Eve bug disclosures. I would like us to also be famous for 00:20:51.799 --> 00:20:59.250 calling out those teams and those software organizations which are being as good as 00:20:59.250 --> 00:21:04.240 the bad guys are being bad, yeah? So provide someone an incentive to be maybe 00:21:04.240 --> 00:21:19.460 happy to see us for a change, right. Okay, so thank you. Yeah, all right. So how do 00:21:19.460 --> 00:21:26.120 we actually pull these things off; the basic idea. So, I'm going to get into some 00:21:26.120 --> 00:21:29.470 deeper theory: if you're not a theorist I want you to focus on this slide. 00:21:29.470 --> 00:21:33.429 And I'm gonna bring it back, it's not all theory from here on out after this but if 00:21:33.429 --> 00:21:39.289 you're not a theorist I really want you to focus on this slide. The basic motivation, 00:21:39.289 --> 00:21:42.560 the basic motivation behind what we're doing; the technical motivation - why we 00:21:42.560 --> 00:21:47.020 think that it's possible to measure and report on security. It all boils down to 00:21:47.020 --> 00:21:53.020 this right. So we start with a thought experiment, a gedankent, right? Given a 00:21:53.020 --> 00:21:58.650 piece of software we can ask: overall, how secure is it? Kind of a vague question but 00:21:58.650 --> 00:22:03.000 you could imagine you know there's versions of that question. And two: what 00:22:03.000 --> 00:22:07.820 are its vulnerabilities. Maybe you want to nitpick with me about what the word 00:22:07.820 --> 00:22:11.240 vulnerability means but broadly you know this is a much more specific question 00:22:11.240 --> 00:22:18.850 right. And here's here's the enticing thing: the first question appears to ask 00:22:18.850 --> 00:22:24.929 for less information than the second question. And maybe if we were taking bets 00:22:24.929 --> 00:22:28.520 I would put my money on, yes, it actually does ask for less information. What do I 00:22:28.520 --> 00:22:33.240 mean by that what do I mean by that? Well, let's say that someone told you all of the 00:22:33.240 --> 00:22:38.389 vulnerabilities in a system right? They said, "Hey, I got them all", right? You're 00:22:38.389 --> 00:22:41.580 like all right that's cool, that's cool. And if someone asks you hey how secure is 00:22:41.580 --> 00:22:45.440 this system you can give them a very precise answer. You can say it has N 00:22:45.440 --> 00:22:48.620 vulnerabilities, and they're of this kind and like all this stuff right so certainly 00:22:48.620 --> 00:22:54.630 the second question is enough to answer the first. But, is the reverse true? 00:22:54.630 --> 00:22:58.470 Namely, if someone were to tell you, for example, "hey, this piece of software has 00:22:58.470 --> 00:23:06.210 exactly 32 vulnerabilities in it." Does that make it easier to find any of them? 00:23:06.210 --> 00:23:12.320 Right, there's room for to maybe do that using some algorithms that are not yet in 00:23:12.320 --> 00:23:15.840 existence. Certainly the computer scientists in here 00:23:15.840 --> 00:23:19.450 are saying, "well, you know, yeah maybe counting the number of SAT solutions 00:23:19.450 --> 00:23:22.700 doesn't help you practically find solutions. But it might and we just don't 00:23:22.700 --> 00:23:27.120 know." Okay fine fine fine. Maybe these things are the same, but the my experience 00:23:27.120 --> 00:23:30.410 in security, and the experience of many others perhaps is that they probably 00:23:30.410 --> 00:23:36.510 aren't the same question. And this motivates what I'm calling here is Zatko's 00:23:36.510 --> 00:23:40.870 question, which is basically asking for an algorithm that demonstrates that the first 00:23:40.870 --> 00:23:45.970 question is easier than the second question, right. So Zatko's question: 00:23:45.970 --> 00:23:49.360 develop a heuristic which can to efficiently answer one, but not 00:23:49.360 --> 00:23:53.549 necessarily two. If you're looking for a metaphor, if you want to know why I care 00:23:53.549 --> 00:23:56.640 about this distinction, I want you to think about some certain controversial 00:23:56.640 --> 00:24:00.990 technologies: maybe think about say nuclear technology, right. An algorithm 00:24:00.990 --> 00:24:04.529 that answers one, but not two, it's a very safe algorithm to publish. Very safe 00:24:04.529 --> 00:24:11.369 algorithm publish indeed. Okay, Claude Shannon would like more information. happy 00:24:11.369 --> 00:24:16.039 to oblige. Let's take a look at this question from a different perspective 00:24:16.039 --> 00:24:19.379 maybe a more hands-on perspective: the hacker perspective, right? If you're a 00:24:19.379 --> 00:24:22.389 hacker and you're watching me up here and I'm waving my hands around and I'm showing 00:24:22.389 --> 00:24:25.930 you charts maybe you're thinking to yourself yeah boy, what do you got? Right, 00:24:25.930 --> 00:24:29.730 how does this actually go. And maybe what you're thinking to yourself is that, you 00:24:29.730 --> 00:24:34.350 know, finding good vulns: that's an artisan craft right? You're in IDA, you 00:24:34.350 --> 00:24:37.250 know you're reversing old way you're doing all these things or hit and Comm, I don't 00:24:37.250 --> 00:24:41.429 know all that stuff. And like, you know, this kind of clever game; cleverness is 00:24:41.429 --> 00:24:47.210 not like this thing that feels very automatable. But you know on the other 00:24:47.210 --> 00:24:51.360 hand there are a lot of tools that do automate things and so it's not completely 00:24:51.360 --> 00:24:57.110 not automatable. And if you're into fuzzing then perhaps 00:24:57.110 --> 00:25:01.500 you are aware of this very simple observation, which is that if your harness 00:25:01.500 --> 00:25:04.940 is perfect if you really know what you're doing if you have a decent fuzzer then in 00:25:04.940 --> 00:25:08.840 principle fuzzing can find every single problem. You have to be able to look for 00:25:08.840 --> 00:25:13.870 it you have to be able harness for it but in principle it will, right. So the hacker 00:25:13.870 --> 00:25:19.210 perspective on Zatko's question is maybe of two minds on the one hand assessing 00:25:19.210 --> 00:25:22.399 security is a game of cleverness, but on the other hand we're kind of right now at 00:25:22.399 --> 00:25:25.880 the cusp of having some game-changing tech really go - maybe you're saying like 00:25:25.880 --> 00:25:29.580 fuzzing is not at the cusp, I promise it's just at the cusp. We haven't seen all the 00:25:29.580 --> 00:25:33.690 fuzzing has to offer right and so maybe there's room maybe there's room for some 00:25:33.690 --> 00:25:41.200 automation to be possible in pursuit of Zatko's question. Of course, there are 00:25:41.200 --> 00:25:45.920 many challenges still in, you know, using existing hacker technology. Mostly of the 00:25:45.920 --> 00:25:49.570 form of various open questions. For example if you're into fuzzing, you know, 00:25:49.570 --> 00:25:53.039 hey: identifying unique crashes. There's an open question. We'll talk about some of 00:25:53.039 --> 00:25:57.060 those, we'll talk about some of those. But I'm going to offer another perspective 00:25:57.060 --> 00:26:01.490 here: so maybe you're not in the business of doing software reviews but you know a 00:26:01.490 --> 00:26:05.929 little computer science. And maybe that computer science has you wondering what's 00:26:05.929 --> 00:26:12.679 this guy talking about, right. I'm here to acknowledge that. So whatever you think 00:26:12.679 --> 00:26:16.610 the word security means: I've got a list of questions up here. Whatever you think 00:26:16.610 --> 00:26:19.502 the word security means, probably, some of these questions are relevant to your 00:26:19.502 --> 00:26:23.299 definition. Right. Does the software have a hidden backdoor 00:26:23.299 --> 00:26:26.600 or any kind of hidden functionality, does it handle crypto material correctly, etc, 00:26:26.600 --> 00:26:30.429 so forth. Anyone in here who knows some computers abilities theory knows that 00:26:30.429 --> 00:26:34.240 every single one of these questions and many others like them are undecidable due 00:26:34.240 --> 00:26:37.960 to reasons essentially no different than the reason the halting problem is 00:26:37.960 --> 00:26:41.330 undecidable,\ which is to say due to reasons essentially first identified and 00:26:41.330 --> 00:26:46.019 studied by Alan Turing a long time before we had microarchitectures and all these 00:26:46.019 --> 00:26:50.350 other things. And so, the computability perspective says that, you know, whatever 00:26:50.350 --> 00:26:54.640 your definition of security is ultimately you have this recognizability problem: 00:26:54.640 --> 00:26:57.900 fancy way of saying that algorithms won't be able to recognize secure software 00:26:57.900 --> 00:27:02.690 because of the undecidability these issues. The takeaway, the takeaway is that 00:27:02.690 --> 00:27:07.090 the computability angle on all of this says: anyone who's in the business that 00:27:07.090 --> 00:27:12.394 we're in has to use heuristics. You have to, you have to. 00:27:15.006 --> 00:27:24.850 All right, this guy gets it. All right, so on the tech side our last technical 00:27:24.850 --> 00:27:28.380 perspective that we're going to take now is certainly the most abstract which is 00:27:28.380 --> 00:27:32.220 the Bayesian perspective, right. So if you're a frequentist, you need to get with 00:27:32.220 --> 00:27:37.490 the times you know it's everything Bayesian now. So, let's talk about this 00:27:37.490 --> 00:27:43.899 for a bit. Only two slides of math, I promise, only two! So, let's say that I 00:27:43.899 --> 00:27:47.120 have some corpus of software. Perhaps it's a collection of all modern browsers, 00:27:47.120 --> 00:27:50.510 perhaps it's the collection of all the packages in the Debian repository, perhaps 00:27:50.510 --> 00:27:53.990 it's everything on github that builds on this system, perhaps it's a hard drive 00:27:53.990 --> 00:27:58.159 full of warez that some guy mailed you, right? You have some corpus of software 00:27:58.159 --> 00:28:02.980 and for a random program in that corpus we can consider this probability: the 00:28:02.980 --> 00:28:07.180 probability distribution of which software is secure versus which is not. For reasons 00:28:07.180 --> 00:28:11.080 described on the computability perspective, this number is not a 00:28:11.080 --> 00:28:17.130 computable number for any reasonable definition of security. So that's a neat 00:28:17.130 --> 00:28:21.220 and so, for practical terms, if you want to do some probabilistic reasoning, you 00:28:21.220 --> 00:28:27.509 need some surrogate for that and so we consider this here. So, instead of 00:28:27.509 --> 00:28:31.000 considering the probability that a piece of software is secure, a non computable 00:28:31.000 --> 00:28:35.960 non verifiable claim, we take a look here at this indexed collection of 00:28:35.960 --> 00:28:38.840 probabilities. This is an infinite countable family of probability 00:28:38.840 --> 00:28:44.330 distributions, basically P sub h,k is just the probability that for a random piece of 00:28:44.330 --> 00:28:50.330 software in the corpus, h work units of fuzzing will find no more than k unique 00:28:50.330 --> 00:28:56.130 crashes, right. And why is this relevant? Well, at the bottom we have this analytic 00:28:56.130 --> 00:28:59.389 observation, which is that in the limit as h goes to infinity you're basically 00:28:59.389 --> 00:29:03.679 saying: "Hey, you know, if I fuzz this thing for infinity times, you know, what's 00:29:03.679 --> 00:29:07.549 that look like?" And, essentially, here we have analytically that this should 00:29:07.549 --> 00:29:12.970 converge. The P sub h,1 should converge to the probability that a piece of software 00:29:12.970 --> 00:29:16.331 just simply cannot be made to crash. Not the same thing as being secure, but 00:29:16.331 --> 00:29:23.730 certainly not a small concern relevant to security. So, none of that stuff actually 00:29:23.730 --> 00:29:30.620 was Bayesian yet, so we need to get there. And so here we go, right: so, the previous 00:29:30.620 --> 00:29:35.080 slide described a probability distribution measured based on fuzzing. But fuzzing is 00:29:35.080 --> 00:29:39.130 expensive and it is also not an answer to Zatko's question because it finds 00:29:39.130 --> 00:29:43.759 vulnerabilities, it doesn't measure security in the general sense and so 00:29:43.759 --> 00:29:47.110 here's where we make the jump to conditional probabilities: Let M be some 00:29:47.110 --> 00:29:51.929 observable property of software has ASLR, has RELRO, calls these functions, doesn't 00:29:51.929 --> 00:29:56.770 call those functions... take your pick. For random s in S we now consider these 00:29:56.770 --> 00:30:02.070 conditional probability distributions and this is the same kind of probability as we 00:30:02.070 --> 00:30:08.379 had on the previous slide but conditioned on this observable being true, and this 00:30:08.379 --> 00:30:11.480 leads to the refined of the Siddall variant of Zatko's question: 00:30:11.480 --> 00:30:17.120 Which observable properties of software satisfy that, when the software has 00:30:17.120 --> 00:30:22.590 property m, the probability of fuzzing being hard is very high? That's what this 00:30:22.590 --> 00:30:27.121 version of this question phrases, and here we say, you know, in large log(h)/k, in 00:30:27.121 --> 00:30:31.590 other words: exponentially more fuzzing than you expect to find bugs. So this is 00:30:31.590 --> 00:30:36.350 the technical version of what we're after. All of this can be explored, you can 00:30:36.350 --> 00:30:40.340 brute-force your way to finding all of this stuff, and that's exactly what we're 00:30:40.340 --> 00:30:48.050 doing. So we're looking for all kinds of things, we're looking for all kinds of 00:30:48.050 --> 00:30:53.840 things that correlate with fuzzing having low yield on a piece of software, and 00:30:53.840 --> 00:30:57.360 there's a lot of ways in which that can happen. It could be that you are looking 00:30:57.360 --> 00:31:01.409 at a feature of software that literally prevents crashes. Maybe it's the never 00:31:01.409 --> 00:31:08.210 crash flag, I don't know. But most of the things I've talked about, ASLR, RERO, etc. 00:31:08.210 --> 00:31:12.169 don't prevent crashes. In fact a ASLR can take non-crashing programs and make them 00:31:12.169 --> 00:31:16.849 crashing. It's the number one reason vendors don't enable it, right? So why am 00:31:16.849 --> 00:31:20.079 I talking about ASLR? Why am I talking about RELRO? Why am i talking about all 00:31:20.079 --> 00:31:22.899 these things that have nothing to do with stopping crashes and I'm claiming I'm 00:31:22.899 --> 00:31:27.399 measuring crashes? This is because, in the Bayesian perspective, correlation is not 00:31:27.399 --> 00:31:31.730 the same thing as causation, right? Correlation is not the same thing as 00:31:31.730 --> 00:31:35.340 causation. It could be that M's presence literally prevents crashes, but it could 00:31:35.340 --> 00:31:39.749 also be that, by some underlying coincidence, the things we're looking for 00:31:39.749 --> 00:31:43.600 are mostly only found in software that's robust against crashing. 00:31:43.600 --> 00:31:48.799 If you're looking for security, I submit to you that the difference doesn't matter. 00:31:48.799 --> 00:31:54.929 Okay, end of my math, danke. I will now go ahead and do this like really nice analogy 00:31:54.929 --> 00:31:59.279 of all those things that I just described, right. So we're looking for indicators of 00:31:59.279 --> 00:32:03.640 a piece of software being secure enough to be good for consumers, right. So here's an 00:32:03.640 --> 00:32:08.131 analogy. Let's say you're a geologist, you study minerals and all of that and you're 00:32:08.131 --> 00:32:13.560 looking for diamonds. Who isn't, right? Want those diamonds! And like how do you 00:32:13.560 --> 00:32:18.270 find diamonds? Even in places that are rich in diamonds, diamonds are not common. 00:32:18.270 --> 00:32:21.279 You don't just go walking around in your boots, kicking until your toe stubs on a 00:32:21.279 --> 00:32:27.049 diamond? You don't do that. Instead you look for other minerals that are mostly 00:32:27.049 --> 00:32:31.860 only found near diamonds but are much more abundant in those locations than the 00:32:31.860 --> 00:32:37.960 diamonds. So, this is mineral science 101, I guess, I don't know. So, for example, 00:32:37.960 --> 00:32:41.389 you want to go find diamond: put on your boots and go kicking until you find some 00:32:41.389 --> 00:32:46.110 chromite, look for some diopside, you know, look for some garnet. None of these 00:32:46.110 --> 00:32:50.340 things turn into diamonds, none of these things cause diamonds but if you're 00:32:50.340 --> 00:32:54.020 finding good concentrations of these things, then, statistically, there's 00:32:54.020 --> 00:32:58.249 probably diamonds nearby. That's what we're doing. We're not looking for the 00:32:58.249 --> 00:33:02.570 things that cause good security per se. Rather, we're looking for the indicators 00:33:02.570 --> 00:33:08.349 that you have put the effort into your software, right? How's that working out 00:33:08.349 --> 00:33:15.070 for us? How's that working out for us? Well, we're still doing studies. It's, you 00:33:15.070 --> 00:33:18.490 know, early to say exactly but we do have the following interesting coincidence: and 00:33:18.490 --> 00:33:24.789 so, here presented I have a collection of prices that somebody gave much for so- 00:33:24.789 --> 00:33:30.369 called the underground exploits. And I can tell you these prices are maybe a little 00:33:30.369 --> 00:33:34.450 low these days but if you work in that business, if you go to Cyscin, if you do 00:33:34.450 --> 00:33:39.009 that kind of stuff, maybe you know that this is ballpark, it's ballpark. 00:33:39.009 --> 00:33:44.080 Alright, and, just a coincidence, maybe it means we're on the right track, I don't 00:33:44.080 --> 00:33:48.740 know, but it's an encouraging sign: When we run these programs through our 00:33:48.740 --> 00:33:53.060 analysis, our rankings more or less correspond to the actual prices that you 00:33:53.060 --> 00:33:58.279 encounter in the wild for access via these applications. Up above, I have one of our 00:33:58.279 --> 00:34:02.059 histogram charts. You can see here that Chrome and Edge in this particular model 00:34:02.059 --> 00:34:06.149 scored very close to the same and it's a test model, so, let's say they're 00:34:06.149 --> 00:34:11.480 basically the same. Firefox, you know, behind there a little 00:34:11.480 --> 00:34:15.040 bit. I don't have Safari on this chart because - this or all Windows applications 00:34:15.040 --> 00:34:21.091 - but the Safari score falls in between. So, lots of theory, lots of theory, lots 00:34:21.091 --> 00:34:27.920 of theory and then we have this. So, we're going to go ahead now and hand off to our 00:34:27.920 --> 00:34:31.679 lead engineer, Parker, who is going to talk about some of the concrete stuff, the 00:34:31.679 --> 00:34:35.210 non-chalkboard stuff, the software stuff that actually makes this work. 00:34:35.956 --> 00:34:40.980 Thompson: Yeah, so I want to talk about the process of actually doing it. Building 00:34:40.980 --> 00:34:45.050 the tooling that's required to collect these observables. Effectively, how do you 00:34:45.050 --> 00:34:50.560 go mining for indicator indicator minerals? But first the progression of 00:34:50.560 --> 00:34:55.810 where we are and where we're going. We initially broke this out into three major 00:34:55.810 --> 00:35:00.360 tracks of our technology. We have our static analysis engine, which started as a 00:35:00.360 --> 00:35:05.790 prototype, and we have now recently completed a much more mature and solid 00:35:05.790 --> 00:35:09.930 engine that's allowing us to be much more extensible and digging deeper into 00:35:09.930 --> 00:35:16.320 programs, and provide a much more deep observables. Then, we have the data 00:35:16.320 --> 00:35:21.510 collection and data reporting. Tim showed some of our early stabs at this. We're 00:35:21.510 --> 00:35:25.450 right now in the process of building new engines to make the data more accessible 00:35:25.450 --> 00:35:30.150 and easy to work with and hopefully more of that will be available soon. Finally, 00:35:30.150 --> 00:35:35.910 we have our fuzzer track. We needed to get some early data, so we played with some 00:35:35.910 --> 00:35:40.680 existing off-the-shelf fuzzers, including AFL, and, while that was fun, 00:35:40.680 --> 00:35:44.190 unfortunately it's a lot of work to manually instrument a lot of fuzzers for 00:35:44.190 --> 00:35:48.830 hundreds of binaries. So, we then built an automated solution 00:35:48.830 --> 00:35:52.930 that started to get us closer to having a fuzzing harness that could autogenerate 00:35:52.930 --> 00:35:57.840 itself, depending on the software, the software's behavior. But, right now, 00:35:57.840 --> 00:36:01.650 unfortunately that technology showed us more deficiencies than it showed 00:36:01.650 --> 00:36:07.360 successes. So, we are now working on a much more mature fuzzer that will allow us 00:36:07.360 --> 00:36:12.780 to dig deeper into programs as we're running and collect very specific things 00:36:12.780 --> 00:36:21.260 that we need for our model and our analysis. But on to our analytic pipeline 00:36:21.260 --> 00:36:25.831 today. This is one of the most concrete components of our engine and one of the 00:36:25.831 --> 00:36:29.000 most fun! We effectively wanted some type of 00:36:29.000 --> 00:36:34.550 software hopper, where you could just pour programs in, installers and then, on the 00:36:34.550 --> 00:36:39.560 other end, come reports: Fully annotated and actionable information that we can 00:36:39.560 --> 00:36:45.320 present to people. So, we went about the process of building a large-scale engine. 00:36:45.320 --> 00:36:50.500 It starts off with a simple REST API, where we can push software in, which then 00:36:50.500 --> 00:36:55.600 gets moved over to our computation cluster that effectively provides us a fabric to 00:36:55.600 --> 00:37:00.310 work with. It makes is made up of a lot of different software suites, starting off 00:37:00.310 --> 00:37:06.730 with our data processing, which is done by apache spark and then moves over into data 00:37:06.730 --> 00:37:12.910 data handling and data analysis in spark, and then we have a common HDFS layer to 00:37:12.910 --> 00:37:17.530 provide a place for the data to be stored and then a resource manager and Yarn. All 00:37:17.530 --> 00:37:22.500 of that is backed by our compute and data nodes, which scale out linearly. That then 00:37:22.500 --> 00:37:27.590 moves into our data science engine, which is effectively spark with Apache Zeppelin, 00:37:27.590 --> 00:37:30.480 which provides us a really fun interface where we can work with the data in an 00:37:30.480 --> 00:37:35.830 interactive manner but be kicking off large-scale jobs into the cluster. And 00:37:35.830 --> 00:37:40.110 finally, this goes into our report generation engine. What this bought us, 00:37:40.110 --> 00:37:46.030 was the ability to linearly scale and make that hopper bigger and bigger as we need, 00:37:46.030 --> 00:37:50.740 but also provide us a way to process data that doesn't fit in a single machine's 00:37:50.740 --> 00:37:54.110 RAM. You can push the instance sizes as you large as you want, but we have 00:37:54.110 --> 00:38:00.300 datasets that blow away any single host RAM set. So this allows us to work with 00:38:00.300 --> 00:38:08.690 really large collections of observables. I want to dive down now into our actual 00:38:08.690 --> 00:38:13.160 static analysis. But first we have to explore the problem space, because it's a 00:38:13.160 --> 00:38:19.490 nasty one. Effectively in settles mission is to process as much software as 00:38:19.490 --> 00:38:25.790 possible. Hopefully all of it, but it's hard to get your hand on all the binaries 00:38:25.790 --> 00:38:29.260 that are out there. When you start to look at that problem you understand there's a 00:38:29.260 --> 00:38:34.830 lot of combinations: there's a lot of CPU architectures, there's a lot of operating 00:38:34.830 --> 00:38:38.610 systems, there's a lot of file formats, there's a lot of environments the software 00:38:38.610 --> 00:38:43.160 gets deployed into, and every single one of them has their own app Archer app 00:38:43.160 --> 00:38:47.320 armory features. And it can be specifically set for one combination 00:38:47.320 --> 00:38:51.671 button on another and you don't want to penalize a developer for not turning on a 00:38:51.671 --> 00:38:56.290 feature they had no access to ever turn on. So effectively we need to solve this 00:38:56.290 --> 00:39:01.050 in a much more generic way. And so what we did is our static analysis engine 00:39:01.050 --> 00:39:04.630 effectively looks like a gigantic collection of abstraction libraries to 00:39:04.630 --> 00:39:12.390 handle binary programs. You take in some type of input file be it ELF, PE, MachO 00:39:12.390 --> 00:39:17.730 and then the pipeline splits. It goes off into two major analyzer classes, our 00:39:17.730 --> 00:39:22.360 format analyzers, which look at the software much like how a linker or loader 00:39:22.360 --> 00:39:26.600 would look at it. I want to understand how it's going to be loaded up, what type of 00:39:26.600 --> 00:39:30.680 armory feature is going to be applied and then we can run analyzers over that. In 00:39:30.680 --> 00:39:34.520 order to achieve that we need abstraction libraries that can provide us an abstract 00:39:34.520 --> 00:39:40.900 memory map, a symbol resolver, generic section properties. So all that feeds in 00:39:40.900 --> 00:39:46.060 and then we run over a collection of analyzers to collect data and observables. 00:39:46.060 --> 00:39:49.650 Next we have our code analyzers, these are the analyzers that run over the code 00:39:49.650 --> 00:39:57.600 itself. I need to be able to look at every possible executable path. In order to do 00:39:57.600 --> 00:40:02.400 that we need to do function discovery, feed that into a control flow recovery 00:40:02.400 --> 00:40:07.880 engine, and then as a post-processing step dig through all of the possible metadata 00:40:07.880 --> 00:40:12.820 in the software, such as like a switch table, or something like that to get even 00:40:12.820 --> 00:40:20.770 deeper into the software. Then this provides us a basic list of basic blocks, 00:40:20.770 --> 00:40:24.470 functions, instruction ranges. And does so in an efficient manner so we can process a 00:40:24.470 --> 00:40:30.550 lot of software as it goes. Then all that gets fed over into the main modular 00:40:30.550 --> 00:40:36.570 analyzers. Finally, all of this comes together and gets put into a gigantic blob 00:40:36.570 --> 00:40:41.850 of observables and fed up to the pipeline. We really want to thank the Ford 00:40:41.850 --> 00:40:46.920 Foundation for supporting our work in this, because the pipeline and the static 00:40:46.920 --> 00:40:51.840 analysis has been a massive boon for our project and we're only beginning now to 00:40:51.840 --> 00:40:58.920 really get our engine running and we're having a great time with it. So digging 00:40:58.920 --> 00:41:03.760 into the observables themselves, what are we looking at and let's break them apart. 00:41:03.760 --> 00:41:08.980 So the format structure components, things like ASLR, DEP, RELRO. 00:41:08.980 --> 00:41:13.370 basic app armory, that's going to go into the feature and gonna be enabled at the OS 00:41:13.370 --> 00:41:17.830 layer when it gets loaded up or linked. And we also collect other metadata about 00:41:17.830 --> 00:41:22.000 the program such as like: "What libraries are linked in?", "What's its dependency 00:41:22.000 --> 00:41:26.400 tree look like – completely?", "How did those software, how did those library 00:41:26.400 --> 00:41:32.040 score?", because that can affect your main software. Interesting example on Linux, if 00:41:32.040 --> 00:41:35.840 you link a library that requires an executable stack, guess what your software 00:41:35.840 --> 00:41:39.990 now has an executable stack, even if you didn't mark that. So we need to be owners 00:41:39.990 --> 00:41:44.700 to understand what ecosystem the software is gonna live in. And the code structure 00:41:44.700 --> 00:41:47.590 analyzers look at things like functionality: "What's the software 00:41:47.590 --> 00:41:52.600 doing?", "What type of app armory is getting injected into the code?". A great 00:41:52.600 --> 00:41:55.850 example of that is something like stack guards or fortify source. These are our 00:41:55.850 --> 00:42:01.550 main features that only really apply and can be observed inside of the control flow 00:42:01.550 --> 00:42:08.240 or inside of the actual instructions themselves. This is why control 00:42:08.240 --> 00:42:10.880 photographs are key. We played around with a number of 00:42:10.880 --> 00:42:15.980 different ways of analyzing software that we could scale out and ultimately we had 00:42:15.980 --> 00:42:20.171 to come down to working with control photographs. Provided here is a basic 00:42:20.171 --> 00:42:23.400 visualization of what I'm talking about with a control photograph, provided by 00:42:23.400 --> 00:42:28.690 Benja, which has wonderful visualization tools, hence this photo, and not our 00:42:28.690 --> 00:42:33.170 engine because we don't build their very many visualization engines. But you 00:42:33.170 --> 00:42:38.470 basically have a function that's broken up into basic blocks, which is broken up into 00:42:38.470 --> 00:42:42.910 instructions, and then you have basic flow between them. Having this as an iterable 00:42:42.910 --> 00:42:47.650 structure that we can work with, allows us to walk over that and walk every single 00:42:47.650 --> 00:42:50.790 instruction, understand the references, understand where code and data is being 00:42:50.790 --> 00:42:54.500 referenced, and how is it being referenced. 00:42:54.500 --> 00:42:57.640 And then what type of functionalities being used, so this is a great way to find 00:42:57.640 --> 00:43:02.530 something, like whether or not your stack guards are being applied on every function 00:43:02.530 --> 00:43:08.340 that needs them, how deep are they being applied, and is the compiler possibly 00:43:08.340 --> 00:43:11.850 introducing errors into your armory features. which are interesting side 00:43:11.850 --> 00:43:19.590 studies. Also why we did this is because we want to push the concept of what type 00:43:19.590 --> 00:43:28.339 of observables even farther. Let's say take this example you want to be able to 00:43:28.339 --> 00:43:34.340 take instruction abstractions. Let's say for all major architectures you can break 00:43:34.340 --> 00:43:38.690 them up into major categories. Be it arithmetic instructions, data manipulation 00:43:38.690 --> 00:43:45.850 instructions, like load stores and then control flow instructions. Then with these 00:43:45.850 --> 00:43:52.830 basic fundamental building blocks you can make artifacts. Think of them like a unit 00:43:52.830 --> 00:43:56.400 of functionality: has some type of input, some type of output, it provides some type 00:43:56.400 --> 00:44:01.280 of operation on it. And then with these little units of functionality, you can 00:44:01.280 --> 00:44:05.210 link them together and think of these artifacts as may be sub-basic block or 00:44:05.210 --> 00:44:09.440 crossing a few basic blocks, but a different way to break up the software. 00:44:09.440 --> 00:44:13.130 Because a basic block is just a branch break, but we want to look at 00:44:13.130 --> 00:44:18.680 functionality brakes, because these artifacts can provide the basic 00:44:18.680 --> 00:44:24.891 fundamental building blocks of the software itself. It's more important, when 00:44:24.891 --> 00:44:28.840 we want to start doing symbolic lifting. So that we can lift the entire software up 00:44:28.840 --> 00:44:35.250 into a generic representation, that we can slice and dice as needed. 00:44:38.642 --> 00:44:42.760 Moving from there, I want to talk about fuzzing a little bit more. Fuzzing is 00:44:42.760 --> 00:44:47.370 effectively at the heart of our project. It provides us the rich dataset that we 00:44:47.370 --> 00:44:52.040 can use to derive a model. It also provides us awesome other metadata on the 00:44:52.040 --> 00:44:58.060 side. But why? Why do we care about fuzzing? Why is fuzzing the metric, that 00:44:58.060 --> 00:45:04.680 you build an engine, that you build a model that you drive some type of reason 00:45:04.680 --> 00:45:11.560 from? So think of the set of bugs, vulnerabilities, and exploitable 00:45:11.560 --> 00:45:16.930 vulnerabilities. In an ideal world you'd want to just have a machine that pulls out 00:45:16.930 --> 00:45:20.250 exploitable vulnerabilities. Unfortunately, this is exceedingly costly 00:45:20.250 --> 00:45:25.690 for a series of decision problems, that go between these sets. So now consider the 00:45:25.690 --> 00:45:31.900 superset of bugs or faults. A fuzzer can easily recognize, or other software can 00:45:31.900 --> 00:45:37.400 easily recognize faults, but if you want to move down the sets you unfortunately 00:45:37.400 --> 00:45:42.770 need to jump through a lot of decision hoops. For example, if you want to move to 00:45:42.770 --> 00:45:45.760 a vulnerability you have to understand: Does the attacker have some type of 00:45:45.760 --> 00:45:51.150 control? Is there a trust boundary being crossed? Is this software configured in 00:45:51.150 --> 00:45:55.000 the right way for this to be vulnerable right now? So they're human factors that 00:45:55.000 --> 00:45:59.280 are not deducible from the outside. You then amplify this decision problem even 00:45:59.280 --> 00:46:05.320 worse going to exploitable vulnerabilities. So if we collect the 00:46:05.320 --> 00:46:11.360 superset of bugs, we will know that there is some proportion of subsets in there. 00:46:11.360 --> 00:46:15.830 And this provides us a datasets easily recognizable and we can collect in a cost- 00:46:15.830 --> 00:46:22.170 efficient manner. Finally, fuzzing is key and we're investing a lot of our time 00:46:22.170 --> 00:46:26.570 right now and working on a new fuzzing engine, because there are some key things 00:46:26.570 --> 00:46:32.290 we want to do. We want to be able to understand all of 00:46:32.290 --> 00:46:35.340 the different paths the software could be taking, and as you're fuzzing you're 00:46:35.340 --> 00:46:40.010 effectively driving the software down as many unique paths while referencing as 00:46:40.010 --> 00:46:47.760 many unique data manipulations as possible. So if we save off every path, 00:46:47.760 --> 00:46:51.840 annotate the ones that are faulting, we now have this beautiful rich data set of 00:46:51.840 --> 00:46:57.060 exactly where the software went as we were driving it in specific ways. Then we feed 00:46:57.060 --> 00:47:02.010 that back into our static analysis engine and begin to generate those instruction 00:47:02.010 --> 00:47:07.680 out of those instruction abstractions, those artifacts. And with that, imagine we 00:47:07.680 --> 00:47:14.560 have these gigantic traces of instruction abstractions. From there we can then begin 00:47:14.560 --> 00:47:20.990 to train the model to explore around the fault location and begin to understand and 00:47:20.990 --> 00:47:27.300 try and study the fundamental building blocks of what a bug looks like in an 00:47:27.300 --> 00:47:32.990 abstract instruction agnostic way. This is why we're spending a lot of time on our 00:47:32.990 --> 00:47:36.980 Fuzzing engine right now. But hopefully soon we'll be able to talk about that more 00:47:36.980 --> 00:47:40.381 and maybe a tech track and not the policy track. 00:47:44.748 --> 00:47:49.170 C: Yeah, so from then on when anything went wrong with the computer we said it 00:47:49.170 --> 00:47:55.700 had bugs in it. laughs All right, I promised you a technical journey, I 00:47:55.700 --> 00:47:59.461 promised you a technical journey into the dark abyss of as deep as you want to get 00:47:59.461 --> 00:48:03.460 with it. So let's go ahead and bring it up. Let's wrap it up and bring it up a 00:48:03.460 --> 00:48:07.340 little bit here. We've talked a great deal today about some theory. We've talked 00:48:07.340 --> 00:48:09.970 about development in our tooling and everything else and so I figured I should 00:48:09.970 --> 00:48:14.010 end with some things that are not in progress, but in fact which are done in 00:48:14.010 --> 00:48:20.630 yesterday's news. Just to go ahead and make that shared here with Europe. So in 00:48:20.630 --> 00:48:24.140 the midst of all of our development we have been discovering and reporting bugs, 00:48:24.140 --> 00:48:28.680 again this not our primary purpose really. But you know you can't help but do it. You 00:48:28.680 --> 00:48:32.170 know how computers are these days. You find bugs just for turning them on, right? 00:48:32.170 --> 00:48:38.610 So we've been disclosing all of that a little while ago. At DEFCON and Black Hat 00:48:38.610 --> 00:48:43.030 our chief scientist Sarah together with Mudge went ahead and dropped this 00:48:43.030 --> 00:48:47.840 bombshell on the Firefox team which is that for some period of time they had ASLR 00:48:47.840 --> 00:48:54.310 disabled on OS X. When we first found it we assumed it was a bug in our tools. When 00:48:54.310 --> 00:48:57.720 we first mentioned it in a talk they came to us and said it's definitely a bug on 00:48:57.720 --> 00:49:03.140 our tools or might be or some level of surprise and then people started looking 00:49:03.140 --> 00:49:08.840 into it and in fact at one point it had been enabled and then temporarily 00:49:08.840 --> 00:49:12.960 disabled. No one knew, everyone thought it was on. It takes someone looking to notice 00:49:12.960 --> 00:49:18.010 that kind of stuff, right. Major shout out though, they fixed it immediately despite 00:49:18.010 --> 00:49:23.950 our full disclosure on stage and everything. So very impressed, but in 00:49:23.950 --> 00:49:27.870 addition to popping surprises on people we've also been doing the usual process of 00:49:27.870 --> 00:49:32.890 submitting patches and bugs, particularly to LLVM and Qemu and if you work in 00:49:32.890 --> 00:49:35.810 software analysis you could probably guess why. 00:49:36.510 --> 00:49:39.280 Incidentally, if you're looking for a target to fuzz if you want to go home from 00:49:39.280 --> 00:49:45.870 CCC and you want to find a ton of findings LLVM comes with a bunch of parsers. You 00:49:45.870 --> 00:49:50.060 should fuzz them, you should fuzz them and I say that because I know for a fact you 00:49:50.060 --> 00:49:53.170 are gonna get a bunch of findings and it'd be really nice. I would appreciate it if I 00:49:53.170 --> 00:49:56.360 didn't have to pay the people to fix them. So if you wouldn't mind disclosing them 00:49:56.360 --> 00:50:00.240 that would help. But besides these bug reports and all these other things we've 00:50:00.240 --> 00:50:04.210 also been working with lots of others. You know we gave a talk earlier this summer, 00:50:04.210 --> 00:50:06.910 Sarah gave a talk earlier this summer, about these things and she presented 00:50:06.910 --> 00:50:11.830 findings on comparing some of these base scores of different Linux distributions. 00:50:11.830 --> 00:50:16.320 And based on those findings there was a person on the fedora red team, Jason 00:50:16.320 --> 00:50:20.470 Calloway, who sat there and well I can't read his mind but I'm sure that he was 00:50:20.470 --> 00:50:24.700 thinking to himself: golly it would be nice to not, you know, be surprised at the 00:50:24.700 --> 00:50:28.560 next one of these talks. They score very well by the way. They were leading in 00:50:28.560 --> 00:50:33.660 many, many of our metrics. Well, in any case, he left Vegas and he went back home 00:50:33.660 --> 00:50:36.850 and him and his colleagues have been working on essentially re-implementing 00:50:36.850 --> 00:50:41.570 much of our tooling so that they can check the stuff that we check before they 00:50:41.570 --> 00:50:47.530 release. Before they release. Looking for security before you release. So that would 00:50:47.530 --> 00:50:51.520 be a good thing for others to do and I'm hoping that that idea really catches on. 00:50:51.520 --> 00:50:58.990 laughs Yeah, yeah right, that would be nice. That would be nice. 00:50:58.990 --> 00:51:04.310 But in addition to that, in addition to that our mission really is to get results 00:51:04.310 --> 00:51:08.220 out to the public and so in order to achieve that, we have broad partnerships 00:51:08.220 --> 00:51:12.340 with Consumer Reports and the digital standard. Especially if you're into cyber 00:51:12.340 --> 00:51:16.410 policy, I really encourage you to take a look at the proposed digital standard, 00:51:16.410 --> 00:51:21.220 which is encompassing of the things we look for and and and so much more. URLs, 00:51:21.220 --> 00:51:25.720 data, traffic, motion and cryptography and update mechanisms and all that good stuff. 00:51:25.720 --> 00:51:31.951 So, where we are and where we're going, the big takeaways here for if you're 00:51:31.951 --> 00:51:36.310 looking for that, so what, three points for you: one we are building a tooling 00:51:36.310 --> 00:51:39.750 necessary to do larger and larger and larger studies regarding these surrogate 00:51:39.750 --> 00:51:44.980 security stores. My hope is that in some period of the not-too-distant future, I 00:51:44.980 --> 00:51:48.600 would like to be able to, with my colleagues, publish some really nice 00:51:48.600 --> 00:51:51.640 findings about what are the things that you can observe in software, which have a 00:51:51.640 --> 00:51:57.390 suspiciously high correlation with the software being good. Right, nobody really 00:51:57.390 --> 00:52:00.390 knows right now. It's an empirical question. As far as I know, the study 00:52:00.390 --> 00:52:03.080 hasn't been done. We've been running it on the small scale. We're building the 00:52:03.080 --> 00:52:06.620 tooling to do it on a much larger scale. We are hoping that this winds up being a 00:52:06.620 --> 00:52:11.480 useful field in security as that technology develops. In the meantime our 00:52:11.480 --> 00:52:15.560 static analyzers are already making surprising discoveries: hit YouTube and 00:52:15.560 --> 00:52:21.300 take a look for Sara Zatko's recent talks at DEFCON/Blackhat. Lots of fun findings 00:52:21.300 --> 00:52:25.910 in there. Lots of things that anyone who looks would have found it. Lots of that. 00:52:25.910 --> 00:52:29.080 And then lastly, if you were in the business of shipping software and you are 00:52:29.080 --> 00:52:32.620 thinking to yourself.. okay so these guys, someone gave them some money to mess up my 00:52:32.620 --> 00:52:36.840 day and you're wondering: what can I do to not have my day messed up? One simple 00:52:36.840 --> 00:52:40.870 piece of advice, one simple piece of advice: make sure your software employs 00:52:40.870 --> 00:52:45.920 every exploit mitigation technique Mudge has ever or will ever hear of. And he's 00:52:45.920 --> 00:52:49.500 heard of a lot of them. He's only gonna, you know all that, turn all those things 00:52:49.500 --> 00:52:52.280 on and if you don't know anything about that stuff, if nobody on your team knows 00:52:52.280 --> 00:52:57.370 anything about that stuff didn't I don't even know I'm saying this if you hear you 00:52:57.370 --> 00:53:00.972 know about that stuff so do that. If you're not here, then you should be here. 00:53:04.428 --> 00:53:16.330 Danke, Danke. Herald Angel: Thank you, Tim and Parker. 00:53:17.501 --> 00:53:23.630 Do we have any questions from the audience? It's really hard to see you with 00:53:23.630 --> 00:53:30.120 that bright light in my face. I think the signal angel has a question. Signal Angel: 00:53:30.120 --> 00:53:34.550 So the IRC channel was impressed by your tools and your models that you wrote. And 00:53:34.550 --> 00:53:38.050 they are wondering what's going to happen to that, because you do have funding from 00:53:38.050 --> 00:53:42.040 the Ford foundation now and so what are your plans with this? Do you plan on 00:53:42.040 --> 00:53:46.080 commercializing this or is it going to be open source or how do we get our hands on 00:53:46.080 --> 00:53:49.150 this? C: It's an excellent question. So for the 00:53:49.150 --> 00:53:53.550 time being the money that we are receiving is to develop the tooling, pay for the AWS 00:53:53.550 --> 00:53:57.790 instances, pay for the engineers and all that stuff. The direction as an 00:53:57.790 --> 00:54:01.410 organization that we would like to take things I have no interest in running a 00:54:01.410 --> 00:54:05.410 monopoly. That sounds like a fantastic amount of work and I really don't want to 00:54:05.410 --> 00:54:09.430 do it. However, I have a great deal of interest in taking the gains that we are 00:54:09.430 --> 00:54:13.860 making in the technology and releasing the data so that other competent researchers 00:54:13.860 --> 00:54:19.020 can go through and find useful things that we may not have noticed ourselves. So 00:54:19.020 --> 00:54:22.150 we're not at a point where we are releasing data in bulk just yet, but that 00:54:22.150 --> 00:54:26.432 is simply a matter of engineering our tools, are still in flux as we, you know. 00:54:26.432 --> 00:54:29.230 When we do that, we want to make sure the data is correct and so our software has to 00:54:29.230 --> 00:54:33.640 have its own low bug counts and all these other things. But ultimately for the 00:54:33.640 --> 00:54:37.950 scientific aspect of our mission. Though the science is not our primary mission. 00:54:37.950 --> 00:54:41.920 Our primary mission is to apply it to help consumers. At the same time, it is our 00:54:41.920 --> 00:54:47.590 belief that an opaque model is as good as crap, no one should trust an opaque model, 00:54:47.590 --> 00:54:50.940 if somebody is telling you that they have some statistics and they do not provide 00:54:50.940 --> 00:54:54.540 you with any underlying data and it is not reproducible you should ignore them. 00:54:54.540 --> 00:54:58.360 Consequently what we are working towards right now is getting to a point where we 00:54:58.360 --> 00:55:02.730 will be able to share all of those findings. The surrogate scores, the 00:55:02.730 --> 00:55:06.000 interesting correlations between observables and fuzzing. All that will be 00:55:06.000 --> 00:55:09.200 public as the material comes online. Signal Angel: Thank you. 00:55:09.200 --> 00:55:11.870 C: Thank you. Herald Angel: Thank you. And microphone 00:55:11.870 --> 00:55:14.860 number three please. Mic3: Hi, thanks so some really 00:55:14.860 --> 00:55:18.450 interesting work you presented here. So there's something I'm not sure I 00:55:18.450 --> 00:55:22.910 understand about the approach that you're taking. If you are evaluating the security 00:55:22.910 --> 00:55:26.320 of say a library function or the implementation of a network protocol for 00:55:26.320 --> 00:55:29.780 example you know there'd be a precise specification you could check that 00:55:29.780 --> 00:55:35.190 against. And the techniques you're using would make sense to me. But it's not so 00:55:35.190 --> 00:55:37.970 clear since you've set the goal that you've set for yourself is to evaluate 00:55:37.970 --> 00:55:43.580 security of consumer software. It's not clear to me whether it's fair to call 00:55:43.580 --> 00:55:47.430 these results security scores in the absence of a threat model so. So my 00:55:47.430 --> 00:55:50.350 question is, you know, how is it meaningful to make a claim that a piece of 00:55:50.350 --> 00:55:52.240 software is secure if you don't have a threat model for it? 00:55:52.240 --> 00:55:56.090 C: This is an excellent question and I anyone who disagrees is they should the 00:55:56.090 --> 00:56:01.330 wrong. Security without a threat model is not security at all. It's absolutely a 00:56:01.330 --> 00:56:05.560 true point. So the things that we are looking for, most of them are things that 00:56:05.560 --> 00:56:08.800 you will already find present in your threat model. And so for example we were 00:56:08.800 --> 00:56:12.390 reporting on the presence of things like a ASLR and lots of other things that get to 00:56:12.390 --> 00:56:17.030 the heart of exploitability of a piece of software. So for example if we are 00:56:17.030 --> 00:56:19.870 reviewing a piece of software, that has no attack surface 00:56:19.870 --> 00:56:24.160 then it is canonically not in the threat model and in that sense it makes no sense 00:56:24.160 --> 00:56:29.270 to report on its overall security. On the other hand, if we're talking about 00:56:29.270 --> 00:56:33.470 software like say a word processor, a browser, anything on your phone, anything 00:56:33.470 --> 00:56:36.120 that talks on the network, we're talking about those kinds of applications then I 00:56:36.120 --> 00:56:39.280 would argue that exploit mitigations and the other things that we are measuring are 00:56:39.280 --> 00:56:44.330 almost certainly very relevant. So there's a sense in which what we are measuring is 00:56:44.330 --> 00:56:48.411 the lowest common denominator among what we imagine or the dominant threat models 00:56:48.411 --> 00:56:53.180 for the applications. The hand-wavy answer, but I promised heuristics so there 00:56:53.180 --> 00:56:55.180 you go. Mic3: Thanks. 00:56:55.180 --> 00:57:01.620 C: Thank you. Herald Angel: Any questions? No raising 00:57:01.620 --> 00:57:07.060 hands, okay. And then the herald can ask a question, because I never can. So the 00:57:07.060 --> 00:57:11.920 question is: you mentioned earlier these security labels and for example what 00:57:11.920 --> 00:57:15.880 institution could give out the security labels? Because as obviously the vendor 00:57:15.880 --> 00:57:21.740 has no interest in IT security? C: Yes it's a very good question. So our 00:57:21.740 --> 00:57:25.580 partnership with Consumer Reports. I don't know if you're familiar with them, but in 00:57:25.580 --> 00:57:31.340 the United States Consumer Reports is a major huge consumer watchdog organization. 00:57:31.340 --> 00:57:36.550 They test the safety of automobiles, they test you know lots of consumer appliances. 00:57:36.550 --> 00:57:40.070 All kinds of things both to see if they function more or less as advertised but 00:57:40.070 --> 00:57:45.210 most importantly they're checking for quality, reliability and safety. So our 00:57:45.210 --> 00:57:49.840 partnership with Consumer Reports is all about us doing our work and then 00:57:49.840 --> 00:57:54.060 publishing that. And so for example the televisions that we presented the data on 00:57:54.060 --> 00:57:58.290 all of that was collected and published in partnership with Consumer Reports. 00:57:58.290 --> 00:58:00.970 Herald: Thank you. C: Thank you. 00:58:02.630 --> 00:58:12.430 Herald: Any other questions for stream. I hear a no. Well in this case people thank 00:58:12.430 --> 00:58:16.440 you. Thank Tim and Parker for their nice talk 00:58:16.440 --> 00:58:19.964 and please give them a very very warm hall round of applause. 00:58:19.964 --> 00:58:24.694 applause C: Thank you. T: Thank you. 00:58:24.694 --> 00:58:51.000 subtitles created by c3subtitles.de in the year 2017. Join, and help us!