Return to Video

34C3 - How risky is the software you use?

  • 0:00 - 0:17
    Music
    Herald: The next talk is about how risky
  • 0:17 - 0:23
    is software you use. So you may be heard
    about Trump versus a Russian security
  • 0:23 - 0:31
    company. We won't judge this, we won't
    comment this, but we dislike the
  • 0:31 - 0:37
    prejudgments of this case. Tim Carstens
    and Parker Thompson will tell you a little
  • 0:37 - 0:43
    bit more about how risky the software is
    you use. Tim Carstens is CITL's Acting
  • 0:43 - 0:48
    Director and Parker Thompson is CITL's
    lead engineer. Please welcome with a very,
  • 0:48 - 0:54
    very warm applause: Tim and Parker!
    Thanks.
  • 0:54 - 1:05
    Applause
    Tim Carstens: Howdy, howdy. So my name is
  • 1:05 - 1:13
    Tim Carstens. I'm the acting director of
    the cyber independent testing lab. It's
  • 1:13 - 1:19
    four words there, we'll talk about all for
    today, especially cyber. With me today as
  • 1:19 - 1:26
    our lead engineer Parker Thompson. Not on
    stage or our other collaborators: Patrick
  • 1:26 - 1:33
    Stach, Sarah Zatko, and present in the
    room but not on stage - Mudge. So today
  • 1:33 - 1:37
    we're going to be talking about our work,
    the lead in. The introduction that was
  • 1:37 - 1:40
    given is phrased in terms of Kaspersky and
    all of that, I'm not gonna be speaking
  • 1:40 - 1:45
    about Kaspersky and I guarantee you I'm
    not gonna be speaking about my president.
  • 1:45 - 1:50
    Right, yeah? Okay. Thank you.
    Applause
  • 1:50 - 1:55
    All right, so why don't we go ahead and
    kick off: I'll mention now parts of this
  • 1:55 - 2:01
    presentation are going to be quite
    technical. Not most of it, and I will
  • 2:01 - 2:04
    always include analogies and all these
    other things if you are here in security
  • 2:04 - 2:11
    but not a bit-twiddler. But if you do want
    to be able to review some of the technical
  • 2:11 - 2:15
    material, if I go through it too fast you
    like to read if you're a mathematician or
  • 2:15 - 2:21
    if you are a computer scientist, our sides
    are already available for download at this
  • 2:21 - 2:25
    site here. We think our pal our partners
    at power door for getting that set up for
  • 2:25 - 2:32
    us. Let's let's get started on the real
    material here. Alright, so we are CITL: a
  • 2:32 - 2:36
    nonprofit organization based in the United
    States founded by our chief scientist
  • 2:36 - 2:43
    Sarah Zatko and our board chair Mudge. And
    our mission is a public good mission - we
  • 2:43 - 2:47
    are hackers but our mission here is
    actually to look out for people who do not
  • 2:47 - 2:50
    know very much about machines
    or as much as the other hackers do.
  • 2:50 - 2:56
    Specifically, we seek to improve the state
    of software security by providing the
  • 2:56 - 3:01
    public with accurate reporting on the
    security of popular software, right? And
  • 3:01 - 3:06
    so there was a mouthful for you. But no
    doubt, no doubt, every single one of you
  • 3:06 - 3:11
    has received questions of the form: what
    do I run on my phone, what do I do with
  • 3:11 - 3:14
    this, what do I do with that, how do I
    protect myself - all these other things
  • 3:14 - 3:20
    lots of people in the general public
    looking for agency in computing. No one's
  • 3:20 - 3:25
    offering it to them, and so we're trying
    to go ahead and provide a forcing function
  • 3:25 - 3:30
    on the software field in order to, you
    know, again be able to enable consumers
  • 3:30 - 3:36
    and users and all these things. Our social
    good work is funded largely by charitable
  • 3:36 - 3:41
    monies from the Ford Foundation whom we
    thank a great deal, but we also have major
  • 3:41 - 3:45
    partnerships with Consumer Reports, which
    is a major organization in the United
  • 3:45 - 3:52
    States that generally, broadly, looks at
    consumer goods for safety and performance.
  • 3:52 - 3:56
    But also partners with The Digital
    Standard, which probably would be of great
  • 3:56 - 3:59
    interest to many people here at Congress
    as it is a holistic standard for
  • 3:59 - 4:04
    protecting user rights. We'll talk about
    some of the work that goes into those
  • 4:04 - 4:10
    things here in a bit, but first I want to
    give the big picture of what it is we're
  • 4:10 - 4:18
    really trying to do in one one short
    little sentence. Something like this but
  • 4:18 - 4:24
    for security, right? What are the
    important facts, how does it rate, you
  • 4:24 - 4:27
    know, is it easy to consume, is it easy to
    go ahead and look and say this thing is
  • 4:27 - 4:31
    good this thing is not good. Something
    like this, but for software security.
  • 4:33 - 4:39
    Sounds hard doesn't it? So I want to talk
    a little bit about what I mean by
  • 4:39 - 4:45
    something like this.
    There are lots of consumer outlook and
  • 4:45 - 4:50
    watchdog and protection groups - some
    private, some government, which are
  • 4:50 - 4:55
    looking to do this for various things that
    are not a software security. And you can
  • 4:55 - 4:58
    see some examples here that are big in the
    United States - I happen to not like these
  • 4:58 - 5:02
    as much as some of the newer consumer
    labels coming out from the EU. But
  • 5:02 - 5:05
    nonetheless they are examples of the kinds
    of things people have done in other
  • 5:05 - 5:11
    fields, fields that are not security to
    try to achieve that same end. And when
  • 5:11 - 5:17
    these things work well, it is for three
    reasons: One, it has to contain the
  • 5:17 - 5:23
    relevant information. Two: it has to be
    based in fact, we're not talking opinions,
  • 5:23 - 5:29
    this is not a book club or something like
    that. And three: it has to be actionable,
  • 5:29 - 5:33
    it has to be actionable - you have to be
    able to know how to make a decision based
  • 5:33 - 5:36
    on it. How do you do that for software
    security? How do you do that for
  • 5:36 - 5:44
    software security? So the rest of the talk
    is going to go in three parts.
  • 5:44 - 5:49
    First, we're going to give a bit of an
    overview for more of the consumer facing
  • 5:49 - 5:53
    side of things for that we do: look at
    some data that we have reported on early
  • 5:53 - 5:57
    and all these other kinds of good things.
    We're then going to go ahead and get
  • 5:58 - 6:06
    terrifyingly, terrifyingly technical. And
    then after that we'll talk about tools to
  • 6:06 - 6:10
    actually implement all this stuff. The
    technical part comes before the tools. So
  • 6:10 - 6:12
    it just tells you how terrifyingly
    technical we're gonna get. It's gonna be
  • 6:12 - 6:20
    fun right. So how do you do this for
    software security: a consumer version. So,
  • 6:20 - 6:25
    if you set forth to the task of trying to
    measure software security, right, many
  • 6:25 - 6:28
    people here probably do work in the
    security field perhaps as consultants
  • 6:28 - 6:32
    doing reviews; certainly I used to. Then
    probably what you're thinking to yourself
  • 6:32 - 6:39
    right now is that there are lots and lots
    and lots and lots of things that affect
  • 6:39 - 6:44
    the security of a piece of software. Some
    of which are, mmm, you're only gonna see
  • 6:44 - 6:48
    them if you go reversing. And some of
    which are just you know kicking around on
  • 6:48 - 6:52
    the ground waiting for you to notice,
    right. So we're going to talk about both
  • 6:52 - 6:56
    of those kinds of things that you might
    measure. But here you see these giant
  • 6:56 - 7:03
    charts that basically go through on the
    left - on the left we have Microsoft Excel
  • 7:03 - 7:08
    on OS X on the right Google Chrome for OS
    X this is a couple years old at this point
  • 7:08 - 7:13
    maybe one and a half years old but over
    here I'm not expecting you to be able to
  • 7:13 - 7:16
    read these - the real point is to say look
    at all of the different things you can
  • 7:16 - 7:20
    measure very easily.
    How do you distill, it how do you boil it
  • 7:20 - 7:27
    down, right. So this is a the opposite of
    a good consumer safety label. This is just
  • 7:27 - 7:30
    um if you ever done any consulting this is
    the kind of report you hand a client to
  • 7:30 - 7:33
    tell them how good their software is,
    right? It's the opposite of consumer
  • 7:33 - 7:40
    grade. But the reason I'm showing it here
    is because, you know, I'm gonna call out
  • 7:40 - 7:43
    some things and maybe you can't process
    all of this because it's too much
  • 7:43 - 7:47
    material, you know. But I'm gonna call it
    some things and once I call them out just
  • 7:47 - 7:53
    like NP you're gonna recognize them
    instantly. So for example, Excel, at the
  • 7:53 - 7:57
    time of this review - look at this column
    of dots. What's this dots telling you?
  • 7:57 - 8:00
    It's telling you look at all these
    libraries -all of them are 32-bit only.
  • 8:00 - 8:07
    Not 64 bits, not 64 bits. Take a look at
    Chrome - exact opposite, exact opposite
  • 8:07 - 8:14
    64-bit binary, right? What are some other
    things? Excel, again, on OSX maybe you can
  • 8:14 - 8:20
    see these danger warning signs that go
    straight straight up the whole thing.
  • 8:20 - 8:28
    That's the the absence of major heat
    protection flags in the binary headers.
  • 8:28 - 8:32
    We'll talk about some what that means
    exactly in a bit. But also if you hop over
  • 8:32 - 8:36
    here you'll see like yeah yeah yeah like
    Chrome has all the different heat
  • 8:36 - 8:42
    protections that a binary might enable, on
    OSX that is, but it also has more dots in
  • 8:42 - 8:45
    this column here off to the right. And
    what do those dots represent?
  • 8:45 - 8:52
    Those dots represent functions, functions
    that historically have been the source of
  • 8:52 - 8:54
    you know if you call these functions are
    very hard to call correctly - if you're a
  • 8:54 - 8:59
    C programmer the "gets" function is a good
    example. But there are lots of them. And
  • 8:59 - 9:03
    you can see here the Chrome doesn't mind,
    it uses them all a bunch. And Excel not so
  • 9:03 - 9:08
    much. And if you know the history of
    Microsoft and the trusted computing
  • 9:08 - 9:12
    initiative and the SDO and all of that you
    will know that a very long time ago
  • 9:12 - 9:17
    Microsoft made the decision and they said
    we're gonna start purging some of these
  • 9:17 - 9:22
    risky functions from our code bases
    because we think it's easier to ban them
  • 9:22 - 9:25
    than teach our devs to use them correctly.
    And you see that reverberating out in
  • 9:25 - 9:29
    their software. Google on the other hand
    says yeah yeah yeah those functions can be
  • 9:29 - 9:32
    dangerous to use but if you know how to
    use them they can be very good and so
  • 9:32 - 9:39
    they're permitted. The point all of this
    is building to is that if you start by
  • 9:39 - 9:43
    just measuring every little thing that
    like your static analyzers can detect in a
  • 9:43 - 9:48
    piece of software. Two things: one, you
    wind up with way more data than you can
  • 9:48 - 9:55
    show in a slide. And two: the engineering
    process, the software development life
  • 9:55 - 10:00
    cycle that went into the software will
    leave behind artifacts that tell you
  • 10:00 - 10:05
    something about the decisions that went
    into designing that engineering process.
  • 10:05 - 10:10
    And so you know, Google for example:
    quite rigorous as far as hitting you know
  • 10:10 - 10:14
    GCC dash, and then enable all of the
    compiler protections. Microsoft may be
  • 10:14 - 10:20
    less good at that, but much more rigid in
    things that's were very popular ideas when
  • 10:20 - 10:24
    they introduced trusted computing,
    alright. So the big takeaway from this
  • 10:24 - 10:29
    material is that again the software
    engineering process results in artifacts
  • 10:29 - 10:36
    in the software that people can find.
    Alright. Ok, so that's that's a whole
  • 10:36 - 10:41
    bunch of data, certainly it's not a
    consumer-friendly label. So how do you
  • 10:41 - 10:46
    start to get in towards the consumer zone?
    Well, the main defect of the big reports
  • 10:46 - 10:51
    that we just saw is that it's too much
    information. It's a very dense on data but
  • 10:51 - 10:56
    it's very hard to distill it to the "so
    what" of it, right?
  • 10:56 - 11:00
    And so this here is one of our earlier
    attempts to go ahead and do that
  • 11:00 - 11:05
    distillation. What are these charts how
    did we come up with these? Well on the
  • 11:05 - 11:08
    previous slide when we saw all these
    different factors that you can analyze in
  • 11:08 - 11:14
    software, basically here's whose views
    that we arrive at this. For each of those
  • 11:14 - 11:19
    things: pick a weight. Go ahead and
    compute a score, average against the
  • 11:19 - 11:22
    weight: tada, now you have some number.
    You can do that for each of the libraries
  • 11:22 - 11:26
    and the piece of software. And if you do
    that for each of the libraries in the
  • 11:26 - 11:29
    software you can then go ahead and produce
    these histograms to show, you know, like
  • 11:29 - 11:36
    this percentage of the DLLs had a score in
    this range. Boom, there's a bar, right.
  • 11:36 - 11:39
    How do you pick those weights? We'll talk
    about that in a sec - it's very technical.
  • 11:39 - 11:45
    But the the takeaway though, is you know
    that you wind up with these charts. Now
  • 11:45 - 11:48
    I've obscured the labels, I've obscured
    the labels and the reason I've done that
  • 11:48 - 11:52
    is because I don't really care that much
    about the actual counts. I want to talk
  • 11:52 - 11:57
    about the shapes, the shapes of these
    charts: it's a qualitative thing.
  • 11:57 - 12:03
    So here: good scores appear on the right,
    bad scores appear on the left. The
  • 12:03 - 12:06
    histogram measuring all the libraries and
    components and so a very secure piece of
  • 12:06 - 12:13
    software in this model manifests as a tall
    bar far to the right. And you can see a
  • 12:13 - 12:18
    clear example at in our custom Gentoo
    build. Anyone here is a Gentoo fan knows -
  • 12:18 - 12:21
    hey I'm going to install this thing, I
    think I'm going to go ahead and turn on
  • 12:21 - 12:25
    every single one of those flags, and lo
    and behold if you do that yeah you wind up
  • 12:25 - 12:31
    with tall bar far to the right. Here's in
    Ubuntu 16, I bet it's 16.04 but I don't
  • 12:31 - 12:36
    recall exactly, 16 LTS. Here you see a lot
    of tall bars to the right - not quite as
  • 12:36 - 12:40
    consolidated as a custom Gentoo build, but
    that makes sense doesn't it right? Because
  • 12:40 - 12:45
    then you know you don't do your whole
    Ubuntu build. Now I want to contrast. I
  • 12:45 - 12:50
    want to contrast. So over here on the
    right we see in the same model, an
  • 12:50 - 12:56
    analysis of the firmware obtained from two
    smart televisions. Last year's models from
  • 12:56 - 13:00
    Samsung and LG. And here the model
    numbers. We did this work in concert with
  • 13:00 - 13:05
    Consumer Reports. And what do you notice
    about these histograms, right. Are the
  • 13:05 - 13:12
    bars tall and to the right? No, they look
    almost normal, not quite, but that doesn't
  • 13:12 - 13:17
    really matter. The main thing that matters
    is that this is the shape you would expect
  • 13:17 - 13:24
    to get if you were playing a random game
    basically to decide what security features
  • 13:24 - 13:28
    to enable in your software. This is the
    shape of not having a security program, is
  • 13:28 - 13:34
    my bet. That's my bet. And so what do you
    see? You see heavy concentration here in
  • 13:34 - 13:39
    the middle, right, that seems fair, and
    like it tails off. On the Samsung nothing
  • 13:39 - 13:44
    scored all that great, same on the LG.
    Both of them are you know running their
  • 13:44 - 13:47
    respective operating systems and they're
    basically just inheriting whatever
  • 13:47 - 13:51
    security came from whatever open source
    thing they forked, right.
  • 13:51 - 13:55
    So this is this is the kind of message,
    this right here is the kind of thing that
  • 13:55 - 14:02
    we serve to exist for. This is us
    producing charts showing that the current
  • 14:02 - 14:08
    practices in the not-so consumer-friendly
    space of running your own Linux distros
  • 14:08 - 14:13
    far exceed the products being delivered,
    certainly in this case in the smart TV
  • 14:13 - 14:25
    market. But I think you might agree with
    me, it's much worse than this. So let's
  • 14:25 - 14:28
    dig into that a little bit more, I have a
    different point that I want to make about
  • 14:28 - 14:34
    that same data set - so this table here
    this table is again looking at the LG
  • 14:34 - 14:40
    Samsung and Gentoo Linux installations.
    And on this table we're just pulling out
  • 14:40 - 14:44
    some of the easy to identify security
    features you might enable in a binary
  • 14:44 - 14:50
    right. So percentage of binaries with
    address space layout randomization, right?
  • 14:50 - 14:56
    Let's talk about that on our Gentoo build
    it's over 99%. That also holds for the
  • 14:56 - 15:03
    Amazon Linux AMI - it holds in Ubuntu.
    ASLR is incredibly common in modern Linux.
  • 15:03 - 15:09
    And despite that, fewer than 70 percent of
    the binaries on the LG television had it
  • 15:09 - 15:14
    enabled. Fewer than 70 percent. And the
    Samsung was doing, you know, better than
  • 15:14 - 15:20
    that I guess, but 80 percent is a pretty
    disappointing when a default Linux
  • 15:20 - 15:25
    install, you know, mainstream Linux distro
    is going to get you 99, right? And it only
  • 15:25 - 15:28
    gets worse, it only gets worse right you
    know?
  • 15:28 - 15:32
    RELRO support, if you don't know what that
    is that's ok but if you do, look abysmal
  • 15:32 - 15:38
    coverage look at this abysmal coverage
    coming out of these IOT devices very sad.
  • 15:38 - 15:41
    And you see it over and over and over
    again. I'm showing this because some
  • 15:41 - 15:46
    people in this room or watching this video
    ship software - and I have a message, I
  • 15:46 - 15:50
    have a message to those people who ship
    software who aren't working on say Chrome
  • 15:50 - 15:59
    or any of the other big-name Pwn2Own kinds
    of targets. Look at this: you can be
  • 15:59 - 16:02
    leading the pack by mastering the
    fundamentals. You can be leading the pack
  • 16:02 - 16:07
    by mastering the fundamentals. This is a
    point that really as a security field we
  • 16:07 - 16:11
    really need to be driving home. You know,
    one of the things that we're seeing here
  • 16:11 - 16:16
    in our data is that if you're the vendor
    who is shipping the product everyone has
  • 16:16 - 16:19
    heard of in the security field and maybe
    your game is pretty decent right? If
  • 16:19 - 16:24
    you're shipping say Windows or if you're
    shipping Firefox or whatever. But if
  • 16:24 - 16:26
    you're if you're doing one of these things
    where people are just kind of beating you
  • 16:26 - 16:31
    up for default passwords, then your
    problems are way further than just default
  • 16:31 - 16:35
    passwords, right? Like the house, the
    house is messy it needs to be cleaned,
  • 16:35 - 16:43
    needs to be cleaned. So the rest of the
    talk like I said we're going to be
  • 16:43 - 16:47
    discussing a lot of other things that
    amount to getting you know a peek behind
  • 16:47 - 16:51
    the curtain and where some of these things
    come from and getting very specific about
  • 16:51 - 16:54
    how this business works, but if you're
    interested in more of the high level
  • 16:54 - 16:59
    material - especially if you're interested
    in interesting results and insights, some
  • 16:59 - 17:02
    of which I'm going to have here later. But
    I really encourage you though to take a
  • 17:02 - 17:07
    look at the talk from this past summer by
    our chief scientist Sarah Zatko, which is
  • 17:07 - 17:11
    predominantly on the topic of surprising
    results in the data.
  • 17:15 - 17:19
    Today, though, this being our first time
    presenting here in Europe, we figured we
  • 17:19 - 17:23
    would take more of an overarching kind of
    view. What we're doing and why we're
  • 17:23 - 17:27
    excited about it and where it's headed. So
    we're about to move into a little bit of
  • 17:27 - 17:32
    the underlying theory, you know. Why do I
    think it's reasonable to even try to
  • 17:32 - 17:35
    measure the security of software from a
    technical perspective. But before we can
  • 17:35 - 17:39
    get into that I need to talk a little bit
    about our goals, so that the decisions and
  • 17:39 - 17:45
    the theory; the motivation is clear,
    right. Our goals are really simple: it's a
  • 17:45 - 17:51
    very easy organization to run because of
    that. Goal number one: remain independent
  • 17:51 - 17:56
    of vendor influence. We are not the first
    organization to purport to be looking out
  • 17:56 - 18:02
    for the consumer. But unlike many of our
    predecessors, we are not taking money from
  • 18:02 - 18:10
    the people we review, right? Seems like
    some basic stuff. Seems like some basic
  • 18:10 - 18:18
    stuff right? Thank you, okay.
    Two: automated, comparable, quantitative
  • 18:18 - 18:24
    analysis. Why automated? Well, we need our
    test results to be reproducible. And Tim
  • 18:24 - 18:28
    goes in opens up your software in IDA and
    finds a bunch of stuff that makes them all
  • 18:28 - 18:33
    stoped - that's not a very repeatable kind
    of a standard for things. And so we're
  • 18:33 - 18:36
    interested in things which are automated.
    We'll talk about, maybe a few hackers in
  • 18:36 - 18:40
    here know how hard that is. We'll talk
    about that, but then last we also we're
  • 18:40 - 18:44
    well acting as a watchdog - we're
    protecting the interests of the user, the
  • 18:44 - 18:48
    consumer, however you would like to look
    at it. But we also have three non goals,
  • 18:48 - 18:53
    three non goals that are equally
    important. One: we have a non goal of
  • 18:53 - 18:57
    finding and disclosing vulnerabilities. I
    reserve the right to find and disclose
  • 18:57 - 19:01
    vulnerabilities. But that's not my goal,
    it's not my goal. Another non goal is to
  • 19:01 - 19:05
    tell software vendors what to do. If a
    vendor asks me how to remediate their
  • 19:05 - 19:08
    terrible score, I will tell them what we
    are measuring but I'm not there to help
  • 19:08 - 19:12
    them remediate. It's on them to be able to
    ship a secure product without me holding
  • 19:12 - 19:19
    their hand. We'll see. And then three:
    non-goal, perform free security testing
  • 19:19 - 19:24
    for vendors. Our testing happens after you
    release. Because when you release your
  • 19:24 - 19:29
    software you are telling people it is
    ready to be used. Is it really though, is
  • 19:29 - 19:32
    it really though, right?
    Applause
  • 19:32 - 19:37
    Yeah, thank you. Yeah, so we are not there
    to give you a preview of what your score
  • 19:37 - 19:42
    will be. There is no sum of money you can
    hand me that will get you an early preview
  • 19:42 - 19:46
    of what your score is - you can try me,
    you can try me: there's a fee for trying
  • 19:46 - 19:50
    me. There's a fee for trying me. But I'm
    not gonna look at your stuff until I'm
  • 19:50 - 19:59
    ready to drop it, right. Yeah bitte, yeah.
    All right. So moving into this theory
  • 19:59 - 20:03
    territory. Three big questions, three big
    questions that need to be addressed if you
  • 20:03 - 20:07
    want to do our work efficiently: what
    works, what works for improving security,
  • 20:07 - 20:13
    what are the things that need or that you
    really want to see in software. Two: how
  • 20:13 - 20:17
    do you recognize when it's being done?
    It's no good if someone hands you a piece
  • 20:17 - 20:20
    of software and says, "I've done all the
    latest things" and it's a complete black
  • 20:20 - 20:25
    box. If you can't check the claim, the
    claim is as good as false, in practical
  • 20:25 - 20:30
    terms, period, right. Software has to be
    reviewable or a priori, I'll think you're
  • 20:30 - 20:36
    full of it. And then three: who's doing it
    - of all the things that work, that you
  • 20:36 - 20:40
    can recognize, who's actually doing it.
    You know, let's go ahead - our field is
  • 20:40 - 20:47
    famous for ruining people's holidays and
    weekends over Friday bug disclosures, you
  • 20:47 - 20:52
    know New Year's Eve bug disclosures. I
    would like us to also be famous for
  • 20:52 - 20:59
    calling out those teams and those software
    organizations which are being as good as
  • 20:59 - 21:04
    the bad guys are being bad, yeah? So
    provide someone an incentive to be maybe
  • 21:04 - 21:19
    happy to see us for a change, right. Okay,
    so thank you. Yeah, all right. So how do
  • 21:19 - 21:26
    we actually pull these things off; the
    basic idea. So, I'm going to get into some
  • 21:26 - 21:29
    deeper theory: if you're not a theorist I
    want you to focus on this slide.
  • 21:29 - 21:33
    And I'm gonna bring it back, it's not all
    theory from here on out after this but if
  • 21:33 - 21:39
    you're not a theorist I really want you to
    focus on this slide. The basic motivation,
  • 21:39 - 21:43
    the basic motivation behind what we're
    doing; the technical motivation - why we
  • 21:43 - 21:47
    think that it's possible to measure and
    report on security. It all boils down to
  • 21:47 - 21:53
    this right. So we start with a thought
    experiment, a gedankent, right? Given a
  • 21:53 - 21:59
    piece of software we can ask: overall, how
    secure is it? Kind of a vague question but
  • 21:59 - 22:03
    you could imagine you know there's
    versions of that question. And two: what
  • 22:03 - 22:08
    are its vulnerabilities. Maybe you want to
    nitpick with me about what the word
  • 22:08 - 22:11
    vulnerability means but broadly you know
    this is a much more specific question
  • 22:11 - 22:19
    right. And here's here's the enticing
    thing: the first question appears to ask
  • 22:19 - 22:25
    for less information than the second
    question. And maybe if we were taking bets
  • 22:25 - 22:29
    I would put my money on, yes, it actually
    does ask for less information. What do I
  • 22:29 - 22:33
    mean by that what do I mean by that? Well,
    let's say that someone told you all of the
  • 22:33 - 22:38
    vulnerabilities in a system right? They
    said, "Hey, I got them all", right? You're
  • 22:38 - 22:42
    like all right that's cool, that's cool.
    And if someone asks you hey how secure is
  • 22:42 - 22:45
    this system you can give them a very
    precise answer. You can say it has N
  • 22:45 - 22:49
    vulnerabilities, and they're of this kind
    and like all this stuff right so certainly
  • 22:49 - 22:55
    the second question is enough to answer
    the first. But, is the reverse true?
  • 22:55 - 22:58
    Namely, if someone were to tell you, for
    example, "hey, this piece of software has
  • 22:58 - 23:06
    exactly 32 vulnerabilities in it." Does
    that make it easier to find any of them?
  • 23:06 - 23:12
    Right, there's room for to maybe do that
    using some algorithms that are not yet in
  • 23:12 - 23:16
    existence.
    Certainly the computer scientists in here
  • 23:16 - 23:19
    are saying, "well, you know, yeah maybe
    counting the number of SAT solutions
  • 23:19 - 23:23
    doesn't help you practically find
    solutions. But it might and we just don't
  • 23:23 - 23:27
    know." Okay fine fine fine. Maybe these
    things are the same, but the my experience
  • 23:27 - 23:30
    in security, and the experience of many
    others perhaps is that they probably
  • 23:30 - 23:37
    aren't the same question. And this
    motivates what I'm calling here is Zatko's
  • 23:37 - 23:41
    question, which is basically asking for an
    algorithm that demonstrates that the first
  • 23:41 - 23:46
    question is easier than the second
    question, right. So Zatko's question:
  • 23:46 - 23:49
    develop a heuristic which can to
    efficiently answer one, but not
  • 23:49 - 23:54
    necessarily two. If you're looking for a
    metaphor, if you want to know why I care
  • 23:54 - 23:57
    about this distinction, I want you to
    think about some certain controversial
  • 23:57 - 24:01
    technologies: maybe think about say
    nuclear technology, right. An algorithm
  • 24:01 - 24:05
    that answers one, but not two, it's a very
    safe algorithm to publish. Very safe
  • 24:05 - 24:11
    algorithm publish indeed. Okay, Claude
    Shannon would like more information. happy
  • 24:11 - 24:16
    to oblige. Let's take a look at this
    question from a different perspective
  • 24:16 - 24:19
    maybe a more hands-on perspective: the
    hacker perspective, right? If you're a
  • 24:19 - 24:22
    hacker and you're watching me up here and
    I'm waving my hands around and I'm showing
  • 24:22 - 24:26
    you charts maybe you're thinking to
    yourself yeah boy, what do you got? Right,
  • 24:26 - 24:30
    how does this actually go. And maybe what
    you're thinking to yourself is that, you
  • 24:30 - 24:34
    know, finding good vulns: that's an
    artisan craft right? You're in IDA, you
  • 24:34 - 24:37
    know you're reversing old way you're doing
    all these things or hit and Comm, I don't
  • 24:37 - 24:41
    know all that stuff. And like, you know,
    this kind of clever game; cleverness is
  • 24:41 - 24:47
    not like this thing that feels very
    automatable. But you know on the other
  • 24:47 - 24:51
    hand there are a lot of tools that do
    automate things and so it's not completely
  • 24:51 - 24:57
    not automatable.
    And if you're into fuzzing then perhaps
  • 24:57 - 25:02
    you are aware of this very simple
    observation, which is that if your harness
  • 25:02 - 25:05
    is perfect if you really know what you're
    doing if you have a decent fuzzer then in
  • 25:05 - 25:09
    principle fuzzing can find every single
    problem. You have to be able to look for
  • 25:09 - 25:14
    it you have to be able harness for it but
    in principle it will, right. So the hacker
  • 25:14 - 25:19
    perspective on Zatko's question is maybe
    of two minds on the one hand assessing
  • 25:19 - 25:22
    security is a game of cleverness, but on
    the other hand we're kind of right now at
  • 25:22 - 25:26
    the cusp of having some game-changing tech
    really go - maybe you're saying like
  • 25:26 - 25:30
    fuzzing is not at the cusp, I promise it's
    just at the cusp. We haven't seen all the
  • 25:30 - 25:34
    fuzzing has to offer right and so maybe
    there's room maybe there's room for some
  • 25:34 - 25:41
    automation to be possible in pursuit of
    Zatko's question. Of course, there are
  • 25:41 - 25:46
    many challenges still in, you know, using
    existing hacker technology. Mostly of the
  • 25:46 - 25:50
    form of various open questions. For
    example if you're into fuzzing, you know,
  • 25:50 - 25:53
    hey: identifying unique crashes. There's
    an open question. We'll talk about some of
  • 25:53 - 25:57
    those, we'll talk about some of those. But
    I'm going to offer another perspective
  • 25:57 - 26:01
    here: so maybe you're not in the business
    of doing software reviews but you know a
  • 26:01 - 26:06
    little computer science. And maybe that
    computer science has you wondering what's
  • 26:06 - 26:13
    this guy talking about, right. I'm here to
    acknowledge that. So whatever you think
  • 26:13 - 26:17
    the word security means: I've got a list
    of questions up here. Whatever you think
  • 26:17 - 26:20
    the word security means, probably, some of
    these questions are relevant to your
  • 26:20 - 26:23
    definition. Right.
    Does the software have a hidden backdoor
  • 26:23 - 26:27
    or any kind of hidden functionality, does
    it handle crypto material correctly, etc,
  • 26:27 - 26:30
    so forth. Anyone in here who knows some
    computers abilities theory knows that
  • 26:30 - 26:34
    every single one of these questions and
    many others like them are undecidable due
  • 26:34 - 26:38
    to reasons essentially no different than
    the reason the halting problem is
  • 26:38 - 26:41
    undecidable,\ which is to say due to
    reasons essentially first identified and
  • 26:41 - 26:46
    studied by Alan Turing a long time before
    we had microarchitectures and all these
  • 26:46 - 26:50
    other things. And so, the computability
    perspective says that, you know, whatever
  • 26:50 - 26:55
    your definition of security is ultimately
    you have this recognizability problem:
  • 26:55 - 26:58
    fancy way of saying that algorithms won't
    be able to recognize secure software
  • 26:58 - 27:03
    because of the undecidability these
    issues. The takeaway, the takeaway is that
  • 27:03 - 27:07
    the computability angle on all of this
    says: anyone who's in the business that
  • 27:07 - 27:12
    we're in has to use heuristics. You have
    to, you have to.
  • 27:15 - 27:25
    All right, this guy gets it. All right, so
    on the tech side our last technical
  • 27:25 - 27:28
    perspective that we're going to take now
    is certainly the most abstract which is
  • 27:28 - 27:32
    the Bayesian perspective, right. So if
    you're a frequentist, you need to get with
  • 27:32 - 27:37
    the times you know it's everything
    Bayesian now. So, let's talk about this
  • 27:37 - 27:44
    for a bit. Only two slides of math, I
    promise, only two! So, let's say that I
  • 27:44 - 27:47
    have some corpus of software. Perhaps it's
    a collection of all modern browsers,
  • 27:47 - 27:51
    perhaps it's the collection of all the
    packages in the Debian repository, perhaps
  • 27:51 - 27:54
    it's everything on github that builds on
    this system, perhaps it's a hard drive
  • 27:54 - 27:58
    full of warez that some guy mailed you,
    right? You have some corpus of software
  • 27:58 - 28:03
    and for a random program in that corpus we
    can consider this probability: the
  • 28:03 - 28:07
    probability distribution of which software
    is secure versus which is not. For reasons
  • 28:07 - 28:11
    described on the computability
    perspective, this number is not a
  • 28:11 - 28:17
    computable number for any reasonable
    definition of security. So that's a neat
  • 28:17 - 28:21
    and so, for practical terms, if you want
    to do some probabilistic reasoning, you
  • 28:21 - 28:28
    need some surrogate for that and so we
    consider this here. So, instead of
  • 28:28 - 28:31
    considering the probability that a piece
    of software is secure, a non computable
  • 28:31 - 28:36
    non verifiable claim, we take a look here
    at this indexed collection of
  • 28:36 - 28:39
    probabilities. This is an infinite
    countable family of probability
  • 28:39 - 28:44
    distributions, basically P sub h,k is just
    the probability that for a random piece of
  • 28:44 - 28:50
    software in the corpus, h work units of
    fuzzing will find no more than k unique
  • 28:50 - 28:56
    crashes, right. And why is this relevant?
    Well, at the bottom we have this analytic
  • 28:56 - 28:59
    observation, which is that in the limit as
    h goes to infinity you're basically
  • 28:59 - 29:04
    saying: "Hey, you know, if I fuzz this
    thing for infinity times, you know, what's
  • 29:04 - 29:08
    that look like?" And, essentially, here we
    have analytically that this should
  • 29:08 - 29:13
    converge. The P sub h,1 should converge to
    the probability that a piece of software
  • 29:13 - 29:16
    just simply cannot be made to crash. Not
    the same thing as being secure, but
  • 29:16 - 29:24
    certainly not a small concern relevant to
    security. So, none of that stuff actually
  • 29:24 - 29:31
    was Bayesian yet, so we need to get there.
    And so here we go, right: so, the previous
  • 29:31 - 29:35
    slide described a probability distribution
    measured based on fuzzing. But fuzzing is
  • 29:35 - 29:39
    expensive and it is also not an answer to
    Zatko's question because it finds
  • 29:39 - 29:44
    vulnerabilities, it doesn't measure
    security in the general sense and so
  • 29:44 - 29:47
    here's where we make the jump to
    conditional probabilities: Let M be some
  • 29:47 - 29:52
    observable property of software has ASLR,
    has RELRO, calls these functions, doesn't
  • 29:52 - 29:57
    call those functions... take your pick.
    For random s in S we now consider these
  • 29:57 - 30:02
    conditional probability distributions and
    this is the same kind of probability as we
  • 30:02 - 30:08
    had on the previous slide but conditioned
    on this observable being true, and this
  • 30:08 - 30:11
    leads to the refined of the Siddall
    variant of Zatko's question:
  • 30:11 - 30:17
    Which observable properties of software
    satisfy that, when the software has
  • 30:17 - 30:23
    property m, the probability of fuzzing
    being hard is very high? That's what this
  • 30:23 - 30:27
    version of this question phrases, and here
    we say, you know, in large log(h)/k, in
  • 30:27 - 30:32
    other words: exponentially more fuzzing
    than you expect to find bugs. So this is
  • 30:32 - 30:36
    the technical version of what we're after.
    All of this can be explored, you can
  • 30:36 - 30:40
    brute-force your way to finding all of
    this stuff, and that's exactly what we're
  • 30:40 - 30:48
    doing. So we're looking for all kinds of
    things, we're looking for all kinds of
  • 30:48 - 30:54
    things that correlate with fuzzing having
    low yield on a piece of software, and
  • 30:54 - 30:57
    there's a lot of ways in which that can
    happen. It could be that you are looking
  • 30:57 - 31:01
    at a feature of software that literally
    prevents crashes. Maybe it's the never
  • 31:01 - 31:08
    crash flag, I don't know. But most of the
    things I've talked about, ASLR, RERO, etc.
  • 31:08 - 31:12
    don't prevent crashes. In fact a ASLR can
    take non-crashing programs and make them
  • 31:12 - 31:17
    crashing. It's the number one reason
    vendors don't enable it, right? So why am
  • 31:17 - 31:20
    I talking about ASLR? Why am I talking
    about RELRO? Why am i talking about all
  • 31:20 - 31:23
    these things that have nothing to do with
    stopping crashes and I'm claiming I'm
  • 31:23 - 31:27
    measuring crashes? This is because, in the
    Bayesian perspective, correlation is not
  • 31:27 - 31:32
    the same thing as causation, right?
    Correlation is not the same thing as
  • 31:32 - 31:35
    causation. It could be that M's presence
    literally prevents crashes, but it could
  • 31:35 - 31:40
    also be that, by some underlying
    coincidence, the things we're looking for
  • 31:40 - 31:44
    are mostly only found in software that's
    robust against crashing.
  • 31:44 - 31:49
    If you're looking for security, I submit
    to you that the difference doesn't matter.
  • 31:49 - 31:55
    Okay, end of my math, danke. I will now go
    ahead and do this like really nice analogy
  • 31:55 - 31:59
    of all those things that I just described,
    right. So we're looking for indicators of
  • 31:59 - 32:04
    a piece of software being secure enough to
    be good for consumers, right. So here's an
  • 32:04 - 32:08
    analogy. Let's say you're a geologist, you
    study minerals and all of that and you're
  • 32:08 - 32:14
    looking for diamonds. Who isn't, right?
    Want those diamonds! And like how do you
  • 32:14 - 32:18
    find diamonds? Even in places that are
    rich in diamonds, diamonds are not common.
  • 32:18 - 32:21
    You don't just go walking around in your
    boots, kicking until your toe stubs on a
  • 32:21 - 32:27
    diamond? You don't do that. Instead you
    look for other minerals that are mostly
  • 32:27 - 32:32
    only found near diamonds but are much more
    abundant in those locations than the
  • 32:32 - 32:38
    diamonds. So, this is mineral science 101,
    I guess, I don't know. So, for example,
  • 32:38 - 32:41
    you want to go find diamond: put on your
    boots and go kicking until you find some
  • 32:41 - 32:46
    chromite, look for some diopside, you
    know, look for some garnet. None of these
  • 32:46 - 32:50
    things turn into diamonds, none of these
    things cause diamonds but if you're
  • 32:50 - 32:54
    finding good concentrations of these
    things, then, statistically, there's
  • 32:54 - 32:58
    probably diamonds nearby. That's what
    we're doing. We're not looking for the
  • 32:58 - 33:03
    things that cause good security per se.
    Rather, we're looking for the indicators
  • 33:03 - 33:08
    that you have put the effort into your
    software, right? How's that working out
  • 33:08 - 33:15
    for us? How's that working out for us?
    Well, we're still doing studies. It's, you
  • 33:15 - 33:18
    know, early to say exactly but we do have
    the following interesting coincidence: and
  • 33:18 - 33:25
    so, here presented I have a collection of
    prices that somebody gave much for so-
  • 33:25 - 33:30
    called the underground exploits. And I can
    tell you these prices are maybe a little
  • 33:30 - 33:34
    low these days but if you work in that
    business, if you go to Cyscin, if you do
  • 33:34 - 33:39
    that kind of stuff, maybe you know that
    this is ballpark, it's ballpark.
  • 33:39 - 33:44
    Alright, and, just a coincidence, maybe it
    means we're on the right track, I don't
  • 33:44 - 33:49
    know, but it's an encouraging sign: When
    we run these programs through our
  • 33:49 - 33:53
    analysis, our rankings more or less
    correspond to the actual prices that you
  • 33:53 - 33:58
    encounter in the wild for access via these
    applications. Up above, I have one of our
  • 33:58 - 34:02
    histogram charts. You can see here that
    Chrome and Edge in this particular model
  • 34:02 - 34:06
    scored very close to the same and it's a
    test model, so, let's say they're
  • 34:06 - 34:11
    basically the same.
    Firefox, you know, behind there a little
  • 34:11 - 34:15
    bit. I don't have Safari on this chart
    because - this or all Windows applications
  • 34:15 - 34:21
    - but the Safari score falls in between.
    So, lots of theory, lots of theory, lots
  • 34:21 - 34:28
    of theory and then we have this. So, we're
    going to go ahead now and hand off to our
  • 34:28 - 34:32
    lead engineer, Parker, who is going to
    talk about some of the concrete stuff, the
  • 34:32 - 34:35
    non-chalkboard stuff, the software stuff
    that actually makes this work.
  • 34:36 - 34:41
    Thompson: Yeah, so I want to talk about
    the process of actually doing it. Building
  • 34:41 - 34:45
    the tooling that's required to collect
    these observables. Effectively, how do you
  • 34:45 - 34:51
    go mining for indicator indicator
    minerals? But first the progression of
  • 34:51 - 34:56
    where we are and where we're going. We
    initially broke this out into three major
  • 34:56 - 35:00
    tracks of our technology. We have our
    static analysis engine, which started as a
  • 35:00 - 35:06
    prototype, and we have now recently
    completed a much more mature and solid
  • 35:06 - 35:10
    engine that's allowing us to be much more
    extensible and digging deeper into
  • 35:10 - 35:16
    programs, and provide a much more deep
    observables. Then, we have the data
  • 35:16 - 35:22
    collection and data reporting. Tim showed
    some of our early stabs at this. We're
  • 35:22 - 35:25
    right now in the process of building new
    engines to make the data more accessible
  • 35:25 - 35:30
    and easy to work with and hopefully more
    of that will be available soon. Finally,
  • 35:30 - 35:36
    we have our fuzzer track. We needed to get
    some early data, so we played with some
  • 35:36 - 35:41
    existing off-the-shelf fuzzers, including
    AFL, and, while that was fun,
  • 35:41 - 35:44
    unfortunately it's a lot of work to
    manually instrument a lot of fuzzers for
  • 35:44 - 35:49
    hundreds of binaries.
    So, we then built an automated solution
  • 35:49 - 35:53
    that started to get us closer to having a
    fuzzing harness that could autogenerate
  • 35:53 - 35:58
    itself, depending on the software, the
    software's behavior. But, right now,
  • 35:58 - 36:02
    unfortunately that technology showed us
    more deficiencies than it showed
  • 36:02 - 36:07
    successes. So, we are now working on a
    much more mature fuzzer that will allow us
  • 36:07 - 36:13
    to dig deeper into programs as we're
    running and collect very specific things
  • 36:13 - 36:21
    that we need for our model and our
    analysis. But on to our analytic pipeline
  • 36:21 - 36:26
    today. This is one of the most concrete
    components of our engine and one of the
  • 36:26 - 36:29
    most fun!
    We effectively wanted some type of
  • 36:29 - 36:35
    software hopper, where you could just pour
    programs in, installers and then, on the
  • 36:35 - 36:40
    other end, come reports: Fully annotated
    and actionable information that we can
  • 36:40 - 36:45
    present to people. So, we went about the
    process of building a large-scale engine.
  • 36:45 - 36:50
    It starts off with a simple REST API,
    where we can push software in, which then
  • 36:50 - 36:56
    gets moved over to our computation cluster
    that effectively provides us a fabric to
  • 36:56 - 37:00
    work with. It makes is made up of a lot of
    different software suites, starting off
  • 37:00 - 37:07
    with our data processing, which is done by
    apache spark and then moves over into data
  • 37:07 - 37:13
    data handling and data analysis in spark,
    and then we have a common HDFS layer to
  • 37:13 - 37:18
    provide a place for the data to be stored
    and then a resource manager and Yarn. All
  • 37:18 - 37:22
    of that is backed by our compute and data
    nodes, which scale out linearly. That then
  • 37:22 - 37:28
    moves into our data science engine, which
    is effectively spark with Apache Zeppelin,
  • 37:28 - 37:30
    which provides us a really fun interface
    where we can work with the data in an
  • 37:30 - 37:36
    interactive manner but be kicking off
    large-scale jobs into the cluster. And
  • 37:36 - 37:40
    finally, this goes into our report
    generation engine. What this bought us,
  • 37:40 - 37:46
    was the ability to linearly scale and make
    that hopper bigger and bigger as we need,
  • 37:46 - 37:51
    but also provide us a way to process data
    that doesn't fit in a single machine's
  • 37:51 - 37:54
    RAM. You can push the instance sizes as
    you large as you want, but we have
  • 37:54 - 38:00
    datasets that blow away any single host
    RAM set. So this allows us to work with
  • 38:00 - 38:09
    really large collections of observables.
    I want to dive down now into our actual
  • 38:09 - 38:13
    static analysis. But first we have to
    explore the problem space, because it's a
  • 38:13 - 38:19
    nasty one. Effectively in settles mission
    is to process as much software as
  • 38:19 - 38:26
    possible. Hopefully all of it, but it's
    hard to get your hand on all the binaries
  • 38:26 - 38:29
    that are out there. When you start to look
    at that problem you understand there's a
  • 38:29 - 38:35
    lot of combinations: there's a lot of CPU
    architectures, there's a lot of operating
  • 38:35 - 38:39
    systems, there's a lot of file formats,
    there's a lot of environments the software
  • 38:39 - 38:43
    gets deployed into, and every single one
    of them has their own app Archer app
  • 38:43 - 38:47
    armory features. And it can be
    specifically set for one combination
  • 38:47 - 38:52
    button on another and you don't want to
    penalize a developer for not turning on a
  • 38:52 - 38:56
    feature they had no access to ever turn
    on. So effectively we need to solve this
  • 38:56 - 39:01
    in a much more generic way. And so what we
    did is our static analysis engine
  • 39:01 - 39:05
    effectively looks like a gigantic
    collection of abstraction libraries to
  • 39:05 - 39:12
    handle binary programs. You take in some
    type of input file be it ELF, PE, MachO
  • 39:12 - 39:18
    and then the pipeline splits. It goes off
    into two major analyzer classes, our
  • 39:18 - 39:22
    format analyzers, which look at the
    software much like how a linker or loader
  • 39:22 - 39:27
    would look at it. I want to understand how
    it's going to be loaded up, what type of
  • 39:27 - 39:31
    armory feature is going to be applied and
    then we can run analyzers over that. In
  • 39:31 - 39:35
    order to achieve that we need abstraction
    libraries that can provide us an abstract
  • 39:35 - 39:41
    memory map, a symbol resolver, generic
    section properties. So all that feeds in
  • 39:41 - 39:46
    and then we run over a collection of
    analyzers to collect data and observables.
  • 39:46 - 39:50
    Next we have our code analyzers, these are
    the analyzers that run over the code
  • 39:50 - 39:58
    itself. I need to be able to look at every
    possible executable path. In order to do
  • 39:58 - 40:02
    that we need to do function discovery,
    feed that into a control flow recovery
  • 40:02 - 40:08
    engine, and then as a post-processing step
    dig through all of the possible metadata
  • 40:08 - 40:13
    in the software, such as like a switch
    table, or something like that to get even
  • 40:13 - 40:21
    deeper into the software. Then this
    provides us a basic list of basic blocks,
  • 40:21 - 40:24
    functions, instruction ranges. And does so
    in an efficient manner so we can process a
  • 40:24 - 40:31
    lot of software as it goes. Then all that
    gets fed over into the main modular
  • 40:31 - 40:37
    analyzers. Finally, all of this comes
    together and gets put into a gigantic blob
  • 40:37 - 40:42
    of observables and fed up to the pipeline.
    We really want to thank the Ford
  • 40:42 - 40:47
    Foundation for supporting our work in
    this, because the pipeline and the static
  • 40:47 - 40:52
    analysis has been a massive boon for our
    project and we're only beginning now to
  • 40:52 - 40:59
    really get our engine running and we're
    having a great time with it. So digging
  • 40:59 - 41:04
    into the observables themselves, what are
    we looking at and let's break them apart.
  • 41:04 - 41:09
    So the format structure components, things
    like ASLR, DEP, RELRO.
  • 41:09 - 41:13
    basic app armory, that's going to go into
    the feature and gonna be enabled at the OS
  • 41:13 - 41:18
    layer when it gets loaded up or linked.
    And we also collect other metadata about
  • 41:18 - 41:22
    the program such as like: "What libraries
    are linked in?", "What's its dependency
  • 41:22 - 41:26
    tree look like – completely?", "How did
    those software, how did those library
  • 41:26 - 41:32
    score?", because that can affect your main
    software. Interesting example on Linux, if
  • 41:32 - 41:36
    you link a library that requires an
    executable stack, guess what your software
  • 41:36 - 41:40
    now has an executable stack, even if you
    didn't mark that. So we need to be owners
  • 41:40 - 41:45
    to understand what ecosystem the software
    is gonna live in. And the code structure
  • 41:45 - 41:48
    analyzers look at things like
    functionality: "What's the software
  • 41:48 - 41:53
    doing?", "What type of app armory is
    getting injected into the code?". A great
  • 41:53 - 41:56
    example of that is something like stack
    guards or fortify source. These are our
  • 41:56 - 42:02
    main features that only really apply and
    can be observed inside of the control flow
  • 42:02 - 42:08
    or inside of the actual instructions
    themselves. This is why control
  • 42:08 - 42:11
    photographs are key.
    We played around with a number of
  • 42:11 - 42:16
    different ways of analyzing software that
    we could scale out and ultimately we had
  • 42:16 - 42:20
    to come down to working with control
    photographs. Provided here is a basic
  • 42:20 - 42:23
    visualization of what I'm talking about
    with a control photograph, provided by
  • 42:23 - 42:29
    Benja, which has wonderful visualization
    tools, hence this photo, and not our
  • 42:29 - 42:33
    engine because we don't build their very
    many visualization engines. But you
  • 42:33 - 42:38
    basically have a function that's broken up
    into basic blocks, which is broken up into
  • 42:38 - 42:43
    instructions, and then you have basic flow
    between them. Having this as an iterable
  • 42:43 - 42:48
    structure that we can work with, allows us
    to walk over that and walk every single
  • 42:48 - 42:51
    instruction, understand the references,
    understand where code and data is being
  • 42:51 - 42:54
    referenced, and how is it being
    referenced.
  • 42:54 - 42:58
    And then what type of functionalities
    being used, so this is a great way to find
  • 42:58 - 43:03
    something, like whether or not your stack
    guards are being applied on every function
  • 43:03 - 43:08
    that needs them, how deep are they being
    applied, and is the compiler possibly
  • 43:08 - 43:12
    introducing errors into your armory
    features. which are interesting side
  • 43:12 - 43:20
    studies. Also why we did this is because
    we want to push the concept of what type
  • 43:20 - 43:28
    of observables even farther. Let's say
    take this example you want to be able to
  • 43:28 - 43:34
    take instruction abstractions. Let's say
    for all major architectures you can break
  • 43:34 - 43:39
    them up into major categories. Be it
    arithmetic instructions, data manipulation
  • 43:39 - 43:46
    instructions, like load stores and then
    control flow instructions. Then with these
  • 43:46 - 43:53
    basic fundamental building blocks you can
    make artifacts. Think of them like a unit
  • 43:53 - 43:56
    of functionality: has some type of input,
    some type of output, it provides some type
  • 43:56 - 44:01
    of operation on it. And then with these
    little units of functionality, you can
  • 44:01 - 44:05
    link them together and think of these
    artifacts as may be sub-basic block or
  • 44:05 - 44:09
    crossing a few basic blocks, but a
    different way to break up the software.
  • 44:09 - 44:13
    Because a basic block is just a branch
    break, but we want to look at
  • 44:13 - 44:19
    functionality brakes, because these
    artifacts can provide the basic
  • 44:19 - 44:25
    fundamental building blocks of the
    software itself. It's more important, when
  • 44:25 - 44:29
    we want to start doing symbolic lifting.
    So that we can lift the entire software up
  • 44:29 - 44:35
    into a generic representation, that we can
    slice and dice as needed.
  • 44:39 - 44:43
    Moving from there, I want to talk about
    fuzzing a little bit more. Fuzzing is
  • 44:43 - 44:47
    effectively at the heart of our project.
    It provides us the rich dataset that we
  • 44:47 - 44:52
    can use to derive a model. It also
    provides us awesome other metadata on the
  • 44:52 - 44:58
    side. But why? Why do we care about
    fuzzing? Why is fuzzing the metric, that
  • 44:58 - 45:05
    you build an engine, that you build a
    model that you drive some type of reason
  • 45:05 - 45:12
    from? So think of the set of bugs,
    vulnerabilities, and exploitable
  • 45:12 - 45:17
    vulnerabilities. In an ideal world you'd
    want to just have a machine that pulls out
  • 45:17 - 45:20
    exploitable vulnerabilities.
    Unfortunately, this is exceedingly costly
  • 45:20 - 45:26
    for a series of decision problems, that go
    between these sets. So now consider the
  • 45:26 - 45:32
    superset of bugs or faults. A fuzzer can
    easily recognize, or other software can
  • 45:32 - 45:37
    easily recognize faults, but if you want
    to move down the sets you unfortunately
  • 45:37 - 45:43
    need to jump through a lot of decision
    hoops. For example, if you want to move to
  • 45:43 - 45:46
    a vulnerability you have to understand:
    Does the attacker have some type of
  • 45:46 - 45:51
    control? Is there a trust boundary being
    crossed? Is this software configured in
  • 45:51 - 45:55
    the right way for this to be vulnerable
    right now? So they're human factors that
  • 45:55 - 45:59
    are not deducible from the outside. You
    then amplify this decision problem even
  • 45:59 - 46:05
    worse going to exploitable
    vulnerabilities. So if we collect the
  • 46:05 - 46:11
    superset of bugs, we will know that there
    is some proportion of subsets in there.
  • 46:11 - 46:16
    And this provides us a datasets easily
    recognizable and we can collect in a cost-
  • 46:16 - 46:22
    efficient manner. Finally, fuzzing is key
    and we're investing a lot of our time
  • 46:22 - 46:27
    right now and working on a new fuzzing
    engine, because there are some key things
  • 46:27 - 46:32
    we want to do.
    We want to be able to understand all of
  • 46:32 - 46:35
    the different paths the software could be
    taking, and as you're fuzzing you're
  • 46:35 - 46:40
    effectively driving the software down as
    many unique paths while referencing as
  • 46:40 - 46:48
    many unique data manipulations as
    possible. So if we save off every path,
  • 46:48 - 46:52
    annotate the ones that are faulting, we
    now have this beautiful rich data set of
  • 46:52 - 46:57
    exactly where the software went as we were
    driving it in specific ways. Then we feed
  • 46:57 - 47:02
    that back into our static analysis engine
    and begin to generate those instruction
  • 47:02 - 47:08
    out of those instruction abstractions,
    those artifacts. And with that, imagine we
  • 47:08 - 47:15
    have these gigantic traces of instruction
    abstractions. From there we can then begin
  • 47:15 - 47:21
    to train the model to explore around the
    fault location and begin to understand and
  • 47:21 - 47:27
    try and study the fundamental building
    blocks of what a bug looks like in an
  • 47:27 - 47:33
    abstract instruction agnostic way. This is
    why we're spending a lot of time on our
  • 47:33 - 47:37
    Fuzzing engine right now. But hopefully
    soon we'll be able to talk about that more
  • 47:37 - 47:40
    and maybe a tech track and not the policy
    track.
  • 47:45 - 47:49
    C: Yeah, so from then on when anything
    went wrong with the computer we said it
  • 47:49 - 47:56
    had bugs in it. laughs All right, I
    promised you a technical journey, I
  • 47:56 - 47:59
    promised you a technical journey into the
    dark abyss of as deep as you want to get
  • 47:59 - 48:03
    with it. So let's go ahead and bring it
    up. Let's wrap it up and bring it up a
  • 48:03 - 48:07
    little bit here. We've talked a great deal
    today about some theory. We've talked
  • 48:07 - 48:10
    about development in our tooling and
    everything else and so I figured I should
  • 48:10 - 48:14
    end with some things that are not in
    progress, but in fact which are done in
  • 48:14 - 48:21
    yesterday's news. Just to go ahead and
    make that shared here with Europe. So in
  • 48:21 - 48:24
    the midst of all of our development we
    have been discovering and reporting bugs,
  • 48:24 - 48:29
    again this not our primary purpose really.
    But you know you can't help but do it. You
  • 48:29 - 48:32
    know how computers are these days. You
    find bugs just for turning them on, right?
  • 48:32 - 48:39
    So we've been disclosing all of that a
    little while ago. At DEFCON and Black Hat
  • 48:39 - 48:43
    our chief scientist Sarah together with
    Mudge went ahead and dropped this
  • 48:43 - 48:48
    bombshell on the Firefox team which is
    that for some period of time they had ASLR
  • 48:48 - 48:54
    disabled on OS X. When we first found it
    we assumed it was a bug in our tools. When
  • 48:54 - 48:58
    we first mentioned it in a talk they came
    to us and said it's definitely a bug on
  • 48:58 - 49:03
    our tools or might be or some level of
    surprise and then people started looking
  • 49:03 - 49:09
    into it and in fact at one point it had
    been enabled and then temporarily
  • 49:09 - 49:13
    disabled. No one knew, everyone thought it
    was on. It takes someone looking to notice
  • 49:13 - 49:18
    that kind of stuff, right. Major shout out
    though, they fixed it immediately despite
  • 49:18 - 49:24
    our full disclosure on stage and
    everything. So very impressed, but in
  • 49:24 - 49:28
    addition to popping surprises on people
    we've also been doing the usual process of
  • 49:28 - 49:33
    submitting patches and bugs, particularly
    to LLVM and Qemu and if you work in
  • 49:33 - 49:36
    software analysis you could probably guess
    why.
  • 49:37 - 49:39
    Incidentally, if you're looking for a
    target to fuzz if you want to go home from
  • 49:39 - 49:46
    CCC and you want to find a ton of findings
    LLVM comes with a bunch of parsers. You
  • 49:46 - 49:50
    should fuzz them, you should fuzz them and
    I say that because I know for a fact you
  • 49:50 - 49:53
    are gonna get a bunch of findings and it'd
    be really nice. I would appreciate it if I
  • 49:53 - 49:56
    didn't have to pay the people to fix them.
    So if you wouldn't mind disclosing them
  • 49:56 - 50:00
    that would help. But besides these bug
    reports and all these other things we've
  • 50:00 - 50:04
    also been working with lots of others. You
    know we gave a talk earlier this summer,
  • 50:04 - 50:07
    Sarah gave a talk earlier this summer,
    about these things and she presented
  • 50:07 - 50:12
    findings on comparing some of these base
    scores of different Linux distributions.
  • 50:12 - 50:16
    And based on those findings there was a
    person on the fedora red team, Jason
  • 50:16 - 50:20
    Calloway, who sat there and well I can't
    read his mind but I'm sure that he was
  • 50:20 - 50:25
    thinking to himself: golly it would be
    nice to not, you know, be surprised at the
  • 50:25 - 50:29
    next one of these talks. They score very
    well by the way. They were leading in
  • 50:29 - 50:34
    many, many of our metrics. Well, in any
    case, he left Vegas and he went back home
  • 50:34 - 50:37
    and him and his colleagues have been
    working on essentially re-implementing
  • 50:37 - 50:42
    much of our tooling so that they can check
    the stuff that we check before they
  • 50:42 - 50:48
    release. Before they release. Looking for
    security before you release. So that would
  • 50:48 - 50:52
    be a good thing for others to do and I'm
    hoping that that idea really catches on.
  • 50:52 - 50:59
    laughs Yeah, yeah right, that would be
    nice. That would be nice.
  • 50:59 - 51:04
    But in addition to that, in addition to
    that our mission really is to get results
  • 51:04 - 51:08
    out to the public and so in order to
    achieve that, we have broad partnerships
  • 51:08 - 51:12
    with Consumer Reports and the digital
    standard. Especially if you're into cyber
  • 51:12 - 51:16
    policy, I really encourage you to take a
    look at the proposed digital standard,
  • 51:16 - 51:21
    which is encompassing of the things we
    look for and and and so much more. URLs,
  • 51:21 - 51:26
    data, traffic, motion and cryptography and
    update mechanisms and all that good stuff.
  • 51:26 - 51:32
    So, where we are and where we're going,
    the big takeaways here for if you're
  • 51:32 - 51:36
    looking for that, so what, three points
    for you: one we are building a tooling
  • 51:36 - 51:40
    necessary to do larger and larger and
    larger studies regarding these surrogate
  • 51:40 - 51:45
    security stores. My hope is that in some
    period of the not-too-distant future, I
  • 51:45 - 51:49
    would like to be able to, with my
    colleagues, publish some really nice
  • 51:49 - 51:52
    findings about what are the things that
    you can observe in software, which have a
  • 51:52 - 51:57
    suspiciously high correlation with the
    software being good. Right, nobody really
  • 51:57 - 52:00
    knows right now. It's an empirical
    question. As far as I know, the study
  • 52:00 - 52:03
    hasn't been done. We've been running it on
    the small scale. We're building the
  • 52:03 - 52:07
    tooling to do it on a much larger scale.
    We are hoping that this winds up being a
  • 52:07 - 52:11
    useful field in security as that
    technology develops. In the meantime our
  • 52:11 - 52:16
    static analyzers are already making
    surprising discoveries: hit YouTube and
  • 52:16 - 52:21
    take a look for Sara Zatko's recent talks
    at DEFCON/Blackhat. Lots of fun findings
  • 52:21 - 52:26
    in there. Lots of things that anyone who
    looks would have found it. Lots of that.
  • 52:26 - 52:29
    And then lastly, if you were in the
    business of shipping software and you are
  • 52:29 - 52:33
    thinking to yourself.. okay so these guys,
    someone gave them some money to mess up my
  • 52:33 - 52:37
    day and you're wondering: what can I do to
    not have my day messed up? One simple
  • 52:37 - 52:41
    piece of advice, one simple piece of
    advice: make sure your software employs
  • 52:41 - 52:46
    every exploit mitigation technique Mudge
    has ever or will ever hear of. And he's
  • 52:46 - 52:50
    heard of a lot of them. He's only gonna,
    you know all that, turn all those things
  • 52:50 - 52:52
    on and if you don't know anything about
    that stuff, if nobody on your team knows
  • 52:52 - 52:57
    anything about that stuff didn't I don't
    even know I'm saying this if you hear you
  • 52:57 - 53:01
    know about that stuff so do that. If
    you're not here, then you should be here.
  • 53:04 - 53:16
    Danke, Danke.
    Herald Angel: Thank you, Tim and Parker.
  • 53:18 - 53:24
    Do we have any questions from the
    audience? It's really hard to see you with
  • 53:24 - 53:30
    that bright light in my face. I think the
    signal angel has a question. Signal Angel:
  • 53:30 - 53:35
    So the IRC channel was impressed by your
    tools and your models that you wrote. And
  • 53:35 - 53:38
    they are wondering what's going to happen
    to that, because you do have funding from
  • 53:38 - 53:42
    the Ford foundation now and so what are
    your plans with this? Do you plan on
  • 53:42 - 53:46
    commercializing this or is it going to be
    open source or how do we get our hands on
  • 53:46 - 53:49
    this?
    C: It's an excellent question. So for the
  • 53:49 - 53:54
    time being the money that we are receiving
    is to develop the tooling, pay for the AWS
  • 53:54 - 53:58
    instances, pay for the engineers and all
    that stuff. The direction as an
  • 53:58 - 54:01
    organization that we would like to take
    things I have no interest in running a
  • 54:01 - 54:05
    monopoly. That sounds like a fantastic
    amount of work and I really don't want to
  • 54:05 - 54:09
    do it. However, I have a great deal of
    interest in taking the gains that we are
  • 54:09 - 54:14
    making in the technology and releasing the
    data so that other competent researchers
  • 54:14 - 54:19
    can go through and find useful things that
    we may not have noticed ourselves. So
  • 54:19 - 54:22
    we're not at a point where we are
    releasing data in bulk just yet, but that
  • 54:22 - 54:26
    is simply a matter of engineering our
    tools, are still in flux as we, you know.
  • 54:26 - 54:29
    When we do that, we want to make sure the
    data is correct and so our software has to
  • 54:29 - 54:34
    have its own low bug counts and all these
    other things. But ultimately for the
  • 54:34 - 54:38
    scientific aspect of our mission. Though
    the science is not our primary mission.
  • 54:38 - 54:42
    Our primary mission is to apply it to help
    consumers. At the same time, it is our
  • 54:42 - 54:48
    belief that an opaque model is as good as
    crap, no one should trust an opaque model,
  • 54:48 - 54:51
    if somebody is telling you that they have
    some statistics and they do not provide
  • 54:51 - 54:55
    you with any underlying data and it is not
    reproducible you should ignore them.
  • 54:55 - 54:58
    Consequently what we are working towards
    right now is getting to a point where we
  • 54:58 - 55:03
    will be able to share all of those
    findings. The surrogate scores, the
  • 55:03 - 55:06
    interesting correlations between
    observables and fuzzing. All that will be
  • 55:06 - 55:09
    public as the material comes online.
    Signal Angel: Thank you.
  • 55:09 - 55:12
    C: Thank you.
    Herald Angel: Thank you. And microphone
  • 55:12 - 55:15
    number three please.
    Mic3: Hi, thanks so some really
  • 55:15 - 55:18
    interesting work you presented here. So
    there's something I'm not sure I
  • 55:18 - 55:23
    understand about the approach that you're
    taking. If you are evaluating the security
  • 55:23 - 55:26
    of say a library function or the
    implementation of a network protocol for
  • 55:26 - 55:30
    example you know there'd be a precise
    specification you could check that
  • 55:30 - 55:35
    against. And the techniques you're using
    would make sense to me. But it's not so
  • 55:35 - 55:38
    clear since you've set the goal that
    you've set for yourself is to evaluate
  • 55:38 - 55:44
    security of consumer software. It's not
    clear to me whether it's fair to call
  • 55:44 - 55:47
    these results security scores in the
    absence of a threat model so. So my
  • 55:47 - 55:50
    question is, you know, how is it
    meaningful to make a claim that a piece of
  • 55:50 - 55:52
    software is secure if you don't have a
    threat model for it?
  • 55:52 - 55:56
    C: This is an excellent question and I
    anyone who disagrees is they should the
  • 55:56 - 56:01
    wrong. Security without a threat model is
    not security at all. It's absolutely a
  • 56:01 - 56:06
    true point. So the things that we are
    looking for, most of them are things that
  • 56:06 - 56:09
    you will already find present in your
    threat model. And so for example we were
  • 56:09 - 56:12
    reporting on the presence of things like a
    ASLR and lots of other things that get to
  • 56:12 - 56:17
    the heart of exploitability of a piece of
    software. So for example if we are
  • 56:17 - 56:20
    reviewing a piece of software, that has no
    attack surface
  • 56:20 - 56:24
    then it is canonically not in the threat
    model and in that sense it makes no sense
  • 56:24 - 56:29
    to report on its overall security. On the
    other hand, if we're talking about
  • 56:29 - 56:33
    software like say a word processor, a
    browser, anything on your phone, anything
  • 56:33 - 56:36
    that talks on the network, we're talking
    about those kinds of applications then I
  • 56:36 - 56:39
    would argue that exploit mitigations and
    the other things that we are measuring are
  • 56:39 - 56:44
    almost certainly very relevant. So there's
    a sense in which what we are measuring is
  • 56:44 - 56:48
    the lowest common denominator among what
    we imagine or the dominant threat models
  • 56:48 - 56:53
    for the applications. The hand-wavy
    answer, but I promised heuristics so there
  • 56:53 - 56:55
    you go.
    Mic3: Thanks.
  • 56:55 - 57:02
    C: Thank you.
    Herald Angel: Any questions? No raising
  • 57:02 - 57:07
    hands, okay. And then the herald can ask a
    question, because I never can. So the
  • 57:07 - 57:12
    question is: you mentioned earlier these
    security labels and for example what
  • 57:12 - 57:16
    institution could give out the security
    labels? Because as obviously the vendor
  • 57:16 - 57:22
    has no interest in IT security?
    C: Yes it's a very good question. So our
  • 57:22 - 57:26
    partnership with Consumer Reports. I don't
    know if you're familiar with them, but in
  • 57:26 - 57:31
    the United States Consumer Reports is a
    major huge consumer watchdog organization.
  • 57:31 - 57:37
    They test the safety of automobiles, they
    test you know lots of consumer appliances.
  • 57:37 - 57:40
    All kinds of things both to see if they
    function more or less as advertised but
  • 57:40 - 57:45
    most importantly they're checking for
    quality, reliability and safety. So our
  • 57:45 - 57:50
    partnership with Consumer Reports is all
    about us doing our work and then
  • 57:50 - 57:54
    publishing that. And so for example the
    televisions that we presented the data on
  • 57:54 - 57:58
    all of that was collected and published in
    partnership with Consumer Reports.
  • 57:58 - 58:01
    Herald: Thank you.
    C: Thank you.
  • 58:03 - 58:12
    Herald: Any other questions for stream. I
    hear a no. Well in this case people thank
  • 58:12 - 58:16
    you.
    Thank Tim and Parker for their nice talk
  • 58:16 - 58:20
    and please give them a very very warm hall
    round of applause.
  • 58:20 - 58:25
    applause
    C: Thank you. T: Thank you.
  • 58:25 - 58:51
    subtitles created by c3subtitles.de
    in the year 2017. Join, and help us!
Title:
34C3 - How risky is the software you use?
Description:

more » « less
Video Language:
English
Duration:
58:51

English subtitles

Revisions Compare revisions