Return to Video

Filippo Valsorda: The plain simple reality of entropy

  • 0:00 - 0:18
    Herald: Our next talk is on "the plain simple
    reality of entropy", and
  • 0:18 - 0:22
    we all know that you need randomness and entropy
  • 0:22 - 0:24
    if you want to do something like encryption
  • 0:24 - 0:26
    or generate keys.
  • 0:26 - 0:29
    And if you don't want to do it the xkcd way,
  • 0:29 - 0:32
    using only 4 as the random number,
  • 0:32 - 0:37
    you need a cryptographically secure pseudorandom
    number generator,
  • 0:37 - 0:40
    and what is this, how it works,
  • 0:40 - 0:42
    and where you can find one,
  • 0:42 - 0:44
    will be the topic of this talk.
  • 0:44 - 0:47
    So I present to you Filippo Valsorda,
  • 0:47 - 0:52
    on "How I learned to stop worrying and love
    urandom".
  • 0:52 - 0:59
    applause
  • 0:59 - 1:03
    FV: Hello. Okay, I'm very glad so many people
    showed up,
  • 1:03 - 1:07
    even if I essentially gave away the entire
    content of the talk
  • 1:07 - 1:10
    in the description.
  • 1:10 - 1:11
    Want me to stop, to leave, something?
  • 1:11 - 1:12
    Okay? No.
  • 1:12 - 1:15
    Anyway, hi! I'm Filippo Valsorda,
  • 1:15 - 1:16
    I work for CloudFlare,
  • 1:16 - 1:19
    I do cryptography and systems engineering,
  • 1:19 - 1:23
    I recently implemented the DNSSEC implementation
  • 1:23 - 1:26
    of the CloudFlare DNS server,
  • 1:26 - 1:32
    and maybe in April 2014, you used my Heartbleed
    test.
  • 1:32 - 1:33
    Anyway,
  • 1:33 - 1:35
    applause
  • 1:35 - 1:39
    Well, thank you!
  • 1:39 - 1:44
    Okay. Anyway, I'm here to tell you about random
    bytes.
  • 1:44 - 1:46
    So, here are some random bytes.
  • 1:46 - 1:50
    They're pretty good, you can use them.
  • 1:50 - 1:52
    laughter
  • 1:52 - 1:56
    But, if you need more,
  • 1:56 - 1:59
    Amazon sells this excellent book
  • 1:59 - 2:05
    full of a million random numbers.
  • 2:05 - 2:08
    Anyway. More seriously,
  • 2:08 - 2:11
    random numbers are central to a lot of
  • 2:11 - 2:16
    the security protocols and systems of our
    modern technology.
  • 2:16 - 2:19
    The most obvious is encryption keys.
  • 2:19 - 2:22
    You obviously want your encryption key to
    be random,
  • 2:22 - 2:24
    to be really hard to predict,
  • 2:24 - 2:26
    and you want your encryption key to be different
  • 2:26 - 2:29
    from the person next to you.
  • 2:29 - 2:30
    Unless you're doing key escrow,
  • 2:30 - 2:33
    which, well, we don't point.
  • 2:33 - 2:37
    So, a lot of other different systems
  • 2:37 - 2:41
    use randomness to prevent all kinds of attack.
  • 2:41 - 2:44
    In particular, one amongst many,
  • 2:44 - 2:46
    DNS, using random query IDs
  • 2:46 - 2:49
    to prevent Kaminsky attacks.
  • 2:49 - 2:55
    So, what makes a stream, a source of random
    bytes, good?
  • 2:55 - 2:59
    What are we looking for when we look for good
    random bytes?
  • 2:59 - 3:04
    First of all, we look for uniform random bytes.
  • 3:04 - 3:07
    Every time we draw a random byte from our
    random byte source
  • 3:07 - 3:10
    we want to have the same probability
  • 3:10 - 3:14
    to get all values, from 0 to 255.
  • 3:14 - 3:19
    For example, you don't want your distribution
    to look like this.
  • 3:19 - 3:21
    This is RC4.
  • 3:21 - 3:24
    But that's not enough.
  • 3:24 - 3:28
    You also want your random bytes to be completely
    unpredictable.
  • 3:28 - 3:33
    And here is where the task actually becomes
    difficult.
  • 3:33 - 3:37
    Because if you think about it, we are programming
    computers.
  • 3:37 - 3:39
    Computers are very deterministic machines,
  • 3:39 - 3:44
    even if they don't feel like they are.
  • 3:44 - 3:46
    And they're essentially machines built
  • 3:46 - 3:52
    to sequentially execute always the same set
    of instructions,
  • 3:52 - 3:54
    which we call code.
  • 3:54 - 3:57
    And when we ask them, at some point,
  • 3:57 - 3:59
    to do something that is completely different
  • 3:59 - 4:01
    every time they do it,
  • 4:01 - 4:04
    and two equal computers should do it differently,
  • 4:04 - 4:05
    we get in trouble.
  • 4:05 - 4:11
    So, where can a computer source this randomness?
  • 4:11 - 4:15
    Where can a computer find unpredictability,
  • 4:15 - 4:18
    if it can't have its own?
  • 4:18 - 4:22
    Obviously, in our messy meat world.
  • 4:22 - 4:23
    In our physical world,
  • 4:23 - 4:28
    where everything is not always happening the
    same way.
  • 4:28 - 4:32
    So, user input: every time you type on the
    keyboard,
  • 4:32 - 4:34
    you do that with different timings.
  • 4:34 - 4:35
    When you move your mouse around,
  • 4:35 - 4:37
    you do that differently every time.
  • 4:37 - 4:41
    Or, simply, reading from disk.
  • 4:41 - 4:45
    Every time your computer reads something from
    disk,
  • 4:45 - 4:49
    it takes a slightly different amount of time.
  • 4:49 - 4:53
    Or interrupt times, I/O, you get the idea.
  • 4:53 - 4:57
    So, all these events are visible to the kernel.
  • 4:57 - 4:59
    The kernel is the component of your system
  • 4:59 - 5:03
    which is controlling all these interactions
  • 5:03 - 5:06
    with the outside world, and can measure them
  • 5:06 - 5:10
    and observe them with the right precision.
  • 5:10 - 5:14
    And each of these events can have
  • 5:14 - 5:17
    a wide or narrow range of possible values,
  • 5:17 - 5:19
    for example, when you read from disk,
  • 5:19 - 5:25
    it might take from 0.17 nanoseconds to 1.3
    nanoseconds.
  • 5:25 - 5:29
    I made numbers up.
  • 5:29 - 5:34
    How wide this range is what we call entropy.
  • 5:34 - 5:36
    Essentially it is how many different values,
  • 5:36 - 5:41
    how spread apart the values are,
  • 5:41 - 5:45
    which also means how easy they are to predict.
  • 5:45 - 5:49
    But something they definitely aren't is uniform.
  • 5:49 - 5:51
    Because as I said, for example, reading from
    disk
  • 5:51 - 5:53
    might take in a specific range,
  • 5:53 - 5:57
    definitely not from 0 to 255 nanoseconds.
  • 5:57 - 5:59
    That would be...
  • 5:59 - 6:01
    And usually they're not enough to satisfy
  • 6:01 - 6:04
    all our random bytes needs.
  • 6:04 - 6:08
    So, now we have some unpredictability.
  • 6:08 - 6:12
    We have some events that we can see from our
    system,
  • 6:12 - 6:16
    and we want to turn that into a stream of
    random bytes
  • 6:16 - 6:19
    that we can use to generate SSH keys,
  • 6:19 - 6:22
    and DNS query IDs, etc.
  • 6:22 - 6:25
    Enter a CSPRNG.
  • 6:25 - 6:29
    Cryptographers like their acronyms very long.
  • 6:29 - 6:34
    It's a cryptographically secure pseudorandom
    number generator.
  • 6:34 - 6:37
    applause
  • 6:37 - 6:40
    It's not that hard to pronounce!
  • 6:40 - 6:42
    Okay, it is.
  • 6:42 - 6:46
    Anyway, it's nothing else than a cryptographic
    tool
  • 6:46 - 6:52
    that takes some input and generates an unlimited,
  • 6:52 - 6:53
    reasonably unlimited,
  • 6:53 - 6:57
    stream of random uniform bytes,
  • 6:57 - 7:03
    which depend on all and only the input you
    gave to the CSPRNG.
  • 7:03 - 7:06
    So you can see how we can use this.
  • 7:06 - 7:10
    We have this amount of random events,
  • 7:10 - 7:13
    we feed that into the CSPRNG,
  • 7:13 - 7:17
    and we get out random bytes that we can use
    for everything.
  • 7:17 - 7:22
    So, to understand how a CSPRNG works,
  • 7:22 - 7:27
    I decided to simply present you with a very
    simple one.
  • 7:27 - 7:29
    One based on hash functions.
  • 7:29 - 7:32
    I assume that everyone in the hall
  • 7:32 - 7:35
    knows essentially what hash functions are.
  • 7:35 - 7:39
    But the properties we care about today of
    hash functions are:
  • 7:39 - 7:43
    The fact that the output is uniform.
  • 7:43 - 7:46
    If you take the output of a hash function,
  • 7:46 - 7:48
    all the bits should be indistinguishable from
    random,
  • 7:48 - 7:52
    if you don't know the input.
  • 7:52 - 7:55
    It's impossible to reverse a hash function.
  • 7:55 - 7:57
    If I give you the output of a hash function,
  • 7:57 - 8:00
    you should know nothing more than before
  • 8:00 - 8:03
    about what the input of the hash function
    is,
  • 8:03 - 8:06
    unless you can specifically figure out the
    input
  • 8:06 - 8:10
    and try the hash function.
  • 8:10 - 8:13
    And finally, it takes a limited amount of
    input,
  • 8:13 - 8:17
    and makes a fixed amount of output.
  • 8:17 - 8:19
    These are the properties we are going to use
  • 8:19 - 8:24
    to build a CSPRNG out of hash functions.
  • 8:24 - 8:27
    So. This is how it works.
  • 8:27 - 8:28
    We start with a pool.
  • 8:28 - 8:32
    We call "pool" an array of bytes,
  • 8:32 - 8:35
    and we fill it with zeros to start.
  • 8:35 - 8:38
    And every time a new event comes in,
  • 8:38 - 8:41
    for example, you moved the mouse around,
  • 8:41 - 8:42
    we take that event,
  • 8:42 - 8:45
    we serialize it to some binary format,
  • 8:45 - 8:46
    doesn't really matter.
  • 8:46 - 8:52
    For example, mouse is at position 15 to 835.
  • 8:52 - 8:53
    And we hash together,
  • 8:53 - 8:57
    we hash the concatenation of our pool,
  • 8:57 - 8:59
    which for now is just zeros,
  • 8:59 - 9:03
    and the serialization of this event.
  • 9:03 - 9:06
    We hash them together, we get an output,
  • 9:06 - 9:10
    and that's our new value of the pool.
  • 9:10 - 9:11
    And we repeat.
  • 9:11 - 9:15
    Now, instead of zeros, we have the output
    from before.
  • 9:15 - 9:20
    Now we have this output, and a new event happens.
  • 9:20 - 9:26
    A disk read happens, and it takes exactly
    1.27589 nanoseconds.
  • 9:26 - 9:34
    And we hash together the old contents of the
    pool,
  • 9:34 - 9:38
    this information, disk read happened and it
    took this amount of time,
  • 9:38 - 9:42
    we hash them together and we get a new value
    of the pool.
  • 9:42 - 9:45
    You see where this is going.
  • 9:45 - 9:47
    We keep doing this.
  • 9:47 - 9:49
    Every time a new event comes in,
  • 9:49 - 9:51
    every time the mouse moves,
  • 9:51 - 9:54
    every time a CPU interrupt is raised,
  • 9:54 - 9:56
    every time disk read happens,
  • 9:56 - 9:59
    we call this stirring function
  • 9:59 - 10:03
    to mix this event into this pool.
  • 10:03 - 10:05
    And what do we end up with?
  • 10:05 - 10:08
    We end up with what we call an entropy pool.
  • 10:08 - 10:14
    Now, to figure this value, you need exactly
    all the events
  • 10:14 - 10:16
    that lead to this value.
  • 10:16 - 10:19
    If you're an attacker, and you really want
    to figure out
  • 10:19 - 10:22
    what my entropy pool is,
  • 10:22 - 10:25
    you don't, you're not supposed to have any
    better way
  • 10:25 - 10:29
    to figure it out than to guess all the different
  • 10:29 - 10:32
    hard disk timings and mouse movements that
    happened
  • 10:32 - 10:35
    all the way up to now.
  • 10:35 - 10:41
    Okay? So now we have this essentially unpredictable
    value,
  • 10:41 - 10:44
    but now we want to generate keys out of it,
  • 10:44 - 10:49
    and we can't just use these few bytes here.
  • 10:49 - 10:53
    So we can use again hash functions.
  • 10:53 - 10:54
    Same hash function.
  • 10:54 - 10:58
    We take the entropy pool, and we hash it with
    a counter.
  • 10:58 - 11:03
    You want 5000 random bits? Sure.
  • 11:03 - 11:05
    You hash entropy pool and 0,
  • 11:05 - 11:10
    hash entropy pool and 1, and 2, 3, 4, 5, 6,
    7, 8, 9.
  • 11:10 - 11:13
    You get all these outputs, you concatenate
    them,
  • 11:13 - 11:19
    and now you have 5000 bits, which are as unpredictable
  • 11:19 - 11:23
    as all the events that were stirred into the
    pool.
  • 11:23 - 11:26
    Let's think about it for a second.
  • 11:26 - 11:29
    We said that hash functions are not invertible,
  • 11:29 - 11:31
    so even if you know one of the outputs,
  • 11:31 - 11:34
    you can't get back to the entropy pool.
  • 11:34 - 11:37
    And we said that hash functions have,
  • 11:37 - 11:41
    that with hash functions, all the bits in
    input
  • 11:41 - 11:44
    affect all the bits of the output.
  • 11:44 - 11:47
    So even if just the counter changes
  • 11:47 - 11:50
    between one rand and the other,
  • 11:50 - 11:52
    the output is completely unrelated.
  • 11:52 - 11:56
    So, did we get what we want?
  • 11:56 - 11:58
    It's uniform, because we said before,
  • 11:58 - 12:01
    hash functions' outputs are uniform.
  • 12:01 - 12:04
    It's unpredictable, because the only way an
    attacker has
  • 12:04 - 12:08
    to figure out what the output will be
  • 12:08 - 12:13
    is imagine or brute-force or observe, I guess,
  • 12:13 - 12:17
    all the hard-disk timings and user inputs,
  • 12:17 - 12:20
    which is impossible for a third party.
  • 12:20 - 12:22
    And it's unlimited, because we can keep
  • 12:22 - 12:25
    incrementing that counter forever.
  • 12:25 - 12:29
    Now, really please don't go implement this
    scheme
  • 12:29 - 12:32
    and say "Filippo told me it was okay".
  • 12:32 - 12:33
    No.
  • 12:33 - 12:38
    Also because it's exactly not what this talk
    is about.
  • 12:38 - 12:45
    So, if CSPRNGs, if we have this tool
  • 12:45 - 12:48
    to turn some unpredictable events
  • 12:48 - 12:52
    into an unlimited stream of random bytes,
  • 12:52 - 12:53
    which is what we need,
  • 12:53 - 12:55
    and we have all these unpredictable events
  • 12:55 - 12:58
    observed by the kernel,
  • 12:58 - 13:02
    doesn't it make sense to just put a CSPRNG
    in the kernel
  • 13:02 - 13:05
    and just have the kernel run the CSPRNG
  • 13:05 - 13:09
    when we need random bytes?
  • 13:09 - 13:13
    It's such a good idea that it's exactly what
    Linux did,
  • 13:13 - 13:16
    and all the other operating systems.
  • 13:16 - 13:19
    In Linux, it's called /dev/urandom,
  • 13:19 - 13:23
    and it looks like a file, you read it like
    a file,
  • 13:23 - 13:25
    and it's technically a character device
  • 13:25 - 13:28
    and every time you read 100 bytes from it,
  • 13:28 - 13:31
    it runs a CSPRNG, on an entropy pool
  • 13:31 - 13:35
    not different from the one I've presented,
  • 13:35 - 13:41
    and this entropy pool is stirred with all
    the events
  • 13:41 - 13:47
    that the kernel saw happen from its privileged
    position.
  • 13:47 - 13:51
    Other operating systems have something similar.
  • 13:51 - 13:55
    OS X and BSD have /dev/random
  • 13:55 - 13:59
    which is exactly what /dev/urandom is on Linux,
  • 13:59 - 14:01
    and on Windows you can get something similar
  • 14:01 - 14:07
    with a CryptGenRandom call.
  • 14:07 - 14:08
    One last thing.
  • 14:08 - 14:11
    Putting the CSPRNG in the kernel
  • 14:11 - 14:13
    is not only about convenience,
  • 14:13 - 14:15
    it's also about security.
  • 14:15 - 14:17
    Because, first of all,
  • 14:17 - 14:20
    the kernel is the entity that can observe
  • 14:20 - 14:22
    the unpredictable events.
  • 14:22 - 14:26
    If you take a CSPRNG, which is just code,
  • 14:26 - 14:28
    so you can implement your own,
  • 14:28 - 14:30
    and you implement it in your library,
  • 14:30 - 14:32
    or in your application,
  • 14:32 - 14:33
    now you have the problem of,
  • 14:33 - 14:38
    how do you take the random, the unpredictable
    events
  • 14:38 - 14:42
    from the kernel and take them to the application?
  • 14:42 - 14:46
    This is something that you can forget to do,
  • 14:46 - 14:47
    often,
  • 14:47 - 14:48
    or do wrong.
  • 14:48 - 14:52
    And, moreover, the kernel can protect
  • 14:52 - 14:55
    the memory space of the entropy pool
  • 14:55 - 14:58
    much better than applications.
  • 14:58 - 15:00
    For example, applications can fork,
  • 15:00 - 15:03
    there's a whole lot of different things
  • 15:03 - 15:05
    that applications can get wrong.
  • 15:05 - 15:10
    And finally, you have one single centralized
    implementation
  • 15:10 - 15:14
    that is reasonably easy to audit.
  • 15:14 - 15:19
    I don't know, was anyone managing Debian servers
    in 2008?
  • 15:19 - 15:20
    laughter
  • 15:20 - 15:25
    Just asking. Unrelated. Right.
  • 15:25 - 15:29
    So, yeah. /dev/urandom.
  • 15:29 - 15:34
    So, we have a solution, right?
  • 15:34 - 15:38
    We have a tool to turn unpredictable events
  • 15:38 - 15:42
    into an unlimited uniform stream of random
    bytes,
  • 15:42 - 15:46
    we have a source of unpredictable events,
  • 15:46 - 15:49
    solved!
  • 15:49 - 15:50
    What are everybody talking about?
  • 15:50 - 15:53
    Why is there even a need for a talk?
  • 15:53 - 16:02
    Well. Sadly, there's some common misconceptions
    in the field,
  • 16:02 - 16:06
    which is also why I'm here to give this talk.
  • 16:06 - 16:11
    One of the most common is fueled by the very
    Linux man pages.
  • 16:11 - 16:14
    The recent versions are better but
  • 16:14 - 16:17
    they still give you this impression
  • 16:17 - 16:20
    that if you want real security,
  • 16:20 - 16:22
    you should be using /dev/random,
  • 16:22 - 16:25
    because /dev/urandom is okay, but, hmm, kinda...
  • 16:25 - 16:29
    and, well, we want real security, right?
  • 16:29 - 16:32
    But you might ask yourself, okay,
  • 16:32 - 16:34
    if /dev/urandom is a CSPRNG,
  • 16:34 - 16:38
    and a CSPRNG is all I need,
  • 16:38 - 16:40
    what else can I get,
  • 16:40 - 16:45
    what does /dev/random have more?
  • 16:45 - 16:48
    Well, the idea of this talk is giving you
    the knowledge
  • 16:48 - 16:52
    to figure out by yourself whether you need
    /dev/random or not.
  • 16:52 - 16:55
    So, first I explained how a CSPRNG works,
  • 16:55 - 16:58
    now I'm going to go a bit into the details
  • 16:58 - 17:02
    of how /dev/urandom and /dev/random work.
  • 17:02 - 17:06
    These are taken directly from the kernel source.
  • 17:06 - 17:11
    Both /dev/urandom and /dev/random are...
  • 17:11 - 17:14
    Yeah. Sorry.
  • 17:14 - 17:16
    Essentially, everything I'm going to say now
  • 17:16 - 17:20
    applies to both /dev/urandom and /dev/random.
  • 17:20 - 17:24
    They both are based on a pool of 4000 bits,
  • 17:24 - 17:29
    not dissimilar from the one of the CSPRNG
    we played with before,
  • 17:29 - 17:36
    which is implemented as a series of 32-bits
    words, I think.
  • 17:36 - 17:39
    The pool is mixed with all the unpredictable
    events,
  • 17:39 - 17:41
    using a CRC-like function.
  • 17:41 - 17:45
    This is not a cryptographically secure hash
    function,
  • 17:45 - 17:48
    but this is just about how the unpredictable
    events,
  • 17:48 - 17:53
    the interrupts, the disk timings, are stirred
  • 17:53 - 17:56
    into the internal pool.
  • 17:56 - 17:58
    Every time one of these events happens,
  • 17:58 - 18:00
    this very fast function kicks in
  • 18:00 - 18:07
    and stirs the pool with the unpredictable
    event.
  • 18:07 - 18:13
    Then extraction, so actual random byte generation,
  • 18:13 - 18:15
    happens with SHA-1.
  • 18:15 - 18:17
    So you want some random bytes from the kernel,
  • 18:17 - 18:22
    what the kernel does is just run SHA-1 on
    the pool,
  • 18:22 - 18:23
    give you the output,
  • 18:23 - 18:27
    and also take the output and feed it back
    into the pool
  • 18:27 - 18:29
    using that mixing function.
  • 18:29 - 18:32
    This is a big difference, you might have noticed,
  • 18:32 - 18:35
    from our design, which is a counter,
  • 18:35 - 18:39
    because keeping counters, turns out, is still
    hard.
  • 18:39 - 18:44
    And they can reset, you can lose count, that's
    bad.
  • 18:44 - 18:49
    Also, this has more security properties against
    compromise.
  • 18:49 - 18:51
    So what is does is simply that,
  • 18:51 - 18:54
    when it generates output, it also stirs it
    back,
  • 18:54 - 18:55
    and if you need more output,
  • 18:55 - 18:58
    SHA-1 again on the new pool
  • 18:58 - 19:01
    gives output and stirs it back into the pool,
  • 19:01 - 19:03
    so that the pool keeps changing.
  • 19:03 - 19:07
    Now, both /dev/urandom and /dev/random
  • 19:07 - 19:10
    do the exact same thing.
  • 19:10 - 19:13
    Same code, same sizes, same entropy sources,
  • 19:13 - 19:15
    literally in the source,
  • 19:15 - 19:18
    random_read is a call to extract_entropy_user,
  • 19:18 - 19:23
    urandom_read is a call to extract_entropy_user.
  • 19:23 - 19:27
    The only difference is
  • 19:27 - 19:30
    I finally get to what's special about /dev/random,
  • 19:30 - 19:34
    is that it tries to do a couple of really
    hard and weird things.
  • 19:34 - 19:39
    First, it tries to guess how many bits of
    entropy
  • 19:39 - 19:44
    were mixed into the pool after each unpredictable
    event.
  • 19:44 - 19:46
    This is already very hard, because, think
    about it,
  • 19:46 - 19:53
    a disk read took 1.735 nanoseconds. Great.
  • 19:53 - 19:56
    We don't know how many different values this
    might take.
  • 19:56 - 20:00
    We don't know if this is a spinning hard disk,
  • 20:00 - 20:02
    which has timings all over the place,
  • 20:02 - 20:05
    or if it's a SSD which almost always takes
    the same time.
  • 20:05 - 20:08
    So we don't know how much predictable this
    is.
  • 20:08 - 20:09
    So this is already hard,
  • 20:09 - 20:13
    figuring out how unpredictable the pool is.
  • 20:13 - 20:18
    So it keeps a counter, arbitrary number of
    how many bits
  • 20:18 - 20:21
    of entropy, how much unpredictability
  • 20:21 - 20:25
    there is in this pool.
  • 20:25 - 20:28
    And then, when you run the hash function on
    the pool,
  • 20:28 - 20:31
    it decreases this count,
  • 20:31 - 20:34
    it reduces this number.
  • 20:34 - 20:38
    And if this number gets too low, it blocks
    you.
  • 20:38 - 20:39
    So you're reading from /dev/random,
  • 20:39 - 20:41
    this number dwindles,
  • 20:41 - 20:44
    so now you're still reading from /dev/random
  • 20:44 - 20:45
    but you're blocked
  • 20:45 - 20:49
    until more unpredictable events happen.
  • 20:49 - 20:52
    This is useless in the modern world.
  • 20:52 - 20:54
    Because entropy does not decrease.
  • 20:54 - 20:59
    Entropy does not run out, and everything freezes.
  • 20:59 - 21:01
    Once the pool becomes unpredictable
  • 21:01 - 21:04
    because too many different events contributed
  • 21:04 - 21:07
    to how the entropy pool looks like,
  • 21:07 - 21:09
    it's forever unpredictable,
  • 21:09 - 21:13
    because the attacker doesn't learn anything
    from the output.
  • 21:13 - 21:16
    Obviously, unless the CSPRNG is broken
  • 21:16 - 21:20
    and is leaking information about the entropy
    pool.
  • 21:20 - 21:23
    However, saying that CSPRNGs are broken
  • 21:23 - 21:29
    is equivalent to saying that a lot of cryptography
    constructs are broken.
  • 21:29 - 21:32
    It's saying that stream ciphers are broken,
  • 21:32 - 21:34
    it's saying that CTR mode is broken,
  • 21:34 - 21:37
    it's saying that TLS and PGP are broken,
  • 21:37 - 21:39
    because they're both about reusing the same
    key
  • 21:39 - 21:42
    for multiple packets or messages.
  • 21:42 - 21:45
    So if cryptographers didn't know how to build
  • 21:45 - 21:48
    a secure CSPRNG,
  • 21:48 - 21:50
    it would mean that cryptographers weren't
    able
  • 21:50 - 21:55
    to build most of the things we're relying
    on today.
  • 21:55 - 21:57
    It would mean that cryptography was doomed.
  • 21:57 - 22:02
    Now, I'm not DJB, I can't tell you if cryptography
    is doomed.
  • 22:02 - 22:06
    But I can tell you that if cryptography is
    doomed,
  • 22:06 - 22:08
    your problem is not your CSPRNG.
  • 22:08 - 22:09
    laughter
  • 22:09 - 22:13
    So, cryptography relies on being able
  • 22:13 - 22:16
    to build secure CSPRNGs.
  • 22:16 - 22:17
    And on the other hand,
  • 22:17 - 22:21
    that makes /dev/random blocking useless, obviously.
  • 22:21 - 22:23
    It can be unacceptable, too, because
  • 22:23 - 22:26
    you get a TLS request, and you're like
  • 22:26 - 22:29
    "I have that HTTP page, but wait a second,
  • 22:29 - 22:32
    I need someone to start typing
  • 22:32 - 22:37
    on the keyboard of the rack to serve it to
    you."
  • 22:37 - 22:38
    And it can even be dangerous,
  • 22:38 - 22:40
    because you're essentially giving away information
  • 22:40 - 22:43
    about what other users in the system are doing
  • 22:43 - 22:46
    to other users.
  • 22:46 - 22:48
    On the other hand,
  • 22:48 - 22:50
    /dev/urandom is safe for any cryptography
    use
  • 22:50 - 22:52
    you want to use it for.
  • 22:52 - 22:54
    You want to generate long-term keys...
  • 22:54 - 23:00
    My GPG keys are generated from /dev/urandom.
  • 23:00 - 23:02
    And I'm not the only one saying this,
  • 23:02 - 23:06
    BoringSSL, Python, Go, Ruby, use /dev/urandom
  • 23:06 - 23:10
    as the only source, the only CSPRNG.
  • 23:10 - 23:13
    Sandstorm even replaces /dev/random with it.
  • 23:13 - 23:15
    And here is a long list of people
  • 23:15 - 23:20
    saying exactly what I'm here on stage to tell
    you.
  • 23:20 - 23:26
    So, I hope that at the end of this, you see
  • 23:26 - 23:30
    that you don't actually need /dev/random,
  • 23:30 - 23:31
    as well as you don't need to keep measuring
  • 23:31 - 23:34
    how much entropy you have in the pool,
  • 23:34 - 23:36
    you don't need to refill the pool
  • 23:36 - 23:38
    with things like haveged or
  • 23:38 - 23:40
    I don't know how to pronounce it.
  • 23:40 - 23:46
    Actually I've even seen people take output
    from /dev/urandom
  • 23:46 - 23:49
    and pipe it back as root into /dev/random
  • 23:49 - 23:52
    so that the entropy doesn't run out,
  • 23:52 - 23:58
    which is exactly what the kernel is doing!
  • 23:58 - 24:03
    Which is, obviously, a pretty upvoted answer
    on StackOverflow.
  • 24:03 - 24:05
    laughter
  • 24:05 - 24:09
    Anyway. And finally,
  • 24:09 - 24:10
    random number quality does not decrease,
  • 24:10 - 24:13
    there are not like premium-level random numbers
  • 24:13 - 24:18
    and then they kinda rot after you use them
    for a while.
  • 24:18 - 24:21
    No, that's not a thing.
  • 24:21 - 24:26
    Okay. So, there is only one small case
  • 24:26 - 24:30
    in which /dev/urandom does not do exactly
  • 24:30 - 24:34
    what we would expect it to do, which is early
    at boot.
  • 24:34 - 24:34
    If you think about it,
  • 24:34 - 24:39
    everything we said is about using unpredictable
    events
  • 24:39 - 24:41
    to build up unpredictability.
  • 24:41 - 24:44
    As soon as you boot the machine,
  • 24:44 - 24:48
    you don't have observed enough events yet.
  • 24:48 - 24:51
    So this got embedded devices,
  • 24:51 - 24:56
    this got the Raspberry Pi recently,
  • 24:56 - 24:58
    essentially it's a Linux shortcoming,
  • 24:58 - 25:00
    which by now it's too late to fix,
  • 25:00 - 25:02
    which is the fact that /dev/urandom will not
    block
  • 25:02 - 25:06
    even at boot, before being initialized.
  • 25:06 - 25:10
    The solution in most cases is just that the
    distribution
  • 25:10 - 25:13
    should save the state of the pool at power-off,
  • 25:13 - 25:16
    and reload it at power-on,
  • 25:16 - 25:19
    or block until the pool is initialized.
  • 25:19 - 25:23
    So, your distribution probably solves this
    for you anyway.
  • 25:23 - 25:28
    So, to sum up, CSPRNGs are pretty cool, and
    they work.
  • 25:28 - 25:31
    You don't need /dev/random.
  • 25:31 - 25:34
    You shouldn't use userspace CSPRNGs
  • 25:34 - 25:36
    because they're very easy to get wrong.
  • 25:36 - 25:39
    And if you need 100 random bytes,
  • 25:39 - 25:42
    read 100 bytes from /dev/urandom.
  • 25:42 - 25:44
    That's it!
  • 25:44 - 25:55
    applause
  • 25:55 - 25:58
    I glossed over a lot of different ways to
    do it wrong,
  • 25:58 - 26:01
    so if you have questions about why not this
    other thing,
  • 26:01 - 26:03
    please, come forward.
  • 26:03 - 26:07
    Herald: Okay, and because people on the stream
    can't be here in person,
  • 26:07 - 26:10
    we will start with questions from the Internet.
  • 26:10 - 26:14
    Q: The first question is: How do you explain,
  • 26:14 - 26:18
    regarding what you explained of /dev/random
    versus /dev/urandom,
  • 26:18 - 26:21
    the fact that on the 4.3.3 kernel, /dev/random
    output
  • 26:21 - 26:26
    is identical with something from /dev/input
    something?
  • 26:26 - 26:27
    Someone claimed that.
  • 26:27 - 26:30
    FV: I'm sorry, you have to repeat. On the
    what?
  • 26:30 - 26:34
    Q: On a kernel 4.3.3, someone claims that
    sometimes,
  • 26:34 - 26:37
    the output from /dev/random, or /dev/unrandom,
  • 26:37 - 26:40
    is identical to something that comes from
    /dev/input,
  • 26:40 - 26:42
    like an input device.
  • 26:42 - 26:46
    FV: I'm not sure I got what system, but...
  • 26:46 - 26:47
    oh my god, what system?
  • 26:47 - 26:51
    Q: Linux, Linux 4.3.3, the guy claims.
  • 26:51 - 26:54
    FV: That sounds like a pretty bad bug, but...
  • 26:54 - 26:55
    I don't know.
  • 26:55 - 26:58
    If that's the case, I'm not aware of it,
  • 26:58 - 26:59
    because I read the kernel source,
  • 26:59 - 27:04
    and it's really a call to extract_entropy_user.
  • 27:04 - 27:05
    File a bug report, maybe?
  • 27:05 - 27:06
    No, I mean, I'm joking.
  • 27:06 - 27:09
    I would be happy to talk about this offline.
  • 27:09 - 27:12
    Herald: Is there another question from the
    stream?
  • 27:12 - 27:14
    Q: Yes, I have two more questions.
  • 27:14 - 27:17
    One is: What do you think about hardware entropy
    generators,
  • 27:17 - 27:18
    or hardware random generators?
  • 27:18 - 27:23
    FV: Aha! I have a slide for this!
  • 27:23 - 27:29
    laughter
  • 27:29 - 27:33
    So, hardware random number generators, very
    quickly.
  • 27:33 - 27:37
    Some CPUs on some platforms have real random
    number generators.
  • 27:37 - 27:41
    Essentially, I'm told they use electrical
    noises
  • 27:41 - 27:43
    to give you actual randomness.
  • 27:43 - 27:47
    Linux has support for them, and, if they're
    loaded,
  • 27:47 - 27:50
    they will immediately be used to refuel this
    pool,
  • 27:50 - 27:54
    and they will also be used as the initialization
    vectors
  • 27:54 - 27:56
    for the SHA-1 of this extraction.
  • 27:56 - 27:59
    So, if they're turned on, you don't have to
    worry about them,
  • 27:59 - 28:03
    and they will make /dev/urandom work even
    better.
  • 28:03 - 28:05
    Yep.
  • 28:05 - 28:09
    applause
  • 28:09 - 28:11
    Herald: Okay, quick question from the stream.
  • 28:11 - 28:17
    Q: Yeah, someone wants your opinion about
    entropy-gathering daemons
  • 28:17 - 28:18
    like havege daemons.
  • 28:18 - 28:21
    FV: There was probably a time when they had
  • 28:21 - 28:27
    their reason to exist, maybe because Linux
    implementations
  • 28:27 - 28:30
    of this entropy gathering was not that good,
  • 28:30 - 28:33
    today they don't really have reason to exist.
  • 28:33 - 28:37
    Herald: Okay, thank you. And microphone 4,
    please.
  • 28:37 - 28:40
    Q: Hello. I wanted to ask about the early
    boot problem.
  • 28:40 - 28:46
    You say that we should mix, that we should
    save the state
  • 28:46 - 28:52
    of /dev/urandom, what happens if a machine
    crashes?
  • 28:52 - 28:55
    Wouldn't you restart from an earlier state
    of /dev/urandom?
  • 28:55 - 28:59
    FV: Hm, yeah, I think that the correct way
    to do this is,
  • 28:59 - 29:06
    as soon as, even before the input is used
    to initialize the pool,
  • 29:06 - 29:09
    the one from the last shutdown,
  • 29:09 - 29:12
    it should be deleted from the disk,
  • 29:12 - 29:13
    and the disk flushed.
  • 29:13 - 29:17
    Yes, it's kind of hard, yes.
  • 29:17 - 29:20
    Herald: Okay, and unfortunately we are running
    out of time,
  • 29:20 - 29:24
    because we have to clear this room. And you
    have a short announcement?
  • 29:24 - 29:30
    FV: Oh, yeah! Tomorrow, at 15.30, I am giving
    a quick workshop
  • 29:30 - 29:35
    about how to implement a Vaudenay padding
    oracle attack in Hall 14.
  • 29:35 - 29:39
    I think it doesn't take as many people as
    are here now...
  • 29:39 - 29:41
    so maybe I shouldn't have said that.
  • 29:41 - 29:44
    Herald: Okay, then! Thanks again, Filippo,
    for the talk.
  • 29:44 - 29:48
    applause
  • 29:48 - 30:00
    subtitles created by c3subtitles.de
    Join, and help us!
Title:
Filippo Valsorda: The plain simple reality of entropy
Description:

more » « less
Video Language:
English
Duration:
30:00

English subtitles

Revisions Compare revisions