36C3 - What's left for private messaging?

    36C3 preroll music
    Herald: Please put your hands together and
    give a warm round of applause to Will Scott.
    Will Scott: Thank you.
    Will: All right. Welcome. So. The basic
    structure of this talk is sort of twofold.
    The first thing is to provide an overview
    of the different mechanisms that exist in
    this space of secure communication and try
    to tease apart a bunch of the individual
    choices and tradeoffs that have to be made
    and the implications of them. Because a
    lot of times we talk about security or
    privacy as very broad terms that cover a
    bunch of individual things. And breaking
    that down gives us a better way to
    understand what it is we're giving up or
    whether or why these decisions actually
    get made for the systems that we end up
    using. And the way that it's going to sort
    of the arc that I'll cover is first trying
    to provide a sort of taxonomy or
    classification of a bunch of the different
    systems that we see around us. And from
    there identify the threats that we often
    are trying to protect against and the
    mechanisms that we have to mitigate those
    threats and then go into some of these
    mechanisms and look at what's happening
    right now on different systems. And by the
    end, we'll sort of be closer to the
    research frontier of what is still
    happening, where are places where we have
    new ideas, but there's still quite a high
    tradeoff to usability or for other reasons
    where these haven't gained mass adoption.
    So I'll introduce our actors: Alice and
    Bob. The basic structure for pretty much
  • 2:11 - 2:18
    all of this is one to one messaging. So
    this is primarily systems that are
  • 2:18 - 2:21
    enabling us to have a conversation that
    looks a lot like what we would have in
  • 2:21 - 2:26
    person. That's sort of the thing that
    we're modelling is I want to have a
  • 2:26 - 2:30
    somewhat synchronous real time
    communication over a span of weeks,
  • 2:30 - 2:35
    months, years, resume it, and in the same
    way that in real life I know someone and I
  • 2:35 - 2:38
    recognize them when I come and talk to
    them again I expect the system to give me
  • 2:38 - 2:42
    similar sorts of properties.
    So the way
    we're going to then think about systems is
  • 2:45 - 2:52
    initially, we have systems that look very
    much the same as how we would have a real
  • 2:52 - 3:00
    life communication, where I can - on a
    local network - use AirDrop or use a bunch
  • 3:00 - 3:04
    of things that just work directly between
    my device and a friend's device
  • 3:04 - 3:07
    to communicate.
  • 3:07 - 3:10
    On a computer, this might
    look like using Netcat or a command line
  • 3:10 - 3:15
    tool to just push data directly to the
    other person. And this actually results in
  • 3:15 - 3:18
    a form of communication that looks very
    similar. Right, it's ephemeral, it goes
  • 3:18 - 3:24
    away afterwards unless the other person
    saves it. But there is already a set of
  • 3:24 - 3:27
    adversaries or threats that we can think
    about how do we secure this sort of
  • 3:27 - 3:30
    One of those would be the
    network. So, can someone else see this
  • 3:35 - 3:39
    communication and how do we hide from
    that? And we have mechanisms against that,
  • 3:39 - 3:44
    namely encryption. Right, I can
    disguise my communication and encrypt it
  • 3:44 - 3:50
    so that someone who is not my intended
    recipient cannot see what's happening.
  • 3:50 - 3:54
    And then the other one would be the
    other...these end devices themselves.
  • 3:54 - 3:58
    Right, so there's a couple of things that
    we need to think about when we think about
  • 3:58 - 4:00
    what is it that we're trying to protect
    against on an end device. One is there
  • 4:00 - 4:06
    might be other bad software that either,
    later gets installed and tries
  • 4:06 - 4:10
    to steal or learn about what was said.
    Either, either at the same time or
  • 4:12 - 4:15
    And so we have mechanisms
    there. One of them would be message
  • 4:15 - 4:20
    expiry. So we can make the messages go
    away, make sure we delete them from disk
  • 4:20 - 4:25
    at some point. And the other would be
    making sure that we've sort of isolated
  • 4:25 - 4:29
    our chats so that it doesn't overlap and
    other applications can't see what's
  • 4:29 - 4:32
    happening there.
    So, we have these direct
    communication patterns but that's a small
  • 4:36 - 4:43
    minority of most of what we think of when
    we chat. Instead, most of the systems that
  • 4:43 - 4:48
    we're using online use a centralized
    server. There's some logically centralized
  • 4:48 - 4:53
    thing in the cloud and I send my messages
    there and it then forwards them to my
  • 4:53 - 4:59
    intended recipient. And so whether it's
    Facebook or WhatsApp or Signal or sorry,
  • 4:59 - 5:05
    Slack or IRC or Signal or Wire or Threema
    or whatever, you know, cloud chat app
  • 5:05 - 5:13
    we're using today, this same model
    applies. So we can identify additional
  • 5:13 - 5:20
    threats here and then we can think about
    why we do this. So one threat is the
  • 5:20 - 5:24
    network. And I'll tear that apart a little
    bit. You've got the local network that we
  • 5:24 - 5:29
    had before. So someone who's on the
    network near the person who's sending
  • 5:29 - 5:33
    messages or receiving messages, so someone
    else in the coffee shop, your local
  • 5:33 - 5:39
    organization, your school, your work,
    you've got the Internet as a whole that
  • 5:39 - 5:45
    messages are passing over. So the ISPs or
    the countries that you're in may want to
  • 5:45 - 5:49
    look at or prevent you from sending
    messages. You've also got an adversary in
  • 5:49 - 5:54
    the network, sort of local or near the
    server that can see most of the messages
  • 5:54 - 5:58
    going in and out of the server because
    these services have to exist somewhere be
  • 5:58 - 6:04
    that in a data center that they physically
    have computers in or in AWS or Google or
  • 6:04 - 6:09
    one of these other clouds. And now you've
    got a set of actors that you need to think
  • 6:09 - 6:12
    about that are near the server that can
    see most of the traffic going in and out
  • 6:12 - 6:14
    of that server.
    We also have to think
    about the server itself as a potential
  • 6:18 - 6:22
    adversary. There's a few different threats
    that we need to think about. The server
  • 6:22 - 6:27
    could get threatened... could get hacked
    or otherwise compromised. So parts of
  • 6:27 - 6:32
    the communication or bugs in the software
    can potentially be a problem.
  • 6:32 - 6:34
    You've got a
    legal entity typically that is running
    this server. And so the jurisdiction
  • 6:39 - 6:44
    that it's in can send requests to get data
    from users or to compel it to provide
  • 6:44 - 6:49
    information. So there's this whole threat
    of what is the server required to turn
  • 6:49 - 6:55
    over. And then you've got sort of how is
    the server actually or this company making
  • 6:55 - 6:59
    money and sustaining itself. Is it going
    to get acquired by someone that you don't
  • 6:59 - 7:03
    trust, even if you trust it now? So
    there's this future view of how do we
  • 7:03 - 7:09
    ensure that the messages I have now don't
    get misused in the future?
    And we have a
  • 7:10 - 7:14
    set of techniques that mitigate these
    problems as well. So one of them would
  • 7:14 - 7:19
    be we can use traffic obfuscation or
    circumvention techniques to make our
  • 7:19 - 7:26
    traffic look less obvious to the network.
    And that prevents a large amount of these.
  • 7:26 - 7:29
    And then, I'm calling this server hardening
    but it's really a sort of a broad set of
  • 7:29 - 7:34
    techniques around how do we trust the
    server less? And how do we make those
  • 7:34 - 7:39
    potential compromises of the server,
    either code based or it having to reveal
  • 7:39 - 7:43
    information less damaging?
    It's worth
    saying that there are a bunch of reasons
  • 7:47 - 7:51
    why we have primarily used centralized
  • 7:51 - 7:53
    You've got availability. It's
  • 7:53 - 7:58
    very easy to go to a single place and it
    also makes a bunch of problems like
  • 7:58 - 8:04
    handling multiple devices and mobile push
    in particular, because both Google and
  • 8:04 - 8:10
    Apple expect or allocate sort of a single
    authorized provider who can send
  • 8:10 - 8:16
    notifications to the app user's mobile
    devices. And so that sort of requires you
  • 8:16 - 8:20
    to have a centralized place that knows
    when to send those messages if you want to
  • 8:20 - 8:24
    provide real time alerts to your
    application users.
  • 8:24 - 8:26
  • 8:26 - 8:32
    both cost, there's some entity now
    that is responsible for all of this cost
  • 8:32 - 8:36
    and has to have a business model and also
    that there is a single entity that people
  • 8:36 - 8:41
    can come to and that now faces the legal
    and regulatory issues.
  • 8:41 - 8:42
  • 8:42 - 8:47
    only type of system we have, right? The
    next most common is probably federated.
  • 8:47 - 8:53
    E-mail is a great example of this. An
    email is nice that now as a user I can
  • 8:53 - 8:58
    choose an email provider that I trust out
    of many, or if I don't trust any of the
  • 8:58 - 9:03
    ones that I see, I can even spin up my own
    with a small group so we can decentralize
  • 9:03 - 9:10
    cost. We can make this more approachable.
    And so while I can gain more confidence in
  • 9:10 - 9:16
    my individual provider, I don't have as
    much trust in, you know, is the recipient,
  • 9:16 - 9:22
    is Bob in this case, I don't know how
    secure his connection is to his provider.
  • 9:22 - 9:26
    Because we've separated and decentralized
  • 9:26 - 9:28
  • 9:28 - 9:35
    both in figuring out identity and
    discovery securely and mobile push. But we
  • 9:35 - 9:39
    have a number of successful examples of
    this. So beyond email, the Fediverse and
  • 9:39 - 9:44
    Mastodon, Riot chat and even SMS are
    examples of federated systems where
  • 9:44 - 9:51
    there's a bunch of providers and it's not
    a single central place.
  • 9:51 - 9:53
  • 9:53 - 9:58
    this sort of metaphor of splitting apart
    and decentralizing and reducing the trust
  • 9:58 - 10:02
    in a single party, you end up with a set
    of decentralized messaging systems as
  • 10:02 - 10:07
    well. And so it's worth mentioning that as
    we sort of get onto this fringe. There's
  • 10:07 - 10:11
    sort of two types: One is using Gossip
    protocols. So things like Secure
  • 10:11 - 10:16
    Scuttlebutt. And in those you connect to
    either the people around you or people
  • 10:16 - 10:20
    that you know. And when you get messages,
    you gossip, you send them on to all of
  • 10:20 - 10:27
    the people around you. And so messages
    spread through the network. That is still
  • 10:27 - 10:33
    an area where we are learning the tradeoff
    of how much metadata gets leaked and
  • 10:33 - 10:41
    things, but is nice in its level of
    decentralization. The others basically
  • 10:41 - 10:48
    tried to make all of the users have some
    relatively low trusted participation in
  • 10:48 - 10:52
    the serving infrastructure. And so you can
    think of this as evolving out of things
  • 10:52 - 10:58
    like distributed hash tables that that are
    used in BitTorrent. You see something very
  • 10:58 - 11:06
    similar in in things like ricochet or
    tox.chat, which will use either tor like
  • 11:06 - 11:11
    relays for sending messages or have an
    explicit DHT for routing where all of the
  • 11:11 - 11:15
    members provide some amount of lookup to
    help with discovery
  • 11:15 - 11:18
    and finding other participants.
  • 11:20 - 11:25
    some of these mechanisms that we've
  • 11:25 - 11:31
    uncovered and we can start with
    encryption. So when you're sending
  • 11:31 - 11:37
    messages to a server by default, there's
    no encryption. This is things like IRC.
  • 11:37 - 11:43
    Email used to be primarily unencrypted and
    you can think of that like a postcard. So
  • 11:43 - 11:48
    you've got a letter or a postcard in this
    case that you're sending. It has where
  • 11:48 - 11:53
    that message is coming from, where it's
    going to and the contents. In contrast,
  • 11:53 - 11:58
    when you use transport encryption -- and
    so this is now a standard for most of the
  • 11:58 - 12:01
    centralized things. What that means is
    you're taking that postcard and you're
  • 12:01 - 12:06
    putting it in an envelope that the network
    can't open. And that's what TLS and other
  • 12:06 - 12:12
    forms of transport encryption are going to
    give you, is the network link just sees
  • 12:12 - 12:16
    the source and destination. It sees there's
    a message coming between Alice and
  • 12:16 - 12:20
    Facebook or whatever cloud provider, but
    can't look into that and see that that's
  • 12:20 - 12:24
    really a message for Bob or what's being
    said. It just sees individuals
  • 12:24 - 12:31
    communicating with that cloud provider.
    And so, you know, SMTPS, there are secure
  • 12:31 - 12:36
    versions of IRC and e-mail and most other
    protocols are using transport security at
  • 12:36 - 12:42
    this point. The thing that we have now is
    called end-to-end encryption or E2E, and so
  • 12:42 - 12:49
    now the difference here is the message
    that Alice is sending is addressed to Bob.
  • 12:49 - 12:54
    And it's encrypted so that the provider
    Facebook can't open that either and can't
  • 12:54 - 13:00
    look at the contents. OK? So the network
    just sees a message going between Alice
  • 13:00 - 13:04
    and Facebook still, but Facebook can't
    open that and actually see the contents of
  • 13:04 - 13:12
    the message. And so end-to-end encryption
    has gained pretty widespread adoption. We
  • 13:12 - 13:16
    have this in Signal, for the most part in
    iMessage, we have tools like PGP and GPG
  • 13:16 - 13:21
    that are implementing forms of this. For
    messaging there's a few that are worth
  • 13:21 - 13:26
    sort of covering in the space: the Signal
    protocol, which was initially called
  • 13:26 - 13:34
    axolotl, is adopted in WhatsApp, in
    Facebook private messaging and sort of
  • 13:34 - 13:43
    is... I guess it has generalized into
    something called the noise framework and
  • 13:43 - 13:50
    is gaining a lot of adoption. OMEMO looks
    a lot like that specifically for XMPP, and
  • 13:50 - 13:56
    so it is a specific implementation. The
    other one is called Off-The-Record or OTR
  • 13:56 - 14:04
    and Off-The-Record sort of developed a
    little bit ... or independently from this,
  • 14:04 - 14:10
    thinks a lot about deniability. I'm not
    going to go too deep into the specific
  • 14:10 - 14:15
    nits of what these protocols are doing,
    but I guess the intuition is the hard
  • 14:15 - 14:20
    parts here is not encrypting a message,
    but rather the hard parts is how do you
  • 14:20 - 14:24
    send that first message and establish a
    session, especially if the other person is
  • 14:24 - 14:28
    offline. So I want to start a
    communication. I type in the first message
  • 14:28 - 14:32
    I'm sending to someone. I need to somehow
    get a key and then send a message that
  • 14:32 - 14:38
    only that person can read and also
    establish this sort of shared secret. And
  • 14:38 - 14:42
    doing all of that in one message or with
    the other device not online ends up being
  • 14:42 - 14:48
    tricky. Additionally, figuring out the
    mapping between a user and their devices,
  • 14:48 - 14:53
    especially as that changes and making sure
    you've appropriately revoked devices,
  • 14:53 - 14:59
    added new devices without keys falling
    over or getting too many warnings to the
  • 14:59 - 15:05
    error ehm too many warnings to the user
    ends up being a lot of the trick in these
  • 15:05 - 15:15
    systems. There's two problems that sort of
    come into play when we start using an end.
  • 15:15 - 15:20
    One is we need to think about connection
    establishment. So, so this is the problem
  • 15:20 - 15:27
    of saying who is Bob? So, so I find a
    contact and I know them in some way by an
  • 15:27 - 15:34
    email address, by a phone number. Signal
    uses phone numbers. You know, a lot of
  • 15:34 - 15:38
    systems maybe use an email address.
    There's things like Threema that use a
  • 15:38 - 15:42
    unique identifier that they generate for
    you. But somehow I have to go from that
  • 15:42 - 15:48
    identifier to some actual key or some
    knowledge of of a cryptographic secret
  • 15:48 - 15:51
    that identifies the other person. And I
    have figure out who I trust to do that
  • 15:51 - 15:59
    mapping of of gaining this thing that I'm
    now using for encryption. And then also
  • 15:59 - 16:04
    there's this "Well, how do we match?" So a
    lot of systems do this by uploading your
  • 16:04 - 16:10
    address book or trying to match with
    existing contacts to solve the user-
  • 16:10 - 16:16
    interface problem of discovery, which is:
    If they can already know the identifiers
  • 16:16 - 16:20
    and have this mapping, then when someone
    new comes in they can suggest and have
  • 16:20 - 16:25
    "prefound" these keys and you just sort of
    trust the server to hold this address book
  • 16:25 - 16:28
    and to do this mapping between what
    they're using as their identifier and and
  • 16:29 - 16:34
    the keys themselves that you're getting
    out. Signal is nice here, it says it's not
  • 16:34 - 16:39
    uploading your contacts, which is true.
    They're uploading hashes of your phone
  • 16:39 - 16:43
    number rather than the actual phone
    numbers. But but it's a similar thing.
  • 16:43 - 16:49
    They've got a directory of known phone
    numbers. And then as people search, you'll
  • 16:49 - 16:55
    search for a hash of the phone number and
    get back, you know, the key that you hope
  • 16:55 - 17:01
    signal has correctly given you. So there's
    sort of a couple of ways that you reduce
  • 17:02 - 17:10
    your trust here. Signal has been going
    down a path using SGX to raise the cost of
  • 17:10 - 17:16
    attacks, oblivious RAM and a bunch of sort
    of systems mechanisms to reduce the
  • 17:16 - 17:22
    costs... or increase the cost of attack
    against their discovery mechanism. The
  • 17:22 - 17:27
    other way that you do this is you allow
    for people to use pseudonyms or anonymous
  • 17:27 - 17:32
    identifiers. So wire you can just register
    on an anonymous email address. And now the
  • 17:32 - 17:38
    cost to you is potentially less if that
    gets compromised. And it's worth noting
  • 17:38 - 17:43
    Moxie will be talking tomorrow at 4:00
    p.m. about the evolution of the space
  • 17:43 - 17:50
    around Signal, so there's probably a bunch
    more depth there that you can expect. So
  • 17:50 - 17:55
    what if we don't want to trust the server
    to do matchmaking? One of the early things
  • 17:55 - 18:00
    that has been around is the web of trust
    around GPG. And this is the notion that.
  • 18:00 - 18:09
    I, if I have in real life or otherwise
    associated an identifier with a key, I can
  • 18:09 - 18:16
    publicly provide a signed statement saying
    that I trust that mapping and then people
  • 18:16 - 18:22
    who don't know someone but have a link
    socially maybe can find these proofs and
  • 18:22 - 18:27
    use that to trust this mapping. So I know
    an identifier and I know that I trust
  • 18:27 - 18:32
    someone who has said, well, this is the
    key associate with that identifier and I
  • 18:32 - 18:37
    can use that network to eventually find an
    identifier that that I'm willing to trust
  • 18:37 - 18:44
    or a key that I'm willing to encrypt to.
    There's some user interface tradeoff here.
  • 18:44 - 18:50
    This is a manual process in general. And
    this year we've had a set of denial-of-
  • 18:50 - 18:56
    service attacks on the web-of-trust
    infrastructure. And so the the specific
  • 18:56 - 19:04
    attack is that anyone can upload these
    attestations or trust, and so if a bunch
  • 19:04 - 19:08
    of random users or sybils start uploading
    trusts, when you go to try and download
  • 19:08 - 19:12
    this, you end up overwhelmed by the amount
    of information. And so the system does not
  • 19:12 - 19:17
    scale because it's very hard to filter to
    people you care about without telling the
  • 19:17 - 19:20
    system who you care about and revealing
    your network, which you're trying to
  • 19:20 - 19:29
    avoid. Keybase takes another approach.
    They made the observation that when I go
  • 19:29 - 19:35
    to try and talk to someone, what I
    actually care about is the person that I
  • 19:35 - 19:41
    believe owns a specific GitHub or Twitter
    or other social profile. And so I can
  • 19:41 - 19:45
    provide an attestation where I say: "Well,
    this is a key that's associated with the
  • 19:45 - 19:51
    account that controls this Twitter account
    or this Reddit account or this, you know,
  • 19:51 - 19:56
    Facebook account." And so by having that
    trust of proofs, I can connect an
  • 19:56 - 20:00
    individual and a cryptographic identity
    with the person behind who has the
  • 20:00 - 20:08
    passwords to a set of other systems.
    Keybase also this year began to provide a
  • 20:08 - 20:13
    monetary incentive for users and then
    struggled with the number of sign ups. And
  • 20:13 - 20:17
    so there's a lot of work in figuring out:
    "OK, do these identities actually
  • 20:17 - 20:22
    correspond to real people and how do you
    prevent a similar denial-of-service--style
  • 20:22 - 20:31
    attack that the web of trust faced in
    identifying things here?" On our devices,
  • 20:31 - 20:38
    we end up in general resorting to a
    concept called tofu or Trust-On-First-Use,
  • 20:38 - 20:43
    and what that means is when I first see a
    key that identifies someone, I'll save
  • 20:43 - 20:48
    that. And if I ever get another need to
    communicate with that person again, I've
  • 20:48 - 20:51
    already got a key and I can keep using
    that same key and expect that key to stay
  • 20:51 - 20:56
    the same. And so that that continuation
    and the ability to pin keys once you've
  • 20:56 - 21:01
    seen them means that if when you first
    establish a connection with someone, it's
  • 21:01 - 21:05
    the real person, then someone who
    compromises them later can't take over or
  • 21:05 - 21:14
    change that. Finally, one of the sort of
    exciting things that came out - this is
  • 21:14 - 21:21
    circa 2015 and is largely defunct now -
    was a system by Adam Langley called Pond
  • 21:21 - 21:28
    that looked at hardening a modern version
    of email. And one of the things that Pond
  • 21:28 - 21:33
    did was it had something called a password
    authenticated key exchange. And so this is
  • 21:33 - 21:40
    an evolving cryptographic area where
    you're saying if two people can start with
  • 21:40 - 21:48
    some weak shared secret - So I can perhaps
    publicly or in plain text ask the
  • 21:48 - 21:53
    challenge, the other person: "Where were
    we at a specific day?" And so now we both
  • 21:53 - 21:57
    know something that maybe has a few bits
    of entropy, at least. If we can write the
  • 21:58 - 22:05
    same textual answer, we can take that, run
    a key derivation function to end up with a
  • 22:05 - 22:10
    larger amount of shared entropy and use
    that as a bootstrapping method to do a key
  • 22:10 - 22:13
    exchange and end up finding a strong
    cryptographic identity for the other
  • 22:13 - 22:22
    person. So Pond has a system that they
    call Panda for linking to individuals
  • 22:22 - 22:26
    based on a challenge response and this is
    also something that you'll find in off-
  • 22:26 - 22:32
    the-record systems around Jabber. The
    other thing that we need to be careful
  • 22:32 - 22:38
    about in end-to-end--encrypted systems is
    deniability. When I'm chatting one on one
  • 22:38 - 22:46
    with someone, that conversation is
    eventually fairly deniable. Either a
  • 22:46 - 22:50
    person can have their recollection of what
    happened and there's no proof that the
  • 22:50 - 22:55
    other person said something unless you've
    recorded it or otherwise, you know,
  • 22:55 - 22:58
    brought some other technology into play.
    But with an encrypted thing where I've
  • 22:58 - 23:03
    authenticated the other person, I end up
    with a transcript - potentially - that,
  • 23:03 - 23:09
    you know, I can turn over later and say,
    look, this person said this. And and, you
  • 23:09 - 23:13
    know, we've seen recently that things like
    emails that come out are authenticated in
  • 23:13 - 23:21
    this way. The DKIM system that
    authenticates email senders showed up in
  • 23:21 - 23:27
    the WikiLeak's releases of Hillary
    Clinton's emails and was able to say:
  • 23:27 - 23:30
    "Look the text in these hasn't been
    changed." And it was signed by the real
  • 23:30 - 23:36
    server that we would expect. So the thing
    that we get from Off-The-Record and the
  • 23:36 - 23:42
    Signal protocol is something called
    deniability or reputability. And this
  • 23:42 - 23:48
    plays into a concept of a forward secrecy,
    which is: We're going to sort of throw
  • 23:48 - 23:55
    away stuff afterwards in a way that our
    chat goes back to being more ephemeral.
  • 23:55 - 23:58
    And so we can think about this in two
    ways. There's actually two properties that
  • 23:58 - 24:03
    interlink in this: We have keys that we're
    using to form our shared session that
  • 24:03 - 24:11
    we're expecting to use to have our secret
    message. And each time I send a message,
  • 24:11 - 24:16
    I'm going to also provide some new key
    material and begin changing that secret
  • 24:16 - 24:22
    key that we're using. So I provide a next
    key. And when Bob replies, he's going to
  • 24:22 - 24:27
    now use my next key as part of that and
    give me his next key. And the other thing
  • 24:27 - 24:31
    that I can then do is when I send a
    message, I can provide the secret bit of
  • 24:31 - 24:35
    my previous key. So I can say: "My last
    private key that I used to send you that
  • 24:35 - 24:41
    previous message was this." And now at the
    end of our conversation, we both know all
  • 24:41 - 24:46
    of the private keys such that we both
    could have created that whole conversation
  • 24:46 - 24:54
    on our own computer. At any given time,
    it's only the most recent message that is
  • 24:54 - 24:57
    that only could have been sent by the
    other person and the rest of the
  • 24:57 - 25:04
    transcript that you have is something you
    could have generated yourself. There is a
  • 25:04 - 25:08
    talk on day three about Off-The-Record v4,
    the fourth version of that, that will go
  • 25:08 - 25:14
    deeper into that, that's at 9:00 p.m. in
    the about:freedom assembly. So I encourage
  • 25:14 - 25:20
    you to do that if you're interested in
    this. OK. The next one to talk about is
  • 25:20 - 25:28
    expiry. This is sort of a follow on to
    this concept of forward secrecy. But
  • 25:28 - 25:32
    there's sort of two attacks here to
    consider. One is something that we should
  • 25:32 - 25:37
    maybe, I guess, give credit to Snapchat
    for popularizing, which is this concept of
  • 25:37 - 25:42
    "the message goes away after some amount
    of time". And really, this is protecting
  • 25:42 - 25:46
    against not fully trusting the other
    person from like sharing it later or
  • 25:46 - 25:51
    sharing in a way you didn't attend ehm
    intent. And this is also like a snapshot
  • 25:51 - 25:57
    adversary. So a bunch of apps will alert
    the other participant if you take a
  • 25:57 - 26:02
    screenshot. This is why some apps will
    blank the screen when they go to the task
  • 26:02 - 26:08
    switcher. So if you're swapping between
    apps, you'll see that some of your
  • 26:08 - 26:12
    applications will just show a blank screen
    or will not show contents. And that's
  • 26:12 - 26:16
    because the mobile operating systems APIs
    don't tell them when you're in that mode
  • 26:16 - 26:19
    when you take a screenshot and so they
    want to just be able to notify you if the
  • 26:19 - 26:24
    other person does. It's worth noting that
    this is all just raising the cost of these
  • 26:24 - 26:28
    attacks and providing sort of a social
    incentive not to, right. I can still use
  • 26:28 - 26:32
    another camera to take a picture of my
    phone and get evidence of something that
  • 26:32 - 26:39
    has been said. But it's discouraging it
    and setting social norms. The other reason
  • 26:39 - 26:44
    for expiry is: After the fact, a
    compromise of a device, so whether that's
  • 26:44 - 26:49
    - you know, someone gets hold of the
    device and tries to do forensic analysis
  • 26:49 - 26:55
    to pull off previous messages or the chat
    database or whether someone tries to
  • 26:55 - 27:00
    install an application that then scans
    through your phone... So that's Fengcai is
  • 27:00 - 27:06
    a application that's been installed as a
    surveillance app in China. And this also
  • 27:06 - 27:10
    boils down to a user interface and user
    experience question, which is how longer
  • 27:10 - 27:13
    you're going to save logs, how much
    history are you going to save and what
  • 27:13 - 27:19
    norms are you going to have? And there's
    there's a tradeoff here. It's useful
  • 27:19 - 27:25
    sometimes to scroll back. And especially
    for companies that believe that they have
  • 27:25 - 27:32
    value added services around being able to
    do data analytics on your chat history.
  • 27:32 - 27:40
    They're wary of getting rid of that. The
    next thing that we have is isolation and
  • 27:40 - 27:48
    OS sand boxing. Right. So this is a lot of
    this is up one layer, which is what is the
  • 27:48 - 27:53
    operating system doing to secure your
    application, your chat system from the
  • 27:53 - 27:59
    other things, the malware or the
    compromises of the the broader device that
  • 27:59 - 28:07
    it's running on. We have a bunch of
    projects around us at Congress that are
  • 28:07 - 28:11
    innovating on this. There are chat systems
    that also attempt to do this sort of on
  • 28:11 - 28:16
    their own. One sort of extreme example is
    called tinfoil chat, which makes use of
  • 28:16 - 28:21
    three devices and a physical diode which
    is designed to have one device that is
  • 28:21 - 28:26
    sending messages and another device that
    is receiving messages. And the thought is:
  • 28:26 - 28:31
    if you receive a message that somehow
    compromises the device, the malware or the
  • 28:31 - 28:37
    malicious file can never get any
    communication back out and so becomes much
  • 28:37 - 28:42
    less valuable to have compromised. And
    they implement this with like a physical
  • 28:42 - 28:54
    hardware diode. The other side of this is
    recovery and backups. Which is you've got
  • 28:54 - 29:01
    a user experience tradeoff between a lot
    of people losing their devices and wanting
  • 29:01 - 29:05
    to get back their contact list or their
    chat history and the fact that now you're
  • 29:05 - 29:08
    keeping this extra copy and have this
    additional place for things to get
  • 29:08 - 29:15
    compromised. Apple has done a lot of work
    here that we don't look out so much. They
  • 29:15 - 29:20
    gave a blackout talk a few years ago where
    they discuss how they use custom hardware
  • 29:20 - 29:25
    security modules in their data centers,
    much like the T2 chip. In the end, devices
  • 29:25 - 29:31
    that will hold the backup keys that get
    used for their iclub backups and do
  • 29:31 - 29:37
    similar amounts of rate limiting. And they
    consider a set of - a pretty wide set of
  • 29:37 - 29:41
    adversaries - more than we might expect.
    So including things like what happens when
  • 29:41 - 29:47
    the government comes and asks us to write
    new software to compromise this? And so
  • 29:47 - 29:52
    they set up their HSMs such that they
    cannot provide software updates to them,
  • 29:52 - 29:57
    which is, you know, a sort of a step of
    how do you do this cloud security side
  • 29:57 - 30:04
    that we don't think about as much. So
    there's a set of slides that you can find
  • 30:04 - 30:09
    from from this. And these slides will be
    online, too, as a pointer to to look at
  • 30:09 - 30:14
    their solution, which considers a large
    number of adversaries that you might not
  • 30:14 - 30:28
    have thought about. So traffic obfuscation
    is primarily a network side adversary. The
  • 30:28 - 30:32
    technique that is getting used as sort of
    what people are using if they feel they
  • 30:32 - 30:37
    need to do this, is something called
    domain fronting and domain fronting, had
  • 30:37 - 30:43
    its heyday maybe in 2014 ish and has
    become somewhat less effective, but it's
  • 30:43 - 30:50
    still effective enough for most of the
    chat things. The basic idea behind domain
  • 30:50 - 30:55
    fronting is that there's a separation of
    layers behind that envelope and the
  • 30:55 - 31:02
    message inside of it that we get in HTTP
    in the Web. So when I create a secure
  • 31:02 - 31:09
    connection to a CDN to a content provider
    like Amazon or Google or Microsoft, I can
  • 31:09 - 31:14
    make that connection and do perform the
    security layer and provide a fairly
  • 31:14 - 31:19
    generic service that I'm connecting to. I
    just want to establish a secure connection
  • 31:19 - 31:24
    to CloudFlare. And then once I've done
    that, the message that I can send inside
  • 31:24 - 31:27
    can be a chat message to a specific
    customer of that CDN or that cloud
  • 31:27 - 31:35
    provider. And so this is an effective way
    to prevent the network from knowing what
  • 31:35 - 31:42
    specific service you're accessing. It got
    used for a bunch of circumvention things.
  • 31:42 - 31:46
    It then got used for a bunch of malware
    things and this caused a bunch of the
  • 31:46 - 31:52
    cloud providers to stop allowing you to do
    this. But it's still getting used. This is
  • 31:52 - 31:56
    still what sort of happening when you turn
    on certain censorship circumvention in
  • 31:56 - 32:02
    signal, it's what telegram is using for
    the most part. And it's the same basic
  • 32:02 - 32:08
    technique is getting another revival with
    DNS over HTTPS and encrypted SNI
  • 32:08 - 32:15
    extensions to TLS which allow for a
    standardized approach to establish a
  • 32:15 - 32:20
    connection to a service without providing
    any specific identifiers to the network
  • 32:20 - 32:26
    for which service you want to connect to.
    It's worth sort of mentioning that
  • 32:26 - 32:34
    probably the most active chat service for
    this sort of obfuscation or circumvention
  • 32:34 - 32:39
    is telegram, which has a bunch of users in
    countries that are not fans of having lots
  • 32:39 - 32:45
    of users of telegram. And so they have
    both systems where they can bounce between
  • 32:45 - 32:49
    IPs very quickly and change where their
    servers appear to be. And they've also
  • 32:49 - 32:55
    used techniques like sending messages over
    DNS tunnels to mitigate some of these
  • 32:55 - 33:02
    censorship things From the provider's
    perspectives this is really accessing
  • 33:02 - 33:06
    their user population. They're not really
    thinking about your local network or
  • 33:06 - 33:09
    caring about that as much as as much as
    they are like, oh, there's millions of
  • 33:09 - 33:17
    users that should probably still have
    access to us. So we can maybe hide the
  • 33:17 - 33:22
    characteristics of traffic in terms of
    what specific service we're connecting.
  • 33:22 - 33:26
    There's some other things about traffic,
    though, that also are revealing to the
  • 33:26 - 33:29
    network. And this is sort of this
    additional metadata that we need to think
  • 33:29 - 33:36
    about. So one of these is padding or the
    size of messages can be revealing. So one
  • 33:36 - 33:39
    sort of immediate thing is the size of a
    chat or a text message is going to be very
  • 33:39 - 33:46
    different from the size of an image or
    voice or movies. And you see this on
  • 33:46 - 33:49
    airplanes or in other bandwidth limited
    settings: they might allow text messages
  • 33:49 - 33:56
    to go through, but images won't. There's
    been research that shows, for instance, on
  • 33:56 - 34:03
    voice, even if I encrypt my voice, we've
    actually gotten really good at compressing
  • 34:03 - 34:08
    audio of human speech. So much so that
    different phonemes, different sounds that
  • 34:08 - 34:14
    we make take up different sizes. And so I
    can say something, compress it, encrypt it
  • 34:14 - 34:20
    and then recover what was said based on
    the relative sizes of different sounds. So
  • 34:20 - 34:25
    there was there was a paper in 2011 that
    Oakland S&P that demonstrated this
  • 34:25 - 34:33
    potential for attacks. And so what this is
    telling us perhaps is that there's a
  • 34:33 - 34:40
    tradeoff between how efficiently I want to
    send things and how much metadata or
  • 34:40 - 34:45
    revealing information for distinguishing
    them I'm giving up. So I can use a less
  • 34:45 - 34:50
    efficient compression that's constant bit
    rate or that otherwise is not revealing
  • 34:50 - 34:52
    this information, but it has higher
    overhead and won't work as well in
  • 34:52 - 34:59
    constrained network environments. The
    other place this shows up is just when
  • 34:59 - 35:05
    people are active. So if I can look at
    when someone is tweeting or when messages
  • 35:05 - 35:11
    are sent, I can probably figure out pretty
    quickly what timezone they're in. Right.
  • 35:11 - 35:17
    And so this leads to a whole set of these
    metadata based attacks. And in particular,
  • 35:17 - 35:22
    there's confirmation attacks and
    intersection attacks. And so intersection
  • 35:22 - 35:27
    attacks is looking at the relative
    activity of multiple people and trying to
  • 35:27 - 35:33
    figure out: OK, when Alice sent a message,
    who else was online or active at the same
  • 35:33 - 35:37
    time? And over time, can I narrow down or
    filter to specific people that were likely
  • 35:37 - 35:45
    who Alice was talking to? Pond also is a
    service to look at or a system to look at
  • 35:45 - 35:52
    in this regard. Their approach was that a
    client would hopefully be always be online
  • 35:52 - 35:58
    and would at a regular pattern check in
    with the server with the same amount of
  • 35:58 - 36:02
    data, regardless of whether there was a
    real message to send or not. So that from
  • 36:02 - 36:07
    the network's perspective, every user
    looked the same. The downside being that
  • 36:07 - 36:13
    you've now got this message being sent by
    every client every minute or so and that
  • 36:13 - 36:19
    creates a huge amount of overhead of, you
    know, just padded data that doesn't have
  • 36:19 - 36:28
    any meaning. So finally, I'll take a look
    at server hardening and the things that
  • 36:28 - 36:33
    we're doing to reduce trust in the server.
    There's a few examples of why we would
  • 36:33 - 36:38
    want to do this. So one is that you've had
    messaging servers, plenty of times, that
  • 36:38 - 36:47
    have not been as secure as they claim. One
    example being that there was a period
  • 36:47 - 36:53
    where the Skype subsidiary in China was
    using a blacklist of keywords on the
  • 36:53 - 36:58
    server to either prevent or intercept some
    subset of their users messages without
  • 36:58 - 37:04
    telling anyone that they were doing that.
    And then also just sort of this uncertain
  • 37:04 - 37:08
    future of, OK, I trust the data now, but
    what can we do so that I don't worry about
  • 37:08 - 37:15
    what the corporate future of this service
    entails for my data. One of the sort of
  • 37:15 - 37:21
    elephants in the room is: the software
    development is probably pretty
  • 37:21 - 37:25
    centralized. So even if I don't trust the
    server, there's some pretty small number
  • 37:25 - 37:29
    of developers who are writing the code.
    And how do I trust that the updates that
  • 37:29 - 37:33
    they are making to this, either the server
    or to my client that they pushed my client
  • 37:33 - 37:39
    isn't reducing my security. Open source is
    a great start to mitigating that, but it's
  • 37:39 - 37:46
    certainly not solving all of this. So one
    thing, one way we can think about how we
  • 37:46 - 37:50
    reduce trust in the server is by looking
    at what the server knows after end to end
  • 37:50 - 37:54
    encryption. It knows things about the
    size. It knows where the message is coming
  • 37:54 - 37:58
    from. It knows where the message is going
    to. Size: we've talked about some of these
  • 37:58 - 38:03
    padding things that we can use to
    mitigate. So how do we reduce the amount
  • 38:03 - 38:07
    of information about sources and
    destinations in this network graph that
  • 38:07 - 38:13
    the server knows? So this is a concept
    called linkability, which is being able to
  • 38:13 - 38:22
    link the source and destination of a
    message. We start to see some mitigations
  • 38:22 - 38:28
    or approaches to reducing linkability
    entering mainstream systems. So Signal has
  • 38:28 - 38:32
    a system called "Sealed Sender" that you
    can enable, where the source of the
  • 38:32 - 38:37
    message goes within the encrypted
    envelope. So that Signal doesn't see that.
  • 38:37 - 38:42
    The downside being that Signal is still
    seeing your IP address but the thought is
  • 38:42 - 38:47
    that they will throw those out relatively
    quickly and so they will have less logs
  • 38:47 - 38:53
    about this source to destination.
    Theoretically, though, there is a bunch of
  • 38:53 - 38:59
    work in this. The first thing I'll point
    to is a set of systems that we classify as
  • 38:59 - 39:08
    mixnets. A mixnet works by having a set of
    providers rather than a single entity
  • 39:08 - 39:13
    that's running the servers. A bunch of
    users will send messages to the first
  • 39:13 - 39:17
    provider, which will shuffle all of them
    and send them to the next provider, which
  • 39:17 - 39:21
    will shuffle them again and send them to a
    final provider that will shuffle them and
  • 39:21 - 39:26
    then be able to send them to destinations.
    And this de-links. Where none of the
  • 39:26 - 39:32
    individual providers know both the source
    and destination of these messages. So this
  • 39:32 - 39:40
    looks maybe a bit like Tors onion routing,
    but differs in in sort of a couple of
  • 39:40 - 39:45
    technicalities. One is typically, you will
    wait for some number of messages rather
  • 39:45 - 39:50
    than just going through with bandwidth and
    low latency. And so by doing that, you can
  • 39:50 - 39:54
    get a theoretical guarantee that this
    batch had at least n messages that got
  • 39:54 - 39:58
    shuffled and therefore you can prevent
    there being some time where only one user
  • 39:58 - 40:05
    was using the system. And so you got a
    stronger theoretic guarantee. There's an
  • 40:05 - 40:10
    active project making a messaging system
    using mixnets called Katzenpost. They gave
  • 40:10 - 40:14
    a talk at Camp this summer and I'd
    encourage you to look at their website or
  • 40:14 - 40:23
    go back to that talk to learn more about
    mixnets. The project that I was, I guess,
  • 40:23 - 40:26
    tangentially helping with is in a space
    called private information retrieval,
  • 40:26 - 40:33
    which is another technique for doing this
    delinking. Private information retrieval
  • 40:33 - 40:38
    frames the question a little bit
    differently. And what it asks is: if I
  • 40:38 - 40:42
    have a server that has a database of
    messages and I want a client to be able to
  • 40:42 - 40:46
    retrieve one of those messages without the
    server knowing which message the client
  • 40:46 - 40:55
    got or asked for. So this sounds maybe
    hard. I can give you a straw man to
  • 40:55 - 40:59
    convince yourself that this is doable and
    the straw man is: I can ask the server for
  • 40:59 - 41:04
    its entire database and then take the
    message that I want and the server hasn't
  • 41:04 - 41:08
    learned anything about which message I
    cared about. But I spent a lot of network
  • 41:08 - 41:14
    bandwidth probably doing that. So there's
    a couple of constructions for this. I'm
  • 41:14 - 41:20
    going to focus on the information
    theoretic private information retrieval.
  • 41:20 - 41:25
    And so we're going to use a similar setup
    to what we had in our threat model for a
  • 41:25 - 41:30
    mixed net, which is we've got a set of
    providers now that have the same database.
  • 41:30 - 41:35
    And I'm going to assume that they're not
    all talking to each other or colluding. So
  • 41:35 - 41:40
    I just need at least one of them, to be
    honest. And one of the things that we'll
  • 41:40 - 41:45
    use here is something called the exclusive
    or operation. To refresh your memory here
  • 41:45 - 41:51
    exclusive or is a binary bitwise
    operation. And the nice property that we
  • 41:51 - 41:56
    get is if I xor something with itself, it
    cancels out. So if I have some piece of
  • 41:56 - 42:03
    data and I xor it against itself, it just
    goes away. So if I have my systems that
  • 42:03 - 42:11
    have the database, I can ask each one to
    give me a superposition of some random
  • 42:11 - 42:17
    subset of its database so I can ask the
    first server, give me items for 11, 14 and
  • 42:17 - 42:24
    20 xor together. I'm assuming all of the
    items are the same size so that you can do
  • 42:24 - 42:31
    these xors. And then if I structure that,
    it can appear to each server independently
  • 42:31 - 42:35
    or as in the request that it sees that I
    just ask for some random subset. But I can
  • 42:35 - 42:39
    do that so that when I xor the things I
    get back, everything just cancels out
  • 42:39 - 42:44
    except the item that I care about. Unless
    you saw all of the requests that I made,
  • 42:44 - 42:49
    you wouldn't be able to tell which item I
    cared about. So by doing this, I've
  • 42:49 - 42:54
    reduced the network bandwidth. I'm only
    getting one item of size back from every
  • 42:54 - 43:00
    server. Now, you might you might have a
    concern that I'm asking the server to do a
  • 43:00 - 43:04
    whole lot of work here. It has to look
    through its entire database and compute
  • 43:04 - 43:10
    this superposition thing. And that seems
    potentially like a lot of work, right. The
  • 43:10 - 43:15
    thing that I think is exciting about this
    space is it turns out this sort of
  • 43:15 - 43:19
    operation of going out to a large database
    and like searching for all of the things
  • 43:19 - 43:24
    and then coming back with a small amount
    of data looks a lot like the hardware that
  • 43:24 - 43:30
    we're building for A.I. and for a bunch of
    these sorts of search like things. And so
  • 43:30 - 43:34
    this runs really quite well on a GPU where
    I can have all of those thousands of cores
  • 43:34 - 43:39
    compute little small parts of the XOR and
    then pull back this relatively small
  • 43:39 - 43:43
    amount of information. And so with GPUs,
    you can actually have databases of
  • 43:43 - 43:51
    gigabytes, tens of gigabytes of data and
    compute these XORs across all of it in
  • 43:51 - 43:59
    order of a millisecond or less. So a
    couple of things in this space. "Talek" is
  • 43:59 - 44:04
    the system that I helped with that
    demonstrates this working. The converse
  • 44:04 - 44:09
    problem is called private information
    storage. And that one is how do I write an
  • 44:09 - 44:14
    item into a database without the database
    knowing which item I wrote, the
  • 44:14 - 44:20
    mathematical construction there is not
    quite as simple to explain. But there's a
  • 44:20 - 44:26
    pretty cool new work in the last month or
    two out of Dan Boneh and Henry Corrigan-
  • 44:26 - 44:35
    Gibbs at Stanford called Express and Saba
    as first author that is showing how to
  • 44:35 - 44:44
    fairly practically perform that operation.
    I'll finish just with a couple minutes on
  • 44:44 - 44:53
    multiparty chat or group chat, so small
    groups. You've sort of got a choice here
  • 44:53 - 44:58
    in terms of how assisted chat systems are
    implementing group chat. One is you can
  • 44:58 - 45:02
    not tell the server about the group. And
    as someone who is part of the group, I
  • 45:02 - 45:06
    just send the same message to everyone in
    the group. And maybe I can tag it for them
  • 45:06 - 45:10
    so that they know it's part of the group
    or you do something more efficient where
  • 45:10 - 45:14
    you tell the server about group membership
    and I send the message once to the server
  • 45:14 - 45:23
    and it sends it to everyone in the group.
    Even if you don't tell the server about
  • 45:23 - 45:27
    it, though, you've got a bunch of things
    to worry about leaked correlation,
  • 45:27 - 45:32
    which is: if at a single time someone
    sends the same sized message to five other
  • 45:32 - 45:36
    people and then later someone else sends
    the same sized message to five other
  • 45:36 - 45:39
    people, and those basically overlap,
    someone in the network basically knows who
  • 45:39 - 45:43
    the group membership is. So it's actually
    quite difficult to conceal group
  • 45:43 - 45:49
    membership. The other thing that breaks
    down is our concept of deniability once
  • 45:49 - 45:53
    again, which is now if multiple people
    have this log. Even if both of them
  • 45:53 - 45:57
    individually could have written it, the
    fact that they have the same cryptographic
  • 45:57 - 46:04
    keys from this other third party probably
    means that third party made that message.
  • 46:04 - 46:13
    So there continues to be work here. Signal
    is working on providing again and SGX and
  • 46:13 - 46:17
    centralized construction for grid
    management to be able to scale better,
  • 46:17 - 46:22
    given I think the pretty realistic fact
    that the server in these cases is probably
  • 46:22 - 46:26
    going to be able to figure out group
    membership in some case, you might as well
  • 46:26 - 46:32
    make it scale. On the other side, one of
    the cool systems that's being prototyped
  • 46:32 - 46:40
    is called "cwtch" out of open privacy.
    And this is an extension to ricochet that
  • 46:40 - 46:46
    allows for offline messages and small
    group chats. It works for order of 5 to 20
  • 46:46 - 46:51
    people, and it works by having a server
    that obliviously forwards on messages to
  • 46:51 - 46:56
    everyone connected to it. So when I send a
    message to a group, the server sends the
  • 46:56 - 46:59
    message to everyone it knows about, not
    just the people in the group, and
  • 46:59 - 47:04
    therefore the server doesn't actually know
    the subgroups that exist. It just knows
  • 47:04 - 47:11
    who's connected to it. And that's a neat
    way. It doesn't necessarily scale to large
  • 47:11 - 47:16
    groups, but it allows for some concealing
    of group membership. They've got an
  • 47:16 - 47:22
    Android prototype as well that's sort of a
    nice extension to make this usable.
  • 47:22 - 47:34
    Wonderful. I guess the final thought here
    is: there's a lot of systems, I'm sure I
  • 47:34 - 47:40
    haven't mentioned all of them. But this
    community is really closely tied to the
  • 47:40 - 47:46
    innovations that are happening in the
    space of private chat. And this is the
  • 47:46 - 47:50
    infrastructure that supports communities
    and is some of the most meaningful stuff
  • 47:50 - 47:56
    you can possibly work on. And I encourage
    you to find new ones and look at a bunch
  • 47:56 - 48:00
    of them and think about the tradeoffs and
    encourage friends to play with new
  • 48:00 - 48:04
    systems, because that's how they gain
    adoption and how people figure out what
  • 48:04 - 48:10
    mechanisms do and don't work. So with
    that, I will take questions.
  • 48:10 - 48:18
  • 48:18 - 48:21
    Herald: Wasn't necessary to encourage you
    to come with an applause. There are
  • 48:21 - 48:25
    microphones that are numbered in the room,
    so if you start lining up behind the
  • 48:25 - 48:30
    microphones, then we can take your
    questions. We already have a question from
  • 48:30 - 48:36
    the Internet.
    Question: Popularity and independency are
  • 48:36 - 48:43
    a contradiction. How can I be sure that an
    increasingly popular messenger like Signal
  • 48:43 - 48:51
    stays independent?
    Answer: I guess I would question whether
  • 48:51 - 48:58
    independence is a goal in and of itself.
    It's true that the value is increasing.
  • 48:58 - 49:03
    And so one of the things I think about is,
    is using systems that have open protocols
  • 49:03 - 49:07
    or that are federated or otherwise not
    centralized. And again, this is reducing
  • 49:07 - 49:13
    that need to have confidence in the future
    business model of single legal entity.
  • 49:13 - 49:21
    But I don't know if independence is of
    the company is the thing that you're
  • 49:21 - 49:25
    trying to trade off with popularity.
    Herald: Well, and we have questions at the
  • 49:25 - 49:28
    microphones. We'll start a microphone,
    number one.
  • 49:28 - 49:34
    Question: Thanks for the talk. First of
    all, we talked to you talked a lot about
  • 49:34 - 49:41
    content and encryption. What about the
    initial problem? History shows that if I'm
  • 49:41 - 49:47
    an individual already observed in a
    sensitive area, that might no need to
  • 49:47 - 49:53
    encrypt or decrypt the message on sending.
    It's already identified. I'm sending at a
  • 49:53 - 49:59
    specific location at a specific time. Is
    there any chance to hide that or do
  • 49:59 - 50:03
    something against it?
    Answer: So make things hidden again after
  • 50:03 - 50:13
    the fact? That seems very hard. I mean,
    so. So there's a couple thoughts there,
  • 50:13 - 50:21
    maybe. There's sort of this real world
    intersection attack, which is if
  • 50:21 - 50:25
    there's a real world observable action of
    who actually shows up at the protest,
  • 50:25 - 50:29
    that's a pretty good way to figure out who
    is chatting about the protests beforehand,
  • 50:29 - 50:37
    potentially. And so, I mean, I think what
    we've seen in real world organizing is
  • 50:37 - 50:42
    things like either really decentralizing
    that, where it happens across a lot of
  • 50:42 - 50:46
    platforms, and happens very spontaneously
    close to the event. So there's not enough
  • 50:46 - 50:56
    time to respond in advance or using or
    hiding your presence or otherwise trying
  • 50:56 - 51:01
    to stagger your actual actions so that
    they are harder to correlate to a specific
  • 51:01 - 51:07
    group. But it's not something the chat
    systems are talking about. I don't think.
  • 51:07 - 51:11
    Herald: We have time for more questions.
    So please line up in the microphones and
  • 51:11 - 51:16
    if you're leaving, then leave quietly. We
    have a question from microphone number 4.
  • 51:16 - 51:19
    Question: So if network actress
  • 51:19 - 51:24
    translation is the original sin to the end
    to end principle, and due to that, we now
  • 51:24 - 51:31
    have to run servers, someone has to pay
    for it. Do you know any solution to that
  • 51:31 - 51:38
    economic problem?
    Answer: I mean, we had to pay for things
  • 51:38 - 51:43
    even without network address translation,
    but we could move more of that cost to end
  • 51:43 - 51:50
    users. And so we have another opportunity
    with IP v six to potentially keep more of
  • 51:50 - 51:54
    the cost with end users or develop
    protocols that are more decentralized
  • 51:54 - 52:00
    where that cost stays more fairly
    distributed. You know, our phones have a
  • 52:00 - 52:05
    huge amount of computation power and
    figuring out how we make our protocols so
  • 52:05 - 52:13
    that work happens there is, I think, an
    ongoing balance. I think some of the
  • 52:13 - 52:18
    reasons why network address translation or
    centralization is so common is because
  • 52:18 - 52:23
    distribute systems are pretty hard to
    build and pretty hard to gain confidence
  • 52:23 - 52:30
    in. So more tools around how we can test
    and feel like we understand and that the
  • 52:30 - 52:35
    system actually is, you know, going to
    work 99.9% of the time for distributed
  • 52:35 - 52:39
    systems is going to make people less wary
    of working with them.
  • 52:39 - 52:43
    So better tools on distribute systems is
    maybe the best answer.
  • 52:43 - 52:48
    Herald: We also have another question from
    the internet, which we'll take now.
  • 52:48 - 52:53
    Question: What do you think of technical
    novices, acceptance and dealing with OTR
  • 52:53 - 52:59
    keys, for example, Matrix Riot? Most
    people I know just click "I verified this
  • 52:59 - 53:03
    key" even if they didn't.
    Anwer: Absolutely. So this, I think
  • 53:03 - 53:08
    goes back to a lot of these problems are
    sort of a user experience tradeoff, which
  • 53:08 - 53:14
    is, you know, we saw initial versions of
    Signal where you would actually try and
  • 53:14 - 53:19
    regularly verify some QR code between each
    and then that sort of has gotten pushed
  • 53:19 - 53:24
    back to a harder to access part of the
    user interface because not many people
  • 53:24 - 53:29
    wanted to deal with that. And an early
    matrix riot you would get a lot of
  • 53:29 - 53:33
    warnings about: There's a new device. Do
    you want to verify this new device? Do you
  • 53:33 - 53:37
    only want to send to the previous devices
    that you trusted. And now you're getting
  • 53:37 - 53:42
    the ability to sort of more automatically
    just sort of accept these changes and
  • 53:42 - 53:45
    you're weakening some amount of the
    encryption security, but you're getting a
  • 53:45 - 53:49
    better, smoother user interface because
    most users are just going to sort of click
  • 53:49 - 53:53
    "yes" because they want to send the
    message. Right. And so there's this
  • 53:53 - 53:56
    tradeoff: when you have built the
    protocols such that you are standing in
  • 53:56 - 54:00
    the way of the person doing what they want
    to do. That's not really where you want to
  • 54:00 - 54:06
    put that friction. So figuring out other
    ways where you can have this on the side
  • 54:06 - 54:13
    or supporting the communication rather
    than hindering it is probably the types of
  • 54:13 - 54:17
    user interfaces or systems that we should
    be thinking about that can be successful.
  • 54:17 - 54:20
    Herald: We have a couple of more
    questions. We'll start at microphone
  • 54:20 - 54:24
    number 3.
    Question: Thank you for your talk. You
  • 54:24 - 54:29
    talked about deniability by sending the
    private key with the last message.
  • 54:29 - 54:34
    And how I you get the private key for the
    last message in the whole conversation
  • 54:34 - 54:45
    Anwer: In the OTR, XMPP, Jabber systems
    there would be an explicit action to end
  • 54:45 - 54:50
    the conversation that would then make it
    repudiateable that would that would send
  • 54:50 - 54:56
    that final message to to close it. What
    you have in things like Signal is it's
  • 54:56 - 55:00
    actually happening every message as part
    of the confirmation of the message.
  • 55:00 - 55:03
    Question: OK. Thank you.
    Herald: We still probably have questions
  • 55:03 - 55:07
    , time for more questions. So please
    line up if you have any. Don't hold back.
  • 55:07 - 55:10
    We have a question from
    microphone number 7.
  • 55:10 - 55:14
    Question: So, first of all, a brief
    comment. The riot thing still doesn't even
  • 55:14 - 55:20
    do tofu. They they haven't figured this
    out. But I think there's a
  • 55:20 - 55:25
    much more subtle conversation that needs
    to happen around deniability, because most
  • 55:25 - 55:31
    of the time, if you have people with with
    a power imbalance, the non repudiatable
  • 55:31 - 55:37
    conversation actually benefits the weaker
    person. So we actually don't want
  • 55:37 - 55:43
    deniability in most of our chat
    applications or whatever, except that's
  • 55:43 - 55:47
    still more subtle than that, because when
    you have people with equal power, maybe
  • 55:47 - 55:55
    you do. It's kind of weird.
    Anwer: Absolutely. And I guess the other
  • 55:55 - 55:59
    part of that is, is that something that
    should be shown to users and is that a
  • 55:59 - 56:03
    concept? Is there a way that you express
    that notion in a way that users can
  • 56:03 - 56:08
    understand it and make good choices? Or is
    it just something that your system makes a
  • 56:08 - 56:13
    choice on for all of your users?
    Herald: We have one more question.
  • 56:13 - 56:17
    Microphone number seven, please line up if
    you have any more. We still have a couple
  • 56:17 - 56:20
    of more minutes. Microphone number seven,
  • 56:20 - 56:23
    Question: Hi, Thanks for the talk. You
    talked about the private information
  • 56:23 - 56:31
    retrieval and how that would stop the
    server from knowing who retrieved the
  • 56:31 - 56:36
    message. But for me, the question is, how
    do I find out in the first place which
  • 56:36 - 56:44
    message is for me? Because if he, for
    example, always use message slot 14, then
  • 56:44 - 56:54
    obviously over a conversation, it would
    again be possible to deanonymize the users
  • 56:54 - 56:59
    in like, OK, they always accessing this
    one in like all those queries.
  • 56:59 - 57:07
    Answer: Absolutely. So I didn't explain
    that part. The trick is that between the
  • 57:07 - 57:13
    two people, we will share some secret,
    which is our conversation secret. And what
  • 57:13 - 57:17
    we will use that conversation secret for
    is to seed a pseudo random number
  • 57:17 - 57:21
    generator. And so we will be able to
    generate the same stream of random
  • 57:21 - 57:28
    numbers. And so each next message will go
    at the place determined by the next item
  • 57:28 - 57:33
    in that random number generator. And so
    now the person writing can just write out
  • 57:33 - 57:36
    random places as far as the server tells
    and when it wants to write the next
  • 57:36 - 57:41
    message in this conversation, it'll
    make sure to write at that next place
  • 57:41 - 57:47
    in its a random number generator for that
    conversation. There is a paper that will
  • 57:47 - 57:50
    describe a bunch more of that system.
    But that's the basic sketch.
  • 57:50 - 57:54
    A: Thank you.
    H: we have a question from the Internet.
  • 57:54 - 57:59
    Question: It seems like identity is the
    weak point of the new breed of messaging
  • 57:59 - 58:03
    apps. How do we solve this part
    of Zooko's triangle, the need for
  • 58:03 - 58:08
    identifiers and to find people?
    Answer: Identity is hard, and I think
  • 58:08 - 58:18
    identity has always been hard and will
    continue to be hard. Having a variety of
  • 58:18 - 58:23
    ways to be identified, I think remains
    important and is why there isn't a single
  • 58:23 - 58:27
    winner takes all system that we use for
    chat. But rather you have a lot of
  • 58:27 - 58:31
    different chat protocols that you use for
    different circles and different social
  • 58:31 - 58:35
    circles that you find yourself in. And
    part of that is our desire to not be
  • 58:35 - 58:39
    confined to a single identity, but to be
    able to have different facets to our
  • 58:39 - 58:45
    personalities. There are systems where you
    can identify yourself with a unique
  • 58:45 - 58:48
    identifier to each person you talk to
    rather than having a single identity
  • 58:48 - 58:54
    within the system. So that's something
    else that Pond would use. Was that the
  • 58:54 - 58:58
    identifier that you gave out to each
    separate friend was different. And so
  • 58:58 - 59:04
    you would appear as a totally separate
    user to each of them. It turns out that's
  • 59:04 - 59:10
    at the same time very difficult, because
    if I post an identifier publicly, suddenly
  • 59:10 - 59:15
    that identifier is now linked to me for
    everyone who uses that identifier. So you
  • 59:15 - 59:18
    have to give these out privately in a one
    on one setting, which limits your
  • 59:18 - 59:23
    discoverability. So that that concept of
    how we deal with identities I think is
  • 59:23 - 59:27
    inherently messy and inherently something
    that there's not going to be something
  • 59:27 - 59:32
    satisfying that solves.
    Herald: And that was the final question
  • 59:32 - 59:35
    concluding this talk. Please give a big
    round of applause for Will Scott.
  • 59:35 - 59:36
    Will: Thank you
  • 59:36 - 59:41
    Postroll music
  • 59:41 - 60:04
36C3 - What's left for private messaging?

