33C3 preroll music
Herald: Good morning everyone, thanks for
showing up in such great numbers, that's
always a good thing
for such an early session.
First of all I would like to ask you
a question, I mean... or
let's start like that: Last night I had
a weird encounter with a locked door
out of the fate that we endured during
this week we were out of our apartment
and the hotel owner let us stay in their
office, but the guy who stayed there
put the dead lock on so we tried to reach
him. Hmmm, how do you reach them?
We thought about maybe he has some
messaging, maybe he has some mobile number,
no landline, landline, they have landline.
It turned out that the guy
was not at the landline, out, exit, and
so we looked around in the bar.
So this wouldn't have happened
if he had mobile messaging, so,
to dive into that, if we could
just text him: "Hey, we are at the hotel,
please open the door" we would have had
one hour more sleep tonight.
So let's dive in
with, yeah, the talk of today.
So this morning session starts with
our speakers Roland Schilling
and Frieder Steinmetz.
applause
And they will be talking about...
they will at first give you a gentle
introduction into Mobile Messaging. I have
nine messaging apps on my phone, no, ten!
The organizers forced me to
install another messaging app.
And after that [they] give you a quick
analysis, or not so quick, I don't know,
a deep analysis of the Threema protocol.
So let's give another round of
applause for our speakers!
applause
Thank you, Thilo. I am Roland,
this is Frieder, and
as, well, as Thilo already introduced
us we are going to talk about
secure messaging. More specifically we are
trying to give a very broad introduction
into the topic because we want to make the
field that is somewhat complex available
to a more broad audience, so as
to leave our expert bubble
and get the knowledge of technology
that people use every day
to these people who are using it.
To do that we have to start
at a very low level which might mean for
the security and crypto nerds in the room
that you will see a lot of things that you
already know. But bear with us, please,
since we are specifically trying, at least
with the first part of the talk, to convey
a few of these mechanisms
that drive encrypted messaging
to people who are new to the field.
So what we are going
to try today is basically three
things: We are... we will try to
outline privacy expectations when we
communicate. We are going to do that
by sketching a communication scenario
to you guys and identifying
what we can derive from that in
expectations. We are going to find
an analogy, or look at an analogy that
helps us map these expectations to mobile
messaging. And then we are going to look
at specific solutions, technical solutions
that make it possible to make mobile
messaging as secure, and give us the same
privacy guarantees that a one-to-one talk
would, before, in the second part of the
talk we move on to look at a specific
implementation, and it's no secret anymore
that we are going to look at the specific
implementation of Threema. So let's just
dive right in.
You are at a party, a party in a house
full of people and a friend approaches
you wanting to have a private
conversation. Now what do you do? You
ideally would find a place at this party
that is, well, private, and in our
scenario you find a room, maybe the
bedroom of the host where nobody's in
there, you enter the room, you close the
door behind you. Meaning you are now
private, you have a one-on-one,
one-to-one session in this room in
private. And we are going to look at
what that means.
First of all the most, the most intuitive
one is what we call confidentiality and
that means that since nobody is there in
the room with you you are absolutely sure
that anything you say and anything your
communication partner says, if you imagine
Frieder and me having this conversation,
can only be heard by the other person.
If that is guaranteed we say… we call this
confidentiality because nobody who's
not intended to overhear any of
the conversation will be able to.
The second part… no, the second
claim that we make is: if you guys
know each other, and again,
if I had a talk with Frieder I know I've
been knowing him for a long time,
more than five years now, I know what
his face looks like, I know his voice,
I know that if I talk to him I actually
talk to ‘him’, meaning I know exactly
who my communication partner is
and the same thing goes vice versa,
so if this is achieved, if we can say
I definitely know who I'm talking to,
there's no chance that somebody else
switches in and poses off as Frieder
we call this ‘authenticity’.
Moving on. Integrity.
Integrity is a bit… this is where
the analogy falls short,
well, somewhat. But, basically, if I can
make sure that everything I say
reaches Frieder exactly the way I wanted
to say it and there is no messenger
in between, I'm not telling a third friend
"Please tell Frieder something" and
he will then alter the message because
he remembered it wrong or
has malicious intentions. If I can
make sure that everything I say
is received by Frieder exactly the way
I said it then we have ‘integrity’
on our communication channel.
Okay. The next ones are two ones
that are bit hard to grasp at first.
Therefore we are going to take a few
minutes to look at these, and they are
‘forward and future secrecy’. Suppose
somebody entered the room while we had our
talk and that person would stay a while
overhear some portion of our talk and
then they would leave the room again. Now
if they, if at the
point where they entered the room they
wouldn't learn anything about the
conversation that we had before, which is
intuitive in this scenario which, that's
why we chose it, they enter the room, and
everything that can overhear is only the
portion of the talk that takes place while
they are in the room, they don't learn
anything about what we said before,
meaning we have what we call forward
security, we'll get back to that, and
after they left they wouldn't be able to
overhear anything, anything more that we
say. This is what we call future security.
Because those are a bit hard to understand
we have made a graphic here. And we are
going to get back to this graphic when we
translate this so I'm going to take a
minute to introduce it. We have a time
line that is blue, goes from left to
right, and on this time line we have green
bar that denotes our secret on our secret
conversation. The first pink bar there is
when the third person enters the room,
then our secret conversation turns orange
because it's no longer secret, it's now
overheard by the third person and after
they left they wouldn't know anything that
was said after that. So the left part of
it meaning the fact that they can't hear
anything into the past is what we call
forward security and if they can't learn
anything after they left we call it future
secure, future secrecy, sorry. Okay, the
last one that we're going to talk about
since we're trying to keep things simple
is deniability. Since we are only two
people in the room and there are no
witnesses we achieve deniability because
after we had this talk we returned to the
party and people asked us what happened,
um, I can always point to Frieder as you
could to your friend and say he said
something. Frieder could always say, no I
didn't, and it would be my word against
his and if this is, you know, if our
scenario allows for this we have
deniability because every one of us can
always deny having said or not having said
something.
And now we are going to look at messaging.
Now in messaging a third player comes into
the room and this could be your provider
if we talk about text messaging like short
messages that we used to send in the 90s,
it could be your messaging provider if you
use something more sophisticated, it could
be WhatsApp for example could be Apple
depending on what your favorite messenger
is but there is always, unless you use,
like, federated systems, if some some of
you guys might think but I'm using Jabber
I know but we are looking at centralized
systems right now and in these there will
always be one third party that all
messages go through, whether you want it
or not. And whether you're aware of it or
not. And this brings us to our second
analogy which is Postal Services now while
messaging feels like you have a private
conversation with the other person and I
think everyone can relate to that you have
your phone you see you are
displayed with the conversation and it
looks like only you and this other person,
in my case Frida, are having this
conversation we feel like we have a
private conversation, while actually our
messages go through a service provider all
the time. Meaning we are now looking
something at something more akin to postal
services. We prepare a message send it
off, our message provider takes the
message, takes a to our intended
recipient, and they can then read the
message. And this is this this applies to
all the messages we exchange. And to
underline that we're going to look at what
I initially called traditional messaging
meaning text messaging, unencrypted SMS
messaging, and as you may or may not be
aware of these messages also go through
our providers: more than one provider
even. Say I'm at Vodafone and Frieder is
with Verizon, I don't know, I would send
my messages to Vodaphone, they would
forward them to Verizon who would then
deliver it to Frieders phone. So since
both of our providers would know all the
messages; they are unencrypted; we would
have no confidentiality.
They could change the messages and these
things have happened actually. So we
don't have any integrity we don't know if
the messages received are actually the
ones that were sent. We also have no
authentication because phone numbers are
very weak for authenticating people, they
are managed by our providers they don't
they are not fixed that there's no fixed
mapping to our phones or our SIM cards.
They can be changed they can be rerouted
so we don't we never know if the messages
we send are actually received by the
people we intended to: no authenticity and
no authentication. Now forward secrecy and
future secrecy don't even apply because we
have no secrecy. We do have some sort of
deniability but this goes into like
philosophically.. Let's do that again:
philosophical claims of whether when I say
I haven't sent anything this must have
been the provider they can technically,
you know, guarantee they did or did not do
something. So let's not dive too deeply
into that discussion, but we can summarize
that messaging translates, at least
traditional messaging, translates very
badly to our privacy expectations when we
think of a communication. Okay, moving on.
Looking at our postal analogy, actually
our messages are more like postcards.
Because they are plain, our providers can
look at them, can change them, you know
all the things we've just described: just
as they would a postcard. They can see the
intended recipient, they can look at the
sender, they can look at the tags, change
it: postcards. And what we want to achieve
now is find a way to wrap these postcards
and make them more like letters, assuming
that postal services don't open letters.
That's the one the one point with this
analogy that we have to like, define. And
to be able to do that we're going to we're
trying to give you the shortest encryption
to – the shortest introduction to
encryption, see I'm confusing myself here,
that you will ever get. Starting with
symmetric encryption.
Now, encryption, for those of you who
don't know, is what we call the
translation of plain, readable text into
text that looks like it's random, but it
can be reversed and turned back into plain
text provided we have the right key for
that. So to stick with a very simple
example please imagine this box that we've
just labeled crypto, and we are not
concerned with what's in the box we just
imagine it as a machine. Please imagine it
as a machine that takes two inputs the
plaintext and the key, and it produces
something that we call ciphertext.
The ciphertext is undistinguishable from
random text, but it can be reversed at the
recipient side using the same key and
basically the same machine just doing the
operation, you know, in reverse: turning
the ciphertext back into plain text. This
is what we call, sorry, this is what we
call symmetric encryption because if you
imagine a line where the cipher text is
you could basically mirror the thing on to
the other side so it's symmetric at that
at that line. And when when there's
something that is called symmetric there
is also something that is called
asymmetric and asymmetric encryption works
relatively the same way, only there are
now two keys. We have made them a yellow
one and a blue one. These keys are called
a key pair. They are mathematically
linked. And the way this works now is that
anything encrypted with one of these keys
can only be decrypted with the other one.
You can do it both ways, but the important
thing to memorize here is just anything I
encrypt with the yellow key can only be
decrypted with the blue key. Okay, since
we have that now, let's capitalize on this
on this scenario. Imagine each of our
communication partners now has one of
these two keys and we are still talking
about the same key pair that we've
outlined on the previous slide. Now we
call one of them a secret key and one of
them a public key. This is probably known
to most of you: traditional public key
cryptography.
We've added something that is called an
identity in this in this picture: we will
get back
to that in a minute. But the scenario we
want we want you to envision right now is
that both parties would publish their
public key to the public. And we are going
to get back to what that means as well.
And keep their secret key, as the name
says, secret. Some of you might know this
as a private key: it's the same the same
concept applies. We just chose to call it
secret key. Because it more clearly
denotes that it's actually secret and not
never published. So this would mean any
message that would that would be encrypted
with one of the parties public key could
then only be decrypted with that parties
secret key, putting us in a position where
I could take Frieta's public key, encrypt
my message, send it to him, and I would
know that he would be the only one able to
decrypt the message - as long as his
secret key remains his, well, secret.
And he doesn't doesn't publish it. Well
the problem is: it's a very expensive
scenario. We get something akin to a
postal to a postal service where we can
now encrypt the message and envision it
like putting a plain sheet of paper into
an envelope, seal it, we would put it on
the way. Nobody on the line would be able
to look into the letter. They would of
course, well, since there are addresses on
there, they would see who it is from and
who it to - but they couldn't look inside
the letter: this is achieved. But as I've
already said it's a very expensive
mechanism and by that we mean it is hard
to do for devices - especially since you
are doing mobile messaging on your phones,
ideally, especially hard to do on on small
devices like phones. So while if we had a
mechanism that would allow us to combine
symmetric and asymmetric encryption. And
it turns out we do. And we are going to
keep this very simple by just looking at
what is called key establishment, and then
again also just one particular way of key
establishment. We have two new boxes here:
they are called key generators. And the
scheme
that we are
looking at right now works works the
following way: You can take one of the
secret keys, and another part and another
public key, like the one of the other
party, put them into the key generator.
And remember, these keys are
mathematically linked each secret key
belongs to exactly one public key. And the
way this key generator works is that
through this mathematical this
mathematical linking it doesn't matter if
you take, in this case, let's call them
Alice and Bob: if you take Alice's secret
key and Bob public key, or Bob secret key
and Alice's public key, you will always
come up with the same key. And we call
this a shared key. Because this key can
now be it can be generated independently
on both sides and it can then be used for
symmetric encryption, and as we've already
told you symmetric encryption is a lot
cheaper than asymmetric encryption. So
this has one advantage and one
disadvantage: the advantages I've already
said is that it's way cheaper, and the
fact, well, the advantage is also that we
come up with the key on both sides, and
the disadvantage is that we come up with
one key on both sides - because whether or
not you've realized this by now since this
is a very static scheme we always come up
with the same key. That is going to be a
problem in a minute. So let's recap we
have looked at asymmetric encryption which
as I've said gives us IDs, and we're going
to look at what means. But it is very
expensive. We know that symmetric
encryption is cheap, but we have to find a
way to get this key delivered to both
parties before they can even start
encrypting their communication. And we
have looked at key establishment, which
allows us which gives us symmetric keys
based on asymmetric key pairs. Meaning we
have now basically achieved
confidentiality - we can use these keys
put them in the machines with our
plaintext, get ciphertext, can, you know,
we are able to transport it to the other
side. Nobody can look inside.
Confidentiality is achieved.
Now deniability. Deniability in this
scenario would basically mean, if you
think back at our initial sketch, where we
could say I haven't said that, and the
other guy couldn't prove that we did,
would in this case be a letter that was
sent to both of the participants, and it
would be from either of the participants.
So that when looking at this
cryptographically, we couldn't say this
was sent by me or this was sent by Frieda.
You could just see it was sent by, well,
either of us. And if you think of the
scheme that we've just sketched, since
both parties come up with the same key by
using different by using a different set
of keys to to generate them, basically the
same key can be generated on both sides.
And you can never really say, by just
looking at a message, if it was encrypted
with a shared key generated on one or on
the other side since they are the same.
So, very simply and on a very high level
we have now achieved deniability. What
about forward and future secrecy? You
remember this picture? Our overheard
conversation on the party that we were at
at the beginning of the talk? Well, this
picture now changes to this. And what we
are looking at now is something we call
key compromise and key renegotiation. Key
compromise would be the scenario where one
of our keys were lost. And we are talking
about the shared key that we generated
now. Which, if it would fall into the
hands of an attacker, this attacker would
be able to decrypt our messages because
it's the same key that we used for that.
Now, if if if at the point where the key
was compromised they wouldn't be able to
decrypt anything prior to that point - we
would have forward secrecy. And if we had
a way to renegotiate keys, and they would
be different ,completely different, not
linked to the ones we had before, and then
use that in the future, we would have
future secrecy. But we don't, since as
we've already said the keys that we
generate are always the same. And we want
you to keep this in mind because
we will get
back to this
when we look at Threema in more detail.
yeah, if we had a way to dump keys after
having used them, we could achieve forward
and future secrecy. Since we don't, we
can't right now. Okay, next recap our key
establishment protocol gives us
confidentiality, deniability, and
authenticity. We don't have forward and
future secrecy. And if you've stuck with
us you would realize we are omitting
integrity here - that is because we don't
want to introduce a new concept right now
but we will get back to that, and you will
see that when we look at Threema it
actually does have integrity. Now,
basically you could think we fixed all
the-- well, we fixed everything, but you
heard us talk about things like IDs, and
we said we haven't really lost a few words
about them lost many words about them and
we're going to look at that now. And we
are going to start with a quote by my very
own professor - don't worry you don't have
to read that, I'm going to do it for you.
My professor says, "cryptography is
rarely, if ever, the solution to a
security problem. Cryptography is a
translation mechanism, usually converting
a communications security problem into a
key management problem." And if you think
of it, this is exactly what we have now,
because I know that Frieder has a private
key, a secret key I'm sorry, and a public
key. He knows that I have a secret key and
a public key. How does I know which one of
those public keys that are in the open is
actually his? How would I communicate to
him what my public key is? Those of you
who've used PGP for example and then the
couple in the last couple of decades know
what I'm talking about. And we have the
same problem everywhere where public key
cryptography is used, so we also have the
same problem in mobile messaging. To the
rescue comes our messaging server -
because, since we have a central instance
inbetween us, we can now query this
instance: I can now tell my public key; I
can now take my public key and my identity,
tell the messaging server, "
Hey messaging server - this is my
identity. Please store it for me." So that
Frieda, who has
some well some kind of information to
identify me can then query, you, get my
public key back. This of course assumes
that we trust the message messaging
server. We may or may not do that. But for
now we have a way to at least communicate
our our public keys to other parties. Now
what can we use as identities here? In
our, like, now a figure here it's very
simple: Alice just goes to the messaging
server and says, "Hey, what's the public
key for Bob?" And the messaging server
magically knows who Bob is, and what his
public key is. And the same thing where I
work works the other way. What would; the
question now is what is a good ID in this
scenario. Remember we are on phones, so we
could think of using phone numbers, we
could think of using email addresses, we
could think of something else. And
something else will be the interesting
part, but let's look at the other parts
one by one.
Phone numbers can identify users - you
remember that you rely on your providers
for the mapping between phone numbers and
SIM cards, so you have to trust another
instance in this situation. We're going to
ignore that completely because we find
that phone numbers are personal
information, and I for one my phone
number. And I mean the same phone number
I've had it for like 18 years now. I
wouldn't want that to get into the wrong
hands. And by using it to identify me as a
person, or, you know, my cryptographic
identity that is bound to my to my keys: I
wouldn't necessarily want to use that,
because I wouldn't be able to change it or
I would want to change it if it ever got
compromised. Now something else comes to
mind: e-mail addresses. E-mail addresses
basically are also personal information.
They are a bit shorter lived, as we would
argue, than phone numbers. But, and you
can use temporary e-mails, you can do a
lot more you are way more flexible with
e-mails. But ideally we want to have
something that is that we call dedicated
IDs, meanings something that identifies me
only within the bounds of the service that
we use.
So that's what we want to have we are
going to show you how this might work but
we still have to find a way to verify
ownership, because this is a scenario that
is more or less likely to happen. I am
presented with a number of public keys to
an identity that I know - and I have to
verify a way to, well, I have to find a
way to verify which one is maybe the right
one, maybe the one that is actually used,
maybe Frieda has used quite a number of
public keys - he's a lazy guy. He forgets
to, you know, take his keys from one
machine to the other: he just, you know,
buys a new laptop sets up a new public
key: bam, he has two - which one am I
supposed to read to use right now. Now
remember that we are looking at the
messenger server for, you know, key
brokerage, and we are now going to add a
third line here and that is this one.
Basically we introduce a way to meet in
person, and again PGP veterans will know
what I'm talking about, and verify our
keys independently. We've chosen QR codes
here - free mail uses QR codes, many other
messengers and do as well, and we want to
like tell you why this is an important
feature to be able to to verify our public
keys independently of the messaging
server. Because once we did that we no
longer have to trust the messaging server
to tell us or - we don't have longer we no
longer have to trust his promise that this
is actually the key we are looking for. We
have verified that independently. Okay, we
have basically solved our authenticity
problem. We know that we can identify
users by phone numbers and emails, and you
remember our queries to the server for
Bob: we can still use phone numbers for
that if we want to. We can use emails for
that if we want to. We don't have to. We can
use our ids anonymously. But we have a way
to verify them independently. The
remaining problem is users changing their
IDs - that is where we have to verify
again. And we also get back to that later,
but I want to look at something else
first, and that is the handling of
metadata.
Now, we know that an attacker can no
longer look inside our messages. They can,
however, still see the addressee, who the
message is from, and they can see how
large the message is, they can see they
can look at timestamps and stuff like
that. And since we are getting a bit tight
on the clock I'm going to try to
accelerate this a bit. Metadata handling:
we want to conceal now who a message is
from, who a message is to. And we are
doing this by taking the envelope that
we've just generated, wrapping it into a
third envelope, and then sending that to
the messenger server first. And the
messenger server gets a lot of envelopes.
They are all just addressed to the
messenger server, so anyone on the network
would basically see there's there's one
party sending a lot of messages to the
messenger server; maybe there are a lot of
parties. But they couldn't look at they
couldn't look at the end-to-end, we call a
channel, that's seeing what the address is
on each internal envelope are. The
messaging server, however, can. They would
open the other-- the outer envelope, look
at the inside, see , "Okay this is a
message directed at Alice," wrap it into
another envelope - that would just say,
"This is the message from the messaging
server and it is directed to Alice." Who
would then be able to, you know, open the
outer envelope, open the inner envelope,
see this is actually a message from Bob.
And what we have thereby achieved is a to
where two layer end to end communication
tunnel as we call it, where the purple and
the blue bar are encrypted channels
between both communication partners and
the messaging server, and they carry an
encrypted tunnel between both partners,
you know, both communication partners,
directly. But, and we've had this caveat
before, the messaging server still knows
both communication partners, they still
know the times that the messages were
sent. And they also know the size of the
message. But we can do something against
that. And we what we do is introduce
padding - meaning,
in the inner envelope we
just stick a bunch of extra
pages so the envelope looks a bit thicker.
And we do that by just appending random
information to the actual message before
we encrypt it. So anything looking at the
encrypted message would just see a large
message. And, of course, that should be
random information every time - it should
have should never have the same length
twice. But if we can achieve that, we can
at least conceal the size of the message.
Now so much for our gentle introduction to
mobile messaging. And for those those of
you stuck around, we are now moving on to
analyze Threema. Now I want to say a few
things before we do that - we are not
affiliated with Threema, we don't, we are
not here to recommend that the the app to
you or the service. We didn't do any kind
of formal analysis. There will be no
guarantees. We will not be quoted with
saying, "use it or don't use it." What we
want to do is make more people aware of
the mechanisms that are in use and we have
chosen basically a random message provider
- we could have chosen anyone. We chose
Threema for the fact that they do offer
dedicated IDs. That they don't bind keys
to phone numbers, which many messengers
do. Those of you who use WhatsApp know
what I'm talking about. And well, since it
is closed source we found it interesting
to look at what is actually happening inside
the app and make that publicly aware. Now
we are not the only ones we've done this,
we are also not the first ones who've done
this, and we don't claim we are. But we
are here now and we want to try to make
you aware of the inner workings of the app
as far as we have understood it. And with
that I hand the presenter over to Frieda.
Applause
Frieda: So I'll be presenting to you our
understanding of the Threema protocol and
how the application works as we deduced
from mostly reverse engineering the
Android app. And so this won't be a
complete picture, but it will it will be a
picture presenting to you the most
essential features and how the protocol
works. And I'll start by giving you a
bird's eye look at the overall
architecture and why Roland was giving you
this abstract introduction to mobile
messaging, there was also always this
third party - this messaging provider.
And this now became actually three
entities because Threema has three
different servers, mostly, doing well,
very different stuff for for the apps
working. And I'll start with the directory
server in orange at the bottom, because
that is the server you most likely will be
contacted contacting first if you want to
engage in any conversation with someone
you never talked to before. Because this
is the server that handles all the
identity public key related stuff that
Roland was talking about so much. This is
the server you'll be querying for whose
public key - I have this Threema ID,
what's the corresponding public key, for
example stuff like that. Above that there
is the messaging server, which is kind of
the core central entity in this this whole
scenario because it's task is relaying
messages from one communication partner to
another. And above that we have the media
server, and I'll be talking about that
later. In short, its its task, its
purpose, is storing large media files like
images and videos you send to your
communication partners. But as I said I
want to start with the directory server,
and in the case of Threema, this directory
server is offers a REST API so
communication with this server happens
via HTTP. It is HTTPS actually so it's
TLS encrypted. And this encryption is also
fulfills all the requirements you would
have to to to a proper TLS connection and,
so, if you if you want to communicate with
the new person and you have
their phone
number or
the email address or Threema ID. You'll be
asking your app will be asking the
directory server, "Hey, I have this phone
number, do you have a corresponding
Threema account and public key." And the
response will hopefully be, "Yes, I do -
that's a public key that's the Threema ID:
go ahead."
And as Ron said we kind of chose Threema
for the arbitrary use of IDs and
especially for the system of verifying
fingerprints in person by scanning QR
codes and because this is something
Threema has and other messengers do not
have I want to talk a little bit about
that, because if you just ask the
directory server "hey I have a threema ID
what is the corresponding public key?" the
threema location will say "ok I got an
answer from from the directory server I
have a public key but I have very little
trust, that you actually know who the real
person behind this threema account is,
we're not quite sure about that", so it'll
mark this contact with one red dot and if
you had a phone number or an email address
and asked the directory server, "hey
what's the corresponding threema account
and public key?" the app will say, "ok we
still have to trust the directory server,
but we're a little bit more confident that
the person on the other hand is actually
who you think they are because you have a
phone number probably linked to a real
person and you have a better idea who
you're talking to but still we rely on the
threema server", so it'll knock a contact
like that with two orange dots and then
there is the final stage if you met
someone in person and scan their, their
public key and threema ID in form of a QR
code such a contact will be marked with
three green dots and in that case the app
says "We're 100% confident we're talking
to the person we want to talk to and we
have the proper keys." So right now we're
at if we think of engaging a conversation,
we were at the point where we do have all
necessary details to start encrypting our
communication, but question remains, how
do we encrypt our communication,
in case of threema.
Threema uses a library called salt has
been developed by Daniel Bernstein and he
called it salt but it's spelled NaCl so
I'm sorry for for the play on words, but
if you see NaCl its salt so this is a
library specifically designed for the
encryption of
messages and it's supposed to be very
simple in use and give us all the the
necessary features we wanted and this is
Salt's authenticated encryption giving us
all the features Roland was talking about
in abstract before. It gives us integrity,
it gives us authenticity, it gives us
confidentiality and just a quick look and
on how this this library would be used is,
as you can see up there like everything in
the grey box is, what the library does and
we only need our secret key, if we want to
encrypt something to someone, the
recipients public key, our message. So far
very obvious and the library also requires
a nonce, which is something that should be
only used once, that's actually yeah part
of the definition, so we generate
something random and include that in the
process of encrypting the message this is
just so that if we encrypt the same
content same message twice, we do not get
the same ciphertext. This is not nothing
secret because as you can see at the
output the library actually gives us
ciphertext, Roland talked a bit about that
what it is and it'll also give you it was
a MAC and I'll just stick with a very
simple definition of what that is, it is
something that ensures that there's kind
of a checksum so someone getting looking
at the cipher text and the MAC can ensure
no one tampered with the cipher text so
the cipher text is still in the state when
it was, when we sent it and if we want to
transmit our message now in encrypted form
to someone, we have to include the nonce,
the nonce is not secret, we can just send
it along with the cipher text, but to
decrypt we need the nonce and well so this
is what we might use for encryption, but
as you might remember from Roland's
introduction, this scheme
does not offer us any forward or future
secrecy and we can still try to to add
some form of forward to future secrecy to
this scheme and this is usually done,
sorry for skipping with a, with something
something called a handshake and
handshakes are a system of discarding old
keys and agreeing agreeing a new keys,
this is usually what we do with the
handshake and scenarios like this and
doing a handshake with someone that is not
online at the moment is pretty difficult
there are protocols to do that; the signal
messaging for app, app for example does
something like that but it's kind of
complicated and threema's protocol spares
the effort and only does this kind of
handshake with the Threema servers because
they are always online, we can always do a
handshake with them, so Threema has some
form of forward secrecy on this connection
to the messaging server and how this is
achieved, I'll try to present to you right
now and we walk through this handshake
step by step and I try to put some focus
on what every step tries to achieve, so if
we initiate a connection, if we start
sending a message the threema app will
connect to the to the messaging server and
start the connection by sending a client
hello, this is a very simple packet. It is
only there to communicate the public key
we from now on intend to use
and a nonce prefix in this case
notice it is I'd say half a nonce and the
other part is some some kind of a counter
that will during the ongoing communication
always be increased by one. So but it'll
do no harm if you just see it as a nonce
right now, so we start the conversation by
saying "hey, we want to use a new key pair
from now on and this is our public key,
please take note" and the server will
react by saying "okay, I need a fresh key
pair as well then", generate a fresh key
pair and let us know what it's public key
from now on is. The only thing to note is,
I mean as you can see there is, there's
not much more than then the things
the client sent
corresponding things from the server side,
but there's also the client nonce
included, so so as we can we can see this
is actually a response to our client hello
we just sent, not something that got, I
don't know redirected to us on accident,
whatever. And as you can see the latter
part of the message including the server's
public key is encrypted that's what what
this bracket saying ciphertext says and it
is encrypted with the server's long-term
secret key and our ephemeral temporary key
and by doing so, the server does something
only the person in possession of the
service long-term secret key can do and
proves to us, this public key we just
received from the server, in this server
"hello", has actually been been sent by
the proper threema server, no one can
impersonate the threema server at that
point, so, after that we are at a point
where the client application knows, this
is the public key threema server wants to
use and it's actually the threema server,
not someone impersonating it, the server
know was there is someone who wants to
talk to me using this public key, but
knows nothing else it doesn't know who's
actually talking to him and this is going
to change with the next packet, because
the threema app is going to, to now send a
client authentication packet, we call it
that way, which includes information about
the client, the first thing is the threema
ID , the threema IDs are eight character
strings, it's just uppercase letters and
numbers and what follows is a user agent
string which is not technically necessary
for the protocol, it's something the
threema app sends, it includes the threema
version, your system; Android iOS and
your, in case of Android, the Android
version and stuff like that so it's very
similar to user agent in web browsers,
yeah. I don't know why they sent it, but
they do and the rest of it is nonces.
Let's get skip over them, but also the
client's ephemeral public key we already
sent in the client hello but this time
encrypted
with our long-term secret key, so we just
repeat what the server just did, proving
by encrypting with our long-term key,
proving that we are, who we claim to be
and that we vouch that we really want to
use this, this temporal key and after that
happens each party knows, what public key
what new keypair the other party wants to
use from now on and that the other party
is actually who they claim to be and so
the handshake is just concluded
by the server
sending a bunch of zeros and grouped
encrypted with the newly exchanged key
pairs. This is just so the client can
decrypt it, see it as a bunch of zeros,
everything worked out, we have a working
connection now so if we've done that we
have this, we have, if you remember this
picture, we have established forward
secrecy in the paths between the app and
the server we do not have established
anything for the inner crypto layer, which
is in case of threema, just taking
messages encrypting them with the salt
library and sending them over the wire.
There's nothing more to it, it's just as I
showed you the scheme before, used in a
very simple way so we now have channels
established and we can communicate via
those and the next step I want to look at,
what we are actually sending via this
channels and so I'm introducing the
threema packet format and this is the
format packets do have, that your
application sends to the threema service,
this is what if what the threema server
sees, in this case it is the form a packet
has if it's something I want to send to a
communication partner, for example, the
content could be a text message
I want to send to someone.
There are different looking messages for,
for management purposes, for exchanges
with the server, that will never be
relayed to someone else, but this is the
the most basic format we use when sending
images, text to, to communication parts
and as you can see there's a packet type,
its purpose is kind of obvious and what
follows is the fields on the envelope as
Roland introduced, it's saying "this is a
message from me"
from Alice to Bob and so you recall the
server can see that, what follows is a
message ID this is just a random ID
generated when sending a message, follows
a timestamp so the server knows this is a
recent message that has been stuck in
transit for a long time, whatever.
What follows is some things to threema
specific, threema does have public
nicknames, it's just an alias for, for
your account you can set that in the app
and if you do it actually gets transmitted
with every message you send, so if you
change it, your name will change at your
communication partners phone with the
first message you sent to them and what
follows is a nonce and that is the nonce
used to encrypt the cypher text as
follows, the cypher text you see down
below is the inner envelope, as in
Roland's earlier pictures and we're now
going to look at what is in this envelope,
how do the messages look we transmitted to
our end-to-end communication partners and
the most simple thing we could look at is
a text message and you can see grayed out
above, still all the stuff from the outer
envelope and down below it's very simple,
we have a message type it's just one byte
indicating in this case that it is a text
message and what follows is text.
It's nothing more, it's just plain plain
text and after that, noteworthy maybe is
padding and this padding is as you can see
in the most inner encryption layer so the
threema server does not know how big your
your actual messages are, this is kind of
useful because there's stuff like typing
notifications you send to your
communication partners, which are always
the same size and to make this, to hide
this from the threema servers, we have
this padding in the inner crypto layer.
Next I want to look at a other message
type, like I'd say the most, yeah, I think
one of the basic message types most people
use with instant messaging app is image
messages, I want to send someone an image,
this is something we do regularly and this
looks a little bit weird in the first on
the first look; because it has a message
type, we know that, we know what what it's
burst with the purposes follows a blob
ID, what a blob ID is, I'm going to
explain in a minute. Follows the size is
very basic, it's just the size of the image
just should be transmitted and what
follows is a key and the mandatory
padding, so, the questions are, what is
this blob ID what is the key ID and what
is this key and this is where the media
server comes into the picture. The media
server is, well I'll show you what happens
if you send an image message. Your app
will take the image you want to send,
generate a random key, encrypt this image
with this key and send it to the media
server and the media server will say "okay
I'll store this under the following blob
ID" and your app takes note of this blob
ID and then, we'll send this kind of image
message I just showed to you to the
messaging server via the messaging server
to your communication partner, your
communication partner opens up the message
looks at it sees a blob ID sees the key
and goes to the media server and says "hey
do you have a blob ID, something stored
under this blob ID?" and the media server
will respond "yes I do, here's the encrypted
stuff" and your communication partner
can take this encrypted stuff, decrypt it
with the key you sent and look at your image.
This is how image sending works. So right
now we do have the basic the basics of
modern instant messaging, we can send
text, we can send images, this is the
simple stuff and what I want to look at
next is something that most people would
want a modern messenger to have as well
and that is group conversations.
Group conversations essentially in threema
do work not very different from other from
other method messages because if you send
something to a group your app will just
encrypt the message several times for
every communication partner involved and
send it to them, but your communication
partners need to know, well this is a
group message and it belongs to this and
that group and to do so threema has group
packets and they include exactly that
information, they include a creator ID
which is the threema ID of the person who
created the group and a group ID which is
something randomly generated when creating
a group and after that folIows a regular
packet format; in this case a text message,
if it were an image message you would see
exactly the same stuff as shown in the
Image message before so this is how group
messages look, but we need a way to
introduce new groups to change names and
for that there are special packets and
this for example is a group "set members
message", which tells everybody there is
this new group and it has the following
members as you can see here there is
only a group ID, there is no longer a
group creator ID included and that is
because a threema group management
is very static, there can only be one person
managing a group and that is the person
who created the group. So only the person
who created the group can send this kind
of messages, saying there is a new member
in the group for example and therefore the
group creator is implicit in this case, it
is the sender of the message, so this is
kind of annoying because you cannot have
a group where everybody can have members for
example and stuff like that. Just if you
set a name for a group, the message looks
very similar it just doesn't include a
member list, but a name field. So, what I
want to talk about next is something that
happens above all the stuff I talked
about right now, because now I show you
there are different kinds of packets doing
all that stuff, there there are lots of
more packages for all your messages for
example they look very similar to the
image messages, because they just I mean
we have a blob ID for the audio file and
stuff like that but what is kind of
interesting I thought, we thought, is that
above this layer of packet formats,
there's, there's also some additional stuff
happening and a good example for that is
how Threema handles subtitles for images,
you can I think a lot of modern messengers
support that at some some kind of text to
an image and Threema doesn't have a packet
format of a field in some kind of image
message for that, but they just embed the
subtitle of the image, in the actual image
and the acts of data of the image and send
it along. This has the advantage of being
compatible with Threema versions not aware
of this feature, because they can just
happily ignore this exif data, you won't
see the subtitle but it won't break
anything. It is though kind of wonky because
it's not actually a feature which is not
reflected in the actual packet format and
this is also very similar happening with
quotes, you can quote other people in
Threema you can like, mark your message
and say I want to quote that and in the
app it looks like like some kind of fixed
feature, yeah, you have this message you
quoted, included in your new message and
it looks like like it's somehow linked to
the old message, but in reality it's just
a text message, including some markdown,
which if you're Threema version supports
this this kind of stuff, is rendered
nicely as is shown below, but if your
version doesn't support it, you'll just
see the plain text.
So again, being compatible with versions
that don't have it introduces some, yeah,
weird layer. And with that, I'll stop
showing you all the features Threema has.
There's certainly more to talk about, but
I think you should have an idea how how it
works in basic terms. What it does; all
the other stuff is kind of similar to what
I showed you and differs in
particularities which aren't so important
I think and I'll just hand over to Roland
who'll be wrapping up our talk and say
something about the results of our reverse
engineering.
Applause
Roland: Okay, we told you we reversed the
app and we told you we weren't the first
ones and this is all true. But we came
here to tell you guys or to make you guys
aware of things you can expect from
messaging apps, and we hope that by using
Threema as an example we have we have
shown you how you can relate your own
privacy expectations to different apps and
we also hope we gave you enough
terminology and explanation to that so you
can make a more more competent decision
next time you look at a messenger and look
at what its promises are. Since we
reversed it anyway and we did a lot of
coding to do that what we did is put it in
a library. Now, I don't know how many of
you guys know the term academic code
Laughter
We are of course we are of course I'm
working at a university, so we've been
doing this on and off for for quite some
time. We started roughly two years ago,
did it for a couple of days then left it
lying around. Eventually we had the whole
thing lying in a drawer for about a year
before we decided to finish it so we we
didn't we never actually put a lot of
effort into the code. We are not
proficient programmers. But we still
wanted to we still wanted to publish what
we did with the hopes that a small
community might form around this, maybe
extend it, help us you know fix the few
things that we didn't do so well, help us
document it - you don't have to take
photographs by the way will will upload
the slides. So these repositories they
exist we push to them we made a GitHub
organization that we push to them
yesterday. If you wanted to look if you
wanted to start coding right away, say if
you wanted to write a bot, we'd recommend
you wait a few weeks say two to three
because we still want it like, fix a few
of the kinks in there. Everyone else we
hope will just look at it, maybe this will
help your understanding of what actually
does. And also the activists in us hope
that this might get the people at Threema
to open-source their code because no
matter what we tell you here, and no
matter what they tell you how their their
app actually works - and this is always
true for non open-source software, there
will never be true transparency: you will
never be able to prove that what runs on
your phone is actually implemented the
same way we've shown you. With our library
you would have these guarantees, you can
actually you can definitely use it to
write bots if you ever wanted to do that.
Or if you just want to understand how it
works please go ahead and dive right into
there. Well, with that said, we thank you
for your attention.
Applause
Herald: Okay, thank you very much, Roland,
Frieder, we only have time for one
question, so who has a super eager
question - the signal angel is signalling.
Signal Angel: There's a couple of
questions, but I will pick the best one.
The best one was from alien: could you use
captions to inject malicious Exif data
into the images?
Frieder: What is malicious Exif data?
Signal Angel: Well some data that probably
the image passing library.
Frieder: What we did not do was have
looked very particular at security
problems in the implementation of Threema.
I could, like, and I would say this falls
into this department: there's also a
library handling the gif display meant and
stuff like that. We could have looked at
is this broken, maybe. We did not. We
looked at the protocol from a higher level
and, so I cannot say anything about it.
Signal Angel: Okay and another question
was when an non-group originating user
sends the group update message, what
happens?
Frieder: The thing is, Threema group IDs
aren't globally unique. A Threema group ID
only refers to a particular group together
with the group creator's ID. So if you
send an update group message from your
account, the app would look for a
different group than you intended. Because
your group ID would say I'm I'm trying to
update a group created by me with this and
that ID. So it won't be the group
you want to hijack.
Herald: Okay, very well. Another round
of applause for our speakers!
applause
postroll music
Subtitles created by c3subtitles.de
in the year 2018