WEBVTT
00:00:00.000 --> 00:00:13.139
33C3 preroll music
00:00:13.139 --> 00:00:17.510
Herald: Good morning everyone, thanks for
showing up in such great numbers, that's
00:00:17.510 --> 00:00:23.630
always a good thing
for such an early session.
00:00:23.630 --> 00:00:28.690
First of all I would like to ask you
a question, I mean... or
00:00:28.690 --> 00:00:33.590
let's start like that: Last night I had
00:00:33.590 --> 00:00:38.680
a weird encounter with a locked door
00:00:38.680 --> 00:00:45.629
out of the fate that we endured during
this week we were out of our apartment
00:00:45.629 --> 00:00:51.330
and the hotel owner let us stay in their
office, but the guy who stayed there
00:00:51.330 --> 00:00:57.290
put the dead lock on so we tried to reach
him. Hmmm, how do you reach them?
00:00:57.290 --> 00:01:03.350
We thought about maybe he has some
messaging, maybe he has some mobile number,
00:01:03.350 --> 00:01:08.880
no landline, landline, they have landline.
It turned out that the guy
00:01:08.880 --> 00:01:13.870
was not at the landline, out, exit, and
00:01:13.870 --> 00:01:17.860
so we looked around in the bar.
So this wouldn't have happened
00:01:17.860 --> 00:01:23.870
if he had mobile messaging, so,
to dive into that, if we could
00:01:23.870 --> 00:01:29.369
just text him: "Hey, we are at the hotel,
please open the door" we would have had
00:01:29.369 --> 00:01:34.659
one hour more sleep tonight.
So let's dive in
00:01:34.659 --> 00:01:41.339
with, yeah, the talk of today.
00:01:41.339 --> 00:01:45.580
So this morning session starts with
our speakers Roland Schilling
00:01:45.580 --> 00:01:48.880
and Frieder Steinmetz.
00:01:48.880 --> 00:01:55.580
applause
00:01:55.580 --> 00:02:00.170
And they will be talking about...
they will at first give you a gentle
00:02:00.170 --> 00:02:06.240
introduction into Mobile Messaging. I have
nine messaging apps on my phone, no, ten!
00:02:06.240 --> 00:02:10.990
The organizers forced me to
install another messaging app.
00:02:10.990 --> 00:02:15.080
And after that [they] give you a quick
analysis, or not so quick, I don't know,
00:02:15.080 --> 00:02:18.390
a deep analysis of the Threema protocol.
00:02:18.390 --> 00:02:22.000
So let's give another round of
applause for our speakers!
00:02:22.000 --> 00:02:28.730
applause
00:02:28.730 --> 00:02:33.780
Thank you, Thilo. I am Roland,
this is Frieder, and
00:02:33.780 --> 00:02:37.840
as, well, as Thilo already introduced
us we are going to talk about
00:02:37.840 --> 00:02:43.260
secure messaging. More specifically we are
trying to give a very broad introduction
00:02:43.260 --> 00:02:48.170
into the topic because we want to make the
field that is somewhat complex available
00:02:48.170 --> 00:02:53.580
to a more broad audience, so as
to leave our expert bubble
00:02:53.580 --> 00:02:58.390
and get the knowledge of technology
that people use every day
00:02:58.390 --> 00:03:03.500
to these people who are using it.
To do that we have to start
00:03:03.500 --> 00:03:08.720
at a very low level which might mean for
the security and crypto nerds in the room
00:03:08.720 --> 00:03:14.550
that you will see a lot of things that you
already know. But bear with us, please,
00:03:14.550 --> 00:03:20.090
since we are specifically trying, at least
with the first part of the talk, to convey
00:03:20.090 --> 00:03:24.790
a few of these mechanisms
that drive encrypted messaging
00:03:24.790 --> 00:03:29.910
to people who are new to the field.
So what we are going
00:03:29.910 --> 00:03:35.250
to try today is basically three
things: We are... we will try to
00:03:35.250 --> 00:03:40.410
outline privacy expectations when we
communicate. We are going to do that
00:03:40.410 --> 00:03:46.110
by sketching a communication scenario
to you guys and identifying
00:03:46.110 --> 00:03:49.960
what we can derive from that in
expectations. We are going to find
00:03:49.960 --> 00:03:54.470
an analogy, or look at an analogy that
helps us map these expectations to mobile
00:03:54.470 --> 00:03:59.470
messaging. And then we are going to look
at specific solutions, technical solutions
00:03:59.470 --> 00:04:06.760
that make it possible to make mobile
messaging as secure, and give us the same
00:04:06.760 --> 00:04:12.470
privacy guarantees that a one-to-one talk
would, before, in the second part of the
00:04:12.470 --> 00:04:17.060
talk we move on to look at a specific
implementation, and it's no secret anymore
00:04:17.060 --> 00:04:23.880
that we are going to look at the specific
implementation of Threema. So let's just
00:04:23.880 --> 00:04:29.590
dive right in.
You are at a party, a party in a house
00:04:29.590 --> 00:04:33.150
full of people and a friend approaches
you wanting to have a private
00:04:33.150 --> 00:04:38.430
conversation. Now what do you do? You
ideally would find a place at this party
00:04:38.430 --> 00:04:43.090
that is, well, private, and in our
scenario you find a room, maybe the
00:04:43.090 --> 00:04:47.600
bedroom of the host where nobody's in
there, you enter the room, you close the
00:04:47.600 --> 00:04:52.340
door behind you. Meaning you are now
private, you have a one-on-one,
00:04:52.340 --> 00:04:56.280
one-to-one session in this room in
private. And we are going to look at
00:04:56.280 --> 00:05:00.420
what that means.
00:05:00.420 --> 00:05:06.919
First of all the most, the most intuitive
one is what we call confidentiality and
00:05:06.919 --> 00:05:11.319
that means that since nobody is there in
the room with you you are absolutely sure
00:05:11.319 --> 00:05:15.509
that anything you say and anything your
communication partner says, if you imagine
00:05:15.509 --> 00:05:21.029
Frieder and me having this conversation,
can only be heard by the other person.
00:05:21.029 --> 00:05:26.300
If that is guaranteed we say… we call this
confidentiality because nobody who's
00:05:26.300 --> 00:05:32.270
not intended to overhear any of
the conversation will be able to.
00:05:32.270 --> 00:05:37.770
The second part… no, the second
00:05:37.770 --> 00:05:42.580
claim that we make is: if you guys
know each other, and again,
00:05:42.580 --> 00:05:45.909
if I had a talk with Frieder I know I've
been knowing him for a long time,
00:05:45.909 --> 00:05:50.619
more than five years now, I know what
his face looks like, I know his voice,
00:05:50.619 --> 00:05:56.199
I know that if I talk to him I actually
talk to ‘him’, meaning I know exactly
00:05:56.199 --> 00:06:01.009
who my communication partner is
and the same thing goes vice versa,
00:06:01.009 --> 00:06:06.660
so if this is achieved, if we can say
I definitely know who I'm talking to,
00:06:06.660 --> 00:06:11.409
there's no chance that somebody else
switches in and poses off as Frieder
00:06:11.409 --> 00:06:17.109
we call this ‘authenticity’.
Moving on. Integrity.
00:06:17.109 --> 00:06:21.840
Integrity is a bit… this is where
the analogy falls short,
00:06:21.840 --> 00:06:27.579
well, somewhat. But, basically, if I can
make sure that everything I say
00:06:27.579 --> 00:06:31.619
reaches Frieder exactly the way I wanted
to say it and there is no messenger
00:06:31.619 --> 00:06:36.860
in between, I'm not telling a third friend
"Please tell Frieder something" and
00:06:36.860 --> 00:06:43.400
he will then alter the message because
he remembered it wrong or
00:06:43.400 --> 00:06:49.070
has malicious intentions. If I can
make sure that everything I say
00:06:49.070 --> 00:06:54.190
is received by Frieder exactly the way
I said it then we have ‘integrity’
00:06:54.190 --> 00:06:59.479
on our communication channel.
Okay. The next ones are two ones
00:06:59.479 --> 00:07:04.849
that are bit hard to grasp at first.
Therefore we are going to take a few
00:07:04.849 --> 00:07:09.050
minutes to look at these, and they are
‘forward and future secrecy’. Suppose
00:07:09.050 --> 00:07:14.900
somebody entered the room while we had our
talk and that person would stay a while
00:07:14.900 --> 00:07:21.449
overhear some portion of our talk and
then they would leave the room again. Now
00:07:21.449 --> 00:07:25.059
if they, if at the
point where they entered the room they
00:07:25.059 --> 00:07:28.630
wouldn't learn anything about the
conversation that we had before, which is
00:07:28.630 --> 00:07:32.349
intuitive in this scenario which, that's
why we chose it, they enter the room, and
00:07:32.349 --> 00:07:36.660
everything that can overhear is only the
portion of the talk that takes place while
00:07:36.660 --> 00:07:41.389
they are in the room, they don't learn
anything about what we said before,
00:07:41.389 --> 00:07:46.610
meaning we have what we call forward
security, we'll get back to that, and
00:07:46.610 --> 00:07:51.190
after they left they wouldn't be able to
overhear anything, anything more that we
00:07:51.190 --> 00:07:56.319
say. This is what we call future security.
Because those are a bit hard to understand
00:07:56.319 --> 00:08:00.090
we have made a graphic here. And we are
going to get back to this graphic when we
00:08:00.090 --> 00:08:05.180
translate this so I'm going to take a
minute to introduce it. We have a time
00:08:05.180 --> 00:08:08.509
line that is blue, goes from left to
right, and on this time line we have green
00:08:08.509 --> 00:08:14.160
bar that denotes our secret on our secret
conversation. The first pink bar there is
00:08:14.160 --> 00:08:19.159
when the third person enters the room,
then our secret conversation turns orange
00:08:19.159 --> 00:08:23.279
because it's no longer secret, it's now
overheard by the third person and after
00:08:23.279 --> 00:08:29.979
they left they wouldn't know anything that
was said after that. So the left part of
00:08:29.979 --> 00:08:36.510
it meaning the fact that they can't hear
anything into the past is what we call
00:08:36.510 --> 00:08:40.299
forward security and if they can't learn
anything after they left we call it future
00:08:40.299 --> 00:08:48.440
secure, future secrecy, sorry. Okay, the
last one that we're going to talk about
00:08:48.440 --> 00:08:53.310
since we're trying to keep things simple
is deniability. Since we are only two
00:08:53.310 --> 00:08:57.720
people in the room and there are no
witnesses we achieve deniability because
00:08:57.720 --> 00:09:01.540
after we had this talk we returned to the
party and people asked us what happened,
00:09:01.540 --> 00:09:06.110
um, I can always point to Frieder as you
could to your friend and say he said
00:09:06.110 --> 00:09:10.070
something. Frieder could always say, no I
didn't, and it would be my word against
00:09:10.070 --> 00:09:17.130
his and if this is, you know, if our
scenario allows for this we have
00:09:17.130 --> 00:09:22.190
deniability because every one of us can
always deny having said or not having said
00:09:22.190 --> 00:09:28.209
something.
And now we are going to look at messaging.
00:09:28.209 --> 00:09:33.870
Now in messaging a third player comes into
the room and this could be your provider
00:09:33.870 --> 00:09:40.040
if we talk about text messaging like short
messages that we used to send in the 90s,
00:09:40.040 --> 00:09:44.000
it could be your messaging provider if you
use something more sophisticated, it could
00:09:44.000 --> 00:09:47.600
be WhatsApp for example could be Apple
depending on what your favorite messenger
00:09:47.600 --> 00:09:54.100
is but there is always, unless you use,
like, federated systems, if some some of
00:09:54.100 --> 00:09:59.310
you guys might think but I'm using Jabber
I know but we are looking at centralized
00:09:59.310 --> 00:10:03.740
systems right now and in these there will
always be one third party that all
00:10:03.740 --> 00:10:07.820
messages go through, whether you want it
or not. And whether you're aware of it or
00:10:07.820 --> 00:10:16.620
not. And this brings us to our second
analogy which is Postal Services now while
00:10:16.620 --> 00:10:20.431
messaging feels like you have a private
conversation with the other person and I
00:10:20.431 --> 00:10:23.822
think everyone can relate to that you have
your phone you see you are
00:10:23.822 --> 00:10:28.410
displayed with the conversation and it
looks like only you and this other person,
00:10:28.410 --> 00:10:32.300
in my case Frida, are having this
conversation we feel like we have a
00:10:32.300 --> 00:10:36.820
private conversation, while actually our
messages go through a service provider all
00:10:36.820 --> 00:10:42.810
the time. Meaning we are now looking
something at something more akin to postal
00:10:42.810 --> 00:10:49.230
services. We prepare a message send it
off, our message provider takes the
00:10:49.230 --> 00:10:53.170
message, takes a to our intended
recipient, and they can then read the
00:10:53.170 --> 00:11:00.740
message. And this is this this applies to
all the messages we exchange. And to
00:11:00.740 --> 00:11:04.700
underline that we're going to look at what
I initially called traditional messaging
00:11:04.700 --> 00:11:12.399
meaning text messaging, unencrypted SMS
messaging, and as you may or may not be
00:11:12.399 --> 00:11:17.029
aware of these messages also go through
our providers: more than one provider
00:11:17.029 --> 00:11:21.760
even. Say I'm at Vodafone and Frieder is
with Verizon, I don't know, I would send
00:11:21.760 --> 00:11:26.100
my messages to Vodaphone, they would
forward them to Verizon who would then
00:11:26.100 --> 00:11:32.089
deliver it to Frieders phone. So since
both of our providers would know all the
00:11:32.089 --> 00:11:36.420
messages; they are unencrypted; we would
have no confidentiality.
00:11:36.420 --> 00:11:41.000
They could change the messages and these
things have happened actually. So we
00:11:41.000 --> 00:11:44.000
don't have any integrity we don't know if
the messages received are actually the
00:11:44.000 --> 00:11:51.410
ones that were sent. We also have no
authentication because phone numbers are
00:11:51.410 --> 00:11:56.690
very weak for authenticating people, they
are managed by our providers they don't
00:11:56.690 --> 00:12:01.720
they are not fixed that there's no fixed
mapping to our phones or our SIM cards.
00:12:01.720 --> 00:12:06.459
They can be changed they can be rerouted
so we don't we never know if the messages
00:12:06.459 --> 00:12:10.730
we send are actually received by the
people we intended to: no authenticity and
00:12:10.730 --> 00:12:17.279
no authentication. Now forward secrecy and
future secrecy don't even apply because we
00:12:17.279 --> 00:12:24.269
have no secrecy. We do have some sort of
deniability but this goes into like
00:12:24.269 --> 00:12:33.370
philosophically.. Let's do that again:
philosophical claims of whether when I say
00:12:33.370 --> 00:12:36.850
I haven't sent anything this must have
been the provider they can technically,
00:12:36.850 --> 00:12:41.629
you know, guarantee they did or did not do
something. So let's not dive too deeply
00:12:41.629 --> 00:12:46.920
into that discussion, but we can summarize
that messaging translates, at least
00:12:46.920 --> 00:12:51.089
traditional messaging, translates very
badly to our privacy expectations when we
00:12:51.089 --> 00:13:00.430
think of a communication. Okay, moving on.
Looking at our postal analogy, actually
00:13:00.430 --> 00:13:05.390
our messages are more like postcards.
Because they are plain, our providers can
00:13:05.390 --> 00:13:09.600
look at them, can change them, you know
all the things we've just described: just
00:13:09.600 --> 00:13:14.440
as they would a postcard. They can see the
intended recipient, they can look at the
00:13:14.440 --> 00:13:18.690
sender, they can look at the tags, change
it: postcards. And what we want to achieve
00:13:18.690 --> 00:13:25.230
now is find a way to wrap these postcards
and make them more like letters, assuming
00:13:25.230 --> 00:13:29.819
that postal services don't open letters.
That's the one the one point with this
00:13:29.819 --> 00:13:35.960
analogy that we have to like, define. And
to be able to do that we're going to we're
00:13:35.960 --> 00:13:41.220
trying to give you the shortest encryption
to – the shortest introduction to
00:13:41.220 --> 00:13:46.790
encryption, see I'm confusing myself here,
that you will ever get. Starting with
00:13:46.790 --> 00:13:50.209
symmetric encryption.
Now, encryption, for those of you who
00:13:50.209 --> 00:13:55.500
don't know, is what we call the
translation of plain, readable text into
00:13:55.500 --> 00:14:00.459
text that looks like it's random, but it
can be reversed and turned back into plain
00:14:00.459 --> 00:14:05.690
text provided we have the right key for
that. So to stick with a very simple
00:14:05.690 --> 00:14:09.400
example please imagine this box that we've
just labeled crypto, and we are not
00:14:09.400 --> 00:14:13.839
concerned with what's in the box we just
imagine it as a machine. Please imagine it
00:14:13.839 --> 00:14:18.069
as a machine that takes two inputs the
plaintext and the key, and it produces
00:14:18.069 --> 00:14:21.959
something that we call ciphertext.
The ciphertext is undistinguishable from
00:14:21.959 --> 00:14:29.949
random text, but it can be reversed at the
recipient side using the same key and
00:14:29.949 --> 00:14:35.019
basically the same machine just doing the
operation, you know, in reverse: turning
00:14:35.019 --> 00:14:42.290
the ciphertext back into plain text. This
is what we call, sorry, this is what we
00:14:42.290 --> 00:14:48.079
call symmetric encryption because if you
imagine a line where the cipher text is
00:14:48.079 --> 00:14:52.959
you could basically mirror the thing on to
the other side so it's symmetric at that
00:14:52.959 --> 00:14:59.560
at that line. And when when there's
something that is called symmetric there
00:14:59.560 --> 00:15:03.350
is also something that is called
asymmetric and asymmetric encryption works
00:15:03.350 --> 00:15:08.630
relatively the same way, only there are
now two keys. We have made them a yellow
00:15:08.630 --> 00:15:13.709
one and a blue one. These keys are called
a key pair. They are mathematically
00:15:13.709 --> 00:15:19.300
linked. And the way this works now is that
anything encrypted with one of these keys
00:15:19.300 --> 00:15:26.649
can only be decrypted with the other one.
You can do it both ways, but the important
00:15:26.649 --> 00:15:31.950
thing to memorize here is just anything I
encrypt with the yellow key can only be
00:15:31.950 --> 00:15:42.370
decrypted with the blue key. Okay, since
we have that now, let's capitalize on this
00:15:42.370 --> 00:15:48.230
on this scenario. Imagine each of our
communication partners now has one of
00:15:48.230 --> 00:15:51.350
these two keys and we are still talking
about the same key pair that we've
00:15:51.350 --> 00:15:56.160
outlined on the previous slide. Now we
call one of them a secret key and one of
00:15:56.160 --> 00:16:03.950
them a public key. This is probably known
to most of you: traditional public key
00:16:03.950 --> 00:16:06.860
cryptography.
We've added something that is called an
00:16:06.860 --> 00:16:09.180
identity in this in this picture: we will
get back
00:16:09.180 --> 00:16:13.870
to that in a minute. But the scenario we
want we want you to envision right now is
00:16:13.870 --> 00:16:19.959
that both parties would publish their
public key to the public. And we are going
00:16:19.959 --> 00:16:24.829
to get back to what that means as well.
And keep their secret key, as the name
00:16:24.829 --> 00:16:29.790
says, secret. Some of you might know this
as a private key: it's the same the same
00:16:29.790 --> 00:16:38.670
concept applies. We just chose to call it
secret key. Because it more clearly
00:16:38.670 --> 00:16:44.110
denotes that it's actually secret and not
never published. So this would mean any
00:16:44.110 --> 00:16:48.550
message that would that would be encrypted
with one of the parties public key could
00:16:48.550 --> 00:16:54.100
then only be decrypted with that parties
secret key, putting us in a position where
00:16:54.100 --> 00:16:59.060
I could take Frieta's public key, encrypt
my message, send it to him, and I would
00:16:59.060 --> 00:17:04.689
know that he would be the only one able to
decrypt the message - as long as his
00:17:04.689 --> 00:17:13.080
secret key remains his, well, secret.
And he doesn't doesn't publish it. Well
00:17:13.080 --> 00:17:22.050
the problem is: it's a very expensive
scenario. We get something akin to a
00:17:22.050 --> 00:17:27.970
postal to a postal service where we can
now encrypt the message and envision it
00:17:27.970 --> 00:17:32.890
like putting a plain sheet of paper into
an envelope, seal it, we would put it on
00:17:32.890 --> 00:17:38.260
the way. Nobody on the line would be able
to look into the letter. They would of
00:17:38.260 --> 00:17:41.350
course, well, since there are addresses on
there, they would see who it is from and
00:17:41.350 --> 00:17:48.220
who it to - but they couldn't look inside
the letter: this is achieved. But as I've
00:17:48.220 --> 00:17:52.470
already said it's a very expensive
mechanism and by that we mean it is hard
00:17:52.470 --> 00:17:59.370
to do for devices - especially since you
are doing mobile messaging on your phones,
00:17:59.370 --> 00:18:07.610
ideally, especially hard to do on on small
devices like phones. So while if we had a
00:18:07.610 --> 00:18:15.320
mechanism that would allow us to combine
symmetric and asymmetric encryption. And
00:18:15.320 --> 00:18:20.990
it turns out we do. And we are going to
keep this very simple by just looking at
00:18:20.990 --> 00:18:26.650
what is called key establishment, and then
again also just one particular way of key
00:18:26.650 --> 00:18:33.160
establishment. We have two new boxes here:
they are called key generators. And the
00:18:33.160 --> 00:18:34.160
scheme
that we are
00:18:34.160 --> 00:18:36.860
looking at right now works works the
following way: You can take one of the
00:18:36.860 --> 00:18:41.890
secret keys, and another part and another
public key, like the one of the other
00:18:41.890 --> 00:18:45.580
party, put them into the key generator.
And remember, these keys are
00:18:45.580 --> 00:18:50.710
mathematically linked each secret key
belongs to exactly one public key. And the
00:18:50.710 --> 00:18:54.150
way this key generator works is that
through this mathematical this
00:18:54.150 --> 00:18:59.380
mathematical linking it doesn't matter if
you take, in this case, let's call them
00:18:59.380 --> 00:19:04.390
Alice and Bob: if you take Alice's secret
key and Bob public key, or Bob secret key
00:19:04.390 --> 00:19:09.780
and Alice's public key, you will always
come up with the same key. And we call
00:19:09.780 --> 00:19:13.830
this a shared key. Because this key can
now be it can be generated independently
00:19:13.830 --> 00:19:18.130
on both sides and it can then be used for
symmetric encryption, and as we've already
00:19:18.130 --> 00:19:26.300
told you symmetric encryption is a lot
cheaper than asymmetric encryption. So
00:19:26.300 --> 00:19:29.900
this has one advantage and one
disadvantage: the advantages I've already
00:19:29.900 --> 00:19:36.370
said is that it's way cheaper, and the
fact, well, the advantage is also that we
00:19:36.370 --> 00:19:39.900
come up with the key on both sides, and
the disadvantage is that we come up with
00:19:39.900 --> 00:19:47.090
one key on both sides - because whether or
not you've realized this by now since this
00:19:47.090 --> 00:19:51.580
is a very static scheme we always come up
with the same key. That is going to be a
00:19:51.580 --> 00:19:57.880
problem in a minute. So let's recap we
have looked at asymmetric encryption which
00:19:57.880 --> 00:20:02.429
as I've said gives us IDs, and we're going
to look at what means. But it is very
00:20:02.429 --> 00:20:06.400
expensive. We know that symmetric
encryption is cheap, but we have to find a
00:20:06.400 --> 00:20:12.309
way to get this key delivered to both
parties before they can even start
00:20:12.309 --> 00:20:16.839
encrypting their communication. And we
have looked at key establishment, which
00:20:16.839 --> 00:20:23.920
allows us which gives us symmetric keys
based on asymmetric key pairs. Meaning we
00:20:23.920 --> 00:20:28.350
have now basically achieved
confidentiality - we can use these keys
00:20:28.350 --> 00:20:32.029
put them in the machines with our
plaintext, get ciphertext, can, you know,
00:20:32.029 --> 00:20:37.030
we are able to transport it to the other
side. Nobody can look inside.
00:20:37.030 --> 00:20:42.670
Confidentiality is achieved.
Now deniability. Deniability in this
00:20:42.670 --> 00:20:47.510
scenario would basically mean, if you
think back at our initial sketch, where we
00:20:47.510 --> 00:20:51.250
could say I haven't said that, and the
other guy couldn't prove that we did,
00:20:51.250 --> 00:20:56.649
would in this case be a letter that was
sent to both of the participants, and it
00:20:56.649 --> 00:21:01.250
would be from either of the participants.
So that when looking at this
00:21:01.250 --> 00:21:05.200
cryptographically, we couldn't say this
was sent by me or this was sent by Frieda.
00:21:05.200 --> 00:21:10.429
You could just see it was sent by, well,
either of us. And if you think of the
00:21:10.429 --> 00:21:13.980
scheme that we've just sketched, since
both parties come up with the same key by
00:21:13.980 --> 00:21:20.010
using different by using a different set
of keys to to generate them, basically the
00:21:20.010 --> 00:21:25.320
same key can be generated on both sides.
And you can never really say, by just
00:21:25.320 --> 00:21:29.649
looking at a message, if it was encrypted
with a shared key generated on one or on
00:21:29.649 --> 00:21:35.940
the other side since they are the same.
So, very simply and on a very high level
00:21:35.940 --> 00:21:41.500
we have now achieved deniability. What
about forward and future secrecy? You
00:21:41.500 --> 00:21:45.330
remember this picture? Our overheard
conversation on the party that we were at
00:21:45.330 --> 00:21:52.740
at the beginning of the talk? Well, this
picture now changes to this. And what we
00:21:52.740 --> 00:21:58.200
are looking at now is something we call
key compromise and key renegotiation. Key
00:21:58.200 --> 00:22:03.669
compromise would be the scenario where one
of our keys were lost. And we are talking
00:22:03.669 --> 00:22:08.470
about the shared key that we generated
now. Which, if it would fall into the
00:22:08.470 --> 00:22:13.059
hands of an attacker, this attacker would
be able to decrypt our messages because
00:22:13.059 --> 00:22:23.140
it's the same key that we used for that.
Now, if if if at the point where the key
00:22:23.140 --> 00:22:28.000
was compromised they wouldn't be able to
decrypt anything prior to that point - we
00:22:28.000 --> 00:22:33.020
would have forward secrecy. And if we had
a way to renegotiate keys, and they would
00:22:33.020 --> 00:22:38.240
be different ,completely different, not
linked to the ones we had before, and then
00:22:38.240 --> 00:22:42.820
use that in the future, we would have
future secrecy. But we don't, since as
00:22:42.820 --> 00:22:48.279
we've already said the keys that we
generate are always the same. And we want
00:22:48.279 --> 00:22:51.210
you to keep this in mind because
we will get
00:22:51.210 --> 00:22:54.120
back to this
when we look at Threema in more detail.
00:22:58.573 --> 00:23:07.320
yeah, if we had a way to dump keys after
having used them, we could achieve forward
00:23:07.320 --> 00:23:13.669
and future secrecy. Since we don't, we
can't right now. Okay, next recap our key
00:23:13.669 --> 00:23:17.180
establishment protocol gives us
confidentiality, deniability, and
00:23:17.180 --> 00:23:23.159
authenticity. We don't have forward and
future secrecy. And if you've stuck with
00:23:23.159 --> 00:23:27.750
us you would realize we are omitting
integrity here - that is because we don't
00:23:27.750 --> 00:23:32.419
want to introduce a new concept right now
but we will get back to that, and you will
00:23:32.419 --> 00:23:39.279
see that when we look at Threema it
actually does have integrity. Now,
00:23:39.279 --> 00:23:43.419
basically you could think we fixed all
the-- well, we fixed everything, but you
00:23:43.419 --> 00:23:47.779
heard us talk about things like IDs, and
we said we haven't really lost a few words
00:23:47.779 --> 00:23:53.310
about them lost many words about them and
we're going to look at that now. And we
00:23:53.310 --> 00:23:56.590
are going to start with a quote by my very
own professor - don't worry you don't have
00:23:56.590 --> 00:24:01.159
to read that, I'm going to do it for you.
My professor says, "cryptography is
00:24:01.159 --> 00:24:04.950
rarely, if ever, the solution to a
security problem. Cryptography is a
00:24:04.950 --> 00:24:09.360
translation mechanism, usually converting
a communications security problem into a
00:24:09.360 --> 00:24:15.770
key management problem." And if you think
of it, this is exactly what we have now,
00:24:15.770 --> 00:24:19.970
because I know that Frieder has a private
key, a secret key I'm sorry, and a public
00:24:19.970 --> 00:24:24.890
key. He knows that I have a secret key and
a public key. How does I know which one of
00:24:24.890 --> 00:24:29.940
those public keys that are in the open is
actually his? How would I communicate to
00:24:29.940 --> 00:24:36.820
him what my public key is? Those of you
who've used PGP for example and then the
00:24:36.820 --> 00:24:42.040
couple in the last couple of decades know
what I'm talking about. And we have the
00:24:42.040 --> 00:24:46.240
same problem everywhere where public key
cryptography is used, so we also have the
00:24:46.240 --> 00:24:52.700
same problem in mobile messaging. To the
rescue comes our messaging server -
00:24:52.700 --> 00:24:57.030
because, since we have a central instance
inbetween us, we can now query this
00:24:57.030 --> 00:25:02.490
instance: I can now tell my public key; I
can now take my public key and my identity,
00:25:02.490 --> 00:25:04.440
tell the messaging server, "
Hey messaging server - this is my
00:25:04.440 --> 00:25:07.490
identity. Please store it for me." So that
Frieda, who has
00:25:07.490 --> 00:25:13.860
some well some kind of information to
identify me can then query, you, get my
00:25:13.860 --> 00:25:19.409
public key back. This of course assumes
that we trust the message messaging
00:25:19.409 --> 00:25:25.730
server. We may or may not do that. But for
now we have a way to at least communicate
00:25:25.730 --> 00:25:31.870
our our public keys to other parties. Now
what can we use as identities here? In
00:25:31.870 --> 00:25:37.029
our, like, now a figure here it's very
simple: Alice just goes to the messaging
00:25:37.029 --> 00:25:40.809
server and says, "Hey, what's the public
key for Bob?" And the messaging server
00:25:40.809 --> 00:25:45.840
magically knows who Bob is, and what his
public key is. And the same thing where I
00:25:45.840 --> 00:25:52.289
work works the other way. What would; the
question now is what is a good ID in this
00:25:52.289 --> 00:25:57.380
scenario. Remember we are on phones, so we
could think of using phone numbers, we
00:25:57.380 --> 00:26:02.000
could think of using email addresses, we
could think of something else. And
00:26:02.000 --> 00:26:06.830
something else will be the interesting
part, but let's look at the other parts
00:26:06.830 --> 00:26:10.559
one by one.
Phone numbers can identify users - you
00:26:10.559 --> 00:26:14.130
remember that you rely on your providers
for the mapping between phone numbers and
00:26:14.130 --> 00:26:19.210
SIM cards, so you have to trust another
instance in this situation. We're going to
00:26:19.210 --> 00:26:23.049
ignore that completely because we find
that phone numbers are personal
00:26:23.049 --> 00:26:27.711
information, and I for one my phone
number. And I mean the same phone number
00:26:27.711 --> 00:26:32.269
I've had it for like 18 years now. I
wouldn't want that to get into the wrong
00:26:32.269 --> 00:26:39.960
hands. And by using it to identify me as a
person, or, you know, my cryptographic
00:26:39.960 --> 00:26:45.429
identity that is bound to my to my keys: I
wouldn't necessarily want to use that,
00:26:45.429 --> 00:26:49.499
because I wouldn't be able to change it or
I would want to change it if it ever got
00:26:49.499 --> 00:26:54.640
compromised. Now something else comes to
mind: e-mail addresses. E-mail addresses
00:26:54.640 --> 00:27:00.740
basically are also personal information.
They are a bit shorter lived, as we would
00:27:00.740 --> 00:27:05.120
argue, than phone numbers. But, and you
can use temporary e-mails, you can do a
00:27:05.120 --> 00:27:09.730
lot more you are way more flexible with
e-mails. But ideally we want to have
00:27:09.730 --> 00:27:14.830
something that is that we call dedicated
IDs, meanings something that identifies me
00:27:14.830 --> 00:27:18.229
only within the bounds of the service that
we use.
00:27:18.229 --> 00:27:24.450
So that's what we want to have we are
going to show you how this might work but
00:27:24.450 --> 00:27:30.480
we still have to find a way to verify
ownership, because this is a scenario that
00:27:30.480 --> 00:27:36.030
is more or less likely to happen. I am
presented with a number of public keys to
00:27:36.030 --> 00:27:41.760
an identity that I know - and I have to
verify a way to, well, I have to find a
00:27:41.760 --> 00:27:46.049
way to verify which one is maybe the right
one, maybe the one that is actually used,
00:27:46.049 --> 00:27:51.429
maybe Frieda has used quite a number of
public keys - he's a lazy guy. He forgets
00:27:51.429 --> 00:27:55.100
to, you know, take his keys from one
machine to the other: he just, you know,
00:27:55.100 --> 00:27:59.210
buys a new laptop sets up a new public
key: bam, he has two - which one am I
00:27:59.210 --> 00:28:04.299
supposed to read to use right now. Now
remember that we are looking at the
00:28:04.299 --> 00:28:10.159
messenger server for, you know, key
brokerage, and we are now going to add a
00:28:10.159 --> 00:28:18.929
third line here and that is this one.
Basically we introduce a way to meet in
00:28:18.929 --> 00:28:23.860
person, and again PGP veterans will know
what I'm talking about, and verify our
00:28:23.860 --> 00:28:28.580
keys independently. We've chosen QR codes
here - free mail uses QR codes, many other
00:28:28.580 --> 00:28:33.440
messengers and do as well, and we want to
like tell you why this is an important
00:28:33.440 --> 00:28:39.520
feature to be able to to verify our public
keys independently of the messaging
00:28:39.520 --> 00:28:43.620
server. Because once we did that we no
longer have to trust the messaging server
00:28:43.620 --> 00:28:47.790
to tell us or - we don't have longer we no
longer have to trust his promise that this
00:28:47.790 --> 00:28:54.150
is actually the key we are looking for. We
have verified that independently. Okay, we
00:28:54.150 --> 00:29:00.050
have basically solved our authenticity
problem. We know that we can identify
00:29:00.050 --> 00:29:04.269
users by phone numbers and emails, and you
remember our queries to the server for
00:29:04.269 --> 00:29:08.100
Bob: we can still use phone numbers for
that if we want to. We can use emails for
00:29:08.100 --> 00:29:12.789
that if we want to. We don't have to. We can
use our ids anonymously. But we have a way
00:29:12.789 --> 00:29:18.190
to verify them independently. The
remaining problem is users changing their
00:29:18.190 --> 00:29:24.179
IDs - that is where we have to verify
again. And we also get back to that later,
00:29:24.179 --> 00:29:27.800
but I want to look at something else
first, and that is the handling of
00:29:27.800 --> 00:29:31.320
metadata.
Now, we know that an attacker can no
00:29:31.320 --> 00:29:36.210
longer look inside our messages. They can,
however, still see the addressee, who the
00:29:36.210 --> 00:29:39.540
message is from, and they can see how
large the message is, they can see they
00:29:39.540 --> 00:29:44.649
can look at timestamps and stuff like
that. And since we are getting a bit tight
00:29:44.649 --> 00:29:50.899
on the clock I'm going to try to
accelerate this a bit. Metadata handling:
00:29:50.899 --> 00:29:56.440
we want to conceal now who a message is
from, who a message is to. And we are
00:29:56.440 --> 00:30:01.050
doing this by taking the envelope that
we've just generated, wrapping it into a
00:30:01.050 --> 00:30:05.480
third envelope, and then sending that to
the messenger server first. And the
00:30:05.480 --> 00:30:12.169
messenger server gets a lot of envelopes.
They are all just addressed to the
00:30:12.169 --> 00:30:15.950
messenger server, so anyone on the network
would basically see there's there's one
00:30:15.950 --> 00:30:19.699
party sending a lot of messages to the
messenger server; maybe there are a lot of
00:30:19.699 --> 00:30:24.590
parties. But they couldn't look at they
couldn't look at the end-to-end, we call a
00:30:24.590 --> 00:30:29.819
channel, that's seeing what the address is
on each internal envelope are. The
00:30:29.819 --> 00:30:35.690
messaging server, however, can. They would
open the other-- the outer envelope, look
00:30:35.690 --> 00:30:40.559
at the inside, see , "Okay this is a
message directed at Alice," wrap it into
00:30:40.559 --> 00:30:43.880
another envelope - that would just say,
"This is the message from the messaging
00:30:43.880 --> 00:30:50.320
server and it is directed to Alice." Who
would then be able to, you know, open the
00:30:50.320 --> 00:30:54.230
outer envelope, open the inner envelope,
see this is actually a message from Bob.
00:30:54.230 --> 00:30:59.559
And what we have thereby achieved is a to
where two layer end to end communication
00:30:59.559 --> 00:31:06.230
tunnel as we call it, where the purple and
the blue bar are encrypted channels
00:31:06.230 --> 00:31:13.150
between both communication partners and
the messaging server, and they carry an
00:31:13.150 --> 00:31:18.560
encrypted tunnel between both partners,
you know, both communication partners,
00:31:18.560 --> 00:31:24.860
directly. But, and we've had this caveat
before, the messaging server still knows
00:31:24.860 --> 00:31:29.080
both communication partners, they still
know the times that the messages were
00:31:29.080 --> 00:31:33.940
sent. And they also know the size of the
message. But we can do something against
00:31:33.940 --> 00:31:39.269
that. And we what we do is introduce
padding - meaning,
00:31:39.269 --> 00:31:42.169
in the inner envelope we
just stick a bunch of extra
00:31:42.169 --> 00:31:47.270
pages so the envelope looks a bit thicker.
And we do that by just appending random
00:31:47.270 --> 00:31:51.790
information to the actual message before
we encrypt it. So anything looking at the
00:31:51.790 --> 00:31:56.479
encrypted message would just see a large
message. And, of course, that should be
00:31:56.479 --> 00:31:59.940
random information every time - it should
have should never have the same length
00:31:59.940 --> 00:32:05.730
twice. But if we can achieve that, we can
at least conceal the size of the message.
00:32:05.730 --> 00:32:12.639
Now so much for our gentle introduction to
mobile messaging. And for those those of
00:32:12.639 --> 00:32:19.210
you stuck around, we are now moving on to
analyze Threema. Now I want to say a few
00:32:19.210 --> 00:32:24.289
things before we do that - we are not
affiliated with Threema, we don't, we are
00:32:24.289 --> 00:32:30.529
not here to recommend that the the app to
you or the service. We didn't do any kind
00:32:30.529 --> 00:32:35.380
of formal analysis. There will be no
guarantees. We will not be quoted with
00:32:35.380 --> 00:32:41.509
saying, "use it or don't use it." What we
want to do is make more people aware of
00:32:41.509 --> 00:32:48.070
the mechanisms that are in use and we have
chosen basically a random message provider
00:32:48.070 --> 00:32:52.370
- we could have chosen anyone. We chose
Threema for the fact that they do offer
00:32:52.370 --> 00:32:57.190
dedicated IDs. That they don't bind keys
to phone numbers, which many messengers
00:32:57.190 --> 00:33:04.240
do. Those of you who use WhatsApp know
what I'm talking about. And well, since it
00:33:04.240 --> 00:33:08.710
is closed source we found it interesting
to look at what is actually happening inside
00:33:08.710 --> 00:33:13.700
the app and make that publicly aware. Now
we are not the only ones we've done this,
00:33:13.700 --> 00:33:18.770
we are also not the first ones who've done
this, and we don't claim we are. But we
00:33:18.770 --> 00:33:24.049
are here now and we want to try to make
you aware of the inner workings of the app
00:33:24.049 --> 00:33:33.490
as far as we have understood it. And with
that I hand the presenter over to Frieda.
00:33:33.490 --> 00:33:42.670
Applause
00:33:42.670 --> 00:33:45.530
Frieda: So I'll be presenting to you our
00:33:45.530 --> 00:33:52.460
understanding of the Threema protocol and
how the application works as we deduced
00:33:52.460 --> 00:33:59.260
from mostly reverse engineering the
Android app. And so this won't be a
00:33:59.260 --> 00:34:03.539
complete picture, but it will it will be a
picture presenting to you the most
00:34:03.539 --> 00:34:09.380
essential features and how the protocol
works. And I'll start by giving you a
00:34:09.380 --> 00:34:16.909
bird's eye look at the overall
architecture and why Roland was giving you
00:34:16.909 --> 00:34:21.419
this abstract introduction to mobile
messaging, there was also always this
00:34:21.419 --> 00:34:26.719
third party - this messaging provider.
And this now became actually three
00:34:26.719 --> 00:34:34.220
entities because Threema has three
different servers, mostly, doing well,
00:34:34.220 --> 00:34:41.230
very different stuff for for the apps
working. And I'll start with the directory
00:34:41.230 --> 00:34:48.840
server in orange at the bottom, because
that is the server you most likely will be
00:34:48.840 --> 00:34:55.149
contacted contacting first if you want to
engage in any conversation with someone
00:34:55.149 --> 00:34:59.770
you never talked to before. Because this
is the server that handles all the
00:34:59.770 --> 00:35:05.850
identity public key related stuff that
Roland was talking about so much. This is
00:35:05.850 --> 00:35:12.089
the server you'll be querying for whose
public key - I have this Threema ID,
00:35:12.089 --> 00:35:17.050
what's the corresponding public key, for
example stuff like that. Above that there
00:35:17.050 --> 00:35:23.410
is the messaging server, which is kind of
the core central entity in this this whole
00:35:23.410 --> 00:35:30.140
scenario because it's task is relaying
messages from one communication partner to
00:35:30.140 --> 00:35:34.989
another. And above that we have the media
server, and I'll be talking about that
00:35:34.989 --> 00:35:42.670
later. In short, its its task, its
purpose, is storing large media files like
00:35:42.670 --> 00:35:49.190
images and videos you send to your
communication partners. But as I said I
00:35:49.190 --> 00:35:54.319
want to start with the directory server,
and in the case of Threema, this directory
00:35:54.319 --> 00:36:01.650
server is offers a REST API so
communication with this server happens
00:36:01.650 --> 00:36:12.260
via HTTP. It is HTTPS actually so it's
TLS encrypted. And this encryption is also
00:36:12.260 --> 00:36:18.550
fulfills all the requirements you would
have to to to a proper TLS connection and,
00:36:18.550 --> 00:36:21.990
so, if you if you want to communicate with
the new person and you have
00:36:21.990 --> 00:36:22.990
their phone
number or
00:36:22.990 --> 00:36:27.460
the email address or Threema ID. You'll be
asking your app will be asking the
00:36:27.460 --> 00:36:30.609
directory server, "Hey, I have this phone
number, do you have a corresponding
00:36:30.609 --> 00:36:37.380
Threema account and public key." And the
response will hopefully be, "Yes, I do -
00:36:37.380 --> 00:36:41.090
that's a public key that's the Threema ID:
go ahead."
00:36:41.090 --> 00:36:51.420
And as Ron said we kind of chose Threema
for the arbitrary use of IDs and
00:36:51.420 --> 00:36:57.210
especially for the system of verifying
fingerprints in person by scanning QR
00:36:57.210 --> 00:37:06.339
codes and because this is something
Threema has and other messengers do not
00:37:06.339 --> 00:37:12.400
have I want to talk a little bit about
that, because if you just ask the
00:37:12.400 --> 00:37:17.050
directory server "hey I have a threema ID
what is the corresponding public key?" the
00:37:17.050 --> 00:37:20.980
threema location will say "ok I got an
answer from from the directory server I
00:37:20.980 --> 00:37:25.660
have a public key but I have very little
trust, that you actually know who the real
00:37:25.660 --> 00:37:29.480
person behind this threema account is,
we're not quite sure about that", so it'll
00:37:29.480 --> 00:37:37.230
mark this contact with one red dot and if
you had a phone number or an email address
00:37:37.230 --> 00:37:40.840
and asked the directory server, "hey
what's the corresponding threema account
00:37:40.840 --> 00:37:46.240
and public key?" the app will say, "ok we
still have to trust the directory server,
00:37:46.240 --> 00:37:51.640
but we're a little bit more confident that
the person on the other hand is actually
00:37:51.640 --> 00:37:55.109
who you think they are because you have a
phone number probably linked to a real
00:37:55.109 --> 00:37:59.810
person and you have a better idea who
you're talking to but still we rely on the
00:37:59.810 --> 00:38:06.640
threema server", so it'll knock a contact
like that with two orange dots and then
00:38:06.640 --> 00:38:11.230
there is the final stage if you met
someone in person and scan their, their
00:38:11.230 --> 00:38:17.390
public key and threema ID in form of a QR
code such a contact will be marked with
00:38:17.390 --> 00:38:23.470
three green dots and in that case the app
says "We're 100% confident we're talking
00:38:23.470 --> 00:38:29.589
to the person we want to talk to and we
have the proper keys." So right now we're
00:38:29.589 --> 00:38:35.600
at if we think of engaging a conversation,
we were at the point where we do have all
00:38:35.600 --> 00:38:41.410
necessary details to start encrypting our
communication, but question remains, how
00:38:41.410 --> 00:38:46.260
do we encrypt our communication,
in case of threema.
00:38:46.260 --> 00:38:52.260
Threema uses a library called salt has
been developed by Daniel Bernstein and he
00:38:52.260 --> 00:38:59.730
called it salt but it's spelled NaCl so
I'm sorry for for the play on words, but
00:38:59.730 --> 00:39:07.240
if you see NaCl its salt so this is a
library specifically designed for the
00:39:07.240 --> 00:39:12.440
encryption of
messages and it's supposed to be very
00:39:12.440 --> 00:39:20.560
simple in use and give us all the the
necessary features we wanted and this is
00:39:20.560 --> 00:39:25.020
Salt's authenticated encryption giving us
all the features Roland was talking about
00:39:25.020 --> 00:39:29.930
in abstract before. It gives us integrity,
it gives us authenticity, it gives us
00:39:29.930 --> 00:39:39.800
confidentiality and just a quick look and
on how this this library would be used is,
00:39:39.800 --> 00:39:44.619
as you can see up there like everything in
the grey box is, what the library does and
00:39:44.619 --> 00:39:49.800
we only need our secret key, if we want to
encrypt something to someone, the
00:39:49.800 --> 00:39:58.520
recipients public key, our message. So far
very obvious and the library also requires
00:39:58.520 --> 00:40:04.960
a nonce, which is something that should be
only used once, that's actually yeah part
00:40:04.960 --> 00:40:08.670
of the definition, so we generate
something random and include that in the
00:40:08.670 --> 00:40:13.099
process of encrypting the message this is
just so that if we encrypt the same
00:40:13.099 --> 00:40:19.030
content same message twice, we do not get
the same ciphertext. This is not nothing
00:40:19.030 --> 00:40:23.090
secret because as you can see at the
output the library actually gives us
00:40:23.090 --> 00:40:27.770
ciphertext, Roland talked a bit about that
what it is and it'll also give you it was
00:40:27.770 --> 00:40:32.690
a MAC and I'll just stick with a very
simple definition of what that is, it is
00:40:32.690 --> 00:40:38.069
something that ensures that there's kind
of a checksum so someone getting looking
00:40:38.069 --> 00:40:43.359
at the cipher text and the MAC can ensure
no one tampered with the cipher text so
00:40:43.359 --> 00:40:50.280
the cipher text is still in the state when
it was, when we sent it and if we want to
00:40:50.280 --> 00:40:54.859
transmit our message now in encrypted form
to someone, we have to include the nonce,
00:40:54.859 --> 00:40:58.060
the nonce is not secret, we can just send
it along with the cipher text, but to
00:40:58.060 --> 00:41:04.690
decrypt we need the nonce and well so this
is what we might use for encryption, but
00:41:04.690 --> 00:41:07.420
as you might remember from Roland's
introduction, this scheme
00:41:07.420 --> 00:41:17.460
does not offer us any forward or future
secrecy and we can still try to to add
00:41:17.460 --> 00:41:24.410
some form of forward to future secrecy to
this scheme and this is usually done,
00:41:24.410 --> 00:41:29.790
sorry for skipping with a, with something
something called a handshake and
00:41:29.790 --> 00:41:36.250
handshakes are a system of discarding old
keys and agreeing agreeing a new keys,
00:41:36.250 --> 00:41:42.960
this is usually what we do with the
handshake and scenarios like this and
00:41:42.960 --> 00:41:48.309
doing a handshake with someone that is not
online at the moment is pretty difficult
00:41:48.309 --> 00:41:52.559
there are protocols to do that; the signal
messaging for app, app for example does
00:41:52.559 --> 00:41:57.340
something like that but it's kind of
complicated and threema's protocol spares
00:41:57.340 --> 00:42:02.140
the effort and only does this kind of
handshake with the Threema servers because
00:42:02.140 --> 00:42:07.150
they are always online, we can always do a
handshake with them, so Threema has some
00:42:07.150 --> 00:42:13.160
form of forward secrecy on this connection
to the messaging server and how this is
00:42:13.160 --> 00:42:19.589
achieved, I'll try to present to you right
now and we walk through this handshake
00:42:19.589 --> 00:42:27.559
step by step and I try to put some focus
on what every step tries to achieve, so if
00:42:27.559 --> 00:42:31.319
we initiate a connection, if we start
sending a message the threema app will
00:42:31.319 --> 00:42:35.270
connect to the to the messaging server and
start the connection by sending a client
00:42:35.270 --> 00:42:43.109
hello, this is a very simple packet. It is
only there to communicate the public key
00:42:43.109 --> 00:42:48.190
we from now on intend to use
and a nonce prefix in this case
00:42:48.190 --> 00:42:54.140
notice it is I'd say half a nonce and the
other part is some some kind of a counter
00:42:54.140 --> 00:43:01.819
that will during the ongoing communication
always be increased by one. So but it'll
00:43:01.819 --> 00:43:08.140
do no harm if you just see it as a nonce
right now, so we start the conversation by
00:43:08.140 --> 00:43:12.740
saying "hey, we want to use a new key pair
from now on and this is our public key,
00:43:12.740 --> 00:43:17.410
please take note" and the server will
react by saying "okay, I need a fresh key
00:43:17.410 --> 00:43:23.859
pair as well then", generate a fresh key
pair and let us know what it's public key
00:43:23.859 --> 00:43:33.260
from now on is. The only thing to note is,
I mean as you can see there is, there's
00:43:33.260 --> 00:43:38.589
not much more than then the things
the client sent
00:43:38.589 --> 00:43:42.689
corresponding things from the server side,
but there's also the client nonce
00:43:42.689 --> 00:43:47.920
included, so so as we can we can see this
is actually a response to our client hello
00:43:47.920 --> 00:43:54.020
we just sent, not something that got, I
don't know redirected to us on accident,
00:43:54.020 --> 00:44:00.180
whatever. And as you can see the latter
part of the message including the server's
00:44:00.180 --> 00:44:06.339
public key is encrypted that's what what
this bracket saying ciphertext says and it
00:44:06.339 --> 00:44:13.089
is encrypted with the server's long-term
secret key and our ephemeral temporary key
00:44:13.089 --> 00:44:18.490
and by doing so, the server does something
only the person in possession of the
00:44:18.490 --> 00:44:23.280
service long-term secret key can do and
proves to us, this public key we just
00:44:23.280 --> 00:44:27.869
received from the server, in this server
"hello", has actually been been sent by
00:44:27.869 --> 00:44:31.940
the proper threema server, no one can
impersonate the threema server at that
00:44:31.940 --> 00:44:42.430
point, so, after that we are at a point
where the client application knows, this
00:44:42.430 --> 00:44:45.830
is the public key threema server wants to
use and it's actually the threema server,
00:44:45.830 --> 00:44:49.880
not someone impersonating it, the server
know was there is someone who wants to
00:44:49.880 --> 00:44:54.599
talk to me using this public key, but
knows nothing else it doesn't know who's
00:44:54.599 --> 00:44:59.220
actually talking to him and this is going
to change with the next packet, because
00:44:59.220 --> 00:45:05.700
the threema app is going to, to now send a
client authentication packet, we call it
00:45:05.700 --> 00:45:10.940
that way, which includes information about
the client, the first thing is the threema
00:45:10.940 --> 00:45:17.730
ID , the threema IDs are eight character
strings, it's just uppercase letters and
00:45:17.730 --> 00:45:24.520
numbers and what follows is a user agent
string which is not technically necessary
00:45:24.520 --> 00:45:28.820
for the protocol, it's something the
threema app sends, it includes the threema
00:45:28.820 --> 00:45:34.640
version, your system; Android iOS and
your, in case of Android, the Android
00:45:34.640 --> 00:45:41.960
version and stuff like that so it's very
similar to user agent in web browsers,
00:45:41.960 --> 00:45:49.560
yeah. I don't know why they sent it, but
they do and the rest of it is nonces.
00:45:49.560 --> 00:45:54.040
Let's get skip over them, but also the
client's ephemeral public key we already
00:45:54.040 --> 00:45:58.160
sent in the client hello but this time
encrypted
00:45:58.160 --> 00:46:01.859
with our long-term secret key, so we just
repeat what the server just did, proving
00:46:01.859 --> 00:46:06.400
by encrypting with our long-term key,
proving that we are, who we claim to be
00:46:06.400 --> 00:46:12.170
and that we vouch that we really want to
use this, this temporal key and after that
00:46:12.170 --> 00:46:17.050
happens each party knows, what public key
what new keypair the other party wants to
00:46:17.050 --> 00:46:23.420
use from now on and that the other party
is actually who they claim to be and so
00:46:23.420 --> 00:46:25.910
the handshake is just concluded
by the server
00:46:25.910 --> 00:46:29.880
sending a bunch of zeros and grouped
encrypted with the newly exchanged key
00:46:29.880 --> 00:46:34.800
pairs. This is just so the client can
decrypt it, see it as a bunch of zeros,
00:46:34.800 --> 00:46:40.990
everything worked out, we have a working
connection now so if we've done that we
00:46:40.990 --> 00:46:46.930
have this, we have, if you remember this
picture, we have established forward
00:46:46.930 --> 00:46:51.700
secrecy in the paths between the app and
the server we do not have established
00:46:51.700 --> 00:46:55.839
anything for the inner crypto layer, which
is in case of threema, just taking
00:46:55.839 --> 00:46:59.619
messages encrypting them with the salt
library and sending them over the wire.
00:46:59.619 --> 00:47:04.550
There's nothing more to it, it's just as I
showed you the scheme before, used in a
00:47:04.550 --> 00:47:11.650
very simple way so we now have channels
established and we can communicate via
00:47:11.650 --> 00:47:17.579
those and the next step I want to look at,
what we are actually sending via this
00:47:17.579 --> 00:47:24.069
channels and so I'm introducing the
threema packet format and this is the
00:47:24.069 --> 00:47:28.950
format packets do have, that your
application sends to the threema service,
00:47:28.950 --> 00:47:36.190
this is what if what the threema server
sees, in this case it is the form a packet
00:47:36.190 --> 00:47:42.540
has if it's something I want to send to a
communication partner, for example, the
00:47:42.540 --> 00:47:45.710
content could be a text message
I want to send to someone.
00:47:45.710 --> 00:47:50.250
There are different looking messages for,
for management purposes, for exchanges
00:47:50.250 --> 00:47:55.240
with the server, that will never be
relayed to someone else, but this is the
00:47:55.240 --> 00:48:01.250
the most basic format we use when sending
images, text to, to communication parts
00:48:01.250 --> 00:48:06.780
and as you can see there's a packet type,
its purpose is kind of obvious and what
00:48:06.780 --> 00:48:12.180
follows is the fields on the envelope as
Roland introduced, it's saying "this is a
00:48:12.180 --> 00:48:17.140
message from me"
from Alice to Bob and so you recall the
00:48:17.140 --> 00:48:21.390
server can see that, what follows is a
message ID this is just a random ID
00:48:21.390 --> 00:48:27.630
generated when sending a message, follows
a timestamp so the server knows this is a
00:48:27.630 --> 00:48:33.340
recent message that has been stuck in
transit for a long time, whatever.
00:48:33.340 --> 00:48:38.750
What follows is some things to threema
specific, threema does have public
00:48:38.750 --> 00:48:45.440
nicknames, it's just an alias for, for
your account you can set that in the app
00:48:45.440 --> 00:48:50.491
and if you do it actually gets transmitted
with every message you send, so if you
00:48:50.491 --> 00:48:56.230
change it, your name will change at your
communication partners phone with the
00:48:56.230 --> 00:49:04.569
first message you sent to them and what
follows is a nonce and that is the nonce
00:49:04.569 --> 00:49:08.960
used to encrypt the cypher text as
follows, the cypher text you see down
00:49:08.960 --> 00:49:14.990
below is the inner envelope, as in
Roland's earlier pictures and we're now
00:49:14.990 --> 00:49:22.340
going to look at what is in this envelope,
how do the messages look we transmitted to
00:49:22.340 --> 00:49:28.610
our end-to-end communication partners and
the most simple thing we could look at is
00:49:28.610 --> 00:49:35.670
a text message and you can see grayed out
above, still all the stuff from the outer
00:49:35.670 --> 00:49:41.140
envelope and down below it's very simple,
we have a message type it's just one byte
00:49:41.140 --> 00:49:46.619
indicating in this case that it is a text
message and what follows is text.
00:49:46.619 --> 00:49:55.400
It's nothing more, it's just plain plain
text and after that, noteworthy maybe is
00:49:55.400 --> 00:50:01.200
padding and this padding is as you can see
in the most inner encryption layer so the
00:50:01.200 --> 00:50:06.300
threema server does not know how big your
your actual messages are, this is kind of
00:50:06.300 --> 00:50:10.630
useful because there's stuff like typing
notifications you send to your
00:50:10.630 --> 00:50:18.619
communication partners, which are always
the same size and to make this, to hide
00:50:18.619 --> 00:50:24.030
this from the threema servers, we have
this padding in the inner crypto layer.
00:50:24.030 --> 00:50:32.819
Next I want to look at a other message
type, like I'd say the most, yeah, I think
00:50:32.819 --> 00:50:37.550
one of the basic message types most people
use with instant messaging app is image
00:50:37.550 --> 00:50:42.200
messages, I want to send someone an image,
this is something we do regularly and this
00:50:42.200 --> 00:50:48.712
looks a little bit weird in the first on
the first look; because it has a message
00:50:48.712 --> 00:50:53.389
type, we know that, we know what what it's
burst with the purposes follows a blob
00:50:53.389 --> 00:50:59.740
ID, what a blob ID is, I'm going to
explain in a minute. Follows the size is
00:50:59.740 --> 00:51:04.380
very basic, it's just the size of the image
just should be transmitted and what
00:51:04.380 --> 00:51:10.200
follows is a key and the mandatory
padding, so, the questions are, what is
00:51:10.200 --> 00:51:16.370
this blob ID what is the key ID and what
is this key and this is where the media
00:51:16.370 --> 00:51:22.610
server comes into the picture. The media
server is, well I'll show you what happens
00:51:22.610 --> 00:51:28.350
if you send an image message. Your app
will take the image you want to send,
00:51:28.350 --> 00:51:34.950
generate a random key, encrypt this image
with this key and send it to the media
00:51:34.950 --> 00:51:38.760
server and the media server will say "okay
I'll store this under the following blob
00:51:38.760 --> 00:51:44.339
ID" and your app takes note of this blob
ID and then, we'll send this kind of image
00:51:44.339 --> 00:51:48.830
message I just showed to you to the
messaging server via the messaging server
00:51:48.830 --> 00:51:53.919
to your communication partner, your
communication partner opens up the message
00:51:53.919 --> 00:51:59.010
looks at it sees a blob ID sees the key
and goes to the media server and says "hey
00:51:59.010 --> 00:52:04.010
do you have a blob ID, something stored
under this blob ID?" and the media server
00:52:04.010 --> 00:52:09.540
will respond "yes I do, here's the encrypted
stuff" and your communication partner
00:52:09.540 --> 00:52:16.210
can take this encrypted stuff, decrypt it
with the key you sent and look at your image.
00:52:16.210 --> 00:52:22.190
This is how image sending works. So right
now we do have the basic the basics of
00:52:22.190 --> 00:52:26.250
modern instant messaging, we can send
text, we can send images, this is the
00:52:26.250 --> 00:52:35.059
simple stuff and what I want to look at
next is something that most people would
00:52:35.059 --> 00:52:41.150
want a modern messenger to have as well
and that is group conversations.
00:52:41.150 --> 00:52:46.440
Group conversations essentially in threema
do work not very different from other from
00:52:46.440 --> 00:52:54.079
other method messages because if you send
something to a group your app will just
00:52:54.079 --> 00:52:57.790
encrypt the message several times for
every communication partner involved and
00:52:57.790 --> 00:53:02.849
send it to them, but your communication
partners need to know, well this is a
00:53:02.849 --> 00:53:08.809
group message and it belongs to this and
that group and to do so threema has group
00:53:08.809 --> 00:53:15.780
packets and they include exactly that
information, they include a creator ID
00:53:15.780 --> 00:53:21.670
which is the threema ID of the person who
created the group and a group ID which is
00:53:21.670 --> 00:53:28.059
something randomly generated when creating
a group and after that folIows a regular
00:53:28.059 --> 00:53:32.790
packet format; in this case a text message,
if it were an image message you would see
00:53:32.790 --> 00:53:37.840
exactly the same stuff as shown in the
Image message before so this is how group
00:53:37.840 --> 00:53:43.070
messages look, but we need a way to
introduce new groups to change names and
00:53:43.070 --> 00:53:51.330
for that there are special packets and
this for example is a group "set members
00:53:51.330 --> 00:53:55.069
message", which tells everybody there is
this new group and it has the following
00:53:55.069 --> 00:54:00.590
members as you can see here there is
only a group ID, there is no longer a
00:54:00.590 --> 00:54:03.880
group creator ID included and that is
because a threema group management
00:54:03.880 --> 00:54:10.500
is very static, there can only be one person
managing a group and that is the person
00:54:10.500 --> 00:54:14.830
who created the group. So only the person
who created the group can send this kind
00:54:14.830 --> 00:54:20.670
of messages, saying there is a new member
in the group for example and therefore the
00:54:20.670 --> 00:54:27.810
group creator is implicit in this case, it
is the sender of the message, so this is
00:54:27.810 --> 00:54:33.260
kind of annoying because you cannot have
a group where everybody can have members for
00:54:33.260 --> 00:54:39.710
example and stuff like that. Just if you
set a name for a group, the message looks
00:54:39.710 --> 00:54:47.710
very similar it just doesn't include a
member list, but a name field. So, what I
00:54:47.710 --> 00:54:52.180
want to talk about next is something that
happens above all the stuff I talked
00:54:52.180 --> 00:54:57.070
about right now, because now I show you
there are different kinds of packets doing
00:54:57.070 --> 00:55:01.069
all that stuff, there there are lots of
more packages for all your messages for
00:55:01.069 --> 00:55:07.030
example they look very similar to the
image messages, because they just I mean
00:55:07.030 --> 00:55:11.340
we have a blob ID for the audio file and
stuff like that but what is kind of
00:55:11.340 --> 00:55:17.760
interesting I thought, we thought, is that
above this layer of packet formats,
00:55:17.760 --> 00:55:24.091
there's, there's also some additional stuff
happening and a good example for that is
00:55:24.091 --> 00:55:30.420
how Threema handles subtitles for images,
you can I think a lot of modern messengers
00:55:30.420 --> 00:55:35.810
support that at some some kind of text to
an image and Threema doesn't have a packet
00:55:35.810 --> 00:55:42.429
format of a field in some kind of image
message for that, but they just embed the
00:55:42.429 --> 00:55:47.630
subtitle of the image, in the actual image
and the acts of data of the image and send
00:55:47.630 --> 00:55:54.920
it along. This has the advantage of being
compatible with Threema versions not aware
00:55:54.920 --> 00:55:58.690
of this feature, because they can just
happily ignore this exif data, you won't
00:55:58.690 --> 00:56:03.680
see the subtitle but it won't break
anything. It is though kind of wonky because
00:56:03.680 --> 00:56:07.839
it's not actually a feature which is not
reflected in the actual packet format and
00:56:07.839 --> 00:56:13.070
this is also very similar happening with
quotes, you can quote other people in
00:56:13.070 --> 00:56:17.540
Threema you can like, mark your message
and say I want to quote that and in the
00:56:17.540 --> 00:56:23.339
app it looks like like some kind of fixed
feature, yeah, you have this message you
00:56:23.339 --> 00:56:28.559
quoted, included in your new message and
it looks like like it's somehow linked to
00:56:28.559 --> 00:56:36.050
the old message, but in reality it's just
a text message, including some markdown,
00:56:36.050 --> 00:56:42.349
which if you're Threema version supports
this this kind of stuff, is rendered
00:56:42.349 --> 00:56:46.809
nicely as is shown below, but if your
version doesn't support it, you'll just
00:56:46.809 --> 00:56:50.990
see the plain text.
So again, being compatible with versions
00:56:50.990 --> 00:57:01.039
that don't have it introduces some, yeah,
weird layer. And with that, I'll stop
00:57:01.039 --> 00:57:07.680
showing you all the features Threema has.
There's certainly more to talk about, but
00:57:07.680 --> 00:57:16.550
I think you should have an idea how how it
works in basic terms. What it does; all
00:57:16.550 --> 00:57:21.880
the other stuff is kind of similar to what
I showed you and differs in
00:57:21.880 --> 00:57:27.619
particularities which aren't so important
I think and I'll just hand over to Roland
00:57:27.619 --> 00:57:34.089
who'll be wrapping up our talk and say
something about the results of our reverse
00:57:34.089 --> 00:57:36.294
engineering.
00:57:36.294 --> 00:57:45.440
Applause
00:57:45.440 --> 00:57:50.300
Roland: Okay, we told you we reversed the
app and we told you we weren't the first
00:57:50.300 --> 00:57:58.980
ones and this is all true. But we came
here to tell you guys or to make you guys
00:57:58.980 --> 00:58:04.839
aware of things you can expect from
messaging apps, and we hope that by using
00:58:04.839 --> 00:58:10.369
Threema as an example we have we have
shown you how you can relate your own
00:58:10.369 --> 00:58:14.109
privacy expectations to different apps and
we also hope we gave you enough
00:58:14.109 --> 00:58:20.559
terminology and explanation to that so you
can make a more more competent decision
00:58:20.559 --> 00:58:28.480
next time you look at a messenger and look
at what its promises are. Since we
00:58:28.480 --> 00:58:33.880
reversed it anyway and we did a lot of
coding to do that what we did is put it in
00:58:33.880 --> 00:58:39.960
a library. Now, I don't know how many of
you guys know the term academic code
00:58:39.960 --> 00:58:45.970
Laughter
We are of course we are of course I'm
00:58:45.970 --> 00:58:50.709
working at a university, so we've been
doing this on and off for for quite some
00:58:50.709 --> 00:58:55.320
time. We started roughly two years ago,
did it for a couple of days then left it
00:58:55.320 --> 00:59:00.569
lying around. Eventually we had the whole
thing lying in a drawer for about a year
00:59:00.569 --> 00:59:05.859
before we decided to finish it so we we
didn't we never actually put a lot of
00:59:05.859 --> 00:59:09.790
effort into the code. We are not
proficient programmers. But we still
00:59:09.790 --> 00:59:15.130
wanted to we still wanted to publish what
we did with the hopes that a small
00:59:15.130 --> 00:59:21.980
community might form around this, maybe
extend it, help us you know fix the few
00:59:21.980 --> 00:59:25.849
things that we didn't do so well, help us
document it - you don't have to take
00:59:25.849 --> 00:59:32.910
photographs by the way will will upload
the slides. So these repositories they
00:59:32.910 --> 00:59:38.230
exist we push to them we made a GitHub
organization that we push to them
00:59:38.230 --> 00:59:43.289
yesterday. If you wanted to look if you
wanted to start coding right away, say if
00:59:43.289 --> 00:59:47.209
you wanted to write a bot, we'd recommend
you wait a few weeks say two to three
00:59:47.209 --> 00:59:51.830
because we still want it like, fix a few
of the kinks in there. Everyone else we
00:59:51.830 --> 00:59:56.579
hope will just look at it, maybe this will
help your understanding of what actually
00:59:56.579 --> 01:00:04.260
does. And also the activists in us hope
that this might get the people at Threema
01:00:04.260 --> 01:00:08.329
to open-source their code because no
matter what we tell you here, and no
01:00:08.329 --> 01:00:11.690
matter what they tell you how their their
app actually works - and this is always
01:00:11.690 --> 01:00:15.950
true for non open-source software, there
will never be true transparency: you will
01:00:15.950 --> 01:00:20.010
never be able to prove that what runs on
your phone is actually implemented the
01:00:20.010 --> 01:00:25.531
same way we've shown you. With our library
you would have these guarantees, you can
01:00:25.531 --> 01:00:30.430
actually you can definitely use it to
write bots if you ever wanted to do that.
01:00:30.430 --> 01:00:34.549
Or if you just want to understand how it
works please go ahead and dive right into
01:00:34.549 --> 01:00:43.780
there. Well, with that said, we thank you
for your attention.
01:00:43.780 --> 01:00:58.990
Applause
Herald: Okay, thank you very much, Roland,
01:00:58.990 --> 01:01:06.220
Frieder, we only have time for one
question, so who has a super eager
01:01:06.220 --> 01:01:11.540
question - the signal angel is signalling.
Signal Angel: There's a couple of
01:01:11.540 --> 01:01:16.930
questions, but I will pick the best one.
The best one was from alien: could you use
01:01:16.930 --> 01:01:22.690
captions to inject malicious Exif data
into the images?
01:01:22.690 --> 01:01:29.880
Frieder: What is malicious Exif data?
Signal Angel: Well some data that probably
01:01:29.880 --> 01:01:37.420
the image passing library.
Frieder: What we did not do was have
01:01:37.420 --> 01:01:42.750
looked very particular at security
problems in the implementation of Threema.
01:01:42.750 --> 01:01:48.099
I could, like, and I would say this falls
into this department: there's also a
01:01:48.099 --> 01:01:52.280
library handling the gif display meant and
stuff like that. We could have looked at
01:01:52.280 --> 01:01:57.140
is this broken, maybe. We did not. We
looked at the protocol from a higher level
01:01:57.140 --> 01:02:01.029
and, so I cannot say anything about it.
Signal Angel: Okay and another question
01:02:01.029 --> 01:02:07.570
was when an non-group originating user
sends the group update message, what
01:02:07.570 --> 01:02:13.529
happens?
Frieder: The thing is, Threema group IDs
01:02:13.529 --> 01:02:19.839
aren't globally unique. A Threema group ID
only refers to a particular group together
01:02:19.839 --> 01:02:26.569
with the group creator's ID. So if you
send an update group message from your
01:02:26.569 --> 01:02:31.029
account, the app would look for a
different group than you intended. Because
01:02:31.029 --> 01:02:36.930
your group ID would say I'm I'm trying to
update a group created by me with this and
01:02:36.930 --> 01:02:43.930
that ID. So it won't be the group
you want to hijack.
01:02:43.930 --> 01:02:47.163
Herald: Okay, very well. Another round
of applause for our speakers!
01:02:47.163 --> 01:02:56.571
applause
01:02:56.571 --> 01:03:12.741
postroll music
01:03:12.741 --> 01:03:19.741
Subtitles created by c3subtitles.de
in the year 2018