1
00:00:00,160 --> 00:00:13,499
<i>33c3 opening theme music</i>

2
00:00:13,499 --> 00:00:20,460
Herald: I'm excited to be here, I guess
you are too. We will get started with our

3
00:00:20,460 --> 00:00:26,670
first talker for the day. He is a security
researcher at SBA Research, and he's also

4
00:00:26,670 --> 00:00:32,668
a member of CCC Vienna. The talk we'll be
hearing today is "Everything you always

5
00:00:32,668 --> 00:00:37,250
wanted to know about Certificate
Transparency" and with that, I will pass

6
00:00:37,250 --> 00:00:41,848
on the stage, please give a warm welcome
to Martin Schmiedecker!

7
00:00:41,909 --> 00:00:46,361
<i>applause</i>

8
00:00:48,740 --> 00:00:53,720
Martin: Thank you very much for these kind
words and this very nice introduction.

9
00:00:54,071 --> 00:00:58,820
As Ari said, I'm a member of CCC Vienna,
I'm also on twitter, so if you have a

10
00:00:58,820 --> 00:01:02,730
comment afterwards, or want to ping me, if
you find a typo in the slides, or

11
00:01:02,730 --> 00:01:05,220
whatever, just ping me on twitter.

12
00:01:05,220 --> 00:01:08,720
So, what is this talk about? What are we going

13
00:01:08,720 --> 00:01:13,010
to talk about? Certificate Transparency
is kind of a new thing in the TLS

14
00:01:13,010 --> 00:01:19,680
ecosystem so not many people are familiar
that it is here. So I will present the

15
00:01:19,680 --> 00:01:24,910
overview, what is CT and what it does and
will also peek under the hood and see what

16
00:01:24,910 --> 00:01:32,060
it actually does, how it works, and how
you can play with it. So one of the things

17
00:01:32,060 --> 00:01:38,150
I have to say about myself: I'm a keen fan
of Internet memes. So even though these

18
00:01:38,150 --> 00:01:44,690
are hilarious pictures. Personally I find
hilarious pictures that I put online. Keep

19
00:01:44,690 --> 00:01:48,700
in mind that HTTPS is a serious topic.
Whether you do net banking, you're

20
00:01:48,700 --> 00:01:53,670
googling, or whatever you do online, HTTPS
is there to protect your privacy and to

21
00:01:53,670 --> 00:01:59,690
protect your security. And in some states,
this has been shown by history, this is

22
00:01:59,690 --> 00:02:05,350
not a case, so there are nation-wide
introspecting devices which break open the

23
00:02:05,350 --> 00:02:11,400
TLS encryption and look at the content.
And people will get a visit from secret

24
00:02:11,400 --> 00:02:16,010
police or anything and they will knock on
their door and arrest them. Just like this

25
00:02:16,010 --> 00:02:21,650
week happened in Turkey, where people got
arrested for posting things on Facebook.

26
00:02:21,650 --> 00:02:25,720
So even though there are some funny
pictures in there keep in mind that this

27
00:02:25,720 --> 00:02:34,030
is just a means to an end for my
presentation. I personally find HTTPS is a

28
00:02:34,030 --> 00:02:39,270
very important topic. I hope I can
convince you, too. And CT in particular is

29
00:02:39,270 --> 00:02:46,900
fascinating. Why is there something like
Certificate Transparency? The name says it

30
00:02:46,900 --> 00:02:52,650
all: if you are a certification authority,
you want to make public the certificates

31
00:02:52,650 --> 00:02:59,860
you sell or you issue. As with many good
stories and many good tools it all started

32
00:02:59,860 --> 00:03:06,150
with a hack. Back in 2011 there was this
Dutch certification authority called

33
00:03:06,150 --> 00:03:10,850
DigiNotar, and they got pawned. They got
really, really badly fisted.

34
00:03:10,850 --> 00:03:11,850
<i>laughter</i>

35
00:03:11,850 --> 00:03:17,680
They lost everything. They lost all their
crown jewels. And as part of this hack,

36
00:03:17,680 --> 00:03:23,650
there were 500-something fraudulent
certificates issued. And not just any

37
00:03:23,650 --> 00:03:27,370
certificates, not just like Let's Encrypt,
where you can get a free certificate, and

38
00:03:27,370 --> 00:03:32,350
and then use it for your internal systems,
or for your web site, or whatever. No,

39
00:03:32,350 --> 00:03:38,870
really, really high value domains and high
value certificates. Like google.com, very

40
00:03:38,870 --> 00:03:43,290
privacy-invasive, if you can read what
people are googling, or what they are

41
00:03:43,290 --> 00:03:48,360
sending in their emails.
windowsupdate.com, which is like the back

42
00:03:48,360 --> 00:03:56,069
door to some of the windows world.
mozilla.com, the attacker could manipulate

43
00:03:56,069 --> 00:04:03,140
the Firefox download, sign it with the
certificate and ship it over a

44
00:04:03,140 --> 00:04:11,050
secure-seeming website. torproject, and so
forth. This was back in 2011 and this was

45
00:04:11,050 --> 00:04:19,180
not just a small incident it hasn't been a
small CA but it was a regular CA with regular

46
00:04:19,180 --> 00:04:24,960
business. What's more on this hack is
that: These certificates have then been

47
00:04:24,960 --> 00:04:29,690
used to intercept communication of
clients. People browsing the web, reading

48
00:04:29,690 --> 00:04:34,850
their email. The company which
investigated the breach afterwards found

49
00:04:34,850 --> 00:04:42,240
out that at least 300.000 IP addresses
were connecting to google.com and were

50
00:04:42,240 --> 00:04:50,400
seeing this fraudulent cert. 99% of which
where from Iran. So it was kind of a

51
00:04:50,400 --> 00:04:56,570
nation state attack against clients of
either ISP based or border gateway based

52
00:04:56,570 --> 00:05:04,070
where people were thinking they were
browsing secured by HTTPS but they were

53
00:05:04,070 --> 00:05:12,220
actually not. This is a wonderful frame
from the video. The guys from Fox IT which

54
00:05:12,220 --> 00:05:19,949
investigated this breach they used the
OCSP requests. Every time you get a

55
00:05:19,949 --> 00:05:23,450
certificate your browser has to somehow
figure out whether or not this certificate

56
00:05:23,450 --> 00:05:30,880
is still valid. If it has been revoked, it
would be nice to not use it anymore. And

57
00:05:30,880 --> 00:05:38,060
one of the approaches which is used is so
called OCSP, so the client asks the

58
00:05:38,060 --> 00:05:45,870
certificate authority: "hey is this still
valid?" And this has been logged. Each of

59
00:05:45,870 --> 00:05:53,360
these requests is one of the clients
seeing this fraudulent certificate and

60
00:05:53,360 --> 00:05:59,790
asking DigiNotar: "Hey, is this cert still
valid?" And as you can see, most of the

61
00:05:59,790 --> 00:06:03,580
connections - it's actually a movie, so
you can see the lights flickering and

62
00:06:03,580 --> 00:06:08,699
popping up and down as people go to sleep
and wake up again. And most of the

63
00:06:08,699 --> 00:06:15,860
people were from Iran. So how did
DigiNotar got hacked? They got really,

64
00:06:15,860 --> 00:06:21,229
really, badly hacked because they had
vulnerabilities everywhere. They had a

65
00:06:21,229 --> 00:06:27,400
system running which was incomprehensibly
insecure for a certification authority.

66
00:06:27,400 --> 00:06:31,900
People think that if you run a
certification authority you build the

67
00:06:31,900 --> 00:06:37,449
foundation for secure communication
online. You are the one securing Internet

68
00:06:37,449 --> 00:06:42,690
communication. And if you run such an
entity, people think you know security.

69
00:06:42,690 --> 00:06:43,960
Actually,

70
00:06:43,960 --> 00:06:45,600
<i>laughter</i>

71
00:06:45,600 --> 00:06:52,100
actually, DigiNotar did not. They had unpatched
software, which was facing the Internet.

72
00:06:52,100 --> 00:06:55,990
Might happen. They didn't have anti-virus
on the machines that issued the

73
00:06:55,990 --> 00:07:01,860
certificates. The didn't have a strong
password for their admin account. So like

74
00:07:01,860 --> 00:07:05,040
"password" or "admin". Actually, you can
read the report online, and the

75
00:07:05,040 --> 00:07:11,600
recommendations from ENISA, the European
security body, they listed all the things

76
00:07:11,600 --> 00:07:18,700
that have been found and identified. Also,
all the certificate-issuing servers were

77
00:07:18,700 --> 00:07:27,040
in one Windows domain. Also kind of bad
from DigiNotar: they kept the incident

78
00:07:27,040 --> 00:07:31,690
secret. Of course, they did not want to
spread out onto the Internet "hey, we got

79
00:07:31,690 --> 00:07:37,760
hacked, and we have had bad security".
They kept this incident hidden

80
00:07:37,760 --> 00:07:39,900
for more than 2 months.

81
00:07:39,900 --> 00:07:45,380
After 2 months, when it got
public, and when the Internet found out,

82
00:07:45,380 --> 00:07:49,820
that actually something really, really bad
had happened, they found out, and

83
00:07:49,820 --> 00:07:59,640
DigiNotar then went bankrupt. That's the sad
ending of the story. But this is not one

84
00:07:59,640 --> 00:08:05,620
of the problems that certification
authorities face. If you run a

85
00:08:05,620 --> 00:08:10,860
certification authority, you issue
certificates based on the identify of your

86
00:08:10,860 --> 00:08:17,310
customers. You can create sub-root CAs, so
you can say Hey, Martin, he looks like a

87
00:08:17,310 --> 00:08:22,960
nice guy, he looks like he knows security,
let's make him a CA and make him verify

88
00:08:22,960 --> 00:08:31,710
identities. Probably not a good idea, but
this is what the business model of HTTPS

89
00:08:31,710 --> 00:08:36,599
and certification authorities is. They
issue certificates and they grant the

90
00:08:36,599 --> 00:08:45,470
permission to issue certificates as well.
And the entire goal of these companies is

91
00:08:45,470 --> 00:08:50,910
to get into the trust stores. Every
browser, every operating system, every

92
00:08:50,910 --> 00:08:56,879
thing connects over TLS has something
called like trust store, where it stores

93
00:08:56,879 --> 00:09:02,499
the entities that are entitled to issue
certificates. And the problem is, those

94
00:09:02,499 --> 00:09:07,199
CAs are not strictly audited. They have
their requirements that they have to

95
00:09:07,199 --> 00:09:13,369
fullfil. They have to show that they have
some kind of security. But afterwards,

96
00:09:13,369 --> 00:09:17,709
once they're certified, and once they're
in the trust stores, there is not such a

97
00:09:17,709 --> 00:09:23,130
strong incentive to audit them, because
they are already in the trust stores, and

98
00:09:23,130 --> 00:09:31,269
they've had their audits, and so forth.
This can lead to many problems. Another

99
00:09:31,269 --> 00:09:38,959
CA, Trustwave, in 2011, it issued sub-CA
certificates. Anyone with a sub-CA

100
00:09:38,959 --> 00:09:46,199
certificate can issue a TLS certificate
for any domain. They used it for traffic

101
00:09:46,199 --> 00:09:50,249
introspection. So they were selling, I
don't know, to a company, which was

102
00:09:50,249 --> 00:09:55,670
building appliances which can break open
the network connections for banks,

103
00:09:55,670 --> 00:10:05,170
companies, or entire ISPs. They can look
into the traffic of it's users. Also,

104
00:10:05,170 --> 00:10:11,749
there was Lenovo SuperFish, wonderful
idea. SuperFish was a local

105
00:10:11,749 --> 00:10:17,070
man-in-the-middle CA, and the goal of the
SuperFish CA was to break open HTTPS

106
00:10:17,070 --> 00:10:20,510
traffic, so that they can inject ads.

107
00:10:20,510 --> 00:10:22,040
<i>laughter</i>

108
00:10:22,040 --> 00:10:27,239
Even though you're using gmail and you
have this nice, slick interface without

109
00:10:27,239 --> 00:10:34,160
obvious ads, SuperFish would break open
this connection, would be trusted by the

110
00:10:34,160 --> 00:10:44,199
browser, and would have huge overlay ads.
Lenovo stopped cooperating with SuperFish.

111
00:10:44,199 --> 00:10:51,889
This was preinstalled on Lenovo notebooks.
They had a local CA installed on the

112
00:10:51,889 --> 00:10:57,720
system so they could inspect the traffic
and show ads to users. What's even more

113
00:10:57,720 --> 00:11:03,470
interesting is that all these CAs had the
same key, and the private key was in RAM.

114
00:11:03,470 --> 00:11:12,889
So anybody could extract the private key
of the CA, use it to sign certificates for

115
00:11:12,889 --> 00:11:19,660
anything, and have an additional layer of
HTTPS injection, where you could not only

116
00:11:19,660 --> 00:11:27,160
show ads, but also read the emails or do
whatever you want. Very bad. They're not doing it

117
00:11:27,160 --> 00:11:34,709
allegedly anymore. Then there was, in
China, the CNNIC, they issued a sub-CA for

118
00:11:34,709 --> 00:11:38,649
an introspection company. Again the
company wanted to sell appliances where

119
00:11:38,649 --> 00:11:46,209
they could break open HTTPS connections
and look into the traffic of the users.

120
00:11:46,209 --> 00:11:51,220
And there was another incident just this
year: Symantec was issuing "test"

121
00:11:51,220 --> 00:11:57,399
certificates to a company or whatever,
among them google.com, opera.com, things

122
00:11:57,399 --> 00:12:04,230
that you probably not would like to test,
and got caught. And the nice thing about

123
00:12:04,230 --> 00:12:08,709
this incident is: they already had
Certificate Transparency installed. And we

124
00:12:08,709 --> 00:12:15,490
will come back to this incident in a
minute. Traffic introspection is a valid

125
00:12:15,490 --> 00:12:21,839
thing. If you have a fleet of planes, and
they are connected via expensive satellite

126
00:12:21,839 --> 00:12:26,739
connections and you really pay a lot for
bandwidth you would like to block, for

127
00:12:26,739 --> 00:12:33,259
example, Netflix, or anything which causes
a lot of traffic. One of the approaches

128
00:12:33,259 --> 00:12:40,309
which was taken by Gogo, they had traffic
introspection devices in their planes and

129
00:12:40,309 --> 00:12:48,899
they issued not-trusted certificates to
inspect the traffic. Bad for them:

130
00:12:48,899 --> 00:12:54,829
Adrienne Porter Felt who works for Google
noticed this and Gogo is not doing this

131
00:12:54,829 --> 00:13:02,200
anymore. And even though traffic
introspection sounds like a really bad

132
00:13:02,200 --> 00:13:07,910
thing, I can think of use cases where this
is legit. If you run a company, if you run

133
00:13:07,910 --> 00:13:15,039
a bank, and you want to prevent people
from leaking data, this can be OK. But it

134
00:13:15,039 --> 00:13:18,120
has to be transparent, people have to know
that this is happening, that they're

135
00:13:18,120 --> 00:13:22,660
inspecting everything. And still won't
prevent people from carrying out the USB

136
00:13:22,660 --> 00:13:29,929
thumb drive with all the data on it. So
this is the big picture why we need

137
00:13:29,929 --> 00:13:34,899
Certificate Transparency. We would like to
see which certificates have been issued by

138
00:13:34,899 --> 00:13:42,889
a specific CA. Some minor issues, not
really minor, that additionally come to

139
00:13:42,889 --> 00:13:49,189
play are that TLS has it's issues
nonetheless whether these certificates are

140
00:13:49,189 --> 00:13:54,790
issued or not. One of them is certificate
revocation is tricky. It's not as easy as

141
00:13:54,790 --> 00:14:01,109
just saying "this certificate is not valid
anymore". Once a certificate is issued, it

142
00:14:01,109 --> 00:14:08,040
is valid until the date shown in the
certificate, which can be three years.

143
00:14:08,040 --> 00:14:12,230
Happens to be, if on the first day of
using this certificate, people notice,

144
00:14:12,230 --> 00:14:17,999
"uh, we should revoke it", clients that
don't get this update will be able to use

145
00:14:17,999 --> 00:14:28,019
this certificate for two and more years.
Also, another limitation is that all CAs

146
00:14:28,019 --> 00:14:35,149
can issue certificates for all websites.
Any of those 1,800 CAs and sub-CAs which

147
00:14:35,149 --> 00:14:41,750
were in trust stores in 2013 they can all
issue a certificate for google.com or

148
00:14:41,750 --> 00:14:46,620
facebook.com. This is not prevented by any
means but social means and contracts,

149
00:14:46,620 --> 00:14:54,640
which state that they have to check the
legitimacy of the request. This was

150
00:14:54,640 --> 00:15:02,869
published in a paper in 2013. There are
more than 1,800 CAs which can sign

151
00:15:02,869 --> 00:15:10,379
certificates for any domain in regular
user devices. Another paper in 2014 found

152
00:15:10,379 --> 00:15:16,089
out that one third of them, one third of
those 1,800 certification authorities,

153
00:15:16,089 --> 00:15:21,100
never issued a single HTTPS certificate.
This makes you wonder: why are they then

154
00:15:21,100 --> 00:15:26,759
in the trust stores and so forth. You can
claim a certain percentage of them they

155
00:15:26,759 --> 00:15:34,499
are used for issuing private certificates
within networks. Still, one third of them

156
00:15:34,499 --> 00:15:44,220
never issued a publicly obtainable HTTPS
certificate. Then of course there the

157
00:15:44,220 --> 00:15:49,109
implementation issues. TLS has a long
history of implementation flaws. Not just

158
00:15:49,109 --> 00:15:54,109
cryptographic, there's logjam, freak,
poodle, whatever. They are a completely

159
00:15:54,109 --> 00:16:01,799
separate issue. But the implementation
issues are troubling the device security

160
00:16:01,799 --> 00:16:06,820
at a constant pace. Famous example is:
"goto fail;" from iOS, where they had an

161
00:16:06,820 --> 00:16:12,660
additional "goto fail" missing bracket and
the certificate validity wasn't checked.

162
00:16:12,660 --> 00:16:19,629
Also, we have a lot of embedded devices.
Once they're powered up, they're used to

163
00:16:19,629 --> 00:16:25,369
generate their private key, and they have
no access to good entropy. Entropy on

164
00:16:25,369 --> 00:16:33,010
embedded devices is surprisingly hard. So
a lot of them generate the same keys. And

165
00:16:33,010 --> 00:16:37,399
as already mentioned, we have different
trust stores per browser, per operating

166
00:16:37,399 --> 00:16:41,910
system. Everyone has a different trust
base. Also of course, every CA tries to

167
00:16:41,910 --> 00:16:47,379
get access into all of the trust stores,
get shipped with system updates to be

168
00:16:47,379 --> 00:16:54,670
trusted, and we have a diversity which is
not natural. Could be much easier if

169
00:16:54,670 --> 00:17:01,490
people would have the same trust base on
all their devices. And there are plenty of

170
00:17:01,490 --> 00:17:07,609
deployment issues. SSLv2: everybody thinks
it dead, but apparently, it's not.

171
00:17:07,609 --> 00:17:12,099
Sebastian Schinzel will give a splendid
presentation two hours from now about the

172
00:17:12,099 --> 00:17:19,129
DROWN attack. The DROWN attack uses SSLv2
weaknesses in email transport. Simply

173
00:17:19,129 --> 00:17:26,720
because it's activated, and it uses the
same key, you can attack top-notch TLS 1.2

174
00:17:26,720 --> 00:17:32,850
encryption, because this is still here.
There's the whole shmafoo of the SHA1

175
00:17:32,850 --> 00:17:37,780
certificates. Certification authorities
are not supposed to issue any SHA1

176
00:17:37,780 --> 00:17:41,760
certificates anymore. Some do, some get
caught, because they back-dated their

177
00:17:41,760 --> 00:17:47,380
certificates, and so forth. It's a mess.
Then there's cypher suites. There are more

178
00:17:47,380 --> 00:17:54,610
than 500 cypher suites available for the
different versions of TLS. Every admin

179
00:17:54,610 --> 00:18:00,060
would like to be [as] secure as possible
but which should he choose. As soon as

180
00:18:00,060 --> 00:18:04,910
there is money involved, like Amazon, they
need to be compatible with Internet

181
00:18:04,910 --> 00:18:16,140
Explorer 6 and so forth. It's really a
mess. And of course, email STARTTLS: Email

182
00:18:16,140 --> 00:18:22,220
never had the design to incorporate
security and authentication, so as always,

183
00:18:22,220 --> 00:18:27,750
they just popped it on top, and this is
STARTTLS. The problem with STARTTLS is it

184
00:18:27,750 --> 00:18:33,080
can be suppressed and people will fall
back to plaintext if they cannot reach the

185
00:18:33,080 --> 00:18:39,530
service with STARTTLS. Perfect forward
secrecy and so forth, deployment is another

186
00:18:39,530 --> 00:18:46,770
topic which can be a talk about. And there
is this troublesome development that the

187
00:18:46,770 --> 00:18:52,340
CAs, they get bought and they get sold
constantly. Just this year, Symantec

188
00:18:52,340 --> 00:19:00,040
bought the company BlueCoat. Symantec is
one of the larger CAs. They run the entire

189
00:19:00,040 --> 00:19:07,150
- not the entire, but they run large parts
of the certifications that are observable.

190
00:19:07,150 --> 00:19:13,100
BlueCoat got popular in the Arab Spring,
because they found BlueCoat proxies which

191
00:19:13,100 --> 00:19:18,700
are capable using man-in-the-middle
attacks to conduct traffic introspection,

192
00:19:18,700 --> 00:19:23,320
have been used at an ISP I think in Syria
or Egypt. They found them, and they have

193
00:19:23,320 --> 00:19:28,820
been deployed nationwide. So if you think
about it that Symantec, one of the largest

194
00:19:28,820 --> 00:19:34,690
CAs, is buying BlueCoat, one of the larger
traffic introspection companies, things

195
00:19:34,690 --> 00:19:38,620
can look really fishy or scary.

196
00:19:39,580 --> 00:19:44,180
Of course they promised they
would never use the Symantec

197
00:19:44,180 --> 00:19:46,600
<i>laughter</i>

198
00:19:46,600 --> 00:19:53,140
This is the state we're in. This is fine,
but it's not. But people still think about

199
00:19:53,140 --> 00:19:59,561
it that HTTPS is safe. And actually it
took a decade to teach people that they

200
00:19:59,561 --> 00:20:05,060
have to search for the lock icon. But if
they do not understand - actually they do

201
00:20:05,060 --> 00:20:11,910
not know how the lock icon appears. But
the entire lock icon is a farce if you dig

202
00:20:11,910 --> 00:20:20,860
into the details. We're all sitting in a
room filled with flames, so to say. So,

203
00:20:20,860 --> 00:20:26,520
this is where certificate transparency
comes in. Certificate transparency has the

204
00:20:26,520 --> 00:20:38,050
goal to identify fraudulent certification
authorities. In a perfect world, any

205
00:20:38,050 --> 00:20:43,140
certification authority would publish all
it's logs, would publish all the

206
00:20:43,140 --> 00:20:48,700
certificates it issues. So as soon as I
get a certificate for schmiedecker.net,

207
00:20:48,700 --> 00:20:54,160
the certification authority - this is part
of the public/private key, it can be

208
00:20:54,160 --> 00:20:59,840
public - so wouldn't it be nice if the CA
would publish that it just issued a

209
00:20:59,840 --> 00:21:05,740
certificate for schmiedecker.net?
Basically: yes. Of course, certification

210
00:21:05,740 --> 00:21:11,300
authorities do not want this to happen, in
particular if they're selling to funky

211
00:21:11,300 --> 00:21:18,440
states or funky businesses which earn
their money with traffic introspection and

212
00:21:18,440 --> 00:21:23,920
so forth. So the perfect world would be
the public key of each certificate would

213
00:21:23,920 --> 00:21:28,160
be published. The certification authority
could say "Hey, I just issued this

214
00:21:28,160 --> 00:21:30,990
certificate" and everybody could see it,
could verify it

215
00:21:30,990 --> 00:21:35,200
and it would be, well, a better world.

216
00:21:37,740 --> 00:21:43,200
This would help to detect
problems very early. So if a small Dutch

217
00:21:43,200 --> 00:21:47,330
certification authority would issue a
certificate for google.com or

218
00:21:47,330 --> 00:21:52,300
torproject.com, this would be noticeable.
I mean, this is a small CA, they would be

219
00:21:52,300 --> 00:21:57,280
really - they should be really surprised
if google.com decides to issue a

220
00:21:57,280 --> 00:22:04,540
certificate for their service. This would
shorten the window of opportunity for an

221
00:22:04,540 --> 00:22:12,560
attacker. Also, the idea is to have some
form of punishment for misbehaving CAs. So

222
00:22:12,560 --> 00:22:18,020
at the moment, right now, if a
certification authority fucks up, and

223
00:22:18,020 --> 00:22:23,970
Google is affected, they mandate that they
need to have additional steps to be

224
00:22:23,970 --> 00:22:32,800
reintroduced into the trust stores. This
is what Google did. They did the Power

225
00:22:32,800 --> 00:22:41,650
Ranger move, and they decided they want to
make the internet more secure. Why Google?

226
00:22:41,650 --> 00:22:46,610
Well, Google is uniquely positioned in a
way that they control the clients with

227
00:22:46,610 --> 00:22:53,820
their browsers with the Android system,
and they also control a large portion of

228
00:22:53,820 --> 00:22:58,340
the servers. Everyone uses Google, except
for those that use Bing.

229
00:22:58,340 --> 00:23:00,530
<i>laughter</i>

230
00:23:00,530 --> 00:23:08,140
Just kidding. What Google did is, once the
DigiNotar hack got public, they pinned

231
00:23:08,140 --> 00:23:13,620
their certificates. Since Chrome has a
decent update cycle they can ship the

232
00:23:13,620 --> 00:23:19,241
certificates which they expect to see with
a browser update. So as soon as [the]

233
00:23:19,241 --> 00:23:27,510
browser updates in the background, it can
enforce the specific certificate that it

234
00:23:27,510 --> 00:23:34,670
expects to see for google.com,
youtube.com, and whatever. Also, it has a

235
00:23:34,670 --> 00:23:40,330
really huge market share. 50% and more,
depending on how you count. Chrome and

236
00:23:40,330 --> 00:23:46,060
Chromium are rather popular. And lastly,
they are a common target. So if some

237
00:23:46,060 --> 00:23:53,860
dictator decides to introspect client
emails, user emails, usually they target

238
00:23:53,860 --> 00:23:59,640
gmail.com, because they have a decent
security, they do not have any other

239
00:23:59,640 --> 00:24:10,180
vulnerabilities or backdoors to allow
access to their content. Which makes the

240
00:24:10,180 --> 00:24:15,700
attack to Gmail a very drastic attack.
With the changes that Google introduced

241
00:24:15,700 --> 00:24:21,190
into Chrome with the certificate pinning,
they can now detect these attacks.

242
00:24:21,190 --> 00:24:29,940
But this was already back in 2011. Since
then, for example, the Porter Felt tweet

243
00:24:29,940 --> 00:24:37,520
I showed you, If Chrome would go to a
website google.com or youtube.com, and

244
00:24:37,520 --> 00:24:44,200
would see a fraudulent certificate, they
would warn the user. And what Google then

245
00:24:44,200 --> 00:24:52,840
did, was to propose a standard, to make an
RFC, how to transparently publish the logs

246
00:24:52,840 --> 00:25:01,350
for certificates that have been issued.
The idea of the RFC is that every

247
00:25:01,350 --> 00:25:11,460
certificate issued is public. This is
implemented in a public, append-only log.

248
00:25:11,460 --> 00:25:16,900
So they have a log, they have open APIs,
and they accept every certificate. Then,

249
00:25:16,900 --> 00:25:22,180
cryptographically assured, the client like
the browser can verify that this is a

250
00:25:22,180 --> 00:25:27,640
publicly logged certificate. And the
entire system is open for all. So you can

251
00:25:27,640 --> 00:25:30,190
go to the website, you can
get the source code,

252
00:25:30,190 --> 00:25:36,490
you can run your own log for RFC 6962.

253
00:25:36,490 --> 00:25:40,610
And everyone is happy.

254
00:25:40,870 --> 00:25:45,960
The goals were to detect misbehaving 
CAs. As I said,

255
00:25:45,960 --> 00:25:51,500
they have their audits, they have their
compliance regulations, and so forth, but

256
00:25:51,500 --> 00:25:55,010
not on the certificate level. With
certificate transparency, they become

257
00:25:55,010 --> 00:26:00,950
audible by the public, by the browsers.
Everyone can query the logs and see

258
00:26:00,950 --> 00:26:04,730
whether or not this particular
certification authority has issued a

259
00:26:04,730 --> 00:26:07,290
certificate for google.com.

260
00:26:10,200 --> 00:26:15,390
Alright! Upon reading the RFC,
there are three entities

261
00:26:15,390 --> 00:26:20,260
which are part of certification
transparency. There are, for one,

262
00:26:20,260 --> 00:26:27,680
the logs, which are like giant vacuum
cleaners. They ingest all the certificates

263
00:26:27,680 --> 00:26:34,170
which are sent to them, and then
cryptographically sign them and issue the

264
00:26:34,170 --> 00:26:40,620
assurance that this specific certificate
has been logged. And this has been issued

265
00:26:40,620 --> 00:26:45,640
and has not been tampered with, and so
forth. Then there are monitors. They

266
00:26:45,640 --> 00:26:49,860
identify suspicious certificates. Usually,
these are the certification authorities

267
00:26:49,860 --> 00:26:55,930
themselves which run those monitors. And
then there are the auditors. The auditors

268
00:26:55,930 --> 00:27:02,870
usually are implemented in the browser.
And they verify that the issued

269
00:27:02,870 --> 00:27:10,190
certificates are really logged. Looking at
them in detail: the role of the monitor

270
00:27:10,190 --> 00:27:14,080
and the auditor is kind of
interchangeable, so a monitor can be an

271
00:27:14,080 --> 00:27:21,350
auditor, back and forth. What the monitor
does, it fetches all the certificates.

272
00:27:21,350 --> 00:27:27,720
So you have this giant pool of certificates.
They are cryptographically assured which

273
00:27:27,720 --> 00:27:33,220
we will see soon. And the monitor just
fetches them all. And they have some form

274
00:27:33,220 --> 00:27:39,920
of semantic checking. They can see, has
there been a certificate for my domain,

275
00:27:39,920 --> 00:27:47,059
has there been any sub-CA created, which
is able to issue certificates for traffic

276
00:27:47,059 --> 00:27:53,590
introspection, and so forth. Also, what
they can then, with this data, do, they

277
00:27:53,590 --> 00:28:00,160
can identify misbehaving log operators. I
said, the logs, they are just gigantic

278
00:28:00,160 --> 00:28:05,150
hoovers, which collect all the
certificates, and they need auditing, too,

279
00:28:05,150 --> 00:28:09,390
of course. They need - they have a
position of power, because they are

280
00:28:09,390 --> 00:28:18,300
managing this huge pool of certificates.
And one needs to challenge the log to

281
00:28:18,300 --> 00:28:24,400
identify misbehaviour. This can be done by
the monitors, can also be done by the

282
00:28:24,400 --> 00:28:32,490
auditors. Every client - right now, it's
implemented in Chrome. Chrome checks for

283
00:28:32,490 --> 00:28:43,110
these certification transparency
cryptographically signed blobs. And the

284
00:28:43,110 --> 00:28:47,460
browsers and everything, they can verify
the log integrity as well. So in the

285
00:28:47,460 --> 00:28:56,860
backend, the log, it creates a hash tree.
This hash tree is signed. We will come to

286
00:28:56,860 --> 00:29:05,650
that in a second. I got lost here. So both
monitors and auditors, they query that the

287
00:29:05,650 --> 00:29:10,570
log entity is working correctly. It
wouldn't be a good thing if China could go

288
00:29:10,570 --> 00:29:16,530
to Google and say them "Hey, we would like
to have this certificate removed." Google

289
00:29:16,530 --> 00:29:22,670
could then comply or could not comply but
whether they remove the certificate this

290
00:29:22,670 --> 00:29:28,340
would be auditible and this would be
observable to the public. So the good

291
00:29:28,340 --> 00:29:33,690
thing is anyone run any software, anyone
of you in this room can run a log entity.

292
00:29:33,690 --> 00:29:38,430
You need some kind of access to some
certificates, so whether or not you are a

293
00:29:38,430 --> 00:29:45,340
certification authority, you can just run
a public log, and everybody can push their

294
00:29:45,340 --> 00:29:53,710
certificates to your service. Right now,
this is not the case. Usually, the CAs run

295
00:29:53,710 --> 00:30:00,230
the monitors and they run the logs, but
this is not by design, anybody can run

296
00:30:00,230 --> 00:30:06,470
anything. One of the problems is
availability. So even through I can set up

297
00:30:06,470 --> 00:30:15,140
a log for certificates, I have the problem
that my log needs to be online 24/7. My

298
00:30:15,140 --> 00:30:22,870
ISP is not happy if I ask him to guarantee
this for me, if I don't pay much much much

299
00:30:22,870 --> 00:30:31,350
more. So, how does it work? Currently, if
you get a certificate, you go to the

300
00:30:31,350 --> 00:30:36,070
certification authority, You say, "hey,
I'm this wonderful domain, please could I

301
00:30:36,070 --> 00:30:42,860
get a certificate?" And then you get the
certificate. What's additionally happening

302
00:30:42,860 --> 00:30:50,350
with certification transparency is that the
CA upon issuing the certificate - this can

303
00:30:50,350 --> 00:30:55,610
be any CA, this can be Let's Encrypt, this
can be Thawte, Symantec, you name it -

304
00:30:55,610 --> 00:31:02,090
what they do is they send the certificate
once they issued it, they send the

305
00:31:02,090 --> 00:31:13,500
certificate to one of the logs. The log
then signs the successful reception of the

306
00:31:13,500 --> 00:31:18,000
certificate, and immediately sends
something back. This blob is called the

307
00:31:18,000 --> 00:31:24,309
SCT, the signed certificate timestamp, and
this can then be included in the

308
00:31:24,309 --> 00:31:32,990
certificate or with other ways. Key point
here is that once the server installs the

309
00:31:32,990 --> 00:31:42,860
certificate, it also installs this SCT, so
that browsers can see it and parse it.

310
00:31:42,860 --> 00:31:49,540
Some people I might have lost here.
Nonetheless, everything is easier in

311
00:31:49,540 --> 00:31:53,771
pictures. Right now, currently - and these
are the pictures from the certification

312
00:31:53,771 --> 00:31:58,570
transparency website, thanks for making
them - my pic skills are really not that

313
00:31:58,570 --> 00:32:03,960
good, so I never would have been able to
make such beautiful graphs. So currently,

314
00:32:03,960 --> 00:32:10,020
there is the certification authority. It
issues a certificate, and the website then

315
00:32:10,020 --> 00:32:17,059
installs it in the correct directory. The
clients check it, and encryption can

316
00:32:17,059 --> 00:32:23,240
happen. The additional step, and this is
the nice thing, it can happen without any

317
00:32:23,240 --> 00:32:28,850
additional steps on the server side and
the client side, it's just the

318
00:32:28,850 --> 00:32:33,650
certification authority needs to do an
additional step. So instead of just

319
00:32:33,650 --> 00:32:39,920
issuing the certificate, they send the
certificate to the logs, the log

320
00:32:39,920 --> 00:32:45,800
immediately sends back the so-called SCT,
the signed certificate timestamp, and this

321
00:32:45,800 --> 00:32:51,830
is then included in the certificate, which
is shipped to the client. And then the

322
00:32:51,830 --> 00:32:57,570
client, if it supports it, can ask the
server whether or not this particular

323
00:32:57,570 --> 00:33:05,680
certificate is included or not. The things
that come back from the log they are

324
00:33:05,680 --> 00:33:11,010
signed, they have an ID, and they have a
timestamp. These are the important things.

325
00:33:11,010 --> 00:33:18,440
They need to be included in those SCT.
Also, what will be interesting in the

326
00:33:18,440 --> 00:33:27,160
future, that the certificate can have
multiple log entries. So the SCT is like a

327
00:33:27,160 --> 00:33:36,380
promise. The log operator promises to
include this certificate in its logs. And

328
00:33:36,380 --> 00:33:40,140
everybody can check afterwards then if
this log has really publicly logged, or if

329
00:33:40,140 --> 00:33:45,260
the authority has omitted to log it. In
the future it will be the case that many

330
00:33:45,260 --> 00:33:52,800
SCTs can be within a certificate. If I'm a
certification authority I can go to any

331
00:33:52,800 --> 00:34:00,000
log operator, send them every certificate
I have and then include many, many SCTs.

332
00:34:00,000 --> 00:34:04,080
And the SCT is not private. This is just
an ID, it's a timestamp, and it's a

333
00:34:04,080 --> 00:34:12,969
signature. This is probably too much.
There's multiple ways for the client to

334
00:34:12,969 --> 00:34:21,289
verify that this certificate has an SCT.
So one of the methods for example is OCSP

335
00:34:21,289 --> 00:34:26,389
stapling. Right now, if you have a
certificate, instead of going to the CA,

336
00:34:26,389 --> 00:34:34,149
the server can staple the OCSP request
signed by the CA. And within this OCSP

337
00:34:34,149 --> 00:34:44,109
stapling there can also be the SCT
included. How does it work on the log

338
00:34:44,109 --> 00:34:48,489
side? Everything there is, is a Merkle
hash tree. A Merkle hash tree is a

339
00:34:48,489 --> 00:34:52,940
wonderful data structure. It's nothing
new, it's nothing fancy, and it's not the

340
00:34:52,940 --> 00:34:54,418
blockchain.

341
00:34:54,418 --> 00:34:55,899
<i>laughter</i>

342
00:34:55,899 --> 00:35:05,400
The Merkle hash tree, it looks, it's a
binary tree. Every node has two children,

343
00:35:05,400 --> 00:35:10,570
and the hash value of an inner node
depends on the two children. So usually

344
00:35:10,570 --> 00:35:14,600
it's the concatenation of the values of
the two children. Get's hashed again, up

345
00:35:14,600 --> 00:35:19,859
to the root. Makes it very space efficient
because if I want to verify the integrity

346
00:35:19,859 --> 00:35:27,799
of one entire tree, all I have to check is
the hash value of the root. Then, of

347
00:35:27,799 --> 00:35:36,260
course, I can get all the relevant hash
values, and then I can reconstruct it. CT

348
00:35:36,260 --> 00:35:45,460
uses SHA256 Merkle tree, and as I said,
everything below a certain node is

349
00:35:45,460 --> 00:35:51,509
responsible for the hash value. So if you
remove a node, if you add a node, or if

350
00:35:51,509 --> 00:36:02,490
you relocate a node, the hash values of
all the upper nodes get changed. Each of

351
00:36:02,490 --> 00:36:06,920
the log operators, additionally to the
promise that they will include every

352
00:36:06,920 --> 00:36:12,400
certificate that it receives, it also
gives a promise on the maximum merge

353
00:36:12,400 --> 00:36:18,890
delay. The SCT, the promise to include
this certificate chain into the log, it

354
00:36:18,890 --> 00:36:26,069
can only finish immediately because it's a
promise to include this into the log. And

355
00:36:26,069 --> 00:36:32,400
the maximum merge delay is the time the
log operator promises to include it in the

356
00:36:32,400 --> 00:36:41,150
big, big Merkle hash tree. The good thing
about the Merkle hash tree is despite

357
00:36:41,150 --> 00:36:46,369
being very space efficient, calculation
efficient, not that much data overhead,

358
00:36:46,369 --> 00:36:50,869
and so forth, it's not possible to
backdate elements. This was interesting

359
00:36:50,869 --> 00:36:55,470
for one of the certification authorities
which issued SHA1 signed certificates,

360
00:36:55,470 --> 00:36:59,670
even though the browsers and everyone
agreed that this should not happen

361
00:36:59,670 --> 00:37:05,440
anymore. So it's also not possible remove
elements that have been once in there. So

362
00:37:05,440 --> 00:37:09,780
if Symantec decided to remove the
google.com certificate, which was a "test"

363
00:37:09,780 --> 00:37:14,359
certificate, this would be noticeable as
well, because if you remove one of the

364
00:37:14,359 --> 00:37:20,739
leaves, the hash values up to the root,
they all change. And it's also not

365
00:37:20,739 --> 00:37:26,690
possible to add elements. if you would
like to add an element unnoticably, you

366
00:37:26,690 --> 00:37:34,160
cannot do this, because the hash values of
all the upper nodes would change. So how

367
00:37:34,160 --> 00:37:39,989
do the logs operate? What they usually do
is once every hour, they receive the

368
00:37:39,989 --> 00:37:48,319
certificates, and once every hour they
include them into their Merkle hash tree.

369
00:37:48,319 --> 00:37:52,069
Probably already too much detail. They
build a separate tree, and then include it

370
00:37:52,069 --> 00:38:01,480
and recalculate the root hash value, which
is then signed and shipped. And the nice

371
00:38:01,480 --> 00:38:07,829
thing about the Merkle tree is that you
have multiple ways of proving things. One

372
00:38:07,829 --> 00:38:18,359
of the things that can be proved whether
or not this log operator is honest. if a

373
00:38:18,359 --> 00:38:21,989
log operator removes one of the
certificates, this becomes visible by

374
00:38:21,989 --> 00:38:32,099
changing all the relevant nodes. Also,
it's very efficient. Also a figure from

375
00:38:32,099 --> 00:38:39,279
the project website. On the left side, you
have a Merkle tree with some added

376
00:38:39,279 --> 00:38:47,039
certificates, appended certificates. And
if a monitor or an auditor decides to

377
00:38:47,039 --> 00:38:53,699
challenge the log operator, at a later
point in time, whether or not these

378
00:38:53,699 --> 00:39:00,509
certificates D6 and D7 have been correctly
added, all the log operator has to send

379
00:39:00,509 --> 00:39:07,329
are those highlighted nodes. This is the
root, this is the thing that is signed,

380
00:39:07,329 --> 00:39:13,079
for example, every hour. This is public.
The certificates, they are public because

381
00:39:13,079 --> 00:39:20,539
like, they're certificates. If now someone
wants to verify that not only these have

382
00:39:20,539 --> 00:39:25,599
been included, this is very easy, because
you just have to calculate all the way up,

383
00:39:25,599 --> 00:39:30,279
but also verify that all the other
certificates are still there, so none of

384
00:39:30,279 --> 00:39:36,510
the old certificates have been removed,
there only needs to be three hash values

385
00:39:36,510 --> 00:39:42,190
transmitted. And then the challenger can
re-calculate everything. So as soon as the

386
00:39:42,190 --> 00:39:46,950
challenger knows those hash values they
can concatenate everything back together

387
00:39:46,950 --> 00:39:57,079
and in the end, it should have the same
hash value as the root. Another proof that

388
00:39:57,079 --> 00:40:02,790
is possible is whether a specific
certificate is still in the log. So it's

389
00:40:02,790 --> 00:40:07,359
not only possible to challenge the
consistency of the entire log regarding

390
00:40:07,359 --> 00:40:14,369
old data, but it's also to verify that a
specific certificate is still in the logs,

391
00:40:14,369 --> 00:40:21,109
or made it into the logs. Remember, the
SCT, the thing that finished immediately,

392
00:40:21,109 --> 00:40:27,190
is just a promise to include it in the
logs, and at a later point in time,

393
00:40:27,190 --> 00:40:35,619
anyone, any auditor can challenge the log
operator if the certificate is really in

394
00:40:35,619 --> 00:40:45,569
the log. So again, if I want to verify
that a specific certificate is in the log

395
00:40:45,569 --> 00:40:51,300
I have the certificate that I would like
to challenge, then I just need, in this

396
00:40:51,300 --> 00:40:57,259
example, those three nodes, and everything
else, the j node can be calculated because

397
00:40:57,259 --> 00:41:02,330
I have the certificate. Then I have the
hash of the certificate. I need this hash,

398
00:41:02,330 --> 00:41:12,430
then I can calculate this value, and so
forth, until I am at the root. So much for

399
00:41:12,430 --> 00:41:17,470
under the hood. Merkle hash trees are
gone. One of the problems of those logs

400
00:41:17,470 --> 00:41:22,630
are they are every growing. You might have
noticed, there is not a single word about

401
00:41:22,630 --> 00:41:31,949
deleting certificates, for valid reasons,
they are ever growing. Of course, nothing

402
00:41:31,949 --> 00:41:39,279
is forever, so what log operators do is
that they rotate the logs. So at a

403
00:41:39,279 --> 00:41:46,119
specific point in time, the log gets
frozen, the tree is then static, and there

404
00:41:46,119 --> 00:41:51,920
is another log entity, which is brough
online and used for, including the newer

405
00:41:51,920 --> 00:41:58,069
certificates. Quite recently, aviator from
Google got frozen.

406
00:41:58,069 --> 00:42:00,719
It contains 46 million certificates.

407
00:42:00,719 --> 00:42:09,060
Small drawback of freezing a
log: as long as one certificate in this

408
00:42:09,060 --> 00:42:16,279
log, in this three is still valid, this
log needs to be reachable. As soon as all

409
00:42:16,279 --> 00:42:22,680
the certificates have been expired, it can
be dumped. But until that it has to be

410
00:42:22,680 --> 00:42:25,680
available for the proofs.

411
00:42:28,099 --> 00:42:34,529
One of the issues is that right now
there are just a few log operators.

412
00:42:34,529 --> 00:42:39,240
In the future, there should
be many more. Not hundred-thousands of

413
00:42:39,240 --> 00:42:46,840
them, but maybe hundreds of them. And they
need to exchange information. Some form of

414
00:42:46,840 --> 00:42:53,460
log chatter should appear. The log
operators chatter with the clients to

415
00:42:53,460 --> 00:43:01,349
verify that they all see the same state of
the Merkle trees. And this has been

416
00:43:01,349 --> 00:43:08,940
published in a paper last year. Right now,
the idea is not yet at a level where they

417
00:43:08,940 --> 00:43:14,440
need to chatter, which we will soon see.
This happens when you create memes on the

418
00:43:14,440 --> 00:43:19,790
train. Usually, they are very bad memes.
This is apparently Gossip Girl, I've never

419
00:43:19,790 --> 00:43:24,579
seen it, but if you google gossip and
meme, ta-da!

420
00:43:24,579 --> 00:43:27,190
<i>laughter</i>

421
00:43:28,650 --> 00:43:33,219
Who now runs the logs? Who are the
entities who are actively running logs. Of

422
00:43:33,219 --> 00:43:37,650
course, Google is running the majority of
them. They proposed the entire thing, they

423
00:43:37,650 --> 00:43:43,970
wrote the code to run these things, and
they run the large, open-for-all

424
00:43:43,970 --> 00:43:50,369
certificate logs. Three of them are
currently open-for-all. Another one is for

425
00:43:50,369 --> 00:43:54,559
Let's Encrypt certificates, and another
one is for non Let's Encrypt certificates.

426
00:43:54,559 --> 00:44:00,470
Of course, Let's Encrypt issues a lot of
certificates., thankfully. So they

427
00:44:00,470 --> 00:44:05,119
separated that, apparently. If you read
the mailing list, they promise that these

428
00:44:05,119 --> 00:44:11,700
free open-for-all logs are separated
geographically and administratively. The

429
00:44:11,700 --> 00:44:21,170
are run by different entities, but they
all have the same boss, and it would be

430
00:44:21,170 --> 00:44:30,190
better if there were more open logs.
Symantec has one, Wosign, CNNIC. Everytime

431
00:44:30,190 --> 00:44:34,410
Google detects that a fraudulent
certificate for google.com has been

432
00:44:34,410 --> 00:44:44,109
issued, those certification authorities
are mandated to run CT. Which is a good

433
00:44:44,109 --> 00:44:50,050
thing, I mean, public and everything.
Google has tens of millions of

434
00:44:50,050 --> 00:44:54,160
certificates. They really have an
open-for-all log, so everyone can push

435
00:44:54,160 --> 00:45:00,640
certificates in there. DigiCert, Symantec
is kind of big, but all the other nodes

436
00:45:00,640 --> 00:45:05,849
which are listed on the website, they have
a hundred-thousand-ish certificates, which

437
00:45:05,849 --> 00:45:14,320
is not that much compared to 50 million or
60 millions. Right now, Google already

438
00:45:14,320 --> 00:45:22,359
mandates certification transparency for
extended valiity certificates, so if you

439
00:45:22,359 --> 00:45:28,160
not only see the green text up in the left
corner of your browser, but also some

440
00:45:28,160 --> 00:45:35,660
fancy name and big, big green whatever,
this is an EV cert. And Google mandates

441
00:45:35,660 --> 00:45:44,190
for EV certs to have two SCTs. Firefox is
in the process of including it, I think.

442
00:45:44,190 --> 00:45:53,450
Also, apparently, certificate transparency
works. Because, when Symantec issued this

443
00:45:53,450 --> 00:45:59,950
certificate for google.com they released a
report stating that they found 23 "test"

444
00:45:59,950 --> 00:46:06,910
certificates. Symantec said that it issued
23 test certificates. But the logs are

445
00:46:06,910 --> 00:46:12,970
public, anybody can query them. And within
seconds, you can see that Symantec issued

446
00:46:12,970 --> 00:46:20,839
another 164 certificates for other
domains, and also 2,500 certificates for

447
00:46:20,839 --> 00:46:29,260
non-exisisting domains. Just regarding
this one issue. I need to hurry, time is

448
00:46:29,260 --> 00:46:34,960
running out. Some of the downsides of
certificate transparency. Of course:

449
00:46:34,960 --> 00:46:40,799
privacy. People can learn your internal
hosts, so if you have NAS for example, and

450
00:46:40,799 --> 00:46:46,289
this NAS is only reachable within your
LAN, and you want to get rid of the

451
00:46:46,289 --> 00:46:51,210
browser warning whenever you access the
interface of your NAS, you can get a Let's

452
00:46:51,210 --> 00:46:56,779
Encrypt certificate but since not only the
certificate is published, but also it's

453
00:46:56,779 --> 00:47:04,230
logged, people can see in the public log
file that there is, for your domain, a

454
00:47:04,230 --> 00:47:10,210
NAS. Also, log entries must contain the
entire chain up to a trusted root

455
00:47:10,210 --> 00:47:15,099
certificate, which excludes everything
which is self-signed, and everything which

456
00:47:15,099 --> 00:47:23,660
is DANE. DANE is for verifying TLS
certificates using DNSsec. And since these

457
00:47:23,660 --> 00:47:30,150
two have no trusted root, they are currently
not working for certificate transparency.

458
00:47:30,150 --> 00:47:35,970
Now, of course you want to see the data.
You're gonna play around with this.

459
00:47:35,970 --> 00:47:42,849
Basically, what you can query, everything
is JSON. So, if you know JSON, you can

460
00:47:42,849 --> 00:47:52,769
work with certificate transparency. The
basic URL is like this. The URL is any log

461
00:47:52,769 --> 00:48:00,719
server, responds with the current root and
it's signature, using this URL. Most

462
00:48:00,719 --> 00:48:05,180
interestingly, it gives you also the
number of certificates and the time stamp.

463
00:48:05,180 --> 00:48:11,740
It looks then like this. JSON, so you
have, this is the aviator log from Google,

464
00:48:11,740 --> 00:48:18,759
which is now frozen. Has 46 something
million certificates, the hash value of

465
00:48:18,759 --> 00:48:28,109
the Merkle tree, and the signature. Also,
you can challenge the certification logs

466
00:48:28,109 --> 00:48:35,339
with consistency proofs, where you have
two states of their tree, and the log has

467
00:48:35,339 --> 00:48:41,280
to prove that it did not modify anything
in between them. And of course, you can

468
00:48:41,280 --> 00:48:49,900
verify that specific certificate is in the
tree with the second URL. And you can just

469
00:48:49,900 --> 00:48:54,940
push certificates there with a POST
request. So you push it, they send back

470
00:48:54,940 --> 00:49:00,859
the SCT, if you're the log operator, then
you would include this. Any website which

471
00:49:00,859 --> 00:49:10,799
right now is not using SCT all it takes is
a POST request. Nothing more. Some screens

472
00:49:10,799 --> 00:49:18,509
from the internals. This is for google.com
in the net internals view. What you can

473
00:49:18,509 --> 00:49:28,130
see is that signed certificate timestamp,
the SCT, is received. It is valid. And

474
00:49:28,130 --> 00:49:33,180
compliance is checked. So this was for
google.com. And everything worked out.

475
00:49:33,180 --> 00:49:39,960
Last but no least, just to mention it,
Comodo operates a large search engine,

476
00:49:39,960 --> 00:49:50,229
crt.sh. There you can query public logs.
Also, Facebook recently added a monitor

477
00:49:50,229 --> 00:49:58,180
for certificates. So if you own a domain
name, and you use an entity which - no if

478
00:49:58,180 --> 00:50:04,739
you own a domain, you can get updates if
the certificate changes. The also monitor

479
00:50:04,739 --> 00:50:10,920
the public logs and as soon as, for
example, facebook.com uses a new

480
00:50:10,920 --> 00:50:19,579
certificate that is logged in CT, you can
get a notification for that. This is what

481
00:50:19,579 --> 00:50:23,619
it looks like. Remember, Facebook can also
send PGP-encrypted mails, then nothing

482
00:50:23,619 --> 00:50:31,790
leaks to anyone. This screenshot was
borrowed from Scott Helme. So, what's

483
00:50:31,790 --> 00:50:41,700
next? Just a few - One month ago, Google
announced that it will mandate certificate

484
00:50:41,700 --> 00:50:49,650
transparency from October 2017 on. So if
you run a website which is secured by TLS

485
00:50:49,650 --> 00:50:53,790
you might want to check before that date
whether or not your certification

486
00:50:53,790 --> 00:50:58,680
authority is using certificate
transparency. I would expect to have more

487
00:50:58,680 --> 00:51:07,049
logs and more certificates included in the
logs. In the far future, basically, the

488
00:51:07,049 --> 00:51:12,869
idea of transparency and this Merkle tree
is open for anything. You could put key

489
00:51:12,869 --> 00:51:17,759
management software releases, anything in
there. The team at Google, they also

490
00:51:17,759 --> 00:51:24,779
builded a prototype for that, called
Trillian, and described in the paper

491
00:51:24,779 --> 00:51:26,879
"Verifiable Data Structures".

492
00:51:26,879 --> 00:51:29,279
Before we
come to the end and questions,

493
00:51:30,569 --> 00:51:31,460
<i>laughter</i>

494
00:51:32,270 --> 00:51:33,140
<i>applause</i>

495
00:51:37,660 --> 00:51:41,579
There is a distinction. Of course, you
could solve this problem with blockchain

496
00:51:41,579 --> 00:51:49,930
as well. But a Merkle hash tree is much
more efficient, much more elegant. When I

497
00:51:49,930 --> 00:51:53,599
talked to a colleague on the train here,
he said, of course, you can just push the

498
00:51:53,599 --> 00:51:57,539
log into the blockchain.
Yeah, not the same thing.

499
00:51:58,309 --> 00:51:59,539
Thank you!

500
00:51:59,979 --> 00:52:00,979
<i>applause</i>

501
00:52:10,769 --> 00:52:13,899
Herald: Thank you Martin for a very
interesting talk! We have a few more

502
00:52:13,899 --> 00:52:17,890
minutes left for Q&A, so if you have a
question, please line up next to the

503
00:52:17,890 --> 00:52:24,390
microphones, and ask your question.
Remember: a question has a question mark

504
00:52:24,390 --> 00:52:29,840
at the end. Also, if you're exiting,
please do so silently and from the front

505
00:52:29,840 --> 00:52:34,650
door, thank you. I think we have a
question over there:

506
00:52:43,150 --> 00:52:55,789
Q: Can you recommend some libs or software
where I can accomplish the TLS handshake

507
00:52:55,789 --> 00:53:02,190
from the client side, so I can get the
SCT, via TLS extension, via OCSP

508
00:53:02,190 --> 00:53:07,039
extension, via the inherited
pre-certificate SCT.

509
00:53:07,039 --> 00:53:14,920
M: Not by heart. I mean, if it's part of
TLS certificate anything will go, OpenSSL,

510
00:53:14,920 --> 00:53:21,589
whatever, it's just a field. Same as for
OCSP, so anything that does OCSP will

511
00:53:21,589 --> 00:53:25,410
include it, it's just that clients that do
not know the extension will just not -

512
00:53:25,410 --> 00:53:31,989
they will ignore it. But anything that
does OCSP or SSL handshake will work.

513
00:53:35,229 --> 00:53:37,029
H: Thank you. Question from this microphone.

514
00:53:37,029 --> 00:53:42,210
Q: Hello, thank you very much for the nice
talk. Do you know how much space is needed

515
00:53:42,210 --> 00:53:45,070
to store all the logs currently?

516
00:53:45,070 --> 00:53:54,009
M: I had the same question, but
unfortunately not. What they store is the

517
00:53:54,009 --> 00:54:02,009
tree, and they store the entire chain,
excluding the root certificates. So,

518
00:54:02,009 --> 00:54:09,700
probably two, three, four certificates per
entry, which is like - I think you can buy

519
00:54:09,700 --> 00:54:17,969
at the regular electronic markets a hard drive
which is able to fit a lot of those entries.

520
00:54:20,199 --> 00:54:21,739
H: Next question from that mic.

521
00:54:21,739 --> 00:54:27,650
Q: Yeah, thank you for the talk. Why do
you need two SCTs for extended validation?

522
00:54:27,650 --> 00:54:36,170
M: Because a single entity might cheat. So
it's like - even though you can detect it,

523
00:54:36,170 --> 00:54:40,940
it's still a timeframe left. And if you
have two SCTs, which are operated

524
00:54:40,940 --> 00:54:45,919
independently, the idea is it's not that
likely that the two will collaborate

525
00:54:45,919 --> 00:54:48,239
to make a certificate disappear.

526
00:54:48,239 --> 00:54:50,019
Q: Thanks!

527
00:54:50,019 --> 00:54:51,499
H: That microphone, yes.

528
00:54:51,499 --> 00:54:55,229
Q: I'm actually a bit surprised, because
Google has been pushing for making the

529
00:54:55,229 --> 00:55:00,209
server HELLO as small as possible, and of
course, this is increasing the server

530
00:55:00,209 --> 00:55:06,839
HELLO with, in this case, an SCT, and of
course, they are also doing OCSP stapling,

531
00:55:06,839 --> 00:55:11,469
so that makes it even bigger. And this is
like a SHA256, so we're talking 256 bits

532
00:55:11,469 --> 00:55:15,690
there, plus another one you said that, you
know, one is not enough. Actually I've

533
00:55:15,690 --> 00:55:19,459
never seen that has more than one SCT.
Have you?

534
00:55:22,749 --> 00:55:23,580
M: No.

535
00:55:23,580 --> 00:55:24,010
<i>laughter</i>

536
00:55:24,100 --> 00:55:25,390
Not yet.

537
00:55:25,390 --> 00:55:26,589
Q: I've looked around, but nothing.

538
00:55:26,589 --> 00:55:27,710
M: Yeah.

539
00:55:27,710 --> 00:55:31,580
Q: It's actually increasing the size. And
I'm just wondering, where is this going.

540
00:55:31,580 --> 00:55:39,319
Are we just gonna eat the costs of having
all these SCTs and OCSP stapling? Are we

541
00:55:39,319 --> 00:55:40,319
prepared to eat that cost?

542
00:55:40,319 --> 00:55:46,609
M: I think the cost is small compared to
the gain you get by HTTP2. So if you pipe

543
00:55:46,609 --> 00:55:52,029
anything to one singular connection. I
think it's not bad of a cost anymore. But

544
00:55:52,029 --> 00:55:57,319
of course, this is a policy thing. To
require a certain amount of SCTs, to

545
00:55:57,319 --> 00:56:01,849
prevent fraudulent CAs.

546
00:56:01,849 --> 00:56:07,859
Q: Is the idea that this will replace
something like the SSL observatory, where

547
00:56:07,859 --> 00:56:13,900
browsers send in certs they see, and then
- you nodded, so I assume yes. And then

548
00:56:13,900 --> 00:56:18,589
also, how does this work for people who
can't have their certs be public?

549
00:56:18,589 --> 00:56:21,359
For people who are like issuing
things for internal networks?

550
00:56:21,359 --> 00:56:27,329
M: If you can't have the certificate
public, probably the better way right now

551
00:56:27,329 --> 00:56:33,650
is to have a certification authority which
is not using CT. In the future, it makes

552
00:56:33,650 --> 00:56:39,930
it much more expensive to operate your own
CA, incorporate it in the trust stores.

553
00:56:39,930 --> 00:56:43,969
But of course, this is costly. You have to
sign the certificate and everything.

554
00:56:43,969 --> 00:56:52,180
Q: But if like in October 2017, when
Chrome rejects all certs that don't have

555
00:56:52,180 --> 00:56:54,470
signed timestamps like what do I do?

556
00:56:56,570 --> 00:56:57,579
M: Use Edge.

557
00:56:58,209 --> 00:57:00,369
<i>laughter</i>

558
00:57:01,949 --> 00:57:06,670
I'm sure you can disable it somehow,
but it's <i>blerg</i>.

559
00:57:08,470 --> 00:57:15,949
Q: What about if someone tries SCT with
DHT or other system.

560
00:57:15,949 --> 00:57:18,169
Not blockchain, of course!

561
00:57:18,169 --> 00:57:21,289
It's possible to do that without
central authorities?

562
00:57:21,289 --> 00:57:24,440
M: Sorry, say again?

563
00:57:24,440 --> 00:57:31,670
Q: My English is very bad, I'm sorry. I
said, it is possible to do that without

564
00:57:31,670 --> 00:57:36,799
some central authority, like Google
or over SCT, but

565
00:57:36,799 --> 00:57:41,409
with a distributed hash table,
like DHT technologies,

566
00:57:41,409 --> 00:57:42,739
M: Yes, yes, of course.

567
00:57:42,739 --> 00:57:47,290
Q: And are there existing implementations?

568
00:57:47,290 --> 00:57:53,079
M: For the centralized thing, yes. Not for
the distributed thing. But I think it's

569
00:57:53,079 --> 00:58:00,269
just adding a layer of DHT on top of it.
So I'm sure you can think of a browser

570
00:58:00,269 --> 00:58:06,039
extension which uses the DHT to obtain
SCT. But right now it's just purely

571
00:58:06,039 --> 00:58:08,039
centralized. But the source is open.

572
00:58:08,039 --> 00:58:09,229
Q: OK, thank you.

573
00:58:10,669 --> 00:58:15,369
Q: I was just curious how it works if you
have a certificate which gets revoked, in

574
00:58:15,369 --> 00:58:19,930
context of the tree. Especially if the
tree is frozen. So how does this work?

575
00:58:19,930 --> 00:58:24,859
How do you revoke a certificate with a
tree, and then how does it work if it's

576
00:58:24,859 --> 00:58:26,690
frozen already.

577
00:58:26,690 --> 00:58:37,339
M: Good question! The goal of CT is not
- it's not about revocation. So whether

578
00:58:37,339 --> 00:58:43,900
revocation path is taken regularly. So you
ask OCSP. It's independent of the

579
00:58:43,900 --> 00:58:48,019
revocation thing. It's just publicly
saying that this certificate has been

580
00:58:48,019 --> 00:58:56,789
issued. So removing a certificate from the
tree, which has been removed - revoked, is

581
00:58:56,789 --> 00:59:01,390
not part of the specification. This is not
the use case. It's just logging the

582
00:59:01,390 --> 00:59:03,089
certificates which have been issued.

583
00:59:03,089 --> 00:59:07,950
Q: But if you audit all the logs, and you
want to know if something is, like going

584
00:59:07,950 --> 00:59:11,380
on that shouldn't be going on, wouldn't
you want to know whether the certificate

585
00:59:11,380 --> 00:59:12,650
has been revoked at some point?

586
00:59:12,650 --> 00:59:20,279
M: Yes, but not in the logs. The logs are
just to prove that the CA has issued this

587
00:59:20,279 --> 00:59:26,640
certificate, and to prove that the log has
correctly logged it. Revocation is

588
00:59:26,640 --> 00:59:32,680
different. Usually, OCSP stapling with the
CA, but that's a different channel. So

589
00:59:32,680 --> 00:59:34,760
this is not for certificate transparency.

590
00:59:34,760 --> 00:59:36,520
Q: Thank you!

591
00:59:36,520 --> 00:59:38,789
H: That's all the time we have for Q&A.

592
00:59:38,789 --> 00:59:41,479
Big round of applause again for
Martin for a great talk!

593
00:59:41,479 --> 00:59:42,859
<i>applause</i>

594
00:59:43,339 --> 00:59:45,599
<i>postroll music</i>

595
00:59:45,599 --> 01:00:08,000
subtitles created by c3subtitles.de
in the year 2017. Join, and help us!