32C3 preroll music
Herald: I welcome you to “TOR Onion
Services – more useful than you think!”
This talk is presented by George
– who is a core developer of TOR and he is
also a developer of the Hidden Services –
by David Goulet – who is a developer
for the TOR Hidden Services –
and by Roger, who is founder of the
TOR Project, an MIT Graduate and
the Foreign Policy Magazine
calls him: he is one of the
Top 100 Global Thinkers.
I think that speaks for himself.
applause
Today we will hear examples of Hidden
Services for really cool use cases,
but also we will hear about security
fixes that make the Hidden Services
even more safer and
stronger for all of us.
Stage free for “TOR Onion Services –
more useful than all of us think”
applause
Roger: Great!
Hi, I’m going to pause while
we get our slides up, I hope…
Hopefully that will be a quick
and easy event – perfect!
Okay, so. Hi, I’m Roger,
this is George, this is David.
We’re going to tell you about TOR Hidden
Services or TOR Onion Services.
They’re basically synonyms, originally
they were called TOR Hidden Services
because the original idea was that
you hide the location of the service;
and then a lot of people started
using them for other features,
other security properties, so we’ve been
shifting to the name Onion Services.
So we’ll switch back and
forth in what we call them.
So, a spoiler before we start.
This is not a talk about the dark web,
there is no dark web.
It’s just a couple of
websites out there that are protected
by other security properties.
So we’ll talk a lot more about
that. You can think of it as:
HTTPS is a way of getting encryption and
security when you’re going to a website
and .onion is another way of getting
encryption and security when you’re
going to a website. So, journalists
like writing articles about the
huge iceberg with 95% of
the web is in the dark web.
That’s nonsense! So we’re gonna try
and tell you a little bit more about
what actually is Hidden Services
and Onion Services and
who uses them and what
things they can do with them.
How many people here know
quite a bit about how TOR works
and what TOR is and so on? I’m
hoping everybody raises their hand!
Awesome, okay. We can
skip all of that, perfect!
So TOR is a US non-profit,
it’s open-source, it’s a
network of about 8000 relays.
One of the fun things about TOR is the
community of researchers, and developers,
and users, and activists, and
advocates, all around the world.
Every time I go to a new city, there’s
a research group at the university
who wants me to teach them what
the open research questions are
and how they can improve, how
they can help you improve TOR.
So, the basic idea is,
you have your software,
it pulls down a list of the 8000 relays,
and it builds a path through 3 of them
so that no single relay gets to learn
where you’re coming from
and where you’re going.
And you can see up at the top
here, there is a .onion address.
So basically Hidden Services,
or Onion Services, are
in your browser, in your TOR browser,
you go to an alternate type of
domain name, that ends in .onion,
and then you end up at that website.
So here’s an example of
a riseup.net website,
which we are reaching using
the onion address for it
rather than black.riseup.net.
Okay, so, I talked about the building
block before, how you use TOR normally
to build a 3-hop circuit through the
network. Once you have that building block,
then you can glue two of them together.
So you’ve got Alice over here,
connecting into the TOR network,
and you’ve got Bob, the website,
connecting into the TOR network,
and they rendez-vous in the middle.
So Alice is getting her anonymity,
her 3 hops inside TOR,
Bob is getting his anonymity,
his 3 hops inside of TOR,
and they meet in the middle.
So Alice doesn’t know where Bob is,
Bob doesn’t know where Alice is,
and the point in the middle
doesn’t know either of them,
yet they can reach each other, and
get some cool security properties.
So, some of these cool
security properties:
One of the really cool ones is that
that .onion name that you saw
with the base32
big pile of 16 characters,
that is the hash of the public
key which is the Onion Service,
which is the Onion address.
So they’re self-authenticating, meaning
if I have the right onion address,
I can be sure that I’m
connecting to the website,
to the service, that’s
associated with that key.
So I don’t need some sort
of Certificate Authority model
where I trust Turkish
Telecom to not lie to me.
It’s all built-in, self-authenticating,
I don’t need any external resources
to convince myself that I’m
going to the right place.
Along with that, is, they’re
end-to-end encrypted.
So I know that nobody
between my TOR client
and the TOR client on the
Service side is able to read,
or intercept, or
man-in-the-middle the traffic.
So there are some other
interesting features also,
one of them is the NAT punching feature.
If you offer an Onion Service,
there’s no reason to allow
incoming connections to it.
So I can run an Onion Service
deep inside the corporate firewall,
or behind Comcast’s firewall,
or wherever I want to,
and people are able to reach it.
So there are a lot of people from
the systems administration side
who say: “I’m going to offer an Onion
address for my home SSH server,
and now the only way that I can
connect back into my home box
is via the TOR network.
I get end-to-end encryption,
I get self-authentication,
and there’s no other way in.
I just firewall all incoming connections
and so the only surface area
that I expose to the world
is, if you’re using my onion
address, you reach my SSH port.
I don’t allow any other
packets in of any sort.”
So that’s a cool example
of how security people
use Onion Services.
George: So, hello, we have some
statistics for you to show you,
to give you an idea of the
current maturity of the system.
We got these statistics by asking
relays to send us information
about the Hidden Service
activity they see.
Only a small fraction of relays
is reporting these statistics,
so we extrapolate
from this small fraction.
So that’s why these statistics can
have lots of ups and downs,
and noise, and everything, but anyway,
they can give you a basic idea.
So, this first statistic is the number of
Hidden Services on the network,
and you can see that it’s about
30.000 Hidden Services, give or take,
and it’s a pretty small number if you
compare it to the whole Internet,
I don’t even know, it’s basically
in the early adoption stages.
And we also have
another statistic, this one,
which is the traffic that the Hidden
Services are generating, basically.
On the top, you can see the total traffic
that the whole network is pushing.
It’s about, I don’t know, 60.000 megabits,
and the bottom graph is the
Hidden-Service-specific traffic,
and you can see that
it’s like 1000 megabits.
Like, a very small
fraction, basically. So,
Hidden Services are still a
very small part of TOR. And,
if you don’t understand this
number thing very well,
we did some calculations and stuff,
and we have this new figure for you,
which is that basically 5% of
client traffic is Hidden Services.
From the whole TOR, 5% is
Hidden Services, basically.
You can handle this as you want.
So, and, we did this whole
thing like a year ago,
and we spent lots of time like figuring
out how to collect statistics,
how to get from the values
themselves to those graphs,
how to obfuscate the statistics in
such a way that we don’t reveal
any information about any clients,
and we wrote a tech report about
it, that you can find in this link
if you’re interested in looking more
[at] how the whole thing works,
and we even wrote a proposal, so if you
google for TOR Project, proposal 238,
you can find more information,
and, yeah, that’s it.
Roger: Okay, so, how did
this whole thing start?
We’re going to go through a
couple of years at the beginning.
In 2004, I wrote the original
Hidden Service code,
and I basically wrote it as a toy.
It was an example: “We
have this thing called TOR,
and use it as a building
block, look what we can do!
We can connect 2 TOR circuits together,
and then you can run a service like this.”
Basically nobody used it for a few years.
One of my friends set
up a hidden wiki where,
if you run an Onion Service,
then you can go to the wiki,
and sign up your address
so that people can find it.
And there were some example services.
But for the first couple of years,
it basically wasn’t used,
wasn’t interesting.
The first really interesting use
case was the Zyprexa documents.
So this was in 2005, 2006.
There’s a huge pharmaceutical
company called Eli Lilly
and they have an antipsychotic
drug called Zyprexa
and it turns out that it was
giving people diabetes
and harming them, and killing
them, and they knew about it.
And somebody leaked 11.000
documents onto the Internet
showing that this drug
company knew about the fact
that they were harming their customers.
And of course the drug company sent
a cease and desist to the website,
and it went away, and it
came up somewhere else,
and they sent a cease and desist,
and it was bouncing around,
and suddenly somebody set up a TOR
Hidden Service with all of the documents,
and Eli Lilly had no idea how to send
a cease and desist to that address,
and a lot of people were able to read
the corruption and problems
with this drug company.
So that was… on the one hand, yay!
applause
On the one hand, that’s really cool. Here
we are, we have a censorship-resistant
privacy thing, somebody
used it to get information out
about a huge company that
was hurting people, great!
On the other hand, it set
us up where ever after,
people looked at Hidden Services and said:
“Well, how do I find a document
that some large organization’s
going to be angry about? I’m going
to set up a website for leaking things,
I’m gonna set up a website for something
else that the Man wants to shut down.”
So the first example of
Hidden Services pointed us
in a direction where, after
that, a lot of people
thought that that’s what
Hidden Services were about.
So, that leads to the next year,
Wikileaks set up a Hidden Service
for their submission engine,
and it’s not that they wanted to
hide the location of the server.
The server was in Sweden, everybody
knew the server was in Sweden.
But they wanted to give extra security
to users who were trying to get there.
One of the really interesting properties
that they used from Hidden Services
is the fact that if you
go to the .onion site
from your normal browser,
it totally doesn’t work.
And this was a security feature for them.
Because they wanted to make
sure that if you’re a leaker,
and you’re doing it wrong,
you’re configuring things wrong,
then it totally fails from the beginning.
They wanted to completely remove
the chance that you accidentally think
that you’re using TOR
correctly and being safe
when actually you screwed
up and you’re not using TOR.
So they wanted to use Onion Services
as another layer of security for the user,
to protect the user from screwing up.
Now fast forward a couple of more years,
there’s another organization in Italy
called GlobaLeaks, where they’ve
set up basically a mechanism where,
if you have something you
want to share with the world,
then you can be connected to a journalist
through this GlobaLeaks platform.
And they actually have been
going around to governments,
convincing them to set
up GlobaLeaks platforms.
So they’ve gone to the Italian government,
they’ve gone to the Philippine government,
and basically they say:
“Look, this is a way for you
to report on corruption,
to hear about corruption
inside your country.” Now, if you
go to a government, and you say:
“I hear there is corruption,
here’s a way to report on it.”
not everybody in the government
will be happy with that.
But one of the features is,
you can very easily say:
“Can you help me set up an
anti-corruption whistleblowing site
for the country next door? I would be
happy to… you know they’ve got corruption,
so how about they provide
the corruption site?”
applause
So it’s really cool that GlobaLeaks
is playing the political game,
trying to demonstrate that making
these things public is worthwhile.
And, of course, here’s a picture of a
cute cat, we have to have one of those,
and WildLeaks is a really
good example of a positive,
I mean, this is a way where if you
see somebody killing a rhinoceros
or elephant or something in
Africa, and you know about it,
upload it to WildLeaks, and
then they can learn more
about poaching and
extinction events and so on.
So, it’s hard to argue with anti-poaching,
anti-corruption sites like that.
And that moves us to SecureDrop,
there’s a group in the US
that is working on another
example of how to connect
people with interesting information
to journalists who want to write about it.
And they’ve actually connected
with the New Yorker and a lot of
high-profile newspapers,
to be able to provide a way for people
to securely provide information
to those journalists. And they say that
it has been used in high-profile events,
and they won’t tell us which
events, which is great!
That’s exactly how it’s
supposed to work.
applause
David: Hello. So, continuing
our timeline here,
this very cool thing happened
in 2014, where Aphex Twin,
this electronic experimental guy,
released his album Syro through
an onion address on Twitter,
and he got 4.000 Retweets.
So we encourage you guys
to consider this method
of releasing all your stuff,
and the complementary ways to
release it would be the open web.
So, onion addresses.
Following that, we got
Blockchain, recently,
in 2014, let’s say two years ago.
They discovered that
for security concerns,
when you’re using TOR, the exit nodes,
some exit nodes, and malicious exit nodes,
were rewriting the Bitcoin addresses.
So for security concerns, they changed…
if you go… you come to
blockchain.info from TOR,
they tell you to use the onion address
so you get all the fancy properties
of end-to-end encryption,
and so on, and so forth.
As of still today, we know
that malicious exit nodes exist,
and they do rewrite Bitcoin addresses.
Don’t be alarmed, it’s not like HAL3000’s,
the thing is that, we at the TOR
Project are actively monitoring
the network at the exit nodes
for these kinds of craziness.
And we need more help from
everyone, from the community,
to find those, so we can block them,
remove them, so fuck
those. Fuck those guys.
And Blockchain took action
with Onion Services. So, great.
Roger: And Facebook set up a
Hidden Service recently as well,
an onion address for their website.
So, the first thing many of
you might be thinking is:
“Wait a minute, I don’t understand,
Facebook is a website on the Internet,
why do they need a Hidden Service,
why do they need an onion address?”
So, the first answer is, they worry
about users in interesting countries.
Say you’ve got a Facebook user in
Turkey or Tunisia or something like that,
and they try to go to Facebook,
and the local DNS server lies to them
and sends them somewhere else,
or Turkish Telecom, which is a certificate
authority that everybody trusts,
ends up pretending to be Facebook.
You man-in-the-middle them,
now there’s certificate pinning
and other challenges like that,
and maybe those are good starts.
But wouldn’t it be cool just to skip the
whole certificate authority infrastructure
and say “Here’s an address”, where
if you go to this in your TOR Browser,
you don’t have to worry
about PGP hijacking,
you don’t have to worry
about certificate authorities,
you don’t have to worry about DNS,
it’s all inside the TOR network,
and it takes care of the security
properties I talked about before.
So, that’s a really cool
way that they can switch.
I was talking to one of the
Facebook people earlier.
He doesn’t want me to tell the number of
users who are using Facebook over TOR,
but it’s many hundreds of thousands.
It’s a shockingly high number of users.
So, wouldn’t it be cool if we
can switch many of those users
from connecting to Facebook.com over TOR,
to connecting to Facebook’s onion address,
and then reduce the
load on the exit relays,
so that it’s faster and
easier and scales better
for the people connecting to websites
that aren’t onion addresses?
So, I was thinking about
this at the very beginning
and I was thinking: “Wait
a minute, I don’t get it,
Facebook has an onion address,
but they have a real address,
why do we need the other one?”
And then I was thinking back.
So, you remember 10 years ago,
when people were running websites
and the administrator on the website said:
“I don’t need to offer HTTPS for
my website, because my users…”
and then they had some bullshit excuse
about how their users didn’t need security
or didn’t need encryption,
or something like that.
And now, 10 years later, we all think
that the people saying: “I don’t
need HTTPS for my website”…
we think they’re greedy and
short-sighted, and selfish,
and they’re not thinking
about their users.
I think the Onion Service thing
is exactly the same thing.
Right now, there are
plenty of people saying:
“I already have HTTPS, I don’t need
an onion address for my website
because my users…” and then
they have some lame explanation.
So hopefully in a couple of years,
it will be self-evident to everybody
that users should be the ones to choose
what sort of security
properties they want.
It shouldn’t be about what the
website thinks the user should have.
I should have the choice
when I’m going to Facebook.
Do I go to the HTTP version,
do I go to the HTTPS version,
do I go the onion version?
It should be up to me to
decide what my situation is
and get the security
properties that I want.
The other challenge here, I’ve talked
to some researchers a while ago
who said: “I found a copy of
Facebook on the dark web”
and I was thinking: “Wait a minute,
you didn’t find a copy of
Facebook on the dark web,
there’s a mechanism for securely
getting to the website called Facebook,
and it’s called Onion Services.
There’s no separate dark web,
it’s about transport encryption,
it’s about a way of reaching
the destination more safely.”
One of the other really cool things,
Facebook didn’t just set
up an onion address,
they got an HTTPS certificate
for their onion address.
They got an EV cert,
the kind that shows you
the green little bar that says:
“This is Facebook”
for their onion address.
They went to Digicert,
and Digicert gave them an SSL certificate
for their onion address, so now you can
get both of them at once. Which is
an amazing new step that we hadn’t
even been thinking about at the time.
So, what does this give them?
Why is this valuable?
One of them is, on the browser side,
when you’re going to an HTTPS URL,
the browser knows to
treat those cookies better,
and to not leak certain things,
and there’s all sorts of security
and privacy improvements
that browsers do when you’re going there.
And we don’t want to teach the browser
that if it’s HTTPS or .onion then be safe.
The other nice thing, on the server side,
Facebook didn’t have to change anything.
This is another way of reaching
the Facebook server.
That’s all there is to it.
And then, another cool thing:
It turns out that the only way
to get a wildcard EV certificate
is for an onion domain.
It’s actually written into, like,
the certificate authority world,
that there is a grand exception
for onion addresses.
You can’t get a wildcard EV cert
unless it’s for an onion address.
So this is super duper
endorsement of Onion Services
from the certificate authority people.
applause
But let’s take a step even further.
Wouldn’t it be cool if we take
the Let’s Encrypt project
and they bundle a TOR client in
with each web server that’s
offering the Let’s Encrypt system?
So every time you sign up for Let’s
Encrypt, you also click the button
saying: “And I want an onion
address for my website”,
and they automatically,
in the same certificate…
you get one for riseup.net,
and as an alternate name,
it’s blahblahblah.onion.
It’s in the same certificate.
So users can go to your website directly
or they can go there
over the onion address
and either way you
provide the SSL certificate
that keeps everybody safe.
Wouldn’t it be cool if every time
somebody signs up for Let’s Encrypt,
they get an onion address
for free for their website,
so that everybody can choose how
they want to reach that website?
applause, cheering
Now, there are a few problems
with that. One of the big ones is,
we want some way of binding
the riseup.net address
to the onion address,
so that when I go to riseup.net
I know that I’m going to
the correct onion address.
So we need some way to vouch for them
and connect them through
signatures or something.
It can be done, but somebody
needs to work out the details.
The other policy barrier is,
right now, the certificate
authority people
say you cannot get an
onion address for a DV cert,
the normal kind of cert.
You can only get it for an EV cert.
And Alec over here is leading the charge
to convince them that that makes no sense,
so hopefully in the next couple
of years, with all of your help,
they will realize that onion addresses are
just like all the other
addresses in the world.
Which leads to another really
cool feature from this year.
We got IETF to publicly
specify, in a real RFC,
that the .onion domain is a special case,
and they’re not going to give
it out in any other way. So…
applause
yeah!
applause
So the first effect here is
that we have actual approval
of Onion Services from the IETF
and other standards committees.
But the second effect, which
is a second-order effect, is,
now we can go to the browsers,
and the DNS resolvers,
and say, whenever you
see an onion resolve,
cut it right there, because you
know that it’s not going into TOR,
and you know that it shouldn’t
go out onto the network.
So now, when you’re in your
normal Internet Explorer,
and you accidentally click
on an onion address,
Internet Explorer knows
that that’s a local address,
that shouldn’t go out onto the network.
So we can keep people
safer, in ordinary browsers
that otherwise wouldn’t
even care that we exist.
George: OK, so, so far we’ve been talking
about websites and Hidden Services,
but this is not all that
Hidden Services can do.
Basically, you can do any sort of TCP
thing you want to do over Hidden Services.
We’re going to show you a few
examples of third-party applications
that have been developed
for Hidden Services
and do various interesting things.
First of all, OnionShare is
a file transfer application
where you basically download this
thing, and then you feed it a file,
and then it exposes a HTTP
server that you can,
that basically hosts your file, an
onion address that hosts your file,
and you can give that a URL, you
can give that URL to your friends,
and they can just put it on their TOR
Browser and download the file easily.
It’s quite convenient, nicely made,
and I think various organizations
like the Intercept and stuff, are
using it to transfer files internally.
And it works fine so far,
as far as we know.
Then the various Hidden Services
are quite good for doing messaging
because basically both sides are anonymous
and this gives a nice twist to
when you talk to some person
and one way to do so is Ricochet,
which is an application that allows you
to talk one-to-one to other people.
It’s decentralized, because
it works over Hidden Services.
That’s actually quite useful,
because, for example,
a few months ago, the Jabber CCC
server got shut down for a few days
and you couldn’t talk
to anyone, basically,
but if you used Ricochet, you were fine,
because they can shut down the CCC server
but they probably can’t shut down
the whole TOR network so easily.
It also has a nice slick UI,
which is quite refreshing
if you’re used to the usual
UIs of the open-source world.
And, anyway, you can…
you can download it from
that web server there.
And then there is Pond,
which is a more experimental
avant-garde messaging application,
which is basically a mix between
messaging and mix nets.
It basically uses a server which
delays your messages and stuff,
which makes it much harder
for a network adversary
to know when you’re sending
or receiving messages,
because you also send chaff,
and fake traffic and stuff.
It’s super-experimental,
the author doesn’t even
want us to really endorse it,
but information is free and
you can visit that website
to learn more about it.
David: So, there’s also,
for many years now,
plenty of services and tools that exist.
George just showed us some tools,
but now there’s services like
Jabber, SMTP, and IMAP,
from the Riseup guys, but not
only Riseup, but Systemli,
Autistici, and Calyx Institute for Jabber.
And it’s more and more and
more Jabber servers right now
that are federating TOR through the Onions
for server-to-server
and also client-to-server.
And this has been around for a long time,
and it serves many, many, many users.
I think Riseup has more than
30.000 users on their Jabbers.
Another neat thing about them is,
very recently, that Debian
created their package repository
and now you can use an onion address
to just update your Debian system.
And you use this amazing package
which is apt-tor-transport and, hop!,
you can update everything through an onion
address, it will detect it automatically.
But then, there’s also much
more that happened recently.
Also the GPG key servers
exist as an onion address.
So you can update your GPG key,
and download a signature,
and so on, and so forth,
which in a way hides
from global observers,
because we know they exist,
all your social graph
because, well, you’re in an
end-to-end encrypted channel.
Of course it can go to
GPG servers, that’s true,
but still at least on the
wire, it’s hidden. Very nice.
Now, DuckDuckGo of course, they have
Jabbers and they also have Hidden Service.
And I talked to the DuckDuckGo
people a few months ago maybe,
I don’t remember. But the point is,
they have many, many, many users
coming through their onion
addresses. The Pirate Bay also.
So, the point of all this is that,
with Facebook and Blockchain,
and all those we’ve seen
– and we actually know that
several Alexa 500 top websites
are currently deploying onion addresses.
And the point here is, between
the TOR network, the onion space
and the open Internet…
If all sites are on both sides, well,
it becomes one side. It’s just different
ways of accessing the information.
So please, please, please go to your
companies, go to your organization,
deploy onion addresses,
and make them public.
Help us have much more.
Roger: Let me, before I get to
the next one, re-emphasize
the point that George was
making about Ricochet.
So Ricochet is an alternate chat program
where every user is
their own onion address.
Every user is their own Onion Service.
And you talk from one Onion
Service to the other Onion Service.
You don’t have to know where the person is
or necessarily even who the person is.
And there’s no middle,
there’s no central point
to go and learn all the accounts,
and who’s friends with who, and so on.
There’s nothing to break into in the
middle where you can spy on everybody.
Everything is decentralized,
everybody is their own onion address.
So I think that’s a key point
as an alternate chat paradigm
where hopefully we can switch away
from the centralization model.
applause
Okay. So, on to phase 2,
a brief diversion.
We’ve been talking to a bunch of
researchers over the past few years
about, they want to do
research on TOR to study
how many users there are,
or how many people go to Facebook,
or all sorts of other
research questions, and
sometimes they do it in dangerous
ways, or inappropriate ways.
So we’ve been working on guidelines
to help people who want to do it safely
actually be able to not harm
people or minimize the harm.
So here are some of the guidelines.
First one is, try to attack your
own traffic, try to attack yourself,
so, if you have a question and you
need to do it on the real TOR network,
you should be the one to generate
your traffic and then try to attack that.
You shouldn’t just pick an
arbitrary user and attack them
because who knows if that’s a
person in Syria who needs help
or a person in Germany who’s trying to
get out from oppression, and so on.
Another approach: only collect data that
you’re willing to publish to the world.
So, too many researchers say:
“Well I’m going to learn all
of this interesting stuff,
and I’m going to write it
down on my hard drive,
and I’ll keep it very safe.
Nobody will break in.”
That approach fails every time.
Somebody breaks in, you lose
the data, you forget about it,
so the ethical way to do this is,
only collect stuff that you’re
willing to make public.
Only collect stuff that’s
safe to make public.
And then the other piece, that’s part
of what we’re talking about there,
don’t collect data that you don’t need,
so figure out what your
research question is,
figure out the minimum that you
can collect to answer that question.
For example, if I want to know
how many people connect to Facebook,
I should not collect every destination
that everybody goes to,
and then afterwards count up
how many of them were Facebook.
I should have a counter that
says: Facebook, increment.
And then at the end it outputs a number,
and that’s the only thing that
I need to know in that case.
So limit the granularity of data.
If you’re counting how many users
are connecting from different countries,
and there are very few
users coming from Mauritania,
consider rounding that down to zero,
so that you don’t accidentally harm
the five people in Mauritania
who’re using it today.
So, approach to do this is:
figure out what you’re trying to learn,
describe the benefits to
the world of learning that,
describe the risks to
people around the world
in TOR of the approach that you’re taking,
and then argue that the benefits
are outweighing the risks.
And one of the key ways
of looking at this is,
if you’re collecting something
interesting, some interesting dataset,
think about whether there
could be somebody else,
somewhere out there in the world,
who has some other dataset,
and when you combine
their dataset with yours,
somebody learns something new,
somebody gets harmed.
If you can imagine any other dataset
that, when it’s combined
with yours, harms people,
then you need to think harder about that.
And then the last point:
use a test network, when possible.
There’s a tool called chutney,
there’s a tool called Shadow,
where you can run your own internal
TOR network on one computer,
and if you can do your research
that way, it’s even better.
So, those are great
guidelines, sounds good.
We need to encourage
more people to do them,
and, to be fair, this is not going to
stop bad people from doing bad things.
But if you want to do
research responsibly,
and ethically, without harming TOR users,
then we need to help everybody learn
what the guidelines are
to do it more safely.
So here’s an example of a tricky edge case
where we really want to think
harder about these things.
One of them is, there are people out there
who want to build a list of every
onion address that they can find.
So, you can learn about
that by going to google
and doing a google search on .onion
and they give you some addresses.
That’s okay, that seems reasonable.
It’s a public dataset, okay, fine.
There’s a more complicated
one, where you are Verisign,
and you run some of the DNS root servers,
and you spy on the DNS
queries of the whole Internet.
And anybody who accidentally
sends a .onion DNS query
to your root server, you write it down.
So now you learn all the side-channel
accidentally leaked addresses.
Is that, does that follow our guidelines?
Is that okay, is that ethical?
It’s kind of complicated,
but I don’t know a way to stop it,
and they already have
the dataset. So, okay, fine.
Now a more complicated one.
What if you’re Comcast, and
you spy on all of your users,
to find out what their DNS queries are,
to learn about accidental leakage there.
Again, I’m going to say it’s
complicated, but it’s probably fine,
they’re already seeing it,
there’s nothing we, TOR, can do
to change our protocol, to make
people accidentally leak these things.
But then option four:
what if you want to learn
a bunch of onion addresses,
so you run new relays in the TOR network,
and you sign them up and
they get into a position
where they can learn about onion addresses
that are being published,
and then you make a list internally.
So that’s actually not cool, because
it’s part of the TOR protocol
that people providing onion addresses
don’t expect those to become public,
and we’ll talk later about
ways that we have
for being able to fix this.
So, if it’s a protocol problem
that we know how to fix,
and it’s inside TOR, and
you’re misbehaving as a relay,
then that’s not cool.
So this is an example where,
it’s sort of hard to reason through
where we should draw the line,
and I’d love to chat more
with you all after that
about where the line should be.
And that leads to an example
of some research that was done last year,
by, we think, some folks at CMU,
who attacked the TOR network,
and, as far as I can tell,
collected a dataset,
and they collected more than they needed
to answer their research questions.
They didn’t do minimization,
they didn’t attack only their own traffic,
they didn’t use a test network,
they basically violated every one of
the guidelines from two slides ago.
So that’s sort of a sad story,
and that leads to the next question:
Should we have some sort
of TOR ethics review board?
Wouldn’t it be cool if, as a researcher,
you write up what you’re trying to learn
and why it’s safe,
and how you’re going to do it,
and you show that to other professors
who help you decide whether you’re right,
and then we go to the academic
review journals and conferences,
and we get them to expect,
in your research paper,
a little section on why this
is responsible research.
And, at that point, it’s expected
that you have thought through that,
and anybody who writes a
paper without that section,
everybody knows that they haven’t
thought it through as much as they should.
Wouldn’t that be a cool future world
where research is done more responsibly
around TOR, and around
security more generally?
applause
Okay. So, there are a couple of problems
in TOR Onion Service security right
now. I’m gonna zip through them briefly
so we can actually get to talk about
them in more detail. The first one is,
the onion identity keys are RSA 1024 bits.
So, that is too short.
We need to switch to ECC.
Another one is, you, the adversary,
can run relays, and
choose your identity key
so that you end up in the
right place in the network
in order to target – censor,
surveil, whatever –
certain onion addresses.
And we’ll talk more about that also.
Another one, I talked about
that a few slides ago,
you can run relays in order to
learn about new onion addresses,
and we’ve got some fixes for that.
So those are 3 that
are onion-address specific,
onion-service specific, that we can solve.
And then there are 3 issues
that are much more broad.
One of them is,
bad guys can run hundreds of relays,
and we need to learn how to
notice that and protect against it.
Another one is,
you can run relays to learn more about
the path selection that
clients are going to do,
and then there’s website fingerprinting.
All of those are separate talks.
I wanted to mention them here,
I’m happy to talk about them later,
but we don’t have time
to get into them in detail.
Okay, phase three, how
do TOR Hidden Services,
TOR Onion Services work right now?
Just to give you some background,
so that when we talk about
the design improvements,
you have a handle on what’s going on.
So, we’ve got Alice over here, she
wants to visit some Hidden Service Bob.
The first step is, Bob generates a key,
and he establishes 3 introduction points,
3 circuits into the TOR network.
And then he publishes, to
the big database in the sky,
“Hi, this is my key,
and these are my 3 introduction points.”
And at that point, Alice somehow
learns his onion address,
and she goes to the database
and pulls down the descriptor
that has his key and the
3 introduction points
and in parallel to that,
she connects to the –
she picks her own rendezvous point,
and she builds a TOR circuit there.
So at this point, Bob has 3 introduction
points open in the TOR network,
and Alice has one rendezvous
point open in the TOR network.
And at that point, Alice connects
to one of the introduction points
and says: “Hey, I want to connect
to you, and I’m waiting over here.
This is the address for
my rendezvous point.
If you want to talk back to me,
I’m waiting right here.”
And then at that point,
Bob, if he wants to,
makes a connection to
the rendezvous point.
So now Alice has a connection
to the rendezvous point,
and Bob has a connection
to the rendezvous point,
and at that point they do
the crypto handshake
so that they get end-to-end encryption,
and then, as the last step,
they’re able to send traffic
over that circuit, where Alice has 3 hops,
and Bob has three hops,
and they’re able to provide
security from that point.
So that’s a very brief summary
of how the handshake works,
in Hidden Service land.
So, in the previous slide, I was talking
about this database in the sky.
Once upon a time, that
was just 3 computers.
I ran 2 of them.
Another directory authority
operator ran the third.
And then, we switched to distributing
that over the entire TOR network,
so there are 8000 relays…
So imagine the hash of each relay’s
identity key on this hash ring.
There are 6 relays at any given point
that are responsible for knowing
where a given Onion Service is.
So, when I’m running my own Onion Service,
I compute which 6 relays they are,
and I publish my descriptor to it,
and when I’m the client
and I want to go there,
I compute which 6 relays they are,
and I go to any one of them,
and I can fetch the descriptor.
So, the way that we actually generate
the predictable set of
which relays they are,
is this hash function up on the top.
So you look at the onion address,
you look at what day
it is, the time period,
and other things that are pretty static.
So that’s how both sides can compute
which relays they should go to
when they’re publishing
or fetching a descriptor.
George: Okay, so, back to
the security issues again.
A few years ago, like 2..3 years ago,
we started looking into Hidden Services
and enumerating the various problems.
Various people wrote papers about
some open issues of security and stuff.
So in 2013 we wrote the first proposal
called next generation Hidden Services,
which basically details various
ways for improving security,
better crypto, blah blah
blah, blah blah blah.
This happened like 2 years ago,
and we still have not been,
we haven’t started heavily
developing, because of the lack of,
basically, developers, since
Hidden Services have been
largely volunteer-driven projects
since like a year ago or so.
So everything was done on
our spare time, basically.
But we’ve been writing proposals,
we’ve been active anyway.
We’re going to start looking over
the various security issues
and the ways to fix them, let’s say.
The first one, we call it
HSDir predictability,
and it touches the subject
that Roger mentioned,
the database in the sky,
which is basically that
when a Hidden Service…
A Hidden Service every time has
6 Hidden Service directories,
6 relays of the network
being responsible for it.
So every day, each Hidden
Service has 6 relays
responsible for it, and it chooses it
using this weird hash formula there.
Which, if you can see, it’s deterministic
so all of them are static,
apart from time-period
which rotates every day.
But the problem is that,
because it’s deterministic,
you can basically plug in
the onion address you want
and the time period in the future
and you can basically predict
the result of that function,
in like, 2 months from now.
So, you can basically know
which relays will be responsible
for a Hidden Service in 2 months
and, if you’re a bad guy, maybe you’ll
go and inject yourself to that place
and you will become the
HSDir of a Hidden Service
and then you can,
like, monitor its activity
or you can do DoS attacks.
So this is not something we like, and
we will attempt to fix it.
Our idea for fixing it is that
we will have to make it,
from deterministic, we will have
to turn it into probabilistic thing,
and we do this by adding
a random value in it.
And this random value is
basically a fresh random value
that the network is going to
be generating every day.
So, how we do this is we use these 9
directory authorities from the network,
they’re these 9 computers, they’re
hardcoded in the source code
and they’re considered semi-trusted.
And basically we wrote the protocol,
that all these 9 directory
authorities do a little dance
and in the end of the day, they have
a fresh random value every day.
It’s not something new, it uses
a commit-and-reveal protocol,
which is some way to do some sort of
distributed random number generation.
And then every day they
make this random value,
they put it in the consensus,
and then Hidden Services
and Hidden Service users
take the random value from the consensus,
plug it into that formula,
and they use it.
And since this is a
supposedly secure protocol,
I shouldn’t be able to predict what the
random value is going to be in 2 months.
Hence, I cannot go and inject myself
in that position in the
database, basically.
And this is the way we fix this problem.
David: Right, continuing now on the
next generation Hidden Services.
The key thing is better crypto.
Right know we have RSA-1024 and SHA-1,
which are considered completely bad to use
and we’re going to use of course
this fancy man there’s work, Daniel.
So it’s basically ED25519
for encrypting and signing
and of course, using SHA-256.
Right now, in TOR, we are experimenting
an implementation upstream of SHA-3.
Although, apparently, we’re not sure
if SHA-256 or SHA-3 will be used, but
right now SHA-256 is a contender
because it goes way faster than SHA-3.
SHA-3 has more things. Still pending.
But, for now, elliptic curves.
So, what to take of that is:
next generation Hidden Services,
which we are actively
working on right now,
will drop all dead crypto.
One of the big changes that’s coming up
is the onion addresses.
On top you have the current onion address
and it’s going to move to 52 characters,
so, basically, your public key.
This is maybe for you
guys considered painful,
and for us it’s extremely painful
to enter this address or just to
type in an address like that,
so, you know, open proposal right now,
or I think there’s an email
thread on our mailing list
on coming up with some more fancy way
to remember an onion address that size.
Like, words, remembering words that just
mash in together and create that address.
So, yeah.
Now, as the Hidden Service evolves,
one of the things we really
want to do is make them faster.
The big difference between
Hidden Services in the network
and a normal TOR circuit
in the network is that
normal TOR circuits usually have 3 hops,
then in Hidden Services you
have 3 hops from the client
and 3 hops from the Service.
Thus you have 6 hops.
So of course much more time
to go through all the relays
than the normal circuit. Now we have
this proposal going on which is
Rendezvous Single Onion Services. And
the point is, you’re gonna do the dance,
the introduction, the rendezvous, and
then once you go to the rendezvous,
instead of the service going 3 hops,
you’re gonna go 1 hop to the service.
So in here we have this artist wanting
to update their Debian machine,
let’s say that, and the Debian
server doesn’t care really much
about anonymity, because we know
where the Debian servers are,
and that’s fine. Thus, clients still
have anonymity with the 3 hops,
then the service doesn’t care, so
only 1 hop. And it goes way faster.
So that’s something we hopefully
will end up deploying soon.
Now, the second one is the Single
Onion Service. It’s roughly the same,
so we have this chef here wanting to go to
his Fairtrade website, you know, whatever,
and the difference here is that
we are going to skip completely
the introduction and rendezvous dance.
And you’re going to do a 3 hop
circuit to a rendezvous point
where the Hidden Service,
let’s say, let’s call it a node,
an introduction point,
which is the yellow line,
and then the client will go
to that introduction point,
and instead of having this current dance
the client extends to the service.
And now we have a 3 hop thing,
no prior work being done
for introduction or rendezvous,
and it goes way faster.
Again, those 2 here and here
are optimization for services
that do not care about anonymity.
And there are plenty of use cases
for that. Facebook, for instance,
or Debian repositories, and so on.
Roger: Am I still on? Great. Facebook
and Debian are really excited about
having one of these options so that
they can have all of their users
reach their service with all the cool
security properties we talked about
but also a lot faster and more scalable
than the current design.
David: Precisely. So, one of the
very cool things we did this summer
was the TOR summer of privacy.
We got some people in,
which we can consider
interns or students whatever,
and one of these projects – that came
out of this cool person that is Donncha –
created OnionBalance. So it’s a way
of load-balancing Hidden Services.
So as you create a Hidden
Service with the top key,
then you copy that key
to multiple machines
so all those servers will
start creating a descriptor.
Basically the descriptor, if you
can remember, is how to reach me.
And we’re going to cherry-pick
introduction points from each
and create a master descriptor
that you can see in that picture;
and that master descriptor
is (?) what clients will use.
Thus you load-balance the network
depending on introduction
points and the instance.
And this is great! And we actually know
that Facebook will actively
start beta-testing this thing
so we can have load-balancing and CDNs
and much more easier for onion addresses.
So, just before I give
these slides to Roger,
the next generation Onion
Services is something
that has been around for 2 years now
and now we’re going to start
working in 2016 actively,
and almost with 4 full-time
developers on that.
It’s still not enough, we need more… we
need resources because we need to get away
from our funding that restricts us
for not working on Onion Services
which we have a bit now.
So resources is very, very important
so we can get this thing
that’s extremely important.
And in the next year, we hope
to get this thing here done.
applause
Roger: Great, so there are a couple
of important takeaways from
what we’ve been describing to you today.
One big one is, there are a lot of
different types of Onion Services out there.
There are a lot more than people think.
Everybody looks at Hidden
Services and says: “Oh,
they’re for websites that the government
hates” or something like that.
But there are examples
like Ricochet, like Facebook,
like GlobaLeaks, like SecureDrop.
All of these different examples are
cool things you can do
with better security properties
for your communication. So it’s not
about hiding where the website is,
it’s about getting more secure ways
of reaching websites and
other services around the world.
So another key point: this is still a
tiny fraction of the overall TOR network.
We have millions of people
using TOR every day,
and something like 5% of the
traffic through the TOR network
is Hidden-Service,
or Onion-Service related.
So it was 3% last year, it’s 5% now,
it’s going up, sounds good,
but it’s still a tiny fraction.
And maybe that’s good, because when
you’re using an Onion Service right now,
you put double load on the TOR network,
because both sides add their own circuit.
Whereas if we switch to some of these
designs that David was talking about,
then it will be much more scalable
and much more efficient
and it would be really cool to
have Amazon, and Facebook,
and Twitter, and Wikipedia, and so on,
all allowing people to get more
security, while using TOR,
while protecting places like Facebook
from learning where they are today.
Another key point, we got all these
cool designs that we touched on briefly,
we’d be happy to tell you
more about them after the talk,
and then the last point: You
run a cool service out there,
please set up an onion version of it,
please set up an onion
address for your cool service,
so that the typical average
onion service in the world
becomes a totally normal
website or other service
that totally ordinary people go to. And
that’s how we will mainstream this thing
and take over the world.
applause
applause
And then, as a final point,
we are in the middle of our
first ever donation campaign.
We are actually trying to grow
a base of people who want to support TOR
in the same way that EFF has done.
So it would be wonderful… I don’t want to
throw away all of our Government funders,
at least not right now, but I
would like to get to the point
where we have other options,
more sustainability,
and we don’t have to look at
each new funding proposal
and wonder if we have to
un-fund people if we don’t get it.
So I’d love to have much more diversity
in the type of people who
are helping TOR to exist
and thrive and help save the world.
So, please consider helping.
applause
Herald: Thanks a lot for this awesome
talk, we have 6 minutes left for questions
and please line up at the microphones
1, 2, 3, 4, 5, and 6 down here.
And while you are doing that,
we would like to hear a
question from the Internet.
Signal Angel: Thank you. I have
a bunch of questions regarding
compromised onion addresses.
Herald: Start with one.
Signal: Do we have a
kind of evil twin problem
and what can I do if my onion
address is compromised?
When there are widespread services
on the TOR net, like Amazon,
how do I know which
one is the official service?
Roger: So the first question, of,
if your onion key gets stolen
or something like that,
that’s the same as the SSL problem.
How do you keep your certificate
for your web server safe?
The answer is: you should keep it safe
just like you keep everything else safe.
And if somebody gets the key for
your onion address, sucks to be you.
Don’t let them do that.
For the second question,
how do you know that a given
onion address is Amazon’s?
That ties into the certificate authority,
the https, the EV cert discussion
that we talked about
where we need to somehow bind
in Amazon’s SSL certificate
the fact that it’s Amazon, and
this is their alternate onion address.
We need to put those
in the same certificate,
so that everybody knows
if you’re getting one
then you know it’s really Amazon.
H: Thank you. I would like to hear
the question from microphone 1
and remember to keep it short and concise.
Q: Again, the addressing issue,
switching from 16 to 52 characters is nice
but if we have to change
the algorithm again
wouldn't it be nice
to have, like, a prefix
to determine the
algorithm for the address?
Roger: Yes, we actually have
a couple of extra bits
in that 52 bytes and we could use them
for versioning or all sorts of things.
And there are some examples
of that in the proposals,
I don’t think we’ve fixed
on any answer yet.
So, we’d love to have your help.
H: Thank you. Please, microphone number 3.
Q: Hey, you gave us a couple of examples
from Facebook, and I was just wondering
if there’s any sort of affiliation between
Facebook and TOR, or Facebook
just happens to be really keen
on offering their services
in less democratic jurisdictions?
Roger: There’s a nice
guy named Alec up here,
who on his own thought of making
Facebook more secure using TOR,
and he went and did it, and then
we realized that he was right,
so we’ve been trying
to help him ever since.
applause
H: Thanks a lot. Microphone number 4.
Q: You said that you want
more and more people
to run Hidden Services.
My question for this is,
are there any guidelines
on how to do that?
Examples of how to do
it with specific services?
Because from what I’ve seen, I tried
doing this for some of the services I run,
that it’s not… it’s painful,
most of the time.
And with the increase of
the addresses right now,
the size of the addresses is going to
become more and more painful. And
one of the things that
I’d love to see for release,
for example, a DNS record
type that’s .onion
that you can add for your
normal DNS record
so people can just look into that and
choose between A, AAA,
and onion to connect to it.
Roger: Yeah. For that last
one, for the DNS side,
if we have DNSSEC, thumbs up.
If we don’t have DNSSEC,
I don’t want to have
that terrible security link
as one of the first steps.
I don’t want to trust
the local DNS resolver
to tell me I can go somewhere else
and then after that, if I get
the right address, I’m safe.
That sounds terrible.
George: So, on how to set up
Hidden Services correctly,
I think Riseup recently published
some sort of guidelines
with various ways you can tweak
it and make it more secure.
I think there are also
some on the TOR Wiki.
But in general you’re right that there are
various ways you can mess up this thing
and it’s not super easy for anyone here
to set up a Hidden Service right now,
and hopefully in the
future we will be able
to have an easier way for people
to set up Hidden Services,
maybe provide a bundle
that you double-click
and it spawns up a
blog, or a Docker image,
I don’t know. It’s still one of the
things we really need to look into.
Roger: It would be really cool to
have a server version of Tails
that has all of this built in
with a, like, Python web server
that’s hard to break into and
automatically configured safely.
That would be something that would
make a lot of people able to do this
more conveniently and not
screw up when they’re doing it.
H: Okay!
applause
So we have a bit of less than 1 minute
left, so I would say “Last question”
and let’s say microphone
number 3 for that.
Q: Hello, I have two small questions.
H: No, one.
Q: One, okay.
laughter
So, I noticed you
have some semi-trusted
assumptions for the random
number generation
for your relays. Did you consider,
or do you think there’s some merit
in using the Bitcoin blockchain
to generate randomness?
George: We considered it. We considered
using the Bitcoin blockchain,
the NIST beacon, all these things, but
there are various engineering issues
like to use the blockchain thing you
need to have 2 verified Merkle trees.
This needs to be coded
on the TOR codebase.
You also depend on Bitcoin, which is
a system quite powerful to be honest,
but you probably don’t want to
depend on outside systems, so…
We really considered it, though.
Q: Thank you.
H: Thanks a lot for this
Q&A, I think you will…
applause
applause
I think you will stick around and be
available for another question-and-answer,
more personal, after the
next upcoming talk, which is
“State of the Onion”, in 15 minutes.
postroll music
Subtitles created by c3subtitles.de
in 2016. Join and help us do more!