intro music
Herald: This is now
"Towards a more trustworthy Tor network"
by Nusenu
The talk will give examples of malicious
relay groups and current issues and how to
tackle those to empower Tor users for
self-defense, so they don't necessarily
need to rely on the detection and removal
of those groups.
So without further ado, enjoy!
And we will see each other
for Q&A afterwards.
Nusenu: Thanks for inviting me to give a
talk about something I deeply care about:
The Tor network.
The Tor network
is a crucial privacy infrastructure,
without which,
we could not use Tor Browser.
I like to uncover malicious Tor relays
to help protect Tor users.
But since that does not come without
personal risks, I'm taking steps
to protect myself from those
running those malicious nodes,
so I can continue to fight them.
For this reason, this is a prerecorded
talk without using my own voice.
Thanks to the people behind the scenes
who made it possible to
present this talk in a safe way.
A few words about me.
I have a long-standing interest
in the state of the Tor network.
In 2015, I started OrNetRadar,
which is a public mailing list and
website showing reports about new
relay groups and possible Sybil attacks.
In 2017, I was asked to join the private
bad-relays Tor Project mailing list
to help analyze and confirm reports
about malicious relays.
To get a better understanding of who runs
what fraction of the Tor network over time
I started OrNetStats. It shows you also
which operators could de-anonymize Tor
users because they are in a position
to perform end-to-end correlation attacks,
something we will describe later.
I'm also the maintainer of
ansible-relayor, which is an Ansible role
used by many large relay operators.
Out of curiosity, I also like
engaging in some limited open-source
intelligence gathering on malicious
Tor network actors, especially when
their motivation for running relays
has not been well understood.
To avoid confusions,
with regards to the Tor Project:
I am not employed by the Tor Project
and I do not speak for the Tor Project.
In this presentation, we will go through
some examples of malicious actors on
the Tor network. They basically represent
our problem statement that motivates us to
improve the "status quo". After describing
some issues with current approaches to
fight malicious relays, we present a new,
additional approach aiming at achieving a
safer Tor experience using trusted relays
to some extent.
The primary target audience
of this presentation are:
Tor users, like Tor Browser users,
relay operators,
onion service operators
like, for example, SecureDrop
and anyone else that cares about Tor.
To get everyone on the same page,
a quick refresher on how Tor works
and what type of relays – also
called nodes – there are.
When Alice uses Tor Browser
to visit Bob's website,
her Tor client selects three Tor relays
to construct a circuit that will be used
to route her traffic through the
Tor network before it reaches Bob.
This gives Alice location anonymity.
The first relay in such a circuit
is called an entry guard relay.
This relay is the only relay seeing
Alice's real IP address and is therefore
considered a more sensitive type of relay.
The guard relay does not learn that Alice
is connecting to Bob, though. It
only sees the next relay as destination.
Guard relays are not changed frequently,
and Alice's Tor client waits up to 12
weeks before choosing a new guard
to make some attacks less effective.
The second relay is called a middle
or middle only relay. This relay
is the least sensitive position, since it
only sees other relays, but does not know
anything about Alice or Bob because it
just forwards encrypted traffic.
And,
the final relay is called an exit relay.
The exit relay gets to learn the
destination, Bob, but does not know
who is connecting to Bob.
The exit relay is also considered
a more sensitive relay type, since it
potentially gets to see and manipulate
clear text traffic (if Alice is not using
an encrypted protocol like HTTPS.)
Although exit relays see the destination,
they can not link all sites Alice visits
at a given point in time to the same Tor
client, to profile her, because Alice's
Tor Browser instructs the Tor client to
create and use distinct circuits for
distinct URL bar domains. So, although
this diagram shows a single circuit only,
a Tor client usually has multiple open Tor
circuits at the same time. In networks
where Tor is censored, users make use of a
special node type, which is called Bridge.
Their primary difference is that they are
not included in the public list of relays,
to make it harder to censor them. Alice
has to manually configure Tor Browser if
she wants to use a bridge. For redundancy,
it is good to have more than one bridge in
case a bridge goes down or gets censored.
The used bridge also gets to see Alice's
real IP address, but not the destination.
Now that we have a basic
understanding of Tor's design,
we might wonder,
why do we need to trust the network,
when roles are distributed
across multiple relays?
So let's look into some attack scenarios.
If an attacker controls
Alice's guard and exit relay,
they can learn that Alice connected to Bob
by performing
end-to-end correlation attacks.
Such attacks can be passive,
meaning no traffic is manipulated
and therefore cannot be detected by
probing suspect relays with test traffic.
OrNetStats gives you a daily updated list
of potential operators in such a position.
There are some restrictions a default
Tor client follows when building circuits
to reduce the likelihood of this occurring
For example, a Tor client does not use
more than one relay in the same /16 IPv4
network block when building circuits. For
example, Alice's Tor client would never
create this circuit because guard and exit
relays are in the same net block one
192.0./16. For this reason, the number of
distinct /16 network blocks an attacker
distributed its relays across is relevant
when evaluating this kind of risk.
Honest relay operators declare their group
of relays in the so-called "MyFamily"
setting. This way they are transparent
about their set of relays and Tor clients
automatically avoid using more than a
single relay from any given family in a
single circuit. Malicious actors will
either not declare relay families or
pretend to be in more than one family.
Another variant of the end-to-end
correlation attack is possible
when Bob is the attacker or
has been compromised by the attacker,
and the attacker also happens to run
Alice's guard relay. In this case,
the attacker can also determine
the actual source IP address used by Alice
when she visits Bob's website.
In cases of large, suspicious, non-exit
relay groups, it is also plausible that
they are after onion services, because
circuits for onion services do not require
exit relays. Onion services provide
location anonymity to the server side.
By running many non-exits,
an attacker could aim at finding the real
IP address / location of an onion service.
Manipulating exit relays are probably
the most common attack type
detected in the wild. That is also
the easiest-to-perform attack type.
Malicious exits usually do not care who
Alice is or what her actual IP address is.
They are mainly interested to
profit from traffic manipulation.
This type of attack can be detected
by probing exits with decoy traffic,
but since malicious exits moved
to more targeted approaches
(specific domains only), detection
is less trivial than one might think.
The best protection against this
kind of attack is using encryption.
Malicious exit relays cannot harm
connections going to onion services.
Now, let's look into
two real-world examples
of large scale and persistent
malicious actors on the Tor network.
The first example, tracked as BTCMITM20,
is in the malicious exit's business and
performs SSL strip attacks on exit relays
to manipulate plaintext HTTP traffic,
like Bitcoin addresses,
to divert Bitcoin transactions to them.
They have been detected for the first time
in 2020, and had some pretty large relay
groups. On this graph, you can see how
much of the Tor exit fraction was under
their control in the first half of 2020.
The different colors represent different
contact infos they gave on their relays
to pretend they are distinct groups.
The sharp drops show events when
they were removed from the network,
before adding relays again.
In February 2021, they managed over 27%
of the Tor network's exit capacity,
despite multiple removal attempts
over almost a year.
At some point in the future,
we will hopefully have HTTPS-Only mode
enabled by default in Tor Browser
to kill this entire attack vector for good
and make malicious exits less lucrative.
I encourage you to test
HTTPS-Only mode in Tor Browser
and notify website operators
that do not work in that mode.
If a website does not work
in HTTPS-Only mode,
you also know it is probably
not safe to use in the first place.
The second example actor,
tracked as KAX17,
is still somewhat of a mystery. And
that is not the best situation to be in.
They are remarkable for:
their focus on non-exit relays,
their network diversity,
with over 200 distinct /16 subnets,
their size – it is the first actor I know
of that peaked at over 100 Gbit/s
advertised non-exit bandwidth – and
they are active since a very long time.
Let's have a look at some KAX17
related events in the past two years.
I first detected and reported them
to the Tor Project in September 2019.
In October 2019,
KAX17 relays got removed
by the Tor directory
authorities for the first time.
In December 2019,
I published the first blog post about them
At that point, they were already
rebuilding their infrastructure
by adding new relays.
In February 2020, I contacted an email
address that was used on some relays that
did not properly declare their relay group
using the "MyFamily" setting. At the time,
they said they would run bridges instead,
so they do not have to set MyFamily.
Side note:
MyFamily is not supported for bridges.
I was not aware that this email address
is linked to KAX17 until October 2021.
In the first half of 2020,
I regularly reported large quantities of
relays to the Tor Project, and they got
removed at high pace until June 2020,
when directory authorities changed their
practices and stopped removing them
because they didn't want to "scare away"
potential new relay operators.
In July 2020, an email address joined
a tor-relays mailing list discussion
I started about a proposal to limit
large-scale attacks on the network.
Now we know
that email address is linked to KAX17.
Since the Tor directory authorities
no longer removed the relay groups
showing up, I sent the information
of over 600 KAX17 relays
to the public tor-talk mailing list.
In October 2021, someone who asked for
anonymity reached out to me and provided a
new way to detect Tor relay groups that
do not run the official Tor software.
Using this methodology,
we were able to detect KAX17
using a second detection method.
This also apparently convinced
the Tor directory authorities,
and in November this year,
a major removal event took place.
Sadly, the time span during which
KAX17 was running relays without
limitations was rather long.
This motivated us to come up with a
design that avoids this kind of complete
dependency on Tor directory authorities
when it comes to safety issues.
And, as you might guess,
KAX17 is already attempting
to restore their foothold again.
Here are some KAX17 properties.
After the release of my second
KAX17 blog post in November 2021,
the media was quick with using
words like "nation-state" and
"Advanced Persistent Threat".
But I find it hard to believe such such
serious entities would be so sloppy.
Since they claim to work for an ISP
in every other email…
I looked into their AS distribution.
I guess they work for more than one ISP.
This chart shows used Autonomous System,
sorted by the unique IP addresses
used at that hoster. So, for example,
They used more than 400 IP
addresses at Microsoft to run relays.
These are not exact numbers,
since it only includes relays since 2019,
and there are likely more.
If we map their IP addresses
to countries, we get this. Do not take
this map too seriously, as the used GEOIP
database was severely outdated and such
databases are never completely accurate,
but it gives us a rough idea. To be clear,
I have no evidence that KAX17 is
performing any kind of attacks against Tor
users, but in our threat model it is
already a considerable risk if even a
benevolent operator is not declaring their
more than 800 relays as a family. Good
protections should protect against
benevolent and malicious Sybil attacks
equally. The strongest input factor for
the risk assessment of this actor is the
fact they do not run the official Tor
software on their relays. There are still
many open questions, and the analysis into
KAX17 is ongoing. If you have any input,
feel free to reach out to me. After
looking at some examples of malicious
actors, I want to shortly summarize some
of the issues in how the malicious relays
problem is currently approached. It is
pretty much like playing Whack-A-Mole. You
hit them and they come back. You hit them
again, and they come back again, over and
over and while you're at it, you're also
training them to come back stronger next
time. Malicious actors can run relays
until they get caught/detected or are
considered suspicious enough for removal
by a Tor directory authorities. If your
threat model does not match the Tor
directory's threat model, you are out of
luck or have to maintain your own
exclusion lists. Attempts to define a
former set of "do not do" requirements for
relays that Tor directory authorities
commit to enforce, have failed, even with
the involvement of a core Tor developer.
It is time for a paradigm change. The
current processes for detecting and
removing malicious Tor relays are failing
us and are not sustainable in the long
run. In recent years, malicious groups
have become larger, harder to detect,
harder to get removed and more persistent.
Here are some of our design goals. Instead
of continuing the single sided arms race
with malicious actors. We aim to empower
Tor users for self-defense without
requiring the detection of malicious Tor
relays and without, solely, depending on
Tor directly authorities for protecting us
from malicious relays. We aim to reduce
the risk of de-anonymization by using at
least a trusted guard or exit or both. We
also acknowledge it is increasingly
impossible to detect all malicious relays
using decoy traffic, therefore, we stop
depending on the detectability of
malicious relays to protect users. In
today's Tor network, we hope to not choose
a malicious guard when we pick one. In the
proposed design, we would pick a trusted
guard instead. In fact, this can be done
with today's Tor browser, if you set any
trusted relays as your bridge. Another
supported configuration would be to use
trusted guards and trusted exits. Such
designs are possible without requiring
code changes in Tor, but are cumbersome to
configure manually, since Tor only
supports relay fingerprints and does not
know about relay operator identifiers. But
what do we actually mean by trusted
relays? Trusted relays are operated by
trusted operators. These operators are
believed to run relays without malicious
intent. Trusted operators are specified by
the user. Users assign trust at the
operator, not the relay level, for
scalability reasons, and to avoid
configuration changes when an operator
changes their relays. Since users should
be able to specify trusted operators, we
need human-readable, authenticated and
globally unique operator identifiers. By
authenticated, we mean they should not be
spoofable arbitrarily like current relay
contact infos. For simplicity, we use DNS
domains as relay operator identifiers, and
we will probably restrict them to 40
characters in length. How do Authenticated
Relay Operator IDs, short AROI, work. From
an operator point of view, configuring an
AROI is easy. Step one: The operator
specifies the desired domain under her
control using Tor's ContactInfo option.
Step two: The operator publishes a simple
text file using the IANA well-known URI
containing all relay fingerprints. If no
web server is available or if the web
server is not considered safe enough,
DNSSEC-signed TXT records are also an
option for authentication. Using DNS is
great for scalability and availability due
to DNS caching, but since every relay
requires its own TXT record, it will take
longer than the URI type proof when
performing proof validation. Operators
that have no domain at all can use free
services like GitHub pages or similar to
serve the text file. For convenience, Eran
Sandler created this simple to use
ContactInfo generator, so relay operators
don't have to read the specification to
generate the required ContactInfo string
for their configuration. For the
Authenticated Relay Operator ID the "url"
and "proof" fields are the only relevant
fields. There are already over 1000 relays
that have implemented the Authenticated
Relay Operator ID. OrNetStats displays an
icon in case the operator implemented it
correctly. Out of the top 24 largest
families by bandwidth, all but eight
operators have implemented the
Authenticated Relay Operator ID already.
On the right-hand side, you can see a few
logos of organizations running relays with
a properly set up AROI. The most relevant
distinction between lines having that
checkmark icon and those that do not have
it is the fact that the string in lines
that do not include the icon can be
arbitrarily spoofed. This graph shows the
largest exit operators that implemented
the AROI. I want to stress one crucial
point about AROIs though, authenticated
must not be confused with trusted.
Malicious operators can also authenticate
their domain and they do. A given AROI can
be trusted or not. It is up to the user,
but using AROIs instead of ContactInfo for
assigning trust is crucial because
ContactInfo can not be trusted directly
without further checks. This graph shows
what fraction of the Tor network's exit
capacity implemented the Authenticated
Relay Operator ID over time. Currently, we
are at around 60 percent already, but
guard capacity is a lot lower, around 15
percent. The reason for that is that exits
are operated mostly by large operators and
organizations, while guards are
distributed across a lot more operators.
There are over 1800 guard families, but
only around 400 exit families. How does a
Tor client make use of AROIs, current Tor
versions do not know what AROIs are and
primarily take relay fingerprints as
configuration inputs. So, we need some
tooling to generate a list of relay
fingerprints starting from a list of
trusted AROIs. We have implemented a quick
and dirty proof of concept that puts
everything together and performs all the
steps shown on this slide, to demonstrate
the concept of using trusted AROIs to
configure Tor client to use trusted exit
relays. It is not meant to be used by end-
users, it merely is a preview for the
technical audience who would like to see
it in action to achieve a better
understanding of the design. The current
proof of concept performs all proof checks
itself without relying on third parties,
but since there are a lot of reasons for
doing proof-checks centrally instead, for
example, by directory authorities. I
recently submitted a partial proposal for
it to the Tor development mailing list to
see whether they would consider it before
proceeding with a more serious
implementation than the current proof of
concept. I find it important to always try
achieving a common goal together with
upstream first before creating solutions
that are maintained outside of upstream
because it will lead to better maintained
improvements and likely a more user-
friendly experience if they are integrated
in upstream. Here is a link to the
mentioned tor-dev email, for those who
would like to follow along. To summarize,
after reviewing some real world examples
of malicious actors on the Tor network, we
concluded that current approaches to limit
risks by bad relays on Tor users might not
live up to Tor users expectations, are not
sustainable in the long run and need an
upgrade to avoid depending on the
detectability of malicious relays, which
is becoming increasingly hard. We
presented a design to extend current anti
bad relay approaches that does not rely on
the detection of malicious relays using
trusted Authenticated Relay Operator IDs.
We have shown that most exit capacity has
implemented AROIs already, while guard
capacity is currently significantly lower,
showing a lack of insights on who operates
Tor's guard capacity. When publicly
speaking about modifying Tor's path
selection in front of a wide audience, I
also consider it to be my responsibility
to explicitly state that you should not
change your Tor configuration options that
influenced path selection behavior without
a clear need, according to your threat
model to avoid potentially standing out.
Using trusted AROIs certainly comes with
some tradeoffs of its own, like for
example, network load balancing, to name
only one. Thanks to many large, trusted
exit operators, it should be feasible in
the near future to use trusted exits
without standing out in a trivially
detectable way because it is harder in the
sense of takes longer to statistically
detect a Tor client changed its possible
pool of exits, if it only excluded a
smaller fraction of exits. Detecting Tor
clients using only a subset of all guards
takes a lot longer than detecting custom
exit sets because guards are not changed
over a longer period of time when compared
with exits. And finally, Tor clients that
make use of trusted AROIs will need a way
to find trusted AROIs, ideally, they could
learn about them dynamically in a safe
way. There is an early work in progress
draft specification linked on this slide.
I want to dedicate this talk to Karsten
Loesing who passed away last year. He was
the kindest person I got to interact with
in the Tor community. Karsten was the Tor
metrics team lead and without his work, my
projects, OrNetStats and OrNetRadar would
not exist. Every time you use
metrics.torproject.org, for example, the
so-called "Relay Search", you are using
his legacy. Thank you for listening, and
I'm really looking forward to your
questions. I'm not sure I'll be able to
respond to questions after the talk in
real time, but it would be nice to have
them read out. So they are part of the
recording and I'll make an effort to
publish answers to all of them via
Mastodon, should I not be able to respond
in real time. I'm also happy to take tips
about unusual things you observed on the
Tor network. Do not underestimate your
power as Tor user to contribute to a safer
Tor network by reporting unusual things.
Most major hits against bad relay actors
were the result of Tor user reports.
quietness
Herald: OK. Thank you very much for this
very informative talk and yes so we will
switch over to the Q&A now. Yeah, thanks
again. Very fascinating. So we have
collected several questions from our IRC
chat, so I'm just going to start. If
bridges don't need the MyFamily setting
isn't this a wide open gap for end-to-end
correlation attacks, for example if a
malicious actor can somehow make the relay
popular as bridge?
Nusenu: Yes, bridges are a concern in the
context of MyFamily, for that reason, it
is not recommended to run bridges and
exits at the same time in current versions
of Tor, but future versions of Tor will
get a new and more relay operator friendly
MyFamily setting. That new MyFamily design
will also support bridges. This will
likely be in Tor 0.4.8.x at some point in
2022.
Herald: OK, thanks. Despite what kind of
attack, are there statistics who or from
which country these attacks are coming
most? Background here is there are rumors
about NSA driven and exit notes.
Nusenu: I don't know about any general
statistics, but I usually include used
autonomous systems by certain groups when
blogging about them. There are some
autonomous systems that are notorious for
being used by malicious groups, but
malicious groups also try to blend in with
the rest by using large ISPs like Hetzner
and OVH.
Herald: Thanks. Is using a bridge that I
host also safer than using a random guard
node?
Nusenu: This is a tricky question, since
it also depends on whether it is a private
bridge, a bridge that is not distributed
to other uses by a bridgeDB. I would say
it is better to not run the bridges you
use yourself.
Herald: OK. What is worse? KAX17 or a well
known trusted operators running 20 percent
of Tor's exits?
Nusenu: Currently, I would say KAX17.
Herald: OK. I think that's the last one
for now: Isn't the anonymity, not
decreased or changed while using trusted
relay list?
Nusenu: Yes, this is a trade-off that
users will need to make. This heavily
depends on the threat model.
Herald: OK. So I think we have gathered
all the questions and they were all
answered. So thank you again for. Yes,
thank you again.
rc3 postroll music
Subtitles created by c3subtitles.de
in the year 2022. Join, and help us!#