intro music Herald: This is now "Towards a more trustworthy Tor network" by Nusenu The talk will give examples of malicious relay groups and current issues and how to tackle those to empower Tor users for self-defense, so they don't necessarily need to rely on the detection and removal of those groups. So without further ado, enjoy! And we will see each other for Q&A afterwards. Nusenu: Thanks for inviting me to give a talk about something I deeply care about: The Tor network. The Tor network is a crucial privacy infrastructure, without which, we could not use Tor Browser. I like to uncover malicious Tor relays to help protect Tor users. But since that does not come without personal risks, I'm taking steps to protect myself from those running those malicious nodes, so I can continue to fight them. For this reason, this is a prerecorded talk without using my own voice. Thanks to the people behind the scenes who made it possible to present this talk in a safe way. A few words about me. I have a long-standing interest in the state of the Tor network. In 2015, I started OrNetRadar, which is a public mailing list and website showing reports about new relay groups and possible Sybil attacks. In 2017, I was asked to join the private bad-relays Tor Project mailing list to help analyze and confirm reports about malicious relays. To get a better understanding of who runs what fraction of the Tor network over time I started OrNetStats. It shows you also which operators could de-anonymize Tor users because they are in a position to perform end-to-end correlation attacks, something we will describe later. I'm also the maintainer of ansible-relayor, which is an Ansible role used by many large relay operators. Out of curiosity, I also like engaging in some limited open-source intelligence gathering on malicious Tor network actors, especially when their motivation for running relays has not been well understood. To avoid confusions, with regards to the Tor Project: I am not employed by the Tor Project and I do not speak for the Tor Project. In this presentation, we will go through some examples of malicious actors on the Tor network. They basically represent our problem statement that motivates us to improve the "status quo". After describing some issues with current approaches to fight malicious relays, we present a new, additional approach aiming at achieving a safer Tor experience using trusted relays to some extent. The primary target audience of this presentation are: Tor users, like Tor Browser users, relay operators, onion service operators like, for example, SecureDrop and anyone else that cares about Tor. To get everyone on the same page, a quick refresher on how Tor works and what type of relays – also called nodes – there are. When Alice uses Tor Browser to visit Bob's website, her Tor client selects three Tor relays to construct a circuit that will be used to route her traffic through the Tor network before it reaches Bob. This gives Alice location anonymity. The first relay in such a circuit is called an entry guard relay. This relay is the only relay seeing Alice's real IP address and is therefore considered a more sensitive type of relay. The guard relay does not learn that Alice is connecting to Bob, though. It only sees the next relay as destination. Guard relays are not changed frequently, and Alice's Tor client waits up to 12 weeks before choosing a new guard to make some attacks less effective. The second relay is called a middle or middle only relay. This relay is the least sensitive position, since it only sees other relays, but does not know anything about Alice or Bob because it just forwards encrypted traffic. And, the final relay is called an exit relay. The exit relay gets to learn the destination, Bob, but does not know who is connecting to Bob. The exit relay is also considered a more sensitive relay type, since it potentially gets to see and manipulate clear text traffic (if Alice is not using an encrypted protocol like HTTPS.) Although exit relays see the destination, they can not link all sites Alice visits at a given point in time to the same Tor client, to profile her, because Alice's Tor Browser instructs the Tor client to create and use distinct circuits for distinct URL bar domains. So, although this diagram shows a single circuit only, a Tor client usually has multiple open Tor circuits at the same time. In networks where Tor is censored, users make use of a special node type, which is called Bridge. Their primary difference is that they are not included in the public list of relays, to make it harder to censor them. Alice has to manually configure Tor Browser if she wants to use a bridge. For redundancy, it is good to have more than one bridge in case a bridge goes down or gets censored. The used bridge also gets to see Alice's real IP address, but not the destination. Now that we have a basic understanding of Tor's design, we might wonder, why do we need to trust the network, when roles are distributed across multiple relays? So let's look into some attack scenarios. If an attacker controls Alice's guard and exit relay, they can learn that Alice connected to Bob by performing end-to-end correlation attacks. Such attacks can be passive, meaning no traffic is manipulated and therefore cannot be detected by probing suspect relays with test traffic. OrNetStats gives you a daily updated list of potential operators in such a position. There are some restrictions a default Tor client follows when building circuits to reduce the likelihood of this occurring For example, a Tor client does not use more than one relay in the same /16 IPv4 network block when building circuits. For example, Alice's Tor client would never create this circuit because guard and exit relays are in the same net block one 192.0./16. For this reason, the number of distinct /16 network blocks an attacker distributed its relays across is relevant when evaluating this kind of risk. Honest relay operators declare their group of relays in the so-called "MyFamily" setting. This way they are transparent about their set of relays and Tor clients automatically avoid using more than a single relay from any given family in a single circuit. Malicious actors will either not declare relay families or pretend to be in more than one family. Another variant of the end-to-end correlation attack is possible when Bob is the attacker or has been compromised by the attacker, and the attacker also happens to run Alice's guard relay. In this case, the attacker can also determine the actual source IP address used by Alice when she visits Bob's website. In cases of large, suspicious, non-exit relay groups, it is also plausible that they are after onion services, because circuits for onion services do not require exit relays. Onion services provide location anonymity to the server side. By running many non-exits, an attacker could aim at finding the real IP address / location of an onion service. Manipulating exit relays are probably the most common attack type detected in the wild. That is also the easiest-to-perform attack type. Malicious exits usually do not care who Alice is or what her actual IP address is. They are mainly interested to profit from traffic manipulation. This type of attack can be detected by probing exits with decoy traffic, but since malicious exits moved to more targeted approaches (specific domains only), detection is less trivial than one might think. The best protection against this kind of attack is using encryption. Malicious exit relays cannot harm connections going to onion services. Now, let's look into two real-world examples of large scale and persistent malicious actors on the Tor network. The first example, tracked as BTCMITM20, is in the malicious exit's business and performs SSL strip attacks on exit relays to manipulate plaintext HTTP traffic, like Bitcoin addresses, to divert Bitcoin transactions to them. They have been detected for the first time in 2020, and had some pretty large relay groups. On this graph, you can see how much of the Tor exit fraction was under their control in the first half of 2020. The different colors represent different contact infos they gave on their relays to pretend they are distinct groups. The sharp drops show events when they were removed from the network, before adding relays again. In February 2021, they managed over 27% of the Tor network's exit capacity, despite multiple removal attempts over almost a year. At some point in the future, we will hopefully have HTTPS-Only mode enabled by default in Tor Browser to kill this entire attack vector for good and make malicious exits less lucrative. I encourage you to test HTTPS-Only mode in Tor Browser and notify website operators that do not work in that mode. If a website does not work in HTTPS-Only mode, you also know it is probably not safe to use in the first place. The second example actor, tracked as KAX17, is still somewhat of a mystery. And that is not the best situation to be in. They are remarkable for: their focus on non-exit relays, their network diversity, with over 200 distinct /16 subnets, their size – it is the first actor I know of that peaked at over 100 Gbit/s advertised non-exit bandwidth – and they are active since a very long time. Let's have a look at some KAX17 related events in the past two years. I first detected and reported them to the Tor Project in September 2019. In October 2019, KAX17 relays got removed by the Tor directory authorities for the first time. In December 2019, I published the first blog post about them At that point, they were already rebuilding their infrastructure by adding new relays. In February 2020, I contacted an email address that was used on some relays that did not properly declare their relay group using the "MyFamily" setting. At the time, they said they would run bridges instead, so they do not have to set MyFamily. Side note: MyFamily is not supported for bridges. I was not aware that this email address is linked to KAX17 until October 2021. In the first half of 2020, I regularly reported large quantities of relays to the Tor Project, and they got removed at high pace until June 2020, when directory authorities changed their practices and stopped removing them because they didn't want to "scare away" potential new relay operators. In July 2020, an email address joined a tor-relays mailing list discussion I started about a proposal to limit large-scale attacks on the network. Now we know that email address is linked to KAX17. Since the Tor directory authorities no longer removed the relay groups showing up, I sent the information of over 600 KAX17 relays to the public tor-talk mailing list. In October 2021, someone who asked for anonymity reached out to me and provided a new way to detect Tor relay groups that do not run the official Tor software. Using this methodology, we were able to detect KAX17 using a second detection method. This also apparently convinced the Tor directory authorities, and in November this year, a major removal event took place. Sadly, the time span during which KAX17 was running relays without limitations was rather long. This motivated us to come up with a design that avoids this kind of complete dependency on Tor directory authorities when it comes to safety issues. And, as you might guess, KAX17 is already attempting to restore their foothold again. Here are some KAX17 properties. After the release of my second KAX17 blog post in November 2021, the media was quick with using words like "nation-state" and "Advanced Persistent Threat". But I find it hard to believe such such serious entities would be so sloppy. Since they claim to work for an ISP in every other email… I looked into their AS distribution. I guess they work for more than one ISP. This chart shows used Autonomous System, sorted by the unique IP addresses used at that hoster. So, for example, They used more than 400 IP addresses at Microsoft to run relays. These are not exact numbers, since it only includes relays since 2019, and there are likely more. If we map their IP addresses to countries, we get this. Do not take this map too seriously, as the used GEOIP database was severely outdated and such databases are never completely accurate, but it gives us a rough idea. To be clear, I have no evidence that KAX17 is performing any kind of attacks against Tor users, but in our threat model it is already a considerable risk if even a benevolent operator is not declaring their more than 800 relays as a family. Good protections should protect against benevolent and malicious Sybil attacks equally. The strongest input factor for the risk assessment of this actor is the fact they do not run the official Tor software on their relays. There are still many open questions, and the analysis into KAX17 is ongoing. If you have any input, feel free to reach out to me. After looking at some examples of malicious actors, I want to shortly summarize some of the issues in how the malicious relays problem is currently approached. It is pretty much like playing Whack-A-Mole. You hit them and they come back. You hit them again, and they come back again, over and over and while you're at it, you're also training them to come back stronger next time. Malicious actors can run relays until they get caught/detected or are considered suspicious enough for removal by a Tor directory authorities. If your threat model does not match the Tor directory's threat model, you are out of luck or have to maintain your own exclusion lists. Attempts to define a former set of "do not do" requirements for relays that Tor directory authorities commit to enforce, have failed, even with the involvement of a core Tor developer. It is time for a paradigm change. The current processes for detecting and removing malicious Tor relays are failing us and are not sustainable in the long run. In recent years, malicious groups have become larger, harder to detect, harder to get removed and more persistent. Here are some of our design goals. Instead of continuing the single sided arms race with malicious actors. We aim to empower Tor users for self-defense without requiring the detection of malicious Tor relays and without, solely, depending on Tor directly authorities for protecting us from malicious relays. We aim to reduce the risk of de-anonymization by using at least a trusted guard or exit or both. We also acknowledge it is increasingly impossible to detect all malicious relays using decoy traffic, therefore, we stop depending on the detectability of malicious relays to protect users. In today's Tor network, we hope to not choose a malicious guard when we pick one. In the proposed design, we would pick a trusted guard instead. In fact, this can be done with today's Tor browser, if you set any trusted relays as your bridge. Another supported configuration would be to use trusted guards and trusted exits. Such designs are possible without requiring code changes in Tor, but are cumbersome to configure manually, since Tor only supports relay fingerprints and does not know about relay operator identifiers. But what do we actually mean by trusted relays? Trusted relays are operated by trusted operators. These operators are believed to run relays without malicious intent. Trusted operators are specified by the user. Users assign trust at the operator, not the relay level, for scalability reasons, and to avoid configuration changes when an operator changes their relays. Since users should be able to specify trusted operators, we need human-readable, authenticated and globally unique operator identifiers. By authenticated, we mean they should not be spoofable arbitrarily like current relay contact infos. For simplicity, we use DNS domains as relay operator identifiers, and we will probably restrict them to 40 characters in length. How do Authenticated Relay Operator IDs, short AROI, work. From an operator point of view, configuring an AROI is easy. Step one: The operator specifies the desired domain under her control using Tor's ContactInfo option. Step two: The operator publishes a simple text file using the IANA well-known URI containing all relay fingerprints. If no web server is available or if the web server is not considered safe enough, DNSSEC-signed TXT records are also an option for authentication. Using DNS is great for scalability and availability due to DNS caching, but since every relay requires its own TXT record, it will take longer than the URI type proof when performing proof validation. Operators that have no domain at all can use free services like GitHub pages or similar to serve the text file. For convenience, Eran Sandler created this simple to use ContactInfo generator, so relay operators don't have to read the specification to generate the required ContactInfo string for their configuration. For the Authenticated Relay Operator ID the "url" and "proof" fields are the only relevant fields. There are already over 1000 relays that have implemented the Authenticated Relay Operator ID. OrNetStats displays an icon in case the operator implemented it correctly. Out of the top 24 largest families by bandwidth, all but eight operators have implemented the Authenticated Relay Operator ID already. On the right-hand side, you can see a few logos of organizations running relays with a properly set up AROI. The most relevant distinction between lines having that checkmark icon and those that do not have it is the fact that the string in lines that do not include the icon can be arbitrarily spoofed. This graph shows the largest exit operators that implemented the AROI. I want to stress one crucial point about AROIs though, authenticated must not be confused with trusted. Malicious operators can also authenticate their domain and they do. A given AROI can be trusted or not. It is up to the user, but using AROIs instead of ContactInfo for assigning trust is crucial because ContactInfo can not be trusted directly without further checks. This graph shows what fraction of the Tor network's exit capacity implemented the Authenticated Relay Operator ID over time. Currently, we are at around 60 percent already, but guard capacity is a lot lower, around 15 percent. The reason for that is that exits are operated mostly by large operators and organizations, while guards are distributed across a lot more operators. There are over 1800 guard families, but only around 400 exit families. How does a Tor client make use of AROIs, current Tor versions do not know what AROIs are and primarily take relay fingerprints as configuration inputs. So, we need some tooling to generate a list of relay fingerprints starting from a list of trusted AROIs. We have implemented a quick and dirty proof of concept that puts everything together and performs all the steps shown on this slide, to demonstrate the concept of using trusted AROIs to configure Tor client to use trusted exit relays. It is not meant to be used by end- users, it merely is a preview for the technical audience who would like to see it in action to achieve a better understanding of the design. The current proof of concept performs all proof checks itself without relying on third parties, but since there are a lot of reasons for doing proof-checks centrally instead, for example, by directory authorities. I recently submitted a partial proposal for it to the Tor development mailing list to see whether they would consider it before proceeding with a more serious implementation than the current proof of concept. I find it important to always try achieving a common goal together with upstream first before creating solutions that are maintained outside of upstream because it will lead to better maintained improvements and likely a more user- friendly experience if they are integrated in upstream. Here is a link to the mentioned tor-dev email, for those who would like to follow along. To summarize, after reviewing some real world examples of malicious actors on the Tor network, we concluded that current approaches to limit risks by bad relays on Tor users might not live up to Tor users expectations, are not sustainable in the long run and need an upgrade to avoid depending on the detectability of malicious relays, which is becoming increasingly hard. We presented a design to extend current anti bad relay approaches that does not rely on the detection of malicious relays using trusted Authenticated Relay Operator IDs. We have shown that most exit capacity has implemented AROIs already, while guard capacity is currently significantly lower, showing a lack of insights on who operates Tor's guard capacity. When publicly speaking about modifying Tor's path selection in front of a wide audience, I also consider it to be my responsibility to explicitly state that you should not change your Tor configuration options that influenced path selection behavior without a clear need, according to your threat model to avoid potentially standing out. Using trusted AROIs certainly comes with some tradeoffs of its own, like for example, network load balancing, to name only one. Thanks to many large, trusted exit operators, it should be feasible in the near future to use trusted exits without standing out in a trivially detectable way because it is harder in the sense of takes longer to statistically detect a Tor client changed its possible pool of exits, if it only excluded a smaller fraction of exits. Detecting Tor clients using only a subset of all guards takes a lot longer than detecting custom exit sets because guards are not changed over a longer period of time when compared with exits. And finally, Tor clients that make use of trusted AROIs will need a way to find trusted AROIs, ideally, they could learn about them dynamically in a safe way. There is an early work in progress draft specification linked on this slide. I want to dedicate this talk to Karsten Loesing who passed away last year. He was the kindest person I got to interact with in the Tor community. Karsten was the Tor metrics team lead and without his work, my projects, OrNetStats and OrNetRadar would not exist. Every time you use metrics.torproject.org, for example, the so-called "Relay Search", you are using his legacy. Thank you for listening, and I'm really looking forward to your questions. I'm not sure I'll be able to respond to questions after the talk in real time, but it would be nice to have them read out. So they are part of the recording and I'll make an effort to publish answers to all of them via Mastodon, should I not be able to respond in real time. I'm also happy to take tips about unusual things you observed on the Tor network. Do not underestimate your power as Tor user to contribute to a safer Tor network by reporting unusual things. Most major hits against bad relay actors were the result of Tor user reports. quietness Herald: OK. Thank you very much for this very informative talk and yes so we will switch over to the Q&A now. Yeah, thanks again. Very fascinating. So we have collected several questions from our IRC chat, so I'm just going to start. If bridges don't need the MyFamily setting isn't this a wide open gap for end-to-end correlation attacks, for example if a malicious actor can somehow make the relay popular as bridge? Nusenu: Yes, bridges are a concern in the context of MyFamily, for that reason, it is not recommended to run bridges and exits at the same time in current versions of Tor, but future versions of Tor will get a new and more relay operator friendly MyFamily setting. That new MyFamily design will also support bridges. This will likely be in Tor 0.4.8.x at some point in 2022. Herald: OK, thanks. Despite what kind of attack, are there statistics who or from which country these attacks are coming most? Background here is there are rumors about NSA driven and exit notes. Nusenu: I don't know about any general statistics, but I usually include used autonomous systems by certain groups when blogging about them. There are some autonomous systems that are notorious for being used by malicious groups, but malicious groups also try to blend in with the rest by using large ISPs like Hetzner and OVH. Herald: Thanks. Is using a bridge that I host also safer than using a random guard node? Nusenu: This is a tricky question, since it also depends on whether it is a private bridge, a bridge that is not distributed to other uses by a bridgeDB. I would say it is better to not run the bridges you use yourself. Herald: OK. What is worse? KAX17 or a well known trusted operators running 20 percent of Tor's exits? Nusenu: Currently, I would say KAX17. Herald: OK. I think that's the last one for now: Isn't the anonymity, not decreased or changed while using trusted relay list? Nusenu: Yes, this is a trade-off that users will need to make. This heavily depends on the threat model. Herald: OK. So I think we have gathered all the questions and they were all answered. So thank you again for. Yes, thank you again. rc3 postroll music Subtitles created by c3subtitles.de in the year 2022. Join, and help us!#