Tor Traffic Analysis: Data-driven Attacks and Defenses

Loading...
Thumbnail Image

Persistent link to this item

Statistics
View Statistics

Journal Title

Journal ISSN

Volume Title

Title

Tor Traffic Analysis: Data-driven Attacks and Defenses

Alternative title

Published Date

2024-07

Publisher

Type

Thesis or Dissertation

Abstract

Anonymity networks such as Tor aim to protect the confidentiality of both the data and metadata — who communicates with whom — of their users. However, traffic analysis techniques can exploit the frequency and timing of network traffic to expose the metadata of these communications. These techniques include website fingerprinting and end-to-end flow correlation, both thoroughly discussed in previous literature. In a website fingerprinting (WF) attack, an adversary between the user and the first Tor relay records the timing and volume of Tor traffic to determine which website the user is visiting. In a flow correlation (FC) attack, the adversary records traffic metadata at both the entry and exit points of the Tor network and then attempts to correlate these flows, thus breaking Tor's anonymity. Our first objective is to demonstrate that, despite challenges such as variable network conditions, the large sets of webpages associated with many websites, and potential "padding'' to prevent traffic analysis, these attacks can be executed effectively. For instance, our Convolutional vision Transformer (CvT) approach, which merges the relative strengths of convolutional neural networks and transformers, can be used with multi-channel feature representations to significantly enhance attack accuracy against defended traffic. This demonstrates that website fingerprinting attacks can be successful even when traffic is obfuscated. Another potential barrier for traffic analysis attacks against Tor is the arbitrarily large amount of non-targeted traffic that an attacker must differentiate from monitored web pages. In particular, an attacker interested in a small subset of traffic may find that the number of false positives begins to surpass the limited number of true positives. Accordingly, in the website fingerprinting setting, we present 'precision optimization' techniques to ensure that an attacker can identify a substantial subset of web pages with high precision. Next, we present a series of performance improvements to current state-of-the-art flow correlation techniques, demonstrating that the improved techniques can operate with even higher accuracy and reliability. Furthermore, we apply these techniques to the problem of stepping-stone identification, showing that our approach can be adapted to correlate flow pairs that have been sent through multiple intermediate hosts. Given the success of these traffic analysis techniques, we then develop defenses to minimize the likelihood of successful website fingerprinting or flow correlation attacks. Although various defenses exist, most are either ineffective, introduce high latency and bandwidth overhead, or require additional infrastructure. Therefore, we aim to design defenses that are both effective and efficient. The first defense, RegulaTor, leverages common patterns in web browsing traffic to regularize traffic, significantly reducing the performance of website fingerprinting attacks while incurring moderate bandwidth and low latency overhead. For latency-sensitive users, we propose DeTorrent, which uses competing neural networks to create and evaluate traffic analysis defenses that insert fake traffic into real traffic flows. DeTorrent operates with moderate overhead and defends against both website fingerprinting and flow correlation attacks more effectively than comparable padding-only defenses. We also demonstrate DeTorrent's practicality by deploying it alongside the Tor network, ensuring it maintains performance when applied to live traffic.

Description

University of Minnesota Ph.D. dissertation. July 2024. Major: Computer Science. Advisor: Nicholas Hopper. 1 computer file (PDF); xii, 125 pages.

Related to

Replaces

License

Collections

Series/Report Number

Funding information

Isbn identifier

Doi identifier

Previously Published Citation

Other identifiers

Suggested citation

Holland, James. (2024). Tor Traffic Analysis: Data-driven Attacks and Defenses. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/269208.

Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.