Tor Traffic Analysis: Data-driven Attacks and Defenses
2024-07
Loading...
View/Download File
Persistent link to this item
Statistics
View StatisticsJournal Title
Journal ISSN
Volume Title
Title
Tor Traffic Analysis: Data-driven Attacks and Defenses
Alternative title
Authors
Published Date
2024-07
Publisher
Type
Thesis or Dissertation
Abstract
Anonymity networks such as Tor aim to protect the confidentiality of both the data and metadata — who communicates with whom — of their users. However, traffic analysis techniques can exploit the frequency and timing of network traffic to expose the metadata of these communications. These techniques include website fingerprinting and end-to-end flow correlation, both thoroughly discussed in previous literature. In a website fingerprinting (WF) attack, an adversary between the user and the first Tor relay records the timing and volume of Tor traffic to determine which website the user is visiting. In a flow correlation (FC) attack, the adversary records traffic metadata at both the entry and exit points of the Tor network and then attempts to correlate these flows, thus breaking Tor's anonymity. Our first objective is to demonstrate that, despite challenges such as variable network conditions, the large sets of webpages associated with many websites, and potential "padding'' to prevent traffic analysis, these attacks can be executed effectively. For instance, our Convolutional vision Transformer (CvT) approach, which merges the relative strengths of convolutional neural networks and transformers, can be used with multi-channel feature representations to significantly enhance attack accuracy against defended traffic. This demonstrates that website fingerprinting attacks can be successful even when traffic is obfuscated. Another potential barrier for traffic analysis attacks against Tor is the arbitrarily large amount of non-targeted traffic that an attacker must differentiate from monitored web pages. In particular, an attacker interested in a small subset of traffic may find that the number of false positives begins to surpass the limited number of true positives. Accordingly, in the website fingerprinting setting, we present 'precision optimization' techniques to ensure that an attacker can identify a substantial subset of web pages with high precision. Next, we present a series of performance improvements to current state-of-the-art flow correlation techniques, demonstrating that the improved techniques can operate with even higher accuracy and reliability. Furthermore, we apply these techniques to the problem of stepping-stone identification, showing that our approach can be adapted to correlate flow pairs that have been sent through multiple intermediate hosts. Given the success of these traffic analysis techniques, we then develop defenses to minimize the likelihood of successful website fingerprinting or flow correlation attacks. Although various defenses exist, most are either ineffective, introduce high latency and bandwidth overhead, or require additional infrastructure. Therefore, we aim to design defenses that are both effective and efficient. The first defense, RegulaTor, leverages common patterns in web browsing traffic to regularize traffic, significantly reducing the performance of website fingerprinting attacks while incurring moderate bandwidth and low latency overhead. For latency-sensitive users, we propose DeTorrent, which uses competing neural networks to create and evaluate traffic analysis defenses that insert fake traffic into real traffic flows. DeTorrent operates with moderate overhead and defends against both website fingerprinting and flow correlation attacks more effectively than comparable padding-only defenses. We also demonstrate DeTorrent's practicality by deploying it alongside the Tor network, ensuring it maintains performance when applied to live traffic.
Description
University of Minnesota Ph.D. dissertation. July 2024. Major: Computer Science. Advisor: Nicholas Hopper. 1 computer file (PDF); xii, 125 pages.
Related to
Replaces
License
Collections
Series/Report Number
Funding information
Isbn identifier
Doi identifier
Previously Published Citation
Other identifiers
Suggested citation
Holland, James. (2024). Tor Traffic Analysis: Data-driven Attacks and Defenses. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/269208.
Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.