Skip to content

This project aims to implement the Pi_t ("t" for "tulip" or "threshold") onion routing protocol in a service-model environment.

License

Notifications You must be signed in to change notification settings

HannahMarsh/pi_t-experiment

Repository files navigation

Implementing $\Pi_t$ 🌷

Introduction

This project aims to implement the $\Pi_t$ ("t" for "tulip" or "threshold") protocol in a service-model environment. The focus of this experiment is on evaluating and comparing the performance and privacy guarantees of $\Pi_t$ against other protocols (such as those described in [VDHLZZ15] and [TLG17]) under similar conditions.

(Jump to Installation)

Background

A communication protocol achieves anonymity if no attacker can distinguish who is communicating with whom based on observable data such as network traffic. Onion routing is a well-established technique for achieving anonymity, where messages are encapsulated in layers of encryption and sent through a series of intermediary relays (relays). However, onion routing alone does not protect against adversaries who can observe all network traffic, such as AS-level or ISP-level attackers.

Mix networks (or mixnets) enhance onion routing by mixing messages at each relay, making it harder for adversaries to correlate incoming and outgoing messages. Various mixnet architectures have been proposed, including Vuvuzela and Stadium. Despite their strong anonymity guarantees, these solutions assume synchronous communication, where time progresses in rounds, and message transmissions are lossless and instantaneous. In practice, however, network communication is asynchronous and thus adversaries can exploit timing discrepancies to correlate messages entering and leaving the network.

$\Pi_t$ ("t" for "tulip" or "threshold"), is the first provably anonymous onion routing protocol for the asynchronous communications setting. As described in [ALU24], this protocol introduces several novel concepts such as checkpoint onions and bruising. Theoretical analysis demonstrates that $\Pi_t$ can provide differential privacy (see definition below) even under strong adversarial models. This analysis assumes a peer-to-peer network where relays must discover each other, exchange keys, and manage communication paths independently. Unfortunately this leads to several practical challenges such as increased complexity, lower fault tolerance, and inconsistent security enforcement.

This project aims to transition $\Pi_t$ to a service-model environment by introducing a fault-tolerant bulletin board that maintains a consistent view of all active relays and coordinates the start of a run.

Differential Privacy

Differential privacy is a mathematical framework for ensuring that the results of data analysis do not reveal any specific individual's data.

In the context of $\Pi_t$ and other onion routing protocols, a more nuanced form of differential privacy, called ($\epsilon$, $\delta$)-Differential Privacy , ensures that an adversary observing network traffic cannot (with high confidence) distinguish between two neighboring communication patterns. This means that the inclusion or exclusion of a single individual's data does not significantly affect the outcome of any analysis.

Epsilon (ε) and delta ($\delta$) are the parameters that define our (ε, $\delta$)-differential privacy guarantees:

  • ε: A non-negative parameter that bounds the multiplicative difference in probabilities of any outcome occurring whether an individual's data is included or not. Smaller values of ε indicate stronger privacy guarantees.
  • $\delta$: A non-negative parameter that bounds the additive difference in probabilities, allowing for a small probability of error. Smaller values of $\delta$ also indicate stronger privacy guarantees.

Formally, a randomized algorithm or mechanism is ε, $\delta$)-differentially private if for every pair of neighboring inputs $\sigma_0$ and $\sigma_1$ and for every set $\mathcal{V}$ of adversarial views,

$$ \Pr[\text{View}^{\mathcal{A}}(\sigma_0) \in \mathcal{V}] \leq e^{\epsilon} \cdot \Pr[\text{View}^{\mathcal{A}}(\sigma_1) \in \mathcal{V}] + \delta $$

This means the adversary's view when Alice sends a message to Bob is statistically close to the view when Alice sends a message to Carol instead.

$\Pi_t$ Implementation Overview

 

Figure 1 - Routing Path Visualization, Example "Scenario 0" (with N clients, R Relays, l1 mixers, and l rounds)

Routing Path Visualization

Parameters

(also defined in /config/config.yml)

  • $x$: Server load (i.e. the expected number of onions each relay processes per round).
  • $\ell_1$: The number of mixers in each routing path.
  • $\ell_2$: The number of gatekeepers in each routing path.
  • $L$: The number of rounds (also the length of the routing path, equal to $\ell_1 + \ell_2 + 1$ ).
  • $R$: The number of participating relays.
  • $N$: The number of participating clients.
  • $d$: The number of non-null key-blocks in $S_1$. (thus $d$ is the threshold for number of bruises before an onion is discard by a gatekeeper).
  • $\tau$: ( $\tau \lt (1 − \gamma)(1 − \chi)$ ) The fraction of expected checkpoint onions needed for a relay to progress its local clock.
  • $\epsilon$: The privacy loss in the worst case scenario.
  • $\delta = 10^{-4}$: The fixed probability of differential privacy violation.
  • $\lambda$: The security parameter. We assume every quantity of the system, including $N$, $R$, $L$ are polynomially bounded by $\lambda$.
  • $\theta$: The maximum fraction of bruisable layers that can be bruised before the innermost tulip bulb becomes unrecoverable. Note that $d = \theta \cdot \ell_1$
  • $\chi$: The fraction of $N$ relays that can be corrupted and controlled by the adversary (the subset is chosen prior to execution). Note that $\chi \lt \theta - 0.5$ and $\chi \lt \frac{d}{\ell_1} - 0.5$

No Global Clock:

  • Each relay maintains a local clock ($c_j$) to track the progression of onion layers. A relay does not progress
    its local clock until it receives a sufficient number of checkpoint onions for the current layer (specified by $\tau$).

Keys:

  • Long-term identity keys ($pk_{i}$, $sk_{i}$) are established for each party $P_i$ (both clients and relays). They are persistent and are used across multiple sessions.
  • Ephemeral session keys ($epk_{i,j}$, $esk_{i,j}$) are generated by a client for each $P_i$ in the $j$-th position in an onion's routing path. They are short-term, meaning a processing party $P_i$ can only compute the shared secret during round $j$. (See /internal/pi_t/tools/keys/ephemeral.go for implementation)
    • Onion formation: The client uses $esk_{i,j}$ and $P_i$'s public key $pk_i$ to generate a shared secret $s_{i,j}$.
      • The client includes $epk_{i,j}$ in the $j$-th layer's header.
    • Peeling: The processing party $P_i$ uses the ephemeral public key $epk_{i,j}$ and its own private key $sk_i$ to compute the shared secret $s_{i,j}$.
  • See internal/pi_t/keys/ecdh.go for this project's ECDH usage.

Tulip Bulb Structure:

Header ($H_i$) Content ($C_i$) Stepel ($S_i$)
$E_i$ $B_{i,1}$ $B_{i,2}$ ... $B_{i,l-1}$
$\{ . . . \{ \{ \{m\}_{k_{l}} \}_{k_{l-1}} \}_{k_{l-2}} . . . \}_{k_{1}}$
$\langle K \rangle$-blocks:
$S_{i,1}$ $S_{i,2}$ ... $S_{i,d}$
$\langle 0 \rangle$-blocks:
$S_{i,d+1}$ $S_{i,d+2}$ ... $S_{i,l_{1}+1}$
Header ($H_i$)
$E_i$ $B_i$
Enc(
  $pk$ ( $P_i$ ),
  $t_i$ ,
  ( Role, $i$ , $y_i$ , $\vec{A}_i$ , $k_i$ )
)
$B_{i,1} = $ $B_{i,2} = $ ... $B_{i,l-1} = $
Encrypted with $k_{i}$:
$I_{i+1}$ $E_{i+1}$
Encrypted with $k_{i}$:
Encrypted with $k_{i+1}$:
$I_{i+2}$ $E_{i+2}$
...
Encrypted with $k_{i}$:
Encrypted with $k_{i+1}$:
...
Encrypted with $k_{i+l-2}$:
$I_{i+l-1}$ $E_{i+l-1}$

Header ($H$):

  • Consists of three parts: the ephemeral public key $epk_{i}$ , a ciphertext $E_i$ and the rest of the header $B_i$ .
    • $epk_i$: The ephemeral public key is used along with $P_i$'s private key $sk_i$ to compute the shared secret $s_i$ from the original client.
    • $E_i$: An encryption under the shared secret $s_i$ of the tuple $(i, y_i, k_i)$ where:
      • $i$  is the position in the route.
      • $y_i$ is the metadata.
      • $k_i$ is the layer key (the shared key for that layer).
    • $B_i$: The rest of the header, which includes:
      • The nonce $y_i$.
      • The time window for the onion's arrival.
      • The next hop in the routing path.

Content ($C$):

  • Contains the payload or the next layer of the onion.
  • Encrypted under the layer key, $k$ .
  • For intermediate relays, it contains the encrypted content of the next onion layer.
  • For the final recipient, it contains the actual message.

Sepal ($S$):

  • Protects the inner layers of the onion by absorbing bruises (caused by delays, tampering, or replay attacks) during transit.
  • Consists of key-blocks and null-blocks.
  • The key-blocks are encrypted versions of the bulb master key $K$.
  • The null-blocks are encrypted versions of the value 0.
  • As each mixer processes the onion, it peels a layer from the sepal:
    • If unbruised, the rightmost sepal block is dropped, retaining the same number of key blocks.
    • If bruised, the leftmost sepal block is dropped, reducing the number of key blocks.
    • This ensures that if the number of bruises exceeds a threshold $d$, the final gatekeeper cannot recover the master key $K$, making the onion undecryptable.

1. Relay / Client Registration:

  • Relays publish their existence and public keys to the bulletin board.
    • See internal/model/relay/relay.go
    • Relays send periodic heartbeat messages so that the bulletin board can maintain a list of all active relays in the network.
  • Clients register their intent to send messages with the bulletin board.
  • When a sufficient number of relays are active (given by $N$ ), and a sufficient number of clients have registered their intent-to-send messages (given by $R$ ), the bulletin board broadcasts a start signal along with the following information.
    • Each participating client receives:
      • a list of active Mixers and Gatekeepers (along with their public keys and which checkpoint onions the client needs to create).
    • All participating relays receive:
      • a list of expected nonces it should receive for each round j.
    • See internal/model/bulletin_board/bulletin_board.go

2. Initialization:

  • When a client $k$ is notified of the start of a run, it receives from the bulletin board:
    • A list of participating Mixers and Gatekeepers where each relay relay $P_i$ is associated with a public key $pk_i$ and a list of sets P_$Y_1,...,Y_{l_1}$ , where $Y_j$ represents the subset of nonces $P_i$ expects to receive during round j which the client is responsible for sending.
  • For each message to be sent, the client constructs a routing path by selecting a random subset of $l_1& Mixers and $l_2$ Gatekeepers in the network.
    • routing_path[ $1...l_1$ ] are Mixers.
    • routing_path[ $l_1 + 1...l_1 + l_2$ ] are Gatekeepers.
    • routing_path[ $l_1 + l_2 + 1$ ] is the final destination.

Forming the Onion:

  • The onion is constructed in layers, with the innermost layer containing the message encrypted with the recipient's public key.
    • Each subsequent layer $j$ contains encrypted metadata that includes:
      • A pseudorandom nonce that is unique to each onion (used to detect replay attacks).
      • The window of expected arrival time for the onion (used to detect delayed arrival).
      • The next hop in the routing path.
    • For each participant in the routing path, the client uses its corresponding session key and pseudorandom function F1
      to determine if it should create a checkpoint onion for that layer. It then uses F2 to generate a nonce for each
      checkpoint onion with its own random routing path.
      • The construction of checkpoint onions follows the same layer-by-layer encryption process as the regular onions.
        The only difference is that checkpoint onions (a.k.a. dummy onions) don't carry a payload and instead provide cover for the "real" payload-carrying onions.
      • Each layer of the onion contains the encrypted shared key which is used by the next relay in the path to decrypt the layer. This shared key is encrypted with the public key of the respective relay and included in the header of each layer.
  • All onions are sent to their first hop (a Mixer).

3. Mixing and Bruising:

  • When a Mixer receives an onion and decrypts its outer layer (header), it reveals the following data:
    • Multiple key slots that contain copies of the decryption key. If an onion is bruised, one of these key slots is invalidated.
    • The nonce (decrypted using the session key shared with the original sender).
    • The window of expected arrival time for the onion.
    • The next hop in the path (another Mixer or a Gatekeeper).
  • The Mixer checks for delays or signs of tampering.
    • To detect a delay, the mixer compares the received "time" (see local time) with an expected time window. If an onion arrives outside this window, it is considered delayed.
    • To check for tampering, the mixer verifies the nonce against its expected set $Y_k$ (calculated with session key).
      • If the nonce is valid, the relay removes the nonce from $Y_k$.
      • Otherwise, the onion is considered tampered with.
  • If the onion is delayed or tampered with, the Mixer invalidates one of the key slots in the onion.
  • The onion is then forwarded to the next relay in the path.
  • The number of protection layers is managed in a way that does not reveal any positional information. For instance,
    additional dummy layers might be used to mask the actual number of active layers.

4. Intermediate Relays:

  • The onion continues to travel through the network of Mixers:
    • Each Mixer decrypts its layer, possibly adds bruises (invalidates key slots), and forwards the onion.
    • This process continues until the onion reaches a Gatekeeper.

5. Gatekeeping:

  • The Gatekeeper receives the onion and checks the number of valid key slots.
  • If the number of valid key slots is below a predefined threshold, the Gatekeeper discards the onion.
    • A threshold is determined based on the network's tolerance for delays and replay attacks
  • If the onion is acceptable, the Gatekeeper forwards it to the next relay (which can be another Mixer or a Gatekeeper, depending on the path).

6. Final Destination

  • The recipient (client) always receives the onion from a Gatekeeper, never directly from a Mixer.
  • The recipient receives the onion and decrypts it using their private key.
  • The message is revealed, completing the communication process.

Adversary Simulation Framework

Potential Adversarial Functions:

  • Observe all received onions and their metadata.
  • Bruise or delay onions that pass through their layer (but cannot modify bruise count).
  • Selectively drop onions to cause disruption, such as making onions appear delayed when they reach the next hop.
  • Inject their own onions, replicate onions (replay attack) to create noise or mislead other relays.

Verifying Differential Privacy:

  1. Create neighboring pairs of datasets that differ by one message or communication path.
  2. Run the protocol on both neighboring datasets.
  3. Record the adversary’s view for each dataset.
  4. Measure how the distributions of the adversary’s views differ between the neighboring datasets.
  5. Calculate the empirical probability of the adversary’s view for each dataset.
  6. Verify that the privacy loss conforms to the differential privacy inequality (for ε and δ).

Notes

No Global Clock:

  • In the $\Pi_t$ protocol, each relay maintains a local clock ($c_j$) to track the progression of onion layers.
    • Threshold (τ): A system parameter representing the fraction of checkpoint onions needed for the relay to progress its local clock.
    • Checkpoints ($Y_k$): A set of expected nonces for the k-th layer checkpoint onions.
  1. Receiving Onions:
  • A relay $P_i$ (acting as a mixer) receives an onion $O$ and determines whether it was received "on time"
    or not relative to $P_i$'s local clock.
  • If the onion $O$ arrived late, $P_i$ bruises the onion and forwards the bruised onion O' to the next destination.
  1. Processing Onions:
  • If $P_i$ is the last mixer on the routing path, it sends the peeled onion O' to the first gatekeeper $G_1$.
  • If $P_i$ is either early or on time, it places the peeled onion O' in its message outbox.
  1. Checking Nonces:
  • If processing $O$ reveals a non-empty nonce $y$ ≠ ⊥, $P_i$ checks whether $y$ belongs to the set
    $Y_k$ (the set of $k$-th layer checkpoint nonces Pi expects to see from the onions it receives).
  • If $y$ is expected, $P_i$ increments $c_k$ by one and updates $Y_k$ to exclude $y$.
  1. Advancing the Local Clock:
  • Upon processing a sufficient number of j-th layer onions (i.e., if $c_j$ ≥ τ |$Y_j$|),
    $P_i$ sends out these onions (but not the onions for future hops) in random order and advances its local clock $c_j$ by one.
  • Onions are batch-processed and sent out in random order at honest intermediaries when batch-processed.

Experiment Setup

  • Clients, $[C_1...C_R]$
    • We will choose target senders, $C_1$ and $C_2$
  • Relays, $[R_1...R_N]$
  • Adversary, $\mathcal{A}$
    • The adversary always drops onions from $C_1$
    • $\mathcal{A}$'s observables, $\text{View}(\sigma_i)$, for a scenario, $i$, include the number of onions sent and received by each client and relay.
      • Let $O_{k,i}$ be the distribution (over many executions of scenario $i$) of the number of onions that client $C_k$ receives by the end of the run.

Senarios

  • We consider two neighboring scenarios for our experiment:

    • Scenario 0 ($\sigma_0$):
      • $C_1$ sends a message to $C_R$
      • $C_2$ sends a message to $C_{R-1}$
    • Scenario 1 ($\sigma_1$):
      • $C_1$ sends a message to $C_{R-1}$
      • $C_2$ sends a message to $C_R$
  • In both scenarios, there are also dummy (checkpoint) onions to provide cover traffic.

  • For example, in Scenario 1 where $C_2$ sends a message to $C_N$, the number of onions, $O_N$, received by $C_N$ will be shifted to the right by 1 compared to $O_{R-1}$ since $C_{R-1}$'s onion was dropped by $\mathcal{A}$.

Adversary's Task

The adversary observes the network volume (number of onions each client and relay are sending and receiving), along with routing information (who each relay are sending to/receiving from each round). Each round, the adversary updates the probability distribution of where the message-bearing onion is likely located. The adversary's goal is to determine the most probable client $[C_2...C_N]$ that received a message-bearing onion from $C_1$.

Computing the Adversary's Advantage

  • We aim to compute the ratio that the adversary is correct (i.e., the "advantage").
  • The "advantage" is essentially a measure of how well the adversary can use the observed data to make correct assumptions about which client sent the onion.
  • This is ideally bounded by $e^\epsilon$.

Installation (for development)

Clone the repository:

git clone https://github.com/HannahMarsh/pi_t-experiment.git;
cd pi_t-experiment

Install dependencies:

bash go mod tidy

Build the project:

go build -v ./...

Development

Run tests:

go test -v ./...

Usage

All configurations are set in the config/config.yaml file.

Running the Bulletin Board

go run cmd/bulletin-board/main.go -logLevel=<logLevel>
  • Options:
    • <logLevel>: (optional) The logging level (e.g., "info", "debug", "warn", "error").

Running a Relay

go run cmd/relay/main.go -id=<id> -logLevel=<logLevel>
  • Options:
    • <id>: The unique identifier for the relay.
    • <logLevel>: (optional) The logging level (e.g., "info", "debug", "warn", "error").

Running a Client

go run cmd/client/main.go -id=<id> -logLevel=<logLevel>
  • Options:
    • <id>: The unique identifier for the client.
    • <logLevel>: (optional) The logging level (e.g., "info", "debug", "warn", "error").

Running the Metric Collector (Scrapes Prometheus endpoints for Clients and relays)

go run cmd/metrics/main.go -logLevel=<slogLevel>
  • Options:
    • <logLevel>: (optional) The logging level (e.g., "info", "debug", "warn", "error").

Running the Browser-Based Visualization Server

go run cmd/visualizer/visualizer.go -port <port>
  • Options:
    • <port>: The port number for the visualization server. Access at http://localhost:<port>.

Endpoints

Bulletin Board

  • Register Client: POST /registerClient
  • Register Relay: POST /registerRelay

Relay & Client

  • Receive Onion: POST /receive
  • Get Status: GET /status
  • Start Run: POST /start
  • Prometheus Metrics: GET /metrics - Note that this is served on a different port which is specified in the config.yml file

When implementing the onion routing protocol, it helps to run the visualization server in real time to view the messages and onions processes by each client and relay. For a small number of clients/relays, this makes debugging the protocol easier.

Obviously, it is not recommended to run the visualization program once we deploy the simulation in a distributed environment (with potentially hundreds of relays and rounds).


References

  • [ALU24] - Ando M, Lysyanskaya A, Upfal E. Bruisable Onions: Anonymous Communication in the Asynchronous Model. Cryptology ePrint Archive. 2024.
  • [TGL+17] - Nirvan Tyagi, Yossi Gilad, Derek Leung, Matei Zaharia, and Nickolai Zeldovich. Stadium: A distributed metadata-private messaging system. In Proceedings of the 26th Symposium on Operating Systems Principles, Shanghai, China, October 28-31, 2017, pages 423–440. ACM, 2017.
  • [vdHLZZ15] - Jelle van den Hooff, David Lazar, Matei Zaharia, and Nickolai Zeldovich. Vuvuzela: scalable private messaging resistant to traffic analysis. In Ethan L. Miller and Steven Hand, editors, Proceedings of the 25th Symposium on Operating Systems Principles, SOSP 2015, Monterey, CA, USA, October 4-7, 2015, pages 137–152. ACM, 2015.