This repository contains the code for the paper Efficient Verifiable Differential Privacy with Input Authenticity in the Local and Shuffle Model by Tariq Bontekoe, Hassan Jameel Asghar, and Fatih Turkmen. Links: arXiv; IACR ePrint
We make our code available open-source under the MIT license. To enable anyone to re-use our code and/or reproduce the results from our paper.
This repository implements the client and server functionality for the three different VLDP schemes presented in the paper (Base, Expand, and Shuffle). Moreover, we implement benchmarks on random data and example scripts on real datasets to evaluate our schemes. The benchmarks give an insight into the client/server performance and the examples show the behaviour on real data. Next to this, we also include Jupyter notebooks which were used for dataset parsing and determining the DP parameters used in the examples and the paper. Finally, we include script for running all benchmarks and examples at once, parsing the results and transforming these into the plots that are presented in the paper.
The simplest way to run the code is by using Docker. To reproduce the experiments from the paper (
on your own hardware, so the runtimes may be different, but the trend should be similar) you can run the run_all
container (or run_all_fast
for faster, but less precise results). To view and reproduce the datasets we used for the
experiments on our paper, one can run the notebook
container.
Note: In our benchmark and example scripts, the trusted environment, communication, and (if present) shuffler are emulated, as these were not needed to measure the client/server performance and communication costs. The code has been written in such a way that messages are easily serialized, and one can use any existing or new library to implement these parts.
This repository contains the following relevant directories and files:
benches
: Rust code implementing the benchmarks on random data (either for histogram or real-valued data)examples
: Rust code implementing the use cases on real data (geodata/histogram or smart meter/real-valued)resources\shuffle-model-parameters
: datasets for both use cases and Jupyter notebook for creating these datasets from the original raw data + determining the DP parametersscripts
: Convenient scripts for automated running of benchmarks, parsing the raw results, and making plots. Thelinux
subfolder contains the scripts for running on Linux-based systems, andwindows
for Windows-based systems. The Python scripts parse the raw results or make plots.run_all
: Runs all benchmarks+examples, parses data, and makes plots. This gives all the results used in the paper. (3 warmup runs, 100 measurement runs)run_all_fast
: Same as above, but only 1 warmup run and 10 measurement runsrun_benches
: Runs performance benchmarks on the client/server to obtain accurate runtime estimates. (3 warmup runs, 100 measurement runs)run_benches_fast
: Same as above, but only 1 warmup run and 10 measurement runsrun_additional_benches
: Runs more benchmarks to obtain accurate runtime estimates for different Merkle tree sizes and amounts of randomness. (3 warmup runs, 100 measurement runs)run_additional_benches_fast
: Same as above, but only 1 warmup run and 10 measurement runsrun_geo_data_examples
: Runs benchmarks on the Geodata use case.run_smart_meter_examples
: Runs benchmarks on the Smart Meter use case.
src
: Actual implementation of the client and server code for our VLDP schemes.PAPER_RESULTS.ZIP
: Raw results as generated for the paper, accompanied by its parsed version and plots made from it.
The simplest way to run the code, scripts and notebook is by means of docker. We have defined a single container which can be used to run all code. For ease of use we set up the most convenient use cases (including an interactive shell to the container) using docker compose.
First make sure you install Docker (in case you do not yet have it):
- Either install Docker Desktop, which provides at least Docker Engine and Docker Compose (make sure to launch it before running).
- Or (on Linux and a bit more work) install Docker Engine and Docker Compose independently:
First, we have to build the docker containers, this is done using docker-compose build
.
Then one can run any of the following containers using docker-compose up -d <container_name> --build
, where
<container_name>
is replaced by any of the following:
notebook
: This runs the jupyter notebook and enables port forwarding of port 8888 to port 8888 on you local machine. The notebook can be accessed by using a browser to go tolocalhost:8888
. The notebook can be found inresources/shuffle-model-parameters
. Its progress and outputs are mounted to the local folder with the same path in your repository, so any changes there are also made locally (PLEASE BE AWARE OF THIS!).run_all
: This runs all benchmarks (i.e., therun_all
script). Results are made available locally through a mount atdocker_mounts/run_all
.run_all_fast
: This runs all benchmarks in the fast setting (i.e., therun_all_fast
script). Results are made available locally through a mount atdocker_mounts/run_all_fast
.
To run any container in interactive mode (i.e., it will not run any command at launch) run
docker-compose run <container_name>
. This can be convenient when you want to run your own commands in one of the
provided containers.
Note: In case you wish to change the port forwarding on your local machine, you can open the compose.yaml
file and
change any occurrence of 8888:8888
to <your port>:8888
.
Depending on what exactly you wish to run, you can follow either (or both) of the following:
First do the following:
-
Clone this repository/Download this code
-
Open a terminal inside the git repo.
-
All commands below can be appended with
--features print-trace
to show timing information. -
To run an example:
cargo run --example <name>
orcargo run --release --example <name>
(release model, this is the most efficient, and what should be used in practice).- To see the available examples:
cargo run --example
- To see the available examples:
-
To run the benchmarks see below
First do the following:
-
Clone this repository/Download this code
-
Open a terminal inside the git repo.
-
(Optional:) Make and activate a virtual environment
-
Install the requirements:
python -m pip install -r requirements.txt
- Type the following in your terminal:
jupyter notebook
. - A browser should open automatically, if not please do so yourself and go to http://localhost:8888
- Go to the folder
resources/shuffle-model-parameters
and openLDP-Shuffle-Parameters.ipynb
and run it.
To run everything, parse the raw results, and make plots: (on Linux) ./scripts/linux/run_all.cmd
(on Windows)
.\scripts\windows\run_all.cmd
. This will create a results
folder containing three folders:
raw
: The raw logs from the benchmarks.parsed
: CSV files containing the relevant information from the raw logs.plots
: Plots made from the timing data of the benchmarks.
Alternatively, one can also run a subset of the benchmarks/examples/parsing by running separate scripts in the scripts
folder.