-
Notifications
You must be signed in to change notification settings - Fork 13
Code for NIPS'2017 paper
License
steven7woo/Accuracy-First-Differential-Privacy
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Code for paper "Accuracy First: Selecting a Differentially Private Level for Accuracy-Constrained ERM" by Ligett, Neel, Roth, Waggoner, Wu https://arxiv.org/abs/1705.10829 --------------------------------------------------------------- -- Disclaimer This code is used for simulations of the performance of differentially private algorithms, but should not be used in practice to protect actual sensitive data! The theorems are proved for true randomness and real numbers, while the code uses python's internal random generators and floating-point numbers. --------------------------------------------------------------- -- Requirements 1. python3 with the numpy, matplotlib, and scikit-learn libraries. 2. (optional) Linux, bash, and the GNU parallel utility available from most repositories. This is not mandatory, it's just that a bash script is used for running a whole bunch of experiments at once and in parallel. --------------------------------------------------------------- -- Usage Navigate into the code/ directory to run the code. You will need a dataset file in plain text. Each row of the file is a data point. It should contain d+1 space-separated numbers (for some d) where the first d are "x" and the last is "y". It is assumed that the L1-norm of each x is at most 1, and each |y| <= 1. For logistic regression, each y should be plus or minus 1. See data/ directories for downloading the datasets used in the paper and processing them into this format. You can run a single experiment at a time and print the output, or run a set of experiments and save the outputs into folders. --------------------------------------------------------------- -- To run a single experiment for a given data set and parameters: $ python3 run_ridge.py [args] OR $ python3 run_logist.py [args] Run them with no arguments for help on the args. --------------------------------------------------------------- -- To run a set of experiments: 1. You should have a dataset file and also a file with a list of the alpha parameters to try, called alphalist.txt. E.g. you can edit 'gen_alphalist.py' to your liking and then run $ gen_alphalist.py > alphalist.txt 2. Edit the file 'run_sims.sh' to set all the parameters to your liking. Also edit the top of the file 'prep_simulations.sh' to rename the variable 'run_file_name'. It should be "run_many_logist.py" if you want logistic regression or "run_many_ridge.py" if you want ridge regression. 3. Execute the following (full explanation of what it does below): $ ./run_sims.sh 4. Execute the following to read the results, print some output about them, and produce some plots. $ python3 collect_results_ridge.py sims-results/ ---------------- About run_sims.sh: This will create a folder sims-results/ and run a bunch of simulations writing the results into that folder, along with an 'about.py' file that specifies what all the parameters were. It does the following: a. Runs python3 prep_simulations.py [args] which creates the folder sim-results/ and writes about.py into it. Also writes a list of commands to a temporary file. b. Invokes GNU parallel to run the commands in parallel. WARNING: for large datasets, may use up all your RAM and crash your computer! Use --max-procs to limit the number of commands to run simultaneously. Each command that is run is of form in step c. c. python3 run_many_ridge.py [args] Runs num_trials experiments for the given parameters, writing the outputs into sim-results/param-i/ for the given i.
About
Code for NIPS'2017 paper
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published