This repository contains a curated list of resources related to privacy engineering, a field of engineering which aims to provide methodologies, tools, and techniques to ensure systems provide acceptable levels of privacy throughout the whole engineering process, also known as Privacy by design (PbD), an approach to systems engineering initially developed by Dr. Ann Cavoukian
For corrections, suggestions, or missing papers, please either open an issue or submit a pull request.
- Privacy Engineering - Nishant Bhajaria
- AI & Privacy: How To Find Balance - Punit Bhatia | Eline Chivot
- The Privacy Engineer's Manifesto Getting from Policy to Code to QA to Value - Dennedy, Michelle, Fox, Jonathan, Finneran, Tom - Dr. Ann Cavoukian, Ph.D., contributing.
- An Introduction to Privacy for Technology Professionals - Travis Breaux
- Strategic Privacy by Design - R. Jason Cronk
- Software Security: Building Security In - Gary McGraw
- The Architecture of Privacy: On Engineering Technologies that Can Deliver Trustworthy Safeguards - Courtney Bowman, Ari Gesher, John K. Grant
- Privacy Engineering: A data flow and ontological approach - Ian Oliver
- Information Privacy Engineering and Privacy by Design: Understanding Privacy Threats, Technology, and Regulations Based on Standards and Best Practices - William Stallings
- Privacy Design Strategies: (The Little Blue Book) - Jaap-Henk Hoepman
- Privacy Cookbook for Business Processes - A collection of Business Process Modelling (BPM) privacy resources
- The NIST Privacy Engineering Program’s (PEP) - Supports the development of trustworthy information systems by applying measurement science and system engineering principles to the creation of frameworks, risk models, guidance, tools, and standards that protect privacy and, by extension, civil liberties.
- NIST Special Publication 800-53Revision 5 - Security and Privacy Controls for Information Systems and Organizations
- NISTIR 8062 - An Introduction to Privacy Engineering and Risk Management in Federal Systems
- NIST Privacy Framework - A tool for improving privacy through enterprise risk management, version 1.0
- OWASP Top 10 Privacy Risks - The OWASP Top 10 Privacy Risks Project provides a top 10 list for privacy risks in web applications and related countermeasures.
- LINDDUN privacy threat modeling framework - The LINDDUN threat modeling framework provides support to systematically elicit and mitigate privacy threats in software architectures.
- LINDDUN GO - A Lightweight Approach to Privacy Threat Modeling
- Operationalizing the Legal Principle of Data Minimization for Personalization - Asia J. Biega, Peter Potash, Hal Daumé III, Fernando Diaz, Michèle Finck
- OpenMind - An open-source community of over 10,000 researchers, engineers, mentors and enthusiasts committed to making a fairer more prosperous world.
- Harvard University Privacy Tools Project - The Privacy Tools Project is a broad effort to advance a multidisciplinary understanding of data privacy issues and build computational, statistical, legal, and policy tools to help address these issues in a variety of contexts.
- MITRE Systems Engineering Guide - Based on the ECD Privacy Framework Privacy Principles
- Bias Mitigation in Data Sets - Tackling sample bias, Non-Response Bias, CognitiveBias, and disparate impact associated with Protected Categoriesin three parts/papers, data, algorithm construction, and outputimpact. This paper covers the Data section.
- Information Commissioner's Office: How to use AI and personal data appropriately and lawfully
- Deidentification 101: A lawyer’s guide to masking, encryption and everything in between - Alfred Rossi, Andrew Burt, Sophie Stalla-Bourdillon
- Deidentification 201: A lawyer’s guide to pseudonymization and anonymization - Alfred Rossi, Andrew Burt, Sophie Stalla-Bourdillon
- Simulating event pipelines for fun and profit (and for testing too) - Andrew Colombi, Tonic.AI
- Generate dummy data in sql server - by Vinicius Negrisolo
- How to Create PostgreSQL Test Data - Alex Thompson
- Data retention in a distributed system - Lea Kissner
- Aggregated data provides a false sense of security - Luk Arbuckle
- Threat Models for Differential Privacy - A series on differential privacy by NIST
- Introduction to Homomorphic Encryption - Katharina Koerner
- [Legal perspectives on PETs: Homomorphic encryption - Katharina Koerner] (https://medium.com/golden-data/legal-perspectives-on-pets-homomorphic-encryption-9ccfb9a334f)
The objective of PETs is to protect personal data and ensure the users of technology that their information is confidential and management of data protection is a priority to the organizations who withhold responsibility for sensitive private information.
- HElib - an open-source (Apache License v2.0) software library that implements homomorphic encryption (HE)
- Condenser - database subsetting tool for Postgres and MySQL which takes a representative sample of your data in a manner that preserves the integrity of your database, e.g., give me 5% of my users.
- Masquerade - A Postgres proxy that masks sensitive datasets
- Google Fully Homomorphic Encryption (FHE) libraries - Open-source libraries and tools to perform fully homomorphic encryption (FHE) operations on an encrypted data set.
- Microsoft SEAL - An easy-to-use open-source (MIT licensed) homomorphic encryption library developed by the Cryptography and Privacy Research Group at Microsoft
- Secure Multi-Party Computation (MPC) with Go - This project implements secure two-party computation with garbled circuit protocol.
- FBPCF (Facebook Private Computation Framework) - The Private Computation Framework (PCF) library builds a scalable, secure, and distributed private computation platform to run secure computations on a production level. PCF library supports running the computation on AWS Cloud and is able to integrate various private computation technologies. Specifically, it leverages EMP-toolkit to enable privacy-preserving computations.
- FBPCS (Facebook Private Computation Service) - a secure, privacy safe and scalable architecture to deploy MPC (Multi Party Computation) applications in a distributed way on virtual private clouds.
- Mainzelliste SecureEpiLinker (MainSEL) - Privacy-Preserving Record Linkage using Secure Multi-Party Computation.
- Google's differential privacy libraries. - libraries to generate ε- and (ε, δ)-differentially private statistics over datasets.
- Diffprivlib - The IBM Differential Privacy Library. Diffprivlib is a general-purpose library for experimenting with, investigating and developing applications in, differential privacy.
- Awesome zero knowledge proofs - A curated list of awesome things related to learning zero knowledge proofs
- Easy-data-masking - A javascript plugin for data masking
- DataMasker - A free data masking and/or anonymizer library for Sql Server written in .NET
- mangle - This library provides functionality to sanitize text and HTML data. This can be integrated into tools that export or manipulate databases/files so that confidential data is not exposed to staging, development or testing systems.
- FHIR Tools for Anonymization - An open-source project that helps anonymize healthcare FHIR data, on-premises or in the cloud, for secondary usage such as research, public health, and more. Fast Healthcare Interoperability Resources (FHIR, pronounced "fire")
Software privacy testing tools aim to make systems more resistant to privacy threats, by identifying privacy weaknesses and vulnerabilities in systems that collect, process and store sensitive private information .
- Privado - An open source static code analysis tool to discover data flows in code
- PrivacyRaven - A privacy testing library for deep learning systems. You can use it to determine the susceptibility of a model to different privacy attacks; evaluate privacy preserving machine learning techniques; develop novel privacy metrics and attacks; and repurpose attacks for data provenance and other use cases.
- TensorFlow Privacy - A Python library that includes implementations of TensorFlow optimizers for training machine learning models with differential privacy.
- Machine Learning Privacy Meter - A Python library that enables quantifying the privacy risks of machine learning models. (NUS Data Privacy and Trustworthy Machine Learning Lab)
- Adversarial Robustness Toolbox (ART) (IBM)
- Format Preserving Lorem Ipsum Text Generation
- Bogus - A simple and sane fake data generator for .NET languages like C#, F# and VB.NET.
- Event Timestamp Generator - Generate real world event sequence data using an exponential distribution.
- Mimesis - A high-performance fake data generator for Python, which provides data for a variety of purposes in a variety of languages.
- pydbgen - Generate full data tables with meaningful yet random entries of most commonly encountered fields in the world of database, such as name, age, birthday,, credit card number, SSN, email id, physical address, company name, job title
- Synthea - Synthetic Patient Population Simulator. The goal is to output synthetic, realistic (but not real), patient data and associated health records in a variety of formats.
- Gretel Synthetics- Synthetic data generators for structured and unstructured text, featuring differentially private learning.
Privacy attacks are the techniques that attackers use to exploit the vulnerabilities in systems that contain sensitive private information. Attacks are often confused with vulnerabilities, which can be discovered using Privacy Testing Tools.
- Awesome Attacks on Machine Learning Privacy
- Differential Privacy Defenses and Sampling Attacksfor Membership Inference - Shadi Rahimian, Tribhuvanesh Orekondy, Mario Fritz
- Exposed! A Survey of Attackson Private Data - Cynthia Dwork, Adam Smith, Thomas Steinke, Jonathan Ullman
- IBM AI Ethics - IBM’s multidisciplinary, multidimensional approach to trustworthy AI
- Podcast: The Robot Brains Podcast - Charles Isbell makes the case for more ethical AI
- Podcast: Exploring the privacy, ethical issues with emotion-detection tech
- Reviving Purpose Limitation and Data Minimisation in Personalisation, Profiling and Decision-Making Systems - Asia Biega and Michèle Finck
- Privacy Law and Data Protection - University of Pennsylvania
- Introduction to GDPR: General Data Protection Regulation - University College London
- Understanding the GDPR - University of Groningen
- Protecting Health Data in the Modern Age: Getting to Grips with the GDPR - University of Groningen
- Privacy Engineer Sample Job Description
- OpenMind - An open-source community of over 10,000 researchers, engineers, mentors and enthusiasts committed to making a fairer more prosperous world.