Skip to content

Chung-ju/Data-Minimization

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 

Repository files navigation

Data Minimization

Update: 2023/2/4

1. Introduction

The principle of "data minimization" means that a data controller should limit the collection of personal information to what is directly relevant and necessary to accomplish a specified purpose. They should also retain the data only for as long as is necessary to fulfil that purpose.

2. Different stage

Stage 1: data collection

Original paper: The Principle of Least Sensing: A Privacy-Friendly Sensing Paradigm for Urban Big Data Analytics.

What is least sensing? When conducting urban big data analysis involving personal data, a data processing entity must sense and collect only the minimum information necessary for the specified analysis purpose.
How to interpret "minimum"? The first and perhaps most intuitive explanation of minimum is on the data quantity, i.e., sensing the smallest amount of data required for the purpose. Actually, other interpretations exist, such as data precision, data sensitivity, and data predictability.

Stage 2: data inference

Search for some FL-related papers.

A SDC repository: AI-SDC

Stage 3: data autit

3. Terminology explanation

  • Data quantity minimization
    • Data retention heuristics focusing on performance-related properties of data.
    • Retain the most recent data while discarding old data.
  • Data quality minimization: aims to reduce the quality of data without reducing its overall quantity.
  • Performance-based data minimization
    • Global data minimization: aims to minimize per-user data collection subject to meeting a target mean performance across users.
    • Per-user data minimization: aims to minimize per-user data collection subject to meeting a target performance for the minimum across all users.
  • Breadth-based data minimization: aims to minimize the number of features.
  • Depth-based data minimization: aims to minimize the overall amount of data collected for one data modality.
  • Runtime data minimization: aims to minimize newly collected data for analysis or prediction.
  • Personalized data minimization: aims to allow different users to reveal more or less about different aspects of their lives based on their own personal preference.
  • Query-driven data minimization:
  • Language-based data minimization:

4. Related papers

Details

Title Stage Link
Limiting data collection in application forms: A real-case application of a founding privacy principle. Data collection PST2012
Privacy Architectures: Reasoning About Data Minimisation and Integrity. Data collection STM2014
Data Minimisation: A Language-Based Approach. Data collection SEC2017
Towards Query-Driven Data Minimization. Data collection LWDA2018
Student Success Prediction and the Trade-Off between Big Data and Data Minimization. Data collection DeLFI2018
Query-Driven Data Minimization with the DataEconomist. Data collection EDBT2019
A Data Minimization Model for Embedding Privacy into Software Systems. Data collection ComSec2019
Operationalizing the Legal Principle of Data Minimization for Personalization. Data collection SIGIR2020
Fair Inputs and Fair Outputs The Incompatibility of Fairness in Privacy and Accuracy. Data collection UMAP2020
Embedding Personal Data Minimization Technologies in Organizations: Needs, Vision and Artifacts. Data collection ICEGOV2021
A Lightweight Scheme Exploiting Social Networks for Data Minimization According to the GDPR. Data collection TCSS2021
How to address data privacy concerns when using social media data in conservation science. Data collection ConBio2021
Towards a Formal Approach for Data Minimization in Programs. Data collection DPM/CBT2021
Configurable Per-Query Data Minimization for Privacy-Compliant Web APIs. Data collection ICWE2022
Learning to Limit Data Collection via Scaling Laws: A Computational Interpretation for the Legal Principle of Data Minimization. Data collection FAccT2022
Practical Data Access Minimization in Trigger-Action Platforms. Data collection USENIXSS2022
I Prefer not to Say: Operationalizing Fair and User-guided Data Minimization. Data collection CoRR2022
Privacy for Free: How does Dataset Condensation Help Privacy. Data collection ICML2022
Censoring Representations with an Adversary. Data inference ICLR2016
Supporting the Design of Privacy-Aware Business Processes via Privacy Process Patterns. Data inference RCIS2017
Detecting Conflicts Between Data-Minimization and Security Requirements in Business Process Models. Data inference ECMFA2018
Mobile Sensor Data Anonymization. Data inference IoTDI2019
Generating Optimal Privacy-Protection Mechanisms via Machine Learning. Data inference CORR2019
A semi-automated BPMN-based framework for detecting conflicts between security, data-minimization, and fairness requirements. Data inference SSM2020
A Siamese Adversarial Anonymizer for Data Minimization in Biometric Applications. Data inference EuroS&P2020
A Hybrid Deep Learning Architecture for Privacy-Preserving Mobile Analytics. Data inference ITJ2020
Data Minimization for GDPR Compliance in Machine Learning Models. Data inference AIEthics2022
Auditing Black-Box Prediction Models for Data Minimization Compliance. Data audit NIPS2021
Auditing Algorithms: On Lessons Learned and the Risks of Data Minimization. Data audit AIES2021
AI auditing and impact assessment: according to the UK information commissioner’s office. Data audit AIEthics2021
Privacy 3.0 := Data Minimization + User Control + Contextual Integrity. Other InfTech2011
From Data Minimization to Data Minimummization. Other DPIS2013
Mo'Data, Mo'Problems? Personal Data Mining and the Challenge to the Data Minimization Principle. Other FPF2013
Regime Change: Enabling Big Data through Europe's New Data Protection Regulation. Other SciTech2016
A Process for Data Protection Impact Assessment Under the European General Data Protection Regulation. Other APF2016
The European General Data Protection Regulation: challenges and considerations for iPSC researchers and biobanks. Other FutMed2017
The Legal Challenges of Big Data: Putting Secondary Rules First in the Field of EU Data Protection. Other EDPL2017
Towards a Roadmap for Privacy Technologies and the General Data Protection Regulation: A Transatlantic Initiative. Other APF2018
Reviving Purpose Limitation and Data Minimisation in DataDriven Systems. Other TechReg2021
A survey of data minimisation techniques in blockchain-based healthcare. Survey ComNet2022
Data Minimisation Potential for Timestamps in Git: An Empirical Analysis of User Configurations. ? Sec2022
A Refactoring for Data Minimisation Using Formal Verification. ? ISoLA2022
Data Minimisation as Privacy and Trust Instrument in Business Processes. ? BPMW2020
Data Minimisation in Communication Protocols: A Formal Analysis Framework and Application to Identity Management. ? IJIS2014
TRIPLEX: verifying data minimisation in communication systems. ? CCS2013
Monitoring Data Minimisation. ? CoRR2018
Personality Is Revealed During Weekends: Towards Data Minimisation for Smartphone Based Personality Classification. ? INTERACT2019
Prevention of Personally Identifiable Information Leakage in E-commerce via Offline Data Minimisation and Pseudonymisation. ? IJISRT2021
Engineering privacy by design reloaded. ? APC2015

5. Useful repositories or URLs

About

A repository of related papers.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published