Paper review: "pMSE Mechanism: Differentially Private Synthetic Data with Maximal Distributional Similarity" #33

MaxGhenis · 2018-12-29T16:44:20Z

In our recent call with Benedetto and Stinson from Census, they recommended reading Snoke and Slavković (2018): "pMSE Mechanism: Differentially Private Synthetic Data with Maximal Distributional Similarity." This issue is to review relevant pieces of the paper regarding synthesis and evaluation.

Synthesis

The paper notes that a synthesis is differentially private if it's built from differentially private parameters (in this case regression coefficients IIUC), and proposes an adaptation of other methods that sample from the distribution, relaxing a boundedness assumption. It cites Bowen and Liu (2018) "Comparative Study of Differentially Private Data Synthesis Methods", which I think would help me follow their approach better.

Their synthesis approach appears to be limited to parametric models; in case that's true and Bowen and Liu are also limited to parametric models, these other papers could be useful for our current nonparametric approaches:

Evaluation

To evaluate the quality of the synthesis, they propose stacking the synthesis and training sets, building a model to predict whether a record is synthesized, and summarize those probabilities as distances from 0.5:

The idea of distinguishing synthesized data from real data is interesting, and they use a CART model to do so.

I'm not sure how necessary the novel metric is, compared to established classification metrics like log-loss. This in-sample approach could also overfit. If we wanted to apply this, I'd want to consider log-loss on a holdout set.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Paper review: "pMSE Mechanism: Differentially Private Synthetic Data with Maximal Distributional Similarity" #33

Paper review: "pMSE Mechanism: Differentially Private Synthetic Data with Maximal Distributional Similarity" #33

MaxGhenis commented Dec 29, 2018 •

edited

Loading

Paper review: "pMSE Mechanism: Differentially Private Synthetic Data with Maximal Distributional Similarity" #33

Paper review: "pMSE Mechanism: Differentially Private Synthetic Data with Maximal Distributional Similarity" #33

Comments

MaxGhenis commented Dec 29, 2018 • edited Loading

Synthesis

Evaluation

MaxGhenis commented Dec 29, 2018 •

edited

Loading