Skip to content

Latest commit

 

History

History
49 lines (38 loc) · 2.15 KB

README.md

File metadata and controls

49 lines (38 loc) · 2.15 KB

GDPR Similarity Comparison

This repo is a part of the report - Towards Automatic Comparison of Data Privacy Documents: A Preliminary Experiment on GDPR-like Laws 🔥

  • We extract information from GDPR-like documents from different countries written in natuaral language and construct well-strucured data.

  • The structured data are 4 columns including chapter, section, article and recital. This could benefit any future work that would like to explore GDPR-like using computational methods. 🚀

  • This project is inspired by COSC-824 Data Protection by Design, Department of Computer Science at Georgetown University.

Data

We convert from PDF to Docx to CSV with well-structured style. Now, our data include GDPR-like documents from:

  • European 🇪🇺
  • Brazil 🇧🇷
  • Indian 🇮🇳
  • What next? 😉

Simply load the data into a dataframe in Python as following code.

import pandas as pd

file_path = "data/LGPD-ES-Brazil-converted.csv"
df = pd.read_csv(file_path) # columns: ["chapter", "section", "article", "recital"]

Materials

Project Member

  • Kornraphop Kawintiranon - Github
  • Yaguang Liu - Github
  • Prof. Benjamin E. Ujcich (Instructor) - Personal

Citation

If you feel our paper and resources are useful and encouraging, please consider citing our work! 🙏

@article{kawintiranon2021automatic,
    title={Towards Automatic Comparison of Data Privacy Documents: A Preliminary Experiment on GDPR-like Laws},
    author={Kawintiranon, Kornraphop and Liu, Yaguang},
    journal={arXiv preprint arXiv:2105.10117},
    year={2021},
    url={https://arxiv.org/abs/2105.10117}
}

References