GDPR Similarity Comparison

This repo is a part of the report - Towards Automatic Comparison of Data Privacy Documents: A Preliminary Experiment on GDPR-like Laws 🔥

We extract information from GDPR-like documents from different countries written in natuaral language and construct well-strucured data.
The structured data are 4 columns including chapter, section, article and recital. This could benefit any future work that would like to explore GDPR-like using computational methods. 🚀
This project is inspired by COSC-824 Data Protection by Design, Department of Computer Science at Georgetown University.

Data

We convert from PDF to Docx to CSV with well-structured style. Now, our data include GDPR-like documents from:

European 🇪🇺
Brazil 🇧🇷
Indian 🇮🇳
What next? 😉

Simply load the data into a dataframe in Python as following code.

import pandas as pd

file_path = "data/LGPD-ES-Brazil-converted.csv"
df = pd.read_csv(file_path) # columns: ["chapter", "section", "article", "recital"]

Materials

Project Member

Kornraphop Kawintiranon - Github
Yaguang Liu - Github
Prof. Benjamin E. Ujcich (Instructor) - Personal

Citation

If you feel our paper and resources are useful and encouraging, please consider citing our work! 🙏

@article{kawintiranon2021automatic,
    title={Towards Automatic Comparison of Data Privacy Documents: A Preliminary Experiment on GDPR-like Laws},
    author={Kawintiranon, Kornraphop and Liu, Yaguang},
    journal={arXiv preprint arXiv:2105.10117},
    year={2021},
    url={https://arxiv.org/abs/2105.10117}
}

References

PDF to Docx: https://smallpdf.com/pdf-to-word

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
data		data
src		src
top_k_each_article		top_k_each_article
top_k_each_sentence		top_k_each_sentence
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
test_python_docs.py		test_python_docs.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GDPR Similarity Comparison

Data

Materials

Project Member

Citation

References

About

Releases

Packages

Languages

License

kornosk/GDPR-similarity-comparison

Folders and files

Latest commit

History

Repository files navigation

GDPR Similarity Comparison

Data

Materials

Project Member

Citation

References

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages