Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable anonymization tool and integrate with existing data ingestion #189

Open
joshgamache opened this issue Jan 30, 2023 · 0 comments
Open

Comments

@joshgamache
Copy link
Contributor

joshgamache commented Jan 30, 2023

Description

Adds Google Data Loss Prevention (DLP) to the DAGs used to ingest raw data sources.

User Story

As a analyst

I want my data to go through an analysis/anonymization process

So that to confirm that working data does not contain Personally Identifiable Information (PII)

Acceptance Criteria

Given When Then Pass/Fail (TBD by reviewer)
I am in the Imported Datasets page I see datasets I should see which datasets are anonymized
I am in the Imported Datasets page I click into a dataset I should see an analysis of the PII in the dataset

Research + Rationale Summary

See issue #136 for research into DLP.

Developer notes

  • Use the DAG pipelines already built and add a component to analyze/report and anonymize incoming raw data (within the Data Clean Room).
@joshgamache joshgamache self-assigned this Jan 30, 2023
@joshgamache joshgamache converted this from a draft issue Jan 30, 2023
@joshgamache joshgamache moved this from 🗒️To Do to 🏗 In progress in ClimateTrax - Product Backlog Feb 1, 2023
@joshgamache joshgamache moved this from 🏗 In progress to Ready for PO Review in ClimateTrax - Product Backlog Feb 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Status: Ready for PO Review
Development

No branches or pull requests

1 participant