-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow hashing or dropping PII from source connectors #1758
Comments
Hey @artefactop. Thanks for the ticket. We are actually brainstorming this feature. It a slippery slope if we want to be a pure EL player but I see the need for very strong guarantees when it comes to privacy and security. Let us get back to you on it. |
@michel-tricot Hey Michel! Do you guys have any update on this? |
Not at the moment. @unoexperto are you encountering that need? |
@michel-tricot I too am curious about an update on this. I have seen some conversation about handling this at the transform layer (DBT), but that doesn't prevent issues related to PII now being migrated to the destination system and GDPR concerns about data residency (e.g. certain data can't leave the source system - As you potentially alluded to). I can see a strong argument to put the specification for hashing or nulling out columns on the Do you have a better idea about how the design may work from an architectural standpoint, or even what options you are considering? A documented proposal or RFC to that effect? In the long term I think this would be a fundamental need for us, as the only temporary work around would be to make views in the source system that hashed or nulled out fields containing sensitive information. Though we haven't load tested view in this respect, but replicating views potentially would have operational and load concerns that table and simple transforms would not. |
Issue was linked to Harvestr Discovery: Hashing PII fields |
Any updates? |
No currently, @HaithemSouala! |
@malikdiarra FYI moved this to oss team's backlog |
Hey Folks! Any updates regarding this issue? |
@Hesperide safe to say it's shipped, no? |
As shared in our roadmap, column hashing is now available as of Airbyte v1.1.0! |
Tell us about the problem you're trying to solve
I want to export my data from PostgreSQL to Bigtable but avoiding some PII information, specifically transforming it to some kind of hash or partially hashed.
Describe the solution you’d like
I see that PipelineWise support that by adding a yaml configuration https://transferwise.github.io/pipelinewise/user_guide/transformations.html, I'm actually researching for a solution and Airbyte looks amazing but I couldn't see if it support this feature.
Would be great if I can configure this transform on the selected column source.
Describe the alternative you’ve considered or used
My alternative is choose PipelineWise as ELT because it already support it.
┆Issue is synchronized with this Asana task by Unito
The text was updated successfully, but these errors were encountered: