-
Notifications
You must be signed in to change notification settings - Fork 410
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(data-masking): add custom mask functionalities #5837
feat(data-masking): add custom mask functionalities #5837
Conversation
Hi @leandrodamascena ! Can I have your help here with mypy? Thanks! |
Hi @leandrodamascena! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @anafalcao! Another round of review. This is a nice work, we just need to fix some things. 🚀
tests/unit/data_masking/_aws_encryption_sdk/test_unit_data_masking.py
Outdated
Show resolved
Hide resolved
|
|
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
APPROVING it @anafalcao! THANK YOU SO MUCH!
|
) * add custom mask functionalities * change flags name to more intuitive * fix type check error * add draft documentation * change doc examples * style: format code with black * fix format base * add tests for new masks * sub header for custom mask in docs * masking rules to handle complex nest * add test for masking rules * modifications based on the feedback * mypy and tests modification * create more tests * Refactoring tests * Refactoring tests * Refactoring tests * Adding docstring + arg parameter * Adding docstring + arg parameter * Removing unnecessary code * Removing unnecessary code * Removing unnecessary code --------- Co-authored-by: Leandro Damascena <lcdama@amazon.pt>
Issue number:
#5826
Summary
This PR enhances the data masking tool by introducing flexible masking options. These new features allow for dynamic, pattern-based, and regex-based masking, providing users with greater control over how sensitive data is obscured in using the
erase
method.Changes
New flags for
erase()
:dynamic_mask
(bool): Enables dynamic masking behavior when set to True, by maintaining the original length and structure of the text replacing with*
.Example: dynamic_mask = True for 'Avenue St' is '****** **'
custom_mask
(str): Specifies a simple pattern for masking data. This pattern is applied directly to the input string, replacing all the original characters.For example, with a mask_pattern of "XX-XX" applied to "12345", the result would be "XX-XX".
regex_pattern
(str): Defines a regular expression pattern used to identify parts of the input string that should be masked. This allows for more complex and flexible masking rules. It's used in conjunction with mask_format.mask_format
(str): Specifies the format to use when replacing parts of the string matched by regex_pattern. It can include placeholders (like \1, \2) to refer to captured groups in the regex pattern, allowing some parts of the original string to be preserved.For example: 'example@email.com' could become 'e*****@email.com'
masking_rules
(dict): Apply different rules (formats) for each data field.User experience
Previously, users had limited options for masking sensitive data. The
erase()
function provided basic masking capabilities, typically replacing entire fields or values with a fixed mask (e.g., '*****').With the new masking options, users now have much more control over how their sensitive data is obscured. The enhanced
erase()
function offers a range of flexible masking techniques to suit various use cases, including different techniques for each field:Checklist
If your change doesn't seem to apply, please leave them unchecked.
Is this a breaking change?
RFC issue number:
Checklist:
Acknowledgment
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.
Disclaimer: We value your time and bandwidth. As such, any pull requests created on non-triaged issues might not be successful.