Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ChnPiiGenerator and Enhance Models #191

Merged
merged 8 commits into from
Jun 24, 2024
Merged

Conversation

MooooCat
Copy link
Contributor

Description

This pull request includes a series of updates and enhancements across multiple files in the software project. Key changes include:

  1. Enhancement in base.py: Added exception handling for column removal in tabular data processing to prevent unintended consequences.
  2. New Feature in chn_pii.py: Introduced a new ChnPiiGenerator class for handling Chinese personal identifiable information (PII) data, including fitting, converting, and reverse converting processes.
  3. Update in manager.py: Updated the list of default processors to include the new ChnPiiGenerator.
  4. Update in base.py: Added a new boolean attribute fit_data_empty to the MLSynthesizerModel class.
  5. Enhancement in ctgan.py: Improved the handling of discrete columns during model fitting and added checks for empty data frames.
  6. New Test in test_chn_pii_generator.py: Added comprehensive tests for the new ChnPiiGenerator class to ensure its functionality and robustness.

Motivation and Context

These changes are necessary to enhance the robustness and functionality of the data processing and model fitting components of the sdgx.

The introduction of the ChnPiiGenerator is particularly important for handling specific regional PII data.

How has this been tested?

The changes have been thoroughly tested in a local development environment.

The new ChnPiiGenerator has been tested with various scenarios, including edge cases, to ensure it handles data correctly.

Types of changes

  • Maintenance (no change in code, maintain the project's CI, docs, etc.)
  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.

@MooooCat MooooCat requested review from sjh120, Z712023 and iokk3732 June 21, 2024 15:55
@MooooCat MooooCat enabled auto-merge (squash) June 21, 2024 15:58
@MooooCat MooooCat disabled auto-merge June 24, 2024 02:24
@MooooCat MooooCat merged commit a5936e7 into main Jun 24, 2024
12 checks passed
@MooooCat MooooCat deleted the feature-chn-pii-generator branch June 24, 2024 02:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant