Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add create_custom_constraint method #848

Merged
merged 42 commits into from
Jul 1, 2022
Merged

Conversation

fealho
Copy link
Member

@fealho fealho commented Jun 17, 2022

Resolve #836.

@fealho fealho force-pushed the issue-836-custom-constraint branch from 1d6cef1 to 05cd96e Compare June 17, 2022 16:45
Copy link
Contributor

@amontanez24 amontanez24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we also add an integration test with a CustomConstraint?


return table_data
TYPE = 'CUSTOM_CONSTRAINT' # this lets us know how to handle it
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we either make this a boolean flag like is_custom or create an enum class for this? The reason would be that whenever someone is doing logic around this attribute, they would have to know the string value for custom constraints and any other type. It would be easier if they could see all the possible values. Here is an example of enums: https://docs.python.org/3/library/enum.html

Personally though, I think this can just be a boolean

@codecov-commenter
Copy link

codecov-commenter commented Jun 18, 2022

Codecov Report

Merging #848 (8469200) into master (c2c2c0d) will increase coverage by 2.60%.
The diff coverage is 100.00%.

❗ Current head 8469200 differs from pull request most recent head 6ed7a81. Consider uploading reports for the commit 6ed7a81 to get more accurate results

@@            Coverage Diff             @@
##           master     #848      +/-   ##
==========================================
+ Coverage   68.04%   70.65%   +2.60%     
==========================================
  Files          38       38              
  Lines        2857     3196     +339     
==========================================
+ Hits         1944     2258     +314     
- Misses        913      938      +25     
Impacted Files Coverage Δ
sdv/constraints/__init__.py 100.00% <ø> (ø)
sdv/constraints/errors.py 100.00% <100.00%> (ø)
sdv/constraints/tabular.py 99.62% <100.00%> (-0.11%) ⬇️
sdv/metadata/table.py 80.53% <100.00%> (+3.74%) ⬆️
sdv/constraints/base.py 96.89% <0.00%> (-0.63%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9885db2...6ed7a81. Read the comment docs.

@fealho fealho force-pushed the issue-836-custom-constraint branch from 65741a0 to 86f01bc Compare June 22, 2022 16:30
@fealho fealho force-pushed the issue-836-custom-constraint branch from cd28d99 to fdff66a Compare June 23, 2022 21:11
@CLAassistant
Copy link

CLAassistant commented Jun 23, 2022

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
2 out of 3 committers have signed the CLA.

✅ amontanez24
✅ fealho
❌ Andrew Montanez


Andrew Montanez seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

@fealho fealho changed the base branch from master to issue-833-remove-handling-strategy June 23, 2022 21:11
@fealho fealho requested a review from amontanez24 June 23, 2022 21:15
@fealho fealho force-pushed the issue-836-custom-constraint branch from fdff66a to 1cdda5c Compare June 28, 2022 18:55
Base automatically changed from issue-833-remove-handling-strategy to master June 28, 2022 19:02
@@ -245,30 +245,6 @@ def filter_valid(self, table_data):

return table_data[valid]

@classmethod
def from_dict(cls, constraint_dict):
Copy link
Member Author

@fealho fealho Jun 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not used anywhere else right now (and it causes problems for the changes introduced by this PR), which is why it's been deleted.

sdv/constraints/errors.py Outdated Show resolved Hide resolved

return np.logical_and.reduce(valid)
def _transform(self, data):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we were going to override the transform method and do something to catch errors so that it is possible to fall back on reject sampling. Currently, the logic only falls back if the constraint raises a MissingConstraintColumnError, but that can't happen with a custom constraint. We either need to catch most/all errors and surface it as a MissingConstraintColumnError, or provide some special logic to handle that

fealho and others added 11 commits June 28, 2022 19:50
commit 04d1549
Author: Felipe Alex Hofmann <fealho@gmail.com>
Date:   Tue Jun 28 19:49:25 2022 -0700

    Move `data.copy` to base class of constraints (#849)

    * changing logic for handling constraints

    * fixing a lot of tests

    * removing handling_strategy attribute

    * removing handling_strategy from docs and code

    * fixing tests, docs and tutorials

    * adding unit tests

    * only calling reverse_transform if custom constraint

    * ignoring sre_parse during isort

    * pr comments

    * raising all errors except missing column erros

    * adding integration test and tracking if reverse transform should use reject sampling

    * refactoring how constraint transforms happen

    * adding unit tests

    * Rebase + move copy to base

    * Fix test cases

    * fix rebase

    * Test resulting data is a copy for transform/reverse_transform

    Co-authored-by: Andrew Montanez <amontanez2424@gmail.com>
    Co-authored-by: Andrew Montanez <andrewmontanez@Andrews-MBP.hsd1.ma.comcast.net>
    Co-authored-by: Andrew Montanez <andrew@datacebo.com>
    Co-authored-by: Andrew Montanez <andrew@sdv.dev>
@fealho fealho marked this pull request as ready for review June 30, 2022 17:23
@fealho fealho requested a review from a team as a code owner June 30, 2022 17:23
Copy link
Contributor

@amontanez24 amontanez24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking good! 2 comments

  1. We may want to rebase off of this branch after it gets merged to make sure nothing weird happens with the loading of the constraints: Update handling constraints doc #856
  2. We should add a custom constraint to the integration/test_constraints.py file

@fealho fealho requested a review from amontanez24 June 30, 2022 18:25
Copy link
Contributor

@amontanez24 amontanez24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small comment but besides that :shipit:

tests/integration/test_constraints.py Outdated Show resolved Hide resolved
Copy link
Member

@pvk-developer pvk-developer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@fealho fealho merged commit 0f88065 into master Jul 1, 2022
@fealho fealho deleted the issue-836-custom-constraint branch July 1, 2022 06:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add create_custom_constraint factory method
5 participants