-
Notifications
You must be signed in to change notification settings - Fork 321
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed combinations Constraint #2253
Comments
Hi @Pavan-Kalyan1432 can you clarify what you mean by "repeating the combinations but it is not considering all the combinations"?
When generating synthetic data, using this constraint will ensure that the synthesizer will only use the same combinations of values in these 4 columns that exist in your real data. So, for example, if you only have rows containing the combination: "Jack", "John", "Jay", and "Jack John Jay" for your 4 columns, then this will be the only combination that will show up in the synthetic data. |
For example it is repeating the same combination multiple times and also it is not considering all the combinations that are in real data |
Hi @Pavan-Kalyan1432, if I may jump in here: The purpose of the FixedCombinations constraint is only to fix the combinations that are created. Adding this constraint will prevent new permutations from being synthesized in the columns you specify. If you sample many many more times, then I think due to random chance, you will eventually end up creating all the combinations that were in the original data. However, preventing repetition is not the purpose of this constraint. May I ask why you want to prevent the repetition in your data? This indicates to me that in your synthetic data, you just want the same exact same names to appear in the exact same rows as your real data. Is that correct? If you could provide more information on your usage (what are you trying to accomplish with synthetic data), we can better guide you to a solution. Thanks. |
Hi @Pavan-Kalyan1432 we hope our responses cleared things up! Since we haven't heard from you in a while, I'm going to move forward with closing this issue out. Please don't hesitate to open a new issue or ask in our Slack for new questions! |
How to manage inter column dependency... |
Hi @Pavan-Kalyan1432, the original issue you filed was for FixedCombinations for first name and last name. Are you still having problems with this? Your most recent question is for a different topic so I have filed a new issue here: #2318 We can continue discussion about your inter-column dependency (birth, date of death, and age) in the new issue. |
Environment details
If you are already running SDV, please indicate the following details about the environment in
which you are running it:
Problem description
What I already tried
Here Fixed combinations is repeating the combinations but it is not considering all the combinations... What to do to make it consider all the combinations of first name, middle name, last name and full name of the real data
The text was updated successfully, but these errors were encountered: