Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support constraints for the PAR Synthesizer #570

Open
schweima opened this issue Aug 25, 2021 · 8 comments
Open

Support constraints for the PAR Synthesizer #570

schweima opened this issue Aug 25, 2021 · 8 comments
Labels
data:sequential Related to timeseries datasets feature:constraints Related to inputting rules or business logic feature request Request for a new feature

Comments

@schweima
Copy link

Environment:
Python 3.7.10
sdv 0.11

Error description:
There is no documentation detailing whether PAR supports constraints.

Is there a way similar to the HMA1 approach mentioned in #296 ?

Initial tests with according to this workaround referenced by @npatki in #566 with the Between constraint are either ignored or yield NaN values when sampling.

Additional information:
Any hints appreciated. Thank you! Use case: measurement time series with a lot of columns that have to be >= 0.

@katxiao katxiao added the feature:constraints Related to inputting rules or business logic label Nov 18, 2021
@sokol11
Copy link

sokol11 commented Dec 28, 2021

Hi. I am also trying to use constraints with PAR. I need to have the UniqueCombinations constraint in my case. Would you mind sharing the code you used to try the HMA1 workaround? It is giving me a KeyError when I naively try:

from sdv.timeseries import PAR
from sdv.constraints import UniqueCombinations
from sdv import Metadata

constraints = [UniqueCombinations(columns=['date', 'col1'])]
metadata = Metadata()
metadata.add_table(name='data')
metadata._metadata['tables']['data']['constraints'] = constraints
model = PAR(metadata, entity_columns=['col1'], sequence_index='date', epochs=128, cuda=True)

I get:

model.fit(data)

...

KeyError: Metadata
  root_path: .
  tables: ['data']
  relationships:

Thank you!

@npatki npatki added the data:sequential Related to timeseries datasets label Jun 10, 2022
@npatki
Copy link
Contributor

npatki commented Jun 10, 2022

The PARModel does not currently support constraints, so any workaround/code that was used for HMA1 will not work here.

Use case: measurement time series with a lot of columns that have to be >= 0.

Issue #745 is specifically tracking the feature request for supporting min/max bounds on the PARModel.

However, let's continue to keep issue open for tracking the feature request of adding general constraints on sequential data.

@tosador
Copy link

tosador commented May 4, 2023

Hi, I am also trying to use the PAR model setting inequality constraints on sdv==1.0.

Inequality constraints are not supported yet (#1320).

I noticed that if I try to generate financial OHLC bars data, sometimes the constraints:

  • H >= C >= L
  • H >= O >= L

are not satisfied.

Use case: training/backtesting machine learning models on synthetic financial bars data.

@npatki npatki changed the title Support passing tabular constraints to the PAR model Support passing tabular constraints to the PAR Synthesizer Nov 3, 2023
@npatki npatki added the feature request Request for a new feature label Nov 3, 2023
@npatki npatki changed the title Support passing tabular constraints to the PAR Synthesizer Support constraints for the PAR Synthesizer Nov 3, 2023
@Ng-ms
Copy link

Ng-ms commented Feb 8, 2024

Hello
In my dataset, i am having certain columns should be generated together, for example, the beginning of the treatment the end of the treatment and the code of the treatment, and the name of the treatment should be predicted together, now the PARSynthesizer is generating them all separate and does not follow this logic in the real data, the only thing I can think of it to solve this is to have constraints in the synthesizer to deal with these columns as ´package ´ to generate them together

@Ng-ms
Copy link

Ng-ms commented Apr 18, 2024

is there any updates on this request?

@npatki
Copy link
Contributor

npatki commented Apr 18, 2024

Hi @Ng-ms, thanks for describing your use case. In theory, this type of logic would be well-served by the FixedCombinations constraint.

I am curious if for any of the columns you mention (beginning of treatment, end of treatment, code of treatment) are you marking any as context_columns in PAR?

There seems to be demand for this feature. I think it will be easier to accommodate constraints in PAR if either all of the columns are context columns or if all of the columns are NOT context columns. I am interested to know if that would cover most of the cases that folks have encountered?

@Ng-ms
Copy link

Ng-ms commented Apr 19, 2024

in this use case, these columns are not context columns.

@srinify
Copy link
Contributor

srinify commented Sep 25, 2024

Hi @schweima @sokol11 @tosador @Ng-ms we've actually added partial support for constraints in PARSynthesizer in SDV 1.14.0: https://github.com/sdv-dev/SDV/releases/tag/v1.14.0

You can use PARSynthesizer with constraints as long as the columns involved with the constraint are either ALL context columns or ALL non-context columns. We've also added a section in our docs for this under the PAR Synthesizer specific section: https://docs.sdv.dev/sdv/sequential-data/modeling/customizations

We're hoping this helps a few of you -- let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data:sequential Related to timeseries datasets feature:constraints Related to inputting rules or business logic feature request Request for a new feature
Projects
None yet
Development

No branches or pull requests

7 participants