Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve sample() method #690

Closed
npatki opened this issue Jan 26, 2022 · 0 comments
Closed

Improve sample() method #690

npatki opened this issue Jan 26, 2022 · 0 comments
Assignees
Labels
data:single-table Related to tabular datasets feature request Request for a new feature
Milestone

Comments

@npatki
Copy link
Contributor

npatki commented Jan 26, 2022

Problem Description

Let's make sampling more user friendly. We can create multiple methods for different user needs.

The sample() method should only be used to sample entire rows (no conditions).

Expected behavior

Parameters:

  • num_rows is a required param
  • randomize_samples will determine whether or not there should be a fixed seed (default: True)
  • Remove all existing params
# works with any tabular model
from sdv.tabular import CTGAN
model = CTGAN()
model.fit(data)

# creates 1000 rows. Different rows created every time
model.sample(num_rows=1000)

# Fix a seed so that this call always creates rows in the same order
model.sample(num_rows=1000, randomize_samples=False)

Error Handling

Guidance if the user does not specify num_rows.

>>> model.sample()
Error: You must specify the number of rows to sample (eg. num_rows=100).

Guidance if the user passes in the conditions argument.

>>> model.sample(num_rows=100, conditions={'age': 45})
Error: This method does not support the conditions parameter. Please create Condition objects
and pass them into the sample_conditions method. See User Guide or API for more details.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data:single-table Related to tabular datasets feature request Request for a new feature
Projects
None yet
Development

No branches or pull requests

2 participants