Skip to content

Commit

Permalink
Add method to sample remaining columns
Browse files Browse the repository at this point in the history
  • Loading branch information
katxiao committed Feb 10, 2022
1 parent bc26439 commit d136a09
Showing 1 changed file with 32 additions and 0 deletions.
32 changes: 32 additions & 0 deletions sdv/tabular/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -541,6 +541,38 @@ def sample_conditions(self, conditions, max_tries=100, batch_size_per_try=None,

return sampled

def sample_remaining_columns(self, known_columns, max_tries=100, batch_size_per_try=None,
randomize_samples=True):
"""Sample rows from this table.
Args:
known_columns (pandas.DataFrame):
A pandas.DataFrame with the columns that are already known. The output
is a DataFrame such that each row in the output is sampled
conditionally on the corresponding row in the input.
max_tries (int):
Number of times to try sampling discarded rows. Defaults to 100.
batch_size_per_try (int):
The batch size to use per attempt at sampling. Defaults to 10 times
the number of rows.
randomize_samples (bool):
Whether or not to use a a fixed seed when sampling. Defaults
to True.
Returns:
pandas.DataFrame:
Sampled data.
Raises:
ConstraintsNotMetError:
If the conditions are not valid for the given constraints.
ValueError:
If any of the following happens:
* any of the conditions' columns are not valid.
* no rows could be generated.
"""
return self._sample_with_conditions(known_columns, max_tries, batch_size_per_try)

def _get_parameters(self):
raise NonParametricError()

Expand Down

0 comments on commit d136a09

Please sign in to comment.