Add evaluation methods to synthesizer #1190

amontanez24 · 2023-01-25T23:47:15Z

Problem Description

As a user, it would be useful to evaluate the synthetic data generated against the original data.

Acceptance criteria

Add a evaluation module and two submodules within it: single_table and multi_table
Add the following methods to the evaluation.single_table module
- evaluate_quality(real_data, synthetic_data, metadata, verbose) - Wrapper around the initialization and evaluation of this class.
- run_diagnostic(real_data, synthetic_data, metadata, verbose) - Wrapper around the initialization and evaluation of this class.
- get_column_plot(real_data, synthetic_data, metadata, column_name) - Wrapper around sdmetrics.reports.utils.get_column_plot
- get_column_pair_plot(real_data, synthetic_data, metadata, column_names) - Wrapper around sdmetrics.reports.utils.get_column_pair_plot
Add the following methods to the evaluation.multi_table module
- run_diagnostic(real_data ,synthetic_data, metadata, verbose) - Wrapper around the initialization and evaluation of this class.
- evaluate_quality(real_data, synthetic_data, metadata, verbose) - Wrapper around the initialization and evaluation of this class.
- synthesizer.get_column_plot(real_data, synthetic_data, metadata, table_name, column_name) - Wraps the same method as the single table case but requires the table name.
- synthesizer.get_column_pair_plot(real_data, synthetic_data, metadata, table_name, column_names) - Wraps the same method as the single table case but requires the table name.

Expected behavior

# Single table cases

quality_report = evaluate_quality(
  real_data=real_data, # DataFrame
  synthetic_data=synthetic_data, # DataFrame
  metadata=my_metadata, # SingleTableMetadata
  verbose=True
)
diagnostic_report = run_diagnostic(
  real_data=real_data, # DataFrame
  synthetic_data=synthetic_data, # DataFrame
  metadata=my_metadata, # SingleTableMetadata
  verbose=True
)
fig = get_column_plot(
  real_data=real_data, # DataFrame
  synthetic_data=synthetic_data, # DataFrame
  metadata=my_metadata, # SingleTableMetadata
  column_name='age'
)
fig = get_column_plot(
  real_data=real_data, # DataFrame
  synthetic_data=synthetic_data, # DataFrame
  metadata=my_metadata, # SingleTableMetadata
  column_names=['age', 'weight']
)

# Multi-table cases
quality_report = evaluate_quality(
  real_data=real_data, # dictionary
  synthetic_data=synthetic_data, # dictionary
  metadata=my_metadata, # MultiTableMetadata
  verbose=True
)
diagnostic_report = run_diagnostic(
  real_data=real_data, # dictionary
  synthetic_data=synthetic_data, # dictionary
  metadata=my_metadata, # MultiTableMetadata
  verbose=True
)
# Plot the 1D marginal distribution
fig = get_column_plot(
  real_data=real_data, # dictionary
  synthetic_data=synthetic_data, # dictionary
  metadata=my_metadata, # MultiTableMetadta
  table_name='users',
  column_name='age'
)
# Plot the 2D bivariate distribution
fig = get_column_plot(
  real_data=real_data, # dictionary
  synthetic_data=synthetic_data, # dictionary
  metadata=my_metadata, # MultitableMetadata
  table_name='users',
  column_names=['age', 'weight']
)

The text was updated successfully, but these errors were encountered:

npatki · 2023-02-09T00:51:01Z

@amontanez24 @fealho I'm testing this out. It seems like evaluate_quality is actually returning the score.

The spec is to return the actual QualityReport object from SDMetrics. Shall we re-open this?

fealho · 2023-02-09T02:52:28Z

@npatki Oh, I thought it was supposed to be the score. Yes, this should be reopened and patched.

amontanez24 added the feature request Request for a new feature label Jan 25, 2023

amontanez24 added this to the 1.0.0 milestone Jan 25, 2023

amontanez24 mentioned this issue Jan 27, 2023

Update evaluate.py to work with the new metadata #1186

Closed

fealho mentioned this issue Jan 31, 2023

Add evaluation methods to synthesizer #1220

Merged

amontanez24 closed this as completed Feb 7, 2023

amontanez24 assigned fealho Feb 7, 2023

npatki reopened this Feb 9, 2023

fealho mentioned this issue Feb 9, 2023

Make evaluate methods return the actual object #1253

Merged

amontanez24 closed this as completed Feb 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add evaluation methods to synthesizer #1190

Add evaluation methods to synthesizer #1190

amontanez24 commented Jan 25, 2023 •

edited

Loading

npatki commented Feb 9, 2023

fealho commented Feb 9, 2023

Add evaluation methods to synthesizer #1190

Add evaluation methods to synthesizer #1190

Comments

amontanez24 commented Jan 25, 2023 • edited Loading

Problem Description

Acceptance criteria

Expected behavior

npatki commented Feb 9, 2023

fealho commented Feb 9, 2023

amontanez24 commented Jan 25, 2023 •

edited

Loading