You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, we are applying the SDMetrics NewRowSynthesis by default in the benchmark_single_table script. The motivation was to capture whether new synthetic data is being created at all -- or whether the rows are being re-used as in DataIdentity.
But in practice, the NewRowSynthesis metric may not be too robust. It may error out on a large # of columns, and leading to generally longer benchmarking runs.
Expected behavior
We should consider the behavior of the default NewRowSynthesis metric that we apply:
We could disable it. That is, by default set sdmetrics=None
We could fix the underlying issues with it in the SDMetrics library. Perhaps that can achieved by subsetting or some other means.
The text was updated successfully, but these errors were encountered:
Version: 0.8.0 (in developement)
Problem Description
Currently, we are applying the SDMetrics NewRowSynthesis by default in the
benchmark_single_table
script. The motivation was to capture whether new synthetic data is being created at all -- or whether the rows are being re-used as inDataIdentity
.But in practice, the
NewRowSynthesis
metric may not be too robust. It may error out on a large # of columns, and leading to generally longer benchmarking runs.Expected behavior
We should consider the behavior of the default
NewRowSynthesis
metric that we apply:sdmetrics=None
The text was updated successfully, but these errors were encountered: