Releases: sdv-dev/SDV
v0.4.4 - 2020-10-06
This release adds a new tabular model based on combining the CTGAN model with the reversible
transformation applied in the GaussianCopula model that converts random variables with
arbitrary distributions to new random variables with standard normal distribution.
The reversible transformation is handled by the GaussianCopulaTransformer recently added to RDT.
New Features
- Add CopulaGAN Model - Issue #202 by @csala
v0.4.3 - 2020-09-28
This release moves the models and algorithms related to generation of synthetic
relational data to a new sdv.relational
subpackage (Issue #198)
As part of the change, also the old sdv.models
have been removed and now
relational modeling is based on the recently introduced sdv.tabular
models.
v0.4.2 - 2020-09-19
In this release the sdv.evaluation
module has been reworked to include 4 different
metrics and in all cases return a normalized score between 0 and 1.
Included metrics are:
cstest
kstest
logistic_detection
svc_detection
v0.4.1 - 2020-09-07
This release fixes a couple of minor issues and introduces an important rework of the
User Guides section of the documentation.
Issues fixed
- Error Message: "make sure the Graphviz executables are on your systems' PATH" - Issue #182 by @csala
- Anonymization mappings leak - Issue #187 by @csala
v0.4.0 - 2020-08-08
In this release SDV gets new documentation, new tutorials, improvements to the Tabular API
and broader python and dependency support.
Complete list of changes:
- New Documentation site based on the
pydata-sphinx-theme
. - New User Guides and Notebook tutorials.
- New Developer Guides section within the docs with details about the SDV architecture,
the ecosystem libraries and how to extend and contribute to the project. - Improved API for the Tabular models with focus on ease of use.
- Support for Python 3.8 and the newest versions of pandas, scipy and scikit-learn.
- New Slack Workspace for development discussions and community support.
v0.3.6 - 2020-07-23
This release introduces a new concept of Constraints
, which allow the user to define
special relationships between columns that will not be handled via modeling.
This is done via a new sdv.constraints
subpackage which defines some well-known pre-defined
constraints, as well as a generic framework that allows the user to customize the constraints
to their needs as much as necessary.
New Features
- Support for Constraints - Issue #169 by @csala
v0.3.5 - 2020-07-09
This release introduces a new subpackage sdv.tabular
with models designed specifically
for single table modeling, while still providing all the usual conveniences from SDV, such
as:
- Seamless multi-type support
- Missing data handling
- PII anonymization
Currently implemented models are:
- GaussianCopula: Multivariate distributions modeled using copula functions. This is stronger
version, with more marginal distributions and options, than the one used to model multi-table
datasets. - CTGAN: GAN-based data synthesizer that can generate synthetic tabular data with high fidelity.
v0.3.4 - 2020-07-04
New Features
- Support for Multiple Parents - Issue #162 by @csala
- Sample by default the same number of rows as in the original table - Issue #163 by @csala
General Improvements
- Add benchmark - Issue #165 by @csala
v0.3.3 - 2020-06-26
General Improvements
- Use SDMetrics for evaluation - Issue #159 by @csala
v0.3.2 - 2020-02-03
General Improvements
- Improve metadata visualization - Issue #151 @csala @JDTheRipperPC