Releases · gretelai/gretel-synthetics

⚙️ Smart seeding now supports a list of seeds. A list of seeds will yield a 1:1 mapping of seeds to generated lines. This is useful for synthesizing partial data tables

⚙️ When using DataFrame Batch mode, we now will write out the original Training DF header order to the model directory. When a model is loaded from disk, the resulting generated DataFrame will have the columns ordered the way they were in the training data.

🐛 When using DP mode. We (temporarily) will patch TensorFlow 2.4.x to utilize new Keras LSTM codepaths. This will be globally patched for Keras within the running Python Interpreter. This provides a drastic speedup when training a DP model.

📖 Doc updates for new seeding features.

Major changes:

Totally refactored modules and package structure. This will enable future contributions to utilize other underlying underlying ML libraries as the core engine. Configurations are now specific to underlying engine. LocalConfig can be replaced with TensorFlowConfig, although the former is still supported for backwards compatibility.
With TensorFlow 2.4.x, TensorFlow Privacy can be used to provide differential private training with modified Keras DP optimizers.
Added new tokenizer module that can be used independently from the underlying model training. By default, we continue to use SentencePiece as the tokenizer. We have also added a char-by-char tokenizer that can be useful to use when using differential privacy.
Misc bug fixes and optimizations
Changes in this release are backwards compatible with previous versions.

Please see our updated README and examples directory.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: gretelai/gretel-synthetics

Validation loss splitting

Auto-select Tokenizer

Misc updates

Batch DF Updates

Smart seeding bugfix

Seeding and DP updates

Modular refactor, tokenizers, and differential privacy, oh my!

RC0 0.15.0

Smart seeding

v0.14.0: Jm/syn 21 (#58)