Skip to content

Commit

Permalink
Merge branch 'master' into wrong-mocks
Browse files Browse the repository at this point in the history
  • Loading branch information
fealho authored Aug 24, 2023
2 parents 9ad029e + 2bfa2e2 commit 0ef099f
Show file tree
Hide file tree
Showing 14 changed files with 1,466 additions and 1,445 deletions.
26 changes: 26 additions & 0 deletions HISTORY.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,39 @@
# History

## 1.7.0 - 2023-08-22

This release adds 3 new transformers:

1. `UniformEncoder` - A categorical and boolean transformer that converts the column into a uniform distribution.
2. `OrderedUniformEncoder` - The same as above, but the order for the categories can be specified, changing which range in the uniform distribution each category belongs to.
3. `IDGenerator`- A text transformer that drops the input column during transform and returns IDs during reverse transform. The IDs all take the form \<prefix>\<number>\<suffix> and can be configured with a custom prefix, suffix and starting point.

Additionally, the `AnonymizedFaker` is enhanced to support the text sdtype.

### Deprecations

* The `get_input_sdtype` method is being deprecated in favor of `get_supported_sdtypes`.

### New Features

* Create IDGenerator transformer - Issue [#675](https://github.com/sdv-dev/RDT/issues/675) by @R-Palazzo
* Add UniformEncoder (and its ordered version) - Issue [#678](https://github.com/sdv-dev/RDT/issues/678) by @R-Palazzo
* Allow me to use AnonymizedFaker with sdtype text columns - Issue [#688](https://github.com/sdv-dev/RDT/issues/688) by @amontanez24

### Maintenance

* Deprecate get_input_sdtype - Issue [#682](https://github.com/sdv-dev/RDT/issues/682) by @R-Palazzo

## 1.6.1 - 2023-08-02

This release updates the default transformers used for certain sdtypes. It also enables the `AnonymizedFaker` and `PseudoAnonymizedFaker` to work with any sdtype besides boolean, categorical, datetime, numerical or text.

### Bugs

* [Enterprise Usage] Unable to assign generic PII transformers (eg. AnonymizedFaker) - Issue [#674](https://github.com/sdv-dev/RDT/issues/674) by @amontanez24

### New Features

* Update the default transformers that HyperTransformer assigns to each sdtype - Issue [#664](https://github.com/sdv-dev/RDT/issues/664) by @amontanez24

## 1.6.0 - 2023-07-12
Expand Down
2 changes: 1 addition & 1 deletion rdt/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@

__author__ = 'DataCebo, Inc.'
__email__ = 'info@sdv.dev'
__version__ = '1.7.0.dev1'
__version__ = '1.7.1.dev0'


import sys
Expand Down
4 changes: 2 additions & 2 deletions rdt/transformers/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -145,8 +145,8 @@ def get_transformers_by_type():
sdtype_transformers = defaultdict(list)
transformer_classes = BaseTransformer.get_subclasses()
for transformer in transformer_classes:
input_sdtype = transformer.get_input_sdtype()
sdtype_transformers[input_sdtype].append(transformer)
for sdtype in transformer.get_supported_sdtypes():
sdtype_transformers[sdtype].append(transformer)

return sdtype_transformers

Expand Down
6 changes: 5 additions & 1 deletion rdt/transformers/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -184,7 +184,11 @@ def get_input_sdtype(cls):
string:
Accepted input sdtype of the transformer.
"""
return cls.INPUT_SDTYPE
warnings.warn(
'`get_input_sdtype` is deprecated. Please use `get_supported_sdtypes` instead.',
FutureWarning
)
return cls.get_supported_sdtypes()[0]

@classmethod
def get_supported_sdtypes(cls):
Expand Down
2 changes: 1 addition & 1 deletion rdt/transformers/pii/anonymizer.py
Original file line number Diff line number Diff line change
Expand Up @@ -133,7 +133,7 @@ def get_supported_sdtypes(cls):
Accepted input sdtypes of the transformer.
"""
unsupported_sdtypes = {
'numerical', 'datetime', 'categorical', 'boolean', 'text', None
'numerical', 'datetime', 'categorical', 'boolean', None
}
all_sdtypes = {cls.INPUT_SDTYPE}
for transformer in BaseTransformer.get_subclasses():
Expand Down
2 changes: 1 addition & 1 deletion setup.cfg
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[bumpversion]
current_version = 1.7.0.dev1
current_version = 1.7.1.dev0
commit = True
tag = True
parse = (?P<major>\d+)\.(?P<minor>\d+)\.(?P<patch>\d+)(\.(?P<release>[a-z]+)(?P<candidate>\d+))?
Expand Down
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -139,6 +139,6 @@
test_suite='tests',
tests_require=tests_require,
url='https://github.com/sdv-dev/RDT',
version='1.7.0.dev1',
version='1.7.1.dev0',
zip_safe=False,
)
2 changes: 1 addition & 1 deletion tests/contributing.py
Original file line number Diff line number Diff line change
Expand Up @@ -365,7 +365,7 @@ def validate_transformer_performance(transformer):

print(f'Validating Performance for transformer {transformer.get_name()}\n')

sdtype = transformer.get_input_sdtype()
sdtype = transformer.get_supported_sdtypes()[0]
transformers = get_transformers_by_type().get(sdtype, [])
dataset_generators = get_dataset_generators_by_type().get(sdtype, [])

Expand Down
Loading

0 comments on commit 0ef099f

Please sign in to comment.