sdv-dev · amontanez24 · Sep 1, 2022 · Aug 31, 2022 · Aug 31, 2022
diff --git a/docs/api_reference/metrics/relational.rst b/docs/api_reference/metrics/relational.rst
@@ -35,12 +35,9 @@ Multi Table Statistical Metrics
     CSTest
     CSTest.get_subclasses
     CSTest.compute
-    KSTest
-    KSTest.get_subclasses
-    KSTest.compute
-    KSTestExtended
-    KSTestExtended.get_subclasses
-    KSTestExtended.compute
+    KSComplement
+    KSComplement.get_subclasses
+    KSComplement.compute
 
 Multi Table Detection Metrics
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

diff --git a/docs/api_reference/metrics/tabular.rst b/docs/api_reference/metrics/tabular.rst
@@ -37,12 +37,9 @@ Single Table Statistical Metrics
     CSTest
     CSTest.get_subclasses
     CSTest.compute
-    KSTest
-    KSTest.get_subclasses
-    KSTest.compute
-    KSTestExtended
-    KSTestExtended.get_subclasses
-    KSTestExtended.compute
+    KSComplement
+    KSComplement.get_subclasses
+    KSComplement.compute
     ContinuousKLDivergence
     ContinuousKLDivergence.get_subclasses
     ContinuousKLDivergence.compute

diff --git a/docs/user_guides/evaluation/evaluation_framework.rst b/docs/user_guides/evaluation/evaluation_framework.rst
@@ -98,21 +98,21 @@ are included within the SDV Evaluation framework. However, the list of
 metrics that are applied can be controlled by passing a list with the
 names of the metrics that you want to apply.
 
-For example, if you were interested on obtaining only the ``CSTest`` and
-``KSTest`` metrics you can call the ``evaluate`` function as follows:
+For example, if you were interested on obtaining only the ``CSTest``
+metric you can call the ``evaluate`` function as follows:
 
 .. ipython:: python
     :okwarning:
 
-    evaluate(synthetic_data, real_data, metrics=['CSTest', 'KSTest'])
+    evaluate(synthetic_data, real_data, metrics=['CSTest'])
 
 
 Or, if we want to see the scores separately:
 
 .. ipython:: python
     :okwarning:
 
-    evaluate(synthetic_data, real_data, metrics=['CSTest', 'KSTest'], aggregate=False)
+    evaluate(synthetic_data, real_data, metrics=['CSTest'], aggregate=False)
 
 
 For more details about all the metrics that exist for the different data modalities

diff --git a/docs/user_guides/evaluation/multi_table_metrics.rst b/docs/user_guides/evaluation/multi_table_metrics.rst
@@ -153,21 +153,20 @@ report back the average score obtained.
 The list of such metrics is:
 
 * ``CSTest``: Multi Single Table metric based on the Single Table CSTest metric.
-* ``KSTest``: Multi Single Table metric based on the Single Table KSTest metric.
-* ``KSTestExtended``: Multi Single Table metric based on the Single Table KSTestExtended metric.
+* ``KSComplement``: Multi Single Table metric based on the Single Table KSComplement metric.
 * ``LogisticDetection``: Multi Single Table metric based on the Single Table LogisticDetection metric.
 * ``SVCDetection``: Multi Single Table metric based on the Single Table SVCDetection metric.
 * ``BNLikelihood``: Multi Single Table metric based on the Single Table BNLikelihood metric.
 * ``BNLogLikelihood``: Multi Single Table metric based on the Single Table BNLogLikelihood metric.
 
-Let's try to use the ``KSTestExtended`` metric:
+Let's try to use the ``KSComplement`` metric:
 
 .. ipython::
     :verbatim:
 
-    In [6]: from sdv.metrics.relational import KSTestExtended
+    In [6]: from sdv.metrics.relational import KSComplement
 
-    In [7]: KSTestExtended.compute(real_data, synthetic_data)
+    In [7]: KSComplement.compute(real_data, synthetic_data)
     Out[7]: 0.8194444444444443
 
 Parent Child Detection Metrics

diff --git a/docs/user_guides/evaluation/single_table_metrics.rst b/docs/user_guides/evaluation/single_table_metrics.rst
@@ -136,7 +136,7 @@ outcome from the test.
 
 Such metrics are:
 
-* ``sdv.metrics.tabular.KSTest``: This metric uses the two-sample Kolmogorov–Smirnov test
+* ``sdv.metrics.tabular.KSComplement``: This metric uses the two-sample Kolmogorov–Smirnov test
   to compare the distributions of continuous columns using the empirical CDF.
   The output for each column is 1 minus the KS Test D statistic, which indicates the maximum
   distance between the expected CDF and the observed CDF values.
@@ -150,16 +150,16 @@ Let us execute these two metrics on the loaded data:
 .. ipython::
     :verbatim:
 
-    In [6]: from sdv.metrics.tabular import CSTest, KSTest
+    In [6]: from sdv.metrics.tabular import CSTest, KSComplement
 
     In [7]: CSTest.compute(real_data, synthetic_data)
     Out[7]: 0.8078084931103922
 
-    In [8]: KSTest.compute(real_data, synthetic_data)
+    In [8]: KSComplement.compute(real_data, synthetic_data)
     Out[8]: 0.6372093023255814
 
 In each case, the statistical test will be executed on all the compatible column (so, categorical
-or boolean columns for ``CSTest`` and numerical columns for ``KSTest``), and report the average
+or boolean columns for ``CSTest`` and numerical columns for ``KSComplement``), and report the average
 score obtained.
 
 .. note:: If your table does not contain any column of the compatible type, the output of
@@ -173,11 +173,11 @@ metric classes or their names:
 
     In [9]: from sdv.evaluation import evaluate
 
-    In [10]: evaluate(synthetic_data, real_data, metrics=['CSTest', 'KSTest'], aggregate=False)
+    In [10]: evaluate(synthetic_data, real_data, metrics=['CSTest', 'KSComplement'], aggregate=False)
     Out[10]:
        metric                                     name  raw_score  normalized_score  min_value  max_value      goal
     0  CSTest                              Chi-Squared   0.807808          0.807808        0.0        1.0  MAXIMIZE
-    1  KSTest  Inverted Kolmogorov-Smirnov D statistic   0.637209          0.637209        0.0        1.0  MAXIMIZE
+    1  KSComplement  Inverted Kolmogorov-Smirnov D statistic   0.637209          0.637209        0.0        1.0  MAXIMIZE
 
 
 Likelihood Metrics

diff --git a/docs/user_guides/single_table/copulagan.rst b/docs/user_guides/single_table/copulagan.rst
@@ -346,44 +346,6 @@ Now that we have discovered the basics, let's go over a few more
 advanced usage examples and see the different arguments that we can pass
 to our ``CopulaGAN`` Model in order to customize it to our needs.
 
-Setting Bounds and Specifying Rounding for Numerical Columns
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-By default, the model will learn the upper and lower bounds of the
-input data, and use that for sampling. This means that all sampled data
-will be between the maximum and minimum values found in the original
-dataset for each numeric column. This option can be overwritten using the
-``min_value`` and ``max_value`` model arguments. These values can either
-be set to a numeric value, set to ``'auto'`` which is the default setting,
-or set to ``None`` which will mean the column is boundless.
-
-The model will also learn the number of decimal places to round to by default.
-This option can be overwritten using the ``rounding`` parameter. The value can
-be an int specifying how many decimal places to round to, ``'auto'`` which is
-the default setting, or ``None`` which means the data will not be rounded.
-
-Since we may want to sample values outside of the ranges in the original data,
-let's pass the ``min_value`` and ``max_value`` arguments as `None` to the model.
-To keep the number of decimals consistent across columns, we can set ``rounding``
-to be 2.
-
-.. ipython:: python
-    :okwarning:
-
-    model = CopulaGAN(
-        primary_key='student_id',
-        min_value=None,
-        max_value=None,
-        rounding=2
-    )
-    model.fit(data)
-
-    unbounded_data = model.sample(10)
-    unbounded_data
-
-As you may notice, the sampled data may have values outside the range of
-the original data.
-
 Exploring the Probability Distributions
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
@@ -423,8 +385,7 @@ in our table. We can explore the distributions which the
 
     model = CopulaGAN(
         primary_key='student_id',
-        min_value=None,
-        max_value=None
+        enforce_min_max_values=False
     )
     model.fit(data)
     distributions = model.get_distributions()
@@ -520,8 +481,7 @@ Let's see what happens if we make the ``CopulaGAN`` use the
         field_distributions={
             'experience_years': 'gamma'
         },
-        min_value=None,
-        max_value=None
+        enforce_min_max_values=False
     )
     model.fit(data)
 

diff --git a/docs/user_guides/single_table/ctgan.rst b/docs/user_guides/single_table/ctgan.rst
@@ -345,44 +345,6 @@ Now that we have discovered the basics, let's go over a few more
 advanced usage examples and see the different arguments that we can pass
 to our ``CTGAN`` Model in order to customize it to our needs.
 
-Setting Bounds and Specifying Rounding for Numerical Columns
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-By default, the model will learn the upper and lower bounds of the
-input data, and use that for sampling. This means that all sampled data
-will be between the maximum and minimum values found in the original
-dataset for each numeric column. This option can be overwritten using the
-``min_value`` and ``max_value`` model arguments. These values can either
-be set to a numeric value, set to ``'auto'`` which is the default setting,
-or set to ``None`` which will mean the column is boundless.
-
-The model will also learn the number of decimal places to round to by default.
-This option can be overwritten using the ``rounding`` parameter. The value can
-be an int specifying how many decimal places to round to, ``'auto'`` which is
-the default setting, or ``None`` which means the data will not be rounded.
-
-Since we may want to sample values outside of the ranges in the original data,
-let's pass the ``min_value`` and ``max_value`` arguments as `None` to the model.
-To keep the number of decimals consistent across columns, we can set ``rounding``
-to be 2.
-
-.. ipython:: python
-    :okwarning:
-
-    model = CTGAN(
-        primary_key='student_id',
-        min_value=None,
-        max_value=None,
-        rounding=2
-    )
-    model.fit(data)
-
-    unbounded_data = model.sample(10)
-    unbounded_data
-
-As you may notice, the sampled data may have values outside the range of
-the original data.
-
 How to modify the CTGAN Hyperparameters?
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 

diff --git a/docs/user_guides/single_table/custom_constraints.rst b/docs/user_guides/single_table/custom_constraints.rst
@@ -174,7 +174,7 @@ would for predefined constraints.
       bonus_divis_500
     ]
 
-    model = GaussianCopula(constraints=constraints, min_value=None, max_value=None)
+    model = GaussianCopula(constraints=constraints, enforce_min_max_values=False)
 
     model.fit(employees)
 

diff --git a/docs/user_guides/single_table/gaussian_copula.rst b/docs/user_guides/single_table/gaussian_copula.rst
@@ -350,44 +350,6 @@ Now that we have discovered the basics, let's go over a few more
 advanced usage examples and see the different arguments that we can pass
 to our ``GaussianCopula`` Model in order to customize it to our needs.
 
-Setting Bounds and Specifying Rounding for Numerical Columns
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-By default, the model will learn the upper and lower bounds of the
-input data, and use that for sampling. This means that all sampled data
-will be between the maximum and minimum values found in the original
-dataset for each numeric column. This option can be overwritten using the
-``min_value`` and ``max_value`` model arguments. These values can either
-be set to a numeric value, set to ``'auto'`` which is the default setting,
-or set to ``None`` which will mean the column is boundless.
-
-The model will also learn the number of decimal places to round to by default.
-This option can be overwritten using the ``rounding`` parameter. The value can
-be an int specifying how many decimal places to round to, ``'auto'`` which is
-the default setting, or ``None`` which means the data will not be rounded.
-
-Since we may want to sample values outside of the ranges in the original data,
-let's pass the ``min_value`` and ``max_value`` arguments as `None` to the model.
-To keep the number of decimals consistent across columns, we can set ``rounding``
-to be 2.
-
-.. ipython:: python
-    :okwarning:
-
-    model = GaussianCopula(
-        primary_key='student_id',
-        min_value=None,
-        max_value=None,
-        rounding=2
-    )
-    model.fit(data)
-
-    unbounded_data = model.sample(10)
-    unbounded_data
-
-As you may notice, the sampled data may have values outside the range of
-the original data.
-
 Exploring the Probability Distributions
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
@@ -427,8 +389,7 @@ in our table. We can explore the distributions which the
 
     model = GaussianCopula(
         primary_key='student_id',
-        min_value=None,
-        max_value=None
+        enforce_min_max_values=False
     )
     model.fit(data)
     distributions = model.get_distributions()
@@ -526,8 +487,7 @@ Let's see what happens if we make the ``GaussianCopula`` use the
         field_distributions={
             'experience_years': 'gamma'
         },
-        min_value=None,
-        max_value=None
+        enforce_min_max_values=False
     )
     model.fit(data)
 

diff --git a/docs/user_guides/single_table/handling_constraints.rst b/docs/user_guides/single_table/handling_constraints.rst
@@ -129,8 +129,8 @@ datetime column name and value. It also expects an inequality relation that must
     )
 
 .. note::
-    All SDV tabular models have min_value and max_value parameters that you set to enforce bounds
-    on all columns. This constraint is redundant if you set these model parameters.
+    All SDV tabular models have an enforce_min_max_values parameter that you set to enforce bounds
+    on all columns. This constraint is redundant if you set this model parameter.
 
 Positive and Negative
 ~~~~~~~~~~~~~~~~~~~~~
@@ -150,8 +150,8 @@ Enforce this by creating a Positive constraint. This object accepts a numerical
     age_positive = Positive(column_name='age')
 
 .. note::
-    All SDV tabular models have min_value and max_value parameters that you set to enforce bounds
-    on all columns. This constraint is redundant if you set these model parameters.
+    All SDV tabular models have an enforce_min_max_value parameter that you set to enforce bounds
+    on all columns. This constraint is redundant if you set this model parameter.
 
 OneHotEncoding
 ~~~~~~~~~~~~~~
@@ -250,8 +250,8 @@ ranges are strict (exclusive) or not (inclusive).
     )
 
 .. note::
-    All SDV tabular models have min_value and max_value parameters that you set to enforce bounds
-    on all columns. This constraint is redundant if you set these model parameters.
+    All SDV tabular models have an enforce_min_max_values parameter that you set to enforce bounds
+    on all columns. This constraint is redundant if you set this model parameter.
 
 Applying the Constraints
 ------------------------
@@ -272,7 +272,7 @@ to pass in the objects a list.
         age_btwn_18_100
     ]
 
-    model = GaussianCopula(constraints=constraints, min_value=None, max_value=None)
+    model = GaussianCopula(constraints=constraints, enforce_min_max_values=False)
 
 Then you can fit the model using the real data. During this process, the SDV ensures that the
 model learns the constraints.

diff --git a/docs/user_guides/single_table/tvae.rst b/docs/user_guides/single_table/tvae.rst
@@ -345,44 +345,6 @@ Now that we have discovered the basics, let's go over a few more
 advanced usage examples and see the different arguments that we can pass
 to our ``CTGAN`` Model in order to customize it to our needs.
 
-Setting Bounds and Specifying Rounding for Numerical Columns
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-By default, the model will learn the upper and lower bounds of the
-input data, and use that for sampling. This means that all sampled data
-will be between the maximum and minimum values found in the original
-dataset for each numeric column. This option can be overwritten using the
-``min_value`` and ``max_value`` model arguments. These values can either
-be set to a numeric value, set to ``'auto'`` which is the default setting,
-or set to ``None`` which will mean the column is boundless.
-
-The model will also learn the number of decimal places to round to by default.
-This option can be overwritten using the ``rounding`` parameter. The value can
-be an int specifying how many decimal places to round to, ``'auto'`` which is
-the default setting, or ``None`` which means the data will not be rounded.
-
-Since we may want to sample values outside of the ranges in the original data,
-let's pass the ``min_value`` and ``max_value`` arguments as `None` to the model.
-To keep the number of decimals consistent across columns, we can set ``rounding``
-to be 2.
-
-.. ipython:: python
-    :okwarning:
-
-    model = TVAE(
-        primary_key='student_id',
-        min_value=None,
-        max_value=None,
-        rounding=2
-    )
-    model.fit(data)
-
-    unbounded_data = model.sample(10)
-    unbounded_data
-
-As you may notice, the sampled data may have values outside the range of
-the original data.
-
 How to modify the TVAE Hyperparameters?
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~