diff --git a/doc/spec/model_selection.rst b/doc/spec/model_selection.rst new file mode 100644 index 000000000..17f4d75e1 --- /dev/null +++ b/doc/spec/model_selection.rst @@ -0,0 +1,43 @@ +.. _model_selection: + +================= +Model Selection +================= + +Estimators that derive from :class:`._OrthoLearner` fit first stage nuisance models on different folds of the data and then fit a final model. +In many cases it will make sense to perform model selection over a number of first-stage models, and the library facilitates this by allowing +a flexible specification of the first-stage models, as any of the following: + + * An sklearn-compatible estimator + + * If the estimator is a known class that performs its own hyperparameter selection via cross-validation (such as :class:`~sklearn.linear_model.LassoCV`), + then this will be done once and then the selected hyperparameters will be used when cross-fitting on each fold + + * If a custom class is used, then it should support a `fit` method and either a `predict` method if the target is continuous or `predict_proba` if the target is discrete. + + * One of the following strings; the exact set of models supported by each of these keywords may vary depending on the version of our package: + + ``"linear"`` + Selects over linear models regularized by L1 or L2 norm + + ``"poly"`` + Selects over regularized linear models with polynomial features of different degrees + + ``"forest"`` + Selects over random forest models + + ``"gbf"`` + Selects over gradient boosting models + + ``"nnet"`` + Selects over neural network models + + ``"automl"`` + Selects over all of the above (note that this will be potentially time consuming) + + * A list of any of the above + + * An implementation of :class:`.ModelSelector`, which is a class that supports a two-stage model selection and fitting process + (this is used internally by our library and is not generally intended to be used directly by end users). + +Most subclasses also use the string `"auto"`` as a special default value to automatically select a model from an appropriate smaller subset of models than would be generated by "automl". diff --git a/doc/spec/spec.rst b/doc/spec/spec.rst index a85af4e08..38917c742 100644 --- a/doc/spec/spec.rst +++ b/doc/spec/spec.rst @@ -12,6 +12,7 @@ EconML User Guide estimation_iv estimation_dynamic inference + model_selection interpretability federated_learning references diff --git a/econml/dml/causal_forest.py b/econml/dml/causal_forest.py index a3affed39..7bac29afb 100644 --- a/econml/dml/causal_forest.py +++ b/econml/dml/causal_forest.py @@ -268,35 +268,23 @@ class CausalForestDML(_BaseDML): Parameters ---------- - model_y: estimator, {'linear', 'forest'}, list of str/estimator, or 'auto' - Determines how to fit the treatment to the features. - - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. + model_y: estimator, default ``'auto'`` + Determines how to fit the outcome to the features. - - 'linear' - LogisticRegressionCV if discrete_outcome=True else WeightedLassoCVWrapper - - 'forest' - RandomForestClassifier if discrete_outcome=True else RandomForestRegressor - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - User-supplied estimators should support 'fit' and 'predict' methods, - and additionally 'predict_proba' if discrete_outcome=True. + - Otherwise, see :ref:`model_selection` for the range of supported options; + if a single model is specified it should be a classifier if `discrete_outcome` is True + and a regressor otherwise - model_t: estimator, {'linear', 'forest'}, list of str/estimator, or 'auto', default 'auto' - Determines how to fit the treatment to the features. str in a sentence - - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. + model_t: estimator, default ``'auto'`` + Determines how to fit the treatment to the features. - - 'linear' - LogisticRegressionCV if discrete_treatment=True else WeightedLassoCVWrapper - - 'forest' - RandomForestClassifier if discrete_treatment=True else RandomForestRegressor - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - User-supplied estimators should support 'fit' and 'predict' methods, - and additionally 'predict_proba' if discrete_treatment=True. + - Otherwise, see :ref:`model_selection` for the range of supported options; + if a single model is specified it should be a classifier if `discrete_treatment` is True + and a regressor otherwise featurizer : :term:`transformer`, optional Must support fit_transform and transform. Used to create composite features in the final CATE regression. diff --git a/econml/dml/dml.py b/econml/dml/dml.py index 01aac0317..d959f39c0 100644 --- a/econml/dml/dml.py +++ b/econml/dml/dml.py @@ -358,35 +358,23 @@ class takes as input the parameter `model_t`, which is an arbitrary scikit-learn Parameters ---------- - model_y: estimator, {'linear', 'forest'}, list of str/estimator, or 'auto' - Determines how to fit the treatment to the features. - - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. + model_y: estimator, default ``'auto'`` + Determines how to fit the outcome to the features. - - 'linear' - LogisticRegressionCV if discrete_outcome=True else WeightedLassoCVWrapper - - 'forest' - RandomForestClassifier if discrete_outcome=True else RandomForestRegressor - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - User-supplied estimators should support 'fit' and 'predict' methods, - and additionally 'predict_proba' if discrete_outcome=True. + - Otherwise, see :ref:`model_selection` for the range of supported options; + if a single model is specified it should be a classifier if `discrete_outcome` is True + and a regressor otherwise - model_t: estimator, {'linear', 'forest'}, list of str/estimator, or 'auto + model_t: estimator, default ``'auto'`` Determines how to fit the treatment to the features. - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - - 'linear' - LogisticRegressionCV if discrete_treatment=True else WeightedLassoCVWrapper - - 'forest' - RandomForestClassifier if discrete_treatment=True else RandomForestRegressor - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models - - User-supplied estimators should support 'fit' and 'predict' methods, - and additionally 'predict_proba' if discrete_treatment=True. + - Otherwise, see :ref:`model_selection` for the range of supported options; + if a single model is specified it should be a classifier if `discrete_treatment` is True + and a regressor otherwise model_final: estimator The estimator for fitting the response residuals to the treatment residuals. Must implement @@ -626,35 +614,23 @@ class LinearDML(StatsModelsCateEstimatorMixin, DML): Parameters ---------- - model_y: estimator, {'linear', 'forest'}, list of str/estimator, or 'auto' - Determines how to fit the treatment to the features. + model_y: estimator, default ``'auto'`` + Determines how to fit the outcome to the features. - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - - 'linear' - LogisticRegressionCV if discrete_outcome=True else WeightedLassoCVWrapper - - 'forest' - RandomForestClassifier if discrete_outcome=True else RandomForestRegressor - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models + - Otherwise, see :ref:`model_selection` for the range of supported options; + if a single model is specified it should be a classifier if `discrete_outcome` is True + and a regressor otherwise - User-supplied estimators should support 'fit' and 'predict' methods, - and additionally 'predict_proba' if discrete_outcome=True. - - model_t: estimator, {'linear', 'forest'}, list of str/estimator, or 'auto', default 'auto' + model_t: estimator, default ``'auto'`` Determines how to fit the treatment to the features. - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. - - - 'linear' - LogisticRegressionCV if discrete_treatment=True else WeightedLassoCVWrapper - - 'forest' - RandomForestClassifier if discrete_treatment=True else RandomForestRegressor - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - User-supplied estimators should support 'fit' and 'predict' methods, - and additionally 'predict_proba' if discrete_treatment=True. + - Otherwise, see :ref:`model_selection` for the range of supported options; + if a single model is specified it should be a classifier if `discrete_treatment` is True + and a regressor otherwise featurizer : :term:`transformer`, optional Must support fit_transform and transform. Used to create composite features in the final CATE regression. @@ -873,35 +849,23 @@ class SparseLinearDML(DebiasedLassoCateEstimatorMixin, DML): Parameters ---------- - model_y: estimator, {'linear', 'forest'}, list of str/estimator, or 'auto' - Determines how to fit the treatment to the features. - - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. + model_y: estimator, default ``'auto'`` + Determines how to fit the outcome to the features. - - 'linear' - LogisticRegressionCV if discrete_outcome=True else WeightedLassoCVWrapper - - 'forest' - RandomForestClassifier if discrete_outcome=True else RandomForestRegressor - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - User-supplied estimators should support 'fit' and 'predict' methods, - and additionally 'predict_proba' if discrete_outcome=True. + - Otherwise, see :ref:`model_selection` for the range of supported options; + if a single model is specified it should be a classifier if `discrete_outcome` is True + and a regressor otherwise - model_t: estimator, {'linear', 'forest'}, list of str/estimator, or 'auto', default 'auto' + model_t: estimator, default ``'auto'`` Determines how to fit the treatment to the features. - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - - 'linear' - LogisticRegressionCV if discrete_treatment=True else WeightedLassoCVWrapper - - 'forest' - RandomForestClassifier if discrete_treatment=True else RandomForestRegressor - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models - - User-supplied estimators should support 'fit' and 'predict' methods, - and additionally 'predict_proba' if discrete_treatment=True. + - Otherwise, see :ref:`model_selection` for the range of supported options; + if a single model is specified it should be a classifier if `discrete_treatment` is True + and a regressor otherwise alpha: str or float, default 'auto' CATE L1 regularization applied through the debiased lasso in the final model. @@ -1172,32 +1136,23 @@ class KernelDML(DML): Parameters ---------- - model_y: estimator, {'linear', 'forest'}, list of str/estimator, or 'auto' - Determines how to fit the treatment to the features. + model_y: estimator, default ``'auto'`` + Determines how to fit the outcome to the features. - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - - 'linear' - LogisticRegressionCV if discrete_outcome=True else WeightedLassoCVWrapper - - 'forest' - RandomForestClassifier if discrete_outcome=True else RandomForestRegressor - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models + - Otherwise, see :ref:`model_selection` for the range of supported options; + if a single model is specified it should be a classifier if `discrete_outcome` is True + and a regressor otherwise - User-supplied estimators should support 'fit' and 'predict' methods, - and additionally 'predict_proba' if discrete_outcome=True. - - model_t: estimator, {'linear', 'forest'}, list of str/estimator, or 'auto', default 'auto' + model_t: estimator, default ``'auto'`` Determines how to fit the treatment to the features. - - If an estimator, will use the model as is for fitting. - - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - User-supplied estimators should support 'fit' and 'predict' methods, - and additionally 'predict_proba' if discrete_treatment=True. + - Otherwise, see :ref:`model_selection` for the range of supported options; + if a single model is specified it should be a classifier if `discrete_treatment` is True + and a regressor otherwise fit_cate_intercept : bool, default True Whether the linear CATE model should have a constant term. @@ -1397,32 +1352,23 @@ class NonParamDML(_BaseDML): Parameters ---------- - model_y: estimator, {'linear', 'forest'}, list of str/estimator, or 'auto' - Determines how to fit the treatment to the features. - - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. + model_y: estimator, default ``'auto'`` + Determines how to fit the outcome to the features. - - 'linear' - LogisticRegressionCV if discrete_outcome=True else WeightedLassoCVWrapper - - 'forest' - RandomForestClassifier if discrete_outcome=True else RandomForestRegressor - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - User-supplied estimators should support 'fit' and 'predict' methods, - and additionally 'predict_proba' if discrete_outcome=True. + - Otherwise, see :ref:`model_selection` for the range of supported options; + if a single model is specified it should be a classifier if `discrete_outcome` is True + and a regressor otherwise - model_t: estimator, {'linear', 'forest'}, list of str/estimator, or 'auto' + model_t: estimator, default ``'auto'`` Determines how to fit the treatment to the features. - - If an estimator, will use the model as is for fitting. - - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - User-supplied estimators should support 'fit' and 'predict' methods, - and additionally 'predict_proba' if discrete_treatment=True. + - Otherwise, see :ref:`model_selection` for the range of supported options; + if a single model is specified it should be a classifier if `discrete_treatment` is True + and a regressor otherwise model_final: estimator The estimator for fitting the response residuals to the treatment residuals. Must implement diff --git a/econml/dr/_drlearner.py b/econml/dr/_drlearner.py index 12d06b3a7..2271b15aa 100644 --- a/econml/dr/_drlearner.py +++ b/econml/dr/_drlearner.py @@ -251,35 +251,22 @@ class takes as input the parameter ``model_regressor``, which is an arbitrary sc Parameters ---------- - model_propensity : estimator, {'linear', 'forest'}, list of str/estimator, or 'auto', default 'auto' - Estimator for Pr[T=t | X, W]. Trained by regressing treatments on (features, controls) concatenated. + model_propensity: estimator, default ``'auto'`` + Classifier for Pr[T=t | X, W]. Trained by regressing treatments on (features, controls) concatenated. - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - - 'linear' - LogisticRegressionCV - - 'forest' - RandomForestClassifier - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models + - Otherwise, see :ref:`model_selection` for the range of supported options - User-supplied estimators should support 'fit' and 'predict', and 'predict_proba'. - - model_regression : estimator, {'linear', 'forest'}, list of str/estimator, or 'auto' + model_regression: estimator, default ``'auto'`` Estimator for E[Y | X, W, T]. Trained by regressing Y on (features, controls, one-hot-encoded treatments) concatenated. The one-hot-encoding excludes the baseline treatment. - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. - - - 'linear' - LogisticRegressionCV if discrete_outcome=True else WeightedLassoCVWrapper - - 'forest' - RandomForestClassifier if discrete_outcome=True else RandomForestRegressor - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - User-supplied estimators should support 'fit' and 'predict' methods, - and additionally 'predict_proba' if discrete_outcome=True. + - Otherwise, see :ref:`model_selection` for the range of supported options; + if a single model is specified it should be a classifier if `discrete_outcome` is True + and a regressor otherwise model_final : estimator for the final cate model. Trained on regressing the doubly robust potential outcomes @@ -813,35 +800,22 @@ class LinearDRLearner(StatsModelsCateEstimatorDiscreteMixin, DRLearner): Parameters ---------- - model_propensity : estimator, {'linear', 'forest'}, list of str/estimator, or 'auto', default 'auto' - Estimator for Pr[T=t | X, W]. Trained by regressing treatments on (features, controls) concatenated. + model_propensity: estimator, default ``'auto'`` + Classifier for Pr[T=t | X, W]. Trained by regressing treatments on (features, controls) concatenated. - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - - 'linear' - LogisticRegressionCV - - 'forest' - RandomForestClassifier - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models + - Otherwise, see :ref:`model_selection` for the range of supported options - User-supplied estimators should support 'fit' and 'predict', and 'predict_proba'. - - model_regression : estimator, {'linear', 'forest'}, list of str/estimator, or 'auto' + model_regression: estimator, default ``'auto'`` Estimator for E[Y | X, W, T]. Trained by regressing Y on (features, controls, one-hot-encoded treatments) concatenated. The one-hot-encoding excludes the baseline treatment. - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. - - - 'linear' - LogisticRegressionCV if discrete_outcome=True else WeightedLassoCVWrapper - - 'forest' - RandomForestClassifier if discrete_outcome=True else RandomForestRegressor - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - User-supplied estimators should support 'fit' and 'predict' methods, - and additionally 'predict_proba' if discrete_outcome=True. + - Otherwise, see :ref:`model_selection` for the range of supported options; + if a single model is specified it should be a classifier if `discrete_outcome` is True + and a regressor otherwise featurizer : :term:`transformer`, optional Must support fit_transform and transform. Used to create composite features in the final CATE regression. @@ -1105,35 +1079,22 @@ class SparseLinearDRLearner(DebiasedLassoCateEstimatorDiscreteMixin, DRLearner): Parameters ---------- - model_propensity : estimator, {'linear', 'forest'}, list of str/estimator, or 'auto', default 'auto' - Estimator for Pr[T=t | X, W]. Trained by regressing treatments on (features, controls) concatenated. + model_propensity: estimator, default ``'auto'`` + Classifier for Pr[T=t | X, W]. Trained by regressing treatments on (features, controls) concatenated. - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - - 'linear' - LogisticRegressionCV - - 'forest' - RandomForestClassifier - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models + - Otherwise, see :ref:`model_selection` for the range of supported options - User-supplied estimators should support 'fit' and 'predict', and 'predict_proba'. - - model_regression : estimator, {'linear', 'forest'}, list of str/estimator, or 'auto' + model_regression: estimator, default ``'auto'`` Estimator for E[Y | X, W, T]. Trained by regressing Y on (features, controls, one-hot-encoded treatments) concatenated. The one-hot-encoding excludes the baseline treatment. - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. - - - 'linear' - LogisticRegressionCV if discrete_outcome=True else WeightedLassoCVWrapper - - 'forest' - RandomForestClassifier if discrete_outcome=True else RandomForestRegressor - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - User-supplied estimators should support 'fit' and 'predict' methods, - and additionally 'predict_proba' if discrete_outcome=True. + - Otherwise, see :ref:`model_selection` for the range of supported options; + if a single model is specified it should be a classifier if `discrete_outcome` is True + and a regressor otherwise featurizer : :term:`transformer`, optional Must support fit_transform and transform. Used to create composite features in the final CATE regression. @@ -1407,35 +1368,22 @@ class ForestDRLearner(ForestModelFinalCateEstimatorDiscreteMixin, DRLearner): Parameters ---------- - model_propensity : estimator, {'linear', 'forest'}, list of str/estimator, or 'auto', default 'auto' - Estimator for Pr[T=t | X, W]. Trained by regressing treatments on (features, controls) concatenated. + model_propensity: estimator, default ``'auto'`` + Classifier for Pr[T=t | X, W]. Trained by regressing treatments on (features, controls) concatenated. - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - - 'linear' - LogisticRegressionCV - - 'forest' - RandomForestClassifier - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models + - Otherwise, see :ref:`model_selection` for the range of supported options - User-supplied estimators should support 'fit' and 'predict', and 'predict_proba'. - - model_regression : estimator, {'linear', 'forest'}, list of str/estimator, or 'auto' + model_regression: estimator, default ``'auto'`` Estimator for E[Y | X, W, T]. Trained by regressing Y on (features, controls, one-hot-encoded treatments) concatenated. The one-hot-encoding excludes the baseline treatment. - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. - - - 'linear' - LogisticRegressionCV if discrete_outcome=True else WeightedLassoCVWrapper - - 'forest' - RandomForestClassifier if discrete_outcome=True else RandomForestRegressor - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - User-supplied estimators should support 'fit' and 'predict' methods, - and additionally 'predict_proba' if discrete_outcome=True. + - Otherwise, see :ref:`model_selection` for the range of supported options; + if a single model is specified it should be a classifier if `discrete_outcome` is True + and a regressor otherwise discrete_outcome: bool, default False Whether the outcome should be treated as binary diff --git a/econml/iv/dml/_dml.py b/econml/iv/dml/_dml.py index 9b45494c7..d47c1204f 100644 --- a/econml/iv/dml/_dml.py +++ b/econml/iv/dml/_dml.py @@ -210,65 +210,41 @@ class OrthoIV(LinearModelFinalCateEstimatorMixin, _OrthoLearner): Parameters ---------- - model_y_xw : estimator, {'linear', 'forest'}, list of str/estimator, or 'auto' - model to estimate :math:`\\E[Y | X, W]`. + model_y_xw: estimator, default ``'auto'`` + Determines how to fit the outcome to the features and controls (:math:`\\E[Y | X, W]`). - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - - 'linear' - LogisticRegressionCV if discrete_outcome=True else WeightedLassoCVWrapper - - 'forest' - RandomForestClassifier if discrete_outcome=True else RandomForestRegressor - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models + - Otherwise, see :ref:`model_selection` for the range of supported options; + if a single model is specified it should be a classifier if `discrete_outcome` is True + and a regressor otherwise - User-supplied estimators should support 'fit' and 'predict' methods, - and additionally 'predict_proba' if discrete_outcome=True. + model_t_xw: estimator, default ``'auto'`` + Determines how to fit the treatment to the features and controls (:math:`\\E[T | X, W]`). - model_t_xw : estimator, {'linear', 'forest'}, list of str/estimator, or 'auto', default 'auto' - model to estimate :math:`\\E[T | X, W]`. + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. + - Otherwise, see :ref:`model_selection` for the range of supported options; + if a single model is specified it should be a classifier if `discrete_treatment` is True + and a regressor otherwise - - 'linear' - LogisticRegressionCV if discrete_treatment=True else WeightedLassoCVWrapper - - 'forest' - RandomForestClassifier if discrete_treatment=True else RandomForestRegressor - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models + model_t_xwz: estimator, default ``'auto'`` + Determines how to fit the treatment to the features, controls, and instrument (:math:`\\E[T | X, W, Z]`). - User-supplied estimators should support 'fit' and 'predict' methods, - and additionally 'predict_proba' if discrete_treatment=True. + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - model_t_xwz : estimator, {'linear', 'forest'}, list of str/estimator, or 'auto', default 'auto' - model to estimate :math:`\\E[T | X, W, Z]`. + - Otherwise, see :ref:`model_selection` for the range of supported options; + if a single model is specified it should be a classifier if `discrete_treatment` is True + and a regressor otherwise - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. + model_z_xw: estimator, default ``'auto'`` + Determines how to fit the instrument to the features and controls (:math:`\\E[Z | X, W]`). - - 'linear' - LogisticRegressionCV if discrete_treatment=True else WeightedLassoCVWrapper - - 'forest' - RandomForestClassifier if discrete_treatment=True else RandomForestRegressor - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - User-supplied estimators should support 'fit' and 'predict' methods, - and additionally 'predict_proba' if discrete_treatment=True. - - model_z_xw : estimator, {'linear', 'forest'}, list of str/estimator, or 'auto' - model to estimate :math:`\\E[Z | X, W]`. - - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. - - - 'linear' - LogisticRegressionCV if discrete_instrument=True else WeightedLassoCVWrapper - - 'forest' - RandomForestClassifier if discrete_instrument=True else RandomForestRegressor - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models - - User-supplied estimators should support 'fit' and 'predict' methods, - and additionally 'predict_proba' if discrete_instrument=True. + - Otherwise, see :ref:`model_selection` for the range of supported options; + if a single model is specified it should be a classifier if `discrete_instrument` is True + and a regressor otherwise projection: bool, default False If True, we fit a slight variant of OrthoIV where we use E[T|X, W, Z] as the instrument as opposed to Z, @@ -1044,50 +1020,32 @@ class DMLIV(_BaseDMLIV): Parameters ---------- - model_y_xw : estimator, {'linear', 'forest'}, list of str/estimator, or 'auto' - model to estimate :math:`\\E[Y | X, W]`. - - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. + model_y_xw: estimator, default ``'auto'`` + Determines how to fit the outcome to the features and controls (:math:`\\E[Y | X, W]`). - - 'linear' - LogisticRegressionCV if discrete_outcome=True else WeightedLassoCVWrapper - - 'forest' - RandomForestClassifier if discrete_outcome=True else RandomForestRegressor - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - User-supplied estimators should support 'fit' and 'predict' methods, - and additionally 'predict_proba' if discrete_outcome=True. + - Otherwise, see :ref:`model_selection` for the range of supported options; + if a single model is specified it should be a classifier if `discrete_outcome` is True + and a regressor otherwise - model_t_xw : estimator, {'linear', 'forest'}, list of str/estimator, or 'auto', default 'auto' - Model to estimate :math:`\\E[T | X, W]`. + model_t_xw: estimator, default ``'auto'`` + Determines how to fit the treatment to the features and controls (:math:`\\E[T | X, W]`). - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - - 'linear' - LogisticRegressionCV if discrete_treatment=True else WeightedLassoCVWrapper - - 'forest' - RandomForestClassifier if discrete_treatment=True else RandomForestRegressor - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models + - Otherwise, see :ref:`model_selection` for the range of supported options; + if a single model is specified it should be a classifier if `discrete_treatment` is True + and a regressor otherwise - User-supplied estimators should support 'fit' and 'predict' methods, - and additionally 'predict_proba' if discrete_treatment=True. + model_t_xwz: estimator, default ``'auto'`` + Determines how to fit the treatment to the features, controls, and instrument (:math:`\\E[T | X, W, Z]`). - model_t_xwz : estimator, {'linear', 'forest'}, list of str/estimator, or 'auto', default 'auto' - Model to estimate :math:`\\E[T | X, W, Z]`. + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. - - - 'linear' - LogisticRegressionCV if discrete_treatment=True else WeightedLassoCVWrapper - - 'forest' - RandomForestClassifier if discrete_treatment=True else RandomForestRegressor - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models - - User-supplied estimators should support 'fit' and 'predict' methods, - and additionally 'predict_proba' if discrete_treatment=True. + - Otherwise, see :ref:`model_selection` for the range of supported options; + if a single model is specified it should be a classifier if `discrete_treatment` is True + and a regressor otherwise model_final : estimator (default is :class:`.StatsModelsLinearRegression`) final model that at fit time takes as input :math:`(Y-\\E[Y|X])`, :math:`(\\E[T|X,Z]-\\E[T|X])` and X @@ -1452,50 +1410,32 @@ class NonParamDMLIV(_BaseDMLIV): Parameters ---------- - model_y_xw : estimator, {'linear', 'forest'}, list of str/estimator, or 'auto' - model to estimate :math:`\\E[Y | X, W]`. - - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. - - - 'linear' - LogisticRegressionCV if discrete_outcome=True else WeightedLassoCVWrapper - - 'forest' - RandomForestClassifier if discrete_outcome=True else RandomForestRegressor - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models - - User-supplied estimators should support 'fit' and 'predict' methods, - and additionally 'predict_proba' if discrete_outcome=True. + model_y_xw: estimator, default ``'auto'`` + Determines how to fit the outcome to the features and controls (:math:`\\E[Y | X, W]`). - model_t_xw : estimator, {'linear', 'forest'}, list of str/estimator, or 'auto', default 'auto' - Model to estimate :math:`\\E[T | X, W]`. + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. + - Otherwise, see :ref:`model_selection` for the range of supported options; + if a single model is specified it should be a classifier if `discrete_outcome` is True + and a regressor otherwise - - 'linear' - LogisticRegressionCV if discrete_treatment=True else WeightedLassoCVWrapper - - 'forest' - RandomForestClassifier if discrete_treatment=True else RandomForestRegressor - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models + model_t_xw: estimator, default ``'auto'`` + Determines how to fit the treatment to the features and controls (:math:`\\E[T | X, W]`). - User-supplied estimators should support 'fit' and 'predict' methods, - and additionally 'predict_proba' if discrete_treatment=True. + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - model_t_xwz : estimator, {'linear', 'forest'}, list of str/estimator, or 'auto', default 'auto' - Model to estimate :math:`\\E[T | X, W, Z]`. + - Otherwise, see :ref:`model_selection` for the range of supported options; + if a single model is specified it should be a classifier if `discrete_treatment` is True + and a regressor otherwise - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. + model_t_xwz: estimator, default ``'auto'`` + Determines how to fit the treatment to the features, controls, and instrument (:math:`\\E[T | X, W, Z]`). - - 'linear' - LogisticRegressionCV if discrete_treatment=True else WeightedLassoCVWrapper - - 'forest' - RandomForestClassifier if discrete_treatment=True else RandomForestRegressor - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - User-supplied estimators should support 'fit' and 'predict' methods, - and additionally 'predict_proba' if discrete_treatment=True. + - Otherwise, see :ref:`model_selection` for the range of supported options; + if a single model is specified it should be a classifier if `discrete_treatment` is True + and a regressor otherwise model_final : estimator final model for predicting :math:`\\tilde{Y}` from X with sample weights V(X) diff --git a/econml/iv/dr/_dr.py b/econml/iv/dr/_dr.py index 02f0f6d4e..ccfc6cde5 100644 --- a/econml/iv/dr/_dr.py +++ b/econml/iv/dr/_dr.py @@ -718,83 +718,51 @@ class DRIV(_DRIV): Parameters ---------- - model_y_xw : estimator, {'linear', 'forest'}, list of str/estimator, or 'auto' - model to estimate :math:`\\E[Y | X, W]`. + model_y_xw: estimator, default ``'auto'`` + Determines how to fit the outcome to the features and controls (:math:`\\E[Y | X, W]`). - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - - 'linear' - LogisticRegressionCV if discrete_outcome=True else WeightedLassoCVWrapper - - 'forest' - RandomForestClassifier if discrete_outcome=True else RandomForestRegressor - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models + - Otherwise, see :ref:`model_selection` for the range of supported options; + if a single model is specified it should be a classifier if `discrete_outcome` is True + and a regressor otherwise - User-supplied estimators should support 'fit' and 'predict' methods, - and additionally 'predict_proba' if discrete_outcome=True. + model_t_xw: estimator, default ``'auto'`` + Determines how to fit the treatment to the features and controls (:math:`\\E[T | X, W]`). - model_t_xw : estimator, {'linear', 'forest'}, list of str/estimator, or 'auto', default 'auto' - Model to estimate :math:`\\E[T | X, W]`. + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. + - Otherwise, see :ref:`model_selection` for the range of supported options; + if a single model is specified it should be a classifier if `discrete_treatment` is True + and a regressor otherwise - - 'linear' - LogisticRegressionCV if discrete_treatment=True else WeightedLassoCVWrapper - - 'forest' - RandomForestClassifier if discrete_treatment=True else RandomForestRegressor - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models + model_z_xw: estimator, default ``'auto'`` + Determines how to fit the instrument to the features and controls (:math:`\\E[Z | X, W]`). - User-supplied estimators should support 'fit' and 'predict' methods, - and additionally 'predict_proba' if discrete_treatment=True. + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - model_z_xw : estimator, {'linear', 'forest'}, list of str/estimator, or 'auto' - model to estimate :math:`\\E[Z | X, W]`. + - Otherwise, see :ref:`model_selection` for the range of supported options; + if a single model is specified it should be a classifier if `discrete_instrument` is True + and a regressor otherwise - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. + model_t_xwz: estimator, default ``'auto'`` + Determines how to fit the treatment to the features, controls, and instrument (:math:`\\E[T | X, W, Z]`). - - 'linear' - LogisticRegressionCV if discrete_instrument=True else WeightedLassoCVWrapper - - 'forest' - RandomForestClassifier if discrete_instrument=True else RandomForestRegressor - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - User-supplied estimators should support 'fit' and 'predict' methods, - and additionally 'predict_proba' if discrete_instrument=True. + - Otherwise, see :ref:`model_selection` for the range of supported options; + if a single model is specified it should be a classifier if `discrete_treatment` is True + and a regressor otherwise - model_t_xwz : estimator, {'linear', 'forest'}, list of str/estimator, or 'auto', default 'auto' - Model to estimate :math:`\\E[T | X, W, Z]`. + model_tz_xw: estimator, default ``'auto'`` + Determines how to fit the covariance to the features and controls (:math:`\\E[T*Z | X, W]` or + :math:`\\E[\\tilde{T}*\\tilde{Z} | X, W]` depending on `fit_cov_directly`). - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - - 'linear' - LogisticRegressionCV if discrete_treatment=True else WeightedLassoCVWrapper - - 'forest' - RandomForestClassifier if discrete_treatment=True else RandomForestRegressor - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models - - User-supplied estimators should support 'fit' and 'predict' methods, - and additionally 'predict_proba' if discrete_treatment=True. - - model_tz_xw : estimator, {'linear', 'forest'}, list of str/estimator, or 'auto', default 'auto' - model to estimate :math:`\\E[T*Z | X, W]` or :math:`\\E[\\tilde{T}*\\tilde{Z} | X, W]` - depending on `fit_cov_directly`. - Target will be discrete if discrete instrument and discrete treatment with `fit_cov_directly=False`, - else target will be continuous. - - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. - - - 'linear' - LogisticRegressionCV if discrete target else WeightedLassoCVWrapper - - 'forest' - RandomForestClassifier if discrete target else RandomForestRegressor - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models - - User-supplied estimators should support 'fit' and 'predict' methods, - and additionally 'predict_proba' if discrete target. + - Otherwise, see :ref:`model_selection` for the range of supported options; + if a single model is specified it should be a classifier if `discrete_treatment` is True + and a regressor otherwise fit_cov_directly : bool, default True Whether to fit :math:`\\E[\\tilde{T}*\\tilde{Z} | X, W]` instead of :math:`\\E[T*Z | X, W]`. @@ -1253,83 +1221,51 @@ class LinearDRIV(StatsModelsCateEstimatorMixin, DRIV): Parameters ---------- - model_y_xw : estimator, {'linear', 'forest'}, list of str/estimator, or 'auto' - model to estimate :math:`\\E[Y | X, W]`. - - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. - - - 'linear' - LogisticRegressionCV if discrete_outcome=True else WeightedLassoCVWrapper - - 'forest' - RandomForestClassifier if discrete_outcome=True else RandomForestRegressor - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models + model_y_xw: estimator, default ``'auto'`` + Determines how to fit the outcome to the features and controls (:math:`\\E[Y | X, W]`). - User-supplied estimators should support 'fit' and 'predict' methods, - and additionally 'predict_proba' if discrete_outcome=True. + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - model_t_xw : estimator, {'linear', 'forest'}, list of str/estimator, or 'auto', default 'auto' - model to estimate :math:`\\E[T | X, W]`. + - Otherwise, see :ref:`model_selection` for the range of supported options; + if a single model is specified it should be a classifier if `discrete_outcome` is True + and a regressor otherwise - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. + model_t_xw: estimator, default ``'auto'`` + Determines how to fit the treatment to the features and controls (:math:`\\E[T | X, W]`). - - 'linear' - LogisticRegressionCV if discrete_treatment=True else WeightedLassoCVWrapper - - 'forest' - RandomForestClassifier if discrete_treatment=True else RandomForestRegressor - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - User-supplied estimators should support 'fit' and 'predict' methods, - and additionally 'predict_proba' if discrete_treatment=True. + - Otherwise, see :ref:`model_selection` for the range of supported options; + if a single model is specified it should be a classifier if `discrete_treatment` is True + and a regressor otherwise - model_z_xw : estimator, {'linear', 'forest'}, list of str/estimator, or 'auto' - model to estimate :math:`\\E[Z | X, W]`. + model_z_xw: estimator, default ``'auto'`` + Determines how to fit the instrument to the features and controls (:math:`\\E[Z | X, W]`). - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - - 'linear' - LogisticRegressionCV if discrete_instrument=True else WeightedLassoCVWrapper - - 'forest' - RandomForestClassifier if discrete_instrument=True else RandomForestRegressor - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models + - Otherwise, see :ref:`model_selection` for the range of supported options; + if a single model is specified it should be a classifier if `discrete_instrument` is True + and a regressor otherwise - User-supplied estimators should support 'fit' and 'predict' methods, - and additionally 'predict_proba' if discrete_instrument=True. + model_t_xwz: estimator, default ``'auto'`` + Determines how to fit the treatment to the features, controls, and instrument (:math:`\\E[T | X, W, Z]`). - model_t_xwz : estimator, {'linear', 'forest'}, list of str/estimator, or 'auto', default 'auto' - model to estimate :math:`\\E[T | X, W, Z]`. + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. + - Otherwise, see :ref:`model_selection` for the range of supported options; + if a single model is specified it should be a classifier if `discrete_treatment` is True + and a regressor otherwise - - 'linear' - LogisticRegressionCV if discrete_treatment=True else WeightedLassoCVWrapper - - 'forest' - RandomForestClassifier if discrete_treatment=True else RandomForestRegressor - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models + model_tz_xw: estimator, default ``'auto'`` + Determines how to fit the covariance to the features and controls (:math:`\\E[T*Z | X, W]` or + :math:`\\E[\\tilde{T}*\\tilde{Z} | X, W]` depending on `fit_cov_directly`). - User-supplied estimators should support 'fit' and 'predict' methods, - and additionally 'predict_proba' if discrete_treatment=True. + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - model_tz_xw : estimator, {'linear', 'forest'}, list of str/estimator, or 'auto', default 'auto' - model to estimate :math:`\\E[T*Z | X, W]` or :math:`\\E[\\tilde{T}*\\tilde{Z} | X, W]` - depending on `fit_cov_directly`. - Target will be discrete if discrete instrument and discrete treatment with `fit_cov_directly=False`, - else target will be continuous. - - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. - - - 'linear' - LogisticRegressionCV if discrete target else WeightedLassoCVWrapper - - 'forest' - RandomForestClassifier if discrete target else RandomForestRegressor - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models - - User-supplied estimators should support 'fit' and 'predict' methods, - and additionally 'predict_proba' if discrete target. + - Otherwise, see :ref:`model_selection` for the range of supported options; + if a single model is specified it should be a classifier if `discrete_treatment` is True + and a regressor otherwise fit_cov_directly : bool, default True Whether to fit :math:`\\E[\\tilde{T}*\\tilde{Z} | X, W]` instead of :math:`\\E[T*Z | X, W]`. @@ -1623,83 +1559,51 @@ class SparseLinearDRIV(DebiasedLassoCateEstimatorMixin, DRIV): Parameters ---------- - model_y_xw : estimator, {'linear', 'forest'}, list of str/estimator, or 'auto' - model to estimate :math:`\\E[Y | X, W]`. - - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. - - - 'linear' - LogisticRegressionCV if discrete_outcome=True else WeightedLassoCVWrapper - - 'forest' - RandomForestClassifier if discrete_outcome=True else RandomForestRegressor - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models - - User-supplied estimators should support 'fit' and 'predict' methods, - and additionally 'predict_proba' if discrete_outcome=True. + model_y_xw: estimator, default ``'auto'`` + Determines how to fit the outcome to the features and controls (:math:`\\E[Y | X, W]`). - model_t_xw : estimator, {'linear', 'forest'}, list of str/estimator, or 'auto', default 'auto' - model to estimate :math:`\\E[T | X, W]`. + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. + - Otherwise, see :ref:`model_selection` for the range of supported options; + if a single model is specified it should be a classifier if `discrete_outcome` is True + and a regressor otherwise - - 'linear' - LogisticRegressionCV if discrete_treatment=True else WeightedLassoCVWrapper - - 'forest' - RandomForestClassifier if discrete_treatment=True else RandomForestRegressor - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models + model_t_xw: estimator, default ``'auto'`` + Determines how to fit the treatment to the features and controls (:math:`\\E[T | X, W]`). - User-supplied estimators should support 'fit' and 'predict' methods, - and additionally 'predict_proba' if discrete_treatment=True. + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - model_z_xw : estimator, {'linear', 'forest'}, list of str/estimator, or 'auto' - model to estimate :math:`\\E[Z | X, W]`. + - Otherwise, see :ref:`model_selection` for the range of supported options; + if a single model is specified it should be a classifier if `discrete_treatment` is True + and a regressor otherwise - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. + model_z_xw: estimator, default ``'auto'`` + Determines how to fit the instrument to the features and controls (:math:`\\E[Z | X, W]`). - - 'linear' - LogisticRegressionCV if discrete_instrument=True else WeightedLassoCVWrapper - - 'forest' - RandomForestClassifier if discrete_instrument=True else RandomForestRegressor - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - User-supplied estimators should support 'fit' and 'predict' methods, - and additionally 'predict_proba' if discrete_instrument=True. + - Otherwise, see :ref:`model_selection` for the range of supported options; + if a single model is specified it should be a classifier if `discrete_instrument` is True + and a regressor otherwise - model_t_xwz : estimator, {'linear', 'forest'}, list of str/estimator, or 'auto', default 'auto' - Model to estimate :math:`\\E[T | X, W, Z]`. + model_t_xwz: estimator, default ``'auto'`` + Determines how to fit the treatment to the features, controls, and instrument (:math:`\\E[T | X, W, Z]`). - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - - 'linear' - LogisticRegressionCV if discrete_treatment=True else WeightedLassoCVWrapper - - 'forest' - RandomForestClassifier if discrete_treatment=True else RandomForestRegressor - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models + - Otherwise, see :ref:`model_selection` for the range of supported options; + if a single model is specified it should be a classifier if `discrete_treatment` is True + and a regressor otherwise - User-supplied estimators should support 'fit' and 'predict' methods, - and additionally 'predict_proba' if discrete_treatment=True. + model_tz_xw: estimator, default ``'auto'`` + Determines how to fit the covariance to the features and controls (:math:`\\E[T*Z | X, W]` or + :math:`\\E[\\tilde{T}*\\tilde{Z} | X, W]` depending on `fit_cov_directly`). - model_tz_xw : estimator, {'linear', 'forest'}, list of str/estimator, or 'auto', default 'auto' - model to estimate :math:`\\E[T*Z | X, W]` or :math:`\\E[\\tilde{T}*\\tilde{Z} | X, W]` - depending on `fit_cov_directly`. - Target will be discrete if discrete instrument and discrete treatment with `fit_cov_directly=False`, - else target will be continuous. + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. - - - 'linear' - LogisticRegressionCV if discrete target else WeightedLassoCVWrapper - - 'forest' - RandomForestClassifier if discrete target else RandomForestRegressor - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models - - User-supplied estimators should support 'fit' and 'predict' methods, - and additionally 'predict_proba' if discrete target. + - Otherwise, see :ref:`model_selection` for the range of supported options; + if a single model is specified it should be a classifier if `discrete_treatment` is True + and a regressor otherwise fit_cov_directly : bool, default True Whether to fit :math:`\\E[\\tilde{T}*\\tilde{Z} | X, W]` instead of :math:`\\E[T*Z | X, W]`. @@ -2039,83 +1943,51 @@ class ForestDRIV(ForestModelFinalCateEstimatorMixin, DRIV): Parameters ---------- - model_y_xw : estimator, {'linear', 'forest'}, list of str/estimator, or 'auto' - model to estimate :math:`\\E[Y | X, W]`. - - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. - - - 'linear' - LogisticRegressionCV if discrete_outcome=True else WeightedLassoCVWrapper - - 'forest' - RandomForestClassifier if discrete_outcome=True else RandomForestRegressor - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models - - User-supplied estimators should support 'fit' and 'predict' methods, - and additionally 'predict_proba' if discrete_outcome=True. - - model_t_xw : estimator, {'linear', 'forest'}, list of str/estimator, or 'auto', default 'auto' - model to estimate :math:`\\E[T | X, W]`. + model_y_xw: estimator, default ``'auto'`` + Determines how to fit the outcome to the features and controls (:math:`\\E[Y | X, W]`). - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - - 'linear' - LogisticRegressionCV if discrete_treatment=True else WeightedLassoCVWrapper - - 'forest' - RandomForestClassifier if discrete_treatment=True else RandomForestRegressor - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models + - Otherwise, see :ref:`model_selection` for the range of supported options; + if a single model is specified it should be a classifier if `discrete_outcome` is True + and a regressor otherwise - User-supplied estimators should support 'fit' and 'predict' methods, - and additionally 'predict_proba' if discrete_treatment=True. + model_t_xw: estimator, default ``'auto'`` + Determines how to fit the treatment to the features and controls (:math:`\\E[T | X, W]`). - model_z_xw : estimator, {'linear', 'forest'}, list of str/estimator, or 'auto' - model to estimate :math:`\\E[Z | X, W]`. + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. + - Otherwise, see :ref:`model_selection` for the range of supported options; + if a single model is specified it should be a classifier if `discrete_treatment` is True + and a regressor otherwise - - 'linear' - LogisticRegressionCV if discrete_instrument=True else WeightedLassoCVWrapper - - 'forest' - RandomForestClassifier if discrete_instrument=True else RandomForestRegressor - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models + model_z_xw: estimator, default ``'auto'`` + Determines how to fit the instrument to the features and controls (:math:`\\E[Z | X, W]`). - User-supplied estimators should support 'fit' and 'predict' methods, - and additionally 'predict_proba' if discrete_instrument=True. + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - model_t_xwz : estimator, {'linear', 'forest'}, list of str/estimator, or 'auto', default 'auto' - model to estimate :math:`\\E[T | X, W, Z]`. + - Otherwise, see :ref:`model_selection` for the range of supported options; + if a single model is specified it should be a classifier if `discrete_instrument` is True + and a regressor otherwise - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. + model_t_xwz: estimator, default ``'auto'`` + Determines how to fit the treatment to the features, controls, and instrument (:math:`\\E[T | X, W, Z]`). - - 'linear' - LogisticRegressionCV if discrete_treatment=True else WeightedLassoCVWrapper - - 'forest' - RandomForestClassifier if discrete_treatment=True else RandomForestRegressor - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - User-supplied estimators should support 'fit' and 'predict' methods, - and additionally 'predict_proba' if discrete_treatment=True. + - Otherwise, see :ref:`model_selection` for the range of supported options; + if a single model is specified it should be a classifier if `discrete_treatment` is True + and a regressor otherwise - model_tz_xw : estimator, {'linear', 'forest'}, list of str/estimator, or 'auto', default 'auto' - model to estimate :math:`\\E[T*Z | X, W]` or :math:`\\E[\\tilde{T}*\\tilde{Z} | X, W]` - depending on `fit_cov_directly`. - Target will be discrete if discrete instrument and discrete treatment with `fit_cov_directly=False`, - else target will be continuous. + model_tz_xw: estimator, default ``'auto'`` + Determines how to fit the covariance to the features and controls (:math:`\\E[T*Z | X, W]` + or :math:`\\E[\\tilde{T}*\\tilde{Z} | X, W]` depending on `fit_cov_directly`). - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - - 'linear' - LogisticRegressionCV if discrete target else WeightedLassoCVWrapper - - 'forest' - RandomForestClassifier if discrete target else RandomForestRegressor - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models - - User-supplied estimators should support 'fit' and 'predict' methods, - and additionally 'predict_proba' if discrete target. + - Otherwise, see :ref:`model_selection` for the range of supported options; + if a single model is specified it should be a classifier if `discrete_treatment` is True + and a regressor otherwise fit_cov_directly : bool, default True Whether to fit :math:`\\E[\\tilde{T}*\\tilde{Z} | X, W]` instead of :math:`\\E[T*Z | X, W]`. @@ -2707,35 +2579,23 @@ class IntentToTreatDRIV(_IntentToTreatDRIV): Parameters ---------- - model_y_xw : estimator, {'linear', 'forest'}, list of str/estimator, or 'auto' - model to estimate :math:`\\E[Y | X, W]`. - - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. - - - 'linear' - LogisticRegressionCV if discrete_outcome=True else WeightedLassoCVWrapper - - 'forest' - RandomForestClassifier if discrete_outcome=True else RandomForestRegressor - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models + model_y_xw: estimator, default ``'auto'`` + Determines how to fit the outcome to the features and controls (:math:`\\E[Y | X, W]`). - User-supplied estimators should support 'fit' and 'predict' methods, - and additionally 'predict_proba' if discrete_outcome=True. + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - model_t_xwz : estimator, {'linear', 'forest'}, list of str/estimator, or 'auto', default 'auto' - model to estimate :math:`\\E[T | X, W, Z]`. + - Otherwise, see :ref:`model_selection` for the range of supported options; + if a single model is specified it should be a classifier if `discrete_outcome` is True + and a regressor otherwise - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. + model_t_xwz: estimator, default ``'auto'`` + Determines how to fit the treatment to the features, controls, and instrument (:math:`\\E[T | X, W, Z]`). - - 'linear' - LogisticRegressionCV if discrete_treatment=True else WeightedLassoCVWrapper - - 'forest' - RandomForestClassifier if discrete_treatment=True else RandomForestRegressor - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - User-supplied estimators should support 'fit' and 'predict' methods, - and additionally 'predict_proba' if discrete_treatment=True. + - Otherwise, see :ref:`model_selection` for the range of supported options; + if a single model is specified it should be a classifier if `discrete_treatment` is True + and a regressor otherwise flexible_model_effect : estimator or 'auto' (default is 'auto') a flexible model for a preliminary version of the CATE, must accept sample_weight at fit time. @@ -3026,36 +2886,23 @@ class LinearIntentToTreatDRIV(StatsModelsCateEstimatorMixin, IntentToTreatDRIV): Parameters ---------- - model_y_xw : estimator, {'linear', 'forest'}, list of str/estimator, or 'auto' - model to estimate :math:`\\E[Y | X, W]`. - - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. - - - 'linear' - LogisticRegressionCV if discrete_outcome=True else WeightedLassoCVWrapper - - 'forest' - RandomForestClassifier if discrete_outcome=True else RandomForestRegressor - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models - - User-supplied estimators should support 'fit' and 'predict' methods, - and additionally 'predict_proba' if discrete_outcome=True. + model_y_xw: estimator, default ``'auto'`` + Determines how to fit the outcome to the features and controls (:math:`\\E[Y | X, W]`). + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - model_t_xwz : estimator, {'linear', 'forest'}, list of str/estimator, or 'auto', default 'auto' - model to estimate :math:`\\E[T | X, W, Z]`. + - Otherwise, see :ref:`model_selection` for the range of supported options; + if a single model is specified it should be a classifier if `discrete_outcome` is True + and a regressor otherwise - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. + model_t_xwz: estimator, default ``'auto'`` + Determines how to fit the treatment to the features, controls, and instrument (:math:`\\E[T | X, W, Z]`). - - 'linear' - LogisticRegressionCV if discrete_treatment=True else WeightedLassoCVWrapper - - 'forest' - RandomForestClassifier if discrete_treatment=True else RandomForestRegressor - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - User-supplied estimators should support 'fit' and 'predict' methods, - and additionally 'predict_proba' if discrete_treatment=True. + - Otherwise, see :ref:`model_selection` for the range of supported options; + if a single model is specified it should be a classifier if `discrete_treatment` is True + and a regressor otherwise flexible_model_effect : estimator or 'auto' (default is 'auto') a flexible model for a preliminary version of the CATE, must accept sample_weight at fit time. diff --git a/econml/panel/dml/_dml.py b/econml/panel/dml/_dml.py index 79614bf71..5124357d9 100644 --- a/econml/panel/dml/_dml.py +++ b/econml/panel/dml/_dml.py @@ -348,35 +348,23 @@ class DynamicDML(LinearModelFinalCateEstimatorMixin, _OrthoLearner): Parameters ---------- - model_y: estimator, {'linear', 'forest'}, list of str/estimator, or 'auto' - model to estimate :math:`\\E[Y | X, W]`. + model_y: estimator, default ``'auto'`` + Determines how to fit the outcome to the features. - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - - 'linear' - LogisticRegressionCV if discrete_outcome=True else WeightedLassoCVWrapper - - 'forest' - RandomForestClassifier if discrete_outcome=True else RandomForestRegressor - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models + - Otherwise, see :ref:`model_selection` for the range of supported options; + if a single model is specified it should be a classifier if `discrete_outcome` is True + and a regressor otherwise - User-supplied estimators should support 'fit' and 'predict' methods, - and additionally 'predict_proba' if discrete_outcome=True. - - model_t: estimator, {'linear', 'forest'}, list of str/estimator, or 'auto' + model_t: estimator, default ``'auto'`` Determines how to fit the treatment to the features. - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. - - - 'linear' - LogisticRegressionCV if discrete_treatment=True else WeightedLassoCVWrapper - - 'forest' - RandomForestClassifier if discrete_treatment=True else RandomForestRegressor - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - User-supplied estimators should support 'fit' and 'predict' methods, - and additionally 'predict_proba' if discrete_treatment=True. + - Otherwise, see :ref:`model_selection` for the range of supported options; + if a single model is specified it should be a classifier if `discrete_treatment` is True + and a regressor otherwise featurizer : :term:`transformer`, optional Must support fit_transform and transform. Used to create composite features in the final CATE regression. diff --git a/econml/policy/_drlearner.py b/econml/policy/_drlearner.py index 2ee38c158..a3018d197 100644 --- a/econml/policy/_drlearner.py +++ b/econml/policy/_drlearner.py @@ -239,34 +239,22 @@ class takes as input the parameter ``model_regressor``, which is an arbitrary sc Parameters ---------- - model_propensity : estimator, {'linear', 'forest'}, list of str/estimator, or 'auto', default 'auto' - Estimator for Pr[T=t | X, W]. Trained by regressing treatments on (features, controls) concatenated. + model_propensity: estimator, default ``'auto'`` + Classifier for Pr[T=t | X, W]. Trained by regressing treatments on (features, controls) concatenated. - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - - 'linear' - LogisticRegressionCV - - 'forest' - RandomForestClassifier - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models + - Otherwise, see :ref:`model_selection` for the range of supported options - User-supplied estimators should support 'fit' and 'predict', and 'predict_proba'. - - model_regression : estimator, {'linear', 'forest'}, list of str/estimator, or 'auto' + model_regression: estimator, default ``'auto'`` Estimator for E[Y | X, W, T]. Trained by regressing Y on (features, controls, one-hot-encoded treatments) concatenated. The one-hot-encoding excludes the baseline treatment. - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. - - - 'linear' - LogisticRegressionCV - - 'forest' - RandomForestClassifier - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - User-supplied estimators should support 'fit' and 'predict' methods. + - Otherwise, see :ref:`model_selection` for the range of supported options; + if a single model is specified it should be a classifier if `discrete_outcome` is True + and a regressor otherwise featurizer : :term:`transformer`, optional Must support fit_transform and transform. Used to create composite features in the final CATE regression. @@ -651,34 +639,22 @@ class takes as input the parameter ``model_regressor``, which is an arbitrary sc Parameters ---------- - model_propensity : estimator, {'linear', 'forest'}, list of str/estimator, or 'auto', default 'auto' - Estimator for Pr[T=t | X, W]. Trained by regressing treatments on (features, controls) concatenated. + model_propensity: estimator, default ``'auto'`` + Classifier for Pr[T=t | X, W]. Trained by regressing treatments on (features, controls) concatenated. - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - - 'linear' - LogisticRegressionCV - - 'forest' - RandomForestClassifier - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models + - Otherwise, see :ref:`model_selection` for the range of supported options - User-supplied estimators should support 'fit' and 'predict', and 'predict_proba'. - - model_regression : estimator, {'linear', 'forest'}, list of str/estimator, or 'auto' + model_regression: estimator, default ``'auto'`` Estimator for E[Y | X, W, T]. Trained by regressing Y on (features, controls, one-hot-encoded treatments) concatenated. The one-hot-encoding excludes the baseline treatment. - - If an estimator, will use the model as is for fitting. - - If str, will use model associated with the keyword. - - - 'linear' - LogisticRegressionCV - - 'forest' - RandomForestClassifier - - If list, will perform model selection on the supplied list, which can be a mix of str and estimators, \ - and then use the best estimator for fitting. - - If 'auto', model will select over linear and forest models + - If ``'auto'``, the model will be the best-fitting of a set of linear and forest models - User-supplied estimators should support 'fit' and 'predict' methods. + - Otherwise, see :ref:`model_selection` for the range of supported options; + if a single model is specified it should be a classifier if `discrete_outcome` is True + and a regressor otherwise featurizer : :term:`transformer`, optional Must support fit_transform and transform. Used to create composite features in the final CATE regression.