Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API: implement full sample simulation as experimental feature #297

Merged
merged 4 commits into from
Sep 7, 2021

Conversation

j-ittner
Copy link
Member

This PR adds a hidden option to all …Simulator classes that enables an alternative approach to calculating simulations and confidence intervals:

  • instead of using all models in a LearnerCrossfit, run simulations using only one single model fit on the full dataset, and using the full dataset for all simulations
  • calculate the simulation CI using the standard error of the mean of each simulation, instead of using bootstrap sampling

To activate the alternative approach, set the hidden full_sample parameter to True after creating a simulator:

sim = UnivariateUpliftSimulator(crossfit=…, confidence_level=…, n_jobs=…, …)
sim.full_sample = True

@j-ittner j-ittner added the API New feature or request label Aug 29, 2021
@j-ittner j-ittner added this to the 1.2.0 milestone Aug 29, 2021
@j-ittner j-ittner self-assigned this Aug 29, 2021
@mgelsm mgelsm self-requested a review September 7, 2021 08:27
@j-ittner j-ittner merged commit 43c7136 into 1.2.x Sep 7, 2021
@j-ittner j-ittner deleted the feature/experimental_full_sample_simulation branch September 7, 2021 09:04
j-ittner added a commit that referenced this pull request Sep 8, 2021
* BUILD: pin Jinja2 version to <3.0 to prevent breaking the sphinx build (#286)

* BUILD: pin Jinja2 version to <3.0 to prevent breaking the sphinx build

* BUILD: update package requirements

* DEV: update conda development environment in environment.yml

* Update version number to 1.2.0

* Update version to 1.0.4

* DEV: pin scipy to 1.5 in environment.yml (#287)

* API: remove FACET 1.0 legacy inspections (#288)

* BUILD: pin jinja2 to ~=2.11 to prevent incompatibility with sphinx (#289)

* BUILD: standardize azure pipeline across BCG Gamma repos (#292)

* BUILD: create individual build tasks per package in single/matrix builds

* DOC: remove comments

* BUILD: use stable versions of pytools and sklearndf in matrix builds

* FIX: revert path to /dist in CopyFiles task

* BUILD: add more packages to conda build recipe

* BUILD: use python 3.8 in conda host environment

* BUILD: update build requirements incl. increased min requirements detail

* BUILD: remove dev/* trigger

* BUILD: downgrade typing_inspect to 0.6 in max build

* BUILD: change numpy requirements syntax; improve upper version bounds

* BUILD: add ipython to min/max matrix test dependencies

* BUILD: use pip version syntax in pyproject.toml

* BUILD: remove indirect package dependencies from default dependencies

* BUILD: update pyproject.toml

* BUILD: relax max build requirement for scipy

* BUILD: use global variables for project and package name

* BUILD: update standardized azure-pipelines.yml

* FIX: re-remove _balancer.py

* BUILD: fix pytools minimum versions in pyproject.toml

* FIX: joblib version in min matrix build

* BUILD: disable nightly builds on 1.1.x branch

* BUILD: tweaks for 1.2.0 release (#295)

* TEST: fix subsample to remove version dependence on random generator

* FIX: de-duplicate sample index before intersecting with test splits

* API: remove deprecated mirror package facet.simulation.partition

* FIX: update nbsphinx to ~=0.8.5 to address new issue with jinja 3.0.0

* BUILD: fix max matrix versions at maximum

* FIX: correct shap version spec in pyproject.toml

* DOC: update RELEASE_NOTES.rst

* BUILD: require pytools ~=1.2 and sklearndf ~= 1.2

* BUILD: require numpy 1.17

* BUILD: remove dev/* from azure pipeline triggers

* FIX: rephrase max numpy requirement due to possible bug in conda-build

* FIX: build sklearndf 1.2.x in azure-pipelines.yml

* BUILD: use python 3.8 in host requirements

* BUILD: remove unneeded trailing .* from pyproject.toml

* BUILD: add missing version requirements to conda run config

* BUILD: set matplotlib to 3.4 in max test

* BUILD: change matplotlib==3 back to matplotlib==3.* for tox

* BUILD: add inherited requirements so we can use them for testing

* FIX: change boruta into boruta_py

* FIX: change boruta_py back into boruta for tox builds (but not conda)

* BUILD: make min requirements more specific to speed up build time

* BUILD: remove conda dependencies with boruta

Co-authored-by: Jörg Schneider <46053259+joerg-schneider@users.noreply.github.com>

* BUILD: relax upper bounds of package requirements (#296)

* BUILD: relax upper bounds of package requirements

* BUILD: replace matplotlib with matplotlib-base

* BUILD: use matplotlib-base only for conda

* API: improve parameter checking for StratifiedBootstrapCV

* TEST: add unit test for StratifiedBootstrapCV

* FIX: don't require arg y in non-stratified bootstraps

* DEV: update package requirements for facet-develop conda environment

* API: implement full sample simulation as experimental feature (#297)

* API: add hidden option to simulate single models on the full sample

* DOC: fix an outdated (but hidden) docstring

* TEST: validate partition frequencies

* FIX: correct sort order of imports

* Update __init__.py

* FIX: use lists not tuples to create simulation results data frame

Co-authored-by: Jörg Schneider <46053259+joerg-schneider@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants