Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add ensemble_bootstrap; add backend argument passed to joblib; add base_model_method argument to predict method #71

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

chenyangkang
Copy link
Owner

@chenyangkang chenyangkang commented Feb 11, 2025

  1. ensemble_bootstrap argument: Defaults to False. if True, the data will be bootstrapped once for each ensemble. In this case users can generate ensemble-level uncertainty, accounting for variance in data.
  2. joblib_backend argument: Defaults 'loky'. Other available arguments include 'threading' ('multiprocessing' will not work with generator as the return). Sometimes only threading may work on certain systems.
  3. base_model_method argument: defaults to None. If None, predict or predict_proba will be used depending on the tasks. This argument is handy if you have a custom base model class that has a special prediction function. Notice that dummy model will still predict 0, so the ensemble-aggregated result is still an average of zeros and your special prediction function output. Therefore, it may only make sense if your special prediction function predicts 0 as the absense/control value. Defaults to None.

Only updated for AdaSTEM and STEM, not for SphereAdaSTEM.

Copy link

codecov bot commented Feb 11, 2025

Codecov Report

Attention: Patch coverage is 97.67442% with 4 lines in your changes missing coverage. Please review.

Project coverage is 90.76%. Comparing base (61efb24) to head (6baf66a).
Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
tests/test_model_custom_base_model_method.py 93.54% 2 Missing ⚠️
stemflow/model/AdaSTEM.py 96.55% 1 Missing ⚠️
stemflow/model/static_func_AdaSTEM.py 92.30% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #71      +/-   ##
==========================================
+ Coverage   90.36%   90.76%   +0.40%     
==========================================
  Files          35       38       +3     
  Lines        2594     2751     +157     
==========================================
+ Hits         2344     2497     +153     
- Misses        250      254       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@chenyangkang chenyangkang changed the title add ensemble_bootstrap; add backend argument passed to joblib add ensemble_bootstrap; add backend argument passed to joblib; add base_model_method argument to predict method Feb 11, 2025
@chenyangkang
Copy link
Owner Author

When you resample/bootstrap, or change any structure of the dataframe in a subprocess/thread or another python worker, a new copy of the data will be stored in the variable, doubling the memory cost. Resampling only the index will solve it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant