-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG]The sampling method of the BRFClassifier is different from the paper #838
Comments
That is correct. I had the impression that we had fixed this. The reference should be removed and the docstring should be updated. |
Thanks for your answer.The reference is still on the website https://pypi.org/project/imbalanced-learn/#id17 |
Basically, the reference is everywhere. We should remove the citation from docs and docstrings saying that we have implemented a variation of random forests in order to be adapted in imbalanced data sets. |
a sudden announcement 😀,thank you for remembering my first issue. |
Indeed, we can succeed to get the right behaviour only by changing the default. So let's do that. |
Looking more in detail, I think that I forgot to change the default of Note that beforehand, we were passing the minority class entirely to |
Describe the bug
Hi.The following is the sampling method of BRF in the paper Using Random Forest to Learn Imbalanced Data:
My interpretation is that the minority samples in each sub-training set are selected by bootstrap and each sub-training set is balanced.Then these sub-training sets are given to traditional random forest's trees.
But in the code of
imblearn\ensemble\_forest.py\_local_parallel_build_trees
,I found that all minority samples in the training set are used in each sub-training set,the minority samples in the sub-training set of each tree are the same.The text was updated successfully, but these errors were encountered: