-
Notifications
You must be signed in to change notification settings - Fork 737
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ZeroDivisionError in merge_ebms #485
Comments
Hi @jfleh -- Are you using sample weights when generating either of the models? When fitting EBMs, we sum up the sample weights for all the samples within each bin of each term, and we put that information in the ebm.bin_weights_ attribute of the model. The exception above is saying that the total of the sample weights for some term is zero (the total, not just a zero for one of the bins). I could potentially see a model built with extremely small sample weights would do this naturally, but the conditions would have to be almost impossibly special. The more likely scenario is that there's a bug somewhere in the merge_ebms function, probably having to do with merging pairs where a spurious term is somehow created during the merge. I can look through merge_ebms and see if I can figure that out, but it would be easier to have some more information about the model first. If the model is private and cannot be posted here, can you look at the bin_weights_ attributes of your models and see if any of the terms have all zeros in one of their term weights. If your model is public, could you use the ebm.to_json(FILE_NAME) function to export a JSON representation of the models and post them here or email them to interpret@microsoft.com Documentation link: |
Hi @paulbkoch, thanks for the response. I do indeed see lots of zeroes in the bin_weights_, I also noticed that I was trying to combine two models that were exactly identical (as a result of being fitted on the same dataset with the same random_state). I am attaching the model. The model has been created with default parameters and is fit on synthetically created data. The predictors are independent of the targets, so there is not actually anything that can be learned on this dataset. I am curious if it is something with the model or the fact that the two models are identical that causes this problem. |
I am still getting the same error, now also with models that are trained on real data that should be able to pick up effects. |
…n when a term in the resulting merged model has only a single non-missing bin
I've pushed a fix for this issue which will be included in our next release. For details see: 0c6c985 In the meantime, you can avoid this issue by not merging models that have features with only 1 value. Such features are entirely useless anyway, so removing them should not affect the model's performance. You can do this with:
Thanks @jfleh for reporting this. It was a good bug to fix. |
I am trying to merge two ebms (classifier or regressor, does not matter which one) and I get the following error:
the two models have been fitted on the exact same dataset.
The text was updated successfully, but these errors were encountered: