ZeroDivisionError in merge_ebms #485

jfleh · 2023-11-08T14:22:42Z

I am trying to merge two ebms (classifier or regressor, does not matter which one) and I get the following error:

Traceback (most recent call last):
  File "/code/trainingmanagerapi.py", line 725, in multiple_local_training
    fitted_model = merge_ebms([fitted_model, ebm2])
  File "/usr/local/lib/python3.9/site-packages/interpret/glassbox/_ebm/_merge_ebms.py", line 719, in merge_ebms
    ) = process_terms(n_classes, ebm.bagged_scores_, ebm.bin_weights_, ebm.bag_weights_)
  File "/usr/local/lib/python3.9/site-packages/interpret/glassbox/_ebm/_utils.py", line 235, in process_terms
    score_mean = np.average(scores, weights=weights)
  File "<__array_function__ internals>", line 180, in average
  File "/usr/local/lib/python3.9/site-packages/numpy/lib/function_base.py", line 547, in average
    raise ZeroDivisionError(
ZeroDivisionError: Weights sum to zero, can't be normalized

the two models have been fitted on the exact same dataset.

The text was updated successfully, but these errors were encountered:

paulbkoch · 2023-11-08T18:46:27Z

Hi @jfleh -- Are you using sample weights when generating either of the models? When fitting EBMs, we sum up the sample weights for all the samples within each bin of each term, and we put that information in the ebm.bin_weights_ attribute of the model. The exception above is saying that the total of the sample weights for some term is zero (the total, not just a zero for one of the bins). I could potentially see a model built with extremely small sample weights would do this naturally, but the conditions would have to be almost impossibly special. The more likely scenario is that there's a bug somewhere in the merge_ebms function, probably having to do with merging pairs where a spurious term is somehow created during the merge. I can look through merge_ebms and see if I can figure that out, but it would be easier to have some more information about the model first. If the model is private and cannot be posted here, can you look at the bin_weights_ attributes of your models and see if any of the terms have all zeros in one of their term weights. If your model is public, could you use the ebm.to_json(FILE_NAME) function to export a JSON representation of the models and post them here or email them to interpret@microsoft.com

Documentation link:
https://interpret.ml/docs/ExplainableBoostingClassifier.html#interpret.glassbox.ExplainableBoostingClassifier.to_json

jfleh · 2023-11-09T14:48:42Z

Hi @paulbkoch, thanks for the response. I do indeed see lots of zeroes in the bin_weights_, I also noticed that I was trying to combine two models that were exactly identical (as a result of being fitted on the same dataset with the same random_state). I am attaching the model. The model has been created with default parameters and is fit on synthetically created data. The predictors are independent of the targets, so there is not actually anything that can be learned on this dataset. I am curious if it is something with the model or the fact that the two models are identical that causes this problem.
model1.txt

jfleh · 2023-11-27T11:19:38Z

I am still getting the same error, now also with models that are trained on real data that should be able to pick up effects.

…n when a term in the resulting merged model has only a single non-missing bin

paulbkoch · 2023-12-04T07:18:44Z

I've pushed a fix for this issue which will be included in our next release. For details see: 0c6c985

In the meantime, you can avoid this issue by not merging models that have features with only 1 value. Such features are entirely useless anyway, so removing them should not affect the model's performance. You can do this with:

ebm.remove_terms([i for i, scores in enumerate(ebm.term_scores_) if np.sum(np.abs(scores)) == 0])

Thanks @jfleh for reporting this. It was a good bug to fix.

paulbkoch added a commit that referenced this issue Dec 4, 2023

fixes issue #485 where merge_ebms raises a ZeroDivisionError exceptio…

0c6c985

…n when a term in the resulting merged model has only a single non-missing bin

paulbkoch closed this as completed Dec 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ZeroDivisionError in merge_ebms #485

ZeroDivisionError in merge_ebms #485

jfleh commented Nov 8, 2023

paulbkoch commented Nov 8, 2023

jfleh commented Nov 9, 2023

jfleh commented Nov 27, 2023

paulbkoch commented Dec 4, 2023 •

edited

Loading

ZeroDivisionError in merge_ebms #485

ZeroDivisionError in merge_ebms #485

Comments

jfleh commented Nov 8, 2023

paulbkoch commented Nov 8, 2023

jfleh commented Nov 9, 2023

jfleh commented Nov 27, 2023

paulbkoch commented Dec 4, 2023 • edited Loading

paulbkoch commented Dec 4, 2023 •

edited

Loading