-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using weights creates a confusing confusion matrix and inaccurate accuracy score #114
Comments
Hi Doug, The formatting issue in the confusion matrix printed in the training logs was fixed. The fix will be included in a next release (the next one of the one after). Note that After some exploration in this case, it seems the model prediction and evaluation (e.g. programmatic access) is correct (i.e., it is only a display issue).
This is possible. If a training dataset is small, the model self evaluation will be noisy. Having example weights (both for training or evaluation) further increase this noise. If you use gradient boosted trees (GBT), the self evaluation is computed using the validation dataset (which it extracted from the training dataset if not provided). So, if the training dataset is small, the validation dataset is also small an a discrepancy between self evaluation and evaluation on a test set is expected. If you use Random Forests , the self evaluation is computed using an out-of-bag evaluation. The out-of-bag evaluation is a conservative estimate of the model quality. If the dataset is small, this estimate can be poor. In addition, if the model contains a small amount of trees, the out-of-bag evaluation can be biased (in the conservative direction). Note that in this case, the |
I'm using RandomForestLearner to train a 10-class categorization model using roughly 15000 examples and 12 features. My example set is imbalanced in terms of category distribution, so I need to use class-based weighting to boost the under-represented classes.
I'm post-processing my dataset with weights computed from the entire set:
The resulting model is effective, but the confusion matrix is confusing. Here part of the output from
model.describe()
:Basically unreadable. Here it is again from
model.self_evaluation()
:Without weights, the confusion matrix prints integers, as I would expect. With weights, it's these floating point numbers that don't make much sense. Also I believe the accuracy number is incorrect. If I run predictions against the model using the same training dataset, I compute only 186 of 15405 incorrect predictions (1.2%).
The text was updated successfully, but these errors were encountered: