-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generalize KNNRegressor to multitarget case #328
Conversation
For a 0.12.3 release
For a 0.12.4 release
For a 0.12.5 release
For a 0.12.6 release
For a 0.12.7 release
@ablaom could you hold on merging this while i review this PR |
@OkonSamuel I think those suggestions are fair. Are you happy to push the relevant changes to save @mateuszbaran opening a new PR? |
@ablaom Ok |
@OkonSamuel Thanks for that! Looking more carefully, it seems the testing around weights for regression is a bit light. I'm going to look into this, unless you want to do it? |
src/NearestNeighbors.jl
Outdated
if m.weights == :uniform | ||
preds[i,:] .= sum(values .* w_) / sum(w_) | ||
else | ||
preds[i,:] .= sum(values .* w_ .* (1.0 .- dists_ ./ sum(dists_))) / (sum(w_) - 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm having trouble understanding the formula in 147 (which predates this PR). @tlienart, can you recall how this was deduced or where it came from?
Naively, it seems to me that we want to write the prediction as a weighted sum of the target values, where the kth weight is simultaneously proportional to the prescribed sample weight w_[k]
and inversely proportional to the distance dist_[k]
. That is, prediction[i,:] = sum(w_ ./ dist_ .* values) / c
where c
is a normalisation constant, chosen such that the weights (coefficients of values
in the sum) add up to one: c = sum(w_ ./ dist_)
. However, I can't reconcile this with the given formula. Am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah had the same thoughts when i went through.
I would propose avoid passing weights to fit
. i.e setting supports_weights
trait to false since the weights needed for knn
models are not per_observation weights but per_neighbor weights,
So we should stick to using weights derived from weights
parameter passed during knn
model construction. (i.e :uniform, :distance)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah had the same thoughts when i went through.
I would propose avoid passing weights to fit. i.e setting supports_weights trait to false since the weights
I don't see anything wrong with mixing per sample weights with an inverse square law for the "neighbour" weights (if its done in a meaningful way). Also, presently, these two KNN models are one of the few models that support sample weights and are therefore used for testing 🙌
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Numerator in this formula does make sense to me, multiplying by 1-dist[i]/sum(dist)
should behave numerically better when distance to one of the neighbors is close to 0. The normalization constant in the denominator looks wrong though. Here is a comparison of different distance-based weights: https://perun.pmf.uns.ac.rs/radovanovic/publications/2016-kais-knn-weighting.pdf .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mateuszbaran Thanks for pointing out the paper. Worth noting the evaluation there is for classifiers making point predictions (the mli classifier is probabilistic and so needs normalisation not needed there) and the testing was restricted to time-series applications. And the current PR is about regression. That said the paper nicely summarises a number of weighting schemes that probably covers cases in common use for both probabilistic classification and deterministic regression (mlj cases).
(Interestingly, I don't see the 1 - dist[i]/sum(dist)
in the paper, although maybe it's a special case of Macleod ??).
It would be nice to implement them all and cite the paper for the definitions. But I would deem that out of scope of the current PR.
For the record sk-learn implements 1/dist[i]
(with no epson-smoothing) and uniform, but also allows a custom function.
I propose we keep the current 1 - dist[i]/sum(dist)
weight (and the 1/dist[i]
weight currently used for the classifier) and do the normalisation post facto, as we do for classification. (I
don't believe there's a more efficient way, if we are mixing in sample-weights.) We'll view this a a "bug fix" (patch release) and open a new issue to generalize and get consistency with regression and classification - which will be breaking. We can do that at the same time as migrating the interface to its own package (#244).
@OkonSamuel @mateuszbaran Happy with this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I'll just merge this as is, despite the mysterious normalization, and we'll fix that when we migrate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that paper isn't about kNN regression but still it nicely collects many weighting functions that can be used here as well.
Your plan sounds great 👍 .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your help with the PR; I realise it was a bit more involved than you probably thought it would be 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem! Also thank you for your quick help in getting this done. I learned quite a bit and I think it was worth it.
Continuation of #327