Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Isotonic Regression #223

Merged
merged 10 commits into from
Jul 26, 2022
Merged

Isotonic Regression #223

merged 10 commits into from
Jul 26, 2022

Conversation

wildart
Copy link
Contributor

@wildart wildart commented May 27, 2022

An isotonic regression based the pool adjacent violators algorithm from Best, M.J., Chakravarti, N. Active set algorithms for isotonic regression; A unifying framework. Mathematical Programming 47, 425–439 (1990).

Comment on lines 46 to 50
impl Default for IsotonicRegression {
fn default() -> Self {
IsotonicRegression::new()
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can just be derived

assert_eq!(y.dim(), n_samples);

// use correlation for determining relationship between x & y
let x = X.slice(s![.., 0]);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do X.col(0)

// use correlation for determining relationship between x & y
let x = X.slice(s![.., 0]);
let rho = DatasetBase::from(stack![Axis(1), x, y]).pearson_correlation();
let incresing = rho.get_coeffs()[0] >= F::zero();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

increasing

let mut i0 = B_zero.2.clone();
i0.extend(&(B_plus.2));
J[i] = (v0, w0, i0);
J.remove(i + 1);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vector removal is O(n), which is inefficient inside loops. Can you change J to a different data structure so removals are less expensive? Hashmap or Vec<Option<...>> are good options, but you'll need to change the way you index into J.

@wildart
Copy link
Contributor Author

wildart commented Jun 6, 2022

I rewrote algorithm, and added an another implementation.

@wildart wildart requested a review from YuhanLiin June 8, 2022 18:34
@YuhanLiin
Copy link
Collaborator

YuhanLiin commented Jun 12, 2022

What's the difference and tradeoffs between PVA and AlgorithmA? Is there a reason why we want to include both instead of just including one?

If we really wanted both, then we should get rid of IsotonicRegression, and implement both PVA and AlgorithmA publicly and impl Fit and Predict on both structs. The common code between both algorithms can be added to the IR trait, which can now be private. This significantly decreases the public API surface.

@wildart
Copy link
Contributor Author

wildart commented Jul 3, 2022

I removed AlgorithmA code, and left PVA code only. Both algorithms are of O(n) complexity, it's just AlgorithmA is dual to PVA, so the solutions they provide are almost similar.

After all, I do not think we need two algorithms implemented.

@YuhanLiin
Copy link
Collaborator

Mention the paper/textbook the algorithm is from in the public docs and you're good to go

@codecov-commenter
Copy link

Codecov Report

Merging #223 (51a5752) into master (d4bd9c9) will decrease coverage by 0.06%.
The diff coverage is 60.47%.

@@            Coverage Diff             @@
##           master     #223      +/-   ##
==========================================
- Coverage   55.44%   55.38%   -0.07%     
==========================================
  Files          95       96       +1     
  Lines        8774     8916     +142     
==========================================
+ Hits         4865     4938      +73     
- Misses       3909     3978      +69     
Impacted Files Coverage Δ
algorithms/linfa-linear/src/isotonic.rs 60.47% <60.47%> (ø)
...gorithms/linfa-preprocessing/src/linear_scaling.rs 72.03% <0.00%> (-4.24%) ⬇️
algorithms/linfa-bayes/src/base_nb.rs 62.96% <0.00%> (-3.71%) ⬇️
algorithms/linfa-kernel/src/lib.rs 60.21% <0.00%> (-3.06%) ⬇️
src/composing/multi_target_model.rs 64.44% <0.00%> (-3.00%) ⬇️
algorithms/linfa-linear/src/glm/mod.rs 55.55% <0.00%> (-2.99%) ⬇️
algorithms/linfa-svm/src/permutable_kernel.rs 45.12% <0.00%> (-1.87%) ⬇️
...linfa-clustering/src/appx_dbscan/cells_grid/mod.rs 49.12% <0.00%> (-1.76%) ⬇️
algorithms/linfa-nn/src/balltree.rs 54.33% <0.00%> (-1.58%) ⬇️
...lgorithms/linfa-clustering/src/optics/algorithm.rs 48.82% <0.00%> (-1.47%) ⬇️
... and 24 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d4bd9c9...51a5752. Read the comment docs.

@YuhanLiin YuhanLiin merged commit 66d1f45 into rust-ml:master Jul 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants