Isotonic Regression #223

wildart · 2022-05-27T21:36:59Z

An isotonic regression based the pool adjacent violators algorithm from Best, M.J., Chakravarti, N. Active set algorithms for isotonic regression; A unifying framework. Mathematical Programming 47, 425–439 (1990).

YuhanLiin · 2022-05-28T18:47:02Z

algorithms/linfa-linear/src/isotonic.rs

+impl Default for IsotonicRegression {
+    fn default() -> Self {
+        IsotonicRegression::new()
+    }
+}


This can just be derived

YuhanLiin · 2022-05-28T18:49:02Z

algorithms/linfa-linear/src/isotonic.rs

+        assert_eq!(y.dim(), n_samples);
+
+        // use correlation for determining relationship between x & y
+        let x = X.slice(s![.., 0]);


Do X.col(0)

YuhanLiin · 2022-05-28T18:49:26Z

algorithms/linfa-linear/src/isotonic.rs

+        // use correlation for determining relationship between x & y
+        let x = X.slice(s![.., 0]);
+        let rho = DatasetBase::from(stack![Axis(1), x, y]).pearson_correlation();
+        let incresing = rho.get_coeffs()[0] >= F::zero();


increasing

YuhanLiin · 2022-05-28T18:59:21Z

algorithms/linfa-linear/src/isotonic.rs

+                let mut i0 = B_zero.2.clone();
+                i0.extend(&(B_plus.2));
+                J[i] = (v0, w0, i0);
+                J.remove(i + 1);


Vector removal is O(n), which is inefficient inside loops. Can you change J to a different data structure so removals are less expensive? Hashmap or Vec<Option<...>> are good options, but you'll need to change the way you index into J.

wildart · 2022-06-06T03:50:14Z

I rewrote algorithm, and added an another implementation.

YuhanLiin · 2022-06-12T05:12:14Z

What's the difference and tradeoffs between PVA and AlgorithmA? Is there a reason why we want to include both instead of just including one?

If we really wanted both, then we should get rid of IsotonicRegression, and implement both PVA and AlgorithmA publicly and impl Fit and Predict on both structs. The common code between both algorithms can be added to the IR trait, which can now be private. This significantly decreases the public API surface.

wildart · 2022-07-03T21:10:41Z

I removed AlgorithmA code, and left PVA code only. Both algorithms are of O(n) complexity, it's just AlgorithmA is dual to PVA, so the solutions they provide are almost similar.

After all, I do not think we need two algorithms implemented.

YuhanLiin · 2022-07-16T07:30:47Z

Mention the paper/textbook the algorithm is from in the public docs and you're good to go

codecov-commenter · 2022-07-26T01:11:45Z

Codecov Report

Merging #223 (51a5752) into master (d4bd9c9) will decrease coverage by 0.06%.
The diff coverage is 60.47%.

@@            Coverage Diff             @@
##           master     #223      +/-   ##
==========================================
- Coverage   55.44%   55.38%   -0.07%     
==========================================
  Files          95       96       +1     
  Lines        8774     8916     +142     
==========================================
+ Hits         4865     4938      +73     
- Misses       3909     3978      +69

Impacted Files	Coverage Δ
algorithms/linfa-linear/src/isotonic.rs	`60.47% <60.47%> (ø)`
...gorithms/linfa-preprocessing/src/linear_scaling.rs	`72.03% <0.00%> (-4.24%)`	⬇️
algorithms/linfa-bayes/src/base_nb.rs	`62.96% <0.00%> (-3.71%)`	⬇️
algorithms/linfa-kernel/src/lib.rs	`60.21% <0.00%> (-3.06%)`	⬇️
src/composing/multi_target_model.rs	`64.44% <0.00%> (-3.00%)`	⬇️
algorithms/linfa-linear/src/glm/mod.rs	`55.55% <0.00%> (-2.99%)`	⬇️
algorithms/linfa-svm/src/permutable_kernel.rs	`45.12% <0.00%> (-1.87%)`	⬇️
...linfa-clustering/src/appx_dbscan/cells_grid/mod.rs	`49.12% <0.00%> (-1.76%)`	⬇️
algorithms/linfa-nn/src/balltree.rs	`54.33% <0.00%> (-1.58%)`	⬇️
...lgorithms/linfa-clustering/src/optics/algorithm.rs	`48.82% <0.00%> (-1.47%)`	⬇️
... and 24 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d4bd9c9...51a5752. Read the comment docs.

wildart force-pushed the isotonic branch from a364b80 to 9aa8e62 Compare May 28, 2022 00:13

YuhanLiin reviewed May 28, 2022

View reviewed changes

wildart requested a review from YuhanLiin June 8, 2022 18:34

wildart added 8 commits July 3, 2022 16:04

added isotonic regression algo

b7bdb3e

fix docstest & clean up

304f0c6

rebased and added missing traits

2fe3530

fixed typos, removed unnecessary declarations

e71f06a

rewrote of PVA & added AlgorithmA implementations

fe36453

fixed tests

e7529a0

code clean up

06434cd

removed AlgorithmA.

06adbeb

wildart force-pushed the isotonic branch from 64ae52f to 06adbeb Compare July 3, 2022 20:47

YuhanLiin approved these changes Jul 16, 2022

View reviewed changes

added the reference

51a5752

Merge branch 'master' into isotonic

2bf56df

YuhanLiin merged commit 66d1f45 into rust-ml:master Jul 26, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Isotonic Regression #223

Isotonic Regression #223

wildart commented May 27, 2022

YuhanLiin May 28, 2022

YuhanLiin May 28, 2022

YuhanLiin May 28, 2022

YuhanLiin May 28, 2022

wildart commented Jun 6, 2022

YuhanLiin commented Jun 12, 2022 •

edited

Loading

wildart commented Jul 3, 2022

YuhanLiin commented Jul 16, 2022

codecov-commenter commented Jul 26, 2022

Isotonic Regression #223

Isotonic Regression #223

Conversation

wildart commented May 27, 2022

YuhanLiin May 28, 2022

Choose a reason for hiding this comment

YuhanLiin May 28, 2022

Choose a reason for hiding this comment

YuhanLiin May 28, 2022

Choose a reason for hiding this comment

YuhanLiin May 28, 2022

Choose a reason for hiding this comment

wildart commented Jun 6, 2022

YuhanLiin commented Jun 12, 2022 • edited Loading

wildart commented Jul 3, 2022

YuhanLiin commented Jul 16, 2022

codecov-commenter commented Jul 26, 2022

Codecov Report

YuhanLiin commented Jun 12, 2022 •

edited

Loading