[linear-ridge-cholesky]使用 SPU 实现线性回归模型 #297

magic-hya · 2023-08-10T12:05:09Z

What problem does this PR solve?

I have read the CLA Document and I hereby sign the CLA
Issue Number: Fixed #274

使用cholesky分解法实现ridge，在spu镜像中完成emul和test单元测试

Possible side effects?

Performance:
在diabetes数据集上，与sklearn的误差在0.01-0.02之间
Backward compatibility:

Signed-off-by: magic-hya <huangya@asiainfo.com>

github-actions · 2023-08-10T12:05:27Z

CLA Assistant Lite bot All contributors have signed the CLA ✍️ ✅

Candicepan · 2023-08-10T12:31:48Z

辛苦按照上方提示签订 CLA 哈

tpppppub

Hi，除了 sml/linear_model 下的改动，.vscode 和 examples 下的请不要 commit

magic-hya · 2023-08-11T02:08:29Z

Hi，除了 sml/linear_model 下的改动，.vscode 和 examples 下的请不要 commit

好的

Signed-off-by: magic-hya <huangya@asiainfo.com>

into linear-ridge-cholesky

magic-hya · 2023-08-11T06:18:30Z

recheck

magic-hya · 2023-08-11T06:33:38Z

I have read the CLA Document and I hereby sign the CLA

anakinxc · 2023-08-11T06:39:13Z

recheck

anakinxc · 2023-08-11T06:43:56Z

Hi @magic-hya

Please use buildifier to format bazel files.
Also format python files with black formatter

Thanks

deadlywing

另外，sml对目录结构做了一次重构，麻烦您按照现在的目录结构有序放置文件，并修改对应的BUILD.bazel文件

感谢支持～

deadlywing · 2023-08-11T02:11:57Z

sml/linear_model/BUILD.bazel

+    name = "ridge",
+    srcs = ["ridge.py"],
+    deps = [
+        "//sml/utils:fxp_approx",


fxp_approx似乎没有用到，可以去掉这个deps

deadlywing · 2023-08-11T02:14:02Z

sml/linear_model/ridge.py

+          dot(X.T, X)
+    """
+
+    def __init__(self, alpha=1.0, solver="lsqr") -> None:


还需要支持是否拟合bias项的选项

deadlywing · 2023-08-11T02:15:49Z

sml/linear_model/ridge.py

+        if y.ndim == 1:
+            y = y.reshape(-1, 1)
+        alpha = jnp.asarray(self.alpha, dtype=x.dtype).ravel()
+        print(f"<<<solver: {self.solver}")


算法主逻辑里不要加print语句吧

deadlywing · 2023-08-11T07:14:02Z

sml/linear_model/ridge_test.py

+
+        x1, x2, y = dsutil.load_dataset_by_config(dataset_config)
+        result = spsim.sim_jax(sim, proc)(x1, x2, y)
+        print(result[:10])


单元测试需要比较明文sklearn的结果～

deadlywing · 2023-08-11T07:14:50Z

sml/linear_model/ridge_emul.py

+
+        # run
+        result = emulator.run(proc)(x1, x2, y)
+        print(result[:10])


麻烦比较一下明文sklearn的结果，并注明一下误差大小～

deadlywing · 2023-08-11T07:16:38Z

sml/linear_model/ridge.py

+
+
+class Solver(Enum):
+    SVD = 'svd'  # not supported


当前utils的extmath.py中有svd的实现，但是在定点代数下发现比较容易溢出，如果您有兴趣可以调用里面的api实现

deadlywing · 2023-08-11T07:18:16Z

sml/linear_model/ridge.py

+    Xy = jnp.dot(x.T, y)
+
+    for i in range(n_features):
+        A = A.at[i, i].set(A[i][i] + alpha[0])


直接加一个对角阵即可，无需循环赋值～

Signed-off-by: magic-hya <huangya@asiainfo.com>

magic-hya · 2023-08-13T07:39:14Z

1.添加了fit_bias选项
2.优化了矩阵计算
3.格式化了bazel和py代码
4.增加了sklearn算法对比测试，误差在0.005以下
测试效果如下：

[sklearn_result]---------------------------------------------
[199.37653   73.73293  172.66196  161.55786  128.74493  103.354576
  83.321686 128.3768   160.04437  206.40292 ]
[spsim_result]-----------------------------------------------  
[199.3735    73.72925  172.65936  161.55386  128.74008  103.35206
  83.318665 128.37201  160.04198  206.40111 ]
[absolute_error]---------------------------------------------  
[0.00302124 0.003685   0.00259399 0.0039978  0.00485229 0.0025177
 0.00302124 0.00479126 0.00239563 0.0018158  0.00493622 0.00495148
 0.00419617 0.00210571 0.00326538 0.00457764 0.00273132 0.00369263
 0.00405884 0.00289154]

deadlywing

另外，麻烦确保commit的代码已经：
使用black格式化过python代码，以及isort重排import语句

deadlywing · 2023-08-14T06:28:03Z

sml/linear_model/ridge.py

+
+from scipy import linalg
+from enum import Enum
+from scipy import linalg, sparse


cleanup一下import语句

deadlywing · 2023-08-14T06:50:37Z

sml/linear_model/ridge.py

+        """
+        if y.ndim == 1:
+            y = y.reshape(-1, 1)
+        alpha = jnp.asarray(self.alpha, dtype=x.dtype).ravel()


似乎不必要转化为array？或许可以直接转化为float类型

deadlywing · 2023-08-14T06:51:34Z

sml/linear_model/tests/ridge_test.py

+
+
+class UnitTests(unittest.TestCase):
+    def test_simple(self):


测试函数名可以改为：test_ridge，明确一下测试函数的语意

Signed-off-by: magic-hya <huangya@asiainfo.com>

magic-hya · 2023-08-14T08:21:21Z

1.代码已使用black+isort格式化，请检查
2.alpha直接使用float类型，后续需要做多变量回归再做调整
3.测试函数名已修正
4.重新执行了测试用例，修改结果精度显示
测试效果如下：

Executing tests from //sml/linear_model/tests:ridge_test
-----------------------------------------------------------------------------
No GPU/TPU found, falling back to CPU. (Set TF_CPP_MIN_LOG_LEVEL=0 and rerun for more info.)
.
----------------------------------------------------------------------
Ran 1 test in 0.838s

OK
[sklearn_result]---------------------------------------------
[199.37653   73.73293  172.66196  161.55786  128.74493  103.354576
  83.321686 128.3768   160.04437  206.40292 ]
[spsim_result]-----------------------------------------------
[199.37305   73.728676 172.6597   161.5527   128.73994  103.349915
  83.31765  128.37166  160.04211  206.39943 ]
[absolute_error]---------------------------------------------
[0.00348 0.00426 0.00226 0.00516 0.00499 0.00466 0.00404 0.00514 0.00226
 0.00349 0.00556 0.00314 0.00468 0.00363 0.00404 0.0065  0.00435 0.00391
 0.0041  0.00439]

deadlywing

LGTM

[linear-ridge-cholesky]使用 SPU 实现线性回归模型

1201a4b

Signed-off-by: magic-hya <huangya@asiainfo.com>

tpppppub requested changes Aug 10, 2023

View reviewed changes

magic-hya added 3 commits August 11, 2023 10:40

Merge branch 'secretflow:main' into linear-ridge-cholesky

4c4c1d2

[linear-ridge-cholesky]使用 SPU 实现线性回归模型

81f8e7e

Signed-off-by: magic-hya <huangya@asiainfo.com>

Merge branch 'linear-ridge-cholesky' of https://github.com/magic-hya/spu

e3393aa

into linear-ridge-cholesky

deadlywing reviewed Aug 11, 2023

View reviewed changes

magic-hya added 4 commits August 11, 2023 18:52

[linear-ridge-cholesky]使用 SPU 实现线性回归模型

fa1d667

Signed-off-by: magic-hya <huangya@asiainfo.com>

Merge branch 'main' into linear-ridge-cholesky

cdccda8

add bias

109697e

Signed-off-by: magic-hya <huangya@asiainfo.com>

add bias

6ecb940

Signed-off-by: magic-hya <huangya@asiainfo.com>

magic-hya requested a review from deadlywing August 13, 2023 07:39

deadlywing reviewed Aug 14, 2023

View reviewed changes

cleanup

9dbd20c

Signed-off-by: magic-hya <huangya@asiainfo.com>

magic-hya requested a review from deadlywing August 14, 2023 08:21

deadlywing approved these changes Aug 14, 2023

View reviewed changes

magic-hya requested a review from tpppppub August 14, 2023 09:05

tpppppub approved these changes Aug 14, 2023

View reviewed changes

deadlywing merged commit 9857c82 into secretflow:main Aug 14, 2023

github-actions bot locked and limited conversation to collaborators Aug 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[linear-ridge-cholesky]使用 SPU 实现线性回归模型 #297

[linear-ridge-cholesky]使用 SPU 实现线性回归模型 #297

magic-hya commented Aug 10, 2023 •

edited

Loading

github-actions bot commented Aug 10, 2023 •

edited

Loading

Candicepan commented Aug 10, 2023

tpppppub left a comment

magic-hya commented Aug 11, 2023

magic-hya commented Aug 11, 2023

magic-hya commented Aug 11, 2023 •

edited by anakinxc

Loading

anakinxc commented Aug 11, 2023

anakinxc commented Aug 11, 2023 •

edited

Loading

deadlywing left a comment

deadlywing Aug 11, 2023

deadlywing Aug 11, 2023

deadlywing Aug 11, 2023

deadlywing Aug 11, 2023

deadlywing Aug 11, 2023

deadlywing Aug 11, 2023

deadlywing Aug 11, 2023

magic-hya commented Aug 13, 2023

deadlywing left a comment

deadlywing Aug 14, 2023

deadlywing Aug 14, 2023

deadlywing Aug 14, 2023

magic-hya commented Aug 14, 2023

deadlywing left a comment

[linear-ridge-cholesky]使用 SPU 实现线性回归模型 #297

[linear-ridge-cholesky]使用 SPU 实现线性回归模型 #297

Conversation

magic-hya commented Aug 10, 2023 • edited Loading

What problem does this PR solve?

Possible side effects?

github-actions bot commented Aug 10, 2023 • edited Loading

Candicepan commented Aug 10, 2023

tpppppub left a comment

Choose a reason for hiding this comment

magic-hya commented Aug 11, 2023

magic-hya commented Aug 11, 2023

magic-hya commented Aug 11, 2023 • edited by anakinxc Loading

anakinxc commented Aug 11, 2023

anakinxc commented Aug 11, 2023 • edited Loading

deadlywing left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

magic-hya commented Aug 13, 2023

deadlywing left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

magic-hya commented Aug 14, 2023

deadlywing left a comment

Choose a reason for hiding this comment

magic-hya commented Aug 10, 2023 •

edited

Loading

github-actions bot commented Aug 10, 2023 •

edited

Loading

magic-hya commented Aug 11, 2023 •

edited by anakinxc

Loading

anakinxc commented Aug 11, 2023 •

edited

Loading