Sparse root VJP for Lasso penalty #274

tristandeleu · 2022-07-14T18:44:41Z

We have been working with @QB3 on a rework of #17, adding support functions to enable implicit differentiation with sparsity-inducing penalties. The main difference is that we are masking the linear system to solve in root_vjp based on the support of the solution, instead of restricting it to its support, in order to be jit-compatible. For now, this functionality has only been added to ProximalGradient, but we can add it to other solvers as well if we agree on the API.

Specifying the support explicitly allows us to ensure that the Jacobian will be non-zero only for coordinates in the support of the solution. In the case of Lasso, this also allows us to use CG to solve the linear system, instead of Normal-CG, since we ensure that the matrix to invert is symmetric. We have written a benchmark to showcase the advantages of masking the linear system to the support only in Lasso:

Number of samples: 100
Number of features: 1000
Size of the support of the solution: 5
Numerical Jacobian: [-0.68067934 -0.97320587 -0.87455234 -0.65993433 -1.13817755]
Jacobian w/o support, CG (15.620 sec.): [ 2.5179840e+09 -1.3495498e+09  1.2744818e+07  1.1016900e+10
 -1.7797000e+09] (size of the support: 5)
Jacobian w/o support, normal CG (2.249 sec.): [-0.68141943 -0.973256   -0.8743981  -0.6596938  -1.1382269 ] (size of the support: 1000)
Jacobian w/ restricted support (1.473 sec.): [-0.6814198  -0.973256   -0.8743989  -0.65969384 -1.1382272 ] (size of the support: 5)
Jacobian w/ masked support (1.503 sec.): [-0.6814196  -0.97325593 -0.87439895 -0.6596938  -1.1382273 ] (size of the support: 5)

(more details about the PR to be added)

mblondel · 2022-07-18T17:35:53Z

Thanks @tristandeleu and @QB3 for this work, this is very interesting! If I understand your main point correctly, masking is almost as fast as restricting but is jit friendly and allows to use CG in the case of the lasso?

I would be curious to know the benchmark results on GPU.

By the way, I see the PR is currently marked as draft. Let me know when you would like me to review :)

tristandeleu and others added 12 commits July 14, 2022 14:34

Add support functions

2ebe8e5

Add support function in root_vjp

0f70ddb

Add support function in ProximalGradient solver

1cc424e

Add documentation

156b41a

Add test for support and jit

d0af896

Add benchmark for sparse root vjp

b9e7e92

Fix comment in implicit_diff_test

9455576

Add benchmark with solve_normal_cg

de7056e

Add size of the support in benchmark

82676cd

added bench sparse gradient

3d92196

little fix, lambda max

3ee2dfe

Decrease the tolerance in lasso_skl_jac

e65d5ed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sparse root VJP for Lasso penalty #274

Sparse root VJP for Lasso penalty #274

tristandeleu commented Jul 14, 2022 •

edited

Loading

mblondel commented Jul 18, 2022

Sparse root VJP for Lasso penalty #274

Are you sure you want to change the base?

Sparse root VJP for Lasso penalty #274

Conversation

tristandeleu commented Jul 14, 2022 • edited Loading

mblondel commented Jul 18, 2022

tristandeleu commented Jul 14, 2022 •

edited

Loading