Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove scikit-learn as a dependency #33

Open
tcrasset opened this issue Mar 21, 2022 · 0 comments
Open

Remove scikit-learn as a dependency #33

tcrasset opened this issue Mar 21, 2022 · 0 comments
Assignees

Comments

@tcrasset
Copy link
Contributor

The package is a huge dependency.
Currently, we only use sklearn in one place (in svd.py):

def SVD(
    df: pd.DataFrame, svd_flip: bool = True
) -> Tuple[NDArray[Any], NDArray[Any], NDArray[Any]]:
    """
    ...
    svd_flip: bool
        Whether to use svd_flip on U and V or not.
    ...
    """
    U, s, V = linalg.svd(df, full_matrices=False)
    if svd_flip:
        U, V = sklearn.utils.extmath.svd_flip(U, V)
    return U, s, V

and the function is a couple lines long:

def svd_flip(u, v, u_based_decision=True):
    """Sign correction to ensure deterministic output from SVD.

    Adjusts the columns of u and the rows of v such that the loadings in the
    columns in u that are largest in absolute value are always positive.
    [...]

    """
    if u_based_decision:
        # columns of u, rows of v
        max_abs_cols = np.argmax(np.abs(u), axis=0)
        signs = np.sign(u[max_abs_cols, range(u.shape[1])])
        u *= signs
        v *= signs[:, np.newaxis]
    else:
        # rows of v, columns of u
        max_abs_rows = np.argmax(np.abs(v), axis=1)
        signs = np.sign(v[range(v.shape[0]), max_abs_rows])
        u *= signs
        v *= signs[:, np.newaxis]
    return u, v
@tas17 tas17 self-assigned this Jun 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants