Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

check user and item matrix for nan entries also in bpr gpu version #731

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

jmorlock
Copy link

@jmorlock jmorlock commented Jan 25, 2025

There are model parameters where the matrix factorization of BayesianPersonalizedRanking fails.
In this case some (or all) entries of the user and the item matrix become NaN.

While this applies to both the CPU and the GPU version, the CPU version already features a corresponding check. In this pull request I added a similar check to the GPU version and consolidated the source code.

Side Note: Not having this check can be quite misleading. Because in this case a strange behavior can be observed:
no error occurs but recommend returns items the user already liked even with filter_already_liked_items set to True. It can be verified using the following test:

import implicit
import numpy as np
import scipy.sparse as sparse

def test_matrix_nan():
    num_users = 2
    num_items = 4
    factors = 3

    # customer 0 liked item 0 and 1
    customers = np.array([0, 0, 1, 1])
    items = np.array([0, 1, 2, 3])
    quantity = np.ones(len(items))

    user_items = sparse.csr_matrix((quantity, (customers, items)))

    user_factors = implicit.gpu._cuda.Matrix(
        np.full((num_users, factors), np.nan, dtype=np.float32)
    )

    item_factors = implicit.gpu._cuda.Matrix(
        np.full((num_items, factors), np.nan, dtype=np.float32)
    )

    # simulate a failed fit by setting both matrices to NaN
    model = implicit.gpu.bpr.BayesianPersonalizedRanking()
    model.user_factors = user_factors
    model.item_factors = item_factors

    (ids, scores) = model.recommend(
        userid=0,
        user_items=user_items[0],
        N=1,
        filter_already_liked_items=True,
        filter_items=None,
        recalculate_user=False,
        items=None,
    )
    assert ids[0] not in {0, 1}   # FAILS

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant