You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
am I doing things correctly here while building the dataset that valid for fastFM?
So basically, I have a dataframe containing my user-item interaction, along with the context/features and the labels. I then split this dataframe into two: 1) X which contains my user-item interaction along with the features, and 2) y which is the rating.
I then convert my dataframe X into python dictionary and then use sklearn Dictvectorizer in order to create the scipy sparse matrix. I then feed it to the fastFM model. here are the code example:
X_train = train_interaction[['profile_id_encoded', 'item_id_encoded',
'popularity_score', 'is_last_interaction']]
y_train = train_interaction['ratings'].values.squeeze()
X_val = val_interaction[['profile_id_encoded', 'item_id_encoded',
'popularity_score', 'is_last_interaction']]
y_val = val_interaction['ratings'].values.squeeze()
# X_train and X_val are dataframe while y_train and y_val are now np.array
X_train_dicts = X_train.to_dict('records')
X_val_dicts = X_val.to_dict('records')
from sklearn.feature_extraction import DictVectorizer
import scipy.sparse as sp
vec = DictVectorizer()
vectorizer = vec.fit_transform(X_train_dicts)
#below i convert the csr matrix into csc_matrix
fm_X_train = sp.csc_matrix(vectorizer)
fm = als.FMRegression(n_iter=10000, init_stdev=0.1, l2_reg_w=0, l2_reg_V=0, rank=5)
fm.fit(fm_X_train, y_train)
# prepare for prediction
vec = DictVectorizer()
vectorizer = vec.fit_transform(X_val_dicts)
fm_X_val = sp.csc_matrix(vectorizer)
y_pred = fm.predict(fm_X_val)
print(mean_squared_error(y_pred, y_val))
the MSE is bad tho: 93%
did I do things correctly here? really appreciate any help, thank you
The text was updated successfully, but these errors were encountered:
Hi,
am I doing things correctly here while building the dataset that valid for fastFM?
So basically, I have a dataframe containing my user-item interaction, along with the context/features and the labels. I then split this dataframe into two: 1)
X
which contains my user-item interaction along with the features, and 2)y
which is the rating.I then convert my dataframe
X
into python dictionary and then usesklearn Dictvectorizer
in order to create the scipy sparse matrix. I then feed it to the fastFM model. here are the code example:the MSE is bad tho: 93%
did I do things correctly here? really appreciate any help, thank you
The text was updated successfully, but these errors were encountered: