-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] The negative sampling for ranking gives AUC 0.00 value when we add sampling
layer to the Model class
#596
Comments
@rnyak , please triage this bug |
So it looks like this is something to do with eager/graph mode. Running with import pyarrow as pa
import tensorflow as tf
from merlin.models.tf.data_augmentation.negative_sampling import UniformNegativeSampling
import merlin.models.tf as mm
from merlin.io import Dataset
from merlin.datasets.entertainment import get_movielens
train, valid = get_movielens(variant="ml-100k")
schema = train.schema
schema = schema.without(["rating"])
# keep only positives
train_df = train.to_ddf().compute()
train = Dataset(train_df[train_df["rating_binary"] == 1])
train.schema = schema
model = mm.Model(
mm.InputBlock(schema),
UniformNegativeSampling(schema, 5, seed=42),
mm.MLPBlock([64]),
mm.BinaryClassificationTask("rating_binary"),
)
model.compile("adam", metrics=[tf.keras.metrics.AUC()], run_eagerly=True)
model.fit(train, batch_size=2048, epochs=1)
# => 25/25 [======] - 5s 187ms/step - loss: 0.4996 - auc_1: 0.5002 |
@rnyak For now since this doesn't work correctly in the model context (without eager mode). We should probably consider this as a non-supported feature. And only recommend use of this in the dataloader in examples/documentation. |
@oliverholworthy this looks like working on graph mode, now. |
I still get zero AUC when import pyarrow
import tensorflow as tf
from merlin.models.tf.transforms.negative_sampling import InBatchNegatives
import merlin.models.tf as mm
from merlin.io import Dataset
from merlin.datasets.entertainment import get_movielens
train, valid = get_movielens(variant="ml-100k")
schema = train.schema
schema = schema.without(["rating"])
# keep only positives
train_df = train.to_ddf().compute()
train = Dataset(train_df[train_df["rating_binary"] == 1])
train.schema = schema
model = mm.Model(
mm.InputBlockV2(schema, aggregation=None),
InBatchNegatives(schema, 5, seed=42),
mm.MLPBlock([64]),
mm.BinaryClassificationTask("rating_binary"),
)
model.compile("adam", metrics=[tf.keras.metrics.AUC()])
model.fit(train, batch_size=2048, epochs=1)
# 25/25 - 1s 13ms/step - loss: 0.2768 - auc_2: 0.0000e+00 - regularization_loss: 0.0000e+00 |
Bug description
The negative sampling for ranking gives AUC 0.00 value and binary classification acc 0.999 when we add
sampling
layer to themm.Model()
class.Steps/Code to reproduce bug
Please run the code below with any synthetic dataset and corresponding schema to repro the issue:
Expected behavior
We should be getting AUC > 0
Environment details
merlin-tensorflow-training:22.05
image with the latest main branches pulled.The text was updated successfully, but these errors were encountered: