[BUG] The negative sampling for ranking gives AUC 0.00 value when we add `sampling` layer to the Model class #596

rnyak · 2022-07-20T14:23:53Z

Bug description

The negative sampling for ranking gives AUC 0.00 value and binary classification acc 0.999 when we add sampling layer to the mm.Model() class.

Steps/Code to reproduce bug

Please run the code below with any synthetic dataset and corresponding schema to repro the issue:

sampling = UniformNegativeSampling(schema, 5, seed=42)

model = mm.Model(
    mm.InputBlock(schema),
    sampling,
    mm.MLPBlock([64]),
    mm.BinaryClassificationTask("label"),
)


BATCH_SIZE=2048
model.compile("adam", run_eagerly=False, metrics=[tf.keras.metrics.AUC()])
model.fit(train, batch_size=BATCH_SIZE)

Expected behavior

We should be getting AUC > 0

Environment details

Merlin version:
Platform:
Python version:
PyTorch version (GPU?):
Tensorflow version (GPU?):

merlin-tensorflow-training:22.05 image with the latest main branches pulled.

The text was updated successfully, but these errors were encountered:

viswa-nvidia · 2022-07-22T21:09:56Z

@rnyak , please triage this bug

oliverholworthy · 2022-08-18T11:25:41Z

So it looks like this is something to do with eager/graph mode. Running with model.compile(..., run_eagerly=True) produces a non-zero AUC. Example below:

import pyarrow as pa
import tensorflow as tf
from merlin.models.tf.data_augmentation.negative_sampling import UniformNegativeSampling
import merlin.models.tf as mm
from merlin.io import Dataset
from merlin.datasets.entertainment import get_movielens


train, valid = get_movielens(variant="ml-100k")
schema = train.schema
schema = schema.without(["rating"])

# keep only positives
train_df = train.to_ddf().compute()
train = Dataset(train_df[train_df["rating_binary"] == 1])
train.schema = schema

model = mm.Model(
    mm.InputBlock(schema),
    UniformNegativeSampling(schema, 5, seed=42),
    mm.MLPBlock([64]),
    mm.BinaryClassificationTask("rating_binary"),
)

model.compile("adam", metrics=[tf.keras.metrics.AUC()], run_eagerly=True)
model.fit(train, batch_size=2048, epochs=1)
# => 25/25 [======] - 5s 187ms/step - loss: 0.4996 - auc_1: 0.5002

oliverholworthy · 2022-08-26T10:34:45Z

@rnyak For now since this doesn't work correctly in the model context (without eager mode). We should probably consider this as a non-supported feature. And only recommend use of this in the dataloader in examples/documentation.

rnyak · 2022-10-05T15:50:55Z

@oliverholworthy this looks like working on graph mode, now.

oliverholworthy · 2022-10-05T17:27:08Z

I still get zero AUC when run_eagerly=False (the default), with this sampling layer (now called InBatchNegatives in the model passing only positives as examples.

import pyarrow
import tensorflow as tf
from merlin.models.tf.transforms.negative_sampling import InBatchNegatives
import merlin.models.tf as mm
from merlin.io import Dataset
from merlin.datasets.entertainment import get_movielens


train, valid = get_movielens(variant="ml-100k")
schema = train.schema
schema = schema.without(["rating"])

# keep only positives
train_df = train.to_ddf().compute()
train = Dataset(train_df[train_df["rating_binary"] == 1])
train.schema = schema

model = mm.Model(
    mm.InputBlockV2(schema, aggregation=None),
    InBatchNegatives(schema, 5, seed=42),
    mm.MLPBlock([64]),
    mm.BinaryClassificationTask("rating_binary"),
)

model.compile("adam", metrics=[tf.keras.metrics.AUC()])
model.fit(train, batch_size=2048, epochs=1)
# 25/25 - 1s 13ms/step - loss: 0.2768 - auc_2: 0.0000e+00 - regularization_loss: 0.0000e+00

rnyak added bug Something isn't working status/needs-triage labels Jul 20, 2022

rnyak assigned marcromeyn and oliverholworthy and unassigned marcromeyn Jul 20, 2022

rnyak added P1 S3 Functionality S2 and removed S3 Functionality labels Jul 22, 2022

rnyak mentioned this issue Jul 25, 2022

[RMP] KDD Tutorial (Merlin Models—Retrieval, Ranking, and Negative Sampling) NVIDIA-Merlin/Merlin#269

Closed

13 tasks

viswa-nvidia removed the status/needs-triage label Jul 29, 2022

viswa-nvidia added this to the Merlin 22.08 milestone Jul 29, 2022

This was referenced Aug 26, 2022

Update In Batch Negative Sampling Augmentation Layer for primary use with dataloader #685

Closed

Return correct type from data augmentation layer depending on the context #703

Merged

viswa-nvidia modified the milestones: Merlin 22.08, Merlin 22.10 Sep 29, 2022

rnyak self-assigned this Oct 3, 2022

rnyak closed this as completed Oct 28, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] The negative sampling for ranking gives AUC 0.00 value when we add `sampling` layer to the Model class #596

[BUG] The negative sampling for ranking gives AUC 0.00 value when we add `sampling` layer to the Model class #596

rnyak commented Jul 20, 2022

viswa-nvidia commented Jul 22, 2022

oliverholworthy commented Aug 18, 2022

oliverholworthy commented Aug 26, 2022

rnyak commented Oct 5, 2022

oliverholworthy commented Oct 5, 2022

[BUG] The negative sampling for ranking gives AUC 0.00 value when we add sampling layer to the Model class #596

[BUG] The negative sampling for ranking gives AUC 0.00 value when we add sampling layer to the Model class #596

Comments

rnyak commented Jul 20, 2022

Bug description

Steps/Code to reproduce bug

Expected behavior

Environment details

viswa-nvidia commented Jul 22, 2022

oliverholworthy commented Aug 18, 2022

oliverholworthy commented Aug 26, 2022

rnyak commented Oct 5, 2022

oliverholworthy commented Oct 5, 2022

[BUG] The negative sampling for ranking gives AUC 0.00 value when we add `sampling` layer to the Model class #596

[BUG] The negative sampling for ranking gives AUC 0.00 value when we add `sampling` layer to the Model class #596