Updates Models to support new dataloader format for lists (values and offsets in dict) and scalar (1D) #999

gabrielspmoreira · 2023-02-24T06:13:04Z

Fixes #819
It also resolves the Capture shapes everywhere #823 for the Merlin Models library

Goals ⚽

This PR changes Merlin Models to support dataloader changes from PR #101:

List features are now represented by two dict keys with the following format: {feature_name}__values, {feature_name}__offsets, instead of a tuple
Scalar features are now 1D tensors, instead of 2D

It also updates code from using ColumnSchema "value_count" to use the new ColumnSchema shape API.

Implementation Details 🚧

Merlin Models had code in many places that converts the tuple representation into RaggedTensor, SparseTensor, or Tensor.
This PR centralizes the conversion from the dataloader representation of list features and eliminate duplicate code on lists conversion.

Created centralized blocks to prepare scalar and list features (PrepareFeatures, PrepareListFeatures)
Removed duplicated blocks to manage lists (ListToDense, ListToRagged, ListToSparse)
Replaced usage of "properties/value_count" to use the new shape

Other

Changed NVT preproc workflow of examples/usecases/transformers-next-item-prediction.ipynb to fix tagging and make it simpler

Testing Details 🔍

Fixed dozens of unit and integration tests to match the new data loader and shape API

review-notebook-app · 2023-02-24T06:13:08Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

github-actions · 2023-02-24T06:20:22Z

Documentation preview

https://nvidia-merlin.github.io/models/review/pr-999

oliverholworthy · 2023-02-27T10:49:16Z

merlin/models/tf/inputs/base.py

@@ -208,7 +200,13 @@ def InputBlock(
            name="continuous_projection",
        )

-    return ParallelBlock(branches, aggregation=aggregation, post=post, is_input=True, **kwargs)
+    _pre = PrepareFeatures(schema)


does every model have PrepareFeatures as the first layer? if so, maybe we don't need this in the InputBlock?

I changed that. As there are many blocks that don't use InputBlockV2 and InputBlock, I moved the calling of PrepareFeatures to Model, Encoder and to the blocks that can be used as pre for fit() / evaluate() and the ones that can be used as a transform for Loader.map()

oliverholworthy · 2023-02-27T17:08:53Z

Re: issue with loader returning SparseTensor's. I think this is because in some cases we're now setting the value_count.max property (specifying the shape of ragged dimensions) which wasn't previously set. Until we remove the sparse output option in the dataloader NVIDIA-Merlin/dataloader#103 . One intermediate step could be to update _augment_schema helper function to always remove value_count from the properties if it's there so that ragged columns will always return the ragged representation.

merlin/models/tf/transforms/features.py

tox.ini

oliverholworthy · 2023-03-15T13:19:39Z

merlin/datasets/testing/sequence_testing/schema.json

@@ -26,8 +26,7 @@
      "name": "item_age_days_norm",
      "type": "FLOAT",
      "valueCount": {
-        "min": "1",
-        "max": "4"


Why do we need to remove the max value count from this schema and others?

The issue is that when min and max are provided and are different, the dataloader returns a tf.SparseTensor, but now MM can only take either tf.Tensor or our convension for ragged tensors (__values and __offsets). So I remove the max to enforce it to return the ragged tensors.
I know you have already a draft PR that will make it return ragged tensors in such cases. As soon as it is merged we can add back the value_count.max from the test dataset schemas if needed.
Does that make sense?

oliverholworthy · 2023-03-15T13:22:14Z

merlin/models/tf/core/combinators.py

+            layer_inputs = {
+                k: v
+                for k, v in inputs.items()
+                if k.replace("__values", "").replace("__offsets", "") in maybe_schema.column_names


what situation is this change required for. If we're handling the ragged inputs in the first layer how are the values/offsets making their way through to a ParallelBlock?

You are right. In general we use PrepareFeatures in the Model right at the beginning to ensure that lists are converted from our ragged tensor convension (__values and __offsets) to tf.RaggedTensor.
But for RetrievalModelV2 we cannot PrepareFeatures in Model, because in that case the Encoders (which are tf.keras.Model) would receive tf.RaggedTensor as inputs and that would cause problems when the query encoder is saved to be served on Triton.
That is why on RetrievalModelV2 constructor we set prep_features=False and let the Encoder class to prep_features=True. As the RetrievalModelV2 is a ParallelBlock of Encoder that is when it would receive the ragged convension (__values and __offsets) and this change is to ensure that the ragged representation is cascaded to the Encoder block

oliverholworthy · 2023-03-15T13:24:52Z

merlin/models/tf/core/encoder.py

@@ -159,11 +166,14 @@ def batch_predict(

        return merlin.io.Dataset(predictions)

-    def call(self, inputs, training=False, testing=False, targets=None, **kwargs):
+    def call(self, inputs, targets=None, training=False, testing=False, **kwargs):


is the re-ordering of the params a functional change. e.g. are there any places where these are being passed in as positional params instead of keyword. One option could be to enforce this by adding a * after inputs. That way we can be sure that the order of the keyword arguments don't matter.

def call(self, inputs, *, targets=None, training=False, testing=False, **kwargs):

oliverholworthy · 2023-03-15T13:31:15Z

merlin/models/tf/models/base.py

-                is_list=True,
-                is_ragged=is_ragged,
-                properties=properties,
+                dims=((0, None), (shape[1], shape[1])),


This can be simplified to dims=(None, shape[1]). Equivalent to what is already here.

…1) instead of 1D as before

…t is necessary

…he right dataloader is installed

…ing models. Fixed linting

…nd GPU tests

…aloader for CI

… of the modified dataloader installed

…the modified dataloader

…aloader

bschifferer · 2023-03-15T18:52:32Z

rerun tests

gabrielspmoreira marked this pull request as draft February 24, 2023 06:13

gabrielspmoreira self-assigned this Feb 24, 2023

gabrielspmoreira added enhancement New feature or request chore Maintenance for the repository labels Feb 24, 2023

gabrielspmoreira added this to the Merlin 23.03 milestone Feb 24, 2023

gabrielspmoreira requested review from karlhigley and bschifferer February 24, 2023 06:19

oliverholworthy reviewed Feb 27, 2023

View reviewed changes

gabrielspmoreira force-pushed the tf/dataloader_changes branch from 29fc1f8 to 6ba93d4 Compare March 2, 2023 06:04

jperez999 marked this pull request as ready for review March 8, 2023 19:09

jperez999 marked this pull request as draft March 8, 2023 20:08

gabrielspmoreira force-pushed the tf/dataloader_changes branch from 4461b18 to 231589b Compare March 8, 2023 20:37

gabrielspmoreira marked this pull request as ready for review March 8, 2023 20:38

karlhigley reviewed Mar 9, 2023

View reviewed changes

merlin/models/tf/transforms/features.py Outdated Show resolved Hide resolved

edknv reviewed Mar 14, 2023

View reviewed changes

tox.ini Outdated Show resolved Hide resolved

gabrielspmoreira requested review from karlhigley, marcromeyn and oliverholworthy March 14, 2023 13:41

oliverholworthy reviewed Mar 15, 2023

View reviewed changes

gabrielspmoreira added 5 commits March 15, 2023 12:20

Updates to support new dataloader format for lists and scalar

e7aaa8e

Updates on MM to make it support new dataloader output

1120a00

Centralizing PrepareFeatures call and fixing tests and API

9bbc8d2

Fixed tests that predict last item of the sequence

71ad90c

Fixed additional tests

55054f6

gabrielspmoreira added 23 commits March 15, 2023 12:20

Fixed InBatchNegative tests

21b0e56

Updating TOX to point to dataloader changes PR and minor fix

61ccd5d

Updated transformers example to fix test

9c44327

Fixed tests

597e776

Fixed nested loader training

0680b9e

Turning Candidate into dataclass and adding 2D ids to it (batch size,…

c740d0e

…1) instead of 1D as before

Updated transformer block in test

ac662fb

Updating gpu-ci.yaml to be able to run CI on an edited PR

97f4562

Removing the edited pull request option from gpu-ci.yaml, to see if i…

8e4bcae

…t is necessary

Fixed test and linting issue

ba3b8c0

Changing the order of libraries installing to try and ensuring that t…

5cb087c

…he right dataloader is installed

Changing GitHub Action to install a modified dataloader after install…

7e247f8

…ing models. Fixed linting

Fixed unit test and linting issue

e606270

Updating tox.ini to install the modified the dataloader for horovod a…

084f99c

…nd GPU tests

Updating tox.ini and fixing linting issue

75fc1d1

Fixed failing tests

4607d6b

Replaced references from value_count to shape

9408f7c

Fixed tests

e2560b8

Trying to enforce the horovod multi-gpu tests to use the modified dat…

8abdb32

…aloader for CI

Removed change in horovodrun command that was trying to enforce usage…

a312e6a

… of the modified dataloader installed

Removing dep install for horovod gpu tests, to try and make it using …

33350aa

…the modified dataloader

Trying to enforce horovod GPU tests to use the installed modified dat…

b503c4c

…aloader

Implemented suggestions from Oliver

2c0e9a2

gabrielspmoreira force-pushed the tf/dataloader_changes branch from 4427f88 to 2c0e9a2 Compare March 15, 2023 15:21

oliverholworthy approved these changes Mar 15, 2023

View reviewed changes

jperez999 merged commit a5e392c into main Mar 15, 2023

karlhigley added the breaking Breaking change label Mar 16, 2023

gabrielspmoreira mentioned this pull request Mar 17, 2023

Restoring tox.ini and .github/workflows/tensorflow.yml #1023

Merged

oliverholworthy mentioned this pull request Apr 11, 2023

[BUG] Model schemas and serving signatures don't align with each other #924

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Updates Models to support new dataloader format for lists (values and offsets in dict) and scalar (1D) #999

Updates Models to support new dataloader format for lists (values and offsets in dict) and scalar (1D) #999

gabrielspmoreira commented Feb 24, 2023 •

edited

Loading

review-notebook-app bot commented Feb 24, 2023

github-actions bot commented Feb 24, 2023

oliverholworthy Feb 27, 2023

gabrielspmoreira Mar 8, 2023

oliverholworthy commented Feb 27, 2023

oliverholworthy Mar 15, 2023

gabrielspmoreira Mar 15, 2023

oliverholworthy Mar 15, 2023

gabrielspmoreira Mar 15, 2023

oliverholworthy Mar 15, 2023

gabrielspmoreira Mar 15, 2023

oliverholworthy Mar 15, 2023

gabrielspmoreira Mar 15, 2023

bschifferer commented Mar 15, 2023

Updates Models to support new dataloader format for lists (__values and __offsets in dict) and scalar (1D) #999

Updates Models to support new dataloader format for lists (__values and __offsets in dict) and scalar (1D) #999

Conversation

gabrielspmoreira commented Feb 24, 2023 • edited Loading

Goals ⚽

Implementation Details 🚧

Testing Details 🔍

review-notebook-app bot commented Feb 24, 2023

github-actions bot commented Feb 24, 2023

Documentation preview

Choose a reason for hiding this comment

Choose a reason for hiding this comment

oliverholworthy commented Feb 27, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bschifferer commented Mar 15, 2023

Updates Models to support new dataloader format for lists (values and offsets in dict) and scalar (1D) #999

Updates Models to support new dataloader format for lists (values and offsets in dict) and scalar (1D) #999

gabrielspmoreira commented Feb 24, 2023 •

edited

Loading