-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updates Models to support new dataloader format for lists (__values and __offsets in dict) and scalar (1D) #999
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
Documentation preview |
merlin/models/tf/inputs/base.py
Outdated
@@ -208,7 +200,13 @@ def InputBlock( | |||
name="continuous_projection", | |||
) | |||
|
|||
return ParallelBlock(branches, aggregation=aggregation, post=post, is_input=True, **kwargs) | |||
_pre = PrepareFeatures(schema) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does every model have PrepareFeatures as the first layer? if so, maybe we don't need this in the InputBlock?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed that. As there are many blocks that don't use InputBlockV2
and InputBlock
, I moved the calling of PrepareFeatures
to Model
, Encoder
and to the blocks that can be used as pre
for fit() / evaluate()
and the ones that can be used as a transform for Loader.map()
Re: issue with loader returning SparseTensor's. I think this is because in some cases we're now setting the value_count.max property (specifying the shape of ragged dimensions) which wasn't previously set. Until we remove the sparse output option in the dataloader NVIDIA-Merlin/dataloader#103 . One intermediate step could be to update |
29fc1f8
to
6ba93d4
Compare
4461b18
to
231589b
Compare
@@ -26,8 +26,7 @@ | |||
"name": "item_age_days_norm", | |||
"type": "FLOAT", | |||
"valueCount": { | |||
"min": "1", | |||
"max": "4" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to remove the max value count from this schema and others?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue is that when min and max are provided and are different, the dataloader returns a tf.SparseTensor, but now MM can only take either tf.Tensor or our convension for ragged tensors (__values and __offsets). So I remove the max to enforce it to return the ragged tensors.
I know you have already a draft PR that will make it return ragged tensors in such cases. As soon as it is merged we can add back the value_count.max
from the test dataset schemas if needed.
Does that make sense?
layer_inputs = { | ||
k: v | ||
for k, v in inputs.items() | ||
if k.replace("__values", "").replace("__offsets", "") in maybe_schema.column_names |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what situation is this change required for. If we're handling the ragged inputs in the first layer how are the values/offsets making their way through to a ParallelBlock?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. In general we use PrepareFeatures
in the Model
right at the beginning to ensure that lists are converted from our ragged tensor convension (__values and __offsets) to tf.RaggedTensor
.
But for RetrievalModelV2
we cannot PrepareFeatures
in Model
, because in that case the Encoders
(which are tf.keras.Model
) would receive tf.RaggedTensor
as inputs and that would cause problems when the query encoder is saved to be served on Triton.
That is why on RetrievalModelV2
constructor we set prep_features=False
and let the Encoder
class to prep_features=True
. As the RetrievalModelV2
is a ParallelBlock
of Encoder
that is when it would receive the ragged convension (__values
and __offsets
) and this change is to ensure that the ragged representation is cascaded to the Encoder
block
merlin/models/tf/core/encoder.py
Outdated
@@ -159,11 +166,14 @@ def batch_predict( | |||
|
|||
return merlin.io.Dataset(predictions) | |||
|
|||
def call(self, inputs, training=False, testing=False, targets=None, **kwargs): | |||
def call(self, inputs, targets=None, training=False, testing=False, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the re-ordering of the params a functional change. e.g. are there any places where these are being passed in as positional params instead of keyword. One option could be to enforce this by adding a *
after inputs. That way we can be sure that the order of the keyword arguments don't matter.
def call(self, inputs, *, targets=None, training=False, testing=False, **kwargs):
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
merlin/models/tf/models/base.py
Outdated
is_list=True, | ||
is_ragged=is_ragged, | ||
properties=properties, | ||
dims=((0, None), (shape[1], shape[1])), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be simplified to dims=(None, shape[1])
. Equivalent to what is already here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
…1) instead of 1D as before
…he right dataloader is installed
…ing models. Fixed linting
… of the modified dataloader installed
…the modified dataloader
4427f88
to
2c0e9a2
Compare
rerun tests |
Fixes #819
It also resolves the Capture shapes everywhere #823 for the Merlin Models library
Goals ⚽
This PR changes Merlin Models to support dataloader changes from PR #101:
{feature_name}__values
,{feature_name}__offsets
, instead of a tupleIt also updates code from using
ColumnSchema
"value_count"
to use the new ColumnSchemashape
API.Implementation Details 🚧
Merlin Models had code in many places that converts the tuple representation into RaggedTensor, SparseTensor, or Tensor.
This PR centralizes the conversion from the dataloader representation of list features and eliminate duplicate code on lists conversion.
PrepareFeatures
,PrepareListFeatures
)ListToDense
,ListToRagged
,ListToSparse
)shape
Other
examples/usecases/transformers-next-item-prediction.ipynb
to fix tagging and make it simplerTesting Details 🔍