Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] add docstrings in mms classes and functions #1101

Merged
merged 2 commits into from
May 22, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
87 changes: 45 additions & 42 deletions merlin/models/tf/blocks/dlrm.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,52 +34,55 @@ def DLRMBlock(
*,
embedding_dim: int = None,
embedding_options: EmbeddingOptions = None,
embeddings: Optional[Block] = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not move the order of arguments if possible - it's a breaking change in the unlikely case where people provide all of the arguments in order as opposed to using named arguments.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The * on line 34 forces the remaining arguments to be keword arguments. which makes them robust to the order here

Copy link
Contributor Author

@rnyak rnyak May 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nv-alaiacano thanks for your comment. Make sense, I talked to Oliver, I think we are safe changing the position of the arg thanks to * arg there. It makes everything forced to be a keyword arg, rather than a positional arg.

bottom_block: Optional[Block] = None,
top_block: Optional[Block] = None,
embeddings: Optional[Block] = None,
) -> SequentialBlock:
"""Builds the DLRM architecture, as proposed in the following
`paper https://arxiv.org/pdf/1906.00091.pdf`_ [1]_.

References
----------
.. [1] Naumov, Maxim, et al. "Deep learning recommendation model for
personalization and recommendation systems." arXiv preprint arXiv:1906.00091 (2019).

Parameters
----------
schema : Schema
The `Schema` with the input features
bottom_block : Block
The `Block` that combines the continuous features (typically a `MLPBlock`)
top_block : Optional[Block], optional
The optional `Block` that combines the outputs of bottom layer and of
the factorization machine layer, by default None
embedding_dim : Optional[int], optional
Dimension of the embeddings, by default None
embedding_options : EmbeddingOptions
Options for the input embeddings.
- embedding_dim_default: int - Default dimension of the embedding
table, when the feature is not found in ``embedding_dims``, by default 64
- infer_embedding_sizes : bool, Automatically defines the embedding
dimension from the feature cardinality in the schema, by default False,
which needs to be kept False for the DLRM architecture.

Returns
-------
SequentialBlock
The DLRM block

Raises
------
ValueError
The schema is required by DLRM
ValueError
The bottom_block is required by DLRM
ValueError
The embedding_dim (X) needs to match the last layer of bottom MLP (Y).
ValueError
Only one-of `embeddings` or `embedding_options` can be used.
`paper https://arxiv.org/pdf/1906.00091.pdf`_ [1]_.

References
----------
.. [1] Naumov, Maxim, et al. "Deep learning recommendation model for
personalization and recommendation systems." arXiv preprint arXiv:1906.00091 (2019).

Parameters
----------
schema : Schema
The `Schema` with the input features
embedding_dim : Optional[int], optional
Dimension of the embeddings, by default None
embedding_options : EmbeddingOptions
Options for the input embeddings.
- embedding_dim_default: int - Default dimension of the embedding
table, when the feature is not found in ``embedding_dims``, by default 64
- infer_embedding_sizes : bool, Automatically defines the embedding
dimension from the feature cardinality in the schema, by default False,
which needs to be kept False for the DLRM architecture.
embeddings: Optional[Block]
If provided creates a ParallelBlock with an EmbeddingTable for each
categorical feature in the schema.
bottom_block : Block
The `Block` that combines the continuous features (typically a `MLPBlock`)
top_block : Optional[Block], optional
The optional `Block` that combines the outputs of bottom layer and of
the factorization machine layer, by default None

Returns
-------
SequentialBlock
The DLRM block

Raises
------
ValueError
The schema is required by DLRM
ValueError
The bottom_block is required by DLRM
ValueError
The embedding_dim (X) needs to match the last layer of bottom MLP (Y).
ValueError
Only one-of `embeddings` or `embedding_options` can be used.
"""
if schema is None:
raise ValueError("The schema is required by DLRM")
Expand Down
12 changes: 12 additions & 0 deletions merlin/models/tf/blocks/interaction.py
Original file line number Diff line number Diff line change
Expand Up @@ -236,6 +236,18 @@ def call(self, inputs: tf.Tensor, **kwargs) -> tf.Tensor:
return 0.5 * tf.subtract(summed_square, squared_sum)

def compute_output_shape(self, input_shapes):
"""Computes the output shape based on the input shapes

Parameters
----------
input_shapes : tf.TensorShape
The input shapes

Returns
-------
tf.TensorShape
The output shape
"""
if len(input_shapes) != 3:
raise ValueError("Found shape {} without 3 dimensions".format(input_shapes))
return (input_shapes[0], input_shapes[2])
Expand Down
27 changes: 27 additions & 0 deletions merlin/models/tf/prediction_tasks/classification.py
Original file line number Diff line number Diff line change
Expand Up @@ -99,12 +99,39 @@ def __init__(
)

def call(self, inputs, training=False, **kwargs):
"""Projects the input with the output layer to a single logit

Parameters
----------
inputs : tf.Tensor
Input tensor
training : bool, optional
Flag that indicates whether it is training or not, by default False

Returns
-------
tf.Tensor
Tensor with the classification probabilities
"""
return self.output_activation(self.output_layer(inputs))

def compute_output_shape(self, input_shape):
"""Computes the output shape based on the input shape

Parameters
----------
input_shape : tf.TensorShape
The input shape

Returns
-------
tf.TensorShape
The output shape
"""
return self.output_layer.compute_output_shape(input_shape)

def get_config(self):
"""Return a Python dict containing the configuration of the model."""
config = super().get_config()
config = maybe_serialize_keras_objects(
self,
Expand Down
13 changes: 13 additions & 0 deletions merlin/models/tf/prediction_tasks/regression.py
Original file line number Diff line number Diff line change
Expand Up @@ -105,9 +105,22 @@ def call(self, inputs: tf.Tensor, training=False, **kwargs) -> tf.Tensor:
return self.output_activation(self.output_layer(inputs))

def compute_output_shape(self, input_shape):
"""Computes the output shape based on the input shape

Parameters
----------
input_shape : tf.TensorShape
The input shape

Returns
-------
tf.TensorShape
The output shape
"""
return self.output_layer.compute_output_shape(input_shape)

def get_config(self):
"""Return a Python dict containing the configuration of the model."""
config = super().get_config()
config = maybe_serialize_keras_objects(
self, config, {"output_layer": tf.keras.layers.serialize}
Expand Down
15 changes: 11 additions & 4 deletions merlin/models/tf/prediction_tasks/retrieval.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,10 +39,6 @@ class ItemRetrievalTask(MultiClassClassificationTask):
The schema object including features to use and their properties.
samplers: List[ItemSampler]
List of samplers for negative sampling, by default `[InBatchSampler()]`
post_logits: Optional[PredictionBlock]
Optional extra pre-call block for post-processing the logits, by default None.
You can for example use `post_logits = mm.PopularitySamplingBlock(item_fequency)`
for populariy sampling correction.
target_name: Optional[str]
If specified, name of the target tensor to retrieve from dataloader.
Defaults to None.
Expand All @@ -52,9 +48,17 @@ class ItemRetrievalTask(MultiClassClassificationTask):
task_block: Block
The `Block` that applies additional layers op to inputs.
Defaults to None.
post_logits: Optional[PredictionBlock]
Optional extra pre-call block for post-processing the logits, by default None.
You can for example use `post_logits = mm.PopularitySamplingBlock(item_fequency)`
for populariy sampling correction.
logits_temperature: float
Parameter used to reduce the model overconfidence, so that logits / T.
Defaults to 1.
cache_query: bool
Add query embeddings to the context block, by default False
store_negative_ids: bool
Returns negative items ids as part of the output, by default False
Returns
-------
PredictionTask
Expand Down Expand Up @@ -112,6 +116,7 @@ def _build_prediction_call(
store_negative_ids: bool = False,
**kwargs,
):
"""Returns a SequentialBlock of ItemRetrievalScorer() and LogitsTemperatureScaler()"""
if samplers is None or len(samplers) == 0:
samplers = (InBatchSampler(),)

Expand All @@ -134,6 +139,7 @@ def _build_prediction_call(
@property
def retrieval_scorer(self):
def find_retrieval_scorer_block(block):
"""Returns the ItemRetrievalScorer layer"""
if isinstance(block, ItemRetrievalScorer):
return block

Expand All @@ -156,6 +162,7 @@ def set_retrieval_cache_query(self, value: bool):
self.retrieval_scorer.cache_query = value

def get_config(self):
"""Return a Python dict containing the configuration of the model."""
config = super(ItemRetrievalTask, self).get_config()
del config["pre"]
if self.samplers:
Expand Down