Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Model schemas and serving signatures don't align with each other #924

Closed
karlhigley opened this issue Dec 15, 2022 · 6 comments
Closed
Assignees
Labels
bug Something isn't working status/needs-triage
Milestone

Comments

@karlhigley
Copy link
Contributor

Bug description

The saved input and output schema files for Tensorflow models contain information that doesn't match the saved model's serving signature. The serving signature also seems to record dimensions inconsistently.

Steps/Code to reproduce bug

import pytest
import tensorflow as tf

import merlin.models.tf as mm
from merlin.datasets.synthetic import generate_data
from merlin.io import Dataset
from merlin.schema import Tags
from merlin.schema.io.tensorflow_metadata import TensorflowMetadata

def test_model_schemas_match_serving_signature(tmpdir):

    train = generate_data("sequence-testing", num_rows=100)

    seq_schema = train.schema.select_by_tag(Tags.SEQUENCE).select_by_tag(Tags.CATEGORICAL)

    target = train.schema.select_by_tag(Tags.ITEM_ID).column_names[0]
    predict_last = mm.SequencePredictLast(schema=seq_schema, target=target)

    input_schema = seq_schema
    output_schema = seq_schema.select_by_name(target)

    train = Dataset(train.to_ddf(columns=input_schema.column_names).compute())
    loader = mm.Loader(train, batch_size=16, shuffle=False)

    d_model = 48
    query_encoder = mm.Encoder(
        mm.InputBlockV2(
            input_schema,
            embeddings=mm.Embeddings(
                input_schema.select_by_tag(Tags.CATEGORICAL), sequence_combiner=None
            ),
        ),
        mm.MLPBlock([d_model]),
        mm.GPT2Block(d_model=d_model, n_head=2, n_layer=2),
        tf.keras.layers.Lambda(lambda x: tf.reduce_mean(x, axis=1)),
    )

    model = mm.RetrievalModelV2(
        query=query_encoder,
        output=mm.ContrastiveOutput(output_schema, negative_samplers="in-batch"),

    )

    model.compile(metrics={})
    model.fit(loader, epochs=1, pre=predict_last)

    query_encoder.save(
        f"{tmpdir}/query_encoder"
    )

    input_schema = TensorflowMetadata.from_json_file(f"{tmpdir}/query_encoder/.merlin/input_schema.json").to_merlin_schema()
    output_schema = TensorflowMetadata.from_json_file(f"{tmpdir}/query_encoder/.merlin/output_schema.json").to_merlin_schema()

    breakpoint()

Then compare the input and output schema (and especially the value counts) to the serving signature:

(Pdb) input_schema
[
    {'name': 'item_id_seq', 'tags': {<Tags.ITEM: 'item'>, <Tags.CATEGORICAL: 'categorical'>, <Tags.ITEM_ID: 'item_id'>, <Tags.ID: 'id'>, <Tags.SEQUENCE: 'sequence'>}, 'properties': {'domain': {'min': 1, 'max': 51996, 'name': 'item_id_seq'}, 'value_count': {'min': 4}}, 'dtype': dtype('int64'), 'is_list': True, 'is_ragged': True},
    {'name': 'categories', 'tags': {<Tags.SEQUENCE: 'sequence'>, <Tags.ITEM: 'item'>, <Tags.LIST: 'list'>, <Tags.CATEGORICAL: 'categorical'>}, 'properties': {'domain': {'min': 1, 'max': 331, 'name': 'categories'}, 'value_count': {'min': 4}}, 'dtype': dtype('int64'), 'is_list': True, 'is_ragged': True}
]

(Pdb) output_schema
[
    {'name': 'output_1', 'tags': set(), 'properties': {}, 'dtype': dtype('float32'), 'is_list': True, 'is_ragged': False}
]
root@b3178cb48bcd:/# saved_model_cli show --tag_set serve --signature_def serving_default --dir /tmp/pytest-of-root/pytest-19/test_model_schemas_match_serv0/query_encoder/

The given SavedModel SignatureDef contains the following input(s):
  inputs['categories_1'] tensor_info:
      dtype: DT_INT64
      shape: (-1, 1)
      name: serving_default_categories_1:0
  inputs['categories_2'] tensor_info:
      dtype: DT_INT32
      shape: (-1, 1)
      name: serving_default_categories_2:0
  inputs['item_id_seq_1'] tensor_info:
      dtype: DT_INT64
      shape: (-1, 1)
      name: serving_default_item_id_seq_1:0
  inputs['item_id_seq_2'] tensor_info:
      dtype: DT_INT32
      shape: (-1, 1)
      name: serving_default_item_id_seq_2:0
The given SavedModel SignatureDef contains the following output(s):
  outputs['output_1'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1, 48)
      name: StatefulPartitionedCall:0
Method name is: tensorflow/serving/predict

Two (potential?) issues:

  • The input value counts indicate that at least four values should be supplied, but the serving signature shape (-1,1) only has a variable length across the batch dimension and seems to expect one value per example instead of four.
  • The shape of the output (-1, 48) is correctly tracked in the serving signature, but the output schema doesn't contain corresponding value counts. We know it's a list from the is_list flag and that it has a fixed shape from the is_ragged flag, but we don't know what that fixed shape is from the schema.

Expected behavior

  • Inputs with variable lengths have shape (-1,-1) in the serving signature
  • Inputs with fixed lengths have shape (-1,fixed_length) in the serving signature
  • Output schemas for fixed length outputs have value counts indicating the shape ({min=48, max=48})

Environment details

  • Merlin version: main branch of all Merlin repos
  • Platform: Docker (nvcr.io/nvstaging/merlin/merlin-ci-runner:latest)
  • Python version: 3.8.10
  • Tensorflow version (GPU?): tensorflow-gpu 2.9.2

Additional context

We may not yet have a good way to transfer information about fixed shapes and variable shapes in the Merlin schemas/file format. If that's the crux of the issue, then it may be time to design our own Protobuf schema that matches the way we think about e.g. shapes instead of trying to force it. If trying to write a bug fix turns into a significant headache, that's probably a sign that there's a deeper issue for us to resolve, and it's totally fine at that point to close this bug as "won't fix" and wait for changes in Core to make it easier.

@oliverholworthy
Copy link
Member

Opened a PR to handle the output schema case (adding value count) for fixed length outputs here #926

For the ragged list inputs. categories and item_id_seq. The value_count.min = 4 here I think is not strictly a limitation of the model. I think this is a side effect of feature attributes from the dataset schema flowing through to the model input schema. The dataset happens to have a minimum sequence length of 4. The model was provided sequence lengths of at least 4 during training. However, that doesn't necessarily mean the model can't provide a response for smaller sequences. Though I suppose it could in theory be constructed that way.

So maybe dropping the value_count.min from the input schema would resolve this issue?

@oliverholworthy
Copy link
Member

Or does this issue also include the question of how should we represent the information about how a ragged feature categories should be split up (e.g. to categories_1 as a 2d array of scalars with values, an categories_2 as a 2d array of scalars with row lengths. And together represent the ragged sequence of values.

@karlhigley
Copy link
Contributor Author

If the models are completely flexible about sequence length, I suppose the minimum would be 0? Honestly, the way we represent shapes is a bit confusing to me. This is one of the things I'd like to straighten out by proposing a standard to be adopted (or revised until it can be.)

@rnyak
Copy link
Contributor

rnyak commented Jan 18, 2023

@sararb and @marcromeyn what does TF model expects for an input of a list of length 4? does it expect like [[1],[2],[3],[4]] or like [1, 2, 3, 4] ?

@sararb
Copy link
Contributor

sararb commented Jan 25, 2023

For a general context: In merlin frameworks, the same list feature can be represented in three different formats:

1.TF dense tensor: all rows have the same fixed length:

"item_id_seq":<tf.Tensor: shape=(3, 4), dtype=int32, numpy=
array([[ 2, 4, 5, 0],
       [23, 4, 0, 0],
       [1, 34, 54, 7]], dtype=int32)>

2.TF RaggedTensor: a tf structure for variable-length lists

"item_id_seq": <tf.RaggedTensor [[2, 4, 5], [23, 4], [1, 34, 54, 7]]> || shape=(3, None)
  1. Tuple of values and lengths: the variable-length list feature is represented by two tensors of values and lengths
"item_id_seq": (
<tf.Tensor: shape=(9, 1), dtype=int32, numpy= array([ 2, 4, 5, 23, 4, 1, 34, 54, 7])
<tf.Tensor: shape=(3, 1), dtype=int32, numpy= array([ 3, 2, 4]), 
)

==> Merlin Models supports two inputs format: case 1 and case 3

  1. In case 1, the padding is set in the data loader and dense tensors are passed to the model. The SavedModel signature is as follows:
The given SavedModel SignatureDef contains the following input(s):
  inputs['categories'] tensor_info:
      dtype: DT_INT64
      shape: (-1, 4)
      name: serving_default_categories:0
  inputs['item_id_seq'] tensor_info:
      dtype: DT_INT64
      shape: (-1, 4)
      name: serving_default_item_id_seq:0
The given SavedModel SignatureDef contains the following output(s):
  outputs['output_1'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1, 48)
      name: StatefulPartitionedCall:0
  1. In case 3 (related to the code shared in this bug ticket), the data loader passes the list inputs as a tuple of two tensors. The SavedModel signature is as follows:
The given SavedModel SignatureDef contains the following input(s):
  inputs['categories_1'] tensor_info:
      dtype: DT_INT64
      shape: (-1, 1)
      name: serving_default_categories_1:0
  inputs['categories_2'] tensor_info:
      dtype: DT_INT32
      shape: (-1, 1)
      name: serving_default_categories_2:0
  inputs['item_id_seq_1'] tensor_info:
      dtype: DT_INT64
      shape: (-1, 1)
      name: serving_default_item_id_seq_1:0
  inputs['item_id_seq_2'] tensor_info:
      dtype: DT_INT32
      shape: (-1, 1)
      name: serving_default_item_id_seq_2:0
The given SavedModel SignatureDef contains the following output(s):
  outputs['output_1'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1, 48)
      name: StatefulPartitionedCall:0

You can see that in case 3, the variable length list features are represented with two inputs item_id_seq_1 and item_id_seq_2 to account for the tensors of values and lengths. So the expected signature for variable lengths is not one variable with shape (-1,-1) but rather two 1-d variables with shape (-1,1) and (-1,1).

  • Representing variable lengths list inputs with one tensor is case 2 (tf.RaggedTensor) But this error message suggests that TF does not generate SavedModel signature when model's inputs are tf.RaggedTensor(s)

@oliverholworthy
Copy link
Member

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working status/needs-triage
Projects
None yet
Development

No branches or pull requests

6 participants