Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Save input and output schema when .save methods are called on models #669

Open
oliverholworthy opened this issue Aug 22, 2022 · 3 comments
Assignees
Labels
Milestone

Comments

@oliverholworthy
Copy link
Member

oliverholworthy commented Aug 22, 2022

Relates to: NVIDIA-Merlin/Merlin#545

🚀 Feature request

Provide a consistent .save interface for all models.

This .save method should save the model artifact along with the schema to a directory provided.

Motivation

Model artifacts alone are not enough to unambiguously infer the correct input schema for a model.
Saving the input schema along with models will enable serving code in systems to figure out the correct inputs required for models from an artifact without requiring the user to provide the schema at serving time.

Part of: NVIDIA-Merlin/Merlin#489

Proposed interface

Create a runtime check-able protocol that specifies the common methods expected on a model object.

This .save method on a model will write out the model artifact(s) along with the input schema to a directory provided.

Files saved in merlin metadata directory.

Saved model directory structure

  • model artifacts to be saved in top-level of directory
  • merlin metadata to be saved in merlin_metadata subdirectory.

e.g. a model.save("my_merln_model") on a tensorflow backend should result in the following directory structure:


my_merlin_model
├── merlin_metadata
│   ├── input_schema.json
│   ├── output_schema.json
│   └── model.json
├── assets
├── keras_metadata.pb
├── saved_model.pb
└── variables
    ├── variables.data-00000-of-00001
    └── variables.index

Sub-Tasks

@EvenOldridge
Copy link
Member

@marcromeyn @oliverholworthy what's the status of this? This is blocking the creation of example notebooks for end to end using systems.

@oliverholworthy oliverholworthy changed the title [FEA] Save input schema when .save methods are called on models [FEA] Save input and output schema when .save methods are called on models Oct 5, 2022
@rnyak
Copy link
Contributor

rnyak commented Oct 19, 2022

partially addressed by #680

@oliverholworthy
Copy link
Member Author

oliverholworthy commented Oct 19, 2022

This PR (#680) handles the first two tasks.

  • Adding the protocol
  • and updating save method of tensorflow models.

There is a bit more to think about beyond this. Adding output schema for tensorflow models. And figuring out how to enforce that a schema is created. Since the Merlin Models API is flexible enough at the moment that we can't always guarantee that we have a schema available. At least not in a way that provides any more information than the saved model signature is able to (since we could infer the schema from the saved model like we do in Merlin Systems currently).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants