Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce ServableModuleValidator Callback #13614

Merged
merged 41 commits into from
Jul 15, 2022
Merged

Conversation

tchaton
Copy link
Contributor

@tchaton tchaton commented Jul 12, 2022

What does this PR do?

Fixes #13480

Context: Improve model mobility from training to serving. MLOps Engineers would argue that a production model shouldn't be trained and resources wasted if the model can't be served to provide value to customers. They would argue that the model should be unit tested as soon as possible to validate its conformity for its production usage. This PR investigates the addition of serving functionalities in PyTorch Lightning.

As a user, you would need to add the SanityServing callback to your Trainer and make your model subclass ServableModule.

The ServableModule requires 3 hooks to be implemented in order to fully describe how the model behaves when being served.

  • configure_payload: Returns an example of a payload object: Lightning can provide example payloads for images, text, videos, etc..
  • configure_serialization: Provide serialization / deserialization methods. Lightning can provide common data types.
  • serve_step: The logic to be performed when you have received a request.
from typing import Dict
import torch
from pytorch_lightning import seed_everything, Trainer
from pytorch_lightning.serve import ServableModuleValidator, ServableModule
from pytorch_lightning.demos.boring_classes import BoringModel

class ServableBoringModel(BoringModel, ServableModule):
    def configure_payload(self) -> ...:
        return {"body": {"x": list(range(32))}}

    def configure_serialization(self):
        class Tensor:
            @staticmethod
            def deserialize(x):
                return torch.tensor(x).float()

            @staticmethod
            def serialize(x):
                return x.numpy().tolist()

        return {"x": Tensor.deserialize}, {"output": Tensor.serialize}

    def serve_step(self, x: torch.Tensor) -> Dict[str, torch.Tensor]:
        return {"output": self.forward(x)}

trainer = Trainer(max_epochs=1, limit_train_batches=2, limit_val_batches=0, callbacks=[ServableModuleValidator()])
trainer.fit(ServableBoringModel())

The Ideal API would rely only on the type to apply the deserialization and serialization and tensor shape could be added:

from pydantic import BaseModel
from typing import TypedDict

class Tensor(BaseModel):
    @staticmethod
    def deserialize(x):
        return torch.tensor(x).float()

    @staticmethod
    def serialize(x):
        return x.numpy().tolist()

class OutputTensor(TypedDict):
    output: Tensor[2]

class ServableBoringModel(BoringModel, ServableModule):
    def configure_payload(self) -> ...:
        return {"body": {"x": list(range(32))}}

    def serve_step(self, x: Tensor[32]) -> OutputTensor:
        return {"output": self.forward(x)}

callback = ServableModuleValidator(
       optimization="trace|script|onxx|tensor_rt|...",
       server='fastapi|torch_serve|ml_server|sagemaker|triton"
)
trainer = Trainer(max_epochs=1, limit_train_batches=2, limit_val_batches=0, callbacks=[callback])
trainer.fit(ServableBoringModel())

Extra functionalities for sanity serving

  • Performance Testing. Latency and RPS
  • Apply serving check at start and end of quantization and pruning.
  • etc...

This would be even more impactful once https://github.com/pytorch/torchdynamo is available in PyTorch and models and their transforms served in pure python are further optimized.

Does your PR introduce any breaking changes? If yes, please list them.

Before submitting

  • Was this discussed/approved via a GitHub issue? (not for typos and docs)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure your PR does only one thing, instead of bundling different changes together?
  • Did you make sure to update the documentation with your changes? (if necessary)
  • Did you write any new necessary tests? (not for typos and docs)
  • Did you verify new and existing tests pass locally with your changes?
  • Did you list all the breaking changes introduced by this pull request?
  • Did you update the CHANGELOG? (not for typos, docs, test updates, or minor internal changes/refactors)

PR review

Anyone in the community is welcome to review the PR.
Before you start reviewing, make sure you have read the review guidelines. In short, see the following bullet-list:

  • Is this pull request ready for review? (if not, please submit in draft mode)
  • Check that all items from Before submitting are resolved
  • Make sure the title is self-explanatory and the description concisely explains the PR
  • Add labels and milestones (and optionally projects) to the PR so it can be classified

Did you have fun?

Make sure you had fun coding 🙃

cc @Borda @tchaton @justusschock @awaelchli @carmocca @ananthsub @ninginthecloud @jjenniferdai @rohitgr7 @akihironitta

@tchaton tchaton changed the title Investigate Serving Callack Investigate Sanity Serving Callback Jul 12, 2022
src/pytorch_lightning/callbacks/sanity_serving.py Outdated Show resolved Hide resolved
tests/tests_pytorch/callbacks/test_sanity_serving.py Outdated Show resolved Hide resolved
tests/tests_pytorch/callbacks/test_sanity_serving.py Outdated Show resolved Hide resolved
tests/tests_pytorch/callbacks/test_sanity_serving.py Outdated Show resolved Hide resolved
tests/tests_pytorch/callbacks/test_sanity_serving.py Outdated Show resolved Hide resolved
src/pytorch_lightning/callbacks/sanity_serving.py Outdated Show resolved Hide resolved
src/pytorch_lightning/callbacks/sanity_serving.py Outdated Show resolved Hide resolved
src/pytorch_lightning/callbacks/sanity_serving.py Outdated Show resolved Hide resolved
src/pytorch_lightning/callbacks/sanity_serving.py Outdated Show resolved Hide resolved
src/pytorch_lightning/callbacks/sanity_serving.py Outdated Show resolved Hide resolved
@tchaton tchaton marked this pull request as ready for review July 12, 2022 11:51
@tchaton tchaton requested a review from carmocca July 12, 2022 12:00
@zippeurfou
Copy link
Contributor

I am not sure I am a fan of the sanityServe to be a callback.
Here are a few reasons:

  1. Serving and model training are two different things (separation of concern). You have multiple scenario where they are not tied as much as this (eg. serving as model composition, serving when you depend on different party)
  2. Serving look at different performance metrics (eg. RPS, uptime...), you do care about theses benchmark. I am not sure tying it up to training step helps (this is not a now but how do I load test here? Or how would we help the users in the future do load testing?) which brings me to the next point.
  3. Hardware / auto-scaling.. is important there. I am not sure we allow good separation there by putting it in a callback.

I am bringing this up because I think it's important SanityServe provides value otherwise no one will use it. However, to provide value we should think about what can it provide that you would not have in the Serve module.

@tchaton tchaton changed the title Investigate Sanity Serving Callback Investigate ServableModuleValidator Callback Jul 12, 2022
@zippeurfou
Copy link
Contributor

Spoke with @tchaton async and we synced on it.

tchaton and others added 7 commits July 12, 2022 14:22
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Copy link

@adriangonz adriangonz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great @tchaton !

Following up from our brief discussion in Slack, just added a couple comments below. It would be great to hear your thoughts!

@williamFalcon
Copy link
Contributor

williamFalcon commented Jul 14, 2022

the premise here is that every model needs to be served which is not true.

for example a model to fold proteins does not need “serving”… this is a model that runs once in a while to generate a new protein sequence for a lab to synthesize.

so, no, we can’t “force” every model to be “production ready.”.

however, if a team opts for forcing their particular research code to always be production ready, they should have a mechanism to enforce that behavior and can opt in.

src/pytorch_lightning/serve/servable_module.py Outdated Show resolved Hide resolved
src/pytorch_lightning/serve/servable_module_validator.py Outdated Show resolved Hide resolved
src/pytorch_lightning/serve/servable_module_validator.py Outdated Show resolved Hide resolved
src/pytorch_lightning/serve/servable_module.py Outdated Show resolved Hide resolved
src/pytorch_lightning/serve/servable_module.py Outdated Show resolved Hide resolved
src/pytorch_lightning/serve/servable_module.py Outdated Show resolved Hide resolved
@justusschock
Copy link
Member

justusschock commented Jul 14, 2022

@williamFalcon that is why this was designed to be a combination of an optional mixin + a callback instead of directly added to the lightning module and the trainer. So this is opt-in as it is currently implemented :)

Meaning that all the hooks for the class class ServableBoringModel(BoringModel, ServableModule): are coming from ServableModule, which is optional to use and the validation of these hooks only happens when adding the ServableModuleValidator as a callback :)

@Borda Borda requested review from lantiga and justusschock July 14, 2022 22:47
@tchaton tchaton self-assigned this Jul 15, 2022
Copy link
Contributor

@awaelchli awaelchli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unblock

@carmocca carmocca added this to the app:0.6 milestone Jul 15, 2022
@tchaton tchaton modified the milestones: app:0.6, pl:1.7 Jul 15, 2022
@lexierule lexierule merged commit 5e26840 into master Jul 15, 2022
@lexierule lexierule deleted the investigate_serving branch July 15, 2022 15:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
design Includes a design discussion feature Is an improvement or enhancement lightningmodule pl.LightningModule ready PRs ready to be merged
Projects
No open projects
Status: Done
Development

Successfully merging this pull request may close these issues.

Add Inference support to PyTorch Lightning