Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate what work is required in systems side for serving TF session based models. What kind of SLA do we need for practical performance. Under what batch size/data. #202

Closed
viswa-nvidia opened this issue Sep 8, 2022 · 3 comments
Assignees
Milestone

Comments

@viswa-nvidia
Copy link

No description provided.

@viswa-nvidia
Copy link
Author

@karlhigley , for completing the example in 22.10 , do we need to complete this investigation ? Are we already working on this ?

@karlhigley
Copy link
Contributor

There are no models to investigate this with yet AFAIK, so I'd be very surprised if we completed an end-to-end example in 22.10

@karlhigley
Copy link
Contributor

I also have no way to independently determine SLAs other than pulling a number out of the air, so the title of this issue doesn't make much sense 😅

@viswa-nvidia viswa-nvidia added this to the Merlin 22.12 milestone Oct 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants