You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This problem will continue getting worse as we need to runtime exports, profiling techniques and other useful workflows for model serving all of which will be gated by slow error handling code that will encourage users to pip install missing dependencies.
So how we can continue making improvements to the base handler while keeping it simple and modular?
Option 1: Inheritance
Instead of adding features to the base handler
we can instead create a new handler like
class ExtendedHandler(BaseHandler):
Benefit is code remains modular but con is that to use profiling and a runtime users would need to resort to multiple inheritance which can be hard to debug
Option 2: Generic interfaces
Instead of having a line that looks like this in our code
self.model = ipex.optimize(self.model)
We can add a generic optimize in the base handler which specializes for a particular implementation depending on what's in the config.properties
Benefit is this very modular but requires more work to create a future proof interface and needs users to change Java code to support their usecase
Option 3: Dynamic runtime loads
Instead of having code in the base handler we can load it at runtime
Benefit is this is very modular, doesn't require any changes to base handler code but given that torchserve is used via a CLI tool and not just running a python file it's tricky to figure out where this change needs to be
Option 4: Utility functions
Another simple approach is to move helpful utility functions to a different file called handler_utils.py
Not a silver bullet but python decorator like could make code more maintainable
@optimize
@profile
@metrics
For example @metrics would be a decorator to keep track of a function start and end time. This works well for @metrics and maybe @profile but for @optimize would require passing the right argument as in the model which is not a parameter in inference but a property of the handler class. Maybe there's a larger discussion here in that handlers need to hold less state
Related we could use Python contextmanager to allocate a runtime so users can say something like with ipex|tensorrt|etc.. and not have to worry about changes to the base handler.
Option 6: ?
There may be other options but I think this is an important problem to figure out to make it simpler for new contributors to add their changes
Recently we've realized that an easy place for new contributors to improve torchserve is to either
examples
1 is easiest but makes means that users that want to benefit from that example, need to go through source code and adapt it to their examples
2 is a bit less easy but still OK because the benefits are to anyone using torchserve. Unfortunately it's slowly making the base handler unmaintainable as it now includes code for model optimization, profiling, model interpretability https://github.com/pytorch/serve/blob/master/ts/torch_handler/base_handler.py
This problem will continue getting worse as we need to runtime exports, profiling techniques and other useful workflows for model serving all of which will be gated by slow error handling code that will encourage users to pip install missing dependencies.
Option 1: Inheritance
Instead of adding features to the base handler
we can instead create a new handler like
Benefit is code remains modular but con is that to use profiling and a runtime users would need to resort to multiple inheritance which can be hard to debug
Option 2: Generic interfaces
Instead of having a line that looks like this in our code
self.model = ipex.optimize(self.model)
We can add a generic
optimize
in the base handler which specializes for a particular implementation depending on what's in theconfig.properties
Benefit is this very modular but requires more work to create a future proof interface and needs users to change Java code to support their usecase
Option 3: Dynamic runtime loads
Instead of having code in the base handler we can load it at runtime
Benefit is this is very modular, doesn't require any changes to base handler code but given that torchserve is used via a CLI tool and not just running a python file it's tricky to figure out where this change needs to be
Option 4: Utility functions
Another simple approach is to move helpful utility functions to a different file called
handler_utils.py
A good candidate is moving a function like https://github.com/pytorch/serve/blob/master/ts/torch_handler/base_handler.py#L229
def _infer_with_profiler(self, data):
That said this approach isn't perfect since even if modularized, profiling would need a footprint like https://github.com/pytorch/serve/blob/master/ts/torch_handler/base_handler.py#L213
if is_profiler_enabled:
Option 5: Python decorators
Not a silver bullet but python decorator like could make code more maintainable
For example
@metrics
would be a decorator to keep track of a function start and end time. This works well for@metrics
and maybe@profile
but for@optimize
would require passing the right argument as in the model which is not a parameter ininference
but a property of the handler class. Maybe there's a larger discussion here in that handlers need to hold less stateRelated we could use Python
contextmanager
to allocate a runtime so users can say something likewith ipex|tensorrt|etc..
and not have to worry about changes to the base handler.Option 6: ?
There may be other options but I think this is an important problem to figure out to make it simpler for new contributors to add their changes
cc: @HamidShojanazeri @chauhang @lxning @maaquib @nskool @min-jean-cho
The text was updated successfully, but these errors were encountered: