forked from apache/tvm
-
Notifications
You must be signed in to change notification settings - Fork 30
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[Runtime] Allow for parameter sharing in GraphRuntime (apache#3384)
Summary: In multi-threaded applications where we have multiple inferences on the same model in parallel (consider e.g. a TTS system handling multiple requests), it can be useful to share the parameters of a model amongst these multiple instances. This improves the cache utilization behaviour of the system, as multiple cores can use the same set of weights instead of evicting the identical copies of weights in a shared cache. As the underlying `NDArray` instances in `data_entry_` implement a ref-counted based sharing system, this is a simple modification of the `GraphRuntime::LoadParams` logic to instead copy parameters from an existing GraphRuntime instance. This is a little ugly in that we need both the pre-existing GraphRuntime instance, as well as the 'serialized' params (since we need to know the set of names we should copy), but without imposing additional assumptions (i.e. storing the set of param names in GraphRuntime, and enforcing that shared param names are identical to the parameters set in the preceding `LoadParams` call), this seems unavoidable. Test Plan: Unit test added.
- Loading branch information
Showing
4 changed files
with
99 additions
and
4 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters