[Runtime] Allow for parameter sharing in GraphRuntime #3384

ajtulloch · 2019-06-17T20:24:55Z

Summary:

In multi-threaded applications where we have multiple inferences on the
same model in parallel (consider e.g. a TTS system handling multiple
requests), it can be useful to share the parameters of a model amongst
these multiple instances. This improves the cache utilization behaviour
of the system, as multiple cores can use the same set of weights instead
of evicting the identical copies of weights in a shared cache.

As the underlying NDArray instances in data_entry_ implement a
ref-counted based sharing system, this is a simple modification of the
GraphRuntime::LoadParams logic to instead copy parameters from an
existing GraphRuntime instance. This is a little ugly in that we need
both the pre-existing GraphRuntime instance, as well as the 'serialized'
params (since we need to know the set of names we should copy), but
without imposing additional assumptions (i.e. storing the set of param
names in GraphRuntime, and enforcing that shared param names are
identical to the parameters set in the preceding LoadParams call),
this seems unavoidable.

Test Plan:

Unit test added.

Summary: In multi-threaded applications where we have multiple inferences on the same model in parallel (consider e.g. a TTS system handling multiple requests), it can be useful to share the parameters of a model amongst these multiple instances. This improves the cache utilization behaviour of the system, as multiple cores can use the same set of weights instead of evicting the identical copies of weights in a shared cache. As the underlying `NDArray` instances in `data_entry_` implement a ref-counted based sharing system, this is a simple modification of the `GraphRuntime::LoadParams` logic to instead copy parameters from an existing GraphRuntime instance. This is a little ugly in that we need both the pre-existing GraphRuntime instance, as well as the 'serialized' params (since we need to know the set of names we should copy), but without imposing additional assumptions (i.e. storing the set of param names in GraphRuntime, and enforcing that shared param names are identical to the parameters set in the preceding `LoadParams` call), this seems unavoidable. Test Plan: Unit test added.

kevinthesun · 2019-06-18T18:44:57Z

LGTM @ajtulloch Can you take a look at ci failure?

huajsj · 2019-06-18T18:00:34Z

src/runtime/graph/graph_runtime.cc

+  std::vector<std::string> names;
+  CHECK(strm->Read(&names)) << "Invalid parameters file format";
+  uint64_t sz;
+  strm->Read(&sz);


here may need a 'CHECK' just like line 189 and 193 did.

huajsj · 2019-06-18T18:03:40Z

src/runtime/graph/graph_runtime.cc

+    CHECK_LT(eid, data_entry_.size());
+    CHECK_EQ(data_entry_[eid].use_count(), 1);
+    data_entry_[eid] = other.GetInput(GetInputIndex(names[i]));
+    CHECK_GT(data_entry_[eid].use_count(), 1);


for consistency, 205,206 208 may need a "message" just like other check did.

data_entry_[eid].use_count() should get increase 1 after one reference, this check just check if the count > 1 seems like not logically complete. how about some logic like following?
int prev_idx = other.GetInput(in_idx).use_count;
...
CHECK_EQ(data_entry_[eid].use_count(), prev_idx + 1);

huajsj · 2019-06-18T18:33:37Z

src/runtime/graph/graph_runtime.cc

+    uint32_t eid = this->entry_id(input_nodes_[in_idx], 0);
+    CHECK_LT(eid, data_entry_.size());
+    CHECK_EQ(data_entry_[eid].use_count(), 1);
+    data_entry_[eid] = other.GetInput(GetInputIndex(names[i]));


for reuse variable purpose , here should be data_entry_[eid] = other.GetInput(in_idx).
one more question is 'is other.GetInputIndex(names[i]) always equal this->GetInputIndex(names[i]) ?', here i guess the logic should be other.GetInput(other.GetInputIndex(names[i])), if they are same we can reuse in_idx.

tqchen · 2019-06-20T21:06:31Z

@ajtulloch can you address @huajsj 's comment?

tqchen · 2019-06-25T04:06:47Z

Thanks @ajtulloch @huajsj , this PR is now merged

Summary: In multi-threaded applications where we have multiple inferences on the same model in parallel (consider e.g. a TTS system handling multiple requests), it can be useful to share the parameters of a model amongst these multiple instances. This improves the cache utilization behaviour of the system, as multiple cores can use the same set of weights instead of evicting the identical copies of weights in a shared cache. As the underlying `NDArray` instances in `data_entry_` implement a ref-counted based sharing system, this is a simple modification of the `GraphRuntime::LoadParams` logic to instead copy parameters from an existing GraphRuntime instance. This is a little ugly in that we need both the pre-existing GraphRuntime instance, as well as the 'serialized' params (since we need to know the set of names we should copy), but without imposing additional assumptions (i.e. storing the set of param names in GraphRuntime, and enforcing that shared param names are identical to the parameters set in the preceding `LoadParams` call), this seems unavoidable. Test Plan: Unit test added.

tqchen added the status: need review label Jun 18, 2019

huajsj reviewed Jun 18, 2019

View reviewed changes

tqchen added the status: need update need update based on feedbacks label Jun 24, 2019

tqchen merged commit 32be34a into apache:master Jun 25, 2019

tqchen added status: accepted and removed status: need review status: need update need update based on feedbacks labels Jun 25, 2019

tqchen mentioned this pull request Nov 8, 2019

[RELEASE][DRAFT] TVM v0.6 Release candidate #4259

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Runtime] Allow for parameter sharing in GraphRuntime #3384

[Runtime] Allow for parameter sharing in GraphRuntime #3384

ajtulloch commented Jun 17, 2019

kevinthesun commented Jun 18, 2019

huajsj Jun 18, 2019

huajsj Jun 18, 2019

huajsj Jun 18, 2019

huajsj Jun 18, 2019

tqchen commented Jun 20, 2019

tqchen commented Jun 25, 2019

[Runtime] Allow for parameter sharing in GraphRuntime #3384

[Runtime] Allow for parameter sharing in GraphRuntime #3384

Conversation

ajtulloch commented Jun 17, 2019

kevinthesun commented Jun 18, 2019

huajsj Jun 18, 2019

Choose a reason for hiding this comment

huajsj Jun 18, 2019

Choose a reason for hiding this comment

huajsj Jun 18, 2019

Choose a reason for hiding this comment

huajsj Jun 18, 2019

Choose a reason for hiding this comment

tqchen commented Jun 20, 2019

tqchen commented Jun 25, 2019