-
Notifications
You must be signed in to change notification settings - Fork 281
Conversation
@@ -79,14 +87,16 @@ class GraphAllocator { | |||
} | |||
|
|||
// constructor | |||
explicit GraphAllocator(const IndexedGraph* idx) : idx_(idx) { | |||
this->Init(dmlc::GetEnv("NNVM_EXEC_MATCH_RANGE", 16), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why remove NNVM_EXEC_MATCH_RANGE? what strategy are you using instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm trying to sweep between [1, NNVM_EXEC_MATCH_RANGE], and select the best value
src/pass/plan_memory.cc
Outdated
@@ -98,6 +108,21 @@ class GraphAllocator { | |||
num_match_color_ = pass::ColorNodeGroup( | |||
*idx_, importance, num_match_color_, &node_color_); | |||
} | |||
// Init the free memory pool | |||
if (shared_pool != nullptr) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider another strategy where you don't insert shared pool to free memory, instead you run planing separately, sort both shared_pool and new pool, then match from small to big.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could try this
tests/python/test_graph.py
Outdated
@@ -130,6 +129,12 @@ def test_plan_memory(): | |||
y = sym.add(y, y) | |||
g = graph.create(y) | |||
g._set_json_attr("shape_attr_key", "shape") | |||
reused_bytes = 16 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what are these?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it's not clear. This is a test case where in the shared pool, there're 16 bytes on device 0, another 1 byte on device 0, and 32 bytes on device 9. Without the shared pool it takes 64 bytes. With this shared pool, total allocated mem should be 64 + 32 = 96 bytes for this graph. Should I test in finer granularity??
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No this is ok
src/pass/plan_memory.cc
Outdated
shape_vec[eid_out].Size() == shape_vec[eid_in].Size() && | ||
dtype_vec[eid_out] == dtype_vec[eid_in]) { | ||
// inplace optimization | ||
storage[eid_out] = storage[eid_in]; | ||
(*storage)[eid_out] = (*storage)[eid_in]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider add a reference in to *storage in beginning?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually started with reference. But python lint gives a complaint that i either pass reference variable which i don't modify, or pass a pointer to a variable which mutate. I don't remember if it's enforced in dmlc/mxnet or dmlc/nnvm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/pass/plan_memory.cc:144: Is this a non-const reference? If so, make const or use a pointer: StorageVector& storage [runtime/references] [2]
This is the error message when I pass as reference
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can pass by pointer but make reference inside the function.
src/pass/plan_memory.cc
Outdated
} | ||
} | ||
std::multimap<size_t, uint32_t>::reverse_iterator rit; | ||
for (rit = eids.rbegin(); rit != eids.rend(); ++rit) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
auto rit = eids.rbegin()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will do
* Imbalance version of shared pool during plan memory * Bug fix for no shared_pool case * Auto search and updated shared mem pool * Cleanup unused code * Cleanup logging code * Add unit test for shared storage * Remove shared pool in PlanMemory. Fix lint warnings * Fix lint warnings * Use reference instead of ptrs
* Imbalance version of shared pool during plan memory * Bug fix for no shared_pool case * Auto search and updated shared mem pool * Cleanup unused code * Cleanup logging code * Add unit test for shared storage * Remove shared pool in PlanMemory. Fix lint warnings * Fix lint warnings * Use reference instead of ptrs
apache/mxnet#5035