-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[uTVM][Runtime] Introduce Virtual Memory Allocator to CRT #5124
Conversation
cc @tmoreau89 @ajtulloch @ZihengJiang @weberlo please help to take a look |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Liangfu for this PR! I've added some comments
src/runtime/crt/graph_runtime.c
Outdated
if (reader->NextArrayItem(reader)) { | ||
reader->ReadInteger(reader, &(attr->shape[shape_count][5])); ndim++; | ||
reader->ReadInteger(reader, attr_shape_ptr + 5); ndim++; | ||
reader->NextArrayItem(reader); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can't this code block be written as a loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch! I successfully converted this into a for loop.
src/runtime/crt/crt_runtime_api.c
Outdated
@@ -79,7 +79,7 @@ int TVMModGetFunction(TVMModuleHandle mod, | |||
if (!strcmp(func_name, "load_params")) { | |||
*out = &TVMGraphRuntime_LoadParams; | |||
} else { | |||
status -1; | |||
status = -1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we enable linting on these sources to avoid such typoes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually we do have the CRT sources checked by the cpplint, but it failed to catch the case. As the CRT is growing, I think we might need cppcheck to ensure the compliance to MISRA-C rules, since it contains a very nice checker misra.py .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I caught this with -Wall flag in gcc.
apps/bundle_deploy/Makefile
Outdated
@@ -37,6 +37,9 @@ demo: $(build_dir)/demo $(build_dir)/bundle.so $(build_dir)/bundle_c.so $(build_ | |||
TVM_NUM_THREADS=1 $(build_dir)/demo $(build_dir)/bundle.so $(build_dir)/cat.bin | |||
TVM_NUM_THREADS=1 $(build_dir)/demo $(build_dir)/bundle_c.so $(build_dir)/cat.bin | |||
|
|||
test_crt: $(build_dir)/test $(build_dir)/test_bundle_c.so $(build_dir)/test_data.bin $(build_dir)/test_output.bin | |||
TVM_NUM_THREADS=1 $(build_dir)/test $(build_dir)/test_bundle_c.so $(build_dir)/test_data.bin $(build_dir)/test_output.bin $(build_dir)/test_graph.json $(build_dir)/test_params.bin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be nice to see some unit tests for the virtual memory allocator; and add those to CI
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test coverage has been extended in cpptest, see tests/cpp/crt_memory_test.cc
I wonder if VMem allocator is a bit overkill for cases like micro controller. If we have a constant memory usage and do not do frequent alloc and free, perhaps we will be good by use an arena style allocator(which just allocs but do not free) and frees everything once we are done with the task. It would be nice to do a quick benchmark of the vmemory allocator vs alternatives(in terms of binary size and efficiency) |
I think we do need to reuse workspace memory between convolution layers, and reusing the space requires recycling the space with TVMBackendFreeWorkspace. |
The workspace memory could have a different strategy. The way it works is that we create a different arena for workspace, along with a counter.
This will work because all workspace memory are temporal. It also guarantees a constant time allocation |
I agree we definitely need an arena style allocator, but can we introduce that in a separate PR? @tqchen |
OK, however, note that havinbg a arena based allocator will likely mean remove the vmalloc based versions(for simplicity reasons). Will let @tmoreau89 to manage this PR :) |
That make sense, since we can have strategies for both runtime and workspace memories. |
@tmoreau89 please followup @liangfu please rebase against the master |
ping @liangfu |
Rebased. Sorry for the late update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one last minor comment, I think we can then merge it and move on to the arena based allocator
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Liangfu for these changes! One more question I have before approving the changes. Are the C sources currently being linted?
Yes, all sources (including C and C++ source code) under |
Change-Id: I0f79f909a04d1c677aabb80f202f0612c5ce7f2a
Change-Id: I37104c09e28112b1974fa2b064c809d0a8d686c3
Change-Id: I35800470bcbfcf96652494f359711cb4c2d34398
Change-Id: I72071289843fff4031c0df8796868a0b9fbc57ee
Change-Id: I32dba85ac1660c77f51c2d0d8ab6436ed0c01c74
Change-Id: I939fa43027a5c7529c5c7c6bd8d6e6beb91b7581
Change-Id: Ie4ff6b64fdfe6810836cf8fd44dace82a20c4581
LGTM from my side, @liangfu let us also make sure we add the crt test to the CI. possibly by running some of the quick tests in the apps |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @liangfu the changes LGTM
* initial crt_memory and memory leak fix in graph_runtime Change-Id: I0f79f909a04d1c677aabb80f202f0612c5ce7f2a * fix memory leak Change-Id: I37104c09e28112b1974fa2b064c809d0a8d686c3 * clean up Change-Id: I039b12015a1d56c8f4120867cd5a5292da34f3e3 * implement vrealloc Change-Id: I35800470bcbfcf96652494f359711cb4c2d34398 * allocate from stack memory for most of the variables Change-Id: I72071289843fff4031c0df8796868a0b9fbc57ee * allocate from stack memory for all of the variables Change-Id: I32dba85ac1660c77f51c2d0d8ab6436ed0c01c74 * lint Change-Id: If12cd240685d7791fc60bc0cfb66389cdc186b73 * lint Change-Id: I7c9d90c11b60b8edda2427ebd189ebe535af2100 * facilitate the growth of TVM_CRT_MAX_NDIM Change-Id: I939fa43027a5c7529c5c7c6bd8d6e6beb91b7581 * extend test coverage of vmalloc Change-Id: Ie4ff6b64fdfe6810836cf8fd44dace82a20c4581 * lint Change-Id: Ibf3c06619ef296df5c49f3945cb6428777781d69 * move logging.h to src * fix an error in macOS * remove logging.h * use cflags for gcc * fix compilation error
* initial crt_memory and memory leak fix in graph_runtime Change-Id: I0f79f909a04d1c677aabb80f202f0612c5ce7f2a * fix memory leak Change-Id: I37104c09e28112b1974fa2b064c809d0a8d686c3 * clean up Change-Id: I039b12015a1d56c8f4120867cd5a5292da34f3e3 * implement vrealloc Change-Id: I35800470bcbfcf96652494f359711cb4c2d34398 * allocate from stack memory for most of the variables Change-Id: I72071289843fff4031c0df8796868a0b9fbc57ee * allocate from stack memory for all of the variables Change-Id: I32dba85ac1660c77f51c2d0d8ab6436ed0c01c74 * lint Change-Id: If12cd240685d7791fc60bc0cfb66389cdc186b73 * lint Change-Id: I7c9d90c11b60b8edda2427ebd189ebe535af2100 * facilitate the growth of TVM_CRT_MAX_NDIM Change-Id: I939fa43027a5c7529c5c7c6bd8d6e6beb91b7581 * extend test coverage of vmalloc Change-Id: Ie4ff6b64fdfe6810836cf8fd44dace82a20c4581 * lint Change-Id: Ibf3c06619ef296df5c49f3945cb6428777781d69 * move logging.h to src * fix an error in macOS * remove logging.h * use cflags for gcc * fix compilation error
* initial crt_memory and memory leak fix in graph_runtime Change-Id: I0f79f909a04d1c677aabb80f202f0612c5ce7f2a * fix memory leak Change-Id: I37104c09e28112b1974fa2b064c809d0a8d686c3 * clean up Change-Id: I039b12015a1d56c8f4120867cd5a5292da34f3e3 * implement vrealloc Change-Id: I35800470bcbfcf96652494f359711cb4c2d34398 * allocate from stack memory for most of the variables Change-Id: I72071289843fff4031c0df8796868a0b9fbc57ee * allocate from stack memory for all of the variables Change-Id: I32dba85ac1660c77f51c2d0d8ab6436ed0c01c74 * lint Change-Id: If12cd240685d7791fc60bc0cfb66389cdc186b73 * lint Change-Id: I7c9d90c11b60b8edda2427ebd189ebe535af2100 * facilitate the growth of TVM_CRT_MAX_NDIM Change-Id: I939fa43027a5c7529c5c7c6bd8d6e6beb91b7581 * extend test coverage of vmalloc Change-Id: Ie4ff6b64fdfe6810836cf8fd44dace82a20c4581 * lint Change-Id: Ibf3c06619ef296df5c49f3945cb6428777781d69 * move logging.h to src * fix an error in macOS * remove logging.h * use cflags for gcc * fix compilation error
As proposed in #5060, this PR brings an implement of a memory container that returns addresses from a single stack.