Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Object] Introduce POD-C Compliant tvm::Map #5740

Merged
merged 3 commits into from
Jun 19, 2020
Merged

[Object] Introduce POD-C Compliant tvm::Map #5740

merged 3 commits into from
Jun 19, 2020

Conversation

junrushao
Copy link
Member

@junrushao junrushao commented Jun 6, 2020

Related discussion: https://discuss.tvm.ai/t/discuss-runtime-array-containers-array-adt-string/4582.

This PR introduces tvm::Map and tvm::MapNode:

  1. MapNode is a POD-C compliant alternative to `std::unordered_map<ObjectRef, ObjectRef>;
  2. Map is the shared reference to the container which implements copy-on-write semantics.

Features

  • Removes dependency to std::unordered_map
  • The new impl is POD-C compatible in terms of Itanium C++ ABI
  • Copy-on-write semantic is preserved in tvm::Map
  • tvm::MapNode is kept to be mutable and iterable

Breaking changes

  • MapNode no longer has the field std::unordered_map<ObjectRef, ObjectRef> data, and all its methods currently used are implemented to MapNode itself
  • Iterating through MapNode is natively supported directly via interfaces like MapNode::begin()

Performance

To show that the introduction of tvm::Map won't hurt performance, I also did some rough benchmark on my laptop on the success insertion performance.

Hardware: MacBook Pro (16-inch, 2019)
Memory: 8 GB * 2, DDR4, 2667 MHz
CPU: Intel i7-9750H CPU @ 2.60GHz
OS: macOS Catalina 10.15.5
Compiler: Apple clang version 11.0.3 (clang-1103.0.32.62)

Number of Operations Avg Total Time (ms) tvm::MapNode Avg Total Time (ms) std::unordered_map
10 0.000309454 0.000996719
100 0.00309778 0.00949672
1000 0.0183351 0.0847953
10000 0.405679 0.926011
100000 4.29819 10.6617
1000000 57.6163 255.175
10000000 825.093 3743.89

Scripts are available on gist. Note that the measured time may depend on operation, randomness, cache line, compiler, etc.

Credits

We use [1] for placing an item on a table of size of power-of-2. We learned that triangle numbers that could iterate through the power-of-2-sized table through [2]. The idea that linked list can be placed in the same table is learned from [3].

[1] Fibonacci Hashing: https://programmingpraxis.com/2018/06/19/fibonacci-hash/
[2] Quadratic probing with triangle numbers: https://fgiesen.wordpress.com/2015/02/22/triangular-numbers-mod-2n/
[3] Flat hash map: https://github.com/skarupke/flat_hash_map

Thank you @mbrookhart for the valuable discussion!

Also CC @tqchen @icemelon9 @yzhliu @zhiics @comaniac @kevinthesun @jwfromm @jroesch if you guys are interested.

@tqchen tqchen self-assigned this Jun 6, 2020
@junrushao junrushao changed the title [Draft][Object] Introduce runtime::Map [Object][Runtime] Introduce runtime::Map Jun 8, 2020
@junrushao junrushao marked this pull request as ready for review June 8, 2020 07:56
3rdparty/compiler-rt/builtin_fp16.h Outdated Show resolved Hide resolved
3rdparty/compiler-rt/int_lib.h Outdated Show resolved Hide resolved
3rdparty/compiler-rt/int_lib.h Outdated Show resolved Hide resolved
include/tvm/runtime/container.h Outdated Show resolved Hide resolved
include/tvm/runtime/container.h Outdated Show resolved Hide resolved
include/tvm/runtime/container.h Outdated Show resolved Hide resolved
include/tvm/runtime/container.h Outdated Show resolved Hide resolved
include/tvm/runtime/container.h Outdated Show resolved Hide resolved
include/tvm/runtime/container.h Outdated Show resolved Hide resolved
include/tvm/runtime/container.h Outdated Show resolved Hide resolved
@junrushao
Copy link
Member Author

Maybe we should get rid of __builtin_clz as it is the slow path anyway

@zhiics
Copy link
Member

zhiics commented Jun 9, 2020

@junrushao1994 +1 MoveFrom should be way more expensive than it though.

include/tvm/runtime/container.h Outdated Show resolved Hide resolved
src/ir/expr.cc Show resolved Hide resolved
include/tvm/runtime/container.h Outdated Show resolved Hide resolved
@junrushao
Copy link
Member Author

junrushao commented Jun 10, 2020

Some actionable items per offline discussion with @tqchen:

A1. We move things back to include/tvm/node, instead of runtime. This is mainly due to the concern of keeping tvm runtime minimal. Once we find a strong use case that really need HashMap to exist in tvm runtime, we can bring it back.
A2. We will implement a specialization for small maps due to the concern of memory overhead.
A3. We will have unordered_map as a backup option, which can be turned on using macro or something like that.
A4. We will add a lot of comments on how this hash map works.

So it requires some amount of work. I will turn it back to draft until the work is done :-)

@junrushao junrushao marked this pull request as draft June 10, 2020 21:53
@junrushao
Copy link
Member Author

A1 and A2 is done over the commits above

@junrushao
Copy link
Member Author

A4 is done with the last commit

@junrushao
Copy link
Member Author

A3 is done over the last commits

@junrushao junrushao marked this pull request as ready for review June 13, 2020 15:43
include/tvm/runtime/container.h Outdated Show resolved Hide resolved
include/tvm/node/container.h Outdated Show resolved Hide resolved
include/tvm/node/container.h Outdated Show resolved Hide resolved
include/tvm/node/container.h Outdated Show resolved Hide resolved
include/tvm/node/container.h Outdated Show resolved Hide resolved
include/tvm/node/container.h Outdated Show resolved Hide resolved
include/tvm/node/container.h Outdated Show resolved Hide resolved
@junrushao junrushao changed the title [Object][Runtime] Introduce runtime::Map [Object][Runtime] Introduce POD-C Compliant tvm::Map Jun 14, 2020
@junrushao junrushao changed the title [Object][Runtime] Introduce POD-C Compliant tvm::Map [Object] Introduce POD-C Compliant tvm::Map Jun 14, 2020
@junrushao
Copy link
Member Author

@zhiics @icemelon9 could you guys take another look? much appreciated!

include/tvm/runtime/container.h Outdated Show resolved Hide resolved
src/node/serialization.cc Show resolved Hide resolved
src/node/serialization.cc Outdated Show resolved Hide resolved
src/node/serialization.cc Outdated Show resolved Hide resolved
cmake/config.cmake Outdated Show resolved Hide resolved
include/tvm/node/container.h Outdated Show resolved Hide resolved
include/tvm/node/container.h Show resolved Hide resolved
include/tvm/node/container.h Outdated Show resolved Hide resolved
include/tvm/node/container.h Outdated Show resolved Hide resolved
include/tvm/node/container.h Show resolved Hide resolved
@junrushao
Copy link
Member Author

junrushao commented Jun 16, 2020

Summary of the change:

  • Remove most of the short names (i, n, m, d, b) and use more informative ones instead.
  • Added comments between important code blocks (insert vs rehash)
  • Add test cases to cover the switch between SmallMapNode and DenseMapNode

@junrushao
Copy link
Member Author

@tqchen Just responded to all the review comments, added test cases and code comments, could you take another look?

include/tvm/node/container.h Outdated Show resolved Hide resolved
@tqchen
Copy link
Member

tqchen commented Jun 18, 2020

Copy link
Member

@zhiics zhiics left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only some nitpicks. Otherwise, LGTM.

include/tvm/node/container.h Show resolved Hide resolved
tests/cpp/container_test.cc Outdated Show resolved Hide resolved
tests/cpp/container_test.cc Show resolved Hide resolved
@junrushao
Copy link
Member Author

@zhiics Could you take another look? Thanks!

@tqchen tqchen merged commit 55e02af into apache:master Jun 19, 2020
@tqchen
Copy link
Member

tqchen commented Jun 19, 2020

THanks @junrushao1994 @zhiics !

trevor-m pushed a commit to trevor-m/tvm that referenced this pull request Jun 30, 2020
zhiics pushed a commit to neo-ai/tvm that referenced this pull request Jul 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants