-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pass function args by reference using span<> #359
Conversation
d8fe1cd
to
dcf6c97
Compare
Codecov Report
@@ Coverage Diff @@
## master #359 +/- ##
=======================================
Coverage 99.17% 99.17%
=======================================
Files 49 49
Lines 13232 13236 +4
=======================================
+ Hits 13123 13127 +4
Misses 109 109 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks good, but please add a basic unit test for span.
Perhaps also two test cases: one using std::vector as input and one using fizzy::stack.
lib/fizzy/execute.cpp
Outdated
std::vector<uint64_t> locals = std::move(args); | ||
std::vector<uint64_t> locals; | ||
locals.reserve(args.size() + code.local_count); | ||
std::copy_n(args.begin(), args.size(), std::back_inserter(locals)); | ||
locals.resize(locals.size() + code.local_count); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need reserve
+ resize
or can we stay with one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also this seems to now expand it to args.size() + local_count + local_count
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could first resize
here, then just copy_n
without back_inserter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is "minimal work" version.
- Firstly we allocate memory for both args and locals (
locals.size()
is still 0 at this point). - We copy args.
- We fill locals with zero.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the end I used the what @gumb0 suggested. Micro-benchmarks included.
lib/fizzy/span.hpp
Outdated
/// https://en.cppreference.com/w/cpp/container/span | ||
/// Only `const T` is supported. | ||
template <typename T, typename = typename std::enable_if<std::is_const_v<T>>::type> | ||
struct span |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason why it's a struct
and not class
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As it's not a standard method, I think it should be called Span
like our other classes like Stack
and OperandStask
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is from C++20: https://en.cppreference.com/w/cpp/container/span.
But it is a class so I will change to class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not, because it's not in std
.
But it's just a nitpick, I don't care much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will leave it as is to be a drop-in replacement for standardized "container".
New benchmarksHaswell / clang-10
Cloud / gcc-9
|
test/unittests/span_test.cpp
Outdated
TEST(span, vector) | ||
{ | ||
std::vector<uint64_t> vec{1, 2, 3, 4, 5, 6}; | ||
span<const uint64_t> s(&vec[1], 3); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
span<const uint64_t> s(&vec[1], 3); | |
const span<const uint64_t> s(&vec[1], 3); |
test/unittests/span_test.cpp
Outdated
stack.push(13); | ||
|
||
constexpr auto num_items = 2; | ||
span<const uint64_t> s(stack.rend() - num_items, num_items); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
span<const uint64_t> s(stack.rend() - num_items, num_items); | |
const span<const uint64_t> s(stack.rend() - num_items, num_items); |
Not happy about regressions. Can you also benchmark the |
13f3967
to
0a2cdfd
Compare
Final benchmarks:
The last commit alone (locals initialization micro-optimization):
And the
Maybe someone wants to explain why 3/8 case is outstanding? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
|
test/unittests/api_test.cpp
Outdated
@@ -7,6 +7,7 @@ | |||
#include <gtest/gtest.h> | |||
#include <lib/fizzy/limits.hpp> | |||
#include <test/utils/asserts.hpp> | |||
#include <test/utils/execute_helpers.hpp> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this file needs it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It has the overload for execute(..., {...})
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But no new test was added, why wasn't this needed before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously, the args were passed by std::vector
which has constructor for {}
.
test/unittests/end_to_end_test.cpp
Outdated
@@ -33,7 +33,8 @@ TEST(end_to_end, milestone1) | |||
const auto module = parse(wasm); | |||
auto instance = instantiate(module); | |||
|
|||
EXPECT_THAT(execute(*instance, 0, {20, 22}), Result(20 + 22 + 20)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This wouldn't work with span?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, because std::span
is a reference type and {20, 22}
is a temporary object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So in the "external API" we should have a wrapper then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we can have this in public API. But, I'm not convinced this is super useful. In normal usage you rarely pass constant args to wasm. They come usually from other sources like CLI, etc.
For now I placed this in fizzy::test
.
27b8366
to
70a4507
Compare
The latest benchmark with GCC10, Haswell, 4 GHz.
|
test/utils/execute_helpers.hpp
Outdated
|
||
namespace fizzy::test | ||
{ | ||
inline execution_result execute(const Module& module, FuncIdx func_idx, std::vector<uint64_t> args) | ||
inline execution_result execute( | ||
Instance& instance, FuncIdx func_idx, std::initializer_list<uint64_t> args) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it not help in tests where you declare args as arrays?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not included this overload in end_to_end tests where it has small number of case that could be converted to std::array
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's 4 cases, I'd still use it.
@@ -613,8 +611,8 @@ execution_result execute( | |||
const auto& code = instance.module.codesec[code_idx]; | |||
auto* const memory = instance.memory.get(); | |||
|
|||
std::vector<uint64_t> locals = std::move(args); | |||
locals.resize(locals.size() + code.local_count); | |||
std::vector<uint64_t> locals(args.size() + code.local_count, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this zeroing it out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. First we allocate and zero for args + locals. Then we overwrite args with proper values.
test/unittests/execute_test.cpp
Outdated
@@ -1003,9 +1004,10 @@ TEST(execute, reuse_args) | |||
"0061736d01000000010b0260027e7e017e6000017e03030200010a19020e002000200180210120002001820b08" | |||
"004217420510000b"); | |||
|
|||
const std::vector<uint64_t> args{20, 3}; | |||
const std::array<uint64_t, 2> args{20, 3}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this change needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not needed, but no need to use std::vector
to allocate 2 args.
|
||
[[gnu::noinline]] auto init_locals_2(fizzy::span<const uint64_t> args, uint32_t local_count) | ||
{ | ||
std::vector<uint64_t> locals(args.size() + local_count); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the same as locals(args.size() + local_count, 0);
e.g. the one used in execute.cpp
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. The 0
value is default.
test/unittests/execute_call_test.cpp
Outdated
Instance&, std::vector<uint64_t> args, int) -> execution_result { | ||
return fizzy::execute(*instance1, *func_idx, std::move(args)); | ||
Instance&, span<const uint64_t> args, int) -> execution_result { | ||
return fizzy::execute(*instance1, *func_idx, std::vector(args.begin(), args.end())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this?
It seems args
passed to this lambda should live long enough on the caller side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A mistake.
This passes function arguments by reference (instead of by value with
std::vector
). This eliminates one allocation per call because - previously one allocation was done to create the vector, then another to create locals.Changing
std::vector
toconst std::vector&
may have been enough, butspan
is more generic and easier to refactor with.The
snap
does not have unit tests as it is trivial and not all methods are currently used after full transition.