Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Small stack optimization #265

Merged
merged 2 commits into from
Jun 4, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 33 additions & 11 deletions lib/fizzy/stack.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -58,32 +58,54 @@ class Stack : public std::vector<T>

class OperandStack
{
/// The size of the pre-allocated internal storage: 128 bytes.
static constexpr auto small_storage_size = 128 / sizeof(uint64_t);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to fine-tune this, e.g. benchmark it with 64, 256, 512? Or should we postpone that until we have a better set of inputs?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tuned it to current benchmark set - the minimal value that fits all benchmarks (and unit tests by coincidence) in small storage.


/// The pointer to the top item, or below the stack bottom if stack is empty.
///
/// This pointer always alias m_storage, but it is kept as the first field
/// because it is accessed the most. Therefore, it must be initialized
/// in the constructor after the m_storage.
uint64_t* m_top;

/// The storage for items.
std::unique_ptr<uint64_t[]> m_storage;
/// The bottom of the stack. Set in the constructor and never modified.
///
/// TODO: This pointer is rarely used and may be removed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this comment still valid?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I just added it. There is some experimentation needed for it, with probably no performance implications. But it is worth just for sanity and "simpler" design.

/// The value can be recomputed as (m_large_storage ? m_large_storage : m_small_storage).
/// Moreover, methods using it may be replaced by ones not needing bottom of the stack.
uint64_t* m_bottom;

/// The pre-allocated internal storage.
uint64_t m_small_storage[small_storage_size];

/// The unbounded storage for items.
std::unique_ptr<uint64_t[]> m_large_storage;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could be std::array

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unique_ptr<std::array>?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I put comment on the wrong line, sorry, I meant it for the stack storage above

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guessed that later.

We only take the pointer of m_small_storage once in constructor. So I don't think it is even worth to include <array> header as we are not using any of std::array features.


public:
/// Default constructor. Sets the top item pointer to below the stack bottom.
/// Default constructor.
///
/// Based on @p max_stack_height decides if to use small pre-allocated storage or allocate
/// large storage.
/// Sets the top item pointer to below the stack bottom.
explicit OperandStack(size_t max_stack_height)
: m_storage{std::make_unique<uint64_t[]>(max_stack_height)}
{
m_top = m_storage.get() - 1;
if (max_stack_height <= small_storage_size)
{
m_bottom = &m_small_storage[0];
}
else
{
m_large_storage = std::make_unique<uint64_t[]>(max_stack_height);
m_bottom = &m_large_storage[0];
}
m_top = m_bottom - 1;
}

OperandStack(const OperandStack&) = delete;
OperandStack& operator=(const OperandStack&) = delete;

/// The current number of items on the stack (aka stack height).
[[nodiscard]] size_t size() const noexcept
{
return static_cast<size_t>(m_top + 1 - m_storage.get());
}
[[nodiscard]] size_t size() noexcept { return static_cast<size_t>(m_top + 1 - m_bottom); }

/// Returns the reference to the top item.
/// Requires non-empty stack.
Expand Down Expand Up @@ -121,11 +143,11 @@ class OperandStack
{
assert(new_size <= size());
// For new_size == 0, the m_top will point below the storage.
m_top = m_storage.get() + new_size - 1;
m_top = m_bottom + new_size - 1;
}

/// Returns iterator to the bottom of the stack.
[[nodiscard]] const uint64_t* rbegin() const noexcept { return m_storage.get(); }
[[nodiscard]] const uint64_t* rbegin() const noexcept { return m_bottom; }

/// Returns end iterator counting from the bottom of the stack.
[[nodiscard]] const uint64_t* rend() const noexcept { return m_top + 1; }
Expand Down
48 changes: 47 additions & 1 deletion test/unittests/execute_test.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1007,4 +1007,50 @@ TEST(execute, reuse_args)
EXPECT_THAT(args, ElementsAre(20, 3));

EXPECT_THAT(execute(parse(wasm), 1, {}), Result(23 % (23 / 5)));
}
}

TEST(execute, stack_abuse)
{
/* wat2wasm
(func (param i32) (result i32)
local.get 0
i32.const 1
i32.const 2
i32.const 3
i32.const 4
i32.const 5
i32.const 6
i32.const 7
i32.const 8
i32.const 9
i32.const 10
i32.const 11
i32.const 12
i32.const 13
i32.const 14
i32.const 15
i32.const 16
i32.add
i32.add
i32.add
i32.add
i32.add
i32.add
i32.add
i32.add
i32.add
i32.add
i32.add
i32.add
i32.add
i32.add
i32.add
i32.add
)
*/
const auto wasm = from_hex(
"0061736d0100000001060160017f017f030201000a360134002000410141024103410441054106410741084109"
"410a410b410c410d410e410f41106a6a6a6a6a6a6a6a6a6a6a6a6a6a6a6a0b");

EXPECT_THAT(execute(parse(wasm), 0, {1000}), Result(1136));
}