Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PVF Worker Process #1964

Merged
merged 18 commits into from
Feb 20, 2024
Merged

PVF Worker Process #1964

merged 18 commits into from
Feb 20, 2024

Conversation

igor-egorov
Copy link
Contributor

@igor-egorov igor-egorov commented Jan 30, 2024

Referenced issues

LimeChain/polkadot-conformance#26
#1800
https://github.com/qdrvm/KAGOME-audit/issues/23 (CPU time limit)

Description of the Change

Execute PVF call to runtime in a separate process thus it could be terminated on deadline.
All the configurations are supported:

  • WasmEdge Interpreted
  • WasmEdge Compiled
  • Binaryen
  • WAVM

Many thanks to @turuslan for help!

Benefits

Correct behavior.

Possible Drawbacks

  1. Memory-related issues have to be addressed separately.
  2. Further development could be continued to dedicate "prepare worker" too (now "execute worker" is separated only).

Usage Examples or Tests

New CLI flags introduced:

  1. Mention --parachain-single-process to employ the former behavior which would not spawn additional child processes.
  2. Set --parachain-check-deadline=5000 to use 5 seconds deadline for PVF blocks execution (if the first flag is not set).

Alternate Designs

Use fork? - not reasonable. Current approach is not going to spend double size of memory too, but much easier to handle control flow.

igor-egorov and others added 12 commits November 17, 2023 16:12
Signed-off-by: Igor Egorov <igor@qdrvm.io>
Signed-off-by: turuslan <turuslan.devbox@gmail.com>
Signed-off-by: Igor Egorov <igor@qdrvm.io>

# Conflicts:
#	core/parachain/pvf/pvf_impl.cpp
#	core/parachain/pvf/pvf_impl.hpp
#	core/runtime/binaryen/core_api_factory_impl.cpp
#	core/runtime/binaryen/instance_environment_factory.cpp
#	core/runtime/common/module_instance.cpp
#	core/runtime/wavm/core_api_factory_impl.cpp
#	core/runtime/wavm/instance_environment_factory.cpp
Signed-off-by: Igor Egorov <igor@qdrvm.io>
Signed-off-by: Igor Egorov <igor@qdrvm.io>
Signed-off-by: Igor Egorov <igor@qdrvm.io>
Signed-off-by: Igor Egorov <igor@qdrvm.io>
Thanks to @turuslan!

Signed-off-by: Igor Egorov <igor@qdrvm.io>
Signed-off-by: turuslan <turuslan.devbox@gmail.com>
Signed-off-by: turuslan <turuslan.devbox@gmail.com>
Signed-off-by: Igor Egorov <igor@qdrvm.io>
@kamilsa kamilsa requested review from kamilsa and removed request for iceseer February 2, 2024 09:13
log,
"Creating wasm module with non-default :heappages value set to {}",
pages);
// TODO: should pvf use ":heappages" from relay chain?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace with issue link

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 22 to 23
// run(scale::encode())
// run(make_shared<Buffer>(scale::encode()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be removed?

exe,
bp::args({"pvf-worker"}),
bp::std_out > pipe_stdout,
// bp::std_err > bp::null,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove?

std::make_shared<libp2p::log::Configurator>()));
auto r = logging_system->configure();
if (not r.message.empty()) {
(r.has_error ? std::cerr : std::cout) << r.message << std::endl;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why print warnings to cout? Especialy if it's used for IPC.

(r.has_error ? std::cerr : std::cout) << r.message << std::endl;
}
if (r.has_error) {
exit(EXIT_FAILURE);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is here it exit() but below it's return?

argv0().value(),
common::Buffer{scale::encode(input).value()},
cb);
// this will go away as soon as callWasm will have an async form
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an issue for it?

void runWorker(boost::asio::io_context &io_context,
std::shared_ptr<libp2p::basic::Scheduler> scheduler,
std::chrono::milliseconds timeout,
const std::string &exe,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be filesystem::path?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really needed as we don't perform any manipulations over the value and use/pass it as is.

std::chrono::milliseconds timeout,
const std::string &exe,
common::Buffer input_,
Cb cb_) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the signature and name it's not quite clear what is this callback. Please comment.

#include <string>

namespace kagome {
inline auto &argv0() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's "/proc/self/exe", but also why not call this function 'get_executable_name' or something similar?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want LINUX ? read("/proc/self/exe") : APPLE ? _dyld_get_image_name(0) : ERROR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found argv0 the simplest and safe enough way to implement this. Anyways, thanks for the pointing the possible way too!


if (argc > 1) {
std::string_view name{argv[1]};
if (name == "pvf-worker") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be added to the output of kagome --help

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to leave it undocumented as it is not supposed for manual use (at least the node run with the flag expects binary input).

};

io_context.post([scheduler, timeout, cb] {
scheduler->schedule([cb]() mutable { cb(std::errc::timed_out); },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it going to be called in a separate thread, or how is it guaranteed that the thread where it's scheduled to execute will be available?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we use the "main" io_context, thus this surely outlive the executor.

Signed-off-by: Igor Egorov <igor@qdrvm.io>
@igor-egorov igor-egorov merged commit 5d419f2 into master Feb 20, 2024
12 of 13 checks passed
@igor-egorov igor-egorov deleted the feature/pvf_worker_process branch February 20, 2024 11:42
iceseer added a commit that referenced this pull request Feb 23, 2024
iceseer pushed a commit that referenced this pull request Feb 23, 2024
Signed-off-by: Igor Egorov <igor@qdrvm.io>
Signed-off-by: turuslan <turuslan.devbox@gmail.com>
Co-authored-by: turuslan <turuslan.devbox@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants