-
Notifications
You must be signed in to change notification settings - Fork 913
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dynamic plugin cleanups #3679
Dynamic plugin cleanups #3679
Conversation
/* Prevent a race between was_pending and still_pending */ | ||
new_reltimer(c->ld->timers, rpc_hook, time_from_msec(1), | ||
call_rpc_command_hook, rpc_hook); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really nice cleanup ! Will test soon
lightningd/plugin.c
Outdated
@@ -456,7 +495,12 @@ static struct io_plan *plugin_write_json(struct io_conn *conn, | |||
static void plugin_conn_finish(struct io_conn *conn, struct plugin *plugin) | |||
{ | |||
plugin->stdout_conn = NULL; | |||
if (plugin->start_cmd) { | |||
plugin_cmd_succeeded(plugin->start_cmd, plugin); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we succeed in this case ? I would have expected us to return an error if plugin dies before completing the command
EDIT: It's for plugin stop
, right ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm I wonder what would happen if a plugin started with plugin start
exit()
after receving init
then
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
outdated: modified in a following commit
@@ -190,6 +190,7 @@ void plugin_kill(struct plugin *plugin, char *fmt, ...) | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the dynamic case though, it's 60 seconds to both complete getmanifest and init, which seems fair.
Given that normally-functioning plugins usually return is less than 2sec, I find it huge. But I can live with it (or with ctrl+c) :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I came across this comment in another commit, so I'll precise: this a comment for the commit message of 62f0916
Travis TL;DR : diff --git a/lightningd/lightningd.c b/lightningd/lightningd.c
index ddfa141d7..2013501be 100644
--- a/lightningd/lightningd.c
+++ b/lightningd/lightningd.c
@@ -123,8 +123,8 @@ static struct lightningd *new_lightningd(const tal_t *ctx)
* is a nod to keeping it minimal and explicit: we need this code for
* testing, but its existence means we're not actually testing the
* same exact code users will be running. */
- ld->dev_debug_subprocess = NULL;
#if DEVELOPER
+ ld->dev_debug_subprocess = NULL;
ld->dev_disconnect_fd = -1;
ld->dev_subdaemon_fail = false;
ld->dev_allow_localhost = false; |
e38a971
to
bf300f4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests nit and a small addition, other than that lgtm : I did some tests, including against plugins in lightningd/plugins and esplora Bitcoin backend (--disable-plugin bcli
).
@@ -1240,7 +1384,7 @@ void plugins_config(struct plugins *plugins) | |||
{ | |||
struct plugin *p; | |||
list_for_each(&plugins->plugins, p, list) { | |||
if (p->plugin_state == UNCONFIGURED) | |||
if (p->plugin_state == NEEDS_INIT) | |||
plugin_config(p); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't add a comment directly in the function, but you don't set the AWAITING_INIT_RESPONSE
in plugin_config()
, after sending the init request.
21a25ef
to
4fe882f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK 4fe882f
Yep, fixed... |
4fe882f
to
1392bf8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK 1392bf8
The symptom (under heavy load and valgrind) in test_plugin_command: lightningd: common/json_stream.c:237: json_stream_output_: Assertion `!js->reader' failed. This is because we try to call `getmanifest` again on `pay` which has not yet responded to init. The minimal fix for this is to keep proper state, so we can tell the difference between "not yet called getmanifest" and "not yet finished init". Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice cleanup, just found some minor nits and questions for my own curiosity 🙂
/* FIXME(cdecker) This just builds a string for the request because | ||
* the json_stream is tightly bound to the command interface. It | ||
* should probably be generalized and fixed up. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🥳
@@ -87,7 +87,6 @@ struct plugin { | |||
*/ | |||
struct plugins { | |||
struct list_head plugins; | |||
size_t pending_manifests; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much nicer state tracking than just counting missing manifests 👍
def run(self): | ||
time.sleep(1) | ||
print("Exiting!") | ||
os._exit(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, I usually use sys.exit(1)
for these, but the docs explicitly suggest using os._exit(1)
after a fork
call.
|
||
@plugin.init() | ||
def init(options, configuration, plugin): | ||
FailThread() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same could be achieved by the following:
def init(options, configuration, plugin):
threading.Timer(1, lambda: os._exit(1))
or if the print
is actually needed we can do this instead:
def init(options, configuration, plugin):
def die():
print("Exiting!")
os._exit(1)
threading.Timer(1.0, die)
😉
lightningd/plugin_control.c
Outdated
tal_del_destructor2(dp->plugin, plugin_dynamic_crash, dp); | ||
|
||
list_for_each(&dp->plugin->plugins->plugins, p, list) { | ||
if (p->plugin_state != INIT_COMPLETE) | ||
return; | ||
} | ||
|
||
/* No plugin unconfigured left, return the plugin list */ | ||
was_pending(plugin_dynamic_list_plugins(dp->cmd, dp->plugin->plugins)); | ||
} | ||
|
||
/** | ||
* Send the init message to the plugin. We don't care about its response, | ||
* but it's considered the last part of the handshake : once it responds | ||
* it is considered configured. | ||
*/ | ||
static void plugin_dynamic_config(struct dynamic_plugin *dp) | ||
{ | ||
struct jsonrpc_request *req; | ||
|
||
req = jsonrpc_request_start(dp->plugin, "init", dp->plugin->log, | ||
plugin_dynamic_config_callback, dp); | ||
plugin_populate_init_request(dp->plugin, req); | ||
jsonrpc_request_end(req); | ||
dp->plugin->plugin_state = AWAITING_INIT_RESPONSE; | ||
plugin_request_send(dp->plugin, req); | ||
} | ||
|
||
static void plugin_dynamic_manifest_callback(const char *buffer, | ||
const jsmntok_t *toks, | ||
const jsmntok_t *idtok, | ||
struct dynamic_plugin *dp) | ||
{ | ||
if (!plugin_parse_getmanifest_response(buffer, toks, idtok, dp->plugin)) | ||
return was_pending(plugin_dynamic_error(dp, "Gave a bad response to getmanifest")); | ||
|
||
if (!dp->plugin->dynamic) | ||
return was_pending(plugin_dynamic_error(dp, "Not a dynamic plugin")); | ||
|
||
/* We got the manifest, now send the init message */ | ||
dp->plugin->plugin_state = NEEDS_INIT; | ||
plugin_dynamic_config(dp); | ||
} | ||
|
||
/** | ||
* This starts a plugin : spawns the process, connect its stdout and stdin, | ||
* then sends it a getmanifest request. | ||
*/ | ||
static struct command_result *plugin_start(struct dynamic_plugin *dp) | ||
{ | ||
int stdin, stdout; | ||
mode_t prev_mask; | ||
char **p_cmd; | ||
struct jsonrpc_request *req; | ||
struct plugin *p = dp->plugin; | ||
|
||
p->dynamic = false; | ||
p_cmd = tal_arrz(NULL, char *, 2); | ||
p_cmd[0] = p->cmd; | ||
/* In case the plugin create files, this is a better default. */ | ||
prev_mask = umask(dp->cmd->ld->initial_umask); | ||
p->pid = pipecmdarr(&stdin, &stdout, &pipecmd_preserve, p_cmd); | ||
umask(prev_mask); | ||
if (p->pid == -1) | ||
return plugin_dynamic_error(dp, "Error running command"); | ||
else | ||
log_debug(dp->cmd->ld->plugins->log, "started(%u) %s", p->pid, p->cmd); | ||
tal_free(p_cmd); | ||
p->buffer = tal_arr(p, char, 64); | ||
p->stop = false; | ||
/* Give the plugin 20 seconds to respond to `getmanifest`, so we don't hang | ||
* too long on the RPC caller. */ | ||
p->timeout_timer = new_reltimer(dp->cmd->ld->timers, dp, | ||
time_from_sec((20)), | ||
plugin_dynamic_timeout, dp); | ||
|
||
/* Besides the timeout we could also have the plugin crash before | ||
* completing the handshake. In that case we'll get notified and we | ||
* can clean up the `struct dynamic_plugin` and return an appropriate | ||
* error. | ||
* | ||
* The destructor is deregistered in the following places: | ||
* | ||
* - plugin_dynamic_error in case of a timeout or a crash | ||
* - plugin_dynamic_config_callback if the handshake completes | ||
*/ | ||
tal_add_destructor2(p, plugin_dynamic_crash, dp); | ||
|
||
/* Create two connections, one read-only on top of the plugin's stdin, and one | ||
* write-only on its stdout. */ | ||
io_new_conn(p, stdout, plugin_stdout_conn_init, p); | ||
io_new_conn(p, stdin, plugin_stdin_conn_init, p); | ||
req = jsonrpc_request_start(p, "getmanifest", p->log, | ||
plugin_dynamic_manifest_callback, dp); | ||
jsonrpc_request_end(req); | ||
plugin_request_send(p, req); | ||
p->plugin_state = AWAITING_GETMANIFEST_RESPONSE; | ||
return command_still_pending(dp->cmd); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice cleanup 👍
/** | ||
* Called when trying to start a plugin through RPC, it starts the plugin and | ||
* will give a result 20 seconds later at the most. | ||
*/ | ||
static struct command_result * | ||
plugin_dynamic_start(struct command *cmd, const char *plugin_path) | ||
{ | ||
struct dynamic_plugin *dp; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment still references 20 seconds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't add the review comment on that line since it'd confuse GH to have review comments on unchanged lines.
@@ -482,7 +483,8 @@ static struct io_plan *plugin_read_json(struct io_conn *conn, | |||
|
|||
if (err) { | |||
plugin_kill(plugin, "%s", err); | |||
return io_close(plugin->stdout_conn); | |||
/* plugin_kill frees plugin */ | |||
return io_close(NULL); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is interesting, what's the effect of io_close(NULL)
? Is it just intended as a no-op?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, just like the one above it, it's a cheats way of getting an io_plan object to return
err = plugin_send_getmanifest(p); | ||
if (!err) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this leak err
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost: it's allocated off of the plugin which we'd be freeing if there is an error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I use this pattern through this code quite a bit.
1392bf8
to
41dfd66
Compare
… not. This will allow us to simplify the caller's command handling. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Now we know whether the command completed or not, we can correctly call command_still_pending() if it didn't complete. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We have this now. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
… for state. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This will let us unify the startup and runtime-started infrastructure. Note that there are two kinds of notifications: 1. Starting a single plugin (i.e. `plugin start`) 2. Starting multiple plugins (i.e. `plugin rescan` or `plugin startdir`). In the latter case, we want the command to complete only once *all* the plugins are dead/finished. We also call plugin_kill() in all cases, and correctly return afterwards (it matters once we use the same paths for dynamic plugins, which don't cause a fatal error if they don't startup). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This actually passes fine, but it's an interesting case to test. Fixed-by: Darosior <darosior@protonmail.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This will allow the dynamic starting code to use them too. Also lets us move dev_debug_subprocess under #if DEVELOPER. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We modify the slow_init() so it doesn't go too slowly for this test. This demonstrates a crash, where we currently try to fail a command multiple times. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This means we now clean up options in startup plugins (that was only done by dynamic code!), and now they both share the 60 second timeout instead of 20 seconds for dynamic. For the dynamic case though, it's 60 seconds to both complete getmanifest and init, which seems fair. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Instead of calling plugin_kill() and returning, have them uniformly return an error string or NULL, and have the top level (plugin_read_json) do the plugin_kill() call. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Once again, this unifies plugin_kill() into the caller. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is what I expected from plugin_kill, and now all the callers do the equivalent anywat, it's easy. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
They are children of the plugin, so this Just Works. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This way the caller doesn't have to "know" that it should use strerror(). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
That's more convenient for most callers, which don't need a fmt. Fixed-by: Darosior <darosior@protonmail.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It's not needed now that plugin_kill frees the plugin immediately. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
1. Make the destructor call check_plugins_resolved(), unless it was uninitialized (`opt_disable_plugin`). 2. Remove redundant list_del (destructor already does it). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The previous implementation was a bit lazy: in particular, since we didn't remember the disabled plugins, we would load them on rescan. Changelog-Changed: config: the `plugin-disable` option works even if specified before the plugin is found.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
There's no reason to assign the plugin vars inside the callback, so do that outside, and the tal_steal() is redundant (the plugin is already the conn parent). And reduce duplication by making plugin_conn_finish call plugin_kill: just make sure we don't call plugin_conn_finish again if plugin_kill is called externally. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We unified this timeout with the 60 second startup timeout. Reported-by: @cdecker Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
41dfd66
to
b2ec70a
Compare
This patch series is about unifying the dynamic and startup plugin paths. In doing so it clears up some latent bugs, and reduces code duplication.