-
Notifications
You must be signed in to change notification settings - Fork 654
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: implementation of ethstats plugin #1216
Conversation
Started implementation of ethstats plugin for trinity, added basic stuff for plugin, implemented some websockets interaction with fake data for now. Plan to add stuff that doesn't depend on py-evm/blockchain like latency reporting, OS, client version etc., adding more logging and some documentation. After that will proceed with addition of blockchain-dependendent stats (most complicated part). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor meta comments. (Only skimmed plugin body, as it mostly seems to be a placeholder.)
Extra-meta: looks like you haven't specified an e-mail for the git repo that GitHub could recognise. (Account does not get attributed with commit.) Fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some structural feedback. Let me know if you have any questions.
return self.is_enabled | ||
|
||
def start(self, context: PluginContext) -> None: | ||
asyncio.ensure_future(self.stats_client.run()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cburgdorf how do we handle cancellations for things like this? Does the plugin API already take care of this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's nothing yet. It's probably worth opening up an issue to collect specific use cases that need to be addressed. In this case, are you referring to how we would stop/restart this plugin or how it would recover from failure to auto-restart or something completely different?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm interested in the clean-shutdown aspect (which might be what he meant too). In this case, there is no cancel token being passed to the stats client from any parent service, so nothing causes it to shut down smoothly.
I guess the plugin could listen for the shutdown event and call await self.stats_client.cancel()
. Any other good options?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens as of now is that inside kill_trinity_gracefully
we call
Line 291 in ca05454
plugin_manager.shutdown() |
which in turn calls
plugin.stop() |
On each plugin. Notice however that we have two instance of PluginManager
in play. This particular instance that lives in the main
process is only responsible for plugins that either overtake the whole main process (e.g. attach
) or that run in their own isolated process. For the first category there is nothing to do really as they completely redefine what the trinity executable actually is (aka this whole kill_trinity_gracefully
just won't "exist" for them anyway). For the case of a BaseIsolatedPlugin
what happens is that we call kill_process_gracefully
for each of them.
py-evm/trinity/extensibility/plugin.py
Lines 180 to 182 in ca05454
def stop(self) -> None: | |
self.context.event_bus.stop() | |
kill_process_gracefully(self._process, self.logger) |
That is basically the same takedown routine that we use for the networking
and database
process today. It is up to the plugin what to make of that. Also, kill_process_gracefully
blocks until each process went down (I guess that happens sequentially which means the more isolated plugins, the longer the shutdown and we should be able to optimize that).
Now for the other PluginManager
instance that lives inside the networking
instance and handles all the plugins that want to run in this shared process, I guess we don't call plugin_manager.shutdown()
there which basically means the plugins in that process don't get a fair chance to shutdown gracefully. I'm also not sure if a synchronous shutdown is the right model there. Tbh, it's a grey area that still needs more thought.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reimplemented whole plugin as BaseIsolatedPlugin, hopefully this should resolve issues with graceful shutdown.
I've fixed some of the comments, also added some basic support for static things (like versions etc.) and proceeded with adding some basic blockchain stats (latest block, peer count) by getting them from |
@evgeniuz lemme know when you'd like another round of review. |
Pushed version with most comments fixed, also changed architecture a bit, now client has producer/consumer pattern. I'm planning to implement a small change to this, though: consumer (that receives messages from stats server) will not process them, but rather will put them to
In future these can be split into more if needed (e. g. one can await for new blocks and send stat when it's available, another will await for new peers etc., another will send Further plan is to add |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leaving final approval to @cburgdorf since he's more familiar with the plugin api. One thing I am concerned about (but doesn't need to block getting this merged) is clean shutdown. I think there is a decent amount of stuff happening here that won't cleanly shutdown.
block = self.chain.get_canonical_head() | ||
peers = len(self.peer_pool) | ||
|
||
await client.send_node_ping() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Each of these await
statements need to be wrapped in self.wait
so that they will be properly cancelled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if I should wrap every async call in self.wait
, but now server_handler
and statistics_handler
are wrapped in self.wait_first
and seem to cancel properly (or maybe I missed something :).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that should be fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All in all, this looks pretty good to me. The main issue is that (afaik) the plugin will probably not shutdown cleanly as it doesn't chain with some existing CancelToken
and without #1309 there is nothing else that would guarantee a clean shutdown for a plugin that runs in the networking
process.
Said that, I think I'd advocate for making this a BaseIsolatedPlugin
anyway (and those do have a clean shutdown already)
Other than that, I agree to the comments @pipermerriam already left.
Hi @evgeniuz have you had a chance to review the comments above? |
Fixed most of the issues, moved logic out of I think all infrastructure/networking is in place, so now will proceed with adding more stats to be sent. @cburgdorf I have a question about how to get chain information in the best way? Since now plugin is isolated I cannot use |
Yes, that's the right way to do it. Will take a closer look at the PR tomorrow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking pretty good. I think there are only a few typing errors left that need to be addressed before we can get this merged. If you address these first, we can get this merged and then improve stats with subsequent PRs.
Fixed suggested type hinting and some more. I would like to add some basic chain stats before merging, though (block, hash, difficulty). |
Fine by me |
Wrapped type parameter in quotes, added blockchain stats (block number, hash, difficulties), some version-based stats. Can be merged if everything is ok. |
Hey @evgeniuz Thanks for all the work you put into this. I'm going to squash this and force push it here for another CI run as this one went red for unrelated reasons. From here, I think it would be nice to add some test coverage with another PR. |
Landed as 3eb6dc5 |
@evgeniuz or @cburgdorf - did you happen to check if the plug-in actually works? I've detailed my difficulties in issue #1337. If you did get it to work: against which remote server? A locally-run instance of the JS |
I didn't but I'd expect @evgeniuz developed against something. However, I expect the implementation to have bugs and in general be an early start. I'm really looking forward to see some added tests (+ @evgeniuz jumping in to provide clarification on how to use it.) In addition to that, it would be great to have some coverage in the docs about how to set this up. Notice that merging this was relatively low impact as the plugin is disabled by default and having it in the code base kinda exposes it more broadly leading to early adopters like you running into bugs and creating issues such as #1337 which in general is leading to further improvement :) |
Yes, that's what I was using, locally run |
Closes #1187