-
Notifications
You must be signed in to change notification settings - Fork 906
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kedro-telemetry
: Improve performance by switching to after_command_run
#4014
Conversation
Signed-off-by: Laura Couto <laurarccouto@gmail.com>
Signed-off-by: Laura Couto <laurarccouto@gmail.com>
Signed-off-by: Laura Couto <laurarccouto@gmail.com>
Signed-off-by: Laura Couto <laurarccouto@gmail.com>
…org/kedro into telemetry-improve-cli-performance
Signed-off-by: Laura Couto <laurarccouto@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR. Could you add some notes to the PR? I think you did some testing before but I am particularly interested if we will continue to receive telemetry for long-running commands like once we move to after_command_run
kedro ipython
kedro jupyter
The actual changes for telemetry to send event after command is kedro-org/kedro-plugins#766, so the description of this PR need to reflect the changes of this PR. I know there were some discussion offline, it would be beneficial to keep a summary here for future reference and know more about the context.
kedro/framework/cli/cli.py
Outdated
@@ -38,6 +39,13 @@ | |||
v{version} | |||
""" | |||
|
|||
logger = logging.getLogger(__name__) | |||
logger.setLevel(logging.ERROR) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will hide WARNING
level logs as I understand, what's the reasoning behind this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My initial idea was that, since this would happen instead of the run stopping for an exception, it should appear only when an error occurs. But yeah it makes sense, we can lower it to a warning level.
kedro/framework/cli/cli.py
Outdated
logger = logging.getLogger(__name__) | ||
logger.setLevel(logging.ERROR) | ||
handler = logging.StreamHandler(sys.stderr) | ||
formatter = logging.Formatter("%(asctime)s - %(name)s - %(levelname)s - %(message)s") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this not formatted by kedro
already? Why is this needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I added this I had issues with the hook not being called unless this formatting was explicitly on the file, it was not using kedro
's. I'm doing some tests to see if I can make it just use kedro
's instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I recalled why I did this. Since it was not using the config from kedro
correctly, it was putting the error messages on stdout instead of stderr.
kedro/framework/cli/cli.py
Outdated
self._cli_hook_manager.hook.after_command_run( | ||
project_metadata=self._metadata, command_args=args, exit_code=0 | ||
) | ||
sys.exit(exit_code) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this sys.exit
necessary? I think there is a high chance this will cause problem to use kedro CLI as sys.exit
would terminate the program eagerly.
#4026 is trying to change click
to standalone=False
to make CLI
https://stackoverflow.com/questions/60780051/python-click-command-exit-flow
This issue document the previous investigation done by @antonymilne and explains more in depth.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've used it here as a way to ensure that it exits with the appropriate error code, because it was exiting with a 0 even when an error happened. It's not necessary to exit at this point, as long as when it exits, it's with the correct code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I did for now moving the sys.exit
into the block that handles the exception in case something goes wrong. So if no errors happen, click will exit normally.
) | ||
hook_called = True | ||
finally: | ||
if not hook_called: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the PR, @lrcouto! Could you please explain how this is possible? It looks like hook_called
will be set to True
in the try block or, if it fails, in the except block, so this line will never be executed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess that's in case of no exception
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this finally block is there to solve that issue where the hook was called twice when there was not an exception.
Thanks for the reviews! I'm doing some tests here to try to address all of the comments. |
Signed-off-by: Laura Couto <laurarccouto@gmail.com>
Signed-off-by: Laura Couto <laurarccouto@gmail.com>
Signed-off-by: Laura Couto <laurarccouto@gmail.com>
Signed-off-by: Laura Couto <laurarccouto@gmail.com>
Signed-off-by: Laura Couto <laurarccouto@gmail.com>
Signed-off-by: Laura Couto <laurarccouto@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, @lrcouto, LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, @lrcouto ! LGTM, I wrote one question about logging settings.
@@ -38,6 +39,9 @@ | |||
v{version} | |||
""" | |||
|
|||
logger = logging.getLogger(__name__) | |||
logger.addHandler(logging.StreamHandler(sys.stderr)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With that setting, will every logging message be sent to stderr
? If the INFO
logging level is configured in Kedro, will INFO
messages also go to stderr
, or are they not connected to Kedro's logging settings?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should affect only the messages generated from this instance of the logger, which is just the one error message on the main function.
re: commands like kedro ipython or kedro jupyter, it'll send info to heap when you reload inside the shell and after you exit it: Exited with ctrl+D here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tested with a few commands, LGTM!
This is the Kedro side of the changes made for kedro-org/kedro-plugins#709. The actual changes to
kedro-telemetry
can be found at kedro-org/kedro-plugins#766.Description
Development notes
Developer Certificate of Origin
We need all contributions to comply with the Developer Certificate of Origin (DCO). All commits must be signed off by including a
Signed-off-by
line in the commit message. See our wiki for guidance.If your PR is blocked due to unsigned commits, then you must follow the instructions under "Rebase the branch" on the GitHub Checks page for your PR. This will retroactively add the sign-off to all unsigned commits and allow the DCO check to pass.
Checklist
RELEASE.md
file