`kedro-telemetry`: Improve performance by switching to after_command_run #4014

lrcouto · 2024-07-16T20:38:23Z

This is the Kedro side of the changes made for kedro-org/kedro-plugins#709. The actual changes to kedro-telemetry can be found at kedro-org/kedro-plugins#766.

Description

Development notes

Developer Certificate of Origin

We need all contributions to comply with the Developer Certificate of Origin (DCO). All commits must be signed off by including a Signed-off-by line in the commit message. See our wiki for guidance.

If your PR is blocked due to unsigned commits, then you must follow the instructions under "Rebase the branch" on the GitHub Checks page for your PR. This will retroactively add the sign-off to all unsigned commits and allow the DCO check to pass.

Checklist

Read the contributing guidelines
Signed off each commit with a Developer Certificate of Origin (DCO)
Opened this PR as a 'Draft Pull Request' if it is work-in-progress
Updated the documentation to reflect the code changes
Added a description of this change in the RELEASE.md file
Added tests to cover my changes
Checked if this change will affect Kedro-Viz, and if so, communicated that with the Viz team

Signed-off-by: Laura Couto <laurarccouto@gmail.com>

kedro/framework/cli/cli.py

Signed-off-by: Laura Couto <laurarccouto@gmail.com>

…org/kedro into telemetry-improve-cli-performance

Signed-off-by: Laura Couto <laurarccouto@gmail.com>

noklam

Thanks for the PR. Could you add some notes to the PR? I think you did some testing before but I am particularly interested if we will continue to receive telemetry for long-running commands like once we move to after_command_run

kedro ipython
kedro jupyter

The actual changes for telemetry to send event after command is kedro-org/kedro-plugins#766, so the description of this PR need to reflect the changes of this PR. I know there were some discussion offline, it would be beneficial to keep a summary here for future reference and know more about the context.

noklam · 2024-07-30T10:00:02Z

kedro/framework/cli/cli.py

@@ -38,6 +39,13 @@
 v{version}
 """

+logger = logging.getLogger(__name__)
+logger.setLevel(logging.ERROR)


This will hide WARNING level logs as I understand, what's the reasoning behind this?

My initial idea was that, since this would happen instead of the run stopping for an exception, it should appear only when an error occurs. But yeah it makes sense, we can lower it to a warning level.

noklam · 2024-07-30T10:01:03Z

kedro/framework/cli/cli.py

+logger = logging.getLogger(__name__)
+logger.setLevel(logging.ERROR)
+handler = logging.StreamHandler(sys.stderr)
+formatter = logging.Formatter("%(asctime)s - %(name)s - %(levelname)s - %(message)s")


Is this not formatted by kedro already? Why is this needed?

When I added this I had issues with the hook not being called unless this formatting was explicitly on the file, it was not using kedro's. I'm doing some tests to see if I can make it just use kedro's instead.

Ok, I recalled why I did this. Since it was not using the config from kedro correctly, it was putting the error messages on stdout instead of stderr.

noklam · 2024-07-30T10:07:45Z

kedro/framework/cli/cli.py

+                self._cli_hook_manager.hook.after_command_run(
+                    project_metadata=self._metadata, command_args=args, exit_code=0
+                )
+            sys.exit(exit_code)


Is this sys.exit necessary? I think there is a high chance this will cause problem to use kedro CLI as sys.exit would terminate the program eagerly.

#4026 is trying to change click to standalone=False to make CLI

https://stackoverflow.com/questions/60780051/python-click-command-exit-flow

This issue document the previous investigation done by @antonymilne and explains more in depth.

I've used it here as a way to ensure that it exits with the appropriate error code, because it was exiting with a 0 even when an error happened. It's not necessary to exit at this point, as long as when it exits, it's with the correct code.

What I did for now moving the sys.exit into the block that handles the exception in case something goes wrong. So if no errors happen, click will exit normally.

DimedS · 2024-07-30T13:23:04Z

kedro/framework/cli/cli.py

+            )
+            hook_called = True
+        finally:
+            if not hook_called:


Thank you for the PR, @lrcouto! Could you please explain how this is possible? It looks like hook_called will be set to True in the try block or, if it fails, in the except block, so this line will never be executed?

I guess that's in case of no exception

Yes, this finally block is there to solve that issue where the hook was called twice when there was not an exception.

lrcouto · 2024-07-31T15:01:46Z

Thanks for the reviews! I'm doing some tests here to try to address all of the comments.

Signed-off-by: Laura Couto <laurarccouto@gmail.com>

ElenaKhaustova

Thank you, @lrcouto, LGTM!

DimedS

Thank you, @lrcouto ! LGTM, I wrote one question about logging settings.

DimedS · 2024-08-01T10:27:13Z

kedro/framework/cli/cli.py

@@ -38,6 +39,9 @@
 v{version}
 """

+logger = logging.getLogger(__name__)
+logger.addHandler(logging.StreamHandler(sys.stderr))


With that setting, will every logging message be sent to stderr? If the INFO logging level is configured in Kedro, will INFO messages also go to stderr, or are they not connected to Kedro's logging settings?

It should affect only the messages generated from this instance of the logger, which is just the one error message on the main function.

lrcouto · 2024-08-01T14:55:13Z

Thanks for the PR. Could you add some notes to the PR? I think you did some testing before but I am particularly interested if we will continue to receive telemetry for long-running commands like once we move to after_command_run
* `kedro ipython`

* `kedro jupyter`
The actual changes for telemetry to send event after command is kedro-org/kedro-plugins#766, so the description of this PR need to reflect the changes of this PR. I know there were some discussion offline, it would be beneficial to keep a summary here for future reference and know more about the context.

re: commands like kedro ipython or kedro jupyter, it'll send info to heap when you reload inside the shell and after you exit it:

Exited with ctrl+D here.

ankatiyar

tested with a few commands, LGTM!

Tentative fix for the hook issue

1e67775

Signed-off-by: Laura Couto <laurarccouto@gmail.com>

lrcouto requested a review from ankatiyar July 16, 2024 20:38

ankatiyar reviewed Jul 17, 2024

View reviewed changes

kedro/framework/cli/cli.py Show resolved Hide resolved

lrcouto and others added 5 commits July 17, 2024 12:24

Merge branch 'main' into telemetry-improve-cli-performance

9fca028

Merge branch 'main' into telemetry-improve-cli-performance

29b627f

Merge branch 'main' into telemetry-improve-cli-performance

8e9e5d4

Add variable to track if hook was called already

22977d5

Signed-off-by: Laura Couto <laurarccouto@gmail.com>

Merge branch 'main' into telemetry-improve-cli-performance

3142ef8

ankatiyar mentioned this pull request Jul 19, 2024

kedro-telemetry: Improve performance by switching to after_command_run kedro-org/kedro-plugins#709

Closed

lrcouto and others added 5 commits July 19, 2024 17:23

Properly set exit code when there is an exception

6ee73b8

Signed-off-by: Laura Couto <laurarccouto@gmail.com>

Merge branch 'main' into telemetry-improve-cli-performance

a97a012

Add test coverage for exception on after_command_hook

45013cf

Signed-off-by: Laura Couto <laurarccouto@gmail.com>

Merge branch 'telemetry-improve-cli-performance' of github.com:kedro-…

bbab82c

…org/kedro into telemetry-improve-cli-performance

Add test for the finally block

66136bd

Signed-off-by: Laura Couto <laurarccouto@gmail.com>

lrcouto marked this pull request as ready for review July 25, 2024 06:23

lrcouto requested a review from merelcht as a code owner July 25, 2024 06:23

noklam self-requested a review July 25, 2024 15:54

lrcouto added 2 commits July 26, 2024 11:03

Merge branch 'main' into telemetry-improve-cli-performance

f5fe4cc

Merge branch 'main' into telemetry-improve-cli-performance

67dd0ff

lrcouto requested review from DimedS and ElenaKhaustova July 29, 2024 21:59

noklam reviewed Jul 30, 2024

View reviewed changes

DimedS reviewed Jul 30, 2024

View reviewed changes

lrcouto and others added 6 commits July 31, 2024 14:40

Merge branch 'main' into telemetry-improve-cli-performance

184c35d

Remove redundant logger configuration on cli.py

7b592fe

Signed-off-by: Laura Couto <laurarccouto@gmail.com>

Add minimal required logger config to sned messages to stderr

c5c22eb

Signed-off-by: Laura Couto <laurarccouto@gmail.com>

Call sys.exit only once

155a30d

Signed-off-by: Laura Couto <laurarccouto@gmail.com>

Lint

de64ef7

Signed-off-by: Laura Couto <laurarccouto@gmail.com>

Remove comment

f6f657b

Signed-off-by: Laura Couto <laurarccouto@gmail.com>

lrcouto added 2 commits August 1, 2024 00:38

Move exit into exception block

b66c0fd

Signed-off-by: Laura Couto <laurarccouto@gmail.com>

Change test

43054c5

Signed-off-by: Laura Couto <laurarccouto@gmail.com>

ElenaKhaustova approved these changes Aug 1, 2024

View reviewed changes

DimedS approved these changes Aug 1, 2024

View reviewed changes

Merge branch 'main' into telemetry-improve-cli-performance

a31f0d4

ankatiyar approved these changes Aug 1, 2024

View reviewed changes

lrcouto merged commit 78594be into main Aug 1, 2024
34 checks passed

lrcouto deleted the telemetry-improve-cli-performance branch August 1, 2024 15:20

ankatiyar restored the telemetry-improve-cli-performance branch August 13, 2024 15:59

noklam deleted the telemetry-improve-cli-performance branch November 12, 2024 10:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`kedro-telemetry`: Improve performance by switching to after_command_run #4014

`kedro-telemetry`: Improve performance by switching to after_command_run #4014

lrcouto commented Jul 16, 2024 •

edited

Loading

noklam left a comment •

edited

Loading

noklam Jul 30, 2024

lrcouto Jul 31, 2024

noklam Jul 30, 2024

lrcouto Jul 31, 2024

lrcouto Jul 31, 2024

noklam Jul 30, 2024

lrcouto Jul 31, 2024

lrcouto Aug 1, 2024

DimedS Jul 30, 2024

ElenaKhaustova Jul 30, 2024

lrcouto Jul 31, 2024

lrcouto commented Jul 31, 2024

ElenaKhaustova left a comment

DimedS left a comment

DimedS Aug 1, 2024

lrcouto Aug 1, 2024

lrcouto commented Aug 1, 2024

ankatiyar left a comment

kedro-telemetry: Improve performance by switching to after_command_run #4014

kedro-telemetry: Improve performance by switching to after_command_run #4014

Conversation

lrcouto commented Jul 16, 2024 • edited Loading

Description

Development notes

Developer Certificate of Origin

Checklist

noklam left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lrcouto commented Jul 31, 2024

ElenaKhaustova left a comment

Choose a reason for hiding this comment

DimedS left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lrcouto commented Aug 1, 2024

ankatiyar left a comment

Choose a reason for hiding this comment

`kedro-telemetry`: Improve performance by switching to after_command_run #4014

`kedro-telemetry`: Improve performance by switching to after_command_run #4014

lrcouto commented Jul 16, 2024 •

edited

Loading

noklam left a comment •

edited

Loading