Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to raise errors during find_pipelines #3823

Merged
merged 8 commits into from
May 22, 2024
Merged

Conversation

deepyaman
Copy link
Member

@deepyaman deepyaman commented Apr 18, 2024

Description

Resolves #2910

Development notes

Developer Certificate of Origin

We need all contributions to comply with the Developer Certificate of Origin (DCO). All commits must be signed off by including a Signed-off-by line in the commit message. See our wiki for guidance.

If your PR is blocked due to unsigned commits, then you must follow the instructions under "Rebase the branch" on the GitHub Checks page for your PR. This will retroactively add the sign-off to all unsigned commits and allow the DCO check to pass.

Checklist

  • Read the contributing guidelines
  • Signed off each commit with a Developer Certificate of Origin (DCO)
  • Opened this PR as a 'Draft Pull Request' if it is work-in-progress
  • Updated the documentation to reflect the code changes
  • Added a description of this change in the RELEASE.md file
  • Added tests to cover my changes
  • Checked if this change will affect Kedro-Viz, and if so, communicated that with the Viz team

@deepyaman deepyaman force-pushed the feat/raise-errors branch 2 times, most recently from e08d464 to a2880db Compare April 18, 2024 15:15
Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
@deepyaman deepyaman marked this pull request as ready for review April 18, 2024 21:50
@deepyaman deepyaman requested a review from merelcht as a code owner April 18, 2024 21:50
@deepyaman deepyaman self-assigned this Apr 18, 2024
@deepyaman deepyaman requested a review from noklam April 18, 2024 21:50
Copy link
Contributor

@noklam noklam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementation wise looks fine. I think we should update the relevant documentation as well. I was wondering should we also update the starter to add the default to make this more discoverable.

pipelines = find_pipelines(raise_errors=False)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
@deepyaman
Copy link
Member Author

Implementation wise looks fine. I think we should update the relevant documentation as well. I was wondering should we also update the starter to add the default to make this more discoverable.

pipelines = find_pipelines(raise_errors=False)

Good point! See 71e0f64 for documentation added in response.

@deepyaman deepyaman enabled auto-merge (squash) April 19, 2024 17:28
@merelcht merelcht requested a review from ElenaKhaustova May 13, 2024 13:21
Copy link
Contributor

@ElenaKhaustova ElenaKhaustova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation looks good to me!

My only question is whether we should keep the default value raise_errors=False rather than raise_errors=True. If I understood correctly, the feature was initially designed for developers, so any half-developed pipeline won't prevent another pipeline from running. However, general users find debugging hard if an issue occurs and the pipeline is still running. So we may consider making raise_errors=True the default behaviour, as the chance that developers are aware of this feature and the flag is higher than that of general users.

@deepyaman
Copy link
Member Author

My only question is whether we should keep the default value raise_errors=False rather than raise_errors=True. If I understood correctly, the feature was initially designed for developers, so any half-developed pipeline won't prevent another pipeline from running. However, general users find debugging hard if an issue occurs and the pipeline is still running. So we may consider making raise_errors=True the default behaviour, as the chance that developers are aware of this feature and the flag is higher than that of general users.

I'm OK with that, if that's the prevailing user feedback. @astrojuanlu @merelcht any opinion?

Also, I don't personally feel that behavior change is "breaking" (it's pretty easy to update your registry, and hopefully you're not running with a broken pipeline already in production), but would be happy to get others' view.

@merelcht
Copy link
Member

My only question is whether we should keep the default value raise_errors=False rather than raise_errors=True. If I understood correctly, the feature was initially designed for developers, so any half-developed pipeline won't prevent another pipeline from running. However, general users find debugging hard if an issue occurs and the pipeline is still running. So we may consider making raise_errors=True the default behaviour, as the chance that developers are aware of this feature and the flag is higher than that of general users.

I'm OK with that, if that's the prevailing user feedback. @astrojuanlu @merelcht any opinion?

Also, I don't personally feel that behavior change is "breaking" (it's pretty easy to update your registry, and hopefully you're not running with a broken pipeline already in production), but would be happy to get others' view.

I think it's a good call to make the default raise_errors=True. We have indeed heard that users struggle with debugging, so raising errors explicitly by default will hopefully help.

@astrojuanlu
Copy link
Member

Without a full understanding of #2401, I'd be wary of setting raise_errors=True on a micro release. We want to take a broader look at what kind of errors people find while debugging.

My recommendation would be to introduce the option but not change the behavior, so that we can take a more comprehensive look soon. That's also the safer option anyway in terms of backwards compatibility.

Copy link
Contributor

@ElenaKhaustova ElenaKhaustova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving the PR not to block it, since the implementation looks good to me.

Agree to decide on making the raise_errors=True later on.

@noklam noklam self-requested a review May 21, 2024 20:53
Copy link
Contributor

@noklam noklam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, a bit late to the discussion. I agree we should not change the behavior in a micro release.

In principle I like a default=True, but with concern about discoverability and the ergonomics of this. I imagine this is like a switch that you may need to turn on & off from time to time, going into pipeline_registry.py to do so feel inconsistent with other CLI options etc.

merelcht added 2 commits May 22, 2024 09:16
Signed-off-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
@deepyaman deepyaman merged commit 56f2095 into main May 22, 2024
41 checks passed
@deepyaman deepyaman deleted the feat/raise-errors branch May 22, 2024 14:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Provide option to raise error explicitly for find_pipelines
5 participants