-
Notifications
You must be signed in to change notification settings - Fork 910
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow extra hooks to be passed via the Kedro CLI #435
Comments
Hi Deepyaman, thanks for creating this! If I play back what you would like to do, you would like to be able to use different Hooks in Kedro without modifying your code? So Kedro automatically discovers them? |
Hey @yetudada! Kedro doesn't need to automatically discover them. For example: kedro run --hooks src.hookshot.hooks.TeePlugin or kedro run --hooks src.hookshot.hooks.TeePlugin,src.hookshot.hooks.CachePlugin If the project context already had |
👏 Well said. It would be great to be able to run ad-hoc/one-off hooks on the fly without a code change. For now/alternatively you could use an env variable to toggle what the hook does. This requires you to implement the reading of env variables in your hook, and implementing the hook on the pipeline. |
@deepyaman your timing is uncanny. We are building this as we speak 😂 However, we are going with an opt-out model, hence @yetudada's question. Basically, after user installs a plugin, we will automatically register the plugin's hooks through a dedicated I'm curious what you and @WaylonWalker think about these two models. Do you think an opt-out (auto-discovery) model or an opt-in model (your proposal) would work better? |
Off the top of my head, a potential issue I foresee with the auto-discovery model is users who install a bunch of plugin hooks for testing, or—shudder—use a single base conda env with Kedro installed for all their projects. This isn't a big issue for commands, because it's just your I think opt-in is also easier for my use case, where I want to benchmark raw pipeline vs pipeline + plugin1 vs pipeline + plugin 2, and I like the explicitness, but these are probably smaller concerns from a dev perspective. |
Hi @deepyaman thank you for sharing, these are some really interesting questions and comments. I'm adding some of my thoughts, let me know what you make of them. It feels like the opt-in / opt-out choice really just depends on how easy it is to opt out. Indeed uninstalling stuff would be a hassle. I saw There’s a model in my head that makes me quite reticent about feeding hooks through the CLI, where it’s not a developer that messes with the job spec, so the less technical it is, or the less code knowledge you need to understand it, the better. The person on support doesn’t need to understand what hooks to activate, they can just switch to a different environment and job done, if the env is there. If not, a developer should create it, which feels intentional. |
I really do like the simplicity of installing pytest plugins and they just work. For the most part though the auto-discovered plugins do not change my test. Typically those only change outside the scope of a single test (im thinking about pytest-watch, parallel runner, coverage, sugar). Things that actually change the execution of my test generally require me to add it in the code by function run, or fixture. I feel that kedro plugins are able to so easily fall into the second category and change the output of your pipeline that it should not be autodiscovered. As much as we --shutter-- about users with a single base environment it's a common workflow. Much of the learning content focuses very heavily on the details of manipulating DataFrames that most new data scientists are missing out on how to manage their local dev environment, and where to place their solution in the codebase. I think @lorenabalan brings up a good point about support. If I am running support on an issue I want to quickly get some information about it before making some big code changes. It would be nice to be able to quickly add a retry on fail, or full failure report hook without a change to the prod environment. Whether this just sits in your teams template (hooks list) and turned off by default, or the framework allows it to be enabled with a flag isn't a big differnence to me. Both require the pre-thought that we may want some extra information from PROD at some point. Both solutions require me to reach into PROD to install the hook, making the code change to add it to the list is not a big deal. note my PROD env is using kedro-docker, other folks with a different prod solution may have a different opinion. |
Agree that this is of paramount importance.
Would be happy with this. It's the current way of specifying hooks plus an alternative solution to this issue, which is to specify the same (almost same, as it seems to overwrite rather than append) list as an environment variable.
Eeeh. I don't think I like only depending on this because Kedro only supports one level of inheritance in conf. If I already have a
From a support perspective, this makes sense. The current implementation only allows one set of hooks to be defined across envs, and your aforementioned suggestion to have hooks defined on a per-env basis makes this more flexible. However, from a dev perspective, I want more control. Creating a new env mirroring a non- Can we support both? It's not like the |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Auto-registration for plugin hooks is one thing, but imagine there's a pipeline with some |
@foxale I just happened to see the email notification for this, but do you either want to reopen this or create a more targeted issue for your use case (and link back here, if necessary)? |
Description
Is your feature request related to a problem? A clear and concise description of what the problem is: "I'm always frustrated when ..."
I should be able to run with or without a set of optional hooks. For example:
or
Context
Why is this change important to you? How would you use it? How can it benefit other users?
Not all hooks are core to functionality. Using something like a debugging hook shouldn't require a code change.
This is also helpful for testing different hooks as part of a CI/CD process.
Possible Implementation
See deepyaman/kedro-accelerator@0f792f9#diff-ebf803a458716ccda133fadc42e45057 (would add it to
KedroContext
though).The text was updated successfully, but these errors were encountered: