-
Notifications
You must be signed in to change notification settings - Fork 906
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extend path-like keys in the catalog #1934
Comments
Hi @PetitLepton, we haven't heard this request before, but we'd be happy to accept a contribution for it 🙂 |
Coming back to this. The offending code is kedro/kedro/framework/context/context.py Lines 91 to 92 in e8f1bfd
We spoke about this at length when debating #2965 and related "Kedro as a library" issues. The current hardcoded list is less than ideal, but this "magic" behavior is deeply ingrained in Kedro at the moment, and the only realistic way to move past it is to introduce variables in the filepaths so that the user is in control: whatever:
type: my_project.extras.datasets.templated_sql_dataset.TemplatedSQLQueryDataSet
filepath: ${kedro:project_root}/data/01_raw/template.sql I think the proposed solution, namely
Goes in the opposite direction of expanding this behavior. If you want to avoid mangling there's a workaround which is not using This probably deserves some more discussion. |
Hi @PetitLepton, @astrojuanlu - I believe #3561 resolves this issue and gets rid of the hardcoding. We now use relative paths across the board, does the example above still have issues? |
I think this issue is still there: kedro/kedro/framework/context/context.py Lines 93 to 94 in f54c6fb
Recently I saw a user doing lots of
so it's a pattern that our users are putting in the wild now. Instead of "let's extend or make configurable the keys that receive a special treatment", I think we should find ways to support something like this: whatever:
type: my_project.extras.datasets.templated_sql_dataset.TemplatedSQLQueryDataSet
filepath: ${kedro:project_root}/data/01_raw/template.sql which would also solve @PetitLepton original request in #1934 (comment). This essentially goes back to option 3 in our original debate #2924 (comment), imitating what Intake does https://intake.readthedocs.io/en/latest/catalog.html#yaml-format, and @yetudada's question here #2819 (comment) which we addressed by adding more docs #3247. |
Not much has changed since my last comment. I'm personally not happy about this bit of magic that continues to bite users but no solutions have been proposed. About this particular proposal of expanding the path-like keys, I'm turning this into a Discussion to signal it's an enhancement proposal. |
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
Description
In the current implementation of the parsing of the context, the function
_convert_paths_to_absolute_posix
— which converts relative to absolute paths — restrict the conversion to three types of keys, namelyThis is very restrictive and can lead to unexpected issues while implementing custom data set as explained in the example below.
Context
Let's imagine that I want to extend the
SQLQueryDataSet
by using a Jinja template and feed it with parameters. I would have done something like the followingWhile this works when running a pipeline or using
kedro ipython
, it potentially breaks when used in a notebook undernotebooks
with a catalog entry likebecause the relative paths are not properly parsed when loading the context.
One can solve this particular example by replacing the two paths by
filepath
andpath
for example which belong to the list of parsed keys mentioned at the top of this issue. In terms of user experience, it is nevertheless pretty unintuitive. Another possibility is to have nested elements like"template": {"filepath": ...}
.Possible Implementation
Could we use something like a regex containing
path
instead of a rigid list?Thanks in advance!
The text was updated successfully, but these errors were encountered: