-
Notifications
You must be signed in to change notification settings - Fork 910
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve DataCatalog and ConfigLoader with autocompletion and meaningful representation when it get printed #1721
Comments
LOVE this idea. I've added it to the interactive workflow milestone which I still need to populate a bit more since I've had related thoughts that I haven't had a chance to write down properly yet. Just a few notes for now:
|
I agree it becomes tricky once
It may be worth discuss what should |
I just came across this page which is well worth reading to get some more ideas:
|
you can also use |
I am working on #2676 and I try to debug and it's hard. I want to re-purpose this issue so it's not Jupyter focus. If we want to make users use this as a component it need to have nicer public API and str. Right now it's hard to find out what is available without going through the source code. In
|
Yeah the internal representation of namespaced datasets with a double underscore is really annoying - I hit this when doing an IDE prototype (#2821). It would be great if the catalog had a presentation-layer representation available in the public API |
The clean up will go into this undefined milestone Redesign Catalog and Datasets. But before this happen we can still make the public interface nicer, maybe then deprecate or unify the rest later. Good shout about namespace I haven't try that. |
One more, when worked with CST they often have a large catalog. A few things that I think is useful
I was wondering could we make this better? Can we have Reference: |
I did some more research and I couldn't find any auto-completion feature for |
found a nicer solution to use I think the full feature need to be designed properly, but some of the non-breaking stuff like printing would be already useful and we can change it afterward. I would like to do this sooner than later. |
You can also do a Rich specific repr https://rich.readthedocs.io/en/stable/pretty.html#typing |
Hello, My two cents on the implementation details:
|
I agree with going ahead with |
I've just started, and here is a fun fact : a__str__ method already exists in from kedro.extras.datasets.pandas import CSVDataSet
ds = CSVDataSet(
filepath=r"temp.csv",
load_args={"sep": ";"},
)
ds
repr(ds)
print(ds)
str(ds)
This tutorial claims that:
... which is what does our |
I confirm from the private repository (before open sourcing) that the Maybe we can just make |
Re-reading this:
Would autocompletion of dataset names as strings be possible? See a similar thing for pandas DataFrame columns: The problem with doing the dynamic properties is that some dataset names that are valid in YAML would become illegal in that way (same problem as with pandas columns) and also it would pollute the namespace of the |
What is missing from this issue? |
Closed because this was done in #3981 |
Dataset discovery
Running the variable in a notebook cell is the most common way to inspect a variable in Jupyter. Currently
catalog
give us an useless memory address, it would be much nicer if it prints out what's available, potentially just wrap it ascatalog.list()
for simplicity.demo:
Autocompletion
Under the hood, auto-completion works with checking the dir(object) method
Possible solutions:
datasets
, so user can docatalog.
+Tab
, it will shows other method available incatalog
though. We can probably makesdatasets
show up at the top of autocompletion too and others method at the bottom. I think it just has to be a list most likely. Of course we have to implement the__getattr__
so it actually looks atcatalog.datasets
too when we docatalog.dataset_name
.demo:
catalog[
+Tab
, it will shows all the datasets.Bonus
session
andcontext
also has useless__repr__
, but they are not too useful in the interactive workflow so it has lower priority.The text was updated successfully, but these errors were encountered: