-
Notifications
You must be signed in to change notification settings - Fork 601
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(ddl): distinguish views from tables while listing them #8864
feat(ddl): distinguish views from tables while listing them #8864
Conversation
10ee2e2
to
012aa23
Compare
ibis/backends/clickhouse/__init__.py
Outdated
self, | ||
like: str | None = None, | ||
database: str | None = None, | ||
no_views: bool = False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
generally in this case I prefer the following pattern:
no_views: bool = False, | |
include_views: bool = True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, makes sense, I will switch to include_views
.
Before we go down this path, let's consider whether the general approach is going to work for all the varieties of table like things one may want. There are tables, views, materialized views, streams, and probably others. At some point someone will ask to be able to list only table like objects of type X. I think adding a flag for each of those to Instead, how about we have a We could also not implement the latter until someone asks for it. |
Thanks @cpcloud , I also think that exposing I will proceed with this approach then. |
I'm not opposed to explicit The one that immediately comes to mind is
|
As another example to this, when |
This isn't true in general, just for the Flink backend. All other backends will create physical persistent storage when |
IMO this isn't a huge loss, but would require some documentation in Another option would be to materialize the table, but I don't know if we want to do that. Repeated use of large CSVs will be faster, but the initial function call will scale with the CSV file rather than the number of sample lines like it does now. |
Another API option would be to have an con.list_tables() # defaults to listing all table-like things within the current hierarchy
con.list_tables(include=["table"]) # only list tables
con.list_tables(include=["table", "view"]) # only list tables and views I like this better because:
|
I'm not a big fan of APIs that encode flags with dynamic values like strings when there's an API to encode that information in a statically knowable and searchable way like a method name. Strings-as-flags are also not easily tab-completable, and if you want encode what values are supported (and perhaps show that somewhere) you have to build something for that. Encoding these APIs as methods doesn't have these problems.
I don't think ignoring the flag is any better, if someone asks for
It's true that accessing the list of materialized views (say) is backend-specific, however, that's the goal here: to expose this information to users. I proposed a solution for this already ( |
What I'm getting at is that for most use cases in ibis the user shouldn't care whether something is a view/table/materialized-view/dataframe/..., they're all treated the same by the expression api (and represented by a When executing an unbound expression, all that matters is that there is a corresponding table-like-thing with a given name and schema for each unbound table - one backend might use a view for Given this, I'd argue that we'd want the best named and most intuitive API to return all table-like-things, rather than just database tables. I'm not a huge fan of |
As a side question:
def create_view(
self,
name: str,
obj: ir.Table,
*,
database: str | None = None,
overwrite: bool = False,
) -> ir.Table:
return self.create_table(
name, obj=obj, temp=None, database=database, overwrite=overwrite
) Given that the notion of |
Agree that |
451a9f7
to
08198f7
Compare
08198f7
to
5a16345
Compare
Thank you all for the discussion. The first pass is ready for review now:
|
@mfatihaktas What's the status of this one, is this blocked or does it need a review? |
Thanks for checking on this. Regarding the status, I marked this PR as ready-for-review some time ago. I am busy with another project this week, but will try to resolve the conflicts before EOD on Friday. |
36bd8ec
to
ea11d93
Compare
ea11d93
to
ec6c4e9
Compare
no_views
arg to list_tables()
Closing in favor of the approach described here: #8382 (comment) |
Description of changes
First step towards addressing #8382.
This
no_views
arg tolist_tables()
for clickhouse, Flink and pyspark backends.Asking for early feedback. If the change is reasonable, will proceed to make the same change for the remaining backends.
Edit: After the discussion on this PR:
list_tables()
andlist_views()
explicit.TODO
comments to decide for those during the reviews.list()
to list tables and views together.