-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose Druid functions via INFORMATION_SCHEMA.ROUTINES
table.
#14346
Comments
Thank you for the proposal, Abhishek. what does a deterministic routine mean? How is the IS_DETERMINISTIC field supposed to be used? |
@abhishekagarwal87, AFAIK, in Druid, all the functions documented here are deterministic by default. Found an old PR that attempted to add One usecase for |
Could you link the the documentation that you are basing this on? I tried googling for "INFORMATION_SCHEMA.ROUTINES" and I keep finding docs from different databases that have more or less columns in the report. What is the docs you are following? |
@vogievetsky, the proposal is based on the SQL 1999 specification - http://web.cecs.pdx.edu/~len/sql1999.pdf. Specifically for
The column definitions are sort of scattered throughout the spec. Then I was also looking at a few vendor implementations. As a side note, the SQL 1999 spec notes several columns that may not apply to Druid. For example,
I think 2 makes sense, but let me know what your thoughts are. |
Thank you for the link |
As a follow-up to the proposed design implementation, we can add a new column,
So with those goals in mind, I think an implementation that will work:
/**
* @return an optional description about the Sql aggregator.
*/
@Nullable
default String functionDescription() {
return null;
}
Applications:A few high-level usecases that can use the description (the details are not fully fleshed out):
Comments or feedback is appreciated! |
@abhishekrb19 - the use case behind the column The |
@abhishekagarwal87, thank you for the comments. Yes, removing As far as |
This is very useful for client side tools. And the web-console can also benefit from this change to implement a more simplier autocompletion suggestion. I'm wondering for |
I like the this proposal. But there's one thing, what's the relationship between the description in the code and the function description in current markdown file? I don't think developers want to write these description in two different places twice. So, maybe one way is that, during mvn source generation phase, we can extract the description from the markdown files and generate some string constants that can be referenced from the default implementation of |
@FrankChen021, please see the updated description -- the |
Good suggestion! I'll think about it more after the main implementation is in place. Also, @vtlim had some thoughts on this. |
Description
Expose Druid functions and operators programmatically via the Druid SQL interface.
Motivation:
Design:
Add a new table
INFORMATION_SCHEMA.ROUTINES
that exposes SQL functions and operators. TheINFORMATION_SCHEMA.ROUTINES
table will include the following columns:druid
.INFORMATION_SCHEMA
.APPROX_COUNT_DISTINCT_DS_THETA
FUNCTION
.YES
for aggregator functions;NO
for scalar functions.Note that Druid-specific columns such as
IS_AGGREGATOR
,SIGNATURES
are also included besides the standard set (columns 1 - 5).Example usage:
To see information about all the aggregator functions, including ones loaded from extensions, run the following SQL query:
Other alternatives: Custom
sys
tables were considered. However,INFORMATION_SCHEMA.ROUTINES
is the SQL standard to expose stored procedures, routines, and built-in types. The proposal is based on the SQL-1999 specification - see relevant sections 20.45, 4.24, 20.69.Implementation sketch:
Calcite knows about the registered operators, including the ones registered in extensions. So extracting this information from Calcite would be the best way. Therefore, we can use the
DruidOperatorTable
that implements Calcite's operator table interface. TheDruidOperatorTable
is aware of all the registered operators at runtime, so we can wire this up into the newINFORMATION_SCHEMA
table.Future work:
In addition to the above columns, I think we could also add more columns over time, including:
Getting this information may involve adding more functionality to
DruidOperatorTable
.Note that this proposal is only for the Druid functions and operators. Exposing data types would be a separate proposal on its own.
The text was updated successfully, but these errors were encountered: