-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sql backend implementation #430
Sql backend implementation #430
Conversation
- The api names for accessing the druid dimensions are now all named correctly to the physical dimensions in Druid.
String sql = writeSql(sqlWriter, relToSql, localBuilder); | ||
return "(" + sql + ")"; | ||
}) | ||
.collect(Collectors.joining("\nUNION ALL\n")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok so after looking into it further, it appears this is supported by MySQL/PostgreSQL (not sure what else), but is not supported by Standard SQL. So for example SQLite will fail with this method.
So it looks like the best way to accomplish the ORDER BY
and LIMIT
functionality on subqueries (these are used by TopN over each bucket and the results are unioned together) would be to make multiple queries and aggregate the results manually like @michael-mclawhorn suggested.
I've filed a bug report with Calcite, but since I'm not sure how much can be done for this issue.
83dfbe5
to
2bf9144
Compare
Fili has initial support for using a sql database as the backend instead of druid. Please | ||
note that there are some restrictions since Fili is optimized for Druid. | ||
|
||
## Setup |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe mention that DruidDimensionLoader
should be turned off?
/** | ||
* Provides a mapping from Druid's {@link Aggregation} to a {@link SqlAggregationBuilder}. | ||
*/ | ||
public interface DruidSqlAggregationConverter { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@archolewa I've done something similar to what you mentioned in #413, but tied it specifically to aggregations, Let me know if you have any concerns with this approach
*/ | ||
public class FilterEvaluator implements ReflectiveVisitor { | ||
private RelBuilder builder; | ||
private final List<String> dimensions; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably be a Set
} | ||
return sub; | ||
case DIVIDE: | ||
if (arithmeticPostAggregation.getFields().size() != 2) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR needs to be rebased on #413, this was changed there
* Small utility class to help with connection to databases, building, and writing sql. | ||
*/ | ||
public class CalciteHelper { | ||
public static final String DEFAULT_SCHEMA = "PUBLIC"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is true for most but not all databases. Another reason this should be specified in the Table logic and not here
* @throws SQLException if failed while reading database. | ||
* @throws IllegalStateException if no {@link JDBCType#TIMESTAMP} column could be found. | ||
*/ | ||
public static String getTimestampColumn(Connection connection, String schema, String table) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should also be specified in the Table, repurpose this to check that the time
column is actually a TIMESTAMP
private void initializeSqlBackend(ObjectMapper mapper) { | ||
String dbUrl = SYSTEM_CONFIG.getStringProperty(DATABASE_URL); | ||
String driver = SYSTEM_CONFIG.getStringProperty(DATABASE_DRIVER); | ||
String schema = SYSTEM_CONFIG.getStringProperty(DATABASE_SCHEMA); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove schema from here as well
private static final String TRUE = "TRUE" | ||
private static final String FALSE = "FALSE" | ||
private static final String FIRST_COMMENT = "added project" | ||
//this is the first result in the database |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clarify "this"
1937592
to
aa2b76f
Compare
Added security module with chaining filters.
aac7302
to
e7a1e95
Compare
- Need to properly distinguish between api and field names
- Before this, everything was assumed to have the same field and api name - Need to also correct this for metrics, and there should probably be a test for this
- maybe delete the SqlTimeConverter comments?
- Update and add comments to a lot of code - Refactoring (SqlConverter -> DefaultSqlBackedClient + DefaultSqlBackedClient) - SqlBackedClient/DefaultSqlBackedClient now only handle the top level operations and shouldn't need to be overridden - Cleaned up how aggregations are converted from druid to sql
- The DruidQuery is now converted into multiple SQL queries which should be processed in order (Only affects TopN Queries)
- Calcite already handles this to make sure all columns needed are available
- The dimensions were not being mapped to their field names correctly - The dimensions should be the last position in the order by - The timegrain to date part functions didn't work correctly for ZonedTimeGrains
- Also added a test to show that timezones are mapped correctly
- Both now have tests showing that the mapping to field names is working
3294989
to
75fa7ea
Compare
@archolewa @michael-mclawhorn Here's the WIP Part 2 of #413
Here's what I can think of that needs to be done:
SqlDimensionLoader
instead of relying on a backing druid instance to sql dataSqlRequestHandler
once we decide on a schemaDruidQueryToSqlConverter
(AbstractBinderFactory
? CustomFilter
/Having
/PostAggregation
evaluator?)Not yet implemented:
GroupByQuery
LookbackQuery