Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluate functions in INSERT..SELECT #1010

Merged
merged 2 commits into from
Dec 23, 2016

Conversation

marcocitus
Copy link
Member

@marcocitus marcocitus commented Dec 2, 2016

While writing https://www.citusdata.com/blog/2016/11/29/event-aggregation-at-scale-with-postgresql/ I ran into a number of usability issues with INSERT..SELECT. In particular, it errors out when using functions like now(), date_trunc(), the timestamptz type, or even simple string||int concatenation, which can be very inconvenient, especially for doing roll-ups.

The problem with doing function evaluation for INSERT .. SELECT is that we perform different transformation on the query trees in the planner for each shard. While we could pass the entire query tree into the plan for each task and subsequently perform function evaluation on each query tree, this can cause substantial overhead.

This PR introduces a generic list of relation<->shard mappings ("RelationShard") that gets passed down with every (router planner) task, such that the per-shard query tree can be reconstructed from the jobQuery in the router executor and applies function evaluation to the SELECT part of the INSERT .. SELECT. The infrastructure will also allow us to deparse and thus perform function evaluation on more complex queries (e.g. SELECT) in the executor.

The RelationShard list replaces the existing selectShardList used for INSERT..SELECT lcoking, since it contains the same information. This change reduces overhead since selectShardList was a list of ShardInterval structs, whereas RelationShard only contains a relationId and a shardId.

Part of #961

@marcocitus marcocitus force-pushed the feature/insert_select_functions branch from a5729bd to 4e0446f Compare December 2, 2016 15:53
* subquery RTE that returns no results.
*/
static void
ConvertRteToSubqueryWithEmptyResult(RangeTblEntry *rte)
Copy link
Member Author

@marcocitus marcocitus Dec 2, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved (without change) from multi_router_planner.c as a dependency of UpdateRelationToShardNames.

*
*/
bool
UpdateRelationToShardNames(Node *node, List *relationShardList)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adapted from UpdateRelationNames multi_router_planner.c

I moved it into its own file because it's now being called both directly from the router planner as well as from the router executor via RebuildQueryStrings, it thus didn't seem very specific to the router planner anymore, but rather to deparsing shard queries in general.

@marcocitus marcocitus force-pushed the feature/insert_select_functions branch 4 times, most recently from be621fb to f6b3067 Compare December 2, 2016 16:32
@ozgune ozgune added this to the 6.1 Release milestone Dec 9, 2016
@marcocitus marcocitus force-pushed the feature/insert_select_functions branch from f6b3067 to 1a216c6 Compare December 9, 2016 12:58
@codecov-io
Copy link

codecov-io commented Dec 9, 2016

Current coverage is 89.79% (diff: 99.41%)

Merging #1010 into master will increase coverage by 0.01%

@@             master      #1010   diff @@
==========================================
  Files            72         73     +1   
  Lines         18573      18645    +72   
  Methods        1143       1147     +4   
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits          16674      16742    +68   
- Misses         1899       1903     +4   
  Partials          0          0          

Powered by Codecov. Last update b7d0a32...11031bc

@marcocitus marcocitus force-pushed the feature/insert_select_functions branch 3 times, most recently from bed0024 to 6f4659a Compare December 15, 2016 10:46
Copy link
Member

@mtuncer mtuncer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an issue with returning incorrect results

@@ -1475,8 +1475,11 @@ int
GetRTEIdentity(RangeTblEntry *rte)
{
Assert(rte->rtekind == RTE_RELATION);
Assert(IsA(rte->values_lists, IntList));
Assert(list_length(rte->values_lists) == 1);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we changing the behavior here. Perhaps we should keep asserts after NULL check.

@@ -293,7 +288,7 @@ CreateInsertSelectRouterPlan(Query *originalQuery,
workerJob->jobQuery = originalQuery;

/* for now we do not support any function evaluation */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment is obsolete with the change

@@ -1873,7 +1869,7 @@ RouterSelectTask(Query *originalQuery, RelationRestrictionContext *restrictionCo
*/
static bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a line about relationShardList in function comment

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a line.

{
relationShard = (RelationShard *) lfirst(relationShardCell);

if (newRte->relid == relationShard->relationId)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am finding this difficult to understand. The previous GetRteIdentity call was used to differentiate between multiple usages of the same relation id. Especially when the same relation was used in different levels of nested subqueries.

We have given each relation RTE a unique Id in the beginning and search for that particular id since relation Id by itself was not enough.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I found an example query: a_table is distributed along integer column a. Values 8 and 24 fall into a different shards lets name it shard_8 and shard_24. And those shards happened to be on the same worker.

select * from a_table where a = 8 UNION select * from a_table where a = 24;

After shard renaming we expect that query to be translated to

select * from a_table_shard_8 where a = 8 UNION select * from a_table_shard_24 where a = 24;

However, with these changes it becomes

select * from a_table_shard_8 where a = 8 UNION select * from a_table_shard_8 where a = 24;

and return incorrect results.

We should either reject this query, or return correct results.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another query with the same problem

 select * from a_table where a > 8 and a = 8 union select * from a_table where a = 24;

it becomes

 select * from a_table_shard_8 where a > 8 and a = 8 union select * from a_table_shard_8 where a = 24;

Copy link
Member Author

@marcocitus marcocitus Dec 19, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I had rteIdentity in RelationShard at first, but removed it because values_list doesn't get passed down to the executor and using relation ID seemed sufficient, but I keep forgetting that we allow queries that reference multiple shards of the same table if they happen to be on the same machine (even though I opened #692 myself 😶 ).

I would be very much in favour of erroring out on queries with >1 shards per relation, since it seems better to error out early than to wait until a shard rebalance / tenant isolation / placement invalidation / value changes and allowing it leads to some rather ugly hacks. We now also have the necessary information in the metadata to warn/error when using shards that are not co-located or reference tables, which was missing when we were discussing #692.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added an error message. Discussed this with @ozgune and he's fine with erroring out for these queries.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also run this by @anarazel. He did not think this is a good idea.

@marcocitus marcocitus force-pushed the feature/insert_select_functions branch 2 times, most recently from 767473e to 7f66a62 Compare December 21, 2016 11:36
@@ -1478,6 +1478,11 @@ GetRTEIdentity(RangeTblEntry *rte)
Assert(IsA(rte->values_lists, IntList));
Assert(list_length(rte->values_lists) == 1);

if (rte->values_lists == NULL)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check should happen before Asserts. In fact GetRTEIdentity functions is never called. Consider removing the function too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing GetRTEIdentity also requires removing matching function IdentifyRTE

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, removed.

{
RangeTblEntry *rte = (RangeTblEntry *) lfirst(rteCell);

if (rte->rtekind != RTE_SUBQUERY)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we ignore CTEs by choice ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. We only support very trivial CTEs right now, but should still be supported and might otherwise slip through later.

* to prevent concurrent DML statements on those shards.
*/
void
LockRelationShardListResources(List *relationShardList, LOCKMODE lockMode)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actual function name and name in comments are different. Consider using more descriptive name. LockRelationShardListResources made me think it is locking the list, not individual shards in ShardList.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was asked to change LockShardResources to LockShardListResources in another PR so it seems slightly inconsistent, but I'm fine with either form.

@marcocitus marcocitus force-pushed the feature/insert_select_functions branch from 231baaf to 2a703fe Compare December 23, 2016 09:23
}
}

foreach(cteCell, query->cteList)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would go for only referenced RTEs but guess it does not matter here.

@mtuncer
Copy link
Member

mtuncer commented Dec 23, 2016

🚢

@mtuncer
Copy link
Member

mtuncer commented Dec 23, 2016

don't forget to squash

@marcocitus marcocitus force-pushed the feature/insert_select_functions branch from 2a703fe to 11031bc Compare December 23, 2016 11:47
@marcocitus marcocitus merged commit 6b947c4 into master Dec 23, 2016
@aamederen aamederen deleted the feature/insert_select_functions branch December 27, 2016 17:15
DimCitus pushed a commit that referenced this pull request Jan 10, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants