Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Join with WHERE clause targeting single shard across joined tables treated as cross-shard #5742

Closed
aquarapid opened this issue Jan 20, 2020 · 3 comments
Labels
Component: Query Serving Type: Enhancement Logical improvement (somewhere between a bug and feature)

Comments

@aquarapid
Copy link
Contributor

aquarapid commented Jan 20, 2020

Scenario:

SQL schema:

create table t1 (
  id integer,
  merchant_id bigint,
  primary key (id),
  key (merchant_id)
) engine=innodb;

create table t2 (
  id integer,
  t1_id integer,
  merchant_id bigint,
  primary key (id),
  key (merchant_id)
) engine=innodb;

vSchema:

{
    "ks1": {
        "sharded": true,
        "tables": {
            "t1": {
                "column_vindexes": [
                    {
                        "column": "merchant_id",
                        "name": "hash"
                    }
                ]
            },
            "t2": {
                "column_vindexes": [
                    {
                        "column": "merchant_id",
                        "name": "hash"
                    }
                ]
            }
        },
        "vindexes": {
            "hash": {
                "type": "hash"
            }
        }
    }
}

The following join query with an aggregate does not work, even though it's obviously targeting a single shard with the compound WHERE clause that covers both the joined tables:

$ vtexplain -schema-file schema.sql -vschema-file vschema.json -shards 4 \n
-sql 'select distinct t1.id \n
from t1 inner join t2 on t1.id = t2.t1_id \n
where t1.merchant_id = 1 and t2.merchant_id = 1;'

ERROR: vtexplain execute error in 'select distinct t1.id from t1 inner join t2 on t1.id = t2.t1_id where t1.merchant_id = 1 and t2.merchant_id = 1': unsupported: cross-shard query with aggregates

If you add the merchant_id equality to the ON clause of the join, it works as expected:

$ vtexplain -schema-file schema.sql -vschema-file vschema.json -shards 4 -sql 'SELECT distinct t1.id from t1 inner join t2 on t1.id = t2.t1_id and t1.merchant_id = t2.merchant_id where t1.merchant_id = 1 and t2.merchant_id = 1;'
----------------------------------------------------------------------
SELECT distinct t1.id from t1 inner join t2 on t1.id = t2.t1_id and t1.merchant_id = t2.merchant_id where t1.merchant_id = 1 and t2.merchant_id = 1

1 ks1/-40: select distinct t1.id from t1 join t2 on t1.id = t2.t1_id and t1.merchant_id = t2.merchant_id where t1.merchant_id = 1 and t2.merchant_id = 1 limit 10001

----------------------------------------------------------------------

Vitess should be able to recognize these are equivalent. In fact, the MySQL parser normalizes these queries to exactly the same query (if you EXPLAIN the query and do show warnings):

/* select#1 */ select distinct `vt_commerce`.`t1`.`id` AS `id` from `vt_commerce`.`t1` join `vt_commerce`.`t2` where ((`vt_commerce`.`t1`.`id` = `vt_commerce`.`t2`.`t1_id`) and (`vt_commerce`.`t2`.`merchant_id` = 1) and (`vt_commerce`.`t1`.`merchant_id` = 1))
@sougou sougou changed the title Left join with WHERE clause targeting single shard across joined tables treated as cross-shard Join with WHERE clause targeting single shard across joined tables treated as cross-shard Jan 31, 2020
@sougou
Copy link
Contributor

sougou commented Jan 31, 2020

I've updated the title and description because this is a normal join issue (not left join). The issue is still valid. As of #5551, we detect more single-shard left-join constructs.

@harshit-gangal
Copy link
Member

Within-shard left joins

@GuptaManan100
Copy link
Member

This has been added Gen4 in #8569.

The planner outputs for this query are as follows -
V3

{
  "QueryType": "SELECT",
  "Original": "select distinct t1.id from t1 inner join t2 on t1.id = t2.t1_id where t1.merchant_id = 1 and t2.merchant_id = 1",
  "Instructions": {
    "OperatorType": "Distinct",
    "Inputs": [
      {
        "OperatorType": "Join",
        "Variant": "Join",
        "JoinColumnIndexes": "-1",
        "JoinVars": {
          "t1_id": 0
        },
        "TableName": "t1_t2",
        "Inputs": [
          {
            "OperatorType": "Route",
            "Variant": "SelectEqualUnique",
            "Keyspace": {
              "Name": "ks1",
              "Sharded": true
            },
            "FieldQuery": "select t1.id from t1 where 1 != 1",
            "Query": "select t1.id from t1 where t1.merchant_id = 1",
            "Table": "t1",
            "Values": [
              1
            ],
            "Vindex": "hash"
          },
          {
            "OperatorType": "Route",
            "Variant": "SelectEqualUnique",
            "Keyspace": {
              "Name": "ks1",
              "Sharded": true
            },
            "FieldQuery": "select 1 from t2 where 1 != 1",
            "Query": "select 1 from t2 where t2.t1_id = :t1_id and t2.merchant_id = 1",
            "Table": "t2",
            "Values": [
              1
            ],
            "Vindex": "hash"
          }
        ]
      }
    ]
  }
}

Gen4 -

{
  "QueryType": "SELECT",
  "Original": "select distinct t1.id from t1 inner join t2 on t1.id = t2.t1_id where t1.merchant_id = 1 and t2.merchant_id = 1",
  "Instructions": {
    "OperatorType": "Route",
    "Variant": "SelectEqualUnique",
    "Keyspace": {
      "Name": "ks1",
      "Sharded": true
    },
    "FieldQuery": "select t1.id from t1, t2 where 1 != 1",
    "Query": "select distinct t1.id from t1, t2 where t1.merchant_id = 1 and t2.merchant_id = 1 and t1.id = t2.t1_id",
    "Table": "t1, t2",
    "Values": [
      1
    ],
    "Vindex": "hash"
  }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Query Serving Type: Enhancement Logical improvement (somewhere between a bug and feature)
Projects
None yet
Development

No branches or pull requests

4 participants