Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incompatible implementation of jsonpath extraction #30352

Closed
zyguan opened this issue Dec 2, 2021 · 3 comments · Fixed by #35320
Closed

Incompatible implementation of jsonpath extraction #30352

zyguan opened this issue Dec 2, 2021 · 3 comments · Fixed by #35320
Labels
affects-5.0 This bug affects 5.0.x versions. affects-5.1 This bug affects 5.1.x versions. affects-5.2 This bug affects 5.2.x versions. affects-5.3 This bug affects 5.3.x versions. affects-5.4 This bug affects the 5.4.x(LTS) versions. affects-6.0 affects-6.1 This bug affects the 6.1.x(LTS) versions. severity/major sig/execution SIG execution type/bug The issue is confirmed as a bug.

Comments

@zyguan
Copy link
Contributor

zyguan commented Dec 2, 2021

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

/* t */ drop table if exists t;
/* t */ create table t (a varchar(100));
/* t */ insert into t values ('{"labels":[{"name":"foo"}]}'),('{"labels":[{"name":"foo"},{"name":"bar"}]}');
/* t */ select a->'$.labels[*].name' from t;

2. What did you expect to see? (Required)

/* t */ drop table if exists t;
-- t >> 0 rows affected
/* t */ create table t (a varchar(100));
-- t >> 0 rows affected
/* t */ insert into t values ('{"labels":[{"name":"foo"}]}'),('{"labels":[{"name":"foo"},{"name":"bar"}]}');
-- t >> 2 rows affected
/* t */ select a->'$.labels[*].name' from t;
-- t >> +-----------------------+
-- t    | a->'$.labels[*].name' |
-- t    +-----------------------+
-- t    | ["foo"]               |
-- t    | ["foo", "bar"]        |
-- t    +-----------------------+

3. What did you see instead (Required)

/* t */ drop table if exists t;
-- t >> 0 rows affected
/* t */ create table t (a varchar(100));
-- t >> 0 rows affected
/* t */ insert into t values ('{"labels":[{"name":"foo"}]}'),('{"labels":[{"name":"foo"},{"name":"bar"}]}');
-- t >> 2 rows affected
/* t */ select a->'$.labels[*].name' from t;
-- t >> +-----------------------+
-- t    | a->'$.labels[*].name' |
-- t    +-----------------------+
-- t    | "foo"                 | <- not an array
-- t    | ["foo", "bar"]        |
-- t    +-----------------------+

4. What is your TiDB version? (Required)

master (6475e89)

@zyguan zyguan added type/bug The issue is confirmed as a bug. sig/sql-infra SIG: SQL Infra severity/major labels Dec 2, 2021
@bb7133 bb7133 added sig/execution SIG execution and removed sig/sql-infra SIG: SQL Infra labels Dec 2, 2021
@bb7133
Copy link
Member

bb7133 commented Dec 2, 2021

I think this is related to the type system and json implementations. /cc @zyguan @XuHuaiyu

@jebter jebter added affects-5.0 This bug affects 5.0.x versions. affects-5.1 This bug affects 5.1.x versions. affects-5.2 This bug affects 5.2.x versions. affects-5.3 This bug affects 5.3.x versions. affects-5.4 This bug affects the 5.4.x(LTS) versions. labels Jan 11, 2022
@VelocityLight VelocityLight added the affects-6.1 This bug affects the 6.1.x(LTS) versions. label May 20, 2022
@ywqzzy
Copy link
Contributor

ywqzzy commented May 24, 2022

In tidb the current extract implementation is the following code:

func (bj BinaryJSON) Extract(pathExprList []PathExpression) (ret BinaryJSON, found bool) {
	buf := make([]BinaryJSON, 0, 1)
	for _, pathExpr := range pathExprList {
		buf = bj.extractTo(buf, pathExpr)
	}
	if len(buf) == 0 {
		found = false
	} else if len(pathExprList) == 1 && len(buf) == 1 {
		// If pathExpr contains asterisks, len(elemList) won't be 1
		// even if len(pathExprList) equals to 1.
		found = true
		ret = buf[0]
	} else {
		found = true
		ret = buildBinaryArray(buf)
	}
	return
}

When the buf len is 1 and pathExprList = 1, It will not construct an array, which is not compatible with MySQL.
Modify the code into the following code will fix the issue.

func (bj BinaryJSON) Extract(pathExprList []PathExpression) (ret BinaryJSON, found bool) {
	buf := make([]BinaryJSON, 0, 1)
	for _, pathExpr := range pathExprList {
		buf = bj.extractTo(buf, pathExpr)
	}
	if len(buf) == 0 {
		found = false
	} else {
		found = true
		ret = buildBinaryArray(buf)
	}
	return
}

@mengxin9014
Copy link
Contributor

In tidb the current extract implementation is the following code:

func (bj BinaryJSON) Extract(pathExprList []PathExpression) (ret BinaryJSON, found bool) {
	buf := make([]BinaryJSON, 0, 1)
	for _, pathExpr := range pathExprList {
		buf = bj.extractTo(buf, pathExpr)
	}
	if len(buf) == 0 {
		found = false
	} else if len(pathExprList) == 1 && len(buf) == 1 {
		// If pathExpr contains asterisks, len(elemList) won't be 1
		// even if len(pathExprList) equals to 1.
		found = true
		ret = buf[0]
	} else {
		found = true
		ret = buildBinaryArray(buf)
	}
	return
}

When the buf len is 1 and pathExprList = 1, It will not construct an array, which is not compatible with MySQL. Modify the code into the following code will fix the issue.

func (bj BinaryJSON) Extract(pathExprList []PathExpression) (ret BinaryJSON, found bool) {
	buf := make([]BinaryJSON, 0, 1)
	for _, pathExpr := range pathExprList {
		buf = bj.extractTo(buf, pathExpr)
	}
	if len(buf) == 0 {
		found = false
	} else {
		found = true
		ret = buildBinaryArray(buf)
	}
	return
}

It is not exactly. If the buf len is 1 and pathExprList = 1, it will not construct an array if pathExpr does not contain asterisks in MYSQL.
for example:

/* t */ drop table if exists t;
-- t >> 0 rows affected
/* t */ create table t (a varchar(100));
-- t >> 0 rows affected
/* t */ insert into t values ('{"labels":[{"name":"foo"}]}'),('{"labels":[{"name":"foo"},{"name":"bar"}]}');
-- t >> 2 rows affected
/* t */ select a->'$.labels[0].name' from t;
-- t >> +-----------------------+
-- t    | a->'$.labels[0].name' |
-- t    +-----------------------+
-- t    |  "foo"                |
-- t    |  "foo"                |
-- t    +-----------------------+
/* t */ select a->'$.labels[1].name' from t;
-- t >> +-----------------------+
-- t    | a->'$.labels[0].name' |
-- t    +-----------------------+
-- t    |  NULL                 |
-- t    |  "bar"                |
-- t    +-----------------------+

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-5.0 This bug affects 5.0.x versions. affects-5.1 This bug affects 5.1.x versions. affects-5.2 This bug affects 5.2.x versions. affects-5.3 This bug affects 5.3.x versions. affects-5.4 This bug affects the 5.4.x(LTS) versions. affects-6.0 affects-6.1 This bug affects the 6.1.x(LTS) versions. severity/major sig/execution SIG execution type/bug The issue is confirmed as a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants