Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

index problem of substring function and charset support #1098

Closed
zyguan opened this issue Apr 14, 2016 · 4 comments
Closed

index problem of substring function and charset support #1098

zyguan opened this issue Apr 14, 2016 · 4 comments

Comments

@zyguan
Copy link
Contributor

zyguan commented Apr 14, 2016

It should be if pos > int64(len(str)) || pos < int64(0) { here, otherwise substring(s, 1) always return an empty string. Beside, I see that the code doesn't handle the charset of the string for now, which leads to following result:

mysql> select substring("你好世界", 1, 3);
+---------------------------------+
| substring("你好世界", 1, 3)     |
+---------------------------------+
| 你                              |
+---------------------------------+
1 row in set (0.00 sec)

I'm willing to help, however, I'm not sure how to get the charset info from the context of such a builtin function call.


  1. What version of Go are you using (go version)?
    go1.6
  2. What operating system and processor architecture are you using (go env)?
    GOHOSTARCH="amd64"
    GOHOSTOS="linux"
  3. What did you do?
    exec select substring("hello", 1, 5);
  4. What did you expect to see?
+--------------------------+
| substring("hello", 1, 5) |
+--------------------------+
| hello                    |
+--------------------------+
  1. What did you see instead?
+--------------------------+
| substring("hello", 1, 5) |
+--------------------------+
|                          |
+--------------------------+
@shenli
Copy link
Member

shenli commented Apr 14, 2016

@zyguan Thanks for your report!
String literal will be wrapped into a ast.ValueExpr. And you can get charset/collation info from ValueExpr.
https://github.com/pingcap/tidb/blob/master/parser/parser.y#L2008

@zyguan
Copy link
Contributor Author

zyguan commented Apr 15, 2016

Thanks for your reply, @shenli .

I see, buitin functions only accept an array of types.Datum as the input.
https://github.com/pingcap/tidb/blob/master/evaluator/evaluator.go#L647

However, the charset/collation info is stored in the exprNode.Type field. Thus, there is (maybe) no way for us to get the info within these builtin functions currently (I thought we can get it from the Context). I think it might be better to pass datums as well as their types to a FuncCall.

@shenli
Copy link
Member

shenli commented Apr 15, 2016

@zyguan Yes, we have a plan to add collation info into datum.

@shenli shenli added the todo label Apr 15, 2016
zyguan added a commit to zyguan/tidb that referenced this issue Apr 15, 2016
* evaluator: fix the incorrect behavior of `substring` function,
             multi-byte charset still hasn't been supported.
shenli pushed a commit that referenced this issue Apr 16, 2016
Fix the incorrect behavior of `substring` function, multi-byte charset still hasn't been supported.
@morgo
Copy link
Contributor

morgo commented Nov 24, 2018

This issue has been resolved in TiDB. Confirming with a recent version:

tidb> select substring("你好世界", 1, 3);
+---------------------------------+
| substring("你好世界", 1, 3)     |
+---------------------------------+
| 你好世                          |
+---------------------------------+
1 row in set (0.00 sec)

tidb> select substring("hello", 1, 5);
+--------------------------+
| substring("hello", 1, 5) |
+--------------------------+
| hello                    |
+--------------------------+
1 row in set (0.00 sec)

tidb> select tidb_version()\G
*************************** 1. row ***************************
tidb_version(): Release Version: v2.1.0-rc.3-219-g1e0876fe8-dirty
Git Commit Hash: 1e0876fe810a832721aac52275dd2b7792fd2892
Git Branch: flush
UTC Build Time: 2018-11-24 01:12:47
GoVersion: go version go1.11 linux/amd64
Race Enabled: false
TiKV Min Version: 2.1.0-alpha.1-ff3dd160846b7d1aed9079c389fc188f7f5ea13e
Check Table Before Drop: false
1 row in set (0.00 sec)

I am going to close this issue for now, but please feel free to re-open if you have any followup questions. Thanks!

@morgo morgo closed this as completed Nov 24, 2018
wddevries pushed a commit to wddevries/tidb that referenced this issue Oct 25, 2024
* domain: add repository worker

Signed-off-by: xhe <xw897002528@gmail.com>

* sessionctx, domain: add function gate

Signed-off-by: xhe <xw897002528@gmail.com>

* repository: refine context usage

Signed-off-by: xhe <xw897002528@gmail.com>

* repository: add recover and session getter

Signed-off-by: xhe <xw897002528@gmail.com>

* sessionctx: hide variable

Signed-off-by: xhe <xw897002528@gmail.com>

* fix bazel

Signed-off-by: xhe <xw897002528@gmail.com>

* fix check

Signed-off-by: xhe <xw897002528@gmail.com>

* repository: refine owner management

Signed-off-by: xhe <xw897002528@gmail.com>

* repository: free memref

Signed-off-by: xhe <xw897002528@gmail.com>

* fix check

Signed-off-by: xhe <xw897002528@gmail.com>

---------

Signed-off-by: xhe <xw897002528@gmail.com>
wddevries pushed a commit to wddevries/tidb that referenced this issue Oct 25, 2024
* domain: add repository worker

Signed-off-by: xhe <xw897002528@gmail.com>

* sessionctx, domain: add function gate

Signed-off-by: xhe <xw897002528@gmail.com>

* repository: refine context usage

Signed-off-by: xhe <xw897002528@gmail.com>

* repository: add recover and session getter

Signed-off-by: xhe <xw897002528@gmail.com>

* sessionctx: hide variable

Signed-off-by: xhe <xw897002528@gmail.com>

* fix bazel

Signed-off-by: xhe <xw897002528@gmail.com>

* fix check

Signed-off-by: xhe <xw897002528@gmail.com>

* repository: refine owner management

Signed-off-by: xhe <xw897002528@gmail.com>

* repository: free memref

Signed-off-by: xhe <xw897002528@gmail.com>

* fix check

Signed-off-by: xhe <xw897002528@gmail.com>

---------

Signed-off-by: xhe <xw897002528@gmail.com>
wddevries pushed a commit to wddevries/tidb that referenced this issue Nov 5, 2024
* domain: add repository worker

Signed-off-by: xhe <xw897002528@gmail.com>

* sessionctx, domain: add function gate

Signed-off-by: xhe <xw897002528@gmail.com>

* repository: refine context usage

Signed-off-by: xhe <xw897002528@gmail.com>

* repository: add recover and session getter

Signed-off-by: xhe <xw897002528@gmail.com>

* sessionctx: hide variable

Signed-off-by: xhe <xw897002528@gmail.com>

* fix bazel

Signed-off-by: xhe <xw897002528@gmail.com>

* fix check

Signed-off-by: xhe <xw897002528@gmail.com>

* repository: refine owner management

Signed-off-by: xhe <xw897002528@gmail.com>

* repository: free memref

Signed-off-by: xhe <xw897002528@gmail.com>

* fix check

Signed-off-by: xhe <xw897002528@gmail.com>

---------

Signed-off-by: xhe <xw897002528@gmail.com>
wddevries pushed a commit to wddevries/tidb that referenced this issue Nov 8, 2024
* domain: add repository worker

Signed-off-by: xhe <xw897002528@gmail.com>

* sessionctx, domain: add function gate

Signed-off-by: xhe <xw897002528@gmail.com>

* repository: refine context usage

Signed-off-by: xhe <xw897002528@gmail.com>

* repository: add recover and session getter

Signed-off-by: xhe <xw897002528@gmail.com>

* sessionctx: hide variable

Signed-off-by: xhe <xw897002528@gmail.com>

* fix bazel

Signed-off-by: xhe <xw897002528@gmail.com>

* fix check

Signed-off-by: xhe <xw897002528@gmail.com>

* repository: refine owner management

Signed-off-by: xhe <xw897002528@gmail.com>

* repository: free memref

Signed-off-by: xhe <xw897002528@gmail.com>

* fix check

Signed-off-by: xhe <xw897002528@gmail.com>

---------

Signed-off-by: xhe <xw897002528@gmail.com>
wddevries pushed a commit to wddevries/tidb that referenced this issue Nov 12, 2024
* domain: add repository worker

Signed-off-by: xhe <xw897002528@gmail.com>

* sessionctx, domain: add function gate

Signed-off-by: xhe <xw897002528@gmail.com>

* repository: refine context usage

Signed-off-by: xhe <xw897002528@gmail.com>

* repository: add recover and session getter

Signed-off-by: xhe <xw897002528@gmail.com>

* sessionctx: hide variable

Signed-off-by: xhe <xw897002528@gmail.com>

* fix bazel

Signed-off-by: xhe <xw897002528@gmail.com>

* fix check

Signed-off-by: xhe <xw897002528@gmail.com>

* repository: refine owner management

Signed-off-by: xhe <xw897002528@gmail.com>

* repository: free memref

Signed-off-by: xhe <xw897002528@gmail.com>

* fix check

Signed-off-by: xhe <xw897002528@gmail.com>

---------

Signed-off-by: xhe <xw897002528@gmail.com>
wddevries pushed a commit to wddevries/tidb that referenced this issue Nov 13, 2024
* domain: add repository worker

Signed-off-by: xhe <xw897002528@gmail.com>

* sessionctx, domain: add function gate

Signed-off-by: xhe <xw897002528@gmail.com>

* repository: refine context usage

Signed-off-by: xhe <xw897002528@gmail.com>

* repository: add recover and session getter

Signed-off-by: xhe <xw897002528@gmail.com>

* sessionctx: hide variable

Signed-off-by: xhe <xw897002528@gmail.com>

* fix bazel

Signed-off-by: xhe <xw897002528@gmail.com>

* fix check

Signed-off-by: xhe <xw897002528@gmail.com>

* repository: refine owner management

Signed-off-by: xhe <xw897002528@gmail.com>

* repository: free memref

Signed-off-by: xhe <xw897002528@gmail.com>

* fix check

Signed-off-by: xhe <xw897002528@gmail.com>

---------

Signed-off-by: xhe <xw897002528@gmail.com>
wddevries pushed a commit to wddevries/tidb that referenced this issue Nov 26, 2024
* domain: add repository worker

Signed-off-by: xhe <xw897002528@gmail.com>

* sessionctx, domain: add function gate

Signed-off-by: xhe <xw897002528@gmail.com>

* repository: refine context usage

Signed-off-by: xhe <xw897002528@gmail.com>

* repository: add recover and session getter

Signed-off-by: xhe <xw897002528@gmail.com>

* sessionctx: hide variable

Signed-off-by: xhe <xw897002528@gmail.com>

* fix bazel

Signed-off-by: xhe <xw897002528@gmail.com>

* fix check

Signed-off-by: xhe <xw897002528@gmail.com>

* repository: refine owner management

Signed-off-by: xhe <xw897002528@gmail.com>

* repository: free memref

Signed-off-by: xhe <xw897002528@gmail.com>

* fix check

Signed-off-by: xhe <xw897002528@gmail.com>

---------

Signed-off-by: xhe <xw897002528@gmail.com>
hawkingrei pushed a commit to wddevries/tidb that referenced this issue Nov 27, 2024
* domain: add repository worker

Signed-off-by: xhe <xw897002528@gmail.com>

* sessionctx, domain: add function gate

Signed-off-by: xhe <xw897002528@gmail.com>

* repository: refine context usage

Signed-off-by: xhe <xw897002528@gmail.com>

* repository: add recover and session getter

Signed-off-by: xhe <xw897002528@gmail.com>

* sessionctx: hide variable

Signed-off-by: xhe <xw897002528@gmail.com>

* fix bazel

Signed-off-by: xhe <xw897002528@gmail.com>

* fix check

Signed-off-by: xhe <xw897002528@gmail.com>

* repository: refine owner management

Signed-off-by: xhe <xw897002528@gmail.com>

* repository: free memref

Signed-off-by: xhe <xw897002528@gmail.com>

* fix check

Signed-off-by: xhe <xw897002528@gmail.com>

---------

Signed-off-by: xhe <xw897002528@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants