Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

alter table add index reports 'panic in the recoverable goroutine' #22453

Closed
tangenta opened this issue Jan 20, 2021 · 7 comments · Fixed by #22458
Closed

alter table add index reports 'panic in the recoverable goroutine' #22453

tangenta opened this issue Jan 20, 2021 · 7 comments · Fixed by #22458
Assignees
Labels
severity/major sig/sql-infra SIG: SQL Infra type/bug The issue is confirmed as a bug.

Comments

@tangenta
Copy link
Contributor

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

drop table if exists t;
set @@tidb_enable_clustered_index=true;
create table t (a int, b enum('Alice'), c int, primary key (c, b));
insert into t values (-1,'Alice',100);
insert into t values (-1,'Alice',7000);
split table t between (0,'Alice') and (10000,'Alice') regions 2;
alter table t add index idx (c);

2. What did you expect to see? (Required)

Query OK, 0 rows affected (2.53 sec)

3. What did you see instead (Required)

[ERROR] [misc.go:115] ["panic in the recoverable goroutine"] [label=ddl] [funcInfo=backfillWorker.run] [r={}] [stack="goroutine 696 [running]:\ngithub.com/pingcap/tidb/util.GetStack(...)\n\t/home/tangenta/gopath/src/github.com/pingcap/tidb/util/misc.go:76\ngithub.com/pingcap/tidb/util.Recover(0x36eae97, 0x3, 0x370ea96, 0x12, 0x0, 0x0)\n\t/home/tangenta/gopath/src/github.com/pingcap/tidb/util/misc.go:119 +0x328\npanic(0x3513a60, 0xc000cbc400)\n\t/usr/local/go/src/runtime/panic.go:969 +0x1b9\ngithub.com/pingcap/tidb/util/codec.CutOne(0xc00fabfb14, 0x6, 0x10, 0x0, 0x1, 0xc01068e340, 0x0, 0x4, 0x10, 0x0, ...)\n\t/home/tangenta/gopath/src/github.com/pingcap/tidb/util/codec/codec.go:895 +0x125\ngithub.com/pingcap/tidb/kv.NewCommonHandle(0xc00fabfb0b, 0xf, 0x19, 0x46fb8d0, 0x31, 0x0)\n\t/home/tangenta/gopath/src/github.com/pingcap/tidb/kv/key.go:248 +0xd4\ngithub.com/pingcap/tidb/tablecodec.DecodeRowKey(0xc00fabfb00, 0x1a, 0x24, 0x0, 0x0, 0x1, 0x372a4a4)\n\t/home/tangenta/gopath/src/github.com/pingcap/tidb/tablecodec/tablecodec.go:269 +0x225\ngithub.com/pingcap/tidb/ddl.tryDecodeToHandleString(0xc00fabfb00, 0x1a, 0x24, 0x2, 0xc000ce0c38)\n\t/home/tangenta/gopath/src/github.com/pingcap/tidb/ddl/backfilling.go:422 +0x49\ngithub.com/pingcap/tidb/ddl.(*reorgBackfillTask).String(0xc00f8d34c0, 0xc010019d18, 0x1)\n\t/home/tangenta/gopath/src/github.com/pingcap/tidb/ddl/backfilling.go:185 +0x87\ngithub.com/pingcap/tidb/ddl.(*backfillWorker).run(0xc000cc39e0, 0xc000ccc0b0, 0x3b01ec0, 0xc0105c3e10)\n\t/home/tangenta/gopath/src/github.com/pingcap/tidb/ddl/backfilling.go:299 +0x467\ncreated by github.com/pingcap/tidb/ddl.(*worker).writePhysicalTableRecord\n\t/home/tangenta/gopath/src/github.com/pingcap/tidb/ddl/backfilling.go:598 +0xa2a\n"]

4. What is your TiDB version? (Required)

master 8ddd41c960caaebbdeb28da33c781fca1464f05f

@tangenta tangenta added the type/bug The issue is confirmed as a bug. label Jan 20, 2021
@tangenta tangenta self-assigned this Jan 20, 2021
@tangenta
Copy link
Contributor Author

As mentioned in #20727,

Root Cause
For now, ADD INDEX(as well as some other DDL jobs that need data-reorganization) assume that the 'startKey' and 'endKey' of a region can always be decoded:

But it is not the case for:

splitting regions on a clustered index table can produce a region that consists of invalid start_key / end_key.

To show the ADD INDEX progress, TiDB uses tryDecodeToHandleString everywhere, which can ignore the decode error:

tidb/ddl/backfilling.go

Lines 421 to 439 in 91a9d30

func tryDecodeToHandleString(key kv.Key) string {
handle, err := tablecodec.DecodeRowKey(key)
if err != nil {
recordPrefixIdx := bytes.Index(key, []byte("_r"))
if recordPrefixIdx == -1 {
return fmt.Sprintf("key: %x", key)
}
handleBytes := key[recordPrefixIdx+2:]
terminatedWithZero := len(handleBytes) > 0 && handleBytes[len(handleBytes)-1] == 0
if terminatedWithZero {
handle, err := tablecodec.DecodeRowKey(key[:len(key)-1])
if err == nil {
return handle.String() + ".next"
}
}
return fmt.Sprintf("%x", handleBytes)
}
return handle.String()
}

However, DecodeRowKey can cause panic in some cases, we need to consider them.

@ti-srebot
Copy link
Contributor

ti-srebot commented Jan 21, 2021

Please edit this comment or add a new comment to complete the following information

Not a bug

  1. Remove the 'type/bug' label
  2. Add notes to indicate why it is not a bug

Duplicate bug

  1. Add the 'type/duplicate' label
  2. Add the link to the original bug

Bug

Note: Make Sure that 'component', and 'severity' labels are added
Example for how to fill out the template: #20100

1. Root Cause Analysis (RCA) (optional)

2. Symptom (optional)

3. All Trigger Conditions (optional)

4. Workaround (optional)

5. Affected versions

[v5.0.0-rc]

6. Fixed versions

v5.0.0

@ti-srebot
Copy link
Contributor

( FixedVersions AffectedVersions ) fields are empty.
The values in ( FixedVersions AffectedVersions ) fields are incorrect.

@ti-srebot
Copy link
Contributor

( AffectedVersions ) fields are empty.
The values in ( FixedVersions AffectedVersions ) fields are incorrect.

@ti-srebot
Copy link
Contributor

The values in ( FixedVersions ) fields are incorrect.

@ti-srebot
Copy link
Contributor

The values in ( FixedVersions ) fields are incorrect.

1 similar comment
@ti-srebot
Copy link
Contributor

The values in ( FixedVersions ) fields are incorrect.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
severity/major sig/sql-infra SIG: SQL Infra type/bug The issue is confirmed as a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants