Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM panic when import data via lightning physical backend #8209

Closed
fubinzh opened this issue Feb 9, 2023 · 3 comments · Fixed by #8217
Closed

DM panic when import data via lightning physical backend #8209

fubinzh opened this issue Feb 9, 2023 · 3 comments · Fixed by #8217
Assignees
Labels
affects-6.6 area/dm Issues or PRs related to DM. found/automation Bugs found by automation cases severity/critical type/bug The issue is confirmed as a bug.

Comments

@fubinzh
Copy link

fubinzh commented Feb 9, 2023

What did you do?

  1. Run DM task to import data with lightning physical backend

What did you expect to see?

  1. Import data should succeed

What did you see instead?

panic seen

2023-02-08 19:42:37 | {"pod":"dm-dm-worker-0","container":"dm-worker","log":"2023-02-08T19:42:37.551472559+08:00 stderr F panic: send on closed channel","namespace":"dm-lighting-physical-backend-big-tps-1560715-1-410"}

[root@centos76_vm dm]# kubectl logs -p dm-dm-worker-0 --kubeconfig kubeconfig.yml -n dm-lighting-physical-backend-big-tps-1560715-1-410
starting dm-worker ...
/dm-worker --name=dm-dm-worker-0 --join=dm-dm-master:8261 --advertise-addr=dm-dm-worker-0.dm-dm-worker-peer:8262 --worker-addr=0.0.0.0:8262 --config=/etc/dm-worker/dm-worker.toml

+----+------------------------------------------------------------------------------------------------------------------------------------+-------------+--------+
|  # | CHECK ITEM                                                                                                                         | TYPE        | PASSED |
+----+------------------------------------------------------------------------------------------------------------------------------------+-------------+--------+
|  1 | Source csv files size is proper                                                                                                    | performance | true   |
+----+------------------------------------------------------------------------------------------------------------------------------------+-------------+--------+
|  2 | the checkpoints are valid                                                                                                          | critical    | true   |
+----+------------------------------------------------------------------------------------------------------------------------------------+-------------+--------+
|  3 | table schemas are valid                                                                                                            | critical    | true   |
+----+------------------------------------------------------------------------------------------------------------------------------------+-------------+--------+
|  4 | Cluster version check passed                                                                                                       | critical    | true   |
+----+------------------------------------------------------------------------------------------------------------------------------------+-------------+--------+
|  5 | Lightning has the correct storage permission                                                                                       | critical    | true   |
+----+------------------------------------------------------------------------------------------------------------------------------------+-------------+--------+
|  6 | sorted-kv-dir:/var/lib/dm-worker/dumped_data.dm_lightning_physical_backend.sorting and data-source-dir:/var/lib/dm-worker/dumped_d | performance | false  |
|    | ata.dm_lightning_physical_backend are in the same disk, may slow down performance                                                  |             |        |
+----+------------------------------------------------------------------------------------------------------------------------------------+-------------+--------+
|  7 | local disk resources are rich, estimate sorted data size 210.4GiB, local available is 1.513TiB                                     | critical    | true   |
+----+------------------------------------------------------------------------------------------------------------------------------------+-------------+--------+
|  8 | Cluster available is rich, available is 5.097TiB, we need 631.1GiB                                                                 | performance | true   |
+----+------------------------------------------------------------------------------------------------------------------------------------+-------------+--------+
|  9 | Cluster doesn't have too many empty regions                                                                                        | performance | true   |
+----+------------------------------------------------------------------------------------------------------------------------------------+-------------+--------+
| 10 | Cluster region distribution is balanced                                                                                            | critical    | true   |
+----+------------------------------------------------------------------------------------------------------------------------------------+-------------+--------+
| 11 | no CDC or PiTR task found                                                                                                          | critical    | true   |
+----+------------------------------------------------------------------------------------------------------------------------------------+-------------+--------+

panic: send on closed channel

goroutine 20707 [running]:
github.com/pingcap/tidb/br/pkg/membuf.(*Pool).release(...)
        github.com/pingcap/tidb@v1.1.0-beta.0.20221201110602-e307642d9fa5/br/pkg/membuf/buffer.go:107
github.com/pingcap/tidb/br/pkg/membuf.(*Buffer).Destroy(0xc00207a550)
        github.com/pingcap/tidb@v1.1.0-beta.0.20221201110602-e307642d9fa5/br/pkg/membuf/buffer.go:170 +0xc6
github.com/pingcap/tidb/br/pkg/lightning/backend/local.(*Writer).Close(0xc07ee48500, {0x3e292e8?, 0xc003b88380?})
        github.com/pingcap/tidb@v1.1.0-beta.0.20221201110602-e307642d9fa5/br/pkg/lightning/backend/local/engine.go:1165 +0x196
github.com/pingcap/tidb/br/pkg/lightning/backend.(*LocalEngineWriter).Close(...)
        github.com/pingcap/tidb@v1.1.0-beta.0.20221201110602-e307642d9fa5/br/pkg/lightning/backend/backend.go:431
github.com/pingcap/tidb/br/pkg/lightning/restore.(*TableRestore).restoreEngine.func3(0x1e95246?, 0xc000ac8120)
        github.com/pingcap/tidb@v1.1.0-beta.0.20221201110602-e307642d9fa5/br/pkg/lightning/restore/table_restore.go:580 +0x1ee
created by github.com/pingcap/tidb/br/pkg/lightning/restore.(*TableRestore).restoreEngine
        github.com/pingcap/tidb@v1.1.0-beta.0.20221201110602-e307642d9fa5/br/pkg/lightning/restore/table_restore.go:567 +0xd1b

Versions of the cluster

[2023/02/08 10:48:35.168 +00:00] [INFO] [version.go:47] ["Welcome to dm-worker"] [release-version=v6.6.0-alpha] [git-hash=c
aa73987b7a3f08e701e45190ad5c10bade542b9] [git-branch=heads/refs/tags/v6.6.0-alpha] [utc-build-time="2023-02-07 11:42:54"] [go-version="go version go1.19.5 linux/amd64"] [failpoint-build=false]

current status of DM cluster (execute query-status <task-name> in dmctl)

2023-02-08T11:42:09.926Z	INFO	k8s/dmcluster.go:54	exec on dm result{podName 15 0 dm-dm-master-0 <nil>} {containerName 15 0 dm-master <nil>} {cmd 1 0  [/bin/sh -c /dmctl --master-addr=127.0.0.1:8261 query-status -s source-0 dm_lightning_physical_backend]} {stdout 15 0 {
    "result": true,
    "msg": "",
    "sources": [
        {
            "result": true,
            "msg": "",
            "sourceStatus": {
                "source": "source-0",
                "worker": "dm-dm-worker-0",
                "result": null,
                "relayStatus": null
            },
            "subTaskStatus": [
                {
                    "name": "dm_lightning_physical_backend",
                    "stage": "Running",
                    "unit": "Load",
                    "result": null,
                    "unresolvedDDLLockID": "",
                    "load": {
                        "finishedBytes": "40979858242",
                        "totalBytes": "204899411561",
                        "progress": "20.00 %",
                        "metaBinlog": "(mysql-bin.000177, 802391258)",
                        "metaBinlogGTID": "0bd4b9c7-a79e-11ed-871e-22301c6856d8:1-375812",
                        "bps": "65964886"
                    },
                    "validation": null
                }
            ]
        }
    ]
} <nil>} {stderr 15 0  <nil>}
@fubinzh fubinzh added area/dm Issues or PRs related to DM. type/bug The issue is confirmed as a bug. labels Feb 9, 2023
@fubinzh fubinzh changed the title DM panic DM panic when import data via lightning physical backend Feb 9, 2023
@fubinzh
Copy link
Author

fubinzh commented Feb 9, 2023

/severity Critical

@lance6716
Copy link
Contributor

seems it's a bug at lightning side, will try to reproduce it

@fubinzh
Copy link
Author

fubinzh commented Mar 14, 2023

/found automation

@ti-chi-bot ti-chi-bot added the found/automation Bugs found by automation cases label Mar 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-6.6 area/dm Issues or PRs related to DM. found/automation Bugs found by automation cases severity/critical type/bug The issue is confirmed as a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants