Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: cn crashed by panic: interface conversion: interface {} is nil, not *config.ParameterUnit during statbility test on distributed mode #16724

Closed
1 task done
aressu1985 opened this issue Jun 6, 2024 · 4 comments
Assignees
Labels
kind/bug Something isn't working needs-triage phase/testing resolved/v1.2.1 severity/s0 Extreme impact: Cause the application to break down and seriously affect the use
Milestone

Comments

@aressu1985
Copy link
Contributor

Is there an existing issue for the same bug?

  • I have checked the existing issues.

Branch Name

1.2-dev

Commit ID

e6b2868

Other Environment Information

- Hardware parameters:
3*CN: 16C 64G
1*DN: 16C 64G
3*LOG: 4C 16G
2*PROXY: 3C 6G
- OS type:
- Others:

Actual Behavior

during statbility test on distributed mode, one cn crashed by panic:
image

mo-log:
https://shanghai.idc.matrixorigin.cn:30001/explore?panes=%7B%22Jyy%22:%7B%22datasource%22:%22loki%22,%22queries%22:%5B%7B%22refId%22:%22A%22,%22expr%22:%22%7Bnamespace%3D%5C%22mo-nightly-e6b2868-20240605224953%5C%22%7D%20%7C%3D%20%60panic%60%22,%22queryType%22:%22range%22,%22datasource%22:%7B%22type%22:%22loki%22,%22uid%22:%22loki%22%7D,%22editorMode%22:%22builder%22%7D%5D,%22range%22:%7B%22from%22:%221717624446705%22,%22to%22:%221717626469448%22%7D%7D%7D&schemaVersion=1&orgId=1

Expected Behavior

No response

Steps to Reproduce

1. run a mo cluster with config in this issue
2. run tpch 10G loop test processes in one independant tenant
3. run tpcc 10 warehouse and 10 ternimals longrunnig test processes in one independant tenant, prepare mode
4. run sysbench mixed cases(insert/delete/update/select) longrunnig test processes with 75 terminals in one independant tenant,non-prepare mode
5. run another sysbench mixed cases(insert/delete/update/select) longrunnig test processe with  75 terminals in one independant tenant,non-prepare mode

Additional information

No response

@aressu1985 aressu1985 added kind/bug Something isn't working needs-triage severity/s0 Extreme impact: Cause the application to break down and seriously affect the use labels Jun 6, 2024
@aressu1985 aressu1985 added this to the 1.2.1 milestone Jun 6, 2024
@daviszhen
Copy link
Contributor

初步看,这个任务是在mo server还未初始化时,就开始执行了。此时global pu还未初始化

@xzxiong
Copy link
Contributor

xzxiong commented Jun 7, 2024

root cause:
StorageUsage task 由 task service 发起运行,在 initMOServer 初始化 setGlobalPu 前,所以调用 getGlobalPu 失败。

启动顺序

  1. getHAKeeperClient 开始接收 heartbeat,触发 task service开始工作
  2. initMOServer 调用 setGlobalPu
    image

启动日志

MOServer 未完成初始化
注: MOServer 启动时会打印日志 Server Listening ...

link
image

@daviszhen
Copy link
Contributor

以及修改了。

@aressu1985
Copy link
Contributor Author

fixed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working needs-triage phase/testing resolved/v1.2.1 severity/s0 Extreme impact: Cause the application to break down and seriously affect the use
Projects
None yet
Development

No branches or pull requests

4 participants