Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DB::Exception: Session is locked by a concurrent client. (SESSION_IS_LOCKED) #7906

Closed
ranjeetranjan opened this issue Mar 12, 2024 · 18 comments
Assignees
Labels
bug Something isn't working driver:clickhouse Issues related to the ClickHouse driver help wanted Community contributions are welcome. question The issue is a question. Please use Stack Overflow for questions.

Comments

@ranjeetranjan
Copy link

ranjeetranjan commented Mar 12, 2024

Getting error

{
    "error": "Error: Code: 373. DB::Exception: Session is locked by a concurrent client. (SESSION_IS_LOCKED) (version 22.3.20.29 (official build))\n"
}

To Reproduce
As per the CLickHouse expert

ClickHouse shows an error message if there are two or more parallel queries with the same session_id. Usually the session_id must be different each time

In details
The error occurs within ClickHouse's internals when a session is trying to be acquired, but it's already locked by another client. This might happen due to concurrent queries or operations trying to access the same session simultaneously, causing contention.

Version:
0.34.61

@igorlukanin
Copy link
Member

Hi @ranjeetranjan 👋 Could you please provide more context here?

@ranjeetranjan
Copy link
Author

Hi @igorlukanin I am getting below error in the cubejs

{"message":"Error querying db","error":"Error: socket hang up\n at QueryQueue.parseResult (/usr/src/app/node_modules/@cubejs-backend/query-orchestrator/src/orchestrator/QueryQueue.js:384:13)\n at QueryQueue.executeInQueue (/usr/src/app/node_modules/@cubejs-backend/query-orchestrator/src/orchestrator/QueryQueue.js:356:19)\n at processTicksAndRejections (node:internal/process/task_queues:95:5)\n at PreAggregationLoadCache.keyQueryResult (/usr/src/app/node_modules/@cubejs-backend/query-orchestrator/src/orchestrator/PreAggregations.ts:435:75)\n at async Promise.all (index 0)\n at PreAggregationLoader.loadPreAggregationWithKeys (/usr/src/app/node_modules/@cubejs-backend/query-orchestrator/src/orchestrator/PreAggregations.ts:645:30)\n at PreAggregationLoader.loadPreAggregation (/usr/src/app/node_modules/@cubejs-backend/query-orchestrator/src/orchestrator/PreAggregations.ts:575:22)\n at preAggregationPromise (/usr/src/app/node_modules/@cubejs-backend/query-orchestrator/src/orchestrator/PreAggregations.ts:2214:30)\n at QueryOrchestrator.fetchQuery (/usr/src/app/node_modules/@cubejs-backend/query-orchestrator/src/orchestrator/QueryOrchestrator.ts:241:9)\n at OrchestratorApi.executeQuery (/usr/src/app/node_modules/@cubejs-backend/server-core/src/core/OrchestratorApi.ts:98:20)\n at /usr/src/app/node_modules/@cubejs-backend/server-core/src/core/RefreshScheduler.ts:605:13\n at async Promise.all (index 1)\n at RefreshScheduler.refreshPreAggregations (/usr/src/app/node_modules/@cubejs-backend/server-core/src/core/RefreshScheduler.ts:590:5)\n at async Promise.all (index 1)\n at RefreshScheduler.runScheduledRefresh (/usr/src/app/node_modules/@cubejs-backend/server-core/src/core/RefreshScheduler.ts:285:9)","requestId":"scheduler-383984a0-dae6-4222-83dd-b0527fbc6b3f"}

To dig into I open a discussion on the ClickHouse Slack community and they recommend

ClickHouse shows that error message if there are two or more parallel queries with the same session_id. Usually, the session_id must be different each time.

As per the Clickhouse community suggestion session_id should be unique. The same session ID creating issues.

@paveltiunov
Copy link
Member

@ranjeetranjan Each connection in the connection pool allocated by Cube has a unique session_id. So, there is either a bug somewhere in connection pooling logic or there is some edge case with your workload that leads to the problem. Either way, we need a test to reproduce it as a first step. Also, a fix contribution would be helpful here as well.

@paveltiunov paveltiunov added bug Something isn't working help wanted Community contributions are welcome. labels Mar 24, 2024
Copy link

If you are interested in working on this issue, please provide go ahead and provide PR for that.
We'd be happy to review it and merge it.
If this is the first time you are contributing a Pull Request to Cube, please check our contribution guidelines.
You can also post any questions while contributing in the #contributors channel in the Cube Slack.

@paveltiunov paveltiunov added the question The issue is a question. Please use Stack Overflow for questions. label Mar 24, 2024
@ranjeetranjan
Copy link
Author

@paveltiunov We utilize a Dashboard containing over 10 to 15 cards for data visualization. We run parallel requests for all these cards to efficiently retrieve and display the data on the Dashboard. It's observed that when parallel requests are made, they often share the same session ID, which could lead to conflicts or inconsistencies. Also we used to retry if we get the continued wait.

@matthieuCarlini
Copy link

matthieuCarlini commented Sep 27, 2024

Hi,
it seems I have the same problem.
I've a cube core (v0.36.2) and clickhouse (24.7.3.42) both on premise.

I have 2 cubes on 2 tables clickhouse. When I made a view joining the 2 cubes, I receive a error in my playground :
Code: 373. DB::Exception: Session 6f5ae44f-f919-4394-80b3-92d043521739 is locked by a concurrent client. (SESSION_IS_LOCKED) (version 24.7.3.42 (official build))

The complete log on clickhouse side is :

2024.09.27 12:15:16.082504 [ 556216 ] {} <Error> DynamicQueryHandler: Code: 373. DB::Exception: Session 0f361321-fd30-4382-a030-f3be3a508469 is locked by a concurrent client. (SESSION_IS_LOCKED), Stack trace (when copying this message, always include the lines below):
0. DB::Exception::Exception(DB::Exception::MessageMasked&&, int, bool) @ 0x000000000d5f7a3b
1. DB::Exception::Exception(PreformattedMessage&&, int) @ 0x00000000079f8ccc
2. DB::Exception::Exception<String const&>(int, FormatStringHelperImpl<std::type_identity<String const&>::type>, String const&) @ 0x0000000007a0b40b
3. DB::Session::makeSessionContext(String const&, std::chrono::duration<long long, std::ratio<1l, 1000000000l>>, bool) @ 0x00000000115ed2c9
4. DB::HTTPHandler::processQuery(DB::HTTPServerRequest&, DB::HTMLForm&, DB::HTTPServerResponse&, DB::HTTPHandler::Output&, std::optional<DB::CurrentThread::QueryScope>&, StrongTypedef<unsigned long, ProfileEvents::EventTag> const&) @ 0x0000000012b56741
5. DB::HTTPHandler::handleRequest(DB::HTTPServerRequest&, DB::HTTPServerResponse&, StrongTypedef<unsigned long, ProfileEvents::EventTag> const&) @ 0x0000000012b5de39
6. DB::HTTPServerConnection::run() @ 0x0000000012bdf993
7. Poco::Net::TCPServerConnection::start() @ 0x00000000159a2267
8. Poco::Net::TCPServerDispatcher::run() @ 0x00000000159a26f9
9. Poco::PooledThread::run() @ 0x000000001596f841
10. Poco::ThreadImpl::runnableEntry(void*) @ 0x000000001596ddfd
11. start_thread @ 0x0000000000007ea7
12. ? @ 0x00000000000fba6f
 (version 24.7.3.42 (official build))

@paveltiunov : I don't know how I can help in the resolution of this issue. We're all on premise so I can test further if it can help.

@igorlukanin
Copy link
Member

@matthieuCarlini Thanks for adding some info here.

I've tried to reproduce this in Cube Cloud on v0.36.3 with ClickHouse v24.10.1.243 in the following way but it worked without issues for me.

Data model:

cubes:
  - name: countries
    sql_table: countries

    joins:
      - name: country_codes
        relationship: one_to_one
        sql: "{countries.name} = {country_codes.name}"

    dimensions:
      - name: name
        sql: name
        type: string

  - name: country_codes
    sql_table: country_codes

    dimensions:
      - name: name
        sql: name
        type: string

      - name: code
        sql: code
        type: string

views:
  - name: country_info

    cubes:
      - join_path: countries
        includes:
          - name: name
            alias: territory_name

      - join_path: countries.country_codes
        includes: '*'

Query:

{
  "limit": 5000,
  "dimensions": [
    "country_info.territory_name",
    "country_info.name",
    "country_info.code"
  ]
}

ClickHouse credentials:
Screenshot 2024-09-30 at 14 23 48

Results:
Screenshot 2024-09-30 at 14 23 04


@matthieuCarlini @ranjeetranjan It would be great if you can help reproduce this. For starters, if you connect to this public ClickHouse server instead of your own one, are you able to reproduce this issue?

@igorlukanin igorlukanin self-assigned this Sep 30, 2024
@igorlukanin igorlukanin added the driver:clickhouse Issues related to the ClickHouse driver label Sep 30, 2024
@ranjeetranjan
Copy link
Author

@matthieuCarlini @igorlukanin Earlier I was using node:latest as build-step for the docker images but when I changed to cubejs/cube:v0.35.55 as docker images the same problem did not come.

@matthieuCarlini
Copy link

matthieuCarlini commented Sep 30, 2024

@igorlukanin : thanks for the test model online.
Your example work for us to. So we investigate a little bit. It seems that the 2 joined cubes hit a sql in clickhouse that takes to much time for the max_execution_time of our clickhouse server.
We look for a better approach, and it don't seems to be a cube problem.
Thanks for your time, my apologies for this mistake analysis.

@igorlukanin
Copy link
Member

@matthieuCarlini Thanks for the update! Indeed, joins in ClickHouse are an interesting thing 😄

@ranjeetranjan Do you think this was also an issue in your case? It would be great if you can test this on the latest version of Cube, because if this does not reproduce, then we'll call it a day and close this issue.

@igorlukanin
Copy link
Member

@matthieuCarlini I wonder if this also deserves a ticket in ClickHouse repo, since the error message is so wildly different from the apparent root cause.

@ranjeetranjan
Copy link
Author

@igorlukanin This came only when we had joins with lot of data but when we use cubejs/cube:v0.35.55 the issues did not came. After changing the build images #6477 also did not came.

@igorlukanin
Copy link
Member

Whew! Looks like we can close this now. Thanks for collaborating, folks!

@hungquach
Copy link

@igorlukanin @ranjeetranjan I've been experiencing the same issue since Cube v0.36.
When I extended the time range, In v0.35, it was fine to wait for all requests, but in v0.36+ I get Error: socket hang up in Cube after a few seconds. Additionally, ClickHouse sometimes returns Error: Code: 373. DB::Exception: Session is locked by a concurrent client.

v1.1: KO 🛑
v1.0: KO 🛑
v0.36: KO 🛑
v0.35: OK ✅

@igorlukanin
Copy link
Member

@hungquach Let's wait for this PR to be merged and released, it might drastically improve how ClickHouse works with Cube: #8928

@mcheshkov
Copy link
Member

@hungquach I've just merged ClickHouse client upgrade, next release should have those changes included. I'll ping again when release is ready.

@hungquach
Copy link

@mcheshkov @igorlukanin thank you for the update. I will do some tests when the release is ready.

@mcheshkov
Copy link
Member

v1.1.7 was just released, it's already available on Docker Hub, soon will be on Cloud.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working driver:clickhouse Issues related to the ClickHouse driver help wanted Community contributions are welcome. question The issue is a question. Please use Stack Overflow for questions.
Projects
None yet
Development

No branches or pull requests

6 participants