Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ALTER timeout on ReplicatedMergeTree #3891

Closed
vasyaabr opened this issue Dec 21, 2018 · 13 comments
Closed

ALTER timeout on ReplicatedMergeTree #3891

vasyaabr opened this issue Dec 21, 2018 · 13 comments
Assignees
Labels
bug Confirmed user-visible misbehaviour in official release

Comments

@vasyaabr
Copy link

vasyaabr commented Dec 21, 2018

On release 18.16.0, when applying ALTER TABLE ... ADD COLUMN on ReplicatedMergeTree table, ALTER completed on current replica, but other replicas coudn't download and apply ALTER action.

On ALTER we see this error:

Alter is not finished because timeout when waiting for some replicas: site2-deac-dwh-cas3 (replication_alter_columns_timeout = 60). Alter will be done asynchronously.
IN:ALTER TABLE ... ADD COLUMN ... Int64

In Clickhouse error logs when each replica starts, started to appear multiple (~1-2 each minute) errors (on release 18.16.0):

2018.12.20 13:54:11.661354 [ 17 ] {} ...Turnovers (StorageReplicatedMergeTree): DB::StorageReplicatedMergeTree::queueTask()::<lambda(DB::StorageReplicatedMergeTree::LogEntryPtr&)>: Poco::Exception. Code: 1000,
e.code() = 0, e.displayText() = No message received, e.what() = No message received

After replicas restart, ALTER applies on all other replicas, new ALTER works fine, but after 10-20 minutes replicas work, new ALTER will hang again.

Number of replicas = 3, quorum write used (insert_quorum=2), replication_alter_partitions_sync = 2. Problem first appeared on server version: 18.16.0, on 18.14.11 all is ok.

Tried to increase http_connection_timeout to 15, but it was useless.

@vasyaabr
Copy link
Author

After downgrade to 18.14.11 problem disapeared.

@ztlpn
Copy link
Contributor

ztlpn commented Dec 21, 2018

Can you look up log messages with ReplicatedMergeTreeAlterThread in them on replicas that couldn't apply ALTER?

@ztlpn ztlpn added the st-need-info We need extra data to continue (waiting for response) label Dec 21, 2018
@vasyaabr
Copy link
Author

On fail we got 1 line for each table:

2018.12.20 20:43:17.023006 [ 33 ] {} funprodwh.Turnovers (ReplicatedMergeTreeAlterThread): void DB::ReplicatedMergeTreeAlterThread::run(): Code: 999, e.displayText() = Coordination::Exception: Session expired (Session expired), e.what() = Coordination::Exception, Stack trace:
8. /usr/bin/clickhouse-server(DB::ReplicatedMergeTreeAlterThread::run()+0x97) [0x71f41f7]

On success we got:

2018.12.21 07:08:58.891630 [ 28 ] {} funtestwh.Turnovers (ReplicatedMergeTreeAlterThread): Changed version of 'columns' node in ZooKeeper. Waiting for structure write lock.
2018.12.21 07:08:58.891681 [ 28 ] {} funtestwh.Turnovers (ReplicatedMergeTreeAlterThread): Columns list changed in ZooKeeper. Applying changes locally.
2018.12.21 07:08:59.006207 [ 28 ] {} funtestwh.Turnovers (ReplicatedMergeTreeAlterThread): Applied changes to table.
2018.12.21 07:08:59.006317 [ 28 ] {} funtestwh.Turnovers (ReplicatedMergeTreeAlterThread): ALTER-ing parts
2018.12.21 07:08:59.255045 [ 28 ] {} funtestwh.Turnovers (ReplicatedMergeTreeAlterThread): No parts ALTER-ed
2018.12.21 07:08:59.255159 [ 28 ] {} funtestwh.Turnovers (ReplicatedMergeTreeAlterThread): Execution took 365 ms.

@ztlpn
Copy link
Contributor

ztlpn commented Dec 21, 2018

Is this the first line with the session expired exception? I need the first similar error and some lines preceding it. If it is possible, just upload the logs somewhere so that I can look at them, that will be easier.

ztlpn pushed a commit that referenced this issue Dec 26, 2018
…2947 #3891

Previously after a SessionExpired event the context->zookeeper field was
reinitialized with the old expired ZooKeeper instance. This led to inability
to get new paths. Better not cache the ZooKeeper instance and get it at the
start of each request.
@ztlpn ztlpn added bug Confirmed user-visible misbehaviour in official release and removed st-need-info We need extra data to continue (waiting for response) labels Dec 26, 2018
@ztlpn ztlpn self-assigned this Dec 26, 2018
@ztlpn
Copy link
Contributor

ztlpn commented Dec 26, 2018

The No message received error is a separate problem. @alesapin is looking into it.

@alesapin
Copy link
Member

Fixed here #3932

@andywow
Copy link

andywow commented Jan 29, 2019

I got same problem on 19.1.6. Downgrade to 18.14.11 helped.

@ztlpn
Copy link
Contributor

ztlpn commented Jan 29, 2019

@andywow By the same problem you mean ALTER timeout or the No message received message?

@andywow
Copy link

andywow commented Jan 29, 2019

@ztlpn, No message received message

@ztlpn
Copy link
Contributor

ztlpn commented Jan 29, 2019

These are two unrelated issues. I'll close this one as presumably it is fixed in 19.1.6. Let us continue the investigation in the separate issue.

@ztlpn ztlpn closed this as completed Jan 29, 2019
@alesapin
Copy link
Member

@andywow Can you describe in #4047?

@andywow
Copy link

andywow commented Jan 31, 2019

@alesapin, here is:

2019.01.31 15:39:16.661361 [ 8 ] {} <Debug> logdb.logs (StorageReplicatedMergeTree): Fetching part 20190131_45182_45182_0 from /clickhouse/logs/tables/1/logs/replicas/clickhouse-0.clickhouse-headless.clickhouse.svc.cluster.local
2019.01.31 15:39:16.680103 [ 29 ] {} <Debug> logdb.logs (ReplicatedMergeTreeQueue): Pulled 1 entries to queue.
2019.01.31 15:39:16.685351 [ 3 ] {} <Debug> logdb.logs (MergerMutator): Merging 5 parts: from 20190131_45168_45168_0 to 20190131_45172_45172_0 into tmp_merge_20190131_45168_45172_1
2019.01.31 15:39:16.856240 [ 46 ] {} <Debug> TCPHandler: Connected ClickHouse client version 19.1.0, revision: 54413, database: logdb, user: fluentd.
2019.01.31 15:39:16.862385 [ 46 ] {125fbafa-42b4-41a2-8983-e3b855d6149f} <Debug> executeQuery: (from 10.128.8.1:57990, user: fluentd) INSERT INTO logs FORMAT JSONEachRow
2019.01.31 15:39:16.910831 [ 51 ] {cbca2d58-9e2b-40e8-b338-bfdbd432efcb} <Debug> logdb.logs (Replicated OutputStream): Wrote block with ID '20190131_15236628106915869693_15382222068285301741', 6014 rows
2019.01.31 15:39:17.139649 [ 31 ] {} <Debug> logdb.logs (ReplicatedMergeTreeQueue): Pulling 1 entries to queue: log-0000375542 - log-0000375542
2019.01.31 15:39:17.166239 [ 31 ] {} <Debug> logdb.logs (ReplicatedMergeTreeQueue): Pulled 1 entries to queue.
2019.01.31 15:39:17.451641 [ 51 ] {cbca2d58-9e2b-40e8-b338-bfdbd432efcb} <Debug> MemoryTracker: Peak memory usage (for query): 60.83 MiB.
2019.01.31 15:39:17.451860 [ 51 ] {} <Information> TCPHandler: Processed in 2.310 sec.
2019.01.31 15:39:17.453015 [ 51 ] {} <Information> TCPHandler: Done processing connection.
2019.01.31 15:39:17.455569 [ 10 ] {} <Debug> logdb.logs (StorageReplicatedMergeTree): Fetching part 20190131_45184_45184_0 from /clickhouse/logs/tables/1/logs/replicas/clickhouse-0.clickhouse-headless.clickhouse.svc.cluster.local
2019.01.31 15:39:17.456064 [ 26 ] {} <Debug> logdb.logs (ReplicatedMergeTreeQueue): Pulling 1 entries to queue: log-0000375543 - log-0000375543
2019.01.31 15:39:17.492125 [ 26 ] {} <Debug> logdb.logs (ReplicatedMergeTreeQueue): Pulled 1 entries to queue.
2019.01.31 15:39:17.875326 [ 23 ] {} <Debug> logdb.logs (ReplicatedMergeTreeQueue): Pulling 1 entries to queue: log-0000375544 - log-0000375544
2019.01.31 15:39:17.899918 [ 23 ] {} <Debug> logdb.logs (ReplicatedMergeTreeQueue): Pulled 1 entries to queue.
2019.01.31 15:39:17.909871 [ 5 ] {} <Debug> logdb.logs (StorageReplicatedMergeTree): Fetching part 20190131_45185_45185_0 from /clickhouse/logs/tables/1/logs/replicas/clickhouse-0.clickhouse-headless.clickhouse.svc.cluster.local
2019.01.31 15:39:18.046970 [ 48 ] {e58e925f-cbe6-48b9-b2b4-be95f895a047} <Debug> logdb.logs (Replicated OutputStream): Wrote block with ID '20190131_16347631984774990372_16569375850032031628', 114 rows
2019.01.31 15:39:18.110888 [ 36 ] {} <Debug> logdb.logs (ReplicatedMergeTreeQueue): Pulling 1 entries to queue: log-0000375545 - log-0000375545
2019.01.31 15:39:18.126868 [ 36 ] {} <Debug> logdb.logs (ReplicatedMergeTreeQueue): Pulled 1 entries to queue.
2019.01.31 15:39:18.197389 [ 52 ] {} <Debug> TCPHandler: Connected ClickHouse client version 19.1.0, revision: 54413, database: logdb, user: fluentd.
2019.01.31 15:39:18.203598 [ 52 ] {ee75f492-9571-4748-b9d3-b95b5f948ab8} <Debug> executeQuery: (from 10.130.4.1:53622, user: fluentd) INSERT INTO logs FORMAT JSONEachRow
2019.01.31 15:39:18.248505 [ 3 ] {} <Debug> logdb.logs (MergerMutator): Selected MergeAlgorithm: Horizontal
2019.01.31 15:39:18.284327 [ 25 ] {} <Debug> logdb.logs (ReplicatedMergeTreeQueue): Pulling 1 entries to queue: log-0000375546 - log-0000375546
2019.01.31 15:39:18.309793 [ 25 ] {} <Debug> logdb.logs (ReplicatedMergeTreeQueue): Pulled 1 entries to queue.
2019.01.31 15:39:18.398901 [ 48 ] {e58e925f-cbe6-48b9-b2b4-be95f895a047} <Debug> MemoryTracker: Peak memory usage (for query): 50.53 MiB.
2019.01.31 15:39:18.399061 [ 48 ] {} <Information> TCPHandler: Processed in 2.543 sec.
2019.01.31 15:39:18.399808 [ 48 ] {} <Information> TCPHandler: Done processing connection.
2019.01.31 15:39:18.441570 [ 51 ] {} <Debug> TCPHandler: Connected ClickHouse client version 19.1.0, revision: 54413, database: logdb, user: fluentd.
2019.01.31 15:39:18.447510 [ 51 ] {c19ee605-bc0e-43bc-8b04-2eb894656013} <Debug> executeQuery: (from 10.131.4.1:56840, user: fluentd) INSERT INTO logs FORMAT JSONEachRow
2019.01.31 15:39:18.458356 [ 16 ] {} <Debug> logdb.logs (StorageReplicatedMergeTree): Fetching part 20190131_45186_45186_0 from /clickhouse/logs/tables/1/logs/replicas/clickhouse-0.clickhouse-headless.clickhouse.svc.cluster.local
2019.01.31 15:39:18.487681 [ 18 ] {} <Debug> logdb.logs (StorageReplicatedMergeTree): Fetched part 20190131_45175_45175_0 from /clickhouse/logs/tables/1/logs/replicas/clickhouse-0.clickhouse-headless.clickhouse.svc.cluster.local
2019.01.31 15:39:18.488393 [ 16 ] {} <Error> logdb.logs (StorageReplicatedMergeTree): DB::StorageReplicatedMergeTree::queueTask()::<lambda(DB::StorageReplicatedMergeTree::LogEntryPtr&)>: Poco::Exception. Code: 1000, e.code() = 0, e.displayText() = No message received, e.what() = No message received
2019.01.31 15:39:18.533057 [ 18 ] {} <Debug> logdb.logs (StorageReplicatedMergeTree): Fetching part 20190131_45186_45186_0 from /clickhouse/logs/tables/1/logs/replicas/clickhouse-0.clickhouse-headless.clickhouse.svc.cluster.local
2019.01.31 15:39:18.801163 [ 27 ] {} <Debug> logdb.logs (ReplicatedMergeTreeQueue): Pulling 1 entries to queue: log-0000375547 - log-0000375547
2019.01.31 15:39:18.836226 [ 27 ] {} <Debug> logdb.logs (ReplicatedMergeTreeQueue): Pulled 1 entries to queue.
2019.01.31 15:39:18.867377 [ 14 ] {} <Debug> logdb.logs (StorageReplicatedMergeTree): Fetching part 20190131_45188_45188_0 from /clickhouse/logs/tables/1/logs/replicas/clickhouse-0.clickhouse-headless.clickhouse.svc.cluster.local
2019.01.31 15:39:18.867465 [ 33 ] {} <Debug> logdb.logs (ReplicatedMergeTreeQueue): Pulling 1 entries to queue: log-0000375548 - log-0000375548
2019.01.31 15:39:18.886963 [ 33 ] {} <Debug> logdb.logs (ReplicatedMergeTreeQueue): Pulled 1 entries to queue.
2019.01.31 15:39:18.907803 [ 6 ] {} <Debug> logdb.logs (StorageReplicatedMergeTree): Fetching part 20190131_45189_45189_0 from /clickhouse/logs/tables/1/logs/replicas/clickhouse-0.clickhouse-headless.clickhouse.svc.cluster.local
2019.01.31 15:39:19.073651 [ 48 ] {} <Debug> TCPHandler: Connected ClickHouse client version 19.1.0, revision: 54413, database: logdb, user: fluentd.
2019.01.31 15:39:19.081883 [ 48 ] {3be7c7c5-b5f8-441b-9557-15aaa7cb1641} <Debug> executeQuery: (from 10.129.4.1:35256, user: fluentd) INSERT INTO logs FORMAT JSONEachRow
2019.01.31 15:39:19.715096 [ 50 ] {be08e484-d5c5-4bc2-9277-75961a5fd615} <Debug> logdb.logs (Replicated OutputStream): Wrote block with ID '20190131_6287150824869234995_11634570143764135831', 15 rows
2019.01.31 15:39:19.828138 [ 34 ] {} <Debug> logdb.logs (ReplicatedMergeTreeQueue): Pulling 1 entries to queue: log-0000375549 - log-0000375549
2019.01.31 15:39:19.843938 [ 34 ] {} <Debug> logdb.logs (ReplicatedMergeTreeQueue): Pulled 1 entries to queue.
2019.01.31 15:39:19.883965 [ 46 ] {125fbafa-42b4-41a2-8983-e3b855d6149f} <Debug> logdb.logs (Replicated OutputStream): Wrote block with ID '20190130_7161329962311412675_16877035574352368739', 1 rows
2019.01.31 15:39:19.996008 [ 24 ] {} <Debug> logdb.logs (ReplicatedMergeTreeQueue): Pulling 1 entries to queue: log-0000375550 - log-0000375550
2019.01.31 15:39:20.003958 [ 50 ] {be08e484-d5c5-4bc2-9277-75961a5fd615} <Debug> MemoryTracker: Peak memory usage (for query): 50.27 MiB.
2019.01.31 15:39:20.004662 [ 50 ] {} <Information> TCPHandler: Processed in 3.384 sec.
2019.01.31 15:39:20.004756 [ 50 ] {} <Information> TCPHandler: Done processing connection.
2019.01.31 15:39:20.005508 [ 11 ] {} <Debug> logdb.logs (StorageReplicatedMergeTree): Fetched part 20190131_45177_45177_0 from /clickhouse/logs/tables/1/logs/replicas/clickhouse-0.clickhouse-headless.clickhouse.svc.cluster.local
2019.01.31 15:39:20.005680 [ 7 ] {} <Debug> logdb.logs (StorageReplicatedMergeTree): Fetching part 20190131_45190_45190_0 from /clickhouse/logs/tables/1/logs/replicas/clickhouse-0.clickhouse-headless.clickhouse.svc.cluster.local
2019.01.31 15:39:20.034334 [ 46 ] {125fbafa-42b4-41a2-8983-e3b855d6149f} <Debug> MemoryTracker: Peak memory usage (for query): 50.24 MiB.
2019.01.31 15:39:20.035376 [ 46 ] {} <Information> TCPHandler: Processed in 3.173 sec.
2019.01.31 15:39:20.034334 [ 7 ] {} <Error> logdb.logs (StorageReplicatedMergeTree): DB::StorageReplicatedMergeTree::queueTask()::<lambda(DB::StorageReplicatedMergeTree::LogEntryPtr&)>: Poco::Exception. Code: 1000, e.code() = 0, e.displayText() = No message received, e.what() = No message received
2019.01.31 15:39:20.035542 [ 46 ] {} <Information> TCPHandler: Done processing connection.
2019.01.31 15:39:20.048498 [ 24 ] {} <Debug> logdb.logs (ReplicatedMergeTreeQueue): Pulled 1 entries to queue.
2019.01.31 15:39:20.077485 [ 11 ] {} <Debug> logdb.logs (StorageReplicatedMergeTree): Fetching part 20190131_45190_45190_0 from /clickhouse/logs/tables/1/logs/replicas/clickhouse-0.clickhouse-headless.clickhouse.svc.cluster.local
2019.01.31 15:39:20.093179 [ 29 ] {} <Debug> logdb.logs (ReplicatedMergeTreeQueue): Pulling 1 entries to queue: log-0000375551 - log-0000375551
2019.01.31 15:39:20.156559 [ 29 ] {} <Debug> logdb.logs (ReplicatedMergeTreeQueue): Pulled 1 entries to queue.
2019.01.31 15:39:20.625886 [ 17 ] {} <Debug> logdb.logs (StorageReplicatedMergeTree): Fetched part 20190131_45178_45178_0 from /clickhouse/logs/tables/1/logs/replicas/clickhouse-0.clickhouse-headless.clickhouse.svc.cluster.local
2019.01.31 15:39:20.699732 [ 31 ] {} <Debug> logdb.logs (MergerMutator): Selected 6 parts from 20190131_45173_45173_0 to 20190131_45178_45178_0
2019.01.31 15:39:20.742512 [ 26 ] {} <Debug> logdb.logs (ReplicatedMergeTreeQueue): Pulling 1 entries to queue: log-0000375552 - log-0000375552
2019.01.31 15:39:20.768180 [ 26 ] {} <Debug> logdb.logs (ReplicatedMergeTreeQueue): Pulled 1 entries to queue.
2019.01.31 15:39:20.799946 [ 9 ] {} <Debug> logdb.logs (MergerMutator): Merging 6 parts: from 20190131_45173_45173_0 to 20190131_45178_45178_0 into tmp_merge_20190131_45173_45178_1
2019.01.31 15:39:21.033075 [ 23 ] {} <Debug> logdb.logs (ReplicatedMergeTreeQueue): Pulling 1 entries to queue: log-0000375553 - log-0000375553
2019.01.31 15:39:21.046616 [ 23 ] {} <Debug> logdb.logs (ReplicatedMergeTreeQueue): Pulled 1 entries to queue.
2019.01.31 15:39:21.144712 [ 36 ] {} <Debug> logdb.logs (ReplicatedMergeTreeQueue): Pulling 1 entries to queue: log-0000375554 - log-0000375554
2019.01.31 15:39:21.188217 [ 36 ] {} <Debug> logdb.logs (ReplicatedMergeTreeQueue): Pulled 1 entries to queue.
2019.01.31 15:39:21.277075 [ 13 ] {} <Debug> logdb.logs (StorageReplicatedMergeTree): Fetched part 20190131_45179_45179_0 from /clickhouse/logs/tables/1/logs/replicas/clickhouse-0.clickhouse-headless.clickhouse.svc.cluster.local
2019.01.31 15:39:21.333644 [ 7 ] {} <Debug> logdb.logs (StorageReplicatedMergeTree): Fetching part 20190131_45193_45193_0 from /clickhouse/logs/tables/1/logs/replicas/clickhouse-0.clickhouse-headless.clickhouse.svc.cluster.local
2019.01.31 15:39:21.333936 [ 16 ] {} <Debug> logdb.logs (StorageReplicatedMergeTree): Fetching part 20190131_45192_45192_0 from /clickhouse/logs/tables/1/logs/replicas/clickhouse-0.clickhouse-headless.clickhouse.svc.cluster.local
2019.01.31 15:39:21.334383 [ 7 ] {} <Error> logdb.logs (StorageReplicatedMergeTree): DB::StorageReplicatedMergeTree::queueTask()::<lambda(DB::StorageReplicatedMergeTree::LogEntryPtr&)>: Poco::Exception. Code: 1000, e.code() = 0, e.displayText() = No message received, e.what() = No message received
2019.01.31 15:39:21.365187 [ 15 ] {} <Debug> logdb.logs (StorageReplicatedMergeTree): Fetched part 20190131_45180_45180_0 from /clickhouse/logs/tables/1/logs/replicas/clickhouse-0.clickhouse-headless.clickhouse.svc.cluster.local
2019.01.31 15:39:21.489950 [ 52 ] {ee75f492-9571-4748-b9d3-b95b5f948ab8} <Debug> logdb.logs (Replicated OutputStream): Wrote block with ID '20190131_6528039023252625981_11318475687759563261', 23 rows
2019.01.31 15:39:21.834537 [ 53 ] {} <Debug> TCPHandler: Connected ClickHouse client version 19.1.0, revision: 54413, database: logdb, user: fluentd.
2019.01.31 15:39:21.840048 [ 53 ] {d89a36b5-636b-4acf-9f27-b2ecc81e0e5c} <Debug> executeQuery: (from 10.128.4.1:60542, user: fluentd) INSERT INTO logs FORMAT JSONEachRow
2019.01.31 15:39:21.857123 [ 35 ] {} <Debug> logdb.logs (ReplicatedMergeTreeQueue): Pulling 1 entries to queue: log-0000375555 - log-0000375555
2019.01.31 15:39:21.892457 [ 35 ] {} <Debug> logdb.logs (ReplicatedMergeTreeQueue): Pulled 1 entries to queue.
2019.01.31 15:39:21.905713 [ 27 ] {} <Debug> logdb.logs (ReplicatedMergeTreeQueue): Pulling 1 entries to queue: log-0000375556 - log-0000375556
2019.01.31 15:39:21.963052 [ 15 ] {} <Debug> logdb.logs (StorageReplicatedMergeTree): Fetching part 20190131_45193_45193_0 from /clickhouse/logs/tables/1/logs/replicas/clickhouse-0.clickhouse-headless.clickhouse.svc.cluster.local
2019.01.31 15:39:21.966984 [ 4 ] {} <Debug> logdb.logs (StorageReplicatedMergeTree): Fetched part 20190131_45181_45181_0 from /clickhouse/logs/tables/1/logs/replicas/clickhouse-0.clickhouse-headless.clickhouse.svc.cluster.local
2019.01.31 15:39:21.968167 [ 52 ] {ee75f492-9571-4748-b9d3-b95b5f948ab8} <Debug> MemoryTracker: Peak memory usage (for query): 50.27 MiB.
2019.01.31 15:39:21.968740 [ 52 ] {} <Information> TCPHandler: Processed in 3.765 sec.
2019.01.31 15:39:21.969419 [ 52 ] {} <Information> TCPHandler: Done processing connection.
2019.01.31 15:39:21.969745 [ 27 ] {} <Debug> logdb.logs (ReplicatedMergeTreeQueue): Pulled 1 entries to queue.
2019.01.31 15:39:21.984459 [ 15 ] {} <Error> logdb.logs (StorageReplicatedMergeTree): DB::StorageReplicatedMergeTree::queueTask()::<lambda(DB::StorageReplicatedMergeTree::LogEntryPtr&)>: Poco::Exception. Code: 1000, e.code() = 0, e.displayText() = No message received, e.what() = No message received
2019.01.31 15:39:22.006068 [ 51 ] {c19ee605-bc0e-43bc-8b04-2eb894656013} <Debug> logdb.logs (Replicated OutputStream): Wrote block with ID '20190131_3167656419313925407_12727283588511619094', 108 rows
2019.01.31 15:39:22.015910 [ 46 ] {} <Debug> TCPHandler: Connected ClickHouse client version 19.1.0, revision: 54413, database: logdb, user: fluentd.
2019.01.31 15:39:22.036797 [ 46 ] {c80eab3c-9254-4649-9d01-46d1e35edc86} <Debug> executeQuery: (from 10.130.4.1:53634, user: fluentd) INSERT INTO logs FORMAT JSONEachRow
2019.01.31 15:39:22.048986 [ 17 ] {} <Debug> logdb.logs (StorageReplicatedMergeTree): Fetching part 20190131_45195_45195_0 from /clickhouse/logs/tables/1/logs/replicas/clickhouse-0.clickhouse-headless.clickhouse.svc.cluster.local
2019.01.31 15:39:22.640416 [ 9 ] {} <Debug> logdb.logs (MergerMutator): Selected MergeAlgorithm: Horizontal
2019.01.31 15:39:22.675019 [ 4 ] {} <Debug> logdb.logs (StorageReplicatedMergeTree): Fetching part 20190131_45193_45193_0 from /clickhouse/logs/tables/1/logs/replicas/clickhouse-0.clickhouse-headless.clickhouse.svc.cluster.local
2019.01.31 15:39:22.675696 [ 8 ] {} <Debug> logdb.logs (StorageReplicatedMergeTree): Fetched part 20190131_45182_45182_0 from /clickhouse/logs/tables/1/logs/replicas/clickhouse-0.clickhouse-headless.clickhouse.svc.cluster.local
2019.01.31 15:39:22.696415 [ 4 ] {} <Error> logdb.logs (StorageReplicatedMergeTree): DB::StorageReplicatedMergeTree::queueTask()::<lambda(DB::StorageReplicatedMergeTree::LogEntryPtr&)>: Poco::Exception. Code: 1000, e.code() = 0, e.displayText() = No message received, e.what() = No message received
2019.01.31 15:39:22.696609 [ 51 ] {c19ee605-bc0e-43bc-8b04-2eb894656013} <Debug> MemoryTracker: Peak memory usage (for query): 50.53 MiB.
2019.01.31 15:39:22.697803 [ 51 ] {} <Information> TCPHandler: Processed in 4.250 sec.
2019.01.31 15:39:22.697838 [ 51 ] {} <Information> TCPHandler: Done processing connection.
2019.01.31 15:39:22.741025 [ 29 ] {} <Debug> logdb.logs (ReplicatedMergeTreeQueue): Pulling 1 entries to queue: log-0000375557 - log-0000375557
2019.01.31 15:39:22.780600 [ 29 ] {} <Debug> logdb.logs (ReplicatedMergeTreeQueue): Pulled 1 entries to queue.
2019.01.31 15:39:22.819382 [ 48 ] {3be7c7c5-b5f8-441b-9557-15aaa7cb1641} <Debug> logdb.logs (Replicated OutputStream): Wrote block with ID '20190131_11005474187308461911_5406784493284169051', 3016 rows
2019.01.31 15:39:22.854585 [ 24 ] {} <Debug> logdb.logs (ReplicatedMergeTreeQueue): Pulling 1 entries to queue: log-0000375558 - log-0000375558
2019.01.31 15:39:22.898054 [ 24 ] {} <Debug> logdb.logs (ReplicatedMergeTreeQueue): Pulled 1 entries to queue.
2019.01.31 15:39:22.911515 [ 8 ] {} <Debug> logdb.logs (StorageReplicatedMergeTree): Fetching part 20190131_45193_45193_0 from /clickhouse/logs/tables/1/logs/replicas/clickhouse-0.clickhouse-headless.clickhouse.svc.cluster.local
2019.01.31 15:39:22.976956 [ 50 ] {} <Debug> TCPHandler: Connected ClickHouse client version 19.1.0, revision: 54413, database: logdb, user: fluentd.
2019.01.31 15:39:22.977445 [ 38 ] {} <Debug> logdb.logs (ReplicatedMergeTreeQueue): Pulling 1 entries to queue: log-0000375559 - log-0000375559
2019.01.31 15:39:22.982398 [ 12 ] {} <Debug> logdb.logs (MergerMutator): Merge sorted 15881 rows, containing 19 columns (19 merged, 0 gathered) in 9.41 sec., 1687.71 rows/sec., 1.32 MB/sec.
2019.01.31 15:39:22.991422 [ 50 ] {ed31ae1e-bc37-4334-9809-6a8b9731c962} <Debug> executeQuery: (from 10.129.6.1:58166, user: fluentd) INSERT INTO logs FORMAT JSONEachRow
2019.01.31 15:39:23.025535 [ 48 ] {3be7c7c5-b5f8-441b-9557-15aaa7cb1641} <Debug> MemoryTracker: Peak memory usage (for query): 56.10 MiB.
2019.01.31 15:39:23.025705 [ 48 ] {} <Information> TCPHandler: Processed in 3.944 sec.
2019.01.31 15:39:23.027713 [ 48 ] {} <Information> TCPHandler: Done processing connection. 

@PangKuo
Copy link

PangKuo commented Feb 22, 2019

Is this the first line with the session expired exception? I need the first similar error and some lines preceding it. If it is possible, just upload the logs somewhere so that I can look at them, that will be easier.

version18.16.1
I have the same problem with issue #3934, is there any other way to fix it except upgrade clickhouse -server version

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Confirmed user-visible misbehaviour in official release
Projects
None yet
Development

No branches or pull requests

5 participants