Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

broken pipe driver issue #844

Closed
piyushdatazip opened this issue Dec 12, 2022 · 17 comments
Closed

broken pipe driver issue #844

piyushdatazip opened this issue Dec 12, 2022 · 17 comments
Labels
investigate stale requires a follow-up

Comments

@piyushdatazip
Copy link

Issue description

cause: doRequest: transport failed to send a request to ClickHouse: write tcp 19.17.2.250:49816->10.100.164.134:80: write: broken pipe. 

Getting this error while making an insert query to clickhouse, Batch Insert, 10000 object in single batch

Interface: E.g. database/sql

Driver version: v2.4.3

Go version: go version go1.19.1 darwin/arm64

ClickHouse Server version: Tried latest as well as 4 month old release

@piyushdatazip
Copy link
Author

piyushdatazip commented Dec 12, 2022

Mentioning here @jkaflik for attention to this issue . Similar issue was reported in ClickHouse/ClickHouse#17446
for clickhouse-jdbc. Facing errors randomly when rows are heavy in size, sometime happens, sometimes not.

@piyushdatazip
Copy link
Author

piyushdatazip commented Dec 12, 2022

also getting similar issues such as EOF or Use of closed network connectionor dial tcp 52.66.145.214:8123: connect: connection refused

cause: Post "CLICKHOUSEDSN&date_time_input_format=best_effort&default_format=Native&wait_end_of_query=1": EOF.

@jkaflik
Copy link
Contributor

jkaflik commented Dec 13, 2022

Hi @piyushdatazip

What's your estimated block size? You are using the HTTP protocol, right? Do you know if that's reproducible on native?

@vnazarenko
Copy link

I'm also getting write: write tcp 127.0.0.1:37686-\u003e127.0.0.1:9000: write: broken pipe during batch inserts. I have script to insert into clickhouse in batches, a pretty big amount of data ( hundreds millions ) so this script run for hours, and this error start appear only after several hours.

@rtkaratekid
Copy link
Contributor

rtkaratekid commented Dec 13, 2022

Just wanted to chime in and say that I also ran across this same error today. Before today I've not had this error before. My insert code looks like this

if err := batch.Send(); err != nil && err != clickhouse.ErrBatchAlreadySent {
	log.Println("Failed inserting batch of flows, retrying: ", err)

	if err := batch.Send(); err != nil {
			lostFlowCount += flowCount
			log.Println("Retry insert failed, continuing to next batch: ", err)
			log.Println("Number of lost flows: ", lostFlowCount)
	}
}

The first insert attempt fails with:
Failed inserting batch of flows, retrying: write tcp <ip>:<send port> -><ip>:9000: write: broken pipe

The retry fails with:
Retry insert failed, continuing to next batch: clickhouse: batch has already been sent
edit: I forgot to mention that although it says the batch has already been sent, there was no data inserted into the db at any point.

I'll add a batch already sent error check, but I just wanted to add that this was something I saw as well.

@jkaflik
Copy link
Contributor

jkaflik commented Dec 13, 2022

@vnazarenko @rtkaratekid
what is CH version and go client version you run?

@rtkaratekid
did anything change in your setup as you have this error now?

@vnazarenko
Copy link

go 1.19
require (
   github.com/ClickHouse/clickhouse-go/v2 v2.4.0
)

@rtkaratekid
Copy link
Contributor

rtkaratekid commented Dec 13, 2022

go1.19.4
github.com/ClickHouse/clickhouse-go/v2 v2.0.14

ClickHouse server version 22.11.2 revision 54460

I don't think anything has changed, but I'm also poking away at this to see if it's my fault.

@jkaflik
Copy link
Contributor

jkaflik commented Dec 14, 2022

@vnazarenko @rtkaratekid can you have a look if it's reproducible on a recent client version?

@rtkaratekid
Copy link
Contributor

@jkaflik today I dug into this more, what I found was happening was that as clickhouse was setting up (I have a bit of db migration code as well) there would be about 100 to 200 failed inserts and then after that everything worked fine. I don't think it's a timing issue because my migration code blocks until the migration is found to be successful, but it could be? I'll probably be able to get around to trying it with an updated client tomorrow, but it's kind of tricky for me to replicate the issue consistently. It's not a huge deal for me right now, just wanted to chime in and help if I could.

@jkaflik
Copy link
Contributor

jkaflik commented Dec 15, 2022

@rtkaratekid sure. I want to get as many details from you, so I can also get prepared to have a reproducible environment.
I'm getting onboarded to this project and to ClickHouse in general, so there might be a bit gap in my "understanding" of things.
Is there anything suspicious you can find in CH logs?

@vnazarenko
Copy link

@vnazarenko @rtkaratekid can you have a look if it's reproducible on a recent client version?

I'll try new version on days

@rf rf mentioned this issue Jan 13, 2023
@jkaflik
Copy link
Contributor

jkaflik commented Jan 19, 2023

Hi @vnazarenko @rtkaratekid @piyushdatazip ,
I had a screen sharing with @n-oden and got some insights on an issue with getting broken pipe in broken socket conditions.
We've merged a proposal by Nathan that mitigates the issue. Can you please check if that solves the issue for you in case you use the native protocol?

I will continue to work on finding a better answer to what happens and fix it.

@piyushdatazip
Copy link
Author

Nice to hear that @jkaflik , since It was an urgent issue for my implementation, we routed towards using the native interface instead of sql, and the problem was resolved.

@jkaflik
Copy link
Contributor

jkaflik commented Jan 20, 2023

@piyushdatazip our finding is indeed it's caused by the database/sql driver implementation not handling the poisoned connections properly in some circumstances and keeping it until the max connection lifetime.

@jkaflik
Copy link
Contributor

jkaflik commented Feb 16, 2023

Hi @vnazarenko @rtkaratekid @piyushdatazip

Any updates on this topic? I hope you don't encounter such issues anymore. If no new reports arrive, I will close this ticket.

@jkaflik jkaflik added the stale requires a follow-up label Feb 16, 2023
@piyushdatazip
Copy link
Author

@jkaflik you can close the issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
investigate stale requires a follow-up
Projects
None yet
Development

No branches or pull requests

4 participants