-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
encoding/csv: golang csv also so slow and consumed two much memory #16786
Comments
Can you provide two small, self contained Go and Java programs that can be used to assess the problem? How much slower than Java is "so slow"? How much memory is "huge memory"? Are you slurping whole files into memory? Also, if you have questions about how to optimize your go programs, you should ask elsewhere (the project does not use the issue tracker for questions. See: Questions). |
As @ALTree said, we need some more information. Maybe you can also give us some more information about your CSV files, like the number of rows and columns. There are currently no options that you could specify for better performance. The internal reader used by encoding/csv uses a Maybe you can try to read directly from the files, instead of reading them into memory first? This could reduce the memory usage (quite) a bit. I have a CL open that optimizes the encoding/csv a bit by avoiding some allocations (basically reducing allocations from 1columnsrows to 1*rows in your case). In a simple synthetic test reading ~15 million rows of CSV I got a 17% win with my CL. If you have multiple files you could also try to parse them with multiple goroutines in parallel. |
You can also not use ReadAll if you care about memory. You should use Go's streaming APIs when your data is large. Let's move this discussion to golang-nuts@ if there's nothing concrete to do here. Reports on the bug tracker should be more concrete than "Go is slow" and the "the speed and memory usage are so huge". Please provide sample code, numbers, etc. |
To be fair the standard library csv reader is notoriously slow out of the box, we could use an Issue tracking the problem. Obviously it'll need some data. |
@ALTree, if you'd like to open one, please go ahead. |
Please answer these questions before submitting your issue. Thanks!
go version
)?go version go1.7 linux/amd64
go env
)?GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
If possible, provide a recipe for reproducing the error.
A complete runnable program is good.
A link on play.golang.org is best.
Read a several csv file, each one is about 50 m. the speed and memory usage are so huge.
I compared regex with Java yesterday: #16758
I wish Go can get the speed of Java
Go is much slower on CSV reading. I think writing also much slower too. So is there any way to optimize this? I read CSV like:
Maybe I can set some options before reading? Seems I can't specify something like buffer size, etc. Why?
The text was updated successfully, but these errors were encountered: