-
Notifications
You must be signed in to change notification settings - Fork 323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Trying to concat rows of ~55000 CSV files with a cumulative size of 1.4gb, xsv killed by oom_reaper #230
Comments
Please provide a reproduction. If you can't share the data, then please consider obfuscating or censoring it somehow. Indeed, this command should use very little memory. Its code is very simple and it is implemented in a straight-forward streaming fashion: Lines 71 to 84 in 3de6c04
The only thing that's required is that each row must fit into memory. |
Okay, here's a link to the tarball: https://drive.google.com/file/d/19UdCh9qFeuZsy1JOYUQvEPl773EVuvVc/view So, steps to reproduce:
By looking at Thanks for the help! |
Oh, and:
|
Thank you for the easy reproduction! Unfortunately, this is a problem with the argv parser that xsv uses: docopt/docopt.rs#207 At some point, I'd like to move off that parser and use clap instead. But it's a big refactor. The only work-around available to you, I think, is to chunk it up into multiple xsv processes. The simplest way to do that is with xargs:
|
Thanks, I ended up just using
Quite elegant! But I digress, Never thought I would see the day when a command-line parser eats all of my ram, It's probably trying to do something far too clever! |
awk can't parse csv correctly, so I'd be careful with that. It assumes the first header record only uses a single line, which might be true in your case but isn't in general.
I wrote the parser and abandoned it ages ago, because of this and other problems. The specific problem is that it uses backtracking to implement the "docopt" style. So it goes exponential in the worst case. I'd say it's decidedly not clever. |
Hi, so I have 55161 csv files in a directory (
1.csv
to55161.csv
) . I'm trying to concat them all with:But
xsv
is being killed by the oom_reaper after exhausting all of my 32gb of RAM. Does anyone know why this is happening? I wouldn't really expectsxv cat
to use very much memory at all, much less over 30gb of memory!Does anyone know what's going on?
The text was updated successfully, but these errors were encountered: