Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory usage and leak problem with large file uploads #89

Closed
screenmeet opened this issue Jul 19, 2018 · 7 comments
Closed

Memory usage and leak problem with large file uploads #89

screenmeet opened this issue Jul 19, 2018 · 7 comments

Comments

@screenmeet
Copy link

The current implementation blows up the process memory on large file uploads (try with 1GB+ file, for example, and monitor process memory).

We made a PR which uses temporary files to manage the intermediate file uploads instead of using memory.

#88

Enjoy!

@kyleratti
Copy link

kyleratti commented Jul 20, 2018

Wow, god bless you! I've been going nuts trying to track down a memory leak when uploading files to my API. It works fine locally on my Macbook, but in production on my Ubuntu servers it just eats the RAM until Node crashes. The files are only around 70MB in size, too, so much lower than what you're seeing, although Node is running on a VM with 1GB of RAM.

You're the man!

kyleratti added a commit to kyleratti/tuckbot-api that referenced this issue Jul 20, 2018
Unfortunately express-fileupload has a bug dealing with larger uploads due to how it handles the received file. See richardgirges/express-fileupload#89 for more details.
@screenmeet
Copy link
Author

@kyleratti You're welcome.

I also observed some different behaviors on different node versions / OS' in terms of memory leakage with the buffers. Sometimes it would just take 2x the file size (70mb would take 140mb), other times, it would jump almost exponentially (a 100MB file would eat 700MB ram).

There was also a secondary issue where if a file was aborted, those intermediate chunks will just live in memory forever (until the process crashes). That issue will still persist if you are using the memory-based approach. I strongly recommend using the intermediate files and not memory buffers...

@kyleratti
Copy link

@screenmeet yeah, that's what was so confusing about debugging this. The absolute only difference between my environments, aside from OS, was that the final storage was on a NFS-mounted share - but express wasn't even calling my route handler, so there was no way that was related.

I wonder if it makes more sense to save to /dev/shm and then move it?

@screenmeet
Copy link
Author

@kyleratti I wouldn't try putting intermediate files anywhere in memory... just too easy to eat eat up all available memory if there are many concurrent uploads or people are uploading large files. Major DDoS vulnerability if you ask me.

@kyleratti
Copy link

@screenmeet that's true - my use-case is a private API used to connect a few of my services together so I don't have that concern, but I see your point. You get the speed advantage with /dev/shm going to memory but you're right, you do have to have a chunk of memory reserved for that alone.

@richardgirges
Copy link
Owner

Hi @screenmeet, still awaiting linting fixes in the PR. There are also some conflicts now. In the meantime, just wanted to notify everyone that the latest v1.0.0-alpha.1 release should address some memory issues. We are no longer building md5 checksums on every upload as this was non-performant and taxing on memory.

@richardgirges
Copy link
Owner

Resolve by #109

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants