-
-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Bulk download (zip generation) from S3 primary storage downloads truncated files #42248
Comments
I found that I'm hitting this issue because of the way ZipStreamer is reading the data:
So, if If I change it like so:
Then my issue is fixed. The behavior was changed here: DeepDiver1975/PHPZipStreamer@22515e3. However, I believe it's incorrect, and the file size should be passed to the function (as it is known by that point) instead, to do the required checks instead of relying on |
Bug description
When downloading a large folder from S3 primary storage, the resulting ZIP file contains some corrupted (truncated) files - usually, only a few, with varying original sizes (I've seen it happening with both smaller and larger files). It's not always the same files, and the resulting ZIP file is valid, it's opening and extracting fine, even passing the CRC check, it's only the few files inside that are truncated (as if they were packed that way to begin with).
The number of files in the folder doesn't seem to matter; size seems to be more of an issue. It can happen even if I upload a single 2.6 GB file to a folder, and download that folder, but is more prevalent with bigger ones. Downloading the files directly one by one works, so they're uploaded correctly, but downloading the whole folder most often results in some truncated files.
My instance is located on a Google Compute Engine VM, with the primary storage being a Google Cloud Storage bucket in the same region. Because of this, the bucket access is very fast, and I can upload and download at 30-50 MB/s via Nextcloud (and the VM itself can access the bucket much faster). The VM has several gigabytes of free RAM and disk space when the issue happens, and there are no logs whatsoever (only a HTTP 200/OK log line for the download request from Nginx, with the truncated size).
I've verified that this only happens with S3 primary storage. Downloads from a local external storage from the same instance work fine (with the same folder contents, tested multiple times), so I think web server issues can be ruled out (unless this is some kind of receive timeout issue caused by S3? But it seems to be downloading just fine, before it stops, and again, the ZIP itself isn't corrupted, just some files inside). I don't have a separate proxy.
Steps to reproduce
Expected behavior
Bulk downloads from S3 shouldn't produce a ZIP with truncated files inside, despite showing a successful download.
Installation method
Community Docker image
Nextcloud Server version
27
Operating system
Debian/Ubuntu
PHP engine version
PHP 8.2
Web server
Nginx
Database engine version
PostgreSQL
Is this bug present after an update or on a fresh install?
Fresh Nextcloud Server install
Are you using the Nextcloud Server Encryption module?
Encryption is Disabled
What user-backends are you using?
Configuration report
List of activated Apps
Nextcloud Signing status
Nextcloud Logs
No response
Additional info
The text was updated successfully, but these errors were encountered: