-
-
Notifications
You must be signed in to change notification settings - Fork 4.6k
feat(previews): previews for large remote files without full file download #53952
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Wow, thank you for turning my suggestion into actual code, I did not expect this! I want to clarify one thing about sparse files. A sparse file is a file with a "hole"; you do not need to fill the middle with zeroes. If you write zeroes to the middle of the file, it is not a sparse file anymore. I am not very familiar with PHP as I am with other languages, but here's an untested example: $input = ...
$output = ...
$total_size = 10*1024*1024*1024; // 10GiB
$partial_size = 5242880; // 5MiB
ftruncate($output, $total_size); // create a large file without actually writing any data to disk
// Copy first part to start of file
fseek($output, 0);
stream_copy_to_stream($input, $output, $partial_size, 0);
// Copy last part to end of file
fseek($output, $total_size - $partial_size);
stream_copy_to_stream($input, $output, $partial_size, $total_size - $partial_size); |
Thanks! So I am (incorrectly) saying it is zeroes. /dev/zero actually provides null zeroes. I quickly generated a file using ftruncate to the same size as another file I generated using /dev/zero with 0 diffs between them. |
5dc4c1b to
2595107
Compare
Indeed, there is no difference between the file content. But if you copy from /dev/zero to the file, actual data is written to disk. Example: Both files appear to be 100MB of zero data: But one file consumes 100MB of disk space, and the other 0 bytes (only metadata): It will be a lot faster to use ftruncate and fseek, instead of writing from /dev/zero to file file. Using ftruncate, you only need to write 2x 5MB of data. If you copy /dev/zero to the file, you many need to write many gigabytes of zeros to the file. |
Understood, thanks for the explanation! Will implement this. |
2595107 to
d3d3819
Compare
|
Force pushed a new one using ftruncate. Thanks again! Now onto the other issue of determining if an external storage will work properly with this. I thought that |
be54fec to
abdef72
Compare
|
Spoke too soon, it was my mistake. |
abdef72 to
e62f6b8
Compare
5c31caf to
52ccc6e
Compare
|
@provokateurin Force pushed an update. Couldn't move the Also re-added in previews for the 5 sec timestamp for local files and limited the 1 and 0 sec ones for ones that require temp files. |
provokateurin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all the code comments, this was really useful to understand everything!
Would you be able to add some tests for this including real videos for testing the different scenarios? That would be really nice to ensure we don't have any regressions in the future.
Also once this PR is merged it would be cool to have an issue about the idea to let ffmpeg handle all of it. I still think we could make it work somehow and make it more reliable and effective that way.
478d921 to
c47bafb
Compare
👍
I don't know how to do that but I'll work on figuring it out. On a related note, I came across this video in the
👍 |
Head branch was pushed to by a user without write access
aaef40b to
c8069ea
Compare
|
Cypress is red because the PR is coming from a fork, but I think it's fine to merge it like this. |
|
But let's wait until after the branch off, so it gets into Nextcloud 33, as we are in the feature freeze phase already. |
😭👍 |
|
Sorry 😞 |
|
Just want to throw in some cents regarding using isLocal() |
In that scenario, it should work fine and generate the preview using the |
…ownload Co-authored-by: Kate <26026535+provokateurin@users.noreply.github.com> Co-authored-by: Côme Chilliet <91878298+come-nc@users.noreply.github.com> Signed-off-by: invario <67800603+invario@users.noreply.github.com>
c8069ea to
97a0dde
Compare
|
Just want to see if the local PR's CI gets reflected in this PR |
|
Yeah! 🥳 |
In a nutshell, previews for remote videos were partially disabled for any files larger than 5MB for performance reasons. (To avoid the situation of say...downloading a 37GB video file from S3.)
(I have a separate pending PR #53634 for S3 that allows preview generation via a direct connection to S3)
See the following for more discussion and detail of the problem this addresses:
[Bug]: Movie preview not generated on remote storages if file > 5MiB #53469
fix(previews): Allow to parse big files for previews #53496
fix(previews): avoid large file downloads for remote movie storage #52079
Summary
Thanks to an idea from Derkedes in this post: #53469 (comment)
This PR does the following:
moov atomin the original MP4/MOV filemoov atomthat it located to the same offset in the new sparse fileMovieTestRemoteFileto verify that sparse file generation is working by comparing to a reference previewImportant notes:
ffmpeghas returned an error when attempting to generate a preview/still using a timestamp that requires data from the empty/null part of the sparse file. This is good and expected behavior and will result in fallback from 5 to 1 to 0 seconds.TODO
Checklist