Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for saving video files to object storage #4290

Merged
merged 24 commits into from
Aug 17, 2021

Conversation

pingiun
Copy link
Contributor

@pingiun pingiun commented Jul 30, 2021

Description

I've reworked this PR a bit so here is the new description. I've kept the old description included below.

This PR adds a job that normally runs after transcoding (unless disabled) which uploads video files to an s3 compatible object store. This improves on the current situation because files are uploaded in a job instead of a request, and they use the native APIs to do this, which makes sure files aren't being rewritten. Some model fields are added to keep track of the moving state and to make sure the last part of the job is only run once.

The movingJobsRunning field in the video model is incremented whenever a transcode job is created and only decremented once the uploads for the created videoFile are done. Then, the new job can detect when the last videoFile was uploaded because movingJobsRunning will be 0 and the cleanup part of the job can run. This uploads the master HLS playlist, removes the files from the local disk, and sets the video to published.

The storage field on videoFile and videoStreamingPlaylist is used to create the correct url to send in the API. The client then correctly loads the files from object storage (als long as CORS is correctly configured on the bucket). The streaming_playlists_url_template config field is used for custom URL generation, this is explained in the comment in default.yaml

Old PR text before rework This adds a job that runs after all transcodes are done that uploads all the video files to object storage (s3 compatible). New fields are added to the video, videoFile, and videoStreamingPlaylist model to keep track of transcodes and where files are stored.

The transcodeJobsRunning field in the video model is incremented whenever a transcode job is created and decremented whenever one finishes. The job that moves files to object storage is added after each transcode job is finished, but only runs when transcodeJobsRunning is 0 so that all files can be uploaded at once.

The storage field on videoFile and videoStreamingPlaylist is used to create the correct url to send in the API. The client then correctly loads the files from object storage (als long as CORS is correctly configured on the bucket).

Currently the old files are not removed, because I wanted to ask feedback on how to best do this. In principle the job that moves files to object storage could delete the files right after the files are uploaded, because we are sure that no transcoding jobs are running anymore. However it might be the case that a user is currently watching a video that is streaming from those files, and I think removing the files might cause the stream to drop.

In addition to this I also need to implement a mechanism to get the files back to the local filesystem whenever local operations need to be done. I think this is only the case when transcoding but I'm not sure about that.

TODO (need help):

  • Remove old files from the filesystem
  • Mechanism to retrieve files from object storage whenever new transcodes are needed

Related issues

Fixes #3661

Has this been tested?

  • 👍 yes, I added tests to the test suite
  • 💭 no, because this PR is a draft and still needs work
  • 🙅 no, because this PR does not update server code
  • 🙋 no, because I need help

@pingiun
Copy link
Contributor Author

pingiun commented Jul 30, 2021

I already started working on the new documentation: https://framagit.org/pingiun/documentation/-/blob/pingiun-master-patch-80848/admin-remote-storage.md

@kontrollanten
Copy link
Contributor

Elegant! Impressive work.

Currently the old files are not removed, because I wanted to ask feedback on how to best do this. In principle the job that moves files to object storage could delete the files right after the files are uploaded, because we are sure that no transcoding jobs are running anymore. However it might be the case that a user is currently watching a video that is streaming from those files, and I think removing the files might cause the stream to drop.

Can we add a video state called "processing" (or something similar) until the upload to s3 is done? That would mean that the video isn't available until the files are in the bucket.

A future request would be the possibility to set the s3 download base URL. That way it would be possible to set the file URL to a CDN, to reduce the costs.

@pingiun
Copy link
Contributor Author

pingiun commented Jul 30, 2021

Can we add a video state called "processing" (or something similar) until the upload to s3 is done? That would mean that the video isn't available until the files are in the bucket.

This might actually be the most elegant solution, good idea.

@SimplyCorbett
Copy link

SimplyCorbett commented Jul 30, 2021

A future request would be the possibility to set the s3 download base URL. That way it would be possible to set the file URL to a CDN, to reduce the costs.

Isn't it already possible to do this by caching the /w/xyz and streaming-playlists portion of the URL using varnish or nuster or similar?

@kontrollanten
Copy link
Contributor

A future request would be the possibility to set the s3 download base URL. That way it would be possible to set the file URL to a CDN, to reduce the costs.

Isn't it already possible to do this by caching the /w/xyz and streaming-playlists portion of the URL using varnish or nuster or similar?

Today that's possible, but as I understand this PR it will change the video file URL so that the browser will download it directly from S3 (correct me if I'm wrong, @pingiun ).

@pingiun
Copy link
Contributor Author

pingiun commented Jul 31, 2021

Today that's possible, but as I understand this PR it will change the video file URL so that the browser will download it directly from S3 (correct me if I'm wrong, @pingiun ).

This is correct

@SimplyCorbett
Copy link

SimplyCorbett commented Jul 31, 2021

Today that's possible, but as I understand this PR it will change the video file URL so that the browser will download it directly from S3 (correct me if I'm wrong, @pingiun ).

This is correct

Then there needs to be a way to cache the video. This isn't a good idea. Cache mirrors are cheap. I can get high bandwidth servers for a fairly cheap price.

Direct S3 would be many times more expensive.

@pingiun
Copy link
Contributor Author

pingiun commented Aug 1, 2021

Today that's possible, but as I understand this PR it will change the video file URL so that the browser will download it directly from S3 (correct me if I'm wrong, @pingiun ).

This is correct

Then there needs to be a way to cache the video. This isn't a good idea. Cache mirrors are cheap. I can get high bandwidth servers for a fairly cheap price.

Direct S3 would be many times more expensive.

With these changes it's still possible to use the old method of s3 storage via s3fs.

Adding the suggestion from @kontrollanten should be fairly simple though, I can try to add that to this PR

@pingiun
Copy link
Contributor Author

pingiun commented Aug 1, 2021

@SimplyCorbett Can you take a look at 26cabf2 and see if this change adds the right support for you?

@pingiun
Copy link
Contributor Author

pingiun commented Aug 2, 2021

I think a base URL should be enough for the most cases.

Changed in 03c6d12

@kontrollanten
Copy link
Contributor

Mechanism to retrieve files from object storage whenever new transcodes are needed

Have you any ideas about how to solve this? I'm thinking about the following solution

  1. In handleHLSJob, a check is done if the video file exists in PeerTube tmp folder. If not, it will download the file from s3.
  2. When handleHLSJob is done, a new job, cleanupTmpVideo, is created.
  3. cleanupTmpVideo job will check if there's any video transcoding jobs in queue, if not it will remove the tmp file.

What do you think about that?

@Chocobozzz
Copy link
Owner

Chocobozzz commented Aug 3, 2021

Hello and thanks @pingiun

Since it's still a draft, here's some code remarks:

  • Try to use objectStorage prefix instead of s3 prefix for your function/variables/config names
  • Prefer to use a tree for the config: s3.streaming_playlists_bucket -> object_storage.streaming_playlists.bucket
  • Use uppercase for config: S3.STREAMING_PLAYLISTS_BUCKETINFO.bucket -> OBJECT_STORAGE.STREAMING_PLAYLISTS.BUCKET (maybe BUCKET_NAME instead of BUCKET)
  • I suggest to rename moveJobsRunning to pendingMovingJobs (or better, create a dedicated videoJobInfo table with a pendingMove & videoId columns so we could also use this table to track pending transcoding jobs)
  • https://github.com/Chocobozzz/PeerTube/pull/4290/files#diff-3e26d41ca4bda1de8e1747af70ca2af642abcc1e9e0bfb94239ff2165acfbde5R19 uses a string instead of an integer
  • I think we should store the origin object storage URL in fileUrl, without base_url injection. Instead, inject the base_url at "runtime" so admins can easily change this configuration without running a script to update DB URLs

pingiun added a commit to pingiun/PeerTube that referenced this pull request Aug 3, 2021
Chocobozzz#4290 (comment)

The remarks in question:
    Try to use objectStorage prefix instead of s3 prefix for your function/variables/config names
    Prefer to use a tree for the config: s3.streaming_playlists_bucket -> object_storage.streaming_playlists.bucket
    Use uppercase for config: S3.STREAMING_PLAYLISTS_BUCKETINFO.bucket -> OBJECT_STORAGE.STREAMING_PLAYLISTS.BUCKET (maybe BUCKET_NAME instead of BUCKET)
    I suggest to rename moveJobsRunning to pendingMovingJobs (or better, create a dedicated videoJobInfo table with a pendingMove & videoId columns so we could also use this table to track pending transcoding jobs)
    https://github.com/Chocobozzz/PeerTube/pull/4290/files#diff-3e26d41ca4bda1de8e1747af70ca2af642abcc1e9e0bfb94239ff2165acfbde5R19 uses a string instead of an integer
    I think we should store the origin object storage URL in fileUrl, without base_url injection. Instead, inject the base_url at "runtime" so admins can easily change this configuration without running a script to update DB URLs
@pingiun
Copy link
Contributor Author

pingiun commented Aug 3, 2021

@Chocobozzz thanks for the good suggestions, I implemented them all in 105cc1f

Do you have an idea at what places the file should be locally available? This is my first time working on this codebase so not everything is in my head yet. I'm not sure in what circumstances transcodes can run for example, so I don't exactly know how I can make sure the lifecycle of local files stays correct.

@kukhariev
Copy link
Contributor

@pingiun this was tested with files larger than 5 gigabytes?

@pingiun
Copy link
Contributor Author

pingiun commented Aug 3, 2021

@pingiun this was tested with files larger than 5 gigabytes?

Thanks for telling me about this limit, I didn't know before that the official SDK would not split up when sending large files.

I added basic support for this in ee4c80a, but it might require some more thinking about aborting multipart requests when something went wrong so parts aren't left in the bucket. Another option would just to request the user to have a lifecycle rule setup in the bucket that deletes old object parts.

As can be seen in the commit I made a hacky addition to the ReadStream to make streaming to the API work. I also created a PR at the official s3 sdk repo so that we might change this back in the future.

@pingiun pingiun force-pushed the add-object-storage-support branch from ea8917b to ee4c80a Compare August 3, 2021 21:25
@Chocobozzz Chocobozzz mentioned this pull request Aug 5, 2021
4 tasks
server/controllers/api/videos/upload.ts Outdated Show resolved Hide resolved
server/controllers/api/videos/upload.ts Outdated Show resolved Hide resolved
server/initializers/checker-after-init.ts Outdated Show resolved Hide resolved
server/initializers/migrations/0660-object-storage.ts Outdated Show resolved Hide resolved
server/initializers/migrations/0660-object-storage.ts Outdated Show resolved Hide resolved
server/lib/object-storage.ts Outdated Show resolved Hide resolved
server/lib/object-storage.ts Outdated Show resolved Hide resolved
server/lib/video-paths.ts Outdated Show resolved Hide resolved
server/models/video/video-file.ts Outdated Show resolved Hide resolved
server/models/video/video-job-info.ts Outdated Show resolved Hide resolved
@Chocobozzz
Copy link
Owner

Do you have an idea at what places the file should be locally available?

I think you can put the files in https://github.com/Chocobozzz/PeerTube/blob/develop/config/production.yaml.example#L78

@pingiun
Copy link
Contributor Author

pingiun commented Aug 5, 2021

I'm not sure why the tests are failing, the same thing is happening when I run the tests locally.

I think you can put the files in https://github.com/Chocobozzz/PeerTube/blob/develop/config/production.yaml.example#L78

I think you may have misunderstood, I meant which places in the code. So which parts in the code need to have the files stored locally, and need a change to support files that may not exist locally?

@Chocobozzz
Copy link
Owner

Chocobozzz commented Aug 6, 2021

I think you may have misunderstood, I meant which places in the code.

Oh sorry, then I think everywhere we use getVideoFilePath (we use it to generate torrents, thumbnails, for transcoding, and playlist generation that should only happen on video upload/update and in scripts).

I'm not sure why the tests are failing, the same thing is happening when I run the tests locally.

The server build fails because of the stream/promises import: https://github.com/Chocobozzz/PeerTube/pull/4290/checks?check_run_id=3251005770#step:10:3480

@pingiun
Copy link
Contributor Author

pingiun commented Aug 6, 2021

I'm not sure why the tests are failing, the same thing is happening when I run the tests locally.

The server build fails because of the stream/promises import: https://github.com/Chocobozzz/PeerTube/pull/4290/checks?check_run_id=3251005770#step:10:3480

Thanks, I didn't catch that. I see that module was only added in Node 15.0 so I will try to remove the usage

@pingiun pingiun marked this pull request as ready for review August 6, 2021 15:06
pingiun added 3 commits August 6, 2021 17:06
Uses two config keys to support url generation that doesn't directly go
to (compatible s3). Can be used to generate urls to any cache server or
CDN.
@pingiun pingiun requested a review from Chocobozzz August 6, 2021 15:09
@Chocobozzz
Copy link
Owner

I'll update this PR, thanks @pingiun

config/default.yaml Outdated Show resolved Hide resolved
config/default.yaml Outdated Show resolved Hide resolved
server/lib/video.ts Outdated Show resolved Hide resolved
server/models/video/video-job-info.ts Outdated Show resolved Hide resolved
@Chocobozzz Chocobozzz force-pushed the add-object-storage-support branch from 39eb385 to 2b71af9 Compare August 16, 2021 15:44
@Chocobozzz
Copy link
Owner

PR is ready

@pingiun Could you open a MR for the documentation?

@Chocobozzz Chocobozzz merged commit 0305db2 into Chocobozzz:develop Aug 17, 2021

const createMultipartCommand = new CreateMultipartUploadCommand({
Bucket: bucketInfo.BUCKET_NAME,
Key: key
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be setting ACL here, as uploads are private by default for a lot of S3 providers.

Something like this:

Suggested change
Key: key
Key: key,
ACL: "public-read"

I'm happy to follow this up in a separate PR. For now I'm just setting the bucket ACL to public.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can review this PR and I'll resolve the conversations directly on develop :)

Copy link
Contributor Author

@pingiun pingiun Aug 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only tested with backblaze b2 before, this does indeed seem to be necessary for AWS S3

Copy link

@hugomd hugomd Aug 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reflecting on this a bit more, the ACL can likely be handled by the bucket policy.

We could follow up to specifically handle private buckets, pre-signing each URL before handing it off to the client.

Here's an example policy that disables file listing but allows retrieving objects:

{
    "Version": "2008-10-17",
    "Statement": [
        {
        "Sid": "AddPerm",
        "Effect": "Allow",
        "Principal": "*",
        "Action": "s3:GetObject",
        "Resource": "arn:aws:s3:::my-bucket/*"
        }
    ]
}

@pingiun
Copy link
Contributor Author

pingiun commented Aug 17, 2021

@Chocobozzz it appears through my testing that you changed something where local files are needed after deleting them. I now see the following in my logs:

[kubernetes.docker.internal:9000] 2021-08-17 17:21:46.663 info: Moving video 011c85e3-0c5e-459f-9f8b-4870b746ca41 in job 8.
[kubernetes.docker.internal:9000] 2021-08-17 17:21:48.332 info: Fetching HLS file hls_011c85e3-0c5e-459f-9f8b-4870b746ca41/7c6c9838-4d4f-4280-bf82-435b96d53465-360-fragmented.mp4 from object storage to /data/tmp/ecf8ea4b-217e-417f-9ed1-864c34dc47e1.mp4.
[kubernetes.docker.internal:9000] 2021-08-17 17:21:48.631 info: Creating torrent /data/torrents/e44ecb3d-b22f-4aad-a848-f4fa19cde296-360-hls.torrent.
[kubernetes.docker.internal:9000] 2021-08-17 17:21:48.639 info: Running cleanup after moving files to object storage (video 011c85e3-0c5e-459f-9f8b-4870b746ca41 in job 8)
[kubernetes.docker.internal:9000] 2021-08-17 17:21:48.842 info: Publishing video 011c85e3-0c5e-459f-9f8b-4870b746ca41.

So files are uploaded, deleted and then downloaded again. This wasn't happening in my code and doesn't seem very efficient (especially for large uploads)

@pingiun
Copy link
Contributor Author

pingiun commented Aug 17, 2021

Another thing that now doesn't work anymore is updating the endpoint url, video's will still be loaded from the old url because it's saved in the database. My code would always generate the URL from scratch (following your suggestion)

@Chocobozzz
Copy link
Owner

Chocobozzz commented Aug 18, 2021

Another thing that now doesn't work anymore is updating the endpoint url, video's will still be loaded from the old url because it's saved in the database.

In my comment I only talked about base_url and not endpoint. IMHO when the admin change the endpoint, they do not expect that peertube automatically considers that objects previously uploaded use the new endpoint. Because admins would need to migrate old objects, and so it's part of a migration process where peertube should be included. It's the same if admins disable object storage at some point, we should still serve old files from object storage.
However, the base_url should stay dynamic because a proxy that just cache files in front of a s3 storage could be easily changed.

So files are uploaded, deleted and then downloaded again. This wasn't happening in my code and doesn't seem very efficient (especially for large uploads)

Thanks, it should be fixed in 1f6125b

@hugomd I fixed your comment in 0d4a3c6

@hugomd
Copy link

hugomd commented Aug 18, 2021

@hugomd I fixed your comment in 0d4a3c6

Thanks for that, I'll pull it in and see how I go with a private bucket!

@hugomd
Copy link

hugomd commented Aug 18, 2021

A nice extension to this could be pre-signing the upload URL and passing it to the client, so they can upload directly to object storage without needing to proxy via Peertube.

Peertube could wait until the upload's finished, and then pull it from object storage for transcoding.

@pingiun
Copy link
Contributor Author

pingiun commented Aug 18, 2021

In my comment I only talked about base_url and not endpoint. IMHO when the admin change the endpoint, they do not expect that peertube automatically considers that objects previously uploaded use the new endpoint. Because admins would need to migrate old objects, and so it's part of a migration process where peertube should be included. It's the same if admins disable object storage at some point, we should still serve old files from object storage.
However, the base_url should stay dynamic because a proxy that just cache files in front of a s3 storage could be easily changed.

Of course, this makes sense.

I created the MR for documentation here: https://framagit.org/framasoft/peertube/documentation/-/merge_requests/98

@Chocobozzz
Copy link
Owner

Perfect, I'll also update the MR when I'll time. Thanks @pingiun

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support S3 storage to store videos
6 participants