Skip to content

Latest commit

 

History

History
212 lines (155 loc) · 13.5 KB

custom_video.md

File metadata and controls

212 lines (155 loc) · 13.5 KB

Custom Video Sources, Processors, and Sinks

Builders using the Amazon Chime SDK for video can produce, modify, and consume raw video frames transmitted or received during the call. You can allow the facade to manage its own camera capture source, provide your own custom source, or use a provided SDK capture source as the first step in a video processing pipeline which modifies frames before transmission. This guide will give an introduction and overview of the APIs involved with custom video sources.

Prerequisites

  • You have read the API overview and have a basic understanding of the components covered in that document.
  • You have completed Getting Started and have running application which uses the Amazon Chime SDK.

Note: Deploying the serverless/browser demo and receiving traffic from the demo created in this post can incur AWS charges.

Using the provided camera capture implementation as a custom source to access additional functionality

While the Amazon Chime SDK internally uses a implementation of camera capture, the same capturer can be created, maintained, and used externally before being passed in for transmission to remote participants using the AudioVideoFacade. This grants access to the following features:

  • Explicit camera device and format selection.
  • Configuration, starting, stopping, and video renderering before joining the call.
  • Torch/flashlight control.

The camera capture implementation is found in DefaultCameraCaptureSource. To create and use the camera capture source complete the following steps:

  1. Create a DefaultCameraCaptureSource. This requires a Logger as dependency.
    let cameraCaptureSource = DefaultCameraCaptureSource(logger: logger)
  1. Call VideoCaptureSource.start() and DefaultCameraCaptureSource.stop() to start and stop the capture respectively. Note that if no VideoSink has been attached (see later sections) that captured frames will be immediately dropped.
    // Start the capture
    cameraCaptureSource.start()

    // Stop the capture when complete
    cameraCaptureSource.stop()
  1. To set the capture device, use CameraCaptureSource.switchCamera() or set CameraCaptureSource.device. You can get a list of usable devices by calling MediaDevice.listVideoDevices(). To set the format, set CameraCaptureSource.format. You can get a list of usable formats by calling MediaDevice.listSupportedVideoCaptureFormats(mediaDevice:) with a specific MediaDevice. These can be set before or after capture has been started, and before or during call.
    // Switch the camera
    cameraCaptureSource.switchCamera()

    // Get the current device and format
    let currentDevice = cameraCaptureSource.device
    let currentFormat = cameraCaptureSource.format

    // Pick a new device explicitly 
    let newDevice = MediaDevice.listVideoDevices().first { mediaDevice in
        mediaDevice.type == MediaDeviceType.videoFrontCamera
    }
    cameraCaptureSource.device = newDevice

    // Pick a new format explicitly (reverse these so the highest resolutions are first)
    let newFormat = MediaDevice.listSupportedVideoCaptureFormats(mediaDevice: videoDevice).reversed().filter { $0.height < 800 }.first
    if let format = newFormat {
        cameraCaptureSource.format = format
    }
  1. To turn on the flashlight on the current camera, set CameraCaptureSource.torchEnabled. This can be set before or after capture has been started, and before or during call.
    // Turn on the torch
    cameraCaptureSource.torchEnabled = true

    // Turn off the torch
    cameraCaptureSource.torchEnabled = false
  1. To render local camera feeds before joining the call, use VideoSource.addVideoSink with a provided VideoSink (e.g. a DefaultVideoRenderView created as described in Getting Started).
    // Add the render view as a sink to camera capture source
    cameraCaptureSource.addVideoSink(sink: someDefaultVideoRenderView)

To use the capture source in a call, do the following:

  1. When enabling local video, call AudioVideoControllerFacade.startLocalVideo(source:) with the camera capture source as the parameter. Ensure that the capture source is started before startLocalVideo(source:) to start transmitting frames.
    // Start the camera capture source is started if not already
    cameraCaptureSource.start()
    audioVideo.startLocalVideo(source: cameraCaptureSource)

Implementing a custom video source and transmitting

If builders wish to implement their own video sources (e.g. a camera capture implementation with different configuration, or a raw data source), they can do so by implementing the VideoSource protocol, and then producing VideoFrame objects containing the raw buffers in some compatible format, similar to the following snippet. See DefaultCameraCaptureSource code for a working implementation using the AVFoundation framework.

The following snippet contains boilerplate for maintaining a list of sinks that have been added to the source; this allows all sources to be forked to multiple targets (e.g. transmission and local rendering). See VideoContentHint for more information on the effects of that paramater to the downstream encoder.

class MyVideoSource: VideoSource {
    // Do not indicate any hint to downstream encoder
    var videoContentHint = VideoContentHint.none

    // Downstream video sinks
    private let sinks = NSMutableSet()

    func startProducingFrames() {
        while (true) {
            // Obtain pixel buffer from undelying source ...

            // Create frame
            let buffer = VideoFramePixelBuffer(pixelBuffer: somePixelBuffer)
            let timestampNs = someTimestamp
            let frame = VideoFrame(timestampNs: Int64(timestampNs),
                                   rotation: .rotation0,
                                   buffer: buffer)

            // Forward the frame to downstream sinks
            for sink in sinks {
                (sink as? VideoSink)?.onVideoFrameReceived(frame: frame)
            }
        }
    }

    func addVideoSink(sink: VideoSink) {
        sinks.add(sink)
    }

    func removeVideoSink(sink: VideoSink) {
        sinks.remove(sink)
    }
}

When enabling local video, call AudioVideoControllerFacade.startLocalVideo(source:) with the custom source as the parameter. Ensure that the capture source is started before startLocalVideo(source:) to start transmitting frames.

    // Create and start the processor
    let myVideoSource = MyVideoSource()
    myVideoSource.startProducingFrames()

    // Begin transmission of frames
    audioVideo.startLocalVideo(source: myVideoSource)

Implementing a custom video processing step for local source

By combining the VideoSource and VideoSink APIs, builders can easily create a video processing step to their applications. Incoming frames can be processed, and then fanned out to downstream sinks like in the following snippet. See example processors in Demo code for complete, documented implementations.

class MyVideoProcessor: VideoSource, VideoSink {
    // Note: Builders may want to make this mirror intended upstream source
    // or make it a constructor parameter
    var videoContentHint = VideoContentHint.none

    // Downstream video sinks
    private let sinks = NSMutableSet()

    func onVideoFrameReceived(frame: VideoFrame) {
        guard let pixelBuffer = frame.buffer as? VideoFramePixelBuffer else {
            return
        }

        // Modify buffer ...

        let processedFrame = VideoFrame(timestampNs: frame.timestampNs,
                                        rotation: frame.rotation,
                                        buffer: VideoFramePixelBuffer(pixelBuffer: someModifiedBuffer))

        for sink in sinks {
            (sink as? VideoSink)?.onVideoFrameReceived(frame: processedFrame)
        }
    }

    func addVideoSink(sink: VideoSink) {
        sinks.add(sink)
    }

    func removeVideoSink(sink: VideoSink) {
        sinks.remove(sink)
    }
}

To use a video frame processor, builders must use a video source external to the facade (e.g. DefaultCameraCaptureSource). Wire up the source to the processing step using VideoSource.addVideoSink(sink:). When enabling local video, call AudioVideoControllerFacade.startLocalVideo(source:) with the processor (i.e. the end of the pipeline) as the parameter. Ensure that the capture source is started to start transmitting frames.

    let myVideoProcessor = MyVideoProcessor()
    // Add processor as sink to camera capture source
    cameraCaptureSource.addVideoSink(sink: myVideoProcessor)

    // Use video processor as source to transmitted video
    audioVideo.startLocalVideo(myVideoProcessor)

Implementing a custom video sink for remote sources

Though most builders will simply use DefaultVideoRenderView, they can also implement their own VideoSink/VideoRenderView (currently VideoRenderView is just an alias for VideoSink); some may want full control over the frames for remote video processing, storage, or other applications. To do so implement the VideoSink interface like in the following snippet.

class MyVideoSink: VideoSink {
    func onVideoFrameReceived(frame: VideoFrame) {
        // Store, render, or upload frame
    }
}

When a tile is added, simply pass in the custom sink to VideoTileControllerFacade.bindVideoView(videoView:tileId:) and it will begin to receive remote frames:

func videoTileDidAdd(tileState: VideoTileState) {
    // Create a new custom sink
    let myVideoSink = MyVideoSink()

    // Bind it to the tile ID
    audioVideo.bindVideoView(myVideoSink, tileState.tileId)
}