Review of frame tensor creation / allocation

We create/allocate output frame tensors in different places. In particular, we determine the `height` and `width` of the output tensor from different sources:

- For batch APIs (CPU and GPU): from the stream metadata, which itself comes from the `CodecContext`
- For single frame APIs:
  - CPU: swscale and filtergraph: from the `AVFrame`
  - GPU: from the `CodecContext`


The info from the metadata / `CodecContext` are available as soon as we add a stream, e.g. right after we instantiate a Python `VideoDecoder`. The `AVFrame` is only available once we have decoded the frame with ffmpeg (this is the "raw output").

The source of truth really is the `AVFrame`. `CondecContext` may be wrong, and in particular we now know that some streams may have variable height and width https://github.com/pytorch/torchcodec/issues/312.

--------


Details:

- For batch APIs:

https://github.com/pytorch/torchcodec/blob/41c6491d81e00f47529074ca1c78217cc03fc0ea/src/torchcodec/decoders/_core/VideoDecoder.cpp#L165-L187

- For single frames APIs: https://github.com/pytorch/torchcodec/blob/41c6491d81e00f47529074ca1c78217cc03fc0ea/src/torchcodec/decoders/_core/VideoDecoder.cpp#L887-L890
  and https://github.com/pytorch/torchcodec/blob/41c6491d81e00f47529074ca1c78217cc03fc0ea/src/torchcodec/decoders/_core/VideoDecoder.cpp#L1279-L1286




- For CUDA APIs:
https://github.com/pytorch/torchcodec/blob/41c6491d81e00f47529074ca1c78217cc03fc0ea/src/torchcodec/decoders/_core/CudaDevice.cpp#L156-L161

	VideoDecoder::BatchDecodedOutput::BatchDecodedOutput(
	int64_t numFrames,
	const VideoStreamDecoderOptions& options,
	const StreamMetadata& metadata)
	: ptsSeconds(torch::empty({numFrames}, {torch::kFloat64})),
	durationSeconds(torch::empty({numFrames}, {torch::kFloat64})) {
	if (options.dimensionOrder == "NHWC") {
	frames = torch::empty(
	{numFrames,
	options.height.value_or(*metadata.height),
	options.width.value_or(*metadata.width),
	3},
	{torch::kUInt8});
	} else if (options.dimensionOrder == "NCHW") {
	frames = torch::empty(
	{numFrames,
	3,
	options.height.value_or(*metadata.height),
	options.width.value_or(*metadata.width)},
	torch::TensorOptions()
	.memory_format(torch::MemoryFormat::ChannelsLast)
	.dtype({torch::kUInt8}));
	} else {

	int width = streamInfo.options.width.value_or(frame->width);
	int height = streamInfo.options.height.value_or(frame->height);
	torch::Tensor tensor = torch::empty(
	{height, width, 3}, torch::TensorOptions().dtype({torch::kUInt8}));

	std::vector<int64_t> shape = {filteredFrame->height, filteredFrame->width, 3};
	std::vector<int64_t> strides = {filteredFrame->linesize[0], 3, 1};
	AVFrame* filteredFramePtr = filteredFrame.release();
	auto deleter = [filteredFramePtr](void*) {
	UniqueAVFrame frameToDelete(filteredFramePtr);
	};
	torch::Tensor tensor = torch::from_blob(
	filteredFramePtr->data[0], shape, strides, deleter, {torch::kUInt8});

	int width = options.width.value_or(codecContext->width);
	int height = options.height.value_or(codecContext->height);
	NppiSize oSizeROI = {width, height};
	Npp8u* input[2] = {src->data[0], src->data[1]};
	torch::Tensor& dst = output.frame;
	dst = allocateDeviceTensor({height, width, 3}, options.device);

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Review of frame tensor creation / allocation #269

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Review of frame tensor creation / allocation #269

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions