Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvements for bcm2835-codec #3097

Merged
merged 3 commits into from
Aug 5, 2019
Merged

Improvements for bcm2835-codec #3097

merged 3 commits into from
Aug 5, 2019

Conversation

wens
Copy link
Contributor

@wens wens commented Jul 23, 2019

This is a three part improvement series.

The first part switches the driver to use the mplane mem2mem API.
This newer API is used exclusively by some applications. For applications
only supporting the old splane API, the userspace library libv4l2 has a
plugin that can convert splane requests into mplane ones.

Unfortunately it is hard to support both APIs at the same time, due to
having different buffer types, and that screws up the videobuf2 helpers.

The second part is implementing theV4L2_CID_MIN_BUFFERS_FOR_CAPTURE
parameter. Some applications require this (and it is said to be required in the
WiP stateful decoder specification). This just outputs the number given by MMAL.

The third part is just a cleanup of how device_caps is set. Changes match mainline.

With all these, I am able to use v4l2 decoding in a Chromium based application on the
Raspberry Pi 4. Unfortunately the application itself is proprietary, and is based on an
older code-base. But I will try to port the required changes onto Debian buster's
chromium package.

It works most of the time, but occasionally it borks and Chromium falls back to software
decoding. One failure type is the decoder not correctly reporting the image dimension,
then Chromium passes in a buffer that is too small. Why a resolution change event was
not triggered I have not dug in to.

Copy link
Contributor

@6by9 6by9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The link in your commit text is invalid, but otherwise the change looks OK.

@wens
Copy link
Contributor Author

wens commented Jul 23, 2019

Looks like the patches in the series were merged, hence no longer visible in patchwork with the default settings. I'll replace the link with one pointing to the mailing list.

@pelwell
Copy link
Contributor

pelwell commented Jul 23, 2019

If they've been merged already then we'd prefer to cherry-pick them directly from upstream.

Copy link
Contributor

@6by9 6by9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bcm2835-codec is a little different from other decoder implementations in that there is an intermediate format conversion between the hardware and V4L2 buffers. The number of capture buffers required is therefore independent of the stream and DPB etc.

There are plans to remove the conversion, but it requires a fair amount of rework within the firmware. Until that is done, you may as well return a static value of 1.

https://hverkuil.home.xs4all.nl/codec-api/uapi/v4l/dev-decoder.html is Hans hosting the latest copy of the decoder spec - no mention of V4L2_CID_MIN_BUFFERS_FOR_CAPTURE being mandatory.

@wens
Copy link
Contributor Author

wens commented Jul 23, 2019

If they've been merged already then we'd prefer to cherry-pick them directly from upstream.

I meant the patches referenced in the link in the commit log.
They are not required for this pull request.

@pelwell
Copy link
Contributor

pelwell commented Jul 23, 2019

As long as there is no overlap then that's fine.

Copy link
Contributor

@6by9 6by9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If they've been merged already then we'd prefer to cherry-pick them directly from upstream.

It's a reference to an equivalent change set. The bcm2835-camera equivalent has been merged as b0b48b4

Copy link
Contributor

@6by9 6by9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it really valid to only advertise single planar formats through the multiplanar API?

https://hverkuil.home.xs4all.nl/codec-api/uapi/v4l/pixfmt-v4l2-mplane.html#c.v4l2_pix_format_mplane

__u32 | pixelformat | The pixel format. Both single- and multi-planar four character codes can be used.

would imply potentially so.
What I do know for definite is that we can't support the genuine multiplanar formats easily as the lower levels want all planes in a single contiguous buffer, but the multi planar allocator has no options for allocating a single buffer with offsets.

I had looked at the client changes required for multi-planar within Chromium, and they weren't significant.

@6by9
Copy link
Contributor

6by9 commented Jul 23, 2019

I'm intrguiged as to how you have got the V4L2 codec code in Chromium to actually work, although it may well be down to your environment.
When I looked, it was all under the huge Ozone switch but the X11 rendering paths in Ozone were fairly impressively broken. Is this using one of the other rendering backends, more akin to how ChromeOS sets things up?

Other than the minor comments, the changes as a whole look reasonable, but they do need to be checked against other clients (mainly GStreamer and FFmpeg).

@wens
Copy link
Contributor Author

wens commented Jul 23, 2019

AFAIK one specifies different fourcc codes to use the true multi-planar formats, i.e. YUV420M vs YUV420.
So supporting the mplane API without supporting the actual multi-planar formats should be OK.

I asked on the mailing list [1] but so far it was only me and a cedrus fellow talking about it.
[1] https://www.spinics.net/lists/linux-media/msg154053.html

@wens
Copy link
Contributor Author

wens commented Jul 23, 2019

For Chromium, in GenericV4L2Device::CreateGLImage I was just bypassing Ozone and calling base::MakeRefCounted<gfx::NativePixmapDmaBuf> directly, which is essentially what Ozone does on Linux.

Another part is in V4L2VideoDecodeAccelerator::ImportBufferForPicture, which should apply to Linux in general and not just when Ozone is enabled. And since V4L2 is Linux-only, I just removed the check.

As for the X11 rendering path being broken, it only supports EGL, not GLX. I ran Chromium with --use-gl=egl so it worked, but I confess this part will need some work.

As for other clients, FFmpeg doesn't work with V4L2 on RPi in general. It complains about pts being zero or something. Unrelated to the changes I made. Gstreamer works before and after my changes.

@6by9
Copy link
Contributor

6by9 commented Jul 23, 2019

Thanks for the list link. I've half dropped off the mailing list as something in the email at one end or other seems to blackhole me :-(
From Paul's response it sounds like cedrus has a very similar requirement to our own.
The single planar APIs are also every so slightly more memory efficient. When allocating independent planes each plane gets aligned to a page boundary. Generally that is nothing compared to the several MBs for the image data.

Forcing the use of libv4l is horrid for those who want the single planar API, but may be the way to go.

I'm slightly conflicted over having the bcm2835-v4l2 and bcm2835-unicam drivers both using the single planar API whilst bcm2835-codec goes multi-planar, but switching the other two is more likely to create issues for existing apps, and the libv4l solution isn't automatic.
bcm2835-codec is new enough that I'm happy to make the change as there are unlikely to be that many people dependent on it.
That thought process isn't enough to block this PR, but is a niggle in the back of my mind. If GStreamer and FFmpeg behave, then that will probably cover 99% of users.

As for other clients, FFmpeg doesn't work with V4L2 on RPi in general. It complains about pts being zero or something.

Raspbian should have a couple of patches to FFmpeg to make it work. I'll have a look.

@wens
Copy link
Contributor Author

wens commented Jul 23, 2019

Thanks for the list link. I've half dropped off the mailing list as something in the email at one end or other seems to blackhole me :-(
From Paul's response it sounds like cedrus has a very similar requirement to our own.

AFAIK the hardware requires the different planes to be close to each other. Having a contiguous buffer is the easiest way to fulfill that requirement.

The single planar APIs are also every so slightly more memory efficient. When allocating independent planes each plane gets aligned to a page boundary. Generally that is nothing compared to the several MBs for the image data.

Right, but that only happens if you use multi-planar formats? For the single-planar, a.k.a. contiguous buffer, formats, there's only one plane.

Forcing the use of libv4l is horrid for those who want the single planar API, but may be the way to go.

What would applications do instead? Direct ioctl calls? That is actually what Chromium does for the decoder. It uses libv4l2, if specified at build time, for the encoder.

I'm slightly conflicted over having the bcm2835-v4l2 and bcm2835-unicam drivers both using the single planar API whilst bcm2835-codec goes multi-planar, but switching the other two is more likely to create issues for existing apps, and the libv4l solution isn't automatic.
bcm2835-codec is new enough that I'm happy to make the change as there are unlikely to be that many people dependent on it.

I figure it's easier to have the driver support the newer API, and have libv4l2 for backward-compatibility, than have applications all try to support both APIs. But I suppose the opposite could be argued as well, since maybe only half the drivers in the kernel use mplane.

Either way my main goal was to use V4L2 in Chromium rather than the MMAL library.

That thought process isn't enough to block this PR, but is a niggle in the back of my mind. If GStreamer and FFmpeg behave, then that will probably cover 99% of users.

As for other clients, FFmpeg doesn't work with V4L2 on RPi in general. It complains about pts being zero or something.

Raspbian should have a couple of patches to FFmpeg to make it work. I'll have a look.

Curious. I was using the newest Raspbian release already.

@wens
Copy link
Contributor Author

wens commented Jul 30, 2019

I've rebased on top of the latest rpi-4.19.y branch, and updated the patches as discussed.

I realized that for V4L2_CID_MIN_BUFFERS_FOR_CAPTURE, my original idea was to implement the control for all roles. But I just simplified it and limited it to the decoder for now.

@6by9
Copy link
Contributor

6by9 commented Jul 31, 2019

Sorry, I still haven't had a chance to test this for myself against GStreamer and FFmpeg. I'll try to do so tomorrow, but can't guarantee it.

wens added 3 commits August 1, 2019 10:06
There are two APIs for mem2mem devices, the older single-planar API and
the newer multi-planar one. Without making things overly complex, the
driver can only support one or the other. However the userspace libv4l2
library has a plugin that allows multi-planar API devices to service
single-planar consumers.

Chromium supports the multi-planar API exclusively, though this is
currently limited to ChromiumOS. It would be possible to add support
for generic Linux.

Switching to the multi-planar API would allow usage of both APIs from
userspace.

Signed-off-by: Chen-Yu Tsai <wens@csie.org>
The stateful decoder specification shows an optional step for retrieving
the miminum number of capture buffers required for the decoder to
proceed. While not a required parameter, having it makes some
applications happy.

bcm2835-codec is a little different from other decoder implementations
in that there is an intermediate format conversion between the hardware
and V4L2 buffers. The number of capture buffers required is therefore
independent of the stream and DPB etc.

There are plans to remove the conversion, but it requires a fair amount
of rework within the firmware. Until that is done, simply return a value
of 1.

Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Instead of filling in the struct v4l2_capability device_caps
field, fill in the struct video_device device_caps field.

That way the V4L2 core knows what the capabilities of the
video device are.

This is similar to a cleanup series by Hans Verkuil [1].

[1] https://www.spinics.net/lists/linux-media/msg153313.html

Signed-off-by: Chen-Yu Tsai <wens@csie.org>
@wens
Copy link
Contributor Author

wens commented Aug 1, 2019

As for FFmpeg not working, someone else had reported the same issue on a different platform [1].

14:10 pH5: Chewi: looks to me like the ffmpeg v4l2m2m code doesn't fill the chroma linesizes correctly for single planar formats
14:10 pH5: Chewi: I think it is this code: https://github.com/FFmpeg/FFmpeg/blob/master/libavcodec/v4l2_buffers.c#L302
14:12 pH5: Chewi: for non-MPLANE queues only plane_info[0].bytesperline is valid, and the chroma strides are half of that.

[1] https://dri.freedesktop.org/~cbrill/dri-log/?channel=etnaviv&highlight_names=&date=2017-10-30

@6by9
Copy link
Contributor

6by9 commented Aug 1, 2019

I have patches (which I thought had been merged in Raspbian) that patch https://github.com/FFmpeg/FFmpeg/blob/master/libavcodec/v4l2_buffers.c#L323 to also support NV21 and YUV420P. YV12 isn't a defined format in FFmpeg IIRC.
https://www.raspberrypi.org/forums/viewtopic.php?t=234778#p1436166

NV12 is the standard format used on Samsung SoCs IIRC, so whoever wrote the original support appears to have done the minimum possible, although plausibly because they took the view that it's better to have no code than untested code. An assert for any of the unhandled multiple plane but single planar V4L2 formats would have been nice rather than just falling in a heap.

@6by9
Copy link
Contributor

6by9 commented Aug 1, 2019

FYI My patch to support the single planar V4L2 API in Chromium:
0001-v4l2-Initial-support-for-the-single-planar-API.zip
Whilst it built, I'm not sure I ever got Chromium in a state that would actively use it - the Ozone X11 crashes were a major headache which is why we largely dropped it.

Using libv4l adds an unknown into the mix. Should applications be using LD_PRELOAD to insert it or not? Under what conditions? What happens if you needed it and didn't preload it? And I believe there are also issues with it and dmabufs (which I thought Chromium used as it was one of the reasons I was investigating it)

Sorry, I haven't had a chance to do any testing on your patches today. I will make it a priority tomorrow.

Copy link
Contributor

@6by9 6by9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm happy with these.
I've tested with FFmpeg and it works.
I still have slight reservations over going multi-planar, but it'd be an easy one to revert if it really messes things up.

@kilograham
Copy link

I am independently working on fixing up ffmpeg to fix the pts issues... i can play with h264_v4l2m2m on Pi 4 but something is not quite right along the way as h264 is smoother in software. Not yet sure whether the issues are timestamp related or not (ffplay as an example drops frames without the correct timestamps)

@pelwell
Copy link
Contributor

pelwell commented Aug 5, 2019

No objections from me, but I'm hardly a qualified observer.

@popcornmix popcornmix merged commit 8c54e77 into raspberrypi:rpi-4.19.y Aug 5, 2019
popcornmix added a commit to raspberrypi/firmware that referenced this pull request Aug 6, 2019
kernel: net: bcmgenet: Workaround for Pi 4B network issue
See: raspberrypi/linux#3108

kernel: overlays: Add baudrate parameter to i2c3-i2c6

kernel: Add HDMI1 facility to the driver
See: raspberrypi/linux#3100

kernel: FKMS Broadcast RGB property, and KMS command line parsing update
See: raspberrypi/linux#3103

kernel: Improvements for bcm2835-codec
See: raspberrypi/linux#3097

kernel: hid: usb: Add device quirks for Freeway Airmouse T3 and MX3
See: raspberrypi/linux#3117

firmware: Fix to allow HDMI audio port route setting
See: raspberrypi/linux#3100

userland: dtoverlay: only run lxpanelctl hooks when lxpanel is running
popcornmix added a commit to Hexxeh/rpi-firmware that referenced this pull request Aug 6, 2019
kernel: net: bcmgenet: Workaround for Pi 4B network issue
See: raspberrypi/linux#3108

kernel: overlays: Add baudrate parameter to i2c3-i2c6

kernel: Add HDMI1 facility to the driver
See: raspberrypi/linux#3100

kernel: FKMS Broadcast RGB property, and KMS command line parsing update
See: raspberrypi/linux#3103

kernel: Improvements for bcm2835-codec
See: raspberrypi/linux#3097

kernel: hid: usb: Add device quirks for Freeway Airmouse T3 and MX3
See: raspberrypi/linux#3117

firmware: Fix to allow HDMI audio port route setting
See: raspberrypi/linux#3100

userland: dtoverlay: only run lxpanelctl hooks when lxpanel is running
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants