Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

alsa: HDMI passthrough support #528

Closed
julianscheel opened this issue Feb 10, 2014 · 55 comments
Closed

alsa: HDMI passthrough support #528

julianscheel opened this issue Feb 10, 2014 · 55 comments

Comments

@julianscheel
Copy link
Contributor

It would be nice to be able to use the snd_bcm2835 alsa driver not only for PCM playback but also for passthrough of encoded audio formats like AC3 through the HDMI port.

@popcornmix
Copy link
Collaborator

Do you have any references to what the alsa driver should do to support this?

Technically you can do it with spdif encapsulated AC3 or DTS sent as stereo PCM:
raspberrypi/firmware#183

You should add config options "no_hdmi_resample=1" and "hdmi_stream_channels=1" to config.txt to ensure data gets through uncorrupted.

@julianscheel
Copy link
Contributor Author

Thanks for the notes. From the alsa side there's nothing special. A 2 chanel S16_LE device should be just fine. So likely it will just work with these parameters set.
I will test and report.
Would it be possible to set these parameters at runtime instead of config.txt? Sounds like they could harm normal PCM playback in some circumstance (ie low sampling rates)

@popcornmix
Copy link
Collaborator

Actually latest (rpi-update) firmware has no_hdmi_resample=1 as a default, and

vcgencmd hdmi_stream_channels 1

will enable hdmi_stream_channels at runtime (0 to disable).

hdmi_stream_channels determines if the AV infoframe reports the actual number of channels (i.e. 2) or 0 (which means refer to stream).

I did try using "refer to stream" in a build a while back which all the hdmi devices we have here worked correctly for either stereo PCM or passthrough formats.
Unfortunately there were a few reports of PCM stereo not working so it needs to be something switchable.

I'm trying to find if there is a way of indictating to alsa driver when passthrough is in use, so the switch can be done. XBMC ALSA sink (for other platforms) uses "AES0-0x06" for passthrough and AES0=0x04" for non-passthrough formats - I think appended to the device name that is opened, but I didn't find a description of what exactly this does.

@julianscheel
Copy link
Contributor Author

The AES0 suffix to the device name can indeed be used to detect non-audio streams. When it's set to 0x06 the bit IEC958_AES0_NONAUDIO (see asounddef.h) is set. Using IEC958_AES0_NONAUDIO as indicator should be a good choice. A quick grep over the kernel shows that some drivers handle it like that.

@popcornmix
Copy link
Collaborator

I can't see where the AES0=0x60 is parsed in the (any) ALSA kernel driver.
Yes, in theory I could parse the spdif stream and extract the IEC958_AES0_NONAUDIO bit, but that would be an expensive operation compared to parsing the ALSA device name.

@julianscheel
Copy link
Contributor Author

Afaict the flags are parsed in alsa-lib when you open the device (see src/conf/pcm/iec958.conf or hdmi.conf). And are applied to the appropriate control, which the driver has to expose. See http://www.alsa-project.org/~tiwai/writing-an-alsa-driver/ch10s03.html

@julianscheel
Copy link
Contributor Author

I finally was able to do some more testing and got it working with VLC. Had to remove the arguments that VLC adds to the device (the AES ones). So adding a dedicated alsa device iec958 or hdmi which understands these sounds like a good idea.
If you instruct me how to set the no_hdmi_resample option out of the kernel instead of through vcgencmd I can try extending the alsa driver.

@popcornmix
Copy link
Collaborator

Latest firmware has no_hdmi_resample as default (for alsa). It will only resample if output sample rate is not supported, but using 44.1kHz or 48kHz should be fine.

The only thing you may want to set is hdmi_stream_channels=1 (which can be done with a vcgencmd). Actually many receivers don't care and passthrough may well work with or without this setting.

I can easily add this option as a flag when opening the alsa driver. If you find the right way to name the driver and extract the AES flags so it looks like a passthrough capable sound card that would be good.

A suggestion for how to flag passthrough it is to use channels=0 in the VC_AUDIO_CONFIG_T. structure. That will be interpreted as two channels in most uses, but will signal 0 (refer to stream header) for number of channels in the audio infoframe.

If that sounds okay I'll add that option to the next firmware update.

@julianscheel
Copy link
Contributor Author

Ok, I've got this coded together roughly already. Using channels=0 sounds ok. Is there a possibility to set any of the further iec958 status flags to the chip? Like consumer/professional mode or emphasis settings?
Right now I only care about the NONAUDIO flag which I would use to set channels=0.

As of now I created a dedicated subdevice hw:0,1 for this, probably we could also skip this and configure iec958 to use hw:0,0 as well. It seems that basically all cards seem to make up a dedicated device for spdif, even if it's actually the same device.

@popcornmix
Copy link
Collaborator

If you make any useful iec958 status bits available in the driver (e.g. where config structure is filled in) I can look at plumbing it through.

@julianscheel
Copy link
Contributor Author

The driver allows the userspace to set AES0...AES3 as parameters. Each one is a byte. Possible values are defined in asoundef.h - I could pass through any of these. Although I doubt it makes sense.
NONAUDIO is important to distinguish PCM vs encoded formats, so this one is handled anyway. Most userspace applications seem to set EMPHASIS on AES0. AES3 can be used to signal FS, don't know if it makes sense to pass that through?
Besides I don't know if any of these flags are really needed in our use-cases. What do you think?

@popcornmix
Copy link
Collaborator

I can't see any of these being important apart from the NONAUDIO one.
The 192 status bits will be set to sensible defaults by GPU. Looks like we set bits:
1: when passthrough
2: no copyright
24:27: samplerate
32:35: 24-bit samples
36:39: samplerate
all others are clear.

@julianscheel
Copy link
Contributor Author

What about the consumer/professional flag? Does it make any sense for us to set it. Otherwise I will limit the options to just allow userspace to set NONAUDIO and ignore the rest.

I just pushed my wip code: julianscheel@b0d3971
Also rebased it onto the rpi-3.10.y branch, just haven't testet it with 3.10, but should work. Additionally to the updated driver you will need a bcm2835.conf (be aware I renamed the driver to just bcm2835 to avoid a long filename with spaces) into /usr/sshare/alsa/cards.
See here: https://gist.github.com/julianscheel/9094131

I have tested this with vlc (--spdif option and --aout alsa). Works very well so far.

@popcornmix
Copy link
Collaborator

Bit 0 is set to 0 (consumer). I'm not sure what difference professional makes.

I'll test your patch. In theory I should be able to make xbmc output passthrough audio through it and can use that for testing the channels=0 code path.

@popcornmix
Copy link
Collaborator

Out of interest if driving multichannel PCM through HDMI (without passthrough) how is the speaker mapping configured?
i.e. can the application request 2.1 (left, right, lfe) and 3.0 (left, right, centre)?
Any interest in getting this working?

@julianscheel
Copy link
Contributor Author

I think professional mode is not relevant when looking at hdmi, so we can ignore it. So I will limit the mask of settable bits to IEC958_AES0_NONAUDIO.

Yes, with xbmc you should be able to get it work. Don't forget to install the bcm2835.conf and use iec958 output device.

multichannel PCM: I have no clue, but would be nice to support it. I will try to dig around a little to figure out how this is handled.

@popcornmix
Copy link
Collaborator

Good new is xbmc detects the sound card "bcm2835 IEC958/HDMI S/PDIF" as passthrough capable. I enable passthough and play, and the log says:

16:46:26 513.949951 T:3035858016    INFO: CAESinkALSA::Initialize - Opened device "iec958:CARD=ALSA,DEV=0,AES0=0x06,AES1=0x82,AES2=0x00,AES3=0x02"

but it gets opened with 2 channels (rather than 0 I was expecting), so I can't auto-switch the hdmi_stream_channels.
If I manually run "vcgencmd hdmi_stream_channels 1" before playing the file, then passthrough does work.

Any idea why channels=0 is not happening?

@julianscheel
Copy link
Contributor Author

Somehow the control mapping does not work as I expected yet. Just looking at it...

@julianscheel
Copy link
Contributor Author

I just force pushed ammended versions of the patch. They actually store the passend in argument settings properly now.
But there is one issue left: The controls are set after the hw_params call is done. So at the time were I check them they are still wrong. If you open the device two times sequentially (without anyone clearing the flags in between) you should see channels=0 now.
It seems pcm_prepare() would be a better place to signal the spdif stuff to the vchiq. Do you think it would be ok to call bcm2835_audio_set_params there again?
Maybe it would be better to have a dedicated set function just for the spdif stuff then?

@popcornmix
Copy link
Collaborator

VC_AUDIO_MSG_TYPE_CONFIG is the message that actually opens the audio driver (on GPU) including setting the AV infoframe (so channels is required at this point).

Now you can send that message more than once. The audio driver will be closed and reopened with new settings. That may cause a TV/receiver to do something (e.g. you could see some effect in LCD status indicators).

I believe we normally get the sequence of messages:
VC_AUDIO_MSG_TYPE_OPEN
VC_AUDIO_MSG_TYPE_CONFIG
VC_AUDIO_MSG_TYPE_WRITE (multiple calls)
VC_AUDIO_MSG_TYPE_START
VC_AUDIO_MSG_TYPE_WRITE (multiple calls)
VC_AUDIO_MSG_TYPE_STOP
VC_AUDIO_MSG_TYPE_CLOSE

We open the driver on the config, rather than the start, so we can submit samples to the driver from the writes before the start message comes.

I could try to do the driver open on the start message, but that would involve keeping hold of the samples submitted since the config so would have some risk of breaking things.
However if you say that not all the information required to open the hdmi driver is available when VC_AUDIO_MSG_TYPE_CONFIG is sent, then perhaps something has to change.

@julianscheel
Copy link
Contributor Author

Looking at the alsa docs (http://www.alsa-project.org/alsa-doc/alsa-lib/group___p_c_m.html#ga1ca0dc120a484965e26cabf966502330) for snd_pcm_hw_params I understand that snd_pcm_prepare is automatically called after the params where successfully set. So from my point of view it sound like it would be safe to move the audio_set_params call into the prepare method. We'd just make sure that snd_pcm_hw_params accepts only allowed hw params, so that the pcm_prepare would not fail later on.
From my debug logs I see that the parameters are always set before snd_pcm_prepare is entered, so this should be safe. What do you think?

@popcornmix
Copy link
Collaborator

Sounds okay to me.

I know some calls come from a proper task context and some come from an interrupt context.
snd_bcm2835_pcm_trigger for example is not called from a task context, and so can't call vchiq functions (see the worker queue in bcm2835-vchiq that is needed to deal with this).

That's the only thing to watch out for, but I'd imagine prepare would be triggered by the calling task so I'd expect it to be okay.

@julianscheel
Copy link
Contributor Author

Alright, I'll give it a try.

@julianscheel
Copy link
Contributor Author

Ok, modified this and at least nothing crashes. I only have remote access to an rpi right now so I could not check that sound is really still working. Will do that tomorrow.
Still feel free to give this reworked patch a try:
julianscheel@6173510
rpi-3.10.y branch is updated as well.
Patch contains a lot of debug output at the moment...

@julianscheel
Copy link
Contributor Author

Just did some testing with the rpi and the patch seems to work fine for me. As long as I do not actually send channels=0. With the current firmware this seems to render the firmware into an erroneous state.

@popcornmix
Copy link
Collaborator

Here is a test build with a fixed 256M/256M memory split:
https://dl.dropboxusercontent.com/u/3669512/temp/start_alsa_passthrough.elf

It should work with channels=0 and set hdmi_stream_channels.
It also should support setting channels up to 8 and output multichannel PCM.
Currently it gets the speaker placement wrong, but it would be nice to determine how that should be handled.

@julianscheel
Copy link
Contributor Author

Have you tested with that firmware? For me it still seems to go mad when channels=0 is set.

[  210.652456] > snd_bcm2835_pcm_prepare()
[  210.652516] snd_bcm2835_pcm_prepare(): spdif_status=0x2008202
[  210.652528] snd_bcm2835_pcm_prepare(): channels=0
[  228.848835] bcm2835_audio_set_ctls_chan:495 bcm2835_audio_set_ctls_chan: failed on waiting for event (status=0)
[  228.849377] bcm2835_audio_set_params:560  Alsa controls not supported
[  228.849712] snd_bcm2835_pcm_prepare:311  error setting hw params

@popcornmix
Copy link
Collaborator

I tested xbmc which used the bcm2835 output for gui sounds, and switched to "bcm2835 IEC958/HDMI S/PDIF driver" when playing video with ac3/dts.
In a debugger I was seeing channels=2 for gui sounds and channels=0 for passthrough and it was working for me.

It is possible that some slightly different configuration is provoking a bug.
Can you provide an executable I can run that will provoke the failure?

@julianscheel
Copy link
Contributor Author

For me the problem raises as soon as channels=0 is set by the driver. I can trigger it with vlc as well as with a simple speaker-test:

speaker-test -D iec958:AES0=0x6 -c 2

That one will open the device, but closing on terminate will fail and afterwards no audio commands are working at all.

@popcornmix
Copy link
Collaborator

You are correct - speakertest wasn't working with that firmware. I've uploaded a fixed version (same url).

@julianscheel
Copy link
Contributor Author

Ok, finally verified that all is working well. With your latest firmware I can easily use spdif passthrough in vlc with the --spdif option :)
I've cleaned up the patch from all the debug print

For rpi-3.6.y: julianscheel@dc0d9f4
For rpi-3.10.y: julianscheel@db269a8

I tested with 3.6, but should be fine with 3.10.
Don't forget that the bcm2835.conf is required :)

@julianscheel
Copy link
Contributor Author

What do you think about merging this patch?
I think would be beneficial for many people.

@popcornmix
Copy link
Collaborator

Any thoughts on what the implication will be for existing users updating to this kernel (but otherwise not changing .conf files)?

@julianscheel
Copy link
Contributor Author

It should not cause any difference for users which do not have a config file. The soundcard will just not have a iec958 device in that case, so they have no benefit.

The only relevant difference is the renaming of the driver from "BRCM bcm2835 ALSA Driver" to just "bcm2835" which will render existing asound.state files useless. But looking at the triviality of the options the driver exposes this should not be too painful.

Still the question is how to go ahead with the config file. I assume that mainlining in libalsa won't work out as long as the alsa driver itself is not in the mainline kernel. Maybe it can be added to the firmware package for raspbian users?

@julianscheel
Copy link
Contributor Author

@popcornmix Any plans to merge this? Or do you see any outstanding things to be dealt with before merging?

@popcornmix
Copy link
Collaborator

It's actually on the 3.12.y kernel tree that we should be moving to soon.
I think I'll bump BRANCH=next firmware to 3.12.y (in addition to USB rewrite), and see if any problems are reported.

@julianscheel
Copy link
Contributor Author

Ah, sorry I only checked the 3.10 kernel. Perfect then. Thank you :)

@julianscheel
Copy link
Contributor Author

I just tested this with the latest start.elf from the firmware repository. It behaves like the old versions before your start_alsa_passthrough.elf. Did you incorporate the changes into the main firmware yet?
My testing system ran on start_alsa_passthrough.elf last weeks.

@popcornmix
Copy link
Collaborator

It looks like it's not in main firmware.

I've found the stashed code from when I was working on this, but it seemed these changes were occurring along with some multichannel audio restructuring that got abandoned for a simpler scheme, so I'll have to unpick the necessary bits (the channels=0 codepath)...

@julianscheel
Copy link
Contributor Author

Ok, do you think you can merge it soon?
I was just about to deploy new firmware to our customers to enable spdif for them. Have to wait for the merge now :)

@adn770
Copy link

adn770 commented Apr 16, 2014

I'm also working on audio pass trough scenarios but by using OpenMax in GStreamer [1].

I've already got a working sink for PCM raw audio with the information at [2] but with the the audio passtrough scenario I get an error in the attempt to allocate the buffers in the input port of audio_render.

audio_render got error: Insufficient resources (0x80001000)

Please could you give us some hint on what might be wrong ?

In order to reproduce the issue you can build with the following:

$ git clone https://github.com/metrological/buildroot
$ cd buildroot
$ make rpi_qt5webkit_defconfig
$ make

To deploy into an SD card with two partitions boot (fat32) and rootfs (ext4):
$ cp output/images/rpi-firmware/* /media//boot/
$ cp output/images/zImage /media/boot/
$ sudo tar xvf output/images/rootfs.tar -C /media/rootfs/

Download and copy some media with a dolby digital audio track, for example [3]

Once booted the rpi to reproduce the issue run the following GStreamer pipeline:

GST_DEBUG=omx*:4 gst-launch-1.0 filesrc location=/root/h264_720p_hp_5.1_6mbps_ac3_planet.mp4 ! qtdemux ! ac3parse ! queue ! omxhdmiaudiosink

[1] http://cgit.freedesktop.org/~adn770/gst-omx/log/?h=audiosink
[2] http://home.nouwen.name/RaspberryPi/documentation/ilcomponents/audio_render.html
[3] http://www.auby.no/files/video_tests/h264_720p_hp_5.1_6mbps_ac3_planet.mp4

@julianscheel
Copy link
Contributor Author

@AD-N770 I think you should open another bug, as this one is clearly about the alsa kernel driver and not about openmax.

@adn770
Copy link

adn770 commented Apr 16, 2014

@julianscheel sure, I don't want to hijack that ticket just it was so related to my scenario...

Anyway I've opened #564

@julianscheel
Copy link
Contributor Author

@popcornmix Have you had a chance to look into forwardporting the channels==0 change in the firmware?

@popcornmix
Copy link
Collaborator

I've tidied up the stashed code (which also adds multichannel and 32-bit support) but it needs a bit more testing. I'll push it out by the weekend.

@julianscheel
Copy link
Contributor Author

Perfect, thank you!

popcornmix pushed a commit to raspberrypi/firmware that referenced this issue Apr 27, 2014
kernel: config: Add CONFIG_NFS_SWAP
See: #266

firmware: alsa: fix for wrong volume when starting playback
See: raspberrypi/linux#570

firmware: Support passthrough through alsa with channels=0
See: raspberrypi/linux#528
popcornmix pushed a commit to Hexxeh/rpi-firmware that referenced this issue Apr 27, 2014
kernel: config: Add CONFIG_NFS_SWAP
See: raspberrypi/firmware#266

firmware: alsa: fix for wrong volume when starting playback
See: raspberrypi/linux#570

firmware: Support passthrough through alsa with channels=0
See: raspberrypi/linux#528
@popcornmix
Copy link
Collaborator

Please test latest rpi-update firmware.

@julianscheel
Copy link
Contributor Author

Seems to work well for me. Thanks :)
I've started some start/stop stress-testing to verify a bit more now. Will close once this ran for a while without issues.

@popcornmix
Copy link
Collaborator

Note that latest rpi-update firmware is now 3.12 kernel which includes the HDMI passthrough support in alsa driver.

@julianscheel
Copy link
Contributor Author

I'm testing the firmware alone as I am not running raspbian. I'm on my own kernel tree which has the alsa driver patch included anyway.

@julianscheel
Copy link
Contributor Author

Tests were all good for me, I'll close the ticket. Thanks for your efforts.

neuschaefer pushed a commit to neuschaefer/raspi-binary-firmware that referenced this issue Feb 27, 2017
kernel: config: Add CONFIG_NFS_SWAP
See: raspberrypi#266

firmware: alsa: fix for wrong volume when starting playback
See: raspberrypi/linux#570

firmware: Support passthrough through alsa with channels=0
See: raspberrypi/linux#528
popcornmix pushed a commit that referenced this issue Jan 22, 2018
When a tail call fails, it is documented that the tail call should
continue execution at the following instruction.  An example tail call
sequence is:

  12: (85) call bpf_tail_call#12
  13: (b7) r0 = 0
  14: (95) exit

The ARM assembler for the tail call in this case ends up branching to
instruction 14 instead of instruction 13, resulting in the BPF filter
returning a non-zero value:

  178:	ldr	r8, [sp, #588]	; insn 12
  17c:	ldr	r6, [r8, r6]
  180:	ldr	r8, [sp, #580]
  184:	cmp	r8, r6
  188:	bcs	0x1e8
  18c:	ldr	r6, [sp, #524]
  190:	ldr	r7, [sp, #528]
  194:	cmp	r7, #0
  198:	cmpeq	r6, #32
  19c:	bhi	0x1e8
  1a0:	adds	r6, r6, #1
  1a4:	adc	r7, r7, #0
  1a8:	str	r6, [sp, #524]
  1ac:	str	r7, [sp, #528]
  1b0:	mov	r6, #104
  1b4:	ldr	r8, [sp, #588]
  1b8:	add	r6, r8, r6
  1bc:	ldr	r8, [sp, #580]
  1c0:	lsl	r7, r8, #2
  1c4:	ldr	r6, [r6, r7]
  1c8:	cmp	r6, #0
  1cc:	beq	0x1e8
  1d0:	mov	r8, #32
  1d4:	ldr	r6, [r6, r8]
  1d8:	add	r6, r6, #44
  1dc:	bx	r6
  1e0:	mov	r0, #0		; insn 13
  1e4:	mov	r1, #0
  1e8:	add	sp, sp, #596	; insn 14
  1ec:	pop	{r4, r5, r6, r7, r8, sl, pc}

For other sequences, the tail call could end up branching midway through
the following BPF instructions, or maybe off the end of the function,
leading to unknown behaviours.

Fixes: 39c13c2 ("arm: eBPF JIT compiler")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
popcornmix pushed a commit that referenced this issue Jan 31, 2018
commit f4483f2 upstream.

When a tail call fails, it is documented that the tail call should
continue execution at the following instruction.  An example tail call
sequence is:

  12: (85) call bpf_tail_call#12
  13: (b7) r0 = 0
  14: (95) exit

The ARM assembler for the tail call in this case ends up branching to
instruction 14 instead of instruction 13, resulting in the BPF filter
returning a non-zero value:

  178:	ldr	r8, [sp, #588]	; insn 12
  17c:	ldr	r6, [r8, r6]
  180:	ldr	r8, [sp, #580]
  184:	cmp	r8, r6
  188:	bcs	0x1e8
  18c:	ldr	r6, [sp, #524]
  190:	ldr	r7, [sp, #528]
  194:	cmp	r7, #0
  198:	cmpeq	r6, #32
  19c:	bhi	0x1e8
  1a0:	adds	r6, r6, #1
  1a4:	adc	r7, r7, #0
  1a8:	str	r6, [sp, #524]
  1ac:	str	r7, [sp, #528]
  1b0:	mov	r6, #104
  1b4:	ldr	r8, [sp, #588]
  1b8:	add	r6, r8, r6
  1bc:	ldr	r8, [sp, #580]
  1c0:	lsl	r7, r8, #2
  1c4:	ldr	r6, [r6, r7]
  1c8:	cmp	r6, #0
  1cc:	beq	0x1e8
  1d0:	mov	r8, #32
  1d4:	ldr	r6, [r6, r8]
  1d8:	add	r6, r6, #44
  1dc:	bx	r6
  1e0:	mov	r0, #0		; insn 13
  1e4:	mov	r1, #0
  1e8:	add	sp, sp, #596	; insn 14
  1ec:	pop	{r4, r5, r6, r7, r8, sl, pc}

For other sequences, the tail call could end up branching midway through
the following BPF instructions, or maybe off the end of the function,
leading to unknown behaviours.

Fixes: 39c13c2 ("arm: eBPF JIT compiler")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
popcornmix pushed a commit that referenced this issue May 24, 2023
After blamed commit, nexthop_fib6_nh_bh() and nexthop_fib6_nh()
are the same.

Delete nexthop_fib6_nh_bh(), and convert /proc/net/ipv6_route
to standard rcu to avoid this splat:

[ 5723.180080] WARNING: suspicious RCU usage
[ 5723.180083] -----------------------------
[ 5723.180084] include/net/nexthop.h:516 suspicious rcu_dereference_check() usage!
[ 5723.180086]
other info that might help us debug this:

[ 5723.180087]
rcu_scheduler_active = 2, debug_locks = 1
[ 5723.180089] 2 locks held by cat/55856:
[ 5723.180091] #0: ffff9440a582afa8 (&p->lock){+.+.}-{3:3}, at: seq_read_iter (fs/seq_file.c:188)
[ 5723.180100] #1: ffffffffaac07040 (rcu_read_lock_bh){....}-{1:2}, at: rcu_lock_acquire (include/linux/rcupdate.h:326)
[ 5723.180109]
stack backtrace:
[ 5723.180111] CPU: 14 PID: 55856 Comm: cat Tainted: G S        I        6.3.0-dbx-DEV #528
[ 5723.180115] Call Trace:
[ 5723.180117]  <TASK>
[ 5723.180119] dump_stack_lvl (lib/dump_stack.c:107)
[ 5723.180124] dump_stack (lib/dump_stack.c:114)
[ 5723.180126] lockdep_rcu_suspicious (include/linux/context_tracking.h:122)
[ 5723.180132] ipv6_route_seq_show (include/net/nexthop.h:?)
[ 5723.180135] ? ipv6_route_seq_next (net/ipv6/ip6_fib.c:2605)
[ 5723.180140] seq_read_iter (fs/seq_file.c:272)
[ 5723.180145] seq_read (fs/seq_file.c:163)
[ 5723.180151] proc_reg_read (fs/proc/inode.c:316 fs/proc/inode.c:328)
[ 5723.180155] vfs_read (fs/read_write.c:468)
[ 5723.180160] ? up_read (kernel/locking/rwsem.c:1617)
[ 5723.180164] ksys_read (fs/read_write.c:613)
[ 5723.180168] __x64_sys_read (fs/read_write.c:621)
[ 5723.180170] do_syscall_64 (arch/x86/entry/common.c:?)
[ 5723.180174] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120)
[ 5723.180177] RIP: 0033:0x7fa455677d2a

Fixes: 09eed11 ("neighbour: switch to standard rcu, instead of rcu_bh")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20230510154646.370659-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
popcornmix pushed a commit that referenced this issue Oct 11, 2023
commit ef1148d upstream.

After blamed commit, nexthop_fib6_nh_bh() and nexthop_fib6_nh()
are the same.

Delete nexthop_fib6_nh_bh(), and convert /proc/net/ipv6_route
to standard rcu to avoid this splat:

[ 5723.180080] WARNING: suspicious RCU usage
[ 5723.180083] -----------------------------
[ 5723.180084] include/net/nexthop.h:516 suspicious rcu_dereference_check() usage!
[ 5723.180086]
other info that might help us debug this:

[ 5723.180087]
rcu_scheduler_active = 2, debug_locks = 1
[ 5723.180089] 2 locks held by cat/55856:
[ 5723.180091] #0: ffff9440a582afa8 (&p->lock){+.+.}-{3:3}, at: seq_read_iter (fs/seq_file.c:188)
[ 5723.180100] #1: ffffffffaac07040 (rcu_read_lock_bh){....}-{1:2}, at: rcu_lock_acquire (include/linux/rcupdate.h:326)
[ 5723.180109]
stack backtrace:
[ 5723.180111] CPU: 14 PID: 55856 Comm: cat Tainted: G S        I        6.3.0-dbx-DEV #528
[ 5723.180115] Call Trace:
[ 5723.180117]  <TASK>
[ 5723.180119] dump_stack_lvl (lib/dump_stack.c:107)
[ 5723.180124] dump_stack (lib/dump_stack.c:114)
[ 5723.180126] lockdep_rcu_suspicious (include/linux/context_tracking.h:122)
[ 5723.180132] ipv6_route_seq_show (include/net/nexthop.h:?)
[ 5723.180135] ? ipv6_route_seq_next (net/ipv6/ip6_fib.c:2605)
[ 5723.180140] seq_read_iter (fs/seq_file.c:272)
[ 5723.180145] seq_read (fs/seq_file.c:163)
[ 5723.180151] proc_reg_read (fs/proc/inode.c:316 fs/proc/inode.c:328)
[ 5723.180155] vfs_read (fs/read_write.c:468)
[ 5723.180160] ? up_read (kernel/locking/rwsem.c:1617)
[ 5723.180164] ksys_read (fs/read_write.c:613)
[ 5723.180168] __x64_sys_read (fs/read_write.c:621)
[ 5723.180170] do_syscall_64 (arch/x86/entry/common.c:?)
[ 5723.180174] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120)
[ 5723.180177] RIP: 0033:0x7fa455677d2a

Fixes: 09eed11 ("neighbour: switch to standard rcu, instead of rcu_bh")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20230510154646.370659-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants