Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

snd usb error: delay: estimated xxx, actual yyy #541

Closed
audetto opened this issue Mar 5, 2014 · 14 comments
Closed

snd usb error: delay: estimated xxx, actual yyy #541

audetto opened this issue Mar 5, 2014 · 14 comments
Assignees

Comments

@audetto
Copy link
Contributor

audetto commented Mar 5, 2014

I run mpd and it has been spamming my dmesg with these errors with different xxx and yyy numbers.

I run ArchLinux 3.13.5 and firmware

Feb 21 2014 17:44:46
Copyright (c) 2012 Broadcom
version 67f299c3a92a793fb8b8efcaf94f7f553153f89a (clean) (release)

The usb card is a cm108 attached directly to the Pi (otherwise the sound is appalling).

I can read everywhere that I should pass nrpacks=1 to snd-usb-audio, but 3.13 has removed this parameter.

The more the system is loaded, the more glitches there are.

Is there a solution for 3.13?

@popcornmix
Copy link
Collaborator

Does it work better with 3.10?

@audetto
Copy link
Contributor Author

audetto commented Mar 6, 2014

yes, but that defeats the point.
3.10 + bluetooth + bluez + PS3 controller does not work, that is why I use 3.13 (this is not the FIQ issues AFAICT, more some regression in the bluetooth stack)

on 3.10: nrpacks=1 no errors logged, nrpacks=20 flooded, default=8, depends on load.

976b6c0

this removes nrpacks.
to be honest I thought the higher the value, the bigger the buffer, the better, but it is clearly not like this.

@popcornmix
Copy link
Collaborator

@P33M have you seen the linked commit? Do you think that will make audio worse for us?

@audetto have you tried:
https://github.com/raspberrypi/linux/tree/rpi-3.10.y-next
This includes the new USB FIQ FSM rewrite that may improve things.

@P33M
Copy link
Contributor

P33M commented Mar 6, 2014

Egads, that comes with a humungous commit message. This will take some studying.

@audetto
Copy link
Contributor Author

audetto commented Mar 8, 2014

I've tried the new FIQ firmware and

  1. on raspbian, not too many "delay: estimated xxx, actual yyy", but plenty of NYET (see "kernel NULL pointer dereference" with sustained network traffic #341 )
  2. on archlinux 3.10: still flooded with "delay: estimated xxx, actual yyy", very few NYET

I could not try arch 3.13 as I could not find a firmware.

I seem to have noticed the raspbian uses pulseaudio, archlinux alsa directly, but this might be unrelated.

Audio quality: more stable, less glitches when system is very loaded. In normal conditions, there are still a few glitches.

@audetto
Copy link
Contributor Author

audetto commented Mar 15, 2014

I finally managed to compile 3.13.y-next, with bitmask 0x7.
The system had a CM108 attached to Pi, a bluetooth CSR 4.0 to a powered hub. (the latter not actually being used)

I still got a few NYET and delay messages but it sounded better.
But then it all crashed badly, I think related to NFS activity, when minidlna stared scanning a small media repository.
Unfortunately I did not have a way to get any message out of it.

@audetto
Copy link
Contributor Author

audetto commented Mar 15, 2014

Ok, it does not crash the Pi.
But it kills the network stack.
From the serial console I got (I can even still log in)

(I put it on pastebin as well, as github seems to mess it up).
http://pastebin.com/QWJJCRix

there are a lot of "delay: estimated 221, actual 44" before this.

[  809.141132] Transfer to device 4 endpoint 0x7 failed - FIQ reported NYET. Data may have been lost.
[  809.158001] Transfer to device 4 endpoint 0x7 failed - FIQ reported NYET. Data may have been lost.
[  863.215709] ------------[ cut here ]------------
[  863.223879] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:264 dev_watchdog+0x2c0/0x2e0()
[  863.239476] NETDEV WATCHDOG: eth0 (smsc95xx): transmit queue 0 timed out
[  863.249789] Modules linked in: btusb ip6t_REJECT xt_hl ip6t_rt snd_usb_audio snd_usbmidi_lib evdev snd_soc_bcm2708_i2s bluetooth nf_conntrack_ipv6 regmap_mmio snd_hwdep nf_defrag_ipv6 snd_rawmidi snd_seq_device snd_soc_core ipt_REJECT leds_gpio xt_comment snd_compress xt_recent led_class regmap_i2c snd_pcm_dmaengine regmap_spi snd_pcm snd_page_alloc snd_timer xt_limit snd xt_tcpudp bcm2708_dmaengine virt_dma xt_addrtype nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack ip6table_filter ip6_tables nf_conntrack_netbios_ns nf_conntrack_broadcast nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack iptable_filter ip_tables x_tables bcm2708_rng rng_core ipv6
[  863.335144] CPU: 0 PID: 0 Comm: swapper Not tainted 3.13.6-andsoft+ #2
[  863.345073] [<c00146d8>] (unwind_backtrace+0x0/0xf0) from [<c0011904>] (show_stack+0x10/0x14)
[  863.360017] [<c0011904>] (show_stack+0x10/0x14) from [<c0024314>] (warn_slowpath_common+0x6c/0x8c)
[  863.375378] [<c0024314>] (warn_slowpath_common+0x6c/0x8c) from [<c00243c8>] (warn_slowpath_fmt+0x30/0x40)
[  863.391314] [<c00243c8>] (warn_slowpath_fmt+0x30/0x40) from [<c04a5604>] (dev_watchdog+0x2c0/0x2e0)
[  863.406618] [<c04a5604>] (dev_watchdog+0x2c0/0x2e0) from [<c002ff64>] (call_timer_fn+0x30/0x15c)
[  863.421599] [<c002ff64>] (call_timer_fn+0x30/0x15c) from [<c00303ec>] (run_timer_softirq+0x1c0/0x318)
[  863.436923] [<c00303ec>] (run_timer_softirq+0x1c0/0x318) from [<c0028808>] (__do_softirq+0xcc/0x2d4)
[  863.452194] [<c0028808>] (__do_softirq+0xcc/0x2d4) from [<c0028d3c>] (irq_exit+0xa0/0xec)
[  863.466498] [<c0028d3c>] (irq_exit+0xa0/0xec) from [<c000f1f0>] (handle_IRQ+0x34/0x84)
[  863.480681] [<c000f1f0>] (handle_IRQ+0x34/0x84) from [<c054b754>] (__irq_svc+0x34/0xc8)
[  863.495139] [<c054b754>] (__irq_svc+0x34/0xc8) from [<c000f338>] (arch_cpu_idle+0x28/0x38)
[  863.510020] [<c000f338>] (arch_cpu_idle+0x28/0x38) from [<c005f058>] (cpu_startup_entry+0x5c/0x204)
[  863.525838] [<c005f058>] (cpu_startup_entry+0x5c/0x204) from [<c0776a30>] (start_kernel+0x31c/0x388)
[  863.541793] ---[ end trace bc7e11830c8ff683 ]---
[ 1007.412059] nfs: server magazzino not responding, still trying
[ 1007.421393] nfs: server magazzino not responding, still trying

I think some other usb related parts are damaged (alsamixer does not work).

Is this related to the FIQ problem?

@audetto
Copy link
Contributor Author

audetto commented Mar 16, 2014

I've tried again and this time it did not happen.

Previously it was enough to stop and restart minidlna (with full rescan).
What I've noticed though is that in the crash case at the top of "top" alongside minidlnad I could find VCHIQ-0, using 30-40% CPU.

In the last attempt, VCHIQ-0 did not show up. Is this related?

@P33M
Copy link
Contributor

P33M commented Mar 19, 2014

Can you try with BRANCH=next firmware after the next update is pushed?

@audetto
Copy link
Contributor Author

audetto commented Mar 23, 2014

I've tested 3.13.y-next with bitmask 0x7 and I did not see much of an improvement.
Maybe a few less NYET, but as soon as minidlna started scanning the NFS mount,
I got the same as before.

tried the workaround suggested in #151, but the regvalue was already 0x31.
the pi eventually managed to reboot from the serial console (although it took a while)

(unrelated to this issue, but the new code removed all the nyet for the internal audio when used in conjunction with bluetooth. here unfortunately it is not the case)

@P33M
Copy link
Contributor

P33M commented Mar 24, 2014

Thanks for the feedback.

Is the loss of network reproducible with no other USB devices connected? I.e. if you SSH in and tell minidlna to scan an NFS share, does the network still die?

@audetto
Copy link
Contributor Author

audetto commented Mar 24, 2014

no no.
mi Pi has been runnig minidlna for 1 year.
7 times a day it downloads some mp4 and then rescans the nfs folder (where there are about 60 mp4, 1 minute scan time in total)
this has worked rock solid so far.

when I decided to make it an internet radio as well (so I added internal audio + bluetooth) I got the infamous panic (so far solved by the new FIQ code)

meanwhile I did try to replace internal audio with a usb card, and only when I play with mpd/mpc, does minidlna cause problems. the card is attached to the Pi itself. the sounds starts glitching and then the network stops.

@P33M
Copy link
Contributor

P33M commented Apr 28, 2014

The incorrect generation of delay messages should be fixed as of the latest BRANCH=next firmware.

Spurious NYET messages may occasionally cause delay messages to get printed due to the lost samples.

Do you still get network dropouts with the latest firmware?

If you do, then can you post the output of several lines of vmstat 2?

@P33M
Copy link
Contributor

P33M commented May 20, 2014

No response from OP in > 2 weeks: assuming functional.

@P33M P33M closed this as completed May 20, 2014
popcornmix pushed a commit that referenced this issue Jul 27, 2023
[ Upstream commit d6a95db ]

There's issue as follows when do fault injection:
WARNING: CPU: 1 PID: 14870 at include/linux/quotaops.h:51 dquot_disable+0x13b7/0x18c0
Modules linked in:
CPU: 1 PID: 14870 Comm: fsconfig Not tainted 6.3.0-next-20230505-00006-g5107a9c821af-dirty #541
RIP: 0010:dquot_disable+0x13b7/0x18c0
RSP: 0018:ffffc9000acc79e0 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff88825e41b980
RDX: 0000000000000000 RSI: ffff88825e41b980 RDI: 0000000000000002
RBP: ffff888179f68000 R08: ffffffff82087ca7 R09: 0000000000000000
R10: 0000000000000001 R11: ffffed102f3ed026 R12: ffff888179f68130
R13: ffff888179f68110 R14: dffffc0000000000 R15: ffff888179f68118
FS:  00007f450a073740(0000) GS:ffff88882fc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ffe96f2efd8 CR3: 000000025c8ad000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 dquot_load_quota_sb+0xd53/0x1060
 dquot_resume+0x172/0x230
 ext4_reconfigure+0x1dc6/0x27b0
 reconfigure_super+0x515/0xa90
 __x64_sys_fsconfig+0xb19/0xd20
 do_syscall_64+0x39/0xb0
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

Above issue may happens as follows:
ProcessA              ProcessB                    ProcessC
sys_fsconfig
  vfs_fsconfig_locked
   reconfigure_super
     ext4_remount
      dquot_suspend -> suspend all type quota

                 sys_fsconfig
                  vfs_fsconfig_locked
                    reconfigure_super
                     ext4_remount
                      dquot_resume
                       ret = dquot_load_quota_sb
                        add_dquot_ref
                                           do_open  -> open file O_RDWR
                                            vfs_open
                                             do_dentry_open
                                              get_write_access
                                               atomic_inc_unless_negative(&inode->i_writecount)
                                              ext4_file_open
                                               dquot_file_open
                                                dquot_initialize
                                                  __dquot_initialize
                                                   dqget
						    atomic_inc(&dquot->dq_count);

                          __dquot_initialize
                           __dquot_initialize
                            dqget
                             if (!test_bit(DQ_ACTIVE_B, &dquot->dq_flags))
                               ext4_acquire_dquot
			        -> Return error DQ_ACTIVE_B flag isn't set
                         dquot_disable
			  invalidate_dquots
			   if (atomic_read(&dquot->dq_count))
	                    dqgrab
			     WARN_ON_ONCE(!test_bit(DQ_ACTIVE_B, &dquot->dq_flags))
	                      -> Trigger warning

In the above scenario, 'dquot->dq_flags' has no DQ_ACTIVE_B is normal when
dqgrab().
To solve above issue just replace the dqgrab() use in invalidate_dquots() with
atomic_inc(&dquot->dq_count).

Signed-off-by: Ye Bin <yebin10@huawei.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Message-Id: <20230605140731.2427629-3-yebin10@huawei.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
popcornmix pushed a commit that referenced this issue Jul 27, 2023
[ Upstream commit d6a95db ]

There's issue as follows when do fault injection:
WARNING: CPU: 1 PID: 14870 at include/linux/quotaops.h:51 dquot_disable+0x13b7/0x18c0
Modules linked in:
CPU: 1 PID: 14870 Comm: fsconfig Not tainted 6.3.0-next-20230505-00006-g5107a9c821af-dirty #541
RIP: 0010:dquot_disable+0x13b7/0x18c0
RSP: 0018:ffffc9000acc79e0 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff88825e41b980
RDX: 0000000000000000 RSI: ffff88825e41b980 RDI: 0000000000000002
RBP: ffff888179f68000 R08: ffffffff82087ca7 R09: 0000000000000000
R10: 0000000000000001 R11: ffffed102f3ed026 R12: ffff888179f68130
R13: ffff888179f68110 R14: dffffc0000000000 R15: ffff888179f68118
FS:  00007f450a073740(0000) GS:ffff88882fc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ffe96f2efd8 CR3: 000000025c8ad000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 dquot_load_quota_sb+0xd53/0x1060
 dquot_resume+0x172/0x230
 ext4_reconfigure+0x1dc6/0x27b0
 reconfigure_super+0x515/0xa90
 __x64_sys_fsconfig+0xb19/0xd20
 do_syscall_64+0x39/0xb0
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

Above issue may happens as follows:
ProcessA              ProcessB                    ProcessC
sys_fsconfig
  vfs_fsconfig_locked
   reconfigure_super
     ext4_remount
      dquot_suspend -> suspend all type quota

                 sys_fsconfig
                  vfs_fsconfig_locked
                    reconfigure_super
                     ext4_remount
                      dquot_resume
                       ret = dquot_load_quota_sb
                        add_dquot_ref
                                           do_open  -> open file O_RDWR
                                            vfs_open
                                             do_dentry_open
                                              get_write_access
                                               atomic_inc_unless_negative(&inode->i_writecount)
                                              ext4_file_open
                                               dquot_file_open
                                                dquot_initialize
                                                  __dquot_initialize
                                                   dqget
						    atomic_inc(&dquot->dq_count);

                          __dquot_initialize
                           __dquot_initialize
                            dqget
                             if (!test_bit(DQ_ACTIVE_B, &dquot->dq_flags))
                               ext4_acquire_dquot
			        -> Return error DQ_ACTIVE_B flag isn't set
                         dquot_disable
			  invalidate_dquots
			   if (atomic_read(&dquot->dq_count))
	                    dqgrab
			     WARN_ON_ONCE(!test_bit(DQ_ACTIVE_B, &dquot->dq_flags))
	                      -> Trigger warning

In the above scenario, 'dquot->dq_flags' has no DQ_ACTIVE_B is normal when
dqgrab().
To solve above issue just replace the dqgrab() use in invalidate_dquots() with
atomic_inc(&dquot->dq_count).

Signed-off-by: Ye Bin <yebin10@huawei.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Message-Id: <20230605140731.2427629-3-yebin10@huawei.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants