Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kernel panic on Raspberry PI 3 B+ #2576

Closed
gourry opened this issue Jun 6, 2018 · 10 comments
Closed

Kernel panic on Raspberry PI 3 B+ #2576

gourry opened this issue Jun 6, 2018 · 10 comments

Comments

@gourry
Copy link

gourry commented Jun 6, 2018

This seems similar to #2555 but it's happening in a very different situation.

I have a running headless Raspbian image since the first Raspberry Pi model. I upgraded it through time (now it's raspbian stretch). During time I changed the hardware to Raspberry Pi B+, Raspberry Pi 2, Raspberry Pi 3 B and I just updated it to Raspberry Pi 3 B+.

The Raspberry Pi is just connected to the ethernet cable and there is an I2C temperature sensor, there is no other device connected (no video, no USB, no wlan nor bluetooth devices).

The issues started with the newest hardware. The entire system is no longer stable and kernel panics after a few hours of uptime.
The system is just sitting idle most of the time, just hosting transmission daemon and mosquitto. I add some jobs in transmission, on average one download every week.

All the packets are up to date.

uname -a
Linux raspberrypi 4.14.44-v7+ #1117 SMP Thu May 31 16:57:56 BST 2018 armv7l GNU/Linux

dmesg
[ 623.334401] Unable to handle kernel NULL pointer dereference at virtual address 00000014
[ 623.334574] pgd = 80004000
[ 623.334631] [00000014] *pgd=00000000
[ 623.334710] Internal error: Oops: 817 [#1] SMP ARM
[ 623.334794] Modules linked in: fuse appletalk psnap llc ax25 cmac sha256_generic arc4 ecb md4 md5 hmac nls_utf8 cifs ccm brcmfmac brcmutil cfg80211 rfkill i2c_bcm2835 fixed uio_pdrv_genirq uio i2c_dev i2c_bcm2708 snd_bcm2835(C) snd_pcm snd_timer snd ip_tables x_tables ipv6
[ 623.335304] CPU: 1 PID: 1495 Comm: postprocess.sh Tainted: G C 4.14.44-v7+ #1117
[ 623.335444] Hardware name: BCM2835
[ 623.335511] task: ac418000 task.stack: ac602000
[ 623.335603] PC is at tcp_close+0x304/0x4e0
[ 623.335682] LR is at __slab_free+0x258/0x3a4
[ 623.335760] pc : [<806e8ad4>] lr : [<802779e0>] psr: a0000013
[ 623.335867] sp : ac603e50 ip : 3da2a000 fp : ac603e7c
[ 623.335957] r10: 00000008 r9 : b40f7520 r8 : 00000000
[ 623.336047] r7 : 00000000 r6 : b4da2c5c r5 : 00000000 r4 : b4da2bc0
[ 623.336156] r3 : b4da2c5c r2 : 00000010 r1 : 80000013 r0 : b4da2c7d
[ 623.336267] Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user
[ 623.338739] Control: 10c5383d Table: 2c7a006a DAC: 00000055
[ 623.341239] Process postprocess.sh (pid: 1495, stack limit = 0xac602210)
[ 623.343678] Stack: (0xac603e50 to 0xac604000)
[ 623.346081] 3e40: ac603e74 8071411c 807140f0 b4da2bc0
[ 623.350863] 3e60: b40f7500 bcdf9110 b40955d8 00000000 ac603e94 ac603e80 80714144 806e87dc
[ 623.355745] 3e80: b40f7500 00000000 ac603eac ac603e98 8066c208 807140fc bc056a80 b40f7520
[ 623.361030] 3ea0: ac603ebc ac603eb0 8066c2b0 8066c1e4 ac603efc ac603ec0 8028bd20 8066c2a0
[ 623.366569] 3ec0: 00000000 00000000 00000001 bc056a88 ac418528 b4cdd9c0 ac418528 ac418000
[ 623.372313] 3ee0: ac418544 80c97e04 bc056a80 00000000 ac603f0c ac603f00 8028bedc 8028bc90
[ 623.378147] 3f00: ac603f34 ac603f10 8013bc90 8028bed0 ac418000 00000000 ac603f40 bc018700
[ 623.384215] 3f20: 00000001 bc018738 ac603f74 ac603f38 80121dd8 8013bbe0 ac603f58 ac603f80
[ 623.390717] 3f40: 00000000 00000002 ac603f74 00000000 b8d07700 76f4a798 000000f8 80108204
[ 623.397534] 3f60: ac602000 00000000 ac603f94 ac603f78 80122654 80121a2c ffffffff 00000001
[ 623.404687] 3f80: 00000000 76f4a798 ac603fa4 ac603f98 8012270c 80122614 00000000 ac603fa8
[ 623.412125] 3fa0: 80108060 801226f8 00000001 00000000 00000000 00000000 00000000 ffffffff
[ 623.419683] 3fc0: 00000001 00000000 76f4a798 000000f8 76f4d0a4 000fa388 76fec000 00000000
[ 623.427534] 3fe0: 00000444 7ee349e0 76e3ed88 76ead454 60000010 00000000 00000000 00000000
[ 623.435500] [<806e8ad4>] (tcp_close) from [<80714144>] (inet_release+0x54/0x80)
[ 623.439581] [<80714144>] (inet_release) from [<8066c208>] (sock_release+0x30/0xbc)
[ 623.447912] [<8066c208>] (sock_release) from [<8066c2b0>] (sock_close+0x1c/0x24)
[ 623.455995] [<8066c2b0>] (sock_close) from [<8028bd20>] (__fput+0x9c/0x1e8)
[ 623.459937] [<8028bd20>] (__fput) from [<8028bedc>] (____fput+0x18/0x1c)
[ 623.463730] [<8028bedc>] (____fput) from [<8013bc90>] (task_work_run+0xbc/0xe0)
[ 623.467448] [<8013bc90>] (task_work_run) from [<80121dd8>] (do_exit+0x3b8/0xb9c)
[ 623.474713] [<80121dd8>] (do_exit) from [<80122654>] (do_group_exit+0x4c/0xe4)
[ 623.478378] [<80122654>] (do_group_exit) from [<8012270c>] (__wake_up_parent+0x0/0x30)
[ 623.485353] [<8012270c>] (__wake_up_parent) from [<80108060>] (ret_fast_syscall+0x0/0x28)
[ 623.492398] Code: e58430a4 e890000c e5808000 e5808004 (e5823004)
[ 623.495995] ---[ end trace c755bd4cb48b7745 ]---
[ 623.499511] Fixing recursive fault but reboot is needed!
[ 939.674448] CIFS VFS: sends on sock bb69c1c0 stuck for 15 seconds
[ 939.677767] CIFS VFS: Error -11 sending data on socket to server
[ 954.794631] CIFS VFS: sends on sock bb69c1c0 stuck for 15 seconds
[ 954.797811] CIFS VFS: Error -11 sending data on socket to server

I don't know if this is a kernel issue or faulty hardware. Seems a kernel issue looking at the message.

@pelwell
Copy link
Contributor

pelwell commented Jun 6, 2018

What power supply are you using?

@gourry
Copy link
Author

gourry commented Jun 6, 2018

I'm using one like this: https://it.aliexpress.com/item/AC-110V-220V-to-DC-5V-3A-15W-Switching-Power-Supply-LED-Driver-for-LED-Strip/32828837298.html
It worked fine with all previous Raspberry Pi models. Do you think that the power supply could cause the issue?

@pelwell
Copy link
Contributor

pelwell commented Jun 6, 2018

It's the first question we ask when we see a kernel panic in a well used part of the upstream kernel, but on paper that supply looks to be adequate.

Is the kernel error log you posted representative of multiple crashes? Have you ever seen any messages saying "Under-voltage detected!"?

Assuming your answers are yes, and no, that's not an error I've seen before.

@gourry
Copy link
Author

gourry commented Jun 7, 2018

Thank you for the reply! I have seen the message "Under-voltage detected! (0x00050005)", immediately followed by "Voltage normalised (0x00000000)" . I didn't pay enough attention to it because the voltage measured at the PSU seems always 5.10V stable but I guess that there could be a voltage drop in the connection cable.
I will try to change all cables with "bigger" ones and let it run again for a few hours.

@pelwell
Copy link
Contributor

pelwell commented Jun 7, 2018

Good idea - some cables are awful.

Even if you do have a power supply issue, if the crash logs are consistent then it may point to a kernel bug, so it's worth reviewing a few of them.

@gourry
Copy link
Author

gourry commented Jun 8, 2018

Thanks. I created a "homemade" cable using 22awg wire and now the power problem is gone. Analyzing the previous kernel "oops", I see some differences, for example some of them refers to "[ 596.875698] Unable to handle kernel paging request at virtual address 6f6320a8" so I guess that the low power could be the issue.
However the problems are not resolved, meaning that I'm esperiencing locks of the kernel related to CIFS. That was the issue I was originally investigating, because my transmission daemon was stuck while storing data into my NAS (CIFS mount).
If this has nothing to do with the kernel, please point me to the right project and close this issue. I don't want to add noise where I'm not supposed to.
The problem is reported by the lines reported below. There are a lot of them repeated over the syslog even on previous days (I'm pasting a few as example). Any idea on how to solve? I never got this issue with Raspberry Pi3, nor 2 nor 1. Could it be due to the upgraded ethernet chip?

Thanks a lot!

Jun 8 15:49:59 raspberrypi kernel: [58120.469383] INFO: task transmission-da:799 blocked for more than 120 seconds.
Jun 8 15:49:59 raspberrypi kernel: [58120.471927] Tainted: G C 4.14.44-v7+ #1117
Jun 8 15:49:59 raspberrypi kernel: [58120.474426] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun 8 15:49:59 raspberrypi kernel: [58120.479425] transmission-da D 0 799 1 0x00000000
Jun 8 15:50:00 raspberrypi kernel: [58120.482071] [<8079b418>] (__schedule) from [<8079ba90>] (schedule+0x50/0xa8)
Jun 8 15:50:00 raspberrypi kernel: [58120.484626] [<8079ba90>] (schedule) from [<8014b148>] (io_schedule+0x20/0x40)
Jun 8 15:50:00 raspberrypi kernel: [58120.487156] [<8014b148>] (io_schedule) from [<8021cb74>] (wait_on_page_bit+0x110/0x130)
Jun 8 15:50:00 raspberrypi kernel: [58120.492127] [<8021cb74>] (wait_on_page_bit) from [<8021cc74>] (__filemap_fdatawait_range+0xe0/0x114)
Jun 8 15:50:00 raspberrypi kernel: [58120.497281] [<8021cc74>] (__filemap_fdatawait_range) from [<8021ccd0>] (filemap_fdatawait_range+0x28/0x38)
Jun 8 15:50:00 raspberrypi kernel: [58120.502536] [<8021ccd0>] (filemap_fdatawait_range) from [<8021f084>] (filemap_write_and_wait+0x68/0x9c)
Jun 8 15:50:00 raspberrypi kernel: [58120.508125] [<8021f084>] (filemap_write_and_wait) from [<7f545028>] (cifs_reopen_file+0x364/0x430 [cifs])
Jun 8 15:50:00 raspberrypi kernel: [58120.514027] [<7f545028>] (cifs_reopen_file [cifs]) from [<7f54a3b4>] (cifs_readpages+0x408/0x6d0 [cifs])
Jun 8 15:50:00 raspberrypi kernel: [58120.519917] [<7f54a3b4>] (cifs_readpages [cifs]) from [<8022f63c>] (__do_page_cache_readahead+0x17c/0x284)
Jun 8 15:50:00 raspberrypi kernel: [58120.525833] [<8022f63c>] (__do_page_cache_readahead) from [<8022fb34>] (force_page_cache_readahead+0xb8/0x12c)
Jun 8 15:50:00 raspberrypi kernel: [58120.531787] [<8022fb34>] (force_page_cache_readahead) from [<80267934>] (SyS_fadvise64_64+0x2d4/0x314)
Jun 8 15:50:00 raspberrypi kernel: [58120.537790] [<80267934>] (SyS_fadvise64_64) from [<8010be18>] (sys_arm_fadvise64_64+0x28/0x30)
Jun 8 15:50:00 raspberrypi kernel: [58120.543911] [<8010be18>] (sys_arm_fadvise64_64) from [<80108060>] (ret_fast_syscall+0x0/0x28)

Jun 6 22:58:38 raspberrypi kernel: [ 980.956292] INFO: task transmission-da:696 blocked for more than 120 seconds.
Jun 6 22:58:38 raspberrypi kernel: [ 980.959486] Tainted: G C 4.14.44-v7+ #1117
Jun 6 22:58:38 raspberrypi kernel: [ 980.962627] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun 6 22:58:38 raspberrypi kernel: [ 980.968999] transmission-da D 0 696 1 0x00000000
Jun 6 22:58:38 raspberrypi kernel: [ 980.972262] [<8079b418>] (__schedule) from [<8079ba90>] (schedule+0x50/0xa8)
Jun 6 22:58:38 raspberrypi kernel: [ 980.975402] [<8079ba90>] (schedule) from [<8014b148>] (io_schedule+0x20/0x40)
Jun 6 22:58:38 raspberrypi kernel: [ 980.978538] [<8014b148>] (io_schedule) from [<8021cb74>] (wait_on_page_bit+0x110/0x130)
Jun 6 22:58:38 raspberrypi kernel: [ 980.984608] [<8021cb74>] (wait_on_page_bit) from [<8021cc74>] (__filemap_fdatawait_range+0xe0/0x114)
Jun 6 22:58:38 raspberrypi kernel: [ 980.990671] [<8021cc74>] (__filemap_fdatawait_range) from [<8021ccd0>] (filemap_fdatawait_range+0x28/0x38)
Jun 6 22:58:38 raspberrypi kernel: [ 980.996670] [<8021ccd0>] (filemap_fdatawait_range) from [<8021f084>] (filemap_write_and_wait+0x68/0x9c)
Jun 6 22:58:38 raspberrypi kernel: [ 981.002760] [<8021f084>] (filemap_write_and_wait) from [<7f4ed028>] (cifs_reopen_file+0x364/0x430 [cifs])
Jun 6 22:58:38 raspberrypi kernel: [ 981.008991] [<7f4ed028>] (cifs_reopen_file [cifs]) from [<7f4f23b4>] (cifs_readpages+0x408/0x6d0 [cifs])
Jun 6 22:58:38 raspberrypi kernel: [ 981.015576] [<7f4f23b4>] (cifs_readpages [cifs]) from [<8022f63c>] (__do_page_cache_readahead+0x17c/0x284)
Jun 6 22:58:38 raspberrypi kernel: [ 981.022136] [<8022f63c>] (__do_page_cache_readahead) from [<8022fb34>] (force_page_cache_readahead+0xb8/0x12c)
Jun 6 22:58:38 raspberrypi kernel: [ 981.029398] [<8022fb34>] (force_page_cache_readahead) from [<80267934>] (SyS_fadvise64_64+0x2d4/0x314)
Jun 6 22:58:38 raspberrypi kernel: [ 981.036036] [<80267934>] (SyS_fadvise64_64) from [<8010be18>] (sys_arm_fadvise64_64+0x28/0x30)
Jun 6 22:58:38 raspberrypi kernel: [ 981.042513] [<8010be18>] (sys_arm_fadvise64_64) from [<80108060>] (ret_fast_syscall+0x0/0x28)

May 31 23:45:25 raspberrypi kernel: [91052.516033] INFO: task kworker/u8:0:4705 blocked for more than 120 seconds.
May 31 23:45:25 raspberrypi kernel: [91052.518051] Tainted: G C 4.14.34-v7+ #1110
May 31 23:45:25 raspberrypi kernel: [91052.519995] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 31 23:45:25 raspberrypi kernel: [91052.523674] kworker/u8:0 D 0 4705 2 0x00000000
May 31 23:45:25 raspberrypi kernel: [91052.525645] Workqueue: writeback wb_workfn (flush-cifs-27)
May 31 23:45:25 raspberrypi kernel: [91052.527692] [<8079a0b8>] (__schedule) from [<8079a730>] (schedule+0x50/0xa8)
May 31 23:45:25 raspberrypi kernel: [91052.529785] [<8079a730>] (schedule) from [<8079aba8>] (schedule_preempt_disabled+0x18/0x1c)
May 31 23:45:25 raspberrypi kernel: [91052.533927] [<8079aba8>] (schedule_preempt_disabled) from [<8079c4b0>] (__mutex_lock.constprop.3+0x190/0x58c)
May 31 23:45:25 raspberrypi kernel: [91052.538273] [<8079c4b0>] (__mutex_lock.constprop.3) from [<8079c9c8>] (__mutex_lock_slowpath+0x1c/0x20)
May 31 23:45:25 raspberrypi kernel: [91052.542856] [<8079c9c8>] (__mutex_lock_slowpath) from [<8079ca28>] (mutex_lock+0x5c/0x60)
May 31 23:45:25 raspberrypi kernel: [91052.548351] [<8079ca28>] (mutex_lock) from [<7f521ce4>] (cifs_reopen_file+0x34/0x430 [cifs])
May 31 23:45:25 raspberrypi kernel: [91052.554206] [<7f521ce4>] (cifs_reopen_file [cifs]) from [<7f525904>] (find_writable_file+0x188/0x28c [cifs])
May 31 23:45:25 raspberrypi kernel: [91052.560495] [<7f525904>] (find_writable_file [cifs]) from [<7f526128>] (cifs_writepages+0x720/0xa40 [cifs])
May 31 23:45:25 raspberrypi kernel: [91052.566693] [<7f526128>] (cifs_writepages [cifs]) from [<8022ea5c>] (do_writepages+0x30/0x8c)
May 31 23:45:25 raspberrypi kernel: [91052.572661] [<8022ea5c>] (do_writepages) from [<802bb198>] (__writeback_single_inode+0x44/0x430)
May 31 23:45:25 raspberrypi kernel: [91052.578850] [<802bb198>] (__writeback_single_inode) from [<802bba8c>] (writeback_sb_inodes+0x20c/0x4c4)
May 31 23:45:25 raspberrypi kernel: [91052.585180] [<802bba8c>] (writeback_sb_inodes) from [<802bbdd4>] (__writeback_inodes_wb+0x90/0xd0)
May 31 23:45:25 raspberrypi kernel: [91052.591646] [<802bbdd4>] (__writeback_inodes_wb) from [<802bc058>] (wb_writeback+0x244/0x358)
May 31 23:45:25 raspberrypi kernel: [91052.598297] [<802bc058>] (wb_writeback) from [<802bca58>] (wb_workfn+0x1d4/0x4d8)
May 31 23:45:25 raspberrypi kernel: [91052.605143] [<802bca58>] (wb_workfn) from [<80137490>] (process_one_work+0x158/0x454)
May 31 23:45:25 raspberrypi kernel: [91052.612071] [<80137490>] (process_one_work) from [<801377f0>] (worker_thread+0x64/0x5b8)
May 31 23:45:25 raspberrypi kernel: [91052.619175] [<801377f0>] (worker_thread) from [<8013d860>] (kthread+0x13c/0x16c)
May 31 23:45:25 raspberrypi kernel: [91052.626322] [<8013d860>] (kthread) from [<8010810c>] (ret_from_fork+0x14/0x28)

@gourry
Copy link
Author

gourry commented Jun 8, 2018

As an additional note, it seems that the issues start after some time. If I reboot the Pi everything is fine, but after 1-2 hours the lock starts happening.
Thanks!

@pelwell
Copy link
Contributor

pelwell commented Jun 8, 2018

That sounds like a duplicate of issue #2482. Feel free to add your observations there and close this issue if you think the symptoms fit.

@gourry
Copy link
Author

gourry commented Jun 8, 2018

It seems to be fitting, thank you! I added a reference there. I hope that the issue can be fixed through software since, reading that thread, it seems an hardware one...

@gourry gourry closed this as completed Jun 8, 2018
@pelwell
Copy link
Contributor

pelwell commented Jun 8, 2018

I think it will turn out to be a software problem - some resource being slowly exhausted and eventually running out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants