Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

modprobe driver 3.18.5+ or 6+ make linux kernel panic crashed #811

Closed
r00t8ug83 opened this issue Feb 8, 2015 · 16 comments
Closed

modprobe driver 3.18.5+ or 6+ make linux kernel panic crashed #811

r00t8ug83 opened this issue Feb 8, 2015 · 16 comments

Comments

@r00t8ug83
Copy link

It just crash! unable to boot...

Message from syslogd@leepi at Feb  6 15:07:20 ...
 kernel:[ 3991.149038] Internal error: Oops: 5 [#1] PREEMPT ARM

Message from syslogd@leepi at Feb  6 15:07:20 ...
 kernel:[ 3991.257700] Process modprobe (pid: 9159, stack limit = 0xd59041b0)

Message from syslogd@leepi at Feb  6 15:07:20 ...
 kernel:[ 3991.265672] Stack: (0xd5905e88 to 0xd5906000)

Message from syslogd@leepi at Feb  6 15:07:20 ...
 kernel:[ 3991.271776] 5e80:                   bf31efe4 00007fff c00862f4 c02fe1a8 00000013 00000000

Message from syslogd@leepi at Feb  6 15:07:20 ...
 kernel:[ 3991.283416] 5ea0: de1d5000 d5905f7c d5905f50 d5905eb8 00000000 bf31efe4 d5904008 bf31f020

Message from syslogd@leepi at Feb  6 15:07:20 ...
 kernel:[ 3991.295277] 5ec0: bf31f140 00000000 b6cf0000 d5904000 00002db0 00000000 00000000 bf2fc674

Message from syslogd@leepi at Feb  6 15:07:20 ...
 kernel:[ 3991.307226] 5ee0: 00000002 00000000 00000000 00000000 00000000 00000000 00000000 00000000

Message from syslogd@leepi at Feb  6 15:07:20 ...
 kernel:[ 3991.319192] 5f00: 00000000 00000000 00000000 00000000 00000000 00000000 00000080 000bb188

Message from syslogd@leepi at Feb  6 15:07:20 ...
 kernel:[ 3991.331270] 5f20: b6c3d000 b6f20948 00000080 c000eb44 d5904000 00000000 d5905fa4 d5905f48

Message from syslogd@leepi at Feb  6 15:07:20 ...
 kernel:[ 3991.343597] 5f40: c0089970 c0087974 de1d5000 000bb188 de25cb3c de25c982 de28847c 0008519c

Message from syslogd@leepi at Feb  6 15:07:20 ...
 kernel:[ 3991.356085] 5f60: 000933dc 00000000 00000000 00000000 0000002b 0000002c 00000021 00000025

Message from syslogd@leepi at Feb  6 15:07:20 ...
 kernel:[ 3991.368737] 5f80: 00000014 00000000 00000000 00000000 00040000 b88a1c88 00000000 d5905fa8

Message from syslogd@leepi at Feb  6 15:07:20 ...
 kernel:[ 3991.381477] 5fa0: c000e8c0 c0089890 00000000 00040000 b6c3d000 000bb188 b6f20948 b6c3d000

Message from syslogd@leepi at Feb  6 15:07:20 ...
 kernel:[ 3991.394319] 5fc0: 00000000 00040000 b88a1c88 00000080 b88a1d68 000bb188 b6f20948 00000000

Message from syslogd@leepi at Feb  6 15:07:20 ...
 kernel:[ 3991.407232] 5fe0: 00000000 be99a40c b6f17fb4 b6e830d4 60000010 b6c3d000 00000000 00000000

Message from syslogd@leepi at Feb  6 15:07:20 ...
 kernel:[ 3991.445669] Code: e51bc084 e15c0005 e2455008 0a000009 (e5953014)
@r00t8ug83
Copy link
Author

run modprobe kernel panic

[ 6489.918493] Unable to handle kernel paging request at virtual address 7e303ab8
[ 6489.929348] pgd = c1f0c000
[ 6489.933779] [7e303ab8] *pgd=00000000
[ 6489.940834] Internal error: Oops: 5 [#1] PREEMPT ARM
[ 6489.947458] Modules linked in: mt7601Usta(O+) arc4 ecb md4 md5 hmac nls_utf8 cifs snd_bcm2835 snd_pcm snd_seq snd_seq_device snd_timer snd uio_pdrv_genirq uio
[ 6489.965148] CPU: 0 PID: 10249 Comm: modprobe Tainted: G           O   3.18.6+ #1
[ 6489.975771] task: da38d100 ti: d59f8000 task.ti: d59f8000
[ 6489.982890] PC is at load_module+0x1948/0x1f1c
[ 6489.989035] LR is at load_module+0x1934/0x1f1c
[ 6489.995116] pc : [<c00892e8>]    lr : [<c00892d4>]    psr: 30000013
[ 6489.995116] sp : d59f9e88  ip : bf19d578  fp : d59f9f44
[ 6490.009893] r10: bf19d410  r9 : 00000000  r8 : bf19d41c
[ 6490.016752] r7 : c05613a8  r6 : d59930e0  r5 : 7e303aa4  r4 : d59f9f48
[ 6490.024960] r3 : 00000000  r2 : 00000000  r1 : c48c6230  r0 : c0824120
[ 6490.033175] Flags: nzCV  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
[ 6490.042047] Control: 00c5387d  Table: 01f0c008  DAC: 00000015
[ 6490.049555] Process modprobe (pid: 10249, stack limit = 0xd59f81b0)
[ 6490.057619] Stack: (0xd59f9e88 to 0xd59fa000)
[ 6490.063727] 9e80:                   bf19d41c 00007fff c008632c c02fe588 00000013 00000000
[ 6490.075385] 9ea0: de16b000 d59f9f7c d59f9f50 d59f9eb8 00000000 bf19d41c d59f8008 bf19d458
[ 6490.087272] 9ec0: bf19d578 00000000 b6ce0000 d59f8000 00002db0 00000000 00000000 bf166580
[ 6490.099232] 9ee0: 00000002 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 6490.111218] 9f00: 00000000 00000000 00000000 00000000 00000000 00000000 00000080 000f6ac0
[ 6490.123313] 9f20: b6bf6000 b6f15948 00000080 c000eb44 d59f8000 00000000 d59f9fa4 d59f9f48
[ 6490.135662] 9f40: c00899a8 c00879ac de16b000 000f6ac0 de220ee8 de220ce6 de259340 000b35d8
[ 6490.148168] 9f60: 000c3668 00000000 00000000 00000000 00000030 00000031 00000026 0000002a
[ 6490.160823] 9f80: 00000016 00000000 00000000 00000000 00040000 b71e8c88 00000000 d59f9fa8
[ 6490.173564] 9fa0: c000e8c0 c00898c8 00000000 00040000 b6bf6000 000f6ac0 b6f15948 b6bf6000
[ 6490.186407] 9fc0: 00000000 00040000 b71e8c88 00000080 b71e8d68 000f6ac0 b6f15948 00000000
[ 6490.199323] 9fe0: 00000000 befc240c b6f0cfb4 b6e77ab4 60000010 b6bf6000 ffffffff 00001fff
[ 6490.212281] [<c00892e8>] (load_module) from [<c00899a8>] (SyS_init_module+0xec/0x100)
[ 6490.224908] [<c00899a8>] (SyS_init_module) from [<c000e8c0>] (ret_fast_syscall+0x0/0x48)
[ 6490.237757] Code: e51bc084 e15c0005 e2455008 0a000009 (e5953014)
[ 6490.256444] ---[ end trace 8d62c540f2c185de ]---

@r00t8ug83
Copy link
Author

key in lsmod keep load i believe it crash, I try reboot Raspberry pi unable to boot up just crash.. is this a bug?

@r00t8ug83 r00t8ug83 changed the title modprobe driver 3.18.5+ make linux kernel panic modprobe driver 3.18.5+ make linux kernel panic crashed Feb 8, 2015
@r00t8ug83 r00t8ug83 changed the title modprobe driver 3.18.5+ make linux kernel panic crashed modprobe driver 3.18.5+ or 6+ make linux kernel panic crashed Feb 8, 2015
@popcornmix
Copy link
Collaborator

What Pi do you have (e.g. B/B+/Pi2?)
Has it been reliable before you started having this issue?
Have you tried a clean sdcard install of raspbian?

@r00t8ug83
Copy link
Author

tested in B+ and A+, clean setup. having kernel panic have to uninstall the driver else reboot will fail to bring up the pi.

@pelwell
Copy link
Contributor

pelwell commented Feb 10, 2015

You say "have to uninstall the driver" - which driver? Are you talking about just one specific driver?

@r00t8ug83
Copy link
Author

@pelwell Fresh build why need to uninstall driver?

@pelwell
Copy link
Contributor

pelwell commented Feb 10, 2015

I don't know why yet. Please answer the question - which driver is causing the kernel panic?

@r00t8ug83
Copy link
Author

driver mt7601Usta

[  195.131026] Unable to handle kernel paging request at virtual address 7e303ab8
[  195.141859] pgd = d91dc000
[  195.146320] [7e303ab8] *pgd=00000000
[  195.152961] Internal error: Oops: 5 [#1] PREEMPT ARM
[  195.159523] Modules linked in: mt7601Usta(O+) arc4 ecb md4 md5 hmac nls_utf8 cifs snd_bcm2835 snd_pcm snd_seq snd_seq_device snd_timer snd uio_pdrv_genirq uio
[  195.177261] CPU: 0 PID: 2480 Comm: modprobe Tainted: G           O   3.18.6+ #753
[  195.187977] task: d9089b40 ti: d9174000 task.ti: d9174000
[  195.195130] PC is at load_module+0x1948/0x1f1c
[  195.201316] LR is at load_module+0x1934/0x1f1c
[  195.207445] pc : [<c00892e8>]    lr : [<c00892d4>]    psr: 30000013
[  195.207445] sp : d9175e88  ip : bf19d578  fp : d9175f44
[  195.222255] r10: bf19d410  r9 : 00000000  r8 : bf19d41c
[  195.229125] r7 : c05613a8  r6 : d9176300  r5 : 7e303aa4  r4 : d9175f48
[  195.237341] r3 : 00000000  r2 : 00000000  r1 : d9383ec8  r0 : c0824120
[  195.245567] Flags: nzCV  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
[  195.254445] Control: 00c5387d  Table: 191dc008  DAC: 00000015
[  195.261964] Process modprobe (pid: 2480, stack limit = 0xd91741b0)
[  195.269952] Stack: (0xd9175e88 to 0xd9176000)
[  195.276075] 5e80:                   bf19d41c 00007fff c008632c c02fe588 00000013 00000000
[  195.287755] 5ea0: de163000 d9175f7c d9175f50 d9175eb8 00000000 bf19d41c d9174008 bf19d458
[  195.299664] 5ec0: bf19d578 00000000 b6cf0000 d9174000 00002db0 00000000 00000000 bf166580
[  195.311640] 5ee0: 00000002 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[  195.323621] 5f00: 00000000 00000000 00000000 00000000 00000000 00000000 00000080 000f6ac0
[  195.335726] 5f20: b6bfa000 b6f19948 00000080 c000eb44 d9174000 00000000 d9175fa4 d9175f48
[  195.348081] 5f40: c00899a8 c00879ac de163000 000f6ac0 de218ee8 de218ce6 de251340 000b35d8
[  195.360584] 5f60: 000c3668 00000000 00000000 00000000 00000030 00000031 00000026 0000002a
[  195.373259] 5f80: 00000016 00000000 00000000 00000000 00040000 b8952c88 00000000 d9175fa8
[  195.386039] 5fa0: c000e8c0 c00898c8 00000000 00040000 b6bfa000 000f6ac0 b6f19948 b6bfa000
[  195.398915] 5fc0: 00000000 00040000 b8952c88 00000080 b8952d68 000f6ac0 b6f19948 00000000
[  195.411847] 5fe0: 00000000 bebb240c b6f10fb4 b6e7bab4 60000010 b6bfa000 00000000 00000000
[  195.424819] [<c00892e8>] (load_module) from [<c00899a8>] (SyS_init_module+0xec/0x100)
[  195.437474] [<c00899a8>] (SyS_init_module) from [<c000e8c0>] (ret_fast_syscall+0x0/0x48)
[  195.450356] Code: e51bc084 e15c0005 e2455008 0a000009 (e5953014)
[  195.463278] ---[ end trace aab3b0cdfad69e78 ]---

@r00t8ug83
Copy link
Author

tested in 3.18.6+ #753 Kernel Panic, I have to rollback 3.15.36+.

@pelwell
Copy link
Contributor

pelwell commented Feb 10, 2015

It appears to be crashing inside the function "add_usage_links", probably because the "target_list" field of the "struct module" is invalid or corrupt.

From what I can see this is an out-of-tree module, and I think it is probably built for the wrong kernel. Where did you get it?

@r00t8ug83
Copy link
Author

@pelwell FYI it work for raspberry pi kernel 3.15.36+ last update Hexxeh/rpi-firmware@f74b921
Before 3.18.x apply.

Impossible compile wrong kernel i get it from here
wget https://github.com/raspberrypi/linux/archive/rpi-3.18.y.tar.gz

@r00t8ug83
Copy link
Author

@pelwell Any big different between 3.15.x 3.18.x kernel? the driver source code just make kernel panic on 3.18.x on raspberry pi B+/A+

@pelwell
Copy link
Contributor

pelwell commented Feb 10, 2015

That module is not part of the kernel tree you linked to. We don't support out-of-tree modules, but I am trying to help you.

Unlike Windows, Linux drivers have to be compiled against the kernel they are to be loaded by. So, where did you get the module? Please provide a link.

  • If it is prebuilt, then you probably need to get a new one compiled against the 3.18.y kernel.
  • If you built it yourself, you will need to rebuild it against the new kernel.

@r00t8ug83
Copy link
Author

@pelwell I'm using this https://github.com/porjo/mt7601

Please help just want to understand why kernel 3.18.x will panic

@pelwell
Copy link
Contributor

pelwell commented Feb 10, 2015

You need to post this question in the Pi Forums. In fact I see you are already doing that.

This is not a kernel bug, you just don't know how to build this external module. Closing.

@pelwell pelwell closed this as completed Feb 10, 2015
@r00t8ug83
Copy link
Author

This fix my problem apt-get install gcc-4.8 g++-4.8, is the kernel compile using 4.8 gcc for 3.18.x kernel? by default raspberry pi gcc using 4.6. Possible default image using gcc into 4.8?

popcornmix pushed a commit that referenced this issue Dec 1, 2017
commit 0bad47c upstream.

During each NFSv4 callback Call, an RDMA Send completion frees the
page that contains the RPC Call message. If the upper layer
determines that a retransmit is necessary, this is too soon.

One possible symptom: after a GARBAGE_ARGS response an NFSv4.1
callback request, the following BUG fires on the NFS server:

kernel: BUG: Bad page state in process kworker/0:2H  pfn:7d3ce2
kernel: page:ffffea001f4f3880 count:-2 mapcount:0 mapping:          (null) index:0x0
kernel: flags: 0x2fffff80000000()
kernel: raw: 002fffff80000000 0000000000000000 0000000000000000 fffffffeffffffff
kernel: raw: dead000000000100 dead000000000200 0000000000000000 0000000000000000
kernel: page dumped because: nonzero _refcount
kernel: Modules linked in: cts rpcsec_gss_krb5 ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm
ocfs2_nodemanager ocfs2_stackglue rpcrdm a ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad
rdma_cm ib_cm iw_cm x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel
kvm irqbypass crct10dif_pc lmul crc32_pclmul ghash_clmulni_intel pcbc iTCO_wdt
iTCO_vendor_support aesni_intel crypto_simd glue_helper cryptd pcspkr lpc_ich i2c_i801
mei_me mf d_core mei raid0 sg wmi ioatdma ipmi_si ipmi_devintf ipmi_msghandler shpchp
acpi_power_meter acpi_pad nfsd nfs_acl lockd auth_rpcgss grace sunrpc ip_tables xfs
libcrc32c mlx4_en mlx4_ib mlx5_ib ib_core sd_mod sr_mod cdrom ast drm_kms_helper
syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ahci crc32c_intel libahci drm
mlx5_core igb libata mlx4_core dca i2c_algo_bit i2c_core nvme
kernel: ptp nvme_core pps_core dm_mirror dm_region_hash dm_log dm_mod dax
kernel: CPU: 0 PID: 11495 Comm: kworker/0:2H Not tainted 4.14.0-rc3-00001-g577ce48 #811
kernel: Hardware name: Supermicro Super Server/X10SRL-F, BIOS 1.0c 09/09/2015
kernel: Workqueue: ib-comp-wq ib_cq_poll_work [ib_core]
kernel: Call Trace:
kernel: dump_stack+0x62/0x80
kernel: bad_page+0xfe/0x11a
kernel: free_pages_check_bad+0x76/0x78
kernel: free_pcppages_bulk+0x364/0x441
kernel: ? ttwu_do_activate.isra.61+0x71/0x78
kernel: free_hot_cold_page+0x1c5/0x202
kernel: __put_page+0x2c/0x36
kernel: svc_rdma_put_context+0xd9/0xe4 [rpcrdma]
kernel: svc_rdma_wc_send+0x50/0x98 [rpcrdma]

This issue exists all the way back to v4.5, but refactoring and code
re-organization prevents this simple patch from applying to kernels
older than v4.12. The fix is the same, however, if someone needs to
backport it.

Reported-by: Ben Coddington <bcodding@redhat.com>
BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=314
Fixes: 5d252f9 ('svcrdma: Add class for RDMA backwards ... ')
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
popcornmix pushed a commit that referenced this issue Apr 16, 2024
LOCKDEP detector reported below warning:
----------------------------------------
[   23.796949] ========================================================
[   23.796950] WARNING: possible irq lock inversion dependency detected
[   23.796952] 6.8.0fix+ #811 Not tainted
[   23.796954] --------------------------------------------------------
[   23.796954] kworker/0:1/8 just changed the state of lock:
[   23.796956] ff365325e084a9b8 (&domain->lock){..-.}-{3:3}, at: amd_iommu_flush_iotlb_all+0x1f/0x50
[   23.796969] but this lock took another, SOFTIRQ-unsafe lock in the past:
[   23.796970]  (pd_bitmap_lock){+.+.}-{3:3}
[   23.796972]

               and interrupts could create inverse lock ordering between them.

[   23.796973]
               other info that might help us debug this:
[   23.796974] Chain exists of:
                 &domain->lock --> &dev_data->lock --> pd_bitmap_lock

[   23.796980]  Possible interrupt unsafe locking scenario:

[   23.796981]        CPU0                    CPU1
[   23.796982]        ----                    ----
[   23.796983]   lock(pd_bitmap_lock);
[   23.796985]                                local_irq_disable();
[   23.796985]                                lock(&domain->lock);
[   23.796988]                                lock(&dev_data->lock);
[   23.796990]   <Interrupt>
[   23.796991]     lock(&domain->lock);

Fix this issue by disabling interrupt when acquiring pd_bitmap_lock.

Note that this is temporary fix. We have a plan to replace custom bitmap
allocator with IDA allocator.

Fixes: 87a6f1f ("iommu/amd: Introduce per-device domain ID to fix potential TLB aliasing issue")
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Link: https://lore.kernel.org/r/20240404102717.6705-1-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants