Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kernel panic on 4.9.0 for s7150x2 #30

Open
duanwujie opened this issue Nov 25, 2019 · 2 comments
Open

kernel panic on 4.9.0 for s7150x2 #30

duanwujie opened this issue Nov 25, 2019 · 2 comments

Comments

@duanwujie
Copy link

[ 63.725792] gim: loading out-of-tree module taints kernel.
[ 63.728298] gim info:(gim_init:149) Start AMD open source GIM initialization
[ 63.728299] gim info:(gim_init:152) GPU IOV MODULE - version 1.1.4
[ 63.728299] gim info:(gim_init:154) Copyright (c) 2014-2017 Advanced Micro Devices, Inc. All rights reserved.
[ 63.728305] gim info:(parse_config_file:219) AMD GIM fb_option = 0
[ 63.728305] gim info:(parse_config_file:219) AMD GIM sched_option = 0
[ 63.728306] gim info:(parse_config_file:219) AMD GIM vf_num = 0
[ 63.728306] gim info:(parse_config_file:219) AMD GIM pf_fb = 0
[ 63.728306] gim info:(parse_config_file:219) AMD GIM vf_fb = 0
[ 63.728307] gim info:(parse_config_file:219) AMD GIM sched_interval = 0
[ 63.728307] gim info:(parse_config_file:219) AMD GIM sched_interval_us = 0
[ 63.728308] gim info:(parse_config_file:219) AMD GIM fb_clear = 0
[ 63.728308] gim info:(init_config:341) INIT CONFIG
[ 63.773658] gim info:(enumerate_all_pfs:146) pfdev :81d60000
[ 63.773659] gim info:(enumerate_all_pfs:146) pfdev :81d63000
[ 63.773660] dwj pf_count : 2
[ 63.773662] gim info:(set_new_adapter:572) curr allocated at ffffffffc0c05d80
[ 63.773662] gim info:(set_new_adapter:579) SRIOV is supported
[ 63.773665] gim info:(set_new_adapter:587) found PCI bridge device
[ 63.773667] gim info:(set_new_adapter:591) found: 02:8.0
[ 63.773691] gim info:(set_new_adapter:608) mmio_base = ffffaa7388fc0000
[ 63.773696] gim info:(set_new_adapter:610) doorbell = ffffaa7389e00000
[ 63.773697] gim error:(map_fb:369) can't iomap for BAR 0
[ 63.774281] gim info:(set_new_adapter:612) pf.fb_va = (null)
[ 63.774293] gim info:(sriov_is_ari_enabled:164) PCI_SRIOV_CAP = 0x00000002
[ 63.774295] gim info:(sriov_is_ari_enabled:174) PCI_SRIOV_CTRL = 0x00000010
[ 63.774295] gim info:(sriov_is_ari_enabled:177) PCI_SRIOV_CTRL_ARI is set --> ARI is supported
[ 63.774298] gim info:(program_ari_mode:441) Read bif_strap8 = 0x00200004
[ 63.774299] gim info:(program_ari_mode:446) program_ari_mode - Set ARI_Mode = PF_BUS
[ 63.774299] gim info:(program_ari_mode:456) Write bif_strap8 = 0x00000004
[ 63.774300] gim info:(gim_read_rom_from_reg:181) Reading VBios from ROM
[ 63.774419] gim info:(gim_read_vbios:243) VBIOS starts: 0x55, 0xaa
[ 63.774420] gim info:(gim_read_vbios:246) VBios size is 0x10000
[ 63.774429] gim info:(gim_read_vbios:249) vbios allocated at ffffaa7383ac1000
[ 63.774429] gim info:(gim_read_rom_from_reg:181) Reading VBios from ROM
[ 63.911429] gim info:(gim_read_vbios:257) BIOS Version Major 0xF Minor 0x31
[ 63.911458] gim info:(gim_read_vbios:270) Valid video BIOS image,
[ 63.911458] gim info:(gim_read_vbios:271) size = 0x10000, check sum is 0x543c00
[ 63.911464] gim info:(gim_post_vbios:302) Init Parser passed!, continue
[ 63.911467] gim info:(atom_chk_asic_status:333) ATOM_CheckAsicStatus - BIOS_SCRATCH_7 = 0x00000000
[ 63.911467] gim info:(atom_chk_asic_status:336) Isolate ATOM_S7_ASIC_INIT_COMPLETE_MASK bit(s) = 0x00000000
[ 63.911469] gim info:(atom_chk_asic_status:339) RLC_CNTL = 0x00000000
[ 63.911469] gim info:(atom_chk_asic_status:341) Isolate RLC_CNTL__RLC_ENABLE_F32_MASK = 0x00000000
[ 63.911469] gim info:(atom_chk_asic_status:348) ATOM_ASIC_NEED_POST
[ 63.911470] gim info:(gim_post_vbios:305) Asic needs a VBios post
[ 63.911470] gim info:(atom_post_vbios:200) ATOM_PostVBIOS: firmware_info passed
[ 63.911470] gim info:(atom_post_vbios:253) asic_init before, engine clock = 7530; memory clock =1e848
[ 64.233696] gim info:(atom_post_vbios:256) asic_init after
[ 64.233696] gim info:(atom_post_vbios:263) atom_init_fan_cntl before
[ 64.233696] gim info:(atom_post_vbios:265) atom_init_fan_cntl after
[ 64.233697] gim info:(gim_post_vbios:311) Post INIT_ASIC successfully!
[ 64.233708] gim info:(firmware_requires_update:510) SMU option ROM version 0x111700
[ 64.233708] gim info:(firmware_requires_update:511) versus patch version 0x111a00
[ 64.233720] gim info:(firmware_requires_update:521) RLCV option ROM version 113 versus patch version 129
[ 64.233720] gim info:(firmware_requires_update:526) TOC found, update it
[ 64.233721] gim info:(patch_firmware:586) Update smc_init table
[ 64.591918] BUG: unable to handle kernel paging request at 0000000000020000
[ 64.592161] IP: [] memcpy_erms+0x6/0x10
[ 64.592398] PGD 0

[ 64.592635] Oops: 0002 [#1] SMP
[ 64.592863] Modules linked in: gim(OE+) openvswitch(E) nf_conntrack_ipv6(E) nf_nat_ipv6(E) nf_conntrack_ipv4(E) nf_defrag_ipv4(E) nf_nat_ipv4(E) nf_defrag_ipv6(E) nf_nat(E) nf_conntrack(E) libcrc32c(E) crc32c_generic(E) mptctl(E) mptbase(E) ib_iser(E) rdma_cm(E) iw_cm(E) ib_cm(E) ib_core(E) configfs(E) iscsi_tcp(E) libiscsi_tcp(E) libiscsi(E) scsi_transport_iscsi(E) nls_ascii(E) nls_cp437(E) vfat(E) fat(E) snd_hda_codec_hdmi(E) snd_hda_codec_realtek(E) snd_hda_codec_generic(E) i915(E) drm_kms_helper(E) drm(E) intel_rapl(E) i2c_algo_bit(E) x86_pkg_temp_thermal(E) hci_uart(E) snd_hda_intel(E) intel_powerclamp(E) snd_hda_codec(E) btbcm(E) btqca(E) snd_hda_core(E) iTCO_wdt(E) snd_hwdep(E) snd_pcm(E) btintel(E) bluetooth(E) eeepc_wmi(E) asus_wmi(E) coretemp(E) iTCO_vendor_support(E) snd_timer(E) intel_lpss_acpi(E)
[ 64.594497] sparse_keymap(E) psmouse(E) mxm_wmi(E) serio_raw(E) evdev(E) joydev(E) kvm_intel(E) intel_lpss(E) mfd_core(E) efi_pstore(E) i2c_i801(E) video(E) shpchp(E) mei_me(E) mei(E) snd(E) soundcore(E) battery(E) rfkill(E) efivars(E) i2c_smbus(E) kvm(E) irqbypass(E) pcspkr(E) crct10dif_pclmul(E) crc32_pclmul(E) tpm_tis(E) acpi_als(E) ghash_clmulni_intel(E) acpi_pad(E) tpm_tis_core(E) kfifo_buf(E) industrialio(E) tpm(E) wmi(E) button(E) ipmi_watchdog(E) ipmi_poweroff(E) ipmi_devintf(E) ipmi_msghandler(E) fuse(E) autofs4(E) ext4(E) crc16(E) jbd2(E) fscrypto(E) mbcache(E) hid_generic(E) sg(E) usbhid(E) sd_mod(E) crc32c_intel(E) aesni_intel(E) aes_x86_64(E) glue_helper(E) lrw(E) gf128mul(E) ablk_helper(E) cryptd(E) ahci(E) libahci(E) xhci_pci(E) libata(E) xhci_hcd(E) r8169(E) mii(E) usbcore(E) scsi_mod(E)
[ 64.596414] usb_common(E) fan(E) thermal(E) i2c_hid(E) hid(E) fjes(E)
[ 64.597078] CPU: 7 PID: 2331 Comm: insmod Tainted: G OE 4.9.0-0.bpo.1-linx-security-amd64 #1 Linx 4.9.2-2~bpo8+1linx2
[ 64.597852] Hardware name: System manufacturer System Product Name/B365M-KYLIN, BIOS 1202 07/15/2019
[ 64.598236] task: ffff9e8680a33000 task.stack: ffffaa7389758000
[ 64.598620] RIP: 0010:[] [] memcpy_erms+0x6/0x10
[ 64.599016] RSP: 0018:ffffaa738975bb98 EFLAGS: 00010206
[ 64.599418] RAX: 0000000000020000 RBX: ffffffffc0c05d80 RCX: 0000000000020000
[ 64.599826] RDX: 0000000000020000 RSI: ffffaa7383f21000 RDI: 0000000000020000
[ 64.600240] RBP: 00000000000006c0 R08: ffff9e867fde0cc0 R09: 0000000000022000
[ 64.600651] R10: 8000000000000163 R11: 00000000000004dc R12: 0000000000000007
[ 64.601067] R13: ffffaa7383f21000 R14: ffffffffc0c05dc0 R15: ffffaa7383acb662
[ 64.601505] FS: 00007ff63fdef700(0000) GS:ffff9e86863c0000(0000) knlGS:0000000000000000
[ 64.601949] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 64.602382] CR2: 0000000000020000 CR3: 000000084000a000 CR4: 00000000003406e0
[ 64.602824] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 64.603269] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 64.603720] Stack:
[ 64.604166] ffffffffc0bb7f3d ffffaa7383acb662 f4000000c0bc3a79 ffff9e867fde1d80
[ 64.604630] ffffffffc0c05d80 0000000000000000 ffffffffc0c05d80 ffffffffc0c05d88
[ 64.605097] ffff9e86800c1980 ffffffffc0bab8bd 0000000000000001 0000000000000000
[ 64.605627] Call Trace:
[ 64.606101] [] ? patch_firmware+0x2bd/0x4e0 [gim]
[ 64.606578] [] ? gim_post_vbios+0x14d/0x200 [gim]
[ 64.607057] [] ? set_new_adapter+0x51b/0x9b0 [gim]
[ 64.607533] [] ? gim_probe+0x30/0x30 [gim]
[ 64.608001] [] ? gim_probe+0xa/0x30 [gim]
[ 64.608465] [] ? gim_init+0xbc/0x120 [gim]
[ 64.608921] [] ? do_one_initcall+0x4c/0x180
[ 64.609376] [] ? __vunmap+0x6d/0xc0
[ 64.609857] [] ? do_init_module+0x5a/0x1f1
[ 64.610302] [] ? load_module+0x23c9/0x28f0
[ 64.610750] [] ? __symbol_put+0x60/0x60
[ 64.611189] [] ? SYSC_finit_module+0x8e/0xe0
[ 64.611632] [] ? do_syscall_64+0x81/0x190
[ 64.612065] [] ? entry_SYSCALL64_slow_path+0x25/0x25
[ 64.612489] Code: 90 90 90 90 90 eb 1e 0f 1f 00 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07 f3 48 a5 89 d1 f3 a4 c3 66 0f 1f 44 00 00 48 89 f8 48 89 d1 a4 c3 0f 1f 80 00 00 00 00 48 89 f8 48 83 fa 20 72 7e 40 38
[ 64.613416] RIP [] memcpy_erms+0x6/0x10
[ 64.613884] RSP
[ 64.614289] CR2: 0000000000020000
[ 64.614686] ---[ end trace 43f95d5155189075 ]-

@aracno974
Copy link

Hi,
i've got exactly the same issue (same trace).
Hardware is an HP dl380 gen8 and a s7150 x2.
Os is Proxmox 5.4 (kernel 4.15.18-24-pve).
How can i resolve this.
Thank you.

@collinwebdesigns
Copy link

collinwebdesigns commented Nov 27, 2020

Hi,

same issue here on ASUS KGPE-D16 boot with quiet reboot=cold mem=256G rcu_nocbs=0-31 amd_iommu=on iommu=pt pci=realloc enable_mtrr_cleanup=1 video=efifb:off and also with s7150 x2.

Linux a4d8 5.4.73-1-pve #1 SMP PVE 5.4.73-1 (Mon, 16 Nov 2020 10:52:16 +0100) x86_64 GNU/Linux

I also get this messages in dmesg

[    3.339577] pci 0000:04:00.0: BAR 0: no space for [mem size 0x10000000 64bit pref]
[    3.339578] pci 0000:04:00.0: BAR 0: failed to assign [mem size 0x10000000 64bit pref]
[    3.339580] pci 0000:04:00.0: BAR 7: no space for [mem size 0x100000000 64bit pref]
[    3.339581] pci 0000:04:00.0: BAR 7: failed to assign [mem size 0x100000000 64bit pref]
[    3.339583] pci 0000:04:00.0: BAR 9: assigned [mem 0xb4400000-0xb83fffff 64bit pref]
[    3.339587] pci 0000:04:00.0: BAR 12: no space for [mem size 0x04000000]
[    3.339588] pci 0000:04:00.0: BAR 12: failed to assign [mem size 0x04000000]
[    3.339589] pci 0000:04:00.0: BAR 2: assigned [mem 0xb4200000-0xb43fffff 64bit pref]
[    3.339597] pci 0000:04:00.0: BAR 5: no space for [mem size 0x00040000]
[    3.339598] pci 0000:04:00.0: BAR 5: failed to assign [mem size 0x00040000]
[    3.339600] pci 0000:04:00.0: BAR 0: no space for [mem size 0x10000000 64bit pref]
[    3.339602] pci 0000:04:00.0: BAR 0: failed to assign [mem size 0x10000000 64bit pref]
[    3.339603] pci 0000:04:00.0: BAR 2: assigned [mem 0xb4200000-0xb43fffff 64bit pref]
[    3.339611] pci 0000:04:00.0: BAR 5: assigned [mem 0xb4400000-0xb443ffff]
[    3.339615] pci 0000:04:00.0: BAR 12: no space for [mem size 0x04000000]
[    3.339616] pci 0000:04:00.0: BAR 12: failed to assign [mem size 0x04000000]
[    3.339617] pci 0000:04:00.0: BAR 9: no space for [mem size 0x04000000 64bit pref]
[    3.339618] pci 0000:04:00.0: BAR 9: failed to assign [mem size 0x04000000 64bit pref]
[    3.339620] pci 0000:04:00.0: BAR 7: no space for [mem size 0x100000000 64bit pref]
[    3.339621] pci 0000:04:00.0: BAR 7: failed to assign [mem size 0x100000000 64bit pref]

I'am using GIM from https://github.com/kasperlewau/MxGPU-Virtualization.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants