Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minimal Linux Kernel version #4014

Closed
denisbertini opened this issue Aug 6, 2019 · 10 comments
Closed

Minimal Linux Kernel version #4014

denisbertini opened this issue Aug 6, 2019 · 10 comments
Labels

Comments

@denisbertini
Copy link

We have some issues running openMPI 4.0 with UCX with Debian Kernel 3.16.0-8 at initialisation of MPI jobs.
It seems to work fine for the upper minor version 0.9 0.10. Did you encounter some
issue with this particular kernel. Is there a minimal version of the linux Kernel in order to
use openMPI with UCX ?
Thanks in advance

@shamisp
Copy link
Contributor

shamisp commented Aug 6, 2019

@denisbertini UCX uses cross-memory-attach but this is standard within 3.16.x. Also it may depend on interconnect that you use.

@yosefe
Copy link
Contributor

yosefe commented Aug 6, 2019

@denisbertini which issues are you seeing?
UCX should auto-detect CMA support based on glibc headers

@denisbertini
Copy link
Author

The problem is linked certainly to a buggy Kernel version:
During the MPI initialisation, we get systematically a Kernel Oops see:

Aug 6 13:12:43 lxbk0538 kernel: [6502193.052726] hugetlbfs: cactus_bns_inte (26849): Using mlock ulimits for SHM_HUGETLB is deprecated
Aug 6 13:12:43 lxbk0538 kernel: [6502193.055667] PGD 107fff1067 PUD 0
Aug 6 13:12:43 lxbk0538 kernel: [6502193.056842] Oops: 0000 [#1] SMP
Aug 6 13:12:43 lxbk0538 kernel: [6502193.058114] Modules linked in: squashfs loop osc(O) mgc(O) lustre(O) lmv(O) fld(O) mdc(O) fid(O) lov(O) ko2iblnd(O) ptlrpc(O) obdclass(O) lnet(O) libcfs(O) cfg80211 rfkill 8021q garp stp mrp llc xt_multiport iptable_filter b
infmt_misc cpufreq_userspace cpufreq_powersave cpufreq_conservative cpufreq_stats xt_mark xt_owner iptable_mangle ip_tables x_tables nfsd auth_rpcgss oid_registry nfsv3 nfs_acl nfs lockd sunrpc fscache x86_pkg_temp_thermal coretemp kvm_intel kvm crc32_pclmul ast
ttm drm_kms_helper joydev aesni_intel drm aes_x86_64 evdev lrw iTCO_wdt iTCO_vendor_support lpc_ich pcspkr gf128mul glue_helper ablk_helper cryptd shpchp mfd_core mei_me mei processor acpi_pad wmi thermal_sys acpi_power_meter tpm_tis tpm button crc32c_generic r
dma_ucm rdma_cm iw_cm ib_uverbs ib_ipoib ib_cm ib_umad mlx4_ib ib_sa ib_mad ib_core ib_addr psmouse ipmi_watchdog ipmi_si ipmi_poweroff ipmi_devintf ipmi_msghandler fuse autofs4 ext4 crc16 mbcache jbd2 dm_mod hid_generic usbhid hid sg sd_mod crc_t10dif crct10dif
_generic crct10dif_pclmul crct10dif_common crc32c_intel ahci libahci igb ehci_pci i2c_i801 libata ehci_hcd xhci_hcd i2c_algo_bit dca usbcore i2c_core ptp mlx4_core scsi_mod usb_common pps_core [last unloaded: libcfs]

@yosefe
Copy link
Contributor

yosefe commented Aug 6, 2019

@denisbertini is there a kernel backtrace following the Oops to see which syscall was causing this?

@denisbertini
Copy link
Author

Here is the complete dump :

Aug 6 13:12:43 lxbk0538 kernel: [6502193.052726] hugetlbfs: cactus_bns_inte (26849): Using mlock ulimits for SHM_HUGETLB is deprecated
Aug 6 13:12:43 lxbk0538 kernel: [6502193.055667] PGD 107fff1067 PUD 0
Aug 6 13:12:43 lxbk0538 kernel: [6502193.056842] Oops: 0000 [#1] SMP
Aug 6 13:12:43 lxbk0538 kernel: [6502193.058114] Modules linked in: squashfs loop osc(O) mgc(O) lustre(O) lmv(O) fld(O) mdc(O) fid(O) lov(O) ko2iblnd(O) ptlrpc(O) obdclass(O) lnet(O) libcfs(O) cfg80211 rfkill 8021q garp stp mrp llc xt_multiport iptable_filter b
infmt_misc cpufreq_userspace cpufreq_powersave cpufreq_conservative cpufreq_stats xt_mark xt_owner iptable_mangle ip_tables x_tables nfsd auth_rpcgss oid_registry nfsv3 nfs_acl nfs lockd sunrpc fscache x86_pkg_temp_thermal coretemp kvm_intel kvm crc32_pclmul ast
ttm drm_kms_helper joydev aesni_intel drm aes_x86_64 evdev lrw iTCO_wdt iTCO_vendor_support lpc_ich pcspkr gf128mul glue_helper ablk_helper cryptd shpchp mfd_core mei_me mei processor acpi_pad wmi thermal_sys acpi_power_meter tpm_tis tpm button crc32c_generic r
dma_ucm rdma_cm iw_cm ib_uverbs ib_ipoib ib_cm ib_umad mlx4_ib ib_sa ib_mad ib_core ib_addr psmouse ipmi_watchdog ipmi_si ipmi_poweroff ipmi_devintf ipmi_msghandler fuse autofs4 ext4 crc16 mbcache jbd2 dm_mod hid_generic usbhid hid sg sd_mod crc_t10dif crct10dif
_generic crct10dif_pclmul crct10dif_common crc32c_intel ahci libahci igb ehci_pci i2c_i801 libata ehci_hcd xhci_hcd i2c_algo_bit dca usbcore i2c_core ptp mlx4_core scsi_mod usb_common pps_core [last unloaded: libcfs]
Aug 6 13:12:43 lxbk0538 kernel: [6502193.070359] CPU: 11 PID: 26834 Comm: cactus_bns_inte Tainted: G O 3.16.0-8-amd64 #1 Debian 3.16.64-2
Aug 6 13:12:43 lxbk0538 kernel: [6502193.071444] Hardware name: Supermicro SYS-6028TR-HTFR/X10DRT-HIBF, BIOS 2.0b 08/12/2016
Aug 6 13:12:43 lxbk0538 kernel: [6502193.072522] task: ffff880fd2c8cbb0 ti: ffff880eccd9c000 task.ti: ffff880eccd9c000
Aug 6 13:12:43 lxbk0538 kernel: [6502193.073590] RIP: 0010:[] [] put_pid+0x12/0x50
Aug 6 13:12:43 lxbk0538 kernel: [6502193.074652] RSP: 0018:ffff880eccd9fea8 EFLAGS: 00010206
Aug 6 13:12:43 lxbk0538 kernel: [6502193.075691] RAX: 000000005ffff000 RBX: ffff880ee7dd1b40 RCX: 00000000000000bb
Aug 6 13:12:43 lxbk0538 kernel: [6502193.076719] RDX: 00000000000000bb RSI: ffffffff81a83b00 RDI: ffffea002747a450
Aug 6 13:12:43 lxbk0538 kernel: [6502193.077737] RBP: 0000000000000200 R08: 0000000000000000 R09: 0000000000000001
Aug 6 13:12:43 lxbk0538 kernel: [6502193.078735] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff81889a40
Aug 6 13:12:43 lxbk0538 kernel: [6502193.079714] R13: 0000000000200000 R14: 00000000fffffff4 R15: fffffffffffffff4
Aug 6 13:12:43 lxbk0538 kernel: [6502193.080675] FS: 00002ba44535a440(0000) GS:ffff88107fb60000(0000) knlGS:0000000000000000
Aug 6 13:12:43 lxbk0538 kernel: [6502193.081630] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Aug 6 13:12:43 lxbk0538 kernel: [6502193.082582] CR2: ffffea0087479488 CR3: 0000000d4a614000 CR4: 0000000000360770
Aug 6 13:12:43 lxbk0538 kernel: [6502193.083529] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Aug 6 13:12:43 lxbk0538 kernel: [6502193.084458] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Aug 6 13:12:43 lxbk0538 kernel: [6502193.085374] Stack:
Aug 6 13:12:43 lxbk0538 kernel: [6502193.086274] ffff880ee7dd1b40 ffffffff8123cfb2 ffff880c165bcd98 30565359535bcd98
Aug 6 13:12:43 lxbk0538 kernel: [6502193.087179] 0030303030303030 000000003bdbaf89 00000000002080e0 ffffffff81889b10
Aug 6 13:12:43 lxbk0538 kernel: [6502193.088084] ffff880eccd9ff70 00007ffe3bd869d0 0000000000000fb0 0000000000200000
Aug 6 13:12:43 lxbk0538 kernel: [6502193.088978] Call Trace:
Aug 6 13:12:43 lxbk0538 kernel: [6502193.089851] [] ? newseg+0x2b2/0x370
Aug 6 13:12:43 lxbk0538 kernel: [6502193.090725] [] ? ipcget+0xd9/0x1d0
Aug 6 13:12:43 lxbk0538 kernel: [6502193.091578] [] ? SyS_shmget+0x42/0x50
Aug 6 13:12:43 lxbk0538 kernel: [6502193.092413] [] ? system_call_fast_compare_end+0x1c/0x21
Aug 6 13:12:43 lxbk0538 kernel: [6502193.093250] Code: 48 c1 e2 05 48 85 c9 48 8b 54 10 38 75 85 0f 1f 00 31 c0 c3 0f 1f 44 00 00 0f 1f 44 00 00 48 85 ff 74 1a 53 8b 47 04 48 c1 e0 05 <48> 8b 5c 07 38 8b 07 83 f8 01 74 12 f0 ff 0f 74 0d 5b f3 c3 66
Aug 6 13:12:43 lxbk0538 kernel: [6502193.095836] RSP
Aug 6 13:12:43 lxbk0538 kernel: [6502193.096651] CR2: ffffea0087479488
Aug 6 13:12:43 lxbk0538 kernel: [6502193.101038] ------------[ cut here ]------------
Aug 6 13:12:43 lxbk0538 kernel: [6502193.102641] invalid opcode: 0000 [#2] SMP
Aug 6 13:12:43 lxbk0538 kernel: [6502193.103427] Modules linked in: squashfs loop osc(O) mgc(O) lustre(O) lmv(O) fld(O) mdc(O) fid(O) lov(O) ko2iblnd(O) ptlrpc(O) obdclass(O) lnet(O) libcfs(O) cfg80211 rfkill 8021q garp stp mrp llc xt_multiport iptable_filter b
infmt_misc cpufreq_userspace cpufreq_powersave cpufreq_conservative cpufreq_stats xt_mark xt_owner iptable_mangle ip_tables x_tables nfsd auth_rpcgss oid_registry nfsv3 nfs_acl nfs lockd sunrpc fscache x86_pkg_temp_thermal coretemp kvm_intel kvm crc32_pclmul ast
ttm drm_kms_helper joydev aesni_intel drm aes_x86_64 evdev lrw iTCO_wdt iTCO_vendor_support lpc_ich pcspkr gf128mul glue_helper ablk_helper cryptd shpchp mfd_core mei_me mei processor acpi_pad wmi thermal_sys acpi_power_meter tpm_tis tpm button crc32c_generic r
dma_ucm rdma_cm iw_cm ib_uverbs ib_ipoib ib_cm ib_umad mlx4_ib ib_sa ib_mad ib_core ib_addr psmouse ipmi_watchdog ipmi_si ipmi_poweroff ipmi_devintf ipmi_msghandler fuse autofs4 ext4 crc16 mbcache jbd2 dm_mod hid_generic usbhid hid sg sd_mod crc_t10dif crct10dif
_generic crct10dif_pclmul crct10dif_common crc32c_intel ahci libahci igb ehci_pci i2c_i801 libata ehci_hcd xhci_hcd i2c_algo_bit dca usbcore i2c_core ptp mlx4_core scsi_mod usb_common pps_core [last unloaded: libcfs]
Aug 6 13:12:43 lxbk0538 kernel: [6502193.112002] CPU: 11 PID: 26834 Comm: cactus_bns_inte Tainted: G O 3.16.0-8-amd64 #1 Debian 3.16.64-2
Aug 6 13:12:43 lxbk0538 kernel: [6502193.112854] Hardware name: Supermicro SYS-6028TR-HTFR/X10DRT-HIBF, BIOS 2.0b 08/12/2016
Aug 6 13:12:43 lxbk0538 kernel: [6502193.113695] task: ffff880fd2c8cbb0 ti: ffff880eccd9c000 task.ti: ffff880eccd9c000
Aug 6 13:12:43 lxbk0538 kernel: [6502193.114527] RIP: 0010:[] [] change_page_attr_set_clr+0x444/0x470
Aug 6 13:12:43 lxbk0538 kernel: [6502193.115360] RSP: 0018:ffff880eccd9f460 EFLAGS: 00010046
Aug 6 13:12:43 lxbk0538 kernel: [6502193.116171] RAX: 0000000000000046 RBX: 0000000000000000 RCX: ffff880eccd9f480
Aug 6 13:12:43 lxbk0538 kernel: [6502193.116974] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000080000000
Aug 6 13:12:43 lxbk0538 kernel: [6502193.117764] RBP: 0000000000000005 R08: 0000000000000001 R09: 00003ffffffff000
Aug 6 13:12:43 lxbk0538 kernel: [6502193.118544] R10: 0000000000000010 R11: 0000000000000000 R12: 0000000000000000
Aug 6 13:12:43 lxbk0538 kernel: [6502193.119313] R13: 0000000000000200 R14: 0000000000000000 R15: 0000000000000000
Aug 6 13:12:43 lxbk0538 kernel: [6502193.120073] FS: 00002ba44535a440(0000) GS:ffff88107fb60000(0000) knlGS:0000000000000000
Aug 6 13:12:43 lxbk0538 kernel: [6502193.120836] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Aug 6 13:12:43 lxbk0538 kernel: [6502193.121601] CR2: ffffea0087479488 CR3: 0000000d4a614000 CR4: 0000000000360770
Aug 6 13:12:43 lxbk0538 kernel: [6502193.122370] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Aug 6 13:12:43 lxbk0538 kernel: [6502193.123143] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Aug 6 13:12:43 lxbk0538 kernel: [6502193.123910] Stack:
Aug 6 13:12:43 lxbk0538 kernel: [6502193.124662] 0000000000000010 ffffffff00000000 ffffffff00000004 0000000000000008
Aug 6 13:12:43 lxbk0538 kernel: [6502193.125436] 0000000000000000 0000000000000000 0000000000000010 0000000000000000
Aug 6 13:12:43 lxbk0538 kernel: [6502193.126203] 0000000000000001 0000000000000005 0000000000050359 0000020000000000
Aug 6 13:12:43 lxbk0538 kernel: [6502193.126962] Call Trace:
Aug 6 13:12:43 lxbk0538 kernel: [6502193.127722] [] ? _set_pages_array+0x112/0x170
Aug 6 13:12:43 lxbk0538 kernel: [6502193.128490] [] ? ttm_set_pages_caching+0x27/0x60 [ttm]
Aug 6 13:12:43 lxbk0538 kernel: [6502193.129252] [] ? ttm_alloc_new_pages.isra.6+0xb8/0x180 [ttm]
Aug 6 13:12:43 lxbk0538 kernel: [6502193.130010] [] ? ttm_pool_populate+0x407/0x510 [ttm]
Aug 6 13:12:43 lxbk0538 kernel: [6502193.130763] [] ? ttm_bo_move_memcpy+0x565/0x5c0 [ttm]
Aug 6 13:12:43 lxbk0538 kernel: [6502193.131511] [] ? __vmalloc_node_range+0x202/0x2a0
Aug 6 13:12:43 lxbk0538 kernel: [6502193.132260] [] ? ttm_bo_handle_move_mem+0x29e/0x620 [ttm]
Aug 6 13:12:43 lxbk0538 kernel: [6502193.133017] [] ? ttm_bo_validate+0x204/0x210 [ttm]
Aug 6 13:12:43 lxbk0538 kernel: [6502193.133774] [] ? ast_bo_push_sysram+0x78/0xd0 [ast]
Aug 6 13:12:43 lxbk0538 kernel: [6502193.134528] [] ? ast_crtc_do_set_base.isra.11.constprop.21+0x6d/0x350 [ast]
Aug 6 13:12:43 lxbk0538 kernel: [6502193.135278] [] ? ast_set_index_reg_mask+0x41/0x70 [ast]
Aug 6 13:12:43 lxbk0538 kernel: [6502193.136013] [] ? ast_crtc_mode_set+0xa6d/0xbb0 [ast]
Aug 6 13:12:43 lxbk0538 kernel: [6502193.136743] [] ? drm_crtc_helper_set_mode+0x31e/0x580 [drm_kms_helper]
Aug 6 13:12:43 lxbk0538 kernel: [6502193.137475] [] ? drm_crtc_helper_set_config+0xa2b/0xbc0 [drm_kms_helper]
Aug 6 13:12:43 lxbk0538 kernel: [6502193.138221] [] ? drm_mode_set_config_internal+0x68/0xe0 [drm]
Aug 6 13:12:43 lxbk0538 kernel: [6502193.138961] [] ? drm_fb_helper_pan_display+0x8d/0xe0 [drm_kms_helper]
Aug 6 13:12:43 lxbk0538 kernel: [6502193.139693] [] ? fb_pan_display+0x9d/0x160
Aug 6 13:12:43 lxbk0538 kernel: [6502193.140418] [] ? bit_update_start+0x1a/0x40
Aug 6 13:12:43 lxbk0538 kernel: [6502193.141128] [] ? fbcon_switch+0x3a3/0x580
Aug 6 13:12:43 lxbk0538 kernel: [6502193.141819] [] ? redraw_screen+0x17d/0x230
Aug 6 13:12:43 lxbk0538 kernel: [6502193.142488] [] ? fb_blank+0x9f/0xc0
Aug 6 13:12:43 lxbk0538 kernel: [6502193.143140] [] ? fbcon_blank+0x20a/0x2e0
Aug 6 13:12:43 lxbk0538 kernel: [6502193.143776] [] ? console_unlock+0x258/0x430
Aug 6 13:12:43 lxbk0538 kernel: [6502193.144389] [] ? wake_up_klogd+0x30/0x40
Aug 6 13:12:43 lxbk0538 kernel: [6502193.144984] [] ? lock_timer_base.isra.35+0x26/0x50
Aug 6 13:12:43 lxbk0538 kernel: [6502193.145567] [] ? internal_add_timer+0x2a/0x70
Aug 6 13:12:43 lxbk0538 kernel: [6502193.146130] [] ? mod_timer+0xf5/0x220
Aug 6 13:12:43 lxbk0538 kernel: [6502193.146668] [] ? do_unblank_screen+0xb9/0x1e0
Aug 6 13:12:43 lxbk0538 kernel: [6502193.147194] [] ? bust_spinlocks+0x15/0x30
Aug 6 13:12:43 lxbk0538 kernel: [6502193.147713] [] ? oops_end+0x2f/0xe0
Aug 6 13:12:43 lxbk0538 kernel: [6502193.148216] [] ? no_context+0x11d/0x300
Aug 6 13:12:43 lxbk0538 kernel: [6502193.148708] [] ? __inode_wait_for_writeback+0x72/0xc0
Aug 6 13:12:43 lxbk0538 kernel: [6502193.149196] [] ? page_fault+0x28/0x30
Aug 6 13:12:43 lxbk0538 kernel: [6502193.149676] [] ? put_pid+0x12/0x50
Aug 6 13:12:43 lxbk0538 kernel: [6502193.150156] [] ? newseg+0x2b2/0x370
Aug 6 13:12:43 lxbk0538 kernel: [6502193.150636] [] ? ipcget+0xd9/0x1d0
Aug 6 13:12:43 lxbk0538 kernel: [6502193.151116] [] ? SyS_shmget+0x42/0x50
Aug 6 13:12:43 lxbk0538 kernel: [6502193.151601] [] ? system_call_fast_compare_end+0x1c/0x21
Aug 6 13:12:43 lxbk0538 kernel: [6502193.152080] Code: 00 a8 01 74 d2 be 00 10 00 00 4c 89 f7 e8 05 dc ff ff eb c3 0f 1f 00 be 00 10 00 00 4c 89 e7 e8 f3 db ff ff e9 29 fe ff ff 0f 0b <0f> 0b 0f 0b be bb 00 00 00 48 c7 c7 d8 9a 71 81 44 89 14 24 e8
Aug 6 13:12:43 lxbk0538 kernel: [6502193.153668] RSP
Aug 6 13:12:43 lxbk0538 kernel: [6502193.154174] ---[ end trace bd4bcad000598c5b ]

@yosefe
Copy link
Contributor

yosefe commented Aug 7, 2019

seems like shmget(HUGETLB) is broken..

@denisbertini
Copy link
Author

OK thanks
Can you give me some hints about how to proceed ?
What can be the reason for that ?

@yosefe
Copy link
Contributor

yosefe commented Aug 7, 2019

Seems like kernel bug, if it works on a newer kernel i'd suggest to use it..
Also, can check if a small program which does only the following fails, if yes it pinpoints the problem so it can be reported to debian:
shmget(IPC_PRIVATE, 2 * 1024 * 1024,IPC_CREAT | SHM_R | SHM_W | SHM_HUGETLB);

@denisbertini
Copy link
Author

OK i tried what you suggested and it seems i am hitting a kernel bug here :

[ 7102.537122] ------------[ cut here ]------------
[ 7102.537955] kernel BUG at /build/linux-3W25dF/linux-3.16.64/arch/x86/mm/pageattr.c:217!
[ 7102.538769] invalid opcode: 0000 [#2] SMP
[ 7102.539585] Modules linked in: xt_mark xt_owner iptable_mangle osc(O) mgc(O) lustre(O) lmv(O) fld(O) mdc(O) fid(O) lov(O) ko2iblnd(O) ptlrpc(O) obdclass(O) lnet(O) libcfs(O) 8021q xt_multiport garp stp mrp llc iptable_filter ip_tables x_tables binfmt_misc nfsv3 nfs_acl nfs lockd sunrpc fscache cpufreq_stats cpufreq_userspace cpufreq_conservative cpufreq_powersave sha512_ssse3 sha512_generic sha256_ssse3 sha256_generic ast ttm drm_kms_helper x86_pkg_temp_thermal coretemp kvm_intel kvm crc32_pclmul aesni_intel iTCO_wdt drm joydev iTCO_vendor_support lpc_ich mei_me aes_x86_64 lrw gf128mul glue_helper ablk_helper mei evdev mfd_core cryptd shpchp pcspkr processor acpi_pad acpi_power_meter thermal_sys tpm_tis tpm button wmi crypto_null crc32c_generic rdma_ucm ib_uverbs rdma_cm iw_cm ib_ipoib ib_cm ib_umad
[ 7102.544742] mlx4_ib ib_sa ib_mad ib_core ib_addr psmouse ipmi_watchdog ipmi_si ipmi_poweroff ipmi_devintf ipmi_msghandler fuse autofs4 ext4 crc16 mbcache jbd2 dm_mod hid_generic usbhid hid sg sd_mod crc_t10dif crct10dif_generic crct10dif_pclmul crct10dif_common crc32c_intel ahci libahci ehci_pci xhci_hcd igb ehci_hcd libata i2c_algo_bit i2c_i801 dca usbcore i2c_core ptp mlx4_core scsi_mod usb_common pps_core [last unloaded: libcfs]
[ 7102.548188] CPU: 5 PID: 28445 Comm: writer Tainted: G C O 3.16.0-8-amd64 #1 Debian 3.16.64-2
[ 7102.549030] Hardware name: Supermicro SYS-6028TR-HTFR/X10DRT-HIBF, BIOS 2.0b 08/12/2016
[ 7102.549864] task: ffff881030ea54d0 ti: ffff880071680000 task.ti: ffff880071680000
[ 7102.550687] RIP: 0010:[] [] change_page_attr_set_clr+0x444/0x470
[ 7102.551528] RSP: 0018:ffff8800716834a0 EFLAGS: 00010046
[ 7102.552328] RAX: 0000000000000046 RBX: 0000000000000000 RCX: ffff8800716834c0
[ 7102.553119] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000080000000
[ 7102.553897] RBP: 0000000000000005 R08: 0000000000000001 R09: 00003ffffffff000
[ 7102.554661] R10: 0000000000000010 R11: 0000000000000000 R12: 0000000000000000
[ 7102.555415] R13: 0000000000000200 R14: 0000000000000000 R15: 0000000000000000
[ 7102.556158] FS: 00007ff103dcb740(0000) GS:ffff88107faa0000(0000) knlGS:0000000000000000
[ 7102.556904] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 7102.557647] CR2: 00007ff1030d64b0 CR3: 000000202de2a000 CR4: 0000000000360770
[ 7102.558394] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 7102.559138] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 7102.559882] Stack:
[ 7102.560623] 0000000000000010 ffffffff00000000 ffffffff00000004 0000000000000008
[ 7102.561399] 0000000000000000 0000000000000000 0000000000000010 0000000000000000
[ 7102.562149] 0000000000000001 0000000000000005 0000000000070aa8 0000020000000000
[ 7102.562888] Call Trace:
[ 7102.563619] [] ? _set_pages_array+0x112/0x170
[ 7102.564370] [] ? ttm_set_pages_caching+0x27/0x60 [ttm]
[ 7102.565116] [] ? ttm_alloc_new_pages.isra.6+0xb8/0x180 [ttm]
[ 7102.565858] [] ? ttm_pool_populate+0x407/0x510 [ttm]
[ 7102.566594] [] ? ttm_bo_move_memcpy+0x565/0x5c0 [ttm]
[ 7102.567328] [] ? __vmalloc_node_range+0x202/0x2a0
[ 7102.568057] [] ? ttm_bo_handle_move_mem+0x29e/0x620 [ttm]
[ 7102.568789] [] ? ttm_bo_validate+0x204/0x210 [ttm]
[ 7102.569527] [] ? ast_bo_push_sysram+0x78/0xd0 [ast]
[ 7102.570266] [] ? ast_crtc_do_set_base.isra.11.constprop.21+0x6d/0x350 [ast]
[ 7102.571012] [] ? ast_set_index_reg_mask+0x41/0x70 [ast]
[ 7102.571760] [] ? ast_crtc_mode_set+0xa6d/0xbb0 [ast]
[ 7102.572479] [] ? drm_crtc_helper_set_mode+0x31e/0x580 [drm_kms_helper]
[ 7102.573195] [] ? drm_crtc_helper_set_config+0xa2b/0xbc0 [drm_kms_helper]
[ 7102.573918] [] ? drm_mode_set_config_internal+0x68/0xe0 [drm]
[ 7102.574634] [] ? drm_fb_helper_pan_display+0x8d/0xe0 [drm_kms_helper]
[ 7102.575350] [] ? fb_pan_display+0x9d/0x160
[ 7102.576055] [] ? bit_update_start+0x1a/0x40
[ 7102.576744] [] ? fbcon_switch+0x3a3/0x580
[ 7102.577415] [] ? redraw_screen+0x17d/0x230
[ 7102.578067] [] ? fb_blank+0x9f/0xc0
[ 7102.578698] [] ? fbcon_blank+0x20a/0x2e0
[ 7102.579314] [] ? lock_timer_base.isra.35+0x26/0x50
[ 7102.579918] [] ? internal_add_timer+0x2a/0x70
[ 7102.580503] [] ? mod_timer+0xf5/0x220
[ 7102.581065] [] ? do_unblank_screen+0xb9/0x1e0
[ 7102.581633] [] ? bust_spinlocks+0x15/0x30
[ 7102.582163] [] ? oops_end+0x2f/0xe0
[ 7102.582676] [] ? general_protection+0x28/0x30
[ 7102.583181] [] ? put_pid+0xb/0x50
[ 7102.583677] [] ? newseg+0x2b2/0x370
[ 7102.584160] [] ? ipcget+0xd9/0x1d0
[ 7102.584630] [] ? SyS_shmget+0x42/0x50
[ 7102.585095] [] ? system_call_fast_compare_end+0x1c/0x21

@yosefe
Copy link
Contributor

yosefe commented Aug 8, 2019

closing this since the issue is not related to UCX, and reproduced with standalone test

@yosefe yosefe closed this as completed Aug 8, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants