GW2 w/ Futex2 on 5.13.1: CoherentUIHost going nuts on CPU usage (~11Cores) #217

Atemu · 2021-07-15T16:44:17Z

I have no knowledge or ability to debug this but this did not happen on 5.12, nor does it happen with 5.13.1 when using esync instead.

This makes the game nearly unplayable (low 20s) and stresses the whole CPU for no good reason.

damentz · 2021-07-17T18:41:23Z

@Atemu this should be fixed now in 5.13/master. We'll need to wait for @heftig to spin another build of linux-zen.

And details of what I did. I'm not entirely sure what was in 5.13/futex2 before, but I believe there were porting errors and missing patches. To fix, I took the v4 patch set [1] (based on 5.13 rc something), and added some proton and basic fixes on top. Did a basic build test and everything's hunky dory.

[1] http://patchwork.sourceware.org/project/glibc/cover/20210603195924.361327-1-andrealmeid@collabora.com/

Atemu · 2021-07-18T21:43:25Z

Hi @damentz, thanks for looking into this. Unfortunately, it's still happening on 18cfd38 though.

If you want to reproduce this for yourself, GW2's launcher also exhibits this behaviour and can easily be installed via Lutris without downloading the whole 50G game.

damentz · 2021-07-19T23:20:59Z

@Atemu thanks for giving the updated master branch a shot. I sent André an email asking if he could provide a port of his futex2-proton working branch for v5.13 and pointed him to the problems you found in this issue. Hopefully we can come out of this with a better futex2 than what we have on 5.12 🤞

The v4 patch set doesn't work well in 5.13 per github issue [1]. After a reponse from André, he recommends using the patch from linux-tkg [2]. Even though the v4 patch set should be better, it's an in-progress patch set and he recommends using an older version if the current doesn't work out for now. [1] #217 (comment) [2] https://github.com/Frogging-Family/linux-tkg/blob/master/linux-tkg-patches/5.13/0007-v5.13-futex2_interface.patch This reverts commit 67c07b3, reversing changes made to f1a8de1.

Potentially resolves: #217

damentz · 2021-07-20T17:04:22Z

@Atemu can you try 5.13/master again? Per André's recommendation, he suggested the patch by the linux-tkg folk at https://github.com/Frogging-Family/linux-tkg/blob/master/linux-tkg-patches/5.13/0007-v5.13-futex2_interface.patch.

André pointed out that even though it's based on a much older futex2 patch set, since the futex2 patch is a work-in-progress, it's expected that there'll be regressions as new drafts come out. Or TL;DR, this should work now 🤞

damentz · 2021-07-22T14:05:49Z

@Atemu Does latest Zen Kernel resolve CPU usage issues with GW2?

damentz · 2021-07-24T20:46:42Z

I'll mark this issue as resolved. Zen Kernel is now using the same patch set used by TKG and XanMod. If there was an issue with this patch then it would be affecting all kernels. Let me know if there's an issue, otherwise I'm marking this as resolved.

Atemu · 2021-07-26T13:01:12Z

Yup, it's fixed now. Thanks!

Turns out the reason this version of futex2 resolved #217 is that it doesn't actually work. Launching a game through proton shows that esync is used instead of fsync. Issue #228 was opened later showing this exact issue and I reproduced on my local system. Will try using Andre's latest patch set to LKML from August 5th, 2021. This reverts commit 1efbf88, reversing changes made to 3136f92.

…kprobe_event_gen_test_exit() commit e0d7526 upstream. When trace_get_event_file() failed, gen_kretprobe_test will be assigned as the error code. If module kprobe_event_gen_test is removed now, the null pointer dereference will happen in kprobe_event_gen_test_exit(). Check if gen_kprobe_test or gen_kretprobe_test is error code or NULL before dereference them. BUG: kernel NULL pointer dereference, address: 0000000000000012 PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI CPU: 3 PID: 2210 Comm: modprobe Not tainted 6.1.0-rc1-00171-g2159299a3b74-dirty #217 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014 RIP: 0010:kprobe_event_gen_test_exit+0x1c/0xb5 [kprobe_event_gen_test] Code: Unable to access opcode bytes at 0xffffffff9ffffff2. RSP: 0018:ffffc900015bfeb8 EFLAGS: 00010246 RAX: ffffffffffffffea RBX: ffffffffa0002080 RCX: 0000000000000000 RDX: ffffffffa0001054 RSI: ffffffffa0001064 RDI: ffffffffdfc6349c RBP: ffffffffa0000000 R08: 0000000000000004 R09: 00000000001e95c0 R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000800 R13: ffffffffa0002420 R14: 0000000000000000 R15: 0000000000000000 FS: 00007f56b75be540(0000) GS:ffff88813bc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffff9ffffff2 CR3: 000000010874a006 CR4: 0000000000330ee0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> __x64_sys_delete_module+0x206/0x380 ? lockdep_hardirqs_on_prepare+0xd8/0x190 ? syscall_enter_from_user_mode+0x1c/0x50 do_syscall_64+0x3f/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd Link: https://lore.kernel.org/all/20221108015130.28326-2-shangxiaojing@huawei.com/ Fixes: 6483624 ("tracing: Add kprobe event command generation test module") Signed-off-by: Shang XiaoJing <shangxiaojing@huawei.com> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit e428e96 upstream. If a kernel thread is created by a user thread, it may carry FPU/SIMD thread info flags (TIF_USEDFPU, TIF_USEDSIMD, etc.). Then it will be considered as a fpu owner and kernel try to save its FPU/SIMD context and cause such errors: [ 41.518931] do_fpu invoked from kernel context![#1]: [ 41.523933] CPU: 1 PID: 395 Comm: iou-wrk-394 Not tainted 6.1.0-rc5+ #217 [ 41.530757] Hardware name: Loongson Loongson-3A5000-7A1000-1w-CRB/Loongson-LS3A5000-7A1000-1w-CRB, BIOS vUDK2018-LoongArch-V2.0.pre-beta8 08/18/2022 [ 41.544064] $ 0 : 0000000000000000 90000000011e9468 9000000106c7c000 9000000106c7fcf0 [ 41.552101] $ 4 : 9000000106305d40 9000000106689800 9000000106c7fd08 0000000003995818 [ 41.560138] $ 8 : 0000000000000001 90000000009a72e4 0000000000000020 fffffffffffffffc [ 41.568174] $12 : 0000000000000000 0000000000000000 0000000000000020 00000009aab7e130 [ 41.576211] $16 : 00000000000001ff 0000000000000407 0000000000000001 0000000000000000 [ 41.584247] $20 : 0000000000000000 0000000000000001 9000000106c7fd70 90000001002f0400 [ 41.592284] $24 : 0000000000000000 900000000178f740 90000000011e9834 90000001063057c0 [ 41.600320] $28 : 0000000000000000 0000000000000001 9000000006826b40 9000000106305140 [ 41.608356] era : 9000000000228848 _save_fp+0x0/0xd8 [ 41.613542] ra : 90000000011e9468 __schedule+0x568/0x8d0 [ 41.619160] CSR crmd: 000000b0 [ 41.619163] CSR prmd: 00000000 [ 41.622359] CSR euen: 00000000 [ 41.625558] CSR ecfg: 00071c1c [ 41.628756] CSR estat: 000f0000 [ 41.635239] ExcCode : f (SubCode 0) [ 41.638783] PrId : 0014c010 (Loongson-64bit) [ 41.643191] Modules linked in: acpi_ipmi vfat fat ipmi_si ipmi_devintf cfg80211 ipmi_msghandler rfkill fuse efivarfs [ 41.653734] Process iou-wrk-394 (pid: 395, threadinfo=0000000004ebe913, task=00000000636fa1be) [ 41.662375] Stack : 00000000ffff0875 9000000006800ec0 9000000006800ec0 90000000002d57e0 [ 41.670412] 0000000000000001 0000000000000000 9000000106535880 0000000000000001 [ 41.678450] 9000000105291800 0000000000000000 9000000105291838 900000000178e000 [ 41.686487] 9000000106c7fd90 9000000106305140 0000000000000001 90000000011e9834 [ 41.694523] 00000000ffff0875 90000000011f034c 9000000105291838 9000000105291830 [ 41.702561] 0000000000000000 9000000006801440 00000000ffff0875 90000000002d48c0 [ 41.710597] 9000000128800001 9000000106305140 9000000105291838 9000000105291838 [ 41.718634] 9000000105291830 9000000107811740 9000000105291848 90000000009bf1e0 [ 41.726672] 9000000105291830 9000000107811748 2d6b72772d756f69 0000000000343933 [ 41.734708] 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 41.742745] ... [ 41.745252] Call Trace: [ 42.197868] [<9000000000228848>] _save_fp+0x0/0xd8 [ 42.205214] [<90000000011ed468>] __schedule+0x568/0x8d0 [ 42.210485] [<90000000011ed834>] schedule+0x64/0xd4 [ 42.215411] [<90000000011f434c>] schedule_timeout+0x88/0x188 [ 42.221115] [<90000000009c36d0>] io_wqe_worker+0x184/0x350 [ 42.226645] [<9000000000221cf0>] ret_from_kernel_thread+0xc/0x9c This can be easily triggered by ltp testcase syscalls/io_uring02 and it can also be easily fixed by clearing the FPU/SIMD thread info flags for kernel threads in copy_thread(). Cc: stable@vger.kernel.org Reported-by: Qi Hu <huqi@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Atemu changed the title ~~GW2 w/ Futex2: CoherentUIHost going nuts on CPU usage (~11Cores)~~ GW2 w/ Futex2 on 5.13.1: CoherentUIHost going nuts on CPU usage (~11Cores) Jul 15, 2021

damentz assigned damentz and heftig Jul 17, 2021

damentz added a commit that referenced this issue Jul 20, 2021

Merge branch '5.13/futex2-tkg' into 5.13/master

1efbf88

Potentially resolves: #217

damentz closed this as completed Jul 24, 2021

damentz mentioned this issue Aug 7, 2021

Fsync doesn't work from 5.13 and onwards #228

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GW2 w/ Futex2 on 5.13.1: CoherentUIHost going nuts on CPU usage (~11Cores) #217

GW2 w/ Futex2 on 5.13.1: CoherentUIHost going nuts on CPU usage (~11Cores) #217

Atemu commented Jul 15, 2021 •

edited

Loading

damentz commented Jul 17, 2021 •

edited

Loading

Atemu commented Jul 18, 2021

damentz commented Jul 19, 2021

damentz commented Jul 20, 2021

damentz commented Jul 22, 2021

damentz commented Jul 24, 2021

Atemu commented Jul 26, 2021

GW2 w/ Futex2 on 5.13.1: CoherentUIHost going nuts on CPU usage (~11Cores) #217

GW2 w/ Futex2 on 5.13.1: CoherentUIHost going nuts on CPU usage (~11Cores) #217

Comments

Atemu commented Jul 15, 2021 • edited Loading

damentz commented Jul 17, 2021 • edited Loading

Atemu commented Jul 18, 2021

damentz commented Jul 19, 2021

damentz commented Jul 20, 2021

damentz commented Jul 22, 2021

damentz commented Jul 24, 2021

Atemu commented Jul 26, 2021

Atemu commented Jul 15, 2021 •

edited

Loading

damentz commented Jul 17, 2021 •

edited

Loading