Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge tag 'LA.UM.9.2.r1-03400-SDMxx0.0' of https://source.codeauroraorg/quic/la/kernel/msm-4.4 into lineage-18.1-caf-msm8998 #33

Closed
wants to merge 1 commit into from

Conversation

Flamefire
Copy link
Contributor

Change-Id: Ie86f072048d1863b07d74ef1e6500b5f18e75963

…org/quic/la/kernel/msm-4.4 into lineage-18.1-caf-msm8998

* tag 'LA.UM.9.2.r1-03400-SDMxx0.0' of https://source.codeaurora.org/quic/la/kernel/msm-4.4:
  diag: Use valid data_source for a valid token

Change-Id: Ie86f072048d1863b07d74ef1e6500b5f18e75963
@derfelot
Copy link
Member

derfelot commented Jul 21, 2021

Thanks. Before I get to my question, I'll just reply to you previous question here, regarding what we track for the kernel.

Besides the obvious (Linux 4.4 stable) we track

  • Wireguard (https://git.zx2c4.com/wireguard-linux-compat/log/). Though I only ever merge snapshots, v1.0.20210606 being the last.
  • sdfat in fs/sdfat from https://github.com/cryptomilk/kernel-sdfat I always try to update that first and bump @cryptomilk for merge if he doesn't see it before pulling it for updates. It is added to the kernel as a subtree, so doing git subtree pull --prefix fs/sdfat/ https://github.com/cryptomilk/kernel-sdfat master should do it
  • Same with the qcom wifi driver in drivers/staging/fw-api/, drivers/staging/qca-wifi-host-cmn/ and drivers/staging/qcacld-3.0/ Here we need to track the LA.UM.7.4.r1 branch. Anything more recent is not compatible with our firmware, causing crazily reduced speeds. luk1337 also helped to test this for me once using a device that does have a new firmware. There won't be any more updates for this branch, so critical updates need to be checked manually in the ASB's from other branches to see if they apply for us as well.
  • Finally, the caf tag for the msm-4.4 kernel. Until now, we have tracked LA.UM.8.4.r1 at first, then LA.UM.8.4.1.r1 to get more recent updates (they were virtually similar).

That's all I can think of. Now, for this change, I am generally fine with tracking this caf branch, I know of other devices doing this as well (e.g. cheeseburger on LOS). But I was wondering if this is the only update there is, e.g. the only change we are missing? Have you tried pulling the complete tag, or just this specific one for that commit? I haven't yet had the chance to compare LA.UM.8.4.r1 or LA.UM.8.4.1.r1 with LA.UM.9.2.r1 to see what the differences are.

Thanks

@Flamefire
Copy link
Contributor Author

Thanks a lot! What a mess...
Could you (for completeness, this is a great summary!) edit in the source repos for the qcom wifi drivers? Having an example source commit where to get the fixes for ASB issues would also be good for future reference.

For my understanding: "the caf tag for the msm-4.4 kernel. [...] tracked LA.UM.8.4.r1". What is the caf tag? I have seen multiple tags named LA.UM.8.4.r1-04600-8x98.0, LA.UM.8.4.r1-04500-8x98.0, ... So I guess this is LA.UM.8.4.r1-<version>-8x98.0
Also what does caf and LA.UM actually stand for? Haven't followed kernel dev to much previously so this is new to me. Been more into userspace where you can better debug stuff ;)

But I was wondering if this is the only update there is, e.g. the only change we are missing? Have you tried pulling the complete tag, or just this specific one for that commit?

I actually just took LineageOS/android_kernel_razer_msm8998@d46db1f which also uses msm8998 and 4.4 kernel and the commit did look correct and useful as far as I can tell. So no, didn't check the whole tag, just this single commit. Can double check later or tomorrow.

I also found other kernel repos tracking https://android.googlesource.com/kernel/common/+log/refs/heads/android-4.4-p instead of mainline linux. A quick rebase check shows a few commits (~50) not in our kernel although they may be superseded by upstream changes. Wanted to check which of those are still useful for us (most are tagged as ANDROID) when I got some time.

@bananafunction
Copy link

@Flamefire
Copy link
Contributor Author

@bananafunction Thanks for the info! I guess https://github.com/bananafunction/android_kernel_sony_msm8994 is the kernel you are using? It's on CM 14.1, so maybe you use another one? Asking because it might make sense to join forces. E.g. I'm building LOS 17.1 and 18.1 ROMs with this kernel, another user is using it for HavocOS already, ...

And WTF is up the with versioning? The latest tag from @derfelot is LA.UM.7.4.r1-06000-8x98.0 from 17 months ago, LA.UM.7.2.r2-08800-8x98.0 is last changed 5 weeks ago O.o So how am I supposed to read those tags?

@bananafunction
Copy link

bananafunction commented Jul 23, 2021

@bananafunction Thanks for the info! I guess https://github.com/bananafunction/android_kernel_sony_msm8994 is the kernel you are using? It's on CM 14.1, so maybe you use another one? Asking because it might make sense to join forces. E.g. I'm building LOS 17.1 and 18.1 ROMs with this kernel, another user is using it for HavocOS already, ...

And WTF is up the with versioning? The latest tag from @derfelot is LA.UM.7.4.r1-06000-8x98.0 from 17 months ago, LA.UM.7.2.r2-08800-8x98.0 is last changed 5 weeks ago O.o So how am I supposed to read those tags?

@Flamefire For my kernel I am using a fork from @cryptomilk & @derfelot kernel: https://github.com/bananafunction/android_kernel_sony_msm8998 and currently on branch lineage-18.1
Basically I just updated/removed some stuff (e.g. no kernel wireguard support, additional patches from googles common kernel branch 'android-4.4-p', wifi patches from CAF tag 'LA.UM.7.2.r2-08800-8x98.0' and soon - as you are already asking in this PR - I will track CAF tags 'LA.UM.9.2.r1' - just like many msm8998 kernel)

As @derfelot mentioned, there were speed/stability issues with the wifi drivers from 'LA.UM.8.4.r1' for our devices, so those were reverted to 'LA.UM.7.4.r1'. However, I found that CAF opened a new branch (and therefore new tags) for msm8998 on Android 9 which I now use to stay up-to-date. However, I just own a lilac device and no other yoshino platform based devices, so I cannot tell if the updated 7.2.r2 wifi drivers work on all devices. If you like, you can build my kernel for testing purpose in your own build.

EDIT: for some weird reason I just saw, github reports my kernel to be forked from @Myself5 kernel. I guess that this is a broken information from time before 'whatawurst' repos were created....

@Flamefire
Copy link
Contributor Author

Ah confused that last digit 😅
Yeah that sounds pretty reasonable for this kernel. Wanted to include the Google patches too once I got some time. Not trivial at this point due to conflicts and need to check what is still required. I'd guess it makes more sense to track that Google branch instead of upstream Linux as upstream Linux is already tracked by Google, with minor delay.
What was the reason for removing wireguard?

Maybe you can change the base fork or contact the support to do that? Might be worth comparing your repo with this :)

@bananafunction
Copy link

Ah confused that last digit 😅
Yeah that sounds pretty reasonable for this kernel. Wanted to include the Google patches too once I got some time. Not trivial at this point due to conflicts and need to check what is still required. I'd guess it makes more sense to track that Google branch instead of upstream Linux as upstream Linux is already tracked by Google, with minor delay.
What was the reason for removing wireguard?

Maybe you can change the base fork or contact the support to do that? Might be worth comparing your repo with this :)

Yeah, I also thought about switching to 'android-4.4-p' branch, but as you also mentioned, there are some conflicts, so I decided to go the easy way with cherry-picking from googles branch and stay on linux stable :-)

Comparison of my kernel to whatawurst kernel is easy as I set it up newly after 27th of May 2021, so no long history to compare ;-)

Reason for removing kernel wireguard support is simply that I tend to forget to update stuff which is neither on linux stable, googles android-4.4-p nor on CAFs branches... So I assume it is safer to use the wireguard android app...

@Flamefire
Copy link
Contributor Author

Reason for removing kernel wireguard support is simply that I tend to forget to update stuff which is neither on linux stable, googles android-4.4-p nor on CAFs branches... So I assume it is safer to use the wireguard android app...

That's why I suggested we join forces and keep all in one kernel repo and evrybody PRs update he remembers ;)

Comparison of my kernel to whatawurst kernel is easy as I set it up newly after 27th of May 2021, so no long history to compare ;-)

Kinda. But with merging the Linux kernel commits accumulate quite a lot making it harder. Having a common base repo allows to use the Github tools and have stuff like "xxx is ahead of yyy by zzz commits"

@derfelot
Copy link
Member

derfelot commented Jul 23, 2021

We shouldn't directly track the aosp kernel, since as you have pointed you there are some differences with caf. Sony's kernel is based on caf, so that's why we are using that. There is nothing wrong with cherry-picking relevant commits from aosp though, but the important ones will (or should at least) also make it into caf.

as for wireguard, there are certain advantages to having direct kernel support, such as speeds and battery usage (doesn't the app only also require root? i can't remember). and wireguard kernel implementation is very stable now (iirc it also made it into mainline now), so it's the preferred way.

@Flamefire
Copy link
Contributor Author

Not fully sure about that. The commits in the AOSP kernel missing here are mostly prefixed by ANDROID, meaning they are specific to android. Sure got to check what in particular they do but I can't imagine it would be bad having them, rather it could be bad NOT having them.
The BACKPORT marked ones probably already landed in caf and I guess conflict resolution will show that. So no worry about them

Quick question about testing: As far as I understood the kernel is compiled into the boot.img file, so instead of flashing the whole ROM to test a new kernel build I get the same result by using fastboot boot boot.img except that the change will be reverted on reboot, which is good in case I mess anything up.
Is this correct, i.e. it is (~99%) enough to test the kernel with fastboot or am I missing anything?

@derfelot
Copy link
Member

derfelot commented Jul 23, 2021

Not sure about what? the important ones making it to caf?

These are just some examples from LA.UM.8.4.1.r1:

https://source.codeaurora.org/quic/la/kernel/msm-4.4/commit/?h=LA.UM.8.4.1.r1-03100-8x98.0&id=2b8fab40e5c99041902d9f8eb13d9c66670f08af
https://source.codeaurora.org/quic/la/kernel/msm-4.4/commit/?h=LA.UM.8.4.1.r1-03100-8x98.0&id=90cbaf5095aa53f37dca49c4e5b2593518cfa955
https://source.codeaurora.org/quic/la/kernel/msm-4.4/commit/?h=LA.UM.8.4.1.r1-03100-8x98.0&id=766d0c0c6c388ec3833caf9c47e9f3f50230cdc3

If they are ASB-relevant, they should be included (mostly). But i always manually go through each month's ASB to make sure that we have them all. Sometimes some are missing, but it's rare.

Using it via boot should probably be fine. But why not flash it and test, e.g. also reboot to recovery and other things (though I don't see anything here that could cause issues in that respect)? in case of issues you can always flash back the previous one.

@Flamefire
Copy link
Contributor Author

About this:

We shouldn't directly track the aosp kernel, since as you have pointed you there are some differences with caf.

As far as I can tell the upstream linux kernel is merged into the aosp one and some additional (mostly android specific commits) are added too. So the aosp is a strict superset of the upstream kernel.
I can't (yet) really imagine a case where it would be bad to use the aosp one as I think the "differences" are beneficial (to android). But yeah, got to check first if that holds true for at least the current state.

But why not flash it and test, e.g. also reboot to recovery and other things

In case of a bootloop or hang I can just turn the phone off and am back to a working phone. Flashing back the previous rom is more of a hassle in those cases.
I don't see what rebooting to recovery should test. TWRP is in its own partition so should be affected by the kernel at all. Or what do you have in mind?

@derfelot
Copy link
Member

derfelot commented Jul 23, 2021

It was just an example of something that is at least partially dependent on kernel. It happened once a long time ago that a caf commit caused issues with reboot to recovery, though I dont remember which one exactly. You dont have to flash the whole previous rom, just the boot.img should be enough.

as for aosp kernel, from what i remember things like zram or binder digressed quite heavily. So i'd rather not directly merge aosp tags, rather pick commits form it.

@Myself5
Copy link
Contributor

Myself5 commented Jul 23, 2021

If I may jump in with some random infos:

Says forked from M5

If a GitHub repo is deleted (the old msm development one) the "forked from" switches to the oldest fork. In that case my repo.

LA.UM

LA=Linux Android compared to e.g. LE, which is Linux Embedded or LF=Linux Firefox (or newer, Kai)

UM is supposedly "Universal MSM". It symbols the chip/bsp specific branches on newer revisions. E.g. the CAF Android sourced are split up between LA.UM and LA.QSSI.

Versions

8.X Tags are intended for Android 10, 9.X Tags for Android 11.

7.X is Android 9. Its likely 7.4 had been deprecated due upgrading the devices to Android 10, while 7.2 stayed on Android 9 and therefore still receives updates to that branch.

Apart from that: +1 to all (sane) Upstream Merging :)

@derfelot: Evidently I am a terrible IRC user, and I doubt that willl ever get better sooo... hows 18.1? Stable enough for me to leech it yet :P?

@derfelot
Copy link
Member

derfelot commented Jul 23, 2021

@derfelot: Evidently I am a terrible IRC user, and I doubt that willl ever get better sooo... hows 18.1? Stable enough for me to leech it yet :P?

Heya :) Yeah, should be good. It's been stable enough for a while I think - WiFi display being the exception, as for most < sdm845 devices.

@bananafunction
Copy link

If I may jump in with some random infos:

Says forked from M5

If a GitHub repo is deleted (the old msm development one) the "forked from" switches to the oldest fork. In that case my repo.

Thanks for the information. I didn't know that :-)

Versions

8.X Tags are intended for Android 10, 9.X Tags for Android 11.

7.X is Android 9. Its likely 7.4 had been deprecated due upgrading the devices to Android 10, while 7.2 stayed on Android 9 and therefore still receives updates to that branch.

Do you know the full scheme behind the tags? I mean, I did know that 7.x is Android 9 but for me it seems just like random that 7.2.x stays on Android 9 and 7.4.x does not (or is no longer supported). Also, is there any roadmap on e.g. how long a branch/tag will be supported? I just check https://wiki.codeaurora.org/xwiki/bin/QAEP/release regularly and then fetch the CAF branches.

@Myself5 Anyway, thank you for the random yet conversation related facts ;-)

@Myself5
Copy link
Contributor

Myself5 commented Jul 23, 2021

Do you know the full scheme behind the tags? I mean, I did know that 7.x is Android 9 but for me it seems just like random that 7.2.x stays on Android 9 and 7.4.x does not (or is no longer supported). Also, is there any roadmap on e.g. how long a branch/tag will be supported? I just check wiki.codeaurora.org/xwiki/bin/QAEP/release regularly and then fetch the CAF branches.

The second part (X.2 and X.4) is the platform. Checked with a few people that all confirmed to me that X.2 is actually sdm660 and NOT msm8998. Nobody knows why its listed as msm8998 though. The platform number for msm8998 is X.4, which had been discontinued as msm8998 got upgraded to 8.4.

As far as I know there are no public roadmaps. I'd assume something like this is only accessible to customers (aka the licencees) rather than the public (and possibly a competitor).

@Flamefire
Copy link
Contributor Author

Fwiw: running this via fastboot for a day now. No issues so far, vibration working.
There is a new tag already so going to check that and likely flash it.

@Flamefire
Copy link
Contributor Author

Closing in favor of #35 (which is what I tested, damn confusing PRs)

@Flamefire Flamefire closed this Jul 26, 2021
Flamefire referenced this pull request in Flamefire/android_kernel_sony_msm8998 Jul 26, 2021
…graph tracing

do_task_stat() calls get_wchan(), which further does unwind_frame().
unwind_frame() restores frame->pc to original value in case function
graph tracer has modified a return address (LR) in a stack frame to hook
a function return. However, if function graph tracer has hit a filtered
function, then we can't unwind it as ftrace_push_return_trace() has
biased the index(frame->graph) with a 'huge negative'
offset(-FTRACE_NOTRACE_DEPTH).

Moreover, arm64 stack walker defines index(frame->graph) as unsigned
int, which can not compare a -ve number.

Similar problem we can have with calling of walk_stackframe() from
save_stack_trace_tsk() or dump_backtrace().

This patch fixes unwind_frame() to test the index for -ve value and
restore index accordingly before we can restore frame->pc.

Reproducer:

cd /sys/kernel/debug/tracing/
echo schedule > set_graph_notrace
echo 1 > options/display-graph
echo wakeup > current_tracer
ps -ef | grep -i agent

Above commands result in:
Unable to handle kernel paging request at virtual address ffff801bd3d1e000
pgd = ffff8003cbe97c00
[ffff801bd3d1e000] *pgd=0000000000000000, *pud=0000000000000000
Internal error: Oops: 96000006 [#1] SMP
[...]
CPU: 5 PID: 11696 Comm: ps Not tainted 4.11.0+ #33
[...]
task: ffff8003c21ba000 task.stack: ffff8003cc6c0000
PC is at unwind_frame+0x12c/0x180
LR is at get_wchan+0xd4/0x134
pc : [<ffff00000808892c>] lr : [<ffff0000080860b8>] pstate: 60000145
sp : ffff8003cc6c3ab0
x29: ffff8003cc6c3ab0 x28: 0000000000000001
x27: 0000000000000026 x26: 0000000000000026
x25: 00000000000012d8 x24: 0000000000000000
x23: ffff8003c1c04000 x22: ffff000008c83000
x21: ffff8003c1c00000 x20: 000000000000000f
x19: ffff8003c1bc0000 x18: 0000fffffc593690
x17: 0000000000000000 x16: 0000000000000001
x15: 0000b855670e2b60 x14: 0003e97f22cf1d0f
x13: 0000000000000001 x12: 0000000000000000
x11: 00000000e8f4883e x10: 0000000154f47ec8
x9 : 0000000070f367c0 x8 : 0000000000000000
x7 : 00008003f7290000 x6 : 0000000000000018
x5 : 0000000000000000 x4 : ffff8003c1c03cb0
x3 : ffff8003c1c03ca0 x2 : 00000017ffe80000
x1 : ffff8003cc6c3af8 x0 : ffff8003d3e9e000

Process ps (pid: 11696, stack limit = 0xffff8003cc6c0000)
Stack: (0xffff8003cc6c3ab0 to 0xffff8003cc6c4000)
[...]
[<ffff00000808892c>] unwind_frame+0x12c/0x180
[<ffff000008305008>] do_task_stat+0x864/0x870
[<ffff000008305c44>] proc_tgid_stat+0x3c/0x48
[<ffff0000082fde0c>] proc_single_show+0x5c/0xb8
[<ffff0000082b27e0>] seq_read+0x160/0x414
[<ffff000008289e6c>] __vfs_read+0x58/0x164
[<ffff00000828b164>] vfs_read+0x88/0x144
[<ffff00000828c2e8>] SyS_read+0x60/0xc0
[<ffff0000080834a0>] __sys_trace_return+0x0/0x4

Fixes: 20380bb (arm64: ftrace: fix a stack tracer's output under function graph tracer)
Signed-off-by: Pratyush Anand <panand@redhat.com>
Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
[catalin.marinas@arm.com: replace WARN_ON with WARN_ON_ONCE]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 9f41631)
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I9cbc564cf7e1808a05e1e45e9196a8d138bae4a5
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Flamefire referenced this pull request in Flamefire/android_kernel_sony_msm8998 Jul 26, 2021
…graph tracing

do_task_stat() calls get_wchan(), which further does unwind_frame().
unwind_frame() restores frame->pc to original value in case function
graph tracer has modified a return address (LR) in a stack frame to hook
a function return. However, if function graph tracer has hit a filtered
function, then we can't unwind it as ftrace_push_return_trace() has
biased the index(frame->graph) with a 'huge negative'
offset(-FTRACE_NOTRACE_DEPTH).

Moreover, arm64 stack walker defines index(frame->graph) as unsigned
int, which can not compare a -ve number.

Similar problem we can have with calling of walk_stackframe() from
save_stack_trace_tsk() or dump_backtrace().

This patch fixes unwind_frame() to test the index for -ve value and
restore index accordingly before we can restore frame->pc.

Reproducer:

cd /sys/kernel/debug/tracing/
echo schedule > set_graph_notrace
echo 1 > options/display-graph
echo wakeup > current_tracer
ps -ef | grep -i agent

Above commands result in:
Unable to handle kernel paging request at virtual address ffff801bd3d1e000
pgd = ffff8003cbe97c00
[ffff801bd3d1e000] *pgd=0000000000000000, *pud=0000000000000000
Internal error: Oops: 96000006 [#1] SMP
[...]
CPU: 5 PID: 11696 Comm: ps Not tainted 4.11.0+ #33
[...]
task: ffff8003c21ba000 task.stack: ffff8003cc6c0000
PC is at unwind_frame+0x12c/0x180
LR is at get_wchan+0xd4/0x134
pc : [<ffff00000808892c>] lr : [<ffff0000080860b8>] pstate: 60000145
sp : ffff8003cc6c3ab0
x29: ffff8003cc6c3ab0 x28: 0000000000000001
x27: 0000000000000026 x26: 0000000000000026
x25: 00000000000012d8 x24: 0000000000000000
x23: ffff8003c1c04000 x22: ffff000008c83000
x21: ffff8003c1c00000 x20: 000000000000000f
x19: ffff8003c1bc0000 x18: 0000fffffc593690
x17: 0000000000000000 x16: 0000000000000001
x15: 0000b855670e2b60 x14: 0003e97f22cf1d0f
x13: 0000000000000001 x12: 0000000000000000
x11: 00000000e8f4883e x10: 0000000154f47ec8
x9 : 0000000070f367c0 x8 : 0000000000000000
x7 : 00008003f7290000 x6 : 0000000000000018
x5 : 0000000000000000 x4 : ffff8003c1c03cb0
x3 : ffff8003c1c03ca0 x2 : 00000017ffe80000
x1 : ffff8003cc6c3af8 x0 : ffff8003d3e9e000

Process ps (pid: 11696, stack limit = 0xffff8003cc6c0000)
Stack: (0xffff8003cc6c3ab0 to 0xffff8003cc6c4000)
[...]
[<ffff00000808892c>] unwind_frame+0x12c/0x180
[<ffff000008305008>] do_task_stat+0x864/0x870
[<ffff000008305c44>] proc_tgid_stat+0x3c/0x48
[<ffff0000082fde0c>] proc_single_show+0x5c/0xb8
[<ffff0000082b27e0>] seq_read+0x160/0x414
[<ffff000008289e6c>] __vfs_read+0x58/0x164
[<ffff00000828b164>] vfs_read+0x88/0x144
[<ffff00000828c2e8>] SyS_read+0x60/0xc0
[<ffff0000080834a0>] __sys_trace_return+0x0/0x4

Fixes: 20380bb (arm64: ftrace: fix a stack tracer's output under function graph tracer)
Signed-off-by: Pratyush Anand <panand@redhat.com>
Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
[catalin.marinas@arm.com: replace WARN_ON with WARN_ON_ONCE]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 9f41631)
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I9cbc564cf7e1808a05e1e45e9196a8d138bae4a5
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Flamefire referenced this pull request in Flamefire/android_kernel_sony_msm8998 Jul 28, 2021
…graph tracing

do_task_stat() calls get_wchan(), which further does unwind_frame().
unwind_frame() restores frame->pc to original value in case function
graph tracer has modified a return address (LR) in a stack frame to hook
a function return. However, if function graph tracer has hit a filtered
function, then we can't unwind it as ftrace_push_return_trace() has
biased the index(frame->graph) with a 'huge negative'
offset(-FTRACE_NOTRACE_DEPTH).

Moreover, arm64 stack walker defines index(frame->graph) as unsigned
int, which can not compare a -ve number.

Similar problem we can have with calling of walk_stackframe() from
save_stack_trace_tsk() or dump_backtrace().

This patch fixes unwind_frame() to test the index for -ve value and
restore index accordingly before we can restore frame->pc.

Reproducer:

cd /sys/kernel/debug/tracing/
echo schedule > set_graph_notrace
echo 1 > options/display-graph
echo wakeup > current_tracer
ps -ef | grep -i agent

Above commands result in:
Unable to handle kernel paging request at virtual address ffff801bd3d1e000
pgd = ffff8003cbe97c00
[ffff801bd3d1e000] *pgd=0000000000000000, *pud=0000000000000000
Internal error: Oops: 96000006 [#1] SMP
[...]
CPU: 5 PID: 11696 Comm: ps Not tainted 4.11.0+ #33
[...]
task: ffff8003c21ba000 task.stack: ffff8003cc6c0000
PC is at unwind_frame+0x12c/0x180
LR is at get_wchan+0xd4/0x134
pc : [<ffff00000808892c>] lr : [<ffff0000080860b8>] pstate: 60000145
sp : ffff8003cc6c3ab0
x29: ffff8003cc6c3ab0 x28: 0000000000000001
x27: 0000000000000026 x26: 0000000000000026
x25: 00000000000012d8 x24: 0000000000000000
x23: ffff8003c1c04000 x22: ffff000008c83000
x21: ffff8003c1c00000 x20: 000000000000000f
x19: ffff8003c1bc0000 x18: 0000fffffc593690
x17: 0000000000000000 x16: 0000000000000001
x15: 0000b855670e2b60 x14: 0003e97f22cf1d0f
x13: 0000000000000001 x12: 0000000000000000
x11: 00000000e8f4883e x10: 0000000154f47ec8
x9 : 0000000070f367c0 x8 : 0000000000000000
x7 : 00008003f7290000 x6 : 0000000000000018
x5 : 0000000000000000 x4 : ffff8003c1c03cb0
x3 : ffff8003c1c03ca0 x2 : 00000017ffe80000
x1 : ffff8003cc6c3af8 x0 : ffff8003d3e9e000

Process ps (pid: 11696, stack limit = 0xffff8003cc6c0000)
Stack: (0xffff8003cc6c3ab0 to 0xffff8003cc6c4000)
[...]
[<ffff00000808892c>] unwind_frame+0x12c/0x180
[<ffff000008305008>] do_task_stat+0x864/0x870
[<ffff000008305c44>] proc_tgid_stat+0x3c/0x48
[<ffff0000082fde0c>] proc_single_show+0x5c/0xb8
[<ffff0000082b27e0>] seq_read+0x160/0x414
[<ffff000008289e6c>] __vfs_read+0x58/0x164
[<ffff00000828b164>] vfs_read+0x88/0x144
[<ffff00000828c2e8>] SyS_read+0x60/0xc0
[<ffff0000080834a0>] __sys_trace_return+0x0/0x4

Fixes: 20380bb (arm64: ftrace: fix a stack tracer's output under function graph tracer)
Signed-off-by: Pratyush Anand <panand@redhat.com>
Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
[catalin.marinas@arm.com: replace WARN_ON with WARN_ON_ONCE]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 9f41631)
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I9cbc564cf7e1808a05e1e45e9196a8d138bae4a5
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
@Flamefire Flamefire mentioned this pull request Nov 15, 2022
ariffjenong pushed a commit to ariffjenong/android_kernel_sony_msm8998 that referenced this pull request Jan 20, 2023
[ Upstream commit bcd70260ef56e0aee8a4fc6cd214a419900b0765 ]

By keep sending L2CAP_CONF_REQ packets, chan->num_conf_rsp increases
multiple times and eventually it will wrap around the maximum number
(i.e., 255).
This patch prevents this by adding a boundary check with
L2CAP_MAX_CONF_RSP

Btmon log:
Bluetooth monitor ver 5.64
= Note: Linux version 6.1.0-rc2 (x86_64)                               0.264594
= Note: Bluetooth subsystem version 2.22                               0.264636
@ MGMT Open: btmon (privileged) version 1.22                  {0x0001} 0.272191
= New Index: 00:00:00:00:00:00 (Primary,Virtual,hci0)          [hci0] 13.877604
@ RAW Open: 9496 (privileged) version 2.22                   {0x0002} 13.890741
= Open Index: 00:00:00:00:00:00                                [hci0] 13.900426
(...)
> ACL Data RX: Handle 200 flags 0x00 dlen 1033             whatawurst#32 [hci0] 14.273106
        invalid packet size (12 != 1033)
        08 00 01 00 02 01 04 00 01 10 ff ff              ............
> ACL Data RX: Handle 200 flags 0x00 dlen 1547             whatawurst#33 [hci0] 14.273561
        invalid packet size (14 != 1547)
        0a 00 01 00 04 01 06 00 40 00 00 00 00 00        ........@.....
> ACL Data RX: Handle 200 flags 0x00 dlen 2061             whatawurst#34 [hci0] 14.274390
        invalid packet size (16 != 2061)
        0c 00 01 00 04 01 08 00 40 00 00 00 00 00 00 04  ........@.......
> ACL Data RX: Handle 200 flags 0x00 dlen 2061             whatawurst#35 [hci0] 14.274932
        invalid packet size (16 != 2061)
        0c 00 01 00 04 01 08 00 40 00 00 00 07 00 03 00  ........@.......
= bluetoothd: Bluetooth daemon 5.43                                   14.401828
> ACL Data RX: Handle 200 flags 0x00 dlen 1033             whatawurst#36 [hci0] 14.275753
        invalid packet size (12 != 1033)
        08 00 01 00 04 01 04 00 40 00 00 00              ........@...

Signed-off-by: Sungwoo Kim <iam@sung-woo.kim>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Ulrich Hecht <uli+cip@fpond.eu>
derfelot pushed a commit to derfelot/android_kernel_sony_msm8998 that referenced this pull request Jul 30, 2023
…graph tracing

do_task_stat() calls get_wchan(), which further does unwind_frame().
unwind_frame() restores frame->pc to original value in case function
graph tracer has modified a return address (LR) in a stack frame to hook
a function return. However, if function graph tracer has hit a filtered
function, then we can't unwind it as ftrace_push_return_trace() has
biased the index(frame->graph) with a 'huge negative'
offset(-FTRACE_NOTRACE_DEPTH).

Moreover, arm64 stack walker defines index(frame->graph) as unsigned
int, which can not compare a -ve number.

Similar problem we can have with calling of walk_stackframe() from
save_stack_trace_tsk() or dump_backtrace().

This patch fixes unwind_frame() to test the index for -ve value and
restore index accordingly before we can restore frame->pc.

Reproducer:

cd /sys/kernel/debug/tracing/
echo schedule > set_graph_notrace
echo 1 > options/display-graph
echo wakeup > current_tracer
ps -ef | grep -i agent

Above commands result in:
Unable to handle kernel paging request at virtual address ffff801bd3d1e000
pgd = ffff8003cbe97c00
[ffff801bd3d1e000] *pgd=0000000000000000, *pud=0000000000000000
Internal error: Oops: 96000006 [whatawurst#1] SMP
[...]
CPU: 5 PID: 11696 Comm: ps Not tainted 4.11.0+ whatawurst#33
[...]
task: ffff8003c21ba000 task.stack: ffff8003cc6c0000
PC is at unwind_frame+0x12c/0x180
LR is at get_wchan+0xd4/0x134
pc : [<ffff00000808892c>] lr : [<ffff0000080860b8>] pstate: 60000145
sp : ffff8003cc6c3ab0
x29: ffff8003cc6c3ab0 x28: 0000000000000001
x27: 0000000000000026 x26: 0000000000000026
x25: 00000000000012d8 x24: 0000000000000000
x23: ffff8003c1c04000 x22: ffff000008c83000
x21: ffff8003c1c00000 x20: 000000000000000f
x19: ffff8003c1bc0000 x18: 0000fffffc593690
x17: 0000000000000000 x16: 0000000000000001
x15: 0000b855670e2b60 x14: 0003e97f22cf1d0f
x13: 0000000000000001 x12: 0000000000000000
x11: 00000000e8f4883e x10: 0000000154f47ec8
x9 : 0000000070f367c0 x8 : 0000000000000000
x7 : 00008003f7290000 x6 : 0000000000000018
x5 : 0000000000000000 x4 : ffff8003c1c03cb0
x3 : ffff8003c1c03ca0 x2 : 00000017ffe80000
x1 : ffff8003cc6c3af8 x0 : ffff8003d3e9e000

Process ps (pid: 11696, stack limit = 0xffff8003cc6c0000)
Stack: (0xffff8003cc6c3ab0 to 0xffff8003cc6c4000)
[...]
[<ffff00000808892c>] unwind_frame+0x12c/0x180
[<ffff000008305008>] do_task_stat+0x864/0x870
[<ffff000008305c44>] proc_tgid_stat+0x3c/0x48
[<ffff0000082fde0c>] proc_single_show+0x5c/0xb8
[<ffff0000082b27e0>] seq_read+0x160/0x414
[<ffff000008289e6c>] __vfs_read+0x58/0x164
[<ffff00000828b164>] vfs_read+0x88/0x144
[<ffff00000828c2e8>] SyS_read+0x60/0xc0
[<ffff0000080834a0>] __sys_trace_return+0x0/0x4

Fixes: 20380bb (arm64: ftrace: fix a stack tracer's output under function graph tracer)
Signed-off-by: Pratyush Anand <panand@redhat.com>
Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
[catalin.marinas@arm.com: replace WARN_ON with WARN_ON_ONCE]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 9f41631)
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I9cbc564cf7e1808a05e1e45e9196a8d138bae4a5
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
SteadyQuad pushed a commit to SteadyQuad/android_kernel_sony_msm8998 that referenced this pull request Aug 16, 2024
[ Upstream commit af9a8730ddb6a4b2edd779ccc0aceb994d616830 ]

During the stress testing of the jffs2 file system,the following
abnormal printouts were found:
[ 2430.649000] Unable to handle kernel paging request at virtual address 0069696969696948
[ 2430.649622] Mem abort info:
[ 2430.649829]   ESR = 0x96000004
[ 2430.650115]   EC = 0x25: DABT (current EL), IL = 32 bits
[ 2430.650564]   SET = 0, FnV = 0
[ 2430.650795]   EA = 0, S1PTW = 0
[ 2430.651032]   FSC = 0x04: level 0 translation fault
[ 2430.651446] Data abort info:
[ 2430.651683]   ISV = 0, ISS = 0x00000004
[ 2430.652001]   CM = 0, WnR = 0
[ 2430.652558] [0069696969696948] address between user and kernel address ranges
[ 2430.653265] Internal error: Oops: 96000004 [whatawurst#1] PREEMPT SMP
[ 2430.654512] CPU: 2 PID: 20919 Comm: cat Not tainted 5.15.25-g512f31242bf6 whatawurst#33
[ 2430.655008] Hardware name: linux,dummy-virt (DT)
[ 2430.655517] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 2430.656142] pc : kfree+0x78/0x348
[ 2430.656630] lr : jffs2_free_inode+0x24/0x48
[ 2430.657051] sp : ffff800009eebd10
[ 2430.657355] x29: ffff800009eebd10 x28: 0000000000000001 x27: 0000000000000000
[ 2430.658327] x26: ffff000038f09d80 x25: 0080000000000000 x24: ffff800009d38000
[ 2430.658919] x23: 5a5a5a5a5a5a5a5a x22: ffff000038f09d80 x21: ffff8000084f0d14
[ 2430.659434] x20: ffff0000bf9a6ac0 x19: 0169696969696940 x18: 0000000000000000
[ 2430.659969] x17: ffff8000b6506000 x16: ffff800009eec000 x15: 0000000000004000
[ 2430.660637] x14: 0000000000000000 x13: 00000001000820a1 x12: 00000000000d1b19
[ 2430.661345] x11: 0004000800000000 x10: 0000000000000001 x9 : ffff8000084f0d14
[ 2430.662025] x8 : ffff0000bf9a6b40 x7 : ffff0000bf9a6b48 x6 : 0000000003470302
[ 2430.662695] x5 : ffff00002e41dcc0 x4 : ffff0000bf9aa3b0 x3 : 0000000003470342
[ 2430.663486] x2 : 0000000000000000 x1 : ffff8000084f0d14 x0 : fffffc0000000000
[ 2430.664217] Call trace:
[ 2430.664528]  kfree+0x78/0x348
[ 2430.664855]  jffs2_free_inode+0x24/0x48
[ 2430.665233]  i_callback+0x24/0x50
[ 2430.665528]  rcu_do_batch+0x1ac/0x448
[ 2430.665892]  rcu_core+0x28c/0x3c8
[ 2430.666151]  rcu_core_si+0x18/0x28
[ 2430.666473]  __do_softirq+0x138/0x3cc
[ 2430.666781]  irq_exit+0xf0/0x110
[ 2430.667065]  handle_domain_irq+0x6c/0x98
[ 2430.667447]  gic_handle_irq+0xac/0xe8
[ 2430.667739]  call_on_irq_stack+0x28/0x54
The parameter passed to kfree was 5a5a5a5a, which corresponds to the target field of
the jffs_inode_info structure. It was found that all variables in the jffs_inode_info
structure were 5a5a5a5a, except for the first member sem. It is suspected that these
variables are not initialized because they were set to 5a5a5a5a during memory testing,
which is meant to detect uninitialized memory.The sem variable is initialized in the
function jffs2_i_init_once, while other members are initialized in
the function jffs2_init_inode_info.

The function jffs2_init_inode_info is called after iget_locked,
but in the iget_locked function, the destroy_inode process is triggered,
which releases the inode and consequently, the target member of the inode
is not initialized.In concurrent high pressure scenarios, iget_locked
may enter the destroy_inode branch as described in the code.

Since the destroy_inode functionality of jffs2 only releases the target,
the fix method is to set target to NULL in jffs2_i_init_once.

Signed-off-by: Wang Yong <wang.yong12@zte.com.cn>
Reviewed-by: Lu Zhongjun <lu.zhongjun@zte.com.cn>
Reviewed-by: Yang Tao <yang.tao172@zte.com.cn>
Cc: Xu Xin <xu.xin16@zte.com.cn>
Cc: Yang Yang <yang.yang29@zte.com.cn>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Ulrich Hecht <uli@kernel.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants