Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update #40

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

update #40

wants to merge 1 commit into from

Conversation

lsmushroom
Copy link

update

hardkernel referenced this pull request in hardkernel/linux Jun 25, 2013
commit 5e4ba61 upstream.

Martin Storsjö reports that the sequence:

        ee312ac1        vsub.f32        s4, s3, s2
        ee702ac0        vsub.f32        s5, s1, s0
        e59f0028        ldr             r0, [pc, #40]
        ee111a90        vmov            r1, s3

on Raspberry Pi (implementor 41 architecture 1 part 20 variant b rev 5)
where s3 is a denormal and s2 is zero results in incorrect behaviour -
the instruction "vsub.f32 s5, s1, s0" is not executed:

        VFP: bounce: trigger ee111a90 fpexc d0000780
        VFP: emulate: INST=0xee312ac1 SCR=0x00000000
        ...

As we can see, the instruction triggering the exception is the "vmov"
instruction, and we emulate the "vsub.f32 s4, s3, s2" but fail to
properly take account of the FPEXC_FP2V flag in FPEXC.  This is because
the test for the second instruction register being valid is bogus, and
will always skip emulation of the second instruction.

Reported-by: Martin Storsjö <martin@martin.st>
Tested-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
johnweber pushed a commit to johnweber/linux that referenced this pull request Jun 26, 2013
Multiple function in asynchronous hashing use a saved-state block,
a.k.a. struct caam_hash_state, which holds a stash of information
between requests (init/update/final). Certain values in this state
block are loaded for processing using an inline-if, and when this
is done, the potential for uninitialized data can pose conflicts.
Therefore, this patch improves initialization of state data to
prevent false assignments using uninitialized data in the state block.

This patch addresses the following traceback, originating in
ahash_final_ctx(), although a problem like this could certainly
exhibit other symptoms:

kernel BUG at arch/arm/mm/dma-mapping.c:465!
Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = 80004000
[00000000] *pgd=00000000
Internal error: Oops: 805 [#1] PREEMPT SMP
Modules linked in:
CPU: 0    Not tainted  (3.0.15-01752-gdd441b9-dirty torvalds#40)
PC is at __bug+0x1c/0x28
LR is at __bug+0x18/0x28
pc : [<80043240>]    lr : [<8004323c>]    psr: 60000013
sp : e423fd98  ip : 60000013  fp : 0000001c
r10: e4191b84  r9 : 00000020  r8 : 00000009
r7 : 88005038  r6 : 00000001  r5 : 2d676572  r4 : e4191a60
r3 : 00000000  r2 : 00000001  r1 : 60000093  r0 : 00000033
Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
Control: 10c53c7d  Table: 1000404a  DAC: 00000015
Process cryptomgr_test (pid: 1306, stack limit = 0xe423e2f0)
Stack: (0xe423fd98 to 0xe4240000)
fd80:                                                       11807fd1 80048544
fda0: 88005000 e4191a00 e5178040 8039dda0 00000000 00000014 2d676572 e419100
fdc0: 88005018 e4191a60 00100100 e4191a00 00000000 8039ce0c e423fea8 00000007
fde0: e4191a00 e4227000 e5178000 8039ce18 e419183c 80203808 80a94a44 00000006
fe00: 00000000 80207180 00000000 00000006 e423ff08 00000000 00000007 e5178000
fe20: e41918a4 80a949b4 8c4844e2 00000000 00000049 74227000 8c4844e2 00000e90
fe40: 0000000e 74227e90 ffff8c58 80ac29e0 e423fed4 8006a350 8c81625c e423ff5c
fe60: 00008576 e4002500 00000003 00030010 e4002500 00000003 e5180000 e4002500
fe80: e5178000 800e6d24 007fffff 00000000 00000010 e4001280 e4002500 60000013
fea0: 000000d0 804df078 00000000 00000000 00000000 00000000 00000000 00000000
fec0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
fee0: 00000000 00000000 e4227000 e4226000 e4753000 e4752000 e40a5000 e40a4000
ff00: e41e7000 e41e6000 00000000 00000000 00000000 e423ff14 e423ff14 00000000
ff20: 00000400 804f9080 e5178000 e4db0b40 00000000 e4db0b80 0000047c 00000400
ff40: 00000000 8020758c 00000400 ffffffff 0000008a 00000000 e4db0b40 80206e00
ff60: e4049dbc 00000000 00000000 00000003 e423ffa4 80062978 e41a8bfc 00000000
ff80: 00000000 e4049db4 00000013 e4049db0 00000013 00000000 00000000 00000000
ffa0: e4db0b40 e4db0b40 80204cbc 00000013 00000000 00000000 00000000 80204cfc
ffc0: e4049da0 80089544 80040a40 00000000 e4db0b40 00000000 00000000 00000000
ffe0: e423ffe0 e423ffe0 e4049da0 800894c4 80040a40 80040a40 00000000 00000000
[<80043240>] (__bug+0x1c/0x28) from [<80048544>] (___dma_single_dev_to_cpu+0x84)
[<80048544>] (___dma_single_dev_to_cpu+0x84/0x94) from [<8039dda0>] (ahash_fina)
[<8039dda0>] (ahash_final_ctx+0x180/0x428) from [<8039ce18>] (ahash_final+0xc/0)
[<8039ce18>] (ahash_final+0xc/0x10) from [<80203808>] (crypto_ahash_op+0x28/0xc)
[<80203808>] (crypto_ahash_op+0x28/0xc0) from [<80207180>] (test_hash+0x214/0x5)
[<80207180>] (test_hash+0x214/0x5b8) from [<8020758c>] (alg_test_hash+0x68/0x8c)
[<8020758c>] (alg_test_hash+0x68/0x8c) from [<80206e00>] (alg_test+0x7c/0x1b8)
[<80206e00>] (alg_test+0x7c/0x1b8) from [<80204cfc>] (cryptomgr_test+0x40/0x48)
[<80204cfc>] (cryptomgr_test+0x40/0x48) from [<80089544>] (kthread+0x80/0x88)
[<80089544>] (kthread+0x80/0x88) from [<80040a40>] (kernel_thread_exit+0x0/0x8)
Code: e59f0010 e1a01003 eb126a8d e3a03000 (e5833000)
---[ end trace d52a403a1d1eaa86 ]---

Signed-off-by: Steve Cornelius <steve.cornelius@freescale.com>
Signed-off-by: Terry Lv <r65388@freescale.com>
johnweber pushed a commit to johnweber/linux that referenced this pull request Jun 26, 2013
Multiple function in asynchronous hashing use a saved-state block,
a.k.a. struct caam_hash_state, which holds a stash of information
between requests (init/update/final). Certain values in this state
block are loaded for processing using an inline-if, and when this
is done, the potential for uninitialized data can pose conflicts.
Therefore, this patch improves initialization of state data to
prevent false assignments using uninitialized data in the state block.

This patch addresses the following traceback, originating in
ahash_final_ctx(), although a problem like this could certainly
exhibit other symptoms:

kernel BUG at arch/arm/mm/dma-mapping.c:465!
Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = 80004000
[00000000] *pgd=00000000
Internal error: Oops: 805 [#1] PREEMPT SMP
Modules linked in:
CPU: 0    Not tainted  (3.0.15-01752-gdd441b9-dirty torvalds#40)
PC is at __bug+0x1c/0x28
LR is at __bug+0x18/0x28
pc : [<80043240>]    lr : [<8004323c>]    psr: 60000013
sp : e423fd98  ip : 60000013  fp : 0000001c
r10: e4191b84  r9 : 00000020  r8 : 00000009
r7 : 88005038  r6 : 00000001  r5 : 2d676572  r4 : e4191a60
r3 : 00000000  r2 : 00000001  r1 : 60000093  r0 : 00000033
Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
Control: 10c53c7d  Table: 1000404a  DAC: 00000015
Process cryptomgr_test (pid: 1306, stack limit = 0xe423e2f0)
Stack: (0xe423fd98 to 0xe4240000)
fd80:                                                       11807fd1 80048544
fda0: 88005000 e4191a00 e5178040 8039dda0 00000000 00000014 2d676572 e419100
fdc0: 88005018 e4191a60 00100100 e4191a00 00000000 8039ce0c e423fea8 00000007
fde0: e4191a00 e4227000 e5178000 8039ce18 e419183c 80203808 80a94a44 00000006
fe00: 00000000 80207180 00000000 00000006 e423ff08 00000000 00000007 e5178000
fe20: e41918a4 80a949b4 8c4844e2 00000000 00000049 74227000 8c4844e2 00000e90
fe40: 0000000e 74227e90 ffff8c58 80ac29e0 e423fed4 8006a350 8c81625c e423ff5c
fe60: 00008576 e4002500 00000003 00030010 e4002500 00000003 e5180000 e4002500
fe80: e5178000 800e6d24 007fffff 00000000 00000010 e4001280 e4002500 60000013
fea0: 000000d0 804df078 00000000 00000000 00000000 00000000 00000000 00000000
fec0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
fee0: 00000000 00000000 e4227000 e4226000 e4753000 e4752000 e40a5000 e40a4000
ff00: e41e7000 e41e6000 00000000 00000000 00000000 e423ff14 e423ff14 00000000
ff20: 00000400 804f9080 e5178000 e4db0b40 00000000 e4db0b80 0000047c 00000400
ff40: 00000000 8020758c 00000400 ffffffff 0000008a 00000000 e4db0b40 80206e00
ff60: e4049dbc 00000000 00000000 00000003 e423ffa4 80062978 e41a8bfc 00000000
ff80: 00000000 e4049db4 00000013 e4049db0 00000013 00000000 00000000 00000000
ffa0: e4db0b40 e4db0b40 80204cbc 00000013 00000000 00000000 00000000 80204cfc
ffc0: e4049da0 80089544 80040a40 00000000 e4db0b40 00000000 00000000 00000000
ffe0: e423ffe0 e423ffe0 e4049da0 800894c4 80040a40 80040a40 00000000 00000000
[<80043240>] (__bug+0x1c/0x28) from [<80048544>] (___dma_single_dev_to_cpu+0x84)
[<80048544>] (___dma_single_dev_to_cpu+0x84/0x94) from [<8039dda0>] (ahash_fina)
[<8039dda0>] (ahash_final_ctx+0x180/0x428) from [<8039ce18>] (ahash_final+0xc/0)
[<8039ce18>] (ahash_final+0xc/0x10) from [<80203808>] (crypto_ahash_op+0x28/0xc)
[<80203808>] (crypto_ahash_op+0x28/0xc0) from [<80207180>] (test_hash+0x214/0x5)
[<80207180>] (test_hash+0x214/0x5b8) from [<8020758c>] (alg_test_hash+0x68/0x8c)
[<8020758c>] (alg_test_hash+0x68/0x8c) from [<80206e00>] (alg_test+0x7c/0x1b8)
[<80206e00>] (alg_test+0x7c/0x1b8) from [<80204cfc>] (cryptomgr_test+0x40/0x48)
[<80204cfc>] (cryptomgr_test+0x40/0x48) from [<80089544>] (kthread+0x80/0x88)
[<80089544>] (kthread+0x80/0x88) from [<80040a40>] (kernel_thread_exit+0x0/0x8)
Code: e59f0010 e1a01003 eb126a8d e3a03000 (e5833000)
---[ end trace d52a403a1d1eaa86 ]---

Signed-off-by: Steve Cornelius <steve.cornelius@freescale.com>
Signed-off-by: Terry Lv <r65388@freescale.com>
ffainelli pushed a commit to ffainelli/linux that referenced this pull request Jun 26, 2013
Martin Storsjö reports that the sequence:

        ee312ac1        vsub.f32        s4, s3, s2
        ee702ac0        vsub.f32        s5, s1, s0
        e59f0028        ldr             r0, [pc, torvalds#40]
        ee111a90        vmov            r1, s3

on Raspberry Pi (implementor 41 architecture 1 part 20 variant b rev 5)
where s3 is a denormal and s2 is zero results in incorrect behaviour -
the instruction "vsub.f32 s5, s1, s0" is not executed:

        VFP: bounce: trigger ee111a90 fpexc d0000780
        VFP: emulate: INST=0xee312ac1 SCR=0x00000000
        ...

As we can see, the instruction triggering the exception is the "vmov"
instruction, and we emulate the "vsub.f32 s4, s3, s2" but fail to
properly take account of the FPEXC_FP2V flag in FPEXC.  This is because
the test for the second instruction register being valid is bogus, and
will always skip emulation of the second instruction.

Cc: <stable@vger.kernel.org>
Reported-by: Martin Storsjö <martin@martin.st>
Tested-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
zehome referenced this pull request in zehome/linux Aug 10, 2013
commit 5e4ba61 upstream.

Martin Storsjö reports that the sequence:

        ee312ac1        vsub.f32        s4, s3, s2
        ee702ac0        vsub.f32        s5, s1, s0
        e59f0028        ldr             r0, [pc, hardkernel#40]
        ee111a90        vmov            r1, s3

on Raspberry Pi (implementor 41 architecture 1 part 20 variant b rev 5)
where s3 is a denormal and s2 is zero results in incorrect behaviour -
the instruction "vsub.f32 s5, s1, s0" is not executed:

        VFP: bounce: trigger ee111a90 fpexc d0000780
        VFP: emulate: INST=0xee312ac1 SCR=0x00000000
        ...

As we can see, the instruction triggering the exception is the "vmov"
instruction, and we emulate the "vsub.f32 s4, s3, s2" but fail to
properly take account of the FPEXC_FP2V flag in FPEXC.  This is because
the test for the second instruction register being valid is bogus, and
will always skip emulation of the second instruction.

Reported-by: Martin Storsjö <martin@martin.st>
Tested-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
swarren pushed a commit to swarren/linux-tegra that referenced this pull request Oct 14, 2013
As the new x86 CPU bootup printout format code maintainer, I am
taking immediate action to improve and clean (and thus indulge
my OCD) the reporting of the cores when coming up online.

Fix padding to a right-hand alignment, cleanup code and bind
reporting width to the max number of supported CPUs on the
system, like this:

 [    0.074509] smpboot: Booting Node   0, Processors:      #1  #2  #3  #4  #5  torvalds#6  torvalds#7 OK
 [    0.644008] smpboot: Booting Node   1, Processors:  torvalds#8  torvalds#9 torvalds#10 torvalds#11 torvalds#12 torvalds#13 torvalds#14 torvalds#15 OK
 [    1.245006] smpboot: Booting Node   2, Processors: torvalds#16 torvalds#17 torvalds#18 torvalds#19 torvalds#20 torvalds#21 torvalds#22 torvalds#23 OK
 [    1.864005] smpboot: Booting Node   3, Processors: torvalds#24 torvalds#25 torvalds#26 torvalds#27 torvalds#28 torvalds#29 torvalds#30 torvalds#31 OK
 [    2.489005] smpboot: Booting Node   4, Processors: torvalds#32 torvalds#33 torvalds#34 torvalds#35 torvalds#36 torvalds#37 torvalds#38 torvalds#39 OK
 [    3.093005] smpboot: Booting Node   5, Processors: torvalds#40 torvalds#41 torvalds#42 torvalds#43 torvalds#44 torvalds#45 torvalds#46 torvalds#47 OK
 [    3.698005] smpboot: Booting Node   6, Processors: torvalds#48 torvalds#49 torvalds#50 torvalds#51 #52 #53 torvalds#54 torvalds#55 OK
 [    4.304005] smpboot: Booting Node   7, Processors: torvalds#56 torvalds#57 #58 torvalds#59 torvalds#60 torvalds#61 torvalds#62 torvalds#63 OK
 [    4.961413] Brought up 64 CPUs

and this:

 [    0.072367] smpboot: Booting Node   0, Processors:    #1 #2 #3 #4 #5 torvalds#6 torvalds#7 OK
 [    0.686329] Brought up 8 CPUs

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Libin <huawei.libin@huawei.com>
Cc: wangyijing@huawei.com
Cc: fenghua.yu@intel.com
Cc: guohanjun@huawei.com
Cc: paul.gortmaker@windriver.com
Link: http://lkml.kernel.org/r/20130927143554.GF4422@pd.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>
swarren pushed a commit to swarren/linux-tegra that referenced this pull request Oct 14, 2013
Turn it into (for example):

[    0.073380] x86: Booting SMP configuration:
[    0.074005] .... node   #0, CPUs:          #1   #2   #3   #4   #5   torvalds#6   torvalds#7
[    0.603005] .... node   #1, CPUs:     torvalds#8   torvalds#9  torvalds#10  torvalds#11  torvalds#12  torvalds#13  torvalds#14  torvalds#15
[    1.200005] .... node   #2, CPUs:    torvalds#16  torvalds#17  torvalds#18  torvalds#19  torvalds#20  torvalds#21  torvalds#22  torvalds#23
[    1.796005] .... node   #3, CPUs:    torvalds#24  torvalds#25  torvalds#26  torvalds#27  torvalds#28  torvalds#29  torvalds#30  torvalds#31
[    2.393005] .... node   #4, CPUs:    torvalds#32  torvalds#33  torvalds#34  torvalds#35  torvalds#36  torvalds#37  torvalds#38  torvalds#39
[    2.996005] .... node   #5, CPUs:    torvalds#40  torvalds#41  torvalds#42  torvalds#43  torvalds#44  torvalds#45  torvalds#46  torvalds#47
[    3.600005] .... node   torvalds#6, CPUs:    torvalds#48  torvalds#49  torvalds#50  torvalds#51  #52  #53  torvalds#54  torvalds#55
[    4.202005] .... node   torvalds#7, CPUs:    torvalds#56  torvalds#57  #58  torvalds#59  torvalds#60  torvalds#61  torvalds#62  torvalds#63
[    4.811005] .... node   torvalds#8, CPUs:    torvalds#64  torvalds#65  torvalds#66  torvalds#67  torvalds#68  torvalds#69  #70  torvalds#71
[    5.421006] .... node   torvalds#9, CPUs:    torvalds#72  torvalds#73  torvalds#74  torvalds#75  torvalds#76  torvalds#77  torvalds#78  torvalds#79
[    6.032005] .... node  torvalds#10, CPUs:    torvalds#80  torvalds#81  torvalds#82  torvalds#83  torvalds#84  torvalds#85  torvalds#86  torvalds#87
[    6.648006] .... node  torvalds#11, CPUs:    torvalds#88  torvalds#89  torvalds#90  torvalds#91  torvalds#92  torvalds#93  torvalds#94  torvalds#95
[    7.262005] .... node  torvalds#12, CPUs:    torvalds#96  torvalds#97  torvalds#98  torvalds#99 torvalds#100 torvalds#101 torvalds#102 torvalds#103
[    7.865005] .... node  torvalds#13, CPUs:   torvalds#104 torvalds#105 torvalds#106 torvalds#107 torvalds#108 torvalds#109 torvalds#110 torvalds#111
[    8.466005] .... node  torvalds#14, CPUs:   torvalds#112 torvalds#113 torvalds#114 torvalds#115 torvalds#116 torvalds#117 torvalds#118 torvalds#119
[    9.073006] .... node  torvalds#15, CPUs:   torvalds#120 torvalds#121 torvalds#122 torvalds#123 torvalds#124 torvalds#125 torvalds#126 torvalds#127
[    9.679901] x86: Booted up 16 nodes, 128 CPUs

and drop useless elements.

Change num_digits() to hpa's division-avoiding, cell-phone-typed
version which he went at great lengths and pains to submit on a
Saturday evening.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: huawei.libin@huawei.com
Cc: wangyijing@huawei.com
Cc: fenghua.yu@intel.com
Cc: guohanjun@huawei.com
Cc: paul.gortmaker@windriver.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20130930095624.GB16383@pd.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>
TechNexion pushed a commit to TechNexion/linux that referenced this pull request Oct 25, 2013
Multiple function in asynchronous hashing use a saved-state block,
a.k.a. struct caam_hash_state, which holds a stash of information
between requests (init/update/final). Certain values in this state
block are loaded for processing using an inline-if, and when this
is done, the potential for uninitialized data can pose conflicts.
Therefore, this patch improves initialization of state data to
prevent false assignments using uninitialized data in the state block.

This patch addresses the following traceback, originating in
ahash_final_ctx(), although a problem like this could certainly
exhibit other symptoms:

kernel BUG at arch/arm/mm/dma-mapping.c:465!
Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = 80004000
[00000000] *pgd=00000000
Internal error: Oops: 805 [#1] PREEMPT SMP
Modules linked in:
CPU: 0    Not tainted  (3.0.15-01752-gdd441b9-dirty torvalds#40)
PC is at __bug+0x1c/0x28
LR is at __bug+0x18/0x28
pc : [<80043240>]    lr : [<8004323c>]    psr: 60000013
sp : e423fd98  ip : 60000013  fp : 0000001c
r10: e4191b84  r9 : 00000020  r8 : 00000009
r7 : 88005038  r6 : 00000001  r5 : 2d676572  r4 : e4191a60
r3 : 00000000  r2 : 00000001  r1 : 60000093  r0 : 00000033
Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
Control: 10c53c7d  Table: 1000404a  DAC: 00000015
Process cryptomgr_test (pid: 1306, stack limit = 0xe423e2f0)
Stack: (0xe423fd98 to 0xe4240000)
fd80:                                                       11807fd1 80048544
fda0: 88005000 e4191a00 e5178040 8039dda0 00000000 00000014 2d676572 e419100
fdc0: 88005018 e4191a60 00100100 e4191a00 00000000 8039ce0c e423fea8 00000007
fde0: e4191a00 e4227000 e5178000 8039ce18 e419183c 80203808 80a94a44 00000006
fe00: 00000000 80207180 00000000 00000006 e423ff08 00000000 00000007 e5178000
fe20: e41918a4 80a949b4 8c4844e2 00000000 00000049 74227000 8c4844e2 00000e90
fe40: 0000000e 74227e90 ffff8c58 80ac29e0 e423fed4 8006a350 8c81625c e423ff5c
fe60: 00008576 e4002500 00000003 00030010 e4002500 00000003 e5180000 e4002500
fe80: e5178000 800e6d24 007fffff 00000000 00000010 e4001280 e4002500 60000013
fea0: 000000d0 804df078 00000000 00000000 00000000 00000000 00000000 00000000
fec0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
fee0: 00000000 00000000 e4227000 e4226000 e4753000 e4752000 e40a5000 e40a4000
ff00: e41e7000 e41e6000 00000000 00000000 00000000 e423ff14 e423ff14 00000000
ff20: 00000400 804f9080 e5178000 e4db0b40 00000000 e4db0b80 0000047c 00000400
ff40: 00000000 8020758c 00000400 ffffffff 0000008a 00000000 e4db0b40 80206e00
ff60: e4049dbc 00000000 00000000 00000003 e423ffa4 80062978 e41a8bfc 00000000
ff80: 00000000 e4049db4 00000013 e4049db0 00000013 00000000 00000000 00000000
ffa0: e4db0b40 e4db0b40 80204cbc 00000013 00000000 00000000 00000000 80204cfc
ffc0: e4049da0 80089544 80040a40 00000000 e4db0b40 00000000 00000000 00000000
ffe0: e423ffe0 e423ffe0 e4049da0 800894c4 80040a40 80040a40 00000000 00000000
[<80043240>] (__bug+0x1c/0x28) from [<80048544>] (___dma_single_dev_to_cpu+0x84)
[<80048544>] (___dma_single_dev_to_cpu+0x84/0x94) from [<8039dda0>] (ahash_fina)
[<8039dda0>] (ahash_final_ctx+0x180/0x428) from [<8039ce18>] (ahash_final+0xc/0)
[<8039ce18>] (ahash_final+0xc/0x10) from [<80203808>] (crypto_ahash_op+0x28/0xc)
[<80203808>] (crypto_ahash_op+0x28/0xc0) from [<80207180>] (test_hash+0x214/0x5)
[<80207180>] (test_hash+0x214/0x5b8) from [<8020758c>] (alg_test_hash+0x68/0x8c)
[<8020758c>] (alg_test_hash+0x68/0x8c) from [<80206e00>] (alg_test+0x7c/0x1b8)
[<80206e00>] (alg_test+0x7c/0x1b8) from [<80204cfc>] (cryptomgr_test+0x40/0x48)
[<80204cfc>] (cryptomgr_test+0x40/0x48) from [<80089544>] (kthread+0x80/0x88)
[<80089544>] (kthread+0x80/0x88) from [<80040a40>] (kernel_thread_exit+0x0/0x8)
Code: e59f0010 e1a01003 eb126a8d e3a03000 (e5833000)
---[ end trace d52a403a1d1eaa86 ]---

Signed-off-by: Steve Cornelius <steve.cornelius@freescale.com>
Signed-off-by: Terry Lv <r65388@freescale.com>
wandboard-org pushed a commit to wandboard-org/linux that referenced this pull request Nov 6, 2013
Multiple function in asynchronous hashing use a saved-state block,
a.k.a. struct caam_hash_state, which holds a stash of information
between requests (init/update/final). Certain values in this state
block are loaded for processing using an inline-if, and when this
is done, the potential for uninitialized data can pose conflicts.
Therefore, this patch improves initialization of state data to
prevent false assignments using uninitialized data in the state block.

This patch addresses the following traceback, originating in
ahash_final_ctx(), although a problem like this could certainly
exhibit other symptoms:

kernel BUG at arch/arm/mm/dma-mapping.c:465!
Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = 80004000
[00000000] *pgd=00000000
Internal error: Oops: 805 [#1] PREEMPT SMP
Modules linked in:
CPU: 0    Not tainted  (3.0.15-01752-gdd441b9-dirty torvalds#40)
PC is at __bug+0x1c/0x28
LR is at __bug+0x18/0x28
pc : [<80043240>]    lr : [<8004323c>]    psr: 60000013
sp : e423fd98  ip : 60000013  fp : 0000001c
r10: e4191b84  r9 : 00000020  r8 : 00000009
r7 : 88005038  r6 : 00000001  r5 : 2d676572  r4 : e4191a60
r3 : 00000000  r2 : 00000001  r1 : 60000093  r0 : 00000033
Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
Control: 10c53c7d  Table: 1000404a  DAC: 00000015
Process cryptomgr_test (pid: 1306, stack limit = 0xe423e2f0)
Stack: (0xe423fd98 to 0xe4240000)
fd80:                                                       11807fd1 80048544
fda0: 88005000 e4191a00 e5178040 8039dda0 00000000 00000014 2d676572 e419100
fdc0: 88005018 e4191a60 00100100 e4191a00 00000000 8039ce0c e423fea8 00000007
fde0: e4191a00 e4227000 e5178000 8039ce18 e419183c 80203808 80a94a44 00000006
fe00: 00000000 80207180 00000000 00000006 e423ff08 00000000 00000007 e5178000
fe20: e41918a4 80a949b4 8c4844e2 00000000 00000049 74227000 8c4844e2 00000e90
fe40: 0000000e 74227e90 ffff8c58 80ac29e0 e423fed4 8006a350 8c81625c e423ff5c
fe60: 00008576 e4002500 00000003 00030010 e4002500 00000003 e5180000 e4002500
fe80: e5178000 800e6d24 007fffff 00000000 00000010 e4001280 e4002500 60000013
fea0: 000000d0 804df078 00000000 00000000 00000000 00000000 00000000 00000000
fec0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
fee0: 00000000 00000000 e4227000 e4226000 e4753000 e4752000 e40a5000 e40a4000
ff00: e41e7000 e41e6000 00000000 00000000 00000000 e423ff14 e423ff14 00000000
ff20: 00000400 804f9080 e5178000 e4db0b40 00000000 e4db0b80 0000047c 00000400
ff40: 00000000 8020758c 00000400 ffffffff 0000008a 00000000 e4db0b40 80206e00
ff60: e4049dbc 00000000 00000000 00000003 e423ffa4 80062978 e41a8bfc 00000000
ff80: 00000000 e4049db4 00000013 e4049db0 00000013 00000000 00000000 00000000
ffa0: e4db0b40 e4db0b40 80204cbc 00000013 00000000 00000000 00000000 80204cfc
ffc0: e4049da0 80089544 80040a40 00000000 e4db0b40 00000000 00000000 00000000
ffe0: e423ffe0 e423ffe0 e4049da0 800894c4 80040a40 80040a40 00000000 00000000
[<80043240>] (__bug+0x1c/0x28) from [<80048544>] (___dma_single_dev_to_cpu+0x84)
[<80048544>] (___dma_single_dev_to_cpu+0x84/0x94) from [<8039dda0>] (ahash_fina)
[<8039dda0>] (ahash_final_ctx+0x180/0x428) from [<8039ce18>] (ahash_final+0xc/0)
[<8039ce18>] (ahash_final+0xc/0x10) from [<80203808>] (crypto_ahash_op+0x28/0xc)
[<80203808>] (crypto_ahash_op+0x28/0xc0) from [<80207180>] (test_hash+0x214/0x5)
[<80207180>] (test_hash+0x214/0x5b8) from [<8020758c>] (alg_test_hash+0x68/0x8c)
[<8020758c>] (alg_test_hash+0x68/0x8c) from [<80206e00>] (alg_test+0x7c/0x1b8)
[<80206e00>] (alg_test+0x7c/0x1b8) from [<80204cfc>] (cryptomgr_test+0x40/0x48)
[<80204cfc>] (cryptomgr_test+0x40/0x48) from [<80089544>] (kthread+0x80/0x88)
[<80089544>] (kthread+0x80/0x88) from [<80040a40>] (kernel_thread_exit+0x0/0x8)
Code: e59f0010 e1a01003 eb126a8d e3a03000 (e5833000)
---[ end trace d52a403a1d1eaa86 ]---

Signed-off-by: Steve Cornelius <steve.cornelius@freescale.com>
Signed-off-by: Terry Lv <r65388@freescale.com>
kees pushed a commit to kees/linux that referenced this pull request Nov 8, 2013
BugLink: http://bugs.launchpad.net/bugs/1164646

commit 5e4ba61 upstream.

Martin Storsjö reports that the sequence:

        ee312ac1        vsub.f32        s4, s3, s2
        ee702ac0        vsub.f32        s5, s1, s0
        e59f0028        ldr             r0, [pc, torvalds#40]
        ee111a90        vmov            r1, s3

on Raspberry Pi (implementor 41 architecture 1 part 20 variant b rev 5)
where s3 is a denormal and s2 is zero results in incorrect behaviour -
the instruction "vsub.f32 s5, s1, s0" is not executed:

        VFP: bounce: trigger ee111a90 fpexc d0000780
        VFP: emulate: INST=0xee312ac1 SCR=0x00000000
        ...

As we can see, the instruction triggering the exception is the "vmov"
instruction, and we emulate the "vsub.f32 s4, s3, s2" but fail to
properly take account of the FPEXC_FP2V flag in FPEXC.  This is because
the test for the second instruction register being valid is bogus, and
will always skip emulation of the second instruction.

Reported-by: Martin Storsjö <martin@martin.st>
Tested-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Steve Conklin <sconklin@canonical.com>
shr-buildhost pushed a commit to shr-distribution/linux that referenced this pull request Nov 30, 2013
commit 5e4ba61 upstream.

Martin Storsjö reports that the sequence:

        ee312ac1        vsub.f32        s4, s3, s2
        ee702ac0        vsub.f32        s5, s1, s0
        e59f0028        ldr             r0, [pc, torvalds#40]
        ee111a90        vmov            r1, s3

on Raspberry Pi (implementor 41 architecture 1 part 20 variant b rev 5)
where s3 is a denormal and s2 is zero results in incorrect behaviour -
the instruction "vsub.f32 s5, s1, s0" is not executed:

        VFP: bounce: trigger ee111a90 fpexc d0000780
        VFP: emulate: INST=0xee312ac1 SCR=0x00000000
        ...

As we can see, the instruction triggering the exception is the "vmov"
instruction, and we emulate the "vsub.f32 s4, s3, s2" but fail to
properly take account of the FPEXC_FP2V flag in FPEXC.  This is because
the test for the second instruction register being valid is bogus, and
will always skip emulation of the second instruction.

Reported-by: Martin Storsjö <martin@martin.st>
Tested-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
tom3q pushed a commit to tom3q/linux that referenced this pull request Jun 10, 2014
Code should be indented using tabs rather than spaces (see CodingStyle)
and the canonical form to declare a constant static variable is using
"static const" rather than "const static". Fixes the following warnings
from checkpatch:

	$ scripts/checkpatch.pl -f drivers/gpu/drm/drm_plane_helper.c
	WARNING: storage class should be at the beginning of the declaration
	torvalds#40: FILE: drivers/gpu/drm/drm_plane_helper.c:40:
	+const static uint32_t safe_modeset_formats[] = {

	WARNING: please, no spaces at the start of a line
	torvalds#41: FILE: drivers/gpu/drm/drm_plane_helper.c:41:
	+       DRM_FORMAT_XRGB8888,$

	WARNING: please, no spaces at the start of a line
	torvalds#42: FILE: drivers/gpu/drm/drm_plane_helper.c:42:
	+       DRM_FORMAT_ARGB8888,$

Signed-off-by: Thierry Reding <treding@nvidia.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Gnurou pushed a commit to Gnurou/linux that referenced this pull request Jul 8, 2014
After unbinding the driver memory was corrupted by double free of
clk_lookup structure. This lead to OOPS when re-binding the driver
again.

The driver allocated memory for 'clk_lookup' with devm_kzalloc. During
driver removal this memory was freed twice: once by clkdev_drop() and
second by devm code.

Kernel panic log:
[   30.839284] Unable to handle kernel paging request at virtual address 5f343173
[   30.846476] pgd = dee14000
[   30.849165] [5f343173] *pgd=00000000
[   30.852703] Internal error: Oops: 805 [#1] PREEMPT SMP ARM
[   30.858166] Modules linked in:
[   30.861208] CPU: 0 PID: 1 Comm: bash Not tainted 3.16.0-rc2-00239-g94bdf617b07e-dirty torvalds#40
[   30.869364] task: df478000 ti: df480000 task.ti: df480000
[   30.874752] PC is at clkdev_add+0x2c/0x38
[   30.878738] LR is at clkdev_add+0x18/0x38
[   30.882732] pc : [<c0350908>]    lr : [<c03508f4>]    psr: 60000013
[   30.882732] sp : df481e78  ip : 00000001  fp : c0700ed8
[   30.894187] r10: 0000000c  r9 : 00000000  r8 : c07b0e3c
[   30.899396] r7 : 00000002  r6 : df45f9d0  r5 : df421390  r4 : c0700d6c
[   30.905906] r3 : 5f343173  r2 : c0700d84  r1 : 60000013  r0 : c0700d6c
[   30.912417] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
[   30.919534] Control: 10c53c7d  Table: 5ee1406a  DAC: 00000015
[   30.925262] Process bash (pid: 1, stack limit = 0xdf480240)
[   30.930817] Stack: (0xdf481e78 to 0xdf482000)
[   30.935159] 1e60:                                                       00001000 df6de610
[   30.943321] 1e80: df7f4558 c0355650 c05ec6ec c0700eb0 df6de600 df7f4510 dec9d69c 00000014
[   30.951480] 1ea0: 00167b48 df6de610 c0700e30 c0713518 00000000 c0700e30 dec9d69c 00000006
[   30.959639] 1ec0: 00167b48 c02c1b7c c02c1b64 df6de610 c07aff48 c02c0420 c06fb150 c047cc20
[   30.967798] 1ee0: df6de610 df6de610 c0700e30 df6de644 c06fb150 0000000c dec9d690 c02bef90
[   30.975957] 1f00: dec9c6c0 dece4c00 df481f80 dece4c00 0000000c c02be73c 0000000c c016ca8c
[   30.984116] 1f20: c016ca48 00000000 00000000 c016c1f4 00000000 00000000 b6f18000 df481f80
[   30.992276] 1f40: df7f66c0 0000000c df480000 df480000 b6f18000 c011094c df47839c 60000013
[   31.000435] 1f60: 00000000 00000000 df7f66c0 df7f66c0 0000000c df480000 b6f18000 c0110dd4
[   31.008594] 1f80: 00000000 00000000 0000000c b6ec05d8 0000000c b6f18000 00000004 c000f2a8
[   31.016753] 1fa0: 00001000 c000f0e0 b6ec05d8 0000000c 00000001 b6f18000 0000000c 00000000
[   31.024912] 1fc0: b6ec05d8 0000000c b6f18000 00000004 0000000c 00000001 00000000 00167b48
[   31.033071] 1fe0: 00000000 bed83a80 b6e004f0 b6e5122c 60000010 00000001 ffffffff ffffffff
[   31.041248] [<c0350908>] (clkdev_add) from [<c0355650>] (s2mps11_clk_probe+0x2b4/0x3b4)
[   31.049223] [<c0355650>] (s2mps11_clk_probe) from [<c02c1b7c>] (platform_drv_probe+0x18/0x48)
[   31.057728] [<c02c1b7c>] (platform_drv_probe) from [<c02c0420>] (driver_probe_device+0x13c/0x384)
[   31.066579] [<c02c0420>] (driver_probe_device) from [<c02bef90>] (bind_store+0x88/0xd8)
[   31.074564] [<c02bef90>] (bind_store) from [<c02be73c>] (drv_attr_store+0x20/0x2c)
[   31.082118] [<c02be73c>] (drv_attr_store) from [<c016ca8c>] (sysfs_kf_write+0x44/0x48)
[   31.090016] [<c016ca8c>] (sysfs_kf_write) from [<c016c1f4>] (kernfs_fop_write+0xc0/0x17c)
[   31.098176] [<c016c1f4>] (kernfs_fop_write) from [<c011094c>] (vfs_write+0xa0/0x1c4)
[   31.105899] [<c011094c>] (vfs_write) from [<c0110dd4>] (SyS_write+0x40/0x8c)
[   31.112931] [<c0110dd4>] (SyS_write) from [<c000f0e0>] (ret_fast_syscall+0x0/0x3c)
[   31.120481] Code: e2842018 e584501c e1a00004 e885000c (e5835000)
[   31.126596] ---[ end trace efad45bfa3a61b05 ]---
[   31.131181] Kernel panic - not syncing: Fatal exception
[   31.136368] CPU1: stopping
[   31.139054] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G      D       3.16.0-rc2-00239-g94bdf617b07e-dirty torvalds#40
[   31.148697] [<c0016480>] (unwind_backtrace) from [<c0012950>] (show_stack+0x10/0x14)
[   31.156419] [<c0012950>] (show_stack) from [<c0480db8>] (dump_stack+0x80/0xcc)
[   31.163622] [<c0480db8>] (dump_stack) from [<c001499c>] (handle_IPI+0x130/0x15c)
[   31.170998] [<c001499c>] (handle_IPI) from [<c000862c>] (gic_handle_irq+0x60/0x68)
[   31.178549] [<c000862c>] (gic_handle_irq) from [<c0013480>] (__irq_svc+0x40/0x70)
[   31.186009] Exception stack(0xdf4bdf88 to 0xdf4bdfd0)
[   31.191046] df80:                   ffffffed 00000000 00000000 00000000 df4bc000 c06d042c
[   31.199207] dfa0: 00000000 ffffffed c06d03c0 00000000 c070c288 00000000 00000000 df4bdfd0
[   31.207363] dfc0: c0010324 c0010328 60000013 ffffffff
[   31.212402] [<c0013480>] (__irq_svc) from [<c0010328>] (arch_cpu_idle+0x28/0x30)
[   31.219783] [<c0010328>] (arch_cpu_idle) from [<c005f150>] (cpu_startup_entry+0x2c4/0x3f0)
[   31.228027] [<c005f150>] (cpu_startup_entry) from [<400086c4>] (0x400086c4)
[   31.234968] ---[ end Kernel panic - not syncing: Fatal exception

Fixes: 7cc560d ("clk: s2mps11: Add support for s2mps11")
Cc: <stable@vger.kernel.org>
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Reviewed-by: Yadwinder Singh Brar <yadi.brar@samsung.com>
Signed-off-by: Mike Turquette <mturquette@linaro.org>
heftig referenced this pull request in zen-kernel/zen-kernel Jul 18, 2014
commit 2a96dfa upstream.

After unbinding the driver memory was corrupted by double free of
clk_lookup structure. This lead to OOPS when re-binding the driver
again.

The driver allocated memory for 'clk_lookup' with devm_kzalloc. During
driver removal this memory was freed twice: once by clkdev_drop() and
second by devm code.

Kernel panic log:
[   30.839284] Unable to handle kernel paging request at virtual address 5f343173
[   30.846476] pgd = dee14000
[   30.849165] [5f343173] *pgd=00000000
[   30.852703] Internal error: Oops: 805 [#1] PREEMPT SMP ARM
[   30.858166] Modules linked in:
[   30.861208] CPU: 0 PID: 1 Comm: bash Not tainted 3.16.0-rc2-00239-g94bdf617b07e-dirty #40
[   30.869364] task: df478000 ti: df480000 task.ti: df480000
[   30.874752] PC is at clkdev_add+0x2c/0x38
[   30.878738] LR is at clkdev_add+0x18/0x38
[   30.882732] pc : [<c0350908>]    lr : [<c03508f4>]    psr: 60000013
[   30.882732] sp : df481e78  ip : 00000001  fp : c0700ed8
[   30.894187] r10: 0000000c  r9 : 00000000  r8 : c07b0e3c
[   30.899396] r7 : 00000002  r6 : df45f9d0  r5 : df421390  r4 : c0700d6c
[   30.905906] r3 : 5f343173  r2 : c0700d84  r1 : 60000013  r0 : c0700d6c
[   30.912417] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
[   30.919534] Control: 10c53c7d  Table: 5ee1406a  DAC: 00000015
[   30.925262] Process bash (pid: 1, stack limit = 0xdf480240)
[   30.930817] Stack: (0xdf481e78 to 0xdf482000)
[   30.935159] 1e60:                                                       00001000 df6de610
[   30.943321] 1e80: df7f4558 c0355650 c05ec6ec c0700eb0 df6de600 df7f4510 dec9d69c 00000014
[   30.951480] 1ea0: 00167b48 df6de610 c0700e30 c0713518 00000000 c0700e30 dec9d69c 00000006
[   30.959639] 1ec0: 00167b48 c02c1b7c c02c1b64 df6de610 c07aff48 c02c0420 c06fb150 c047cc20
[   30.967798] 1ee0: df6de610 df6de610 c0700e30 df6de644 c06fb150 0000000c dec9d690 c02bef90
[   30.975957] 1f00: dec9c6c0 dece4c00 df481f80 dece4c00 0000000c c02be73c 0000000c c016ca8c
[   30.984116] 1f20: c016ca48 00000000 00000000 c016c1f4 00000000 00000000 b6f18000 df481f80
[   30.992276] 1f40: df7f66c0 0000000c df480000 df480000 b6f18000 c011094c df47839c 60000013
[   31.000435] 1f60: 00000000 00000000 df7f66c0 df7f66c0 0000000c df480000 b6f18000 c0110dd4
[   31.008594] 1f80: 00000000 00000000 0000000c b6ec05d8 0000000c b6f18000 00000004 c000f2a8
[   31.016753] 1fa0: 00001000 c000f0e0 b6ec05d8 0000000c 00000001 b6f18000 0000000c 00000000
[   31.024912] 1fc0: b6ec05d8 0000000c b6f18000 00000004 0000000c 00000001 00000000 00167b48
[   31.033071] 1fe0: 00000000 bed83a80 b6e004f0 b6e5122c 60000010 00000001 ffffffff ffffffff
[   31.041248] [<c0350908>] (clkdev_add) from [<c0355650>] (s2mps11_clk_probe+0x2b4/0x3b4)
[   31.049223] [<c0355650>] (s2mps11_clk_probe) from [<c02c1b7c>] (platform_drv_probe+0x18/0x48)
[   31.057728] [<c02c1b7c>] (platform_drv_probe) from [<c02c0420>] (driver_probe_device+0x13c/0x384)
[   31.066579] [<c02c0420>] (driver_probe_device) from [<c02bef90>] (bind_store+0x88/0xd8)
[   31.074564] [<c02bef90>] (bind_store) from [<c02be73c>] (drv_attr_store+0x20/0x2c)
[   31.082118] [<c02be73c>] (drv_attr_store) from [<c016ca8c>] (sysfs_kf_write+0x44/0x48)
[   31.090016] [<c016ca8c>] (sysfs_kf_write) from [<c016c1f4>] (kernfs_fop_write+0xc0/0x17c)
[   31.098176] [<c016c1f4>] (kernfs_fop_write) from [<c011094c>] (vfs_write+0xa0/0x1c4)
[   31.105899] [<c011094c>] (vfs_write) from [<c0110dd4>] (SyS_write+0x40/0x8c)
[   31.112931] [<c0110dd4>] (SyS_write) from [<c000f0e0>] (ret_fast_syscall+0x0/0x3c)
[   31.120481] Code: e2842018 e584501c e1a00004 e885000c (e5835000)
[   31.126596] ---[ end trace efad45bfa3a61b05 ]---
[   31.131181] Kernel panic - not syncing: Fatal exception
[   31.136368] CPU1: stopping
[   31.139054] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G      D       3.16.0-rc2-00239-g94bdf617b07e-dirty #40
[   31.148697] [<c0016480>] (unwind_backtrace) from [<c0012950>] (show_stack+0x10/0x14)
[   31.156419] [<c0012950>] (show_stack) from [<c0480db8>] (dump_stack+0x80/0xcc)
[   31.163622] [<c0480db8>] (dump_stack) from [<c001499c>] (handle_IPI+0x130/0x15c)
[   31.170998] [<c001499c>] (handle_IPI) from [<c000862c>] (gic_handle_irq+0x60/0x68)
[   31.178549] [<c000862c>] (gic_handle_irq) from [<c0013480>] (__irq_svc+0x40/0x70)
[   31.186009] Exception stack(0xdf4bdf88 to 0xdf4bdfd0)
[   31.191046] df80:                   ffffffed 00000000 00000000 00000000 df4bc000 c06d042c
[   31.199207] dfa0: 00000000 ffffffed c06d03c0 00000000 c070c288 00000000 00000000 df4bdfd0
[   31.207363] dfc0: c0010324 c0010328 60000013 ffffffff
[   31.212402] [<c0013480>] (__irq_svc) from [<c0010328>] (arch_cpu_idle+0x28/0x30)
[   31.219783] [<c0010328>] (arch_cpu_idle) from [<c005f150>] (cpu_startup_entry+0x2c4/0x3f0)
[   31.228027] [<c005f150>] (cpu_startup_entry) from [<400086c4>] (0x400086c4)
[   31.234968] ---[ end Kernel panic - not syncing: Fatal exception

Fixes: 7cc560d ("clk: s2mps11: Add support for s2mps11")
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Reviewed-by: Yadwinder Singh Brar <yadi.brar@samsung.com>
Signed-off-by: Mike Turquette <mturquette@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
koct9i pushed a commit to koct9i/linux that referenced this pull request Sep 23, 2014
ERROR: code indent should use tabs where possible
torvalds#37: FILE: include/linux/mmdebug.h:33:
+        do {^I^I^I^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#37: FILE: include/linux/mmdebug.h:33:
+        do {^I^I^I^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#38: FILE: include/linux/mmdebug.h:34:
+                if (unlikely(cond)) {^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#38: FILE: include/linux/mmdebug.h:34:
+                if (unlikely(cond)) {^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#39: FILE: include/linux/mmdebug.h:35:
+                        dump_mm(mm);^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#39: FILE: include/linux/mmdebug.h:35:
+                        dump_mm(mm);^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#40: FILE: include/linux/mmdebug.h:36:
+                        BUG();^I^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#40: FILE: include/linux/mmdebug.h:36:
+                        BUG();^I^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#41: FILE: include/linux/mmdebug.h:37:
+                }^I^I^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#41: FILE: include/linux/mmdebug.h:37:
+                }^I^I^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#42: FILE: include/linux/mmdebug.h:38:
+        } while (0)$

WARNING: please, no spaces at the start of a line
torvalds#42: FILE: include/linux/mmdebug.h:38:
+        } while (0)$

WARNING: Prefer [subsystem eg: netdev]_alert([subsystem]dev, ... then dev_alert(dev, ... then pr_alert(...  to printk(KERN_ALERT ...
torvalds#74: FILE: mm/debug.c:171:
+	printk(KERN_ALERT

total: 6 errors, 7 warnings, 109 lines checked

NOTE: whitespace errors detected, you may wish to use scripts/cleanpatch or
      scripts/cleanfile

./patches/mm-introduce-vm_bug_on_mm.patch has style problems, please review.

If any of these errors are false positives, please report
them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
aryabinin pushed a commit to aryabinin/linux that referenced this pull request Sep 24, 2014
ERROR: code indent should use tabs where possible
torvalds#37: FILE: include/linux/mmdebug.h:33:
+        do {^I^I^I^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#37: FILE: include/linux/mmdebug.h:33:
+        do {^I^I^I^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#38: FILE: include/linux/mmdebug.h:34:
+                if (unlikely(cond)) {^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#38: FILE: include/linux/mmdebug.h:34:
+                if (unlikely(cond)) {^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#39: FILE: include/linux/mmdebug.h:35:
+                        dump_mm(mm);^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#39: FILE: include/linux/mmdebug.h:35:
+                        dump_mm(mm);^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#40: FILE: include/linux/mmdebug.h:36:
+                        BUG();^I^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#40: FILE: include/linux/mmdebug.h:36:
+                        BUG();^I^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#41: FILE: include/linux/mmdebug.h:37:
+                }^I^I^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#41: FILE: include/linux/mmdebug.h:37:
+                }^I^I^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#42: FILE: include/linux/mmdebug.h:38:
+        } while (0)$

WARNING: please, no spaces at the start of a line
torvalds#42: FILE: include/linux/mmdebug.h:38:
+        } while (0)$

WARNING: Prefer [subsystem eg: netdev]_alert([subsystem]dev, ... then dev_alert(dev, ... then pr_alert(...  to printk(KERN_ALERT ...
torvalds#74: FILE: mm/debug.c:171:
+	printk(KERN_ALERT

total: 6 errors, 7 warnings, 109 lines checked

NOTE: whitespace errors detected, you may wish to use scripts/cleanpatch or
      scripts/cleanfile

./patches/mm-introduce-vm_bug_on_mm.patch has style problems, please review.

If any of these errors are false positives, please report
them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
ddstreet pushed a commit to ddstreet/linux that referenced this pull request Sep 25, 2014
ERROR: code indent should use tabs where possible
torvalds#37: FILE: include/linux/mmdebug.h:33:
+        do {^I^I^I^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#37: FILE: include/linux/mmdebug.h:33:
+        do {^I^I^I^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#38: FILE: include/linux/mmdebug.h:34:
+                if (unlikely(cond)) {^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#38: FILE: include/linux/mmdebug.h:34:
+                if (unlikely(cond)) {^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#39: FILE: include/linux/mmdebug.h:35:
+                        dump_mm(mm);^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#39: FILE: include/linux/mmdebug.h:35:
+                        dump_mm(mm);^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#40: FILE: include/linux/mmdebug.h:36:
+                        BUG();^I^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#40: FILE: include/linux/mmdebug.h:36:
+                        BUG();^I^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#41: FILE: include/linux/mmdebug.h:37:
+                }^I^I^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#41: FILE: include/linux/mmdebug.h:37:
+                }^I^I^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#42: FILE: include/linux/mmdebug.h:38:
+        } while (0)$

WARNING: please, no spaces at the start of a line
torvalds#42: FILE: include/linux/mmdebug.h:38:
+        } while (0)$

WARNING: Prefer [subsystem eg: netdev]_alert([subsystem]dev, ... then dev_alert(dev, ... then pr_alert(...  to printk(KERN_ALERT ...
torvalds#74: FILE: mm/debug.c:171:
+	printk(KERN_ALERT

total: 6 errors, 7 warnings, 109 lines checked

NOTE: whitespace errors detected, you may wish to use scripts/cleanpatch or
      scripts/cleanfile

./patches/mm-introduce-vm_bug_on_mm.patch has style problems, please review.

If any of these errors are false positives, please report
them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
koct9i pushed a commit to koct9i/linux that referenced this pull request Sep 27, 2014
ERROR: code indent should use tabs where possible
torvalds#37: FILE: include/linux/mmdebug.h:33:
+        do {^I^I^I^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#37: FILE: include/linux/mmdebug.h:33:
+        do {^I^I^I^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#38: FILE: include/linux/mmdebug.h:34:
+                if (unlikely(cond)) {^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#38: FILE: include/linux/mmdebug.h:34:
+                if (unlikely(cond)) {^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#39: FILE: include/linux/mmdebug.h:35:
+                        dump_mm(mm);^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#39: FILE: include/linux/mmdebug.h:35:
+                        dump_mm(mm);^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#40: FILE: include/linux/mmdebug.h:36:
+                        BUG();^I^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#40: FILE: include/linux/mmdebug.h:36:
+                        BUG();^I^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#41: FILE: include/linux/mmdebug.h:37:
+                }^I^I^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#41: FILE: include/linux/mmdebug.h:37:
+                }^I^I^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#42: FILE: include/linux/mmdebug.h:38:
+        } while (0)$

WARNING: please, no spaces at the start of a line
torvalds#42: FILE: include/linux/mmdebug.h:38:
+        } while (0)$

WARNING: Prefer [subsystem eg: netdev]_alert([subsystem]dev, ... then dev_alert(dev, ... then pr_alert(...  to printk(KERN_ALERT ...
torvalds#74: FILE: mm/debug.c:171:
+	printk(KERN_ALERT

total: 6 errors, 7 warnings, 109 lines checked

NOTE: whitespace errors detected, you may wish to use scripts/cleanpatch or
      scripts/cleanfile

./patches/mm-introduce-vm_bug_on_mm.patch has style problems, please review.

If any of these errors are false positives, please report
them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
tom3q pushed a commit to tom3q/linux that referenced this pull request Oct 2, 2014
ERROR: code indent should use tabs where possible
torvalds#37: FILE: include/linux/mmdebug.h:33:
+        do {^I^I^I^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#37: FILE: include/linux/mmdebug.h:33:
+        do {^I^I^I^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#38: FILE: include/linux/mmdebug.h:34:
+                if (unlikely(cond)) {^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#38: FILE: include/linux/mmdebug.h:34:
+                if (unlikely(cond)) {^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#39: FILE: include/linux/mmdebug.h:35:
+                        dump_mm(mm);^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#39: FILE: include/linux/mmdebug.h:35:
+                        dump_mm(mm);^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#40: FILE: include/linux/mmdebug.h:36:
+                        BUG();^I^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#40: FILE: include/linux/mmdebug.h:36:
+                        BUG();^I^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#41: FILE: include/linux/mmdebug.h:37:
+                }^I^I^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#41: FILE: include/linux/mmdebug.h:37:
+                }^I^I^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#42: FILE: include/linux/mmdebug.h:38:
+        } while (0)$

WARNING: please, no spaces at the start of a line
torvalds#42: FILE: include/linux/mmdebug.h:38:
+        } while (0)$

WARNING: Prefer [subsystem eg: netdev]_alert([subsystem]dev, ... then dev_alert(dev, ... then pr_alert(...  to printk(KERN_ALERT ...
torvalds#74: FILE: mm/debug.c:171:
+	printk(KERN_ALERT

total: 6 errors, 7 warnings, 109 lines checked

NOTE: whitespace errors detected, you may wish to use scripts/cleanpatch or
      scripts/cleanfile

./patches/mm-introduce-vm_bug_on_mm.patch has style problems, please review.

If any of these errors are false positives, please report
them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
aryabinin pushed a commit to aryabinin/linux that referenced this pull request Oct 3, 2014
ERROR: code indent should use tabs where possible
torvalds#37: FILE: include/linux/mmdebug.h:33:
+        do {^I^I^I^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#37: FILE: include/linux/mmdebug.h:33:
+        do {^I^I^I^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#38: FILE: include/linux/mmdebug.h:34:
+                if (unlikely(cond)) {^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#38: FILE: include/linux/mmdebug.h:34:
+                if (unlikely(cond)) {^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#39: FILE: include/linux/mmdebug.h:35:
+                        dump_mm(mm);^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#39: FILE: include/linux/mmdebug.h:35:
+                        dump_mm(mm);^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#40: FILE: include/linux/mmdebug.h:36:
+                        BUG();^I^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#40: FILE: include/linux/mmdebug.h:36:
+                        BUG();^I^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#41: FILE: include/linux/mmdebug.h:37:
+                }^I^I^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#41: FILE: include/linux/mmdebug.h:37:
+                }^I^I^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#42: FILE: include/linux/mmdebug.h:38:
+        } while (0)$

WARNING: please, no spaces at the start of a line
torvalds#42: FILE: include/linux/mmdebug.h:38:
+        } while (0)$

WARNING: Prefer [subsystem eg: netdev]_alert([subsystem]dev, ... then dev_alert(dev, ... then pr_alert(...  to printk(KERN_ALERT ...
torvalds#74: FILE: mm/debug.c:171:
+	printk(KERN_ALERT

total: 6 errors, 7 warnings, 109 lines checked

NOTE: whitespace errors detected, you may wish to use scripts/cleanpatch or
      scripts/cleanfile

./patches/mm-introduce-vm_bug_on_mm.patch has style problems, please review.

If any of these errors are false positives, please report
them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
pstglia pushed a commit to pstglia/linux that referenced this pull request Oct 6, 2014
Code should be indented using tabs rather than spaces (see CodingStyle)
and the canonical form to declare a constant static variable is using
"static const" rather than "const static". Fixes the following warnings
from checkpatch:

	$ scripts/checkpatch.pl -f drivers/gpu/drm/drm_plane_helper.c
	WARNING: storage class should be at the beginning of the declaration
	torvalds#40: FILE: drivers/gpu/drm/drm_plane_helper.c:40:
	+const static uint32_t safe_modeset_formats[] = {

	WARNING: please, no spaces at the start of a line
	torvalds#41: FILE: drivers/gpu/drm/drm_plane_helper.c:41:
	+       DRM_FORMAT_XRGB8888,$

	WARNING: please, no spaces at the start of a line
	torvalds#42: FILE: drivers/gpu/drm/drm_plane_helper.c:42:
	+       DRM_FORMAT_ARGB8888,$

Signed-off-by: Thierry Reding <treding@nvidia.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
pstglia pushed a commit to pstglia/linux that referenced this pull request Oct 6, 2014
After unbinding the driver memory was corrupted by double free of
clk_lookup structure. This lead to OOPS when re-binding the driver
again.

The driver allocated memory for 'clk_lookup' with devm_kzalloc. During
driver removal this memory was freed twice: once by clkdev_drop() and
second by devm code.

Kernel panic log:
[   30.839284] Unable to handle kernel paging request at virtual address 5f343173
[   30.846476] pgd = dee14000
[   30.849165] [5f343173] *pgd=00000000
[   30.852703] Internal error: Oops: 805 [#1] PREEMPT SMP ARM
[   30.858166] Modules linked in:
[   30.861208] CPU: 0 PID: 1 Comm: bash Not tainted 3.16.0-rc2-00239-g94bdf617b07e-dirty torvalds#40
[   30.869364] task: df478000 ti: df480000 task.ti: df480000
[   30.874752] PC is at clkdev_add+0x2c/0x38
[   30.878738] LR is at clkdev_add+0x18/0x38
[   30.882732] pc : [<c0350908>]    lr : [<c03508f4>]    psr: 60000013
[   30.882732] sp : df481e78  ip : 00000001  fp : c0700ed8
[   30.894187] r10: 0000000c  r9 : 00000000  r8 : c07b0e3c
[   30.899396] r7 : 00000002  r6 : df45f9d0  r5 : df421390  r4 : c0700d6c
[   30.905906] r3 : 5f343173  r2 : c0700d84  r1 : 60000013  r0 : c0700d6c
[   30.912417] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
[   30.919534] Control: 10c53c7d  Table: 5ee1406a  DAC: 00000015
[   30.925262] Process bash (pid: 1, stack limit = 0xdf480240)
[   30.930817] Stack: (0xdf481e78 to 0xdf482000)
[   30.935159] 1e60:                                                       00001000 df6de610
[   30.943321] 1e80: df7f4558 c0355650 c05ec6ec c0700eb0 df6de600 df7f4510 dec9d69c 00000014
[   30.951480] 1ea0: 00167b48 df6de610 c0700e30 c0713518 00000000 c0700e30 dec9d69c 00000006
[   30.959639] 1ec0: 00167b48 c02c1b7c c02c1b64 df6de610 c07aff48 c02c0420 c06fb150 c047cc20
[   30.967798] 1ee0: df6de610 df6de610 c0700e30 df6de644 c06fb150 0000000c dec9d690 c02bef90
[   30.975957] 1f00: dec9c6c0 dece4c00 df481f80 dece4c00 0000000c c02be73c 0000000c c016ca8c
[   30.984116] 1f20: c016ca48 00000000 00000000 c016c1f4 00000000 00000000 b6f18000 df481f80
[   30.992276] 1f40: df7f66c0 0000000c df480000 df480000 b6f18000 c011094c df47839c 60000013
[   31.000435] 1f60: 00000000 00000000 df7f66c0 df7f66c0 0000000c df480000 b6f18000 c0110dd4
[   31.008594] 1f80: 00000000 00000000 0000000c b6ec05d8 0000000c b6f18000 00000004 c000f2a8
[   31.016753] 1fa0: 00001000 c000f0e0 b6ec05d8 0000000c 00000001 b6f18000 0000000c 00000000
[   31.024912] 1fc0: b6ec05d8 0000000c b6f18000 00000004 0000000c 00000001 00000000 00167b48
[   31.033071] 1fe0: 00000000 bed83a80 b6e004f0 b6e5122c 60000010 00000001 ffffffff ffffffff
[   31.041248] [<c0350908>] (clkdev_add) from [<c0355650>] (s2mps11_clk_probe+0x2b4/0x3b4)
[   31.049223] [<c0355650>] (s2mps11_clk_probe) from [<c02c1b7c>] (platform_drv_probe+0x18/0x48)
[   31.057728] [<c02c1b7c>] (platform_drv_probe) from [<c02c0420>] (driver_probe_device+0x13c/0x384)
[   31.066579] [<c02c0420>] (driver_probe_device) from [<c02bef90>] (bind_store+0x88/0xd8)
[   31.074564] [<c02bef90>] (bind_store) from [<c02be73c>] (drv_attr_store+0x20/0x2c)
[   31.082118] [<c02be73c>] (drv_attr_store) from [<c016ca8c>] (sysfs_kf_write+0x44/0x48)
[   31.090016] [<c016ca8c>] (sysfs_kf_write) from [<c016c1f4>] (kernfs_fop_write+0xc0/0x17c)
[   31.098176] [<c016c1f4>] (kernfs_fop_write) from [<c011094c>] (vfs_write+0xa0/0x1c4)
[   31.105899] [<c011094c>] (vfs_write) from [<c0110dd4>] (SyS_write+0x40/0x8c)
[   31.112931] [<c0110dd4>] (SyS_write) from [<c000f0e0>] (ret_fast_syscall+0x0/0x3c)
[   31.120481] Code: e2842018 e584501c e1a00004 e885000c (e5835000)
[   31.126596] ---[ end trace efad45bfa3a61b05 ]---
[   31.131181] Kernel panic - not syncing: Fatal exception
[   31.136368] CPU1: stopping
[   31.139054] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G      D       3.16.0-rc2-00239-g94bdf617b07e-dirty torvalds#40
[   31.148697] [<c0016480>] (unwind_backtrace) from [<c0012950>] (show_stack+0x10/0x14)
[   31.156419] [<c0012950>] (show_stack) from [<c0480db8>] (dump_stack+0x80/0xcc)
[   31.163622] [<c0480db8>] (dump_stack) from [<c001499c>] (handle_IPI+0x130/0x15c)
[   31.170998] [<c001499c>] (handle_IPI) from [<c000862c>] (gic_handle_irq+0x60/0x68)
[   31.178549] [<c000862c>] (gic_handle_irq) from [<c0013480>] (__irq_svc+0x40/0x70)
[   31.186009] Exception stack(0xdf4bdf88 to 0xdf4bdfd0)
[   31.191046] df80:                   ffffffed 00000000 00000000 00000000 df4bc000 c06d042c
[   31.199207] dfa0: 00000000 ffffffed c06d03c0 00000000 c070c288 00000000 00000000 df4bdfd0
[   31.207363] dfc0: c0010324 c0010328 60000013 ffffffff
[   31.212402] [<c0013480>] (__irq_svc) from [<c0010328>] (arch_cpu_idle+0x28/0x30)
[   31.219783] [<c0010328>] (arch_cpu_idle) from [<c005f150>] (cpu_startup_entry+0x2c4/0x3f0)
[   31.228027] [<c005f150>] (cpu_startup_entry) from [<400086c4>] (0x400086c4)
[   31.234968] ---[ end Kernel panic - not syncing: Fatal exception

Fixes: 7cc560d ("clk: s2mps11: Add support for s2mps11")
Cc: <stable@vger.kernel.org>
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Reviewed-by: Yadwinder Singh Brar <yadi.brar@samsung.com>
Signed-off-by: Mike Turquette <mturquette@linaro.org>
chrillomat pushed a commit to chrillomat/linux that referenced this pull request Oct 6, 2014
this patch can fix NULL pointer issue in GPU kernel driver with the following log

[<7f240438>] (gckEVENT_AddList+0x0/0x810 [galcore]) from [<7f239ebc>] (gckCOMMAND_Commit+0xf28/0x118c [galcore])
[<7f238f94>] (gckCOMMAND_Commit+0x0/0x118c [galcore]) from [<7f2362dc>] (gckKERNEL_Dispatch+0x120c/0x24e4 [galcore])
[<7f2350d0>] (gckKERNEL_Dispatch+0x0/0x24e4 [galcore]) from [<7f222280>] (drv_ioctl+0x390/0x540 [galcore])
[<7f221ef0>] (drv_ioctl+0x0/0x540 [galcore]) from [<800facd0>] (vfs_ioctl+0x30/0x44)

The false code is at 0x217bc where the 0-pointer happens (r3 = 0)

gcuVIDMEM_NODE_PTR node = (gcuVIDMEM_NODE_PTR)(gcmUINT64_TO_PTR(Record->info.u.FreeVideoMemory.node));
   217b8:                e5953028             ldr           r3, [r5, torvalds#40]         ; 0x28

                if (node->VidMem.memory->object.type == gcvOBJ_VIDMEM)
   217bc:                e5932000             ldr           r2, [r3]
   217c0:                e5922000             ldr           r2, [r2]
   217c4:                e152000a             cmp       r2, sl
                {
                     gcmkVERIFY_OK(gckKERNEL_RemoveProcessDB(Event->kernel,

Date: Apr 23, 2014
Signed-off-by: Xianzhong <b07117@freescale.com>
Acked-by: Jason Liu
bengal pushed a commit to bengal/linux that referenced this pull request Oct 7, 2014
ERROR: code indent should use tabs where possible
torvalds#37: FILE: include/linux/mmdebug.h:33:
+        do {^I^I^I^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#37: FILE: include/linux/mmdebug.h:33:
+        do {^I^I^I^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#38: FILE: include/linux/mmdebug.h:34:
+                if (unlikely(cond)) {^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#38: FILE: include/linux/mmdebug.h:34:
+                if (unlikely(cond)) {^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#39: FILE: include/linux/mmdebug.h:35:
+                        dump_mm(mm);^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#39: FILE: include/linux/mmdebug.h:35:
+                        dump_mm(mm);^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#40: FILE: include/linux/mmdebug.h:36:
+                        BUG();^I^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#40: FILE: include/linux/mmdebug.h:36:
+                        BUG();^I^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#41: FILE: include/linux/mmdebug.h:37:
+                }^I^I^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#41: FILE: include/linux/mmdebug.h:37:
+                }^I^I^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#42: FILE: include/linux/mmdebug.h:38:
+        } while (0)$

WARNING: please, no spaces at the start of a line
torvalds#42: FILE: include/linux/mmdebug.h:38:
+        } while (0)$

WARNING: Prefer [subsystem eg: netdev]_alert([subsystem]dev, ... then dev_alert(dev, ... then pr_alert(...  to printk(KERN_ALERT ...
torvalds#74: FILE: mm/debug.c:171:
+	printk(KERN_ALERT

total: 6 errors, 7 warnings, 109 lines checked

NOTE: whitespace errors detected, you may wish to use scripts/cleanpatch or
      scripts/cleanfile

./patches/mm-introduce-vm_bug_on_mm.patch has style problems, please review.

If any of these errors are false positives, please report
them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
kees pushed a commit to kees/linux that referenced this pull request Oct 9, 2014
BugLink: http://bugs.launchpad.net/bugs/1356913

commit 2a96dfa upstream.

After unbinding the driver memory was corrupted by double free of
clk_lookup structure. This lead to OOPS when re-binding the driver
again.

The driver allocated memory for 'clk_lookup' with devm_kzalloc. During
driver removal this memory was freed twice: once by clkdev_drop() and
second by devm code.

Kernel panic log:
[   30.839284] Unable to handle kernel paging request at virtual address 5f343173
[   30.846476] pgd = dee14000
[   30.849165] [5f343173] *pgd=00000000
[   30.852703] Internal error: Oops: 805 [#1] PREEMPT SMP ARM
[   30.858166] Modules linked in:
[   30.861208] CPU: 0 PID: 1 Comm: bash Not tainted 3.16.0-rc2-00239-g94bdf617b07e-dirty torvalds#40
[   30.869364] task: df478000 ti: df480000 task.ti: df480000
[   30.874752] PC is at clkdev_add+0x2c/0x38
[   30.878738] LR is at clkdev_add+0x18/0x38
[   30.882732] pc : [<c0350908>]    lr : [<c03508f4>]    psr: 60000013
[   30.882732] sp : df481e78  ip : 00000001  fp : c0700ed8
[   30.894187] r10: 0000000c  r9 : 00000000  r8 : c07b0e3c
[   30.899396] r7 : 00000002  r6 : df45f9d0  r5 : df421390  r4 : c0700d6c
[   30.905906] r3 : 5f343173  r2 : c0700d84  r1 : 60000013  r0 : c0700d6c
[   30.912417] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
[   30.919534] Control: 10c53c7d  Table: 5ee1406a  DAC: 00000015
[   30.925262] Process bash (pid: 1, stack limit = 0xdf480240)
[   30.930817] Stack: (0xdf481e78 to 0xdf482000)
[   30.935159] 1e60:                                                       00001000 df6de610
[   30.943321] 1e80: df7f4558 c0355650 c05ec6ec c0700eb0 df6de600 df7f4510 dec9d69c 00000014
[   30.951480] 1ea0: 00167b48 df6de610 c0700e30 c0713518 00000000 c0700e30 dec9d69c 00000006
[   30.959639] 1ec0: 00167b48 c02c1b7c c02c1b64 df6de610 c07aff48 c02c0420 c06fb150 c047cc20
[   30.967798] 1ee0: df6de610 df6de610 c0700e30 df6de644 c06fb150 0000000c dec9d690 c02bef90
[   30.975957] 1f00: dec9c6c0 dece4c00 df481f80 dece4c00 0000000c c02be73c 0000000c c016ca8c
[   30.984116] 1f20: c016ca48 00000000 00000000 c016c1f4 00000000 00000000 b6f18000 df481f80
[   30.992276] 1f40: df7f66c0 0000000c df480000 df480000 b6f18000 c011094c df47839c 60000013
[   31.000435] 1f60: 00000000 00000000 df7f66c0 df7f66c0 0000000c df480000 b6f18000 c0110dd4
[   31.008594] 1f80: 00000000 00000000 0000000c b6ec05d8 0000000c b6f18000 00000004 c000f2a8
[   31.016753] 1fa0: 00001000 c000f0e0 b6ec05d8 0000000c 00000001 b6f18000 0000000c 00000000
[   31.024912] 1fc0: b6ec05d8 0000000c b6f18000 00000004 0000000c 00000001 00000000 00167b48
[   31.033071] 1fe0: 00000000 bed83a80 b6e004f0 b6e5122c 60000010 00000001 ffffffff ffffffff
[   31.041248] [<c0350908>] (clkdev_add) from [<c0355650>] (s2mps11_clk_probe+0x2b4/0x3b4)
[   31.049223] [<c0355650>] (s2mps11_clk_probe) from [<c02c1b7c>] (platform_drv_probe+0x18/0x48)
[   31.057728] [<c02c1b7c>] (platform_drv_probe) from [<c02c0420>] (driver_probe_device+0x13c/0x384)
[   31.066579] [<c02c0420>] (driver_probe_device) from [<c02bef90>] (bind_store+0x88/0xd8)
[   31.074564] [<c02bef90>] (bind_store) from [<c02be73c>] (drv_attr_store+0x20/0x2c)
[   31.082118] [<c02be73c>] (drv_attr_store) from [<c016ca8c>] (sysfs_kf_write+0x44/0x48)
[   31.090016] [<c016ca8c>] (sysfs_kf_write) from [<c016c1f4>] (kernfs_fop_write+0xc0/0x17c)
[   31.098176] [<c016c1f4>] (kernfs_fop_write) from [<c011094c>] (vfs_write+0xa0/0x1c4)
[   31.105899] [<c011094c>] (vfs_write) from [<c0110dd4>] (SyS_write+0x40/0x8c)
[   31.112931] [<c0110dd4>] (SyS_write) from [<c000f0e0>] (ret_fast_syscall+0x0/0x3c)
[   31.120481] Code: e2842018 e584501c e1a00004 e885000c (e5835000)
[   31.126596] ---[ end trace efad45bfa3a61b05 ]---
[   31.131181] Kernel panic - not syncing: Fatal exception
[   31.136368] CPU1: stopping
[   31.139054] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G      D       3.16.0-rc2-00239-g94bdf617b07e-dirty torvalds#40
[   31.148697] [<c0016480>] (unwind_backtrace) from [<c0012950>] (show_stack+0x10/0x14)
[   31.156419] [<c0012950>] (show_stack) from [<c0480db8>] (dump_stack+0x80/0xcc)
[   31.163622] [<c0480db8>] (dump_stack) from [<c001499c>] (handle_IPI+0x130/0x15c)
[   31.170998] [<c001499c>] (handle_IPI) from [<c000862c>] (gic_handle_irq+0x60/0x68)
[   31.178549] [<c000862c>] (gic_handle_irq) from [<c0013480>] (__irq_svc+0x40/0x70)
[   31.186009] Exception stack(0xdf4bdf88 to 0xdf4bdfd0)
[   31.191046] df80:                   ffffffed 00000000 00000000 00000000 df4bc000 c06d042c
[   31.199207] dfa0: 00000000 ffffffed c06d03c0 00000000 c070c288 00000000 00000000 df4bdfd0
[   31.207363] dfc0: c0010324 c0010328 60000013 ffffffff
[   31.212402] [<c0013480>] (__irq_svc) from [<c0010328>] (arch_cpu_idle+0x28/0x30)
[   31.219783] [<c0010328>] (arch_cpu_idle) from [<c005f150>] (cpu_startup_entry+0x2c4/0x3f0)
[   31.228027] [<c005f150>] (cpu_startup_entry) from [<400086c4>] (0x400086c4)
[   31.234968] ---[ end Kernel panic - not syncing: Fatal exception

Fixes: 7cc560d ("clk: s2mps11: Add support for s2mps11")
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Reviewed-by: Yadwinder Singh Brar <yadi.brar@samsung.com>
Signed-off-by: Mike Turquette <mturquette@linaro.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
torvalds pushed a commit that referenced this pull request Oct 29, 2014
This patch wires up the new syscall sys_bpf() on powerpc.

Passes the tests in samples/bpf:

    #0 add+sub+mul OK
    #1 unreachable OK
    #2 unreachable2 OK
    #3 out of range jump OK
    #4 out of range jump2 OK
    #5 test1 ld_imm64 OK
    #6 test2 ld_imm64 OK
    #7 test3 ld_imm64 OK
    #8 test4 ld_imm64 OK
    #9 test5 ld_imm64 OK
    #10 no bpf_exit OK
    #11 loop (back-edge) OK
    #12 loop2 (back-edge) OK
    #13 conditional loop OK
    #14 read uninitialized register OK
    #15 read invalid register OK
    #16 program doesn't init R0 before exit OK
    #17 stack out of bounds OK
    #18 invalid call insn1 OK
    #19 invalid call insn2 OK
    #20 invalid function call OK
    #21 uninitialized stack1 OK
    #22 uninitialized stack2 OK
    #23 check valid spill/fill OK
    #24 check corrupted spill/fill OK
    #25 invalid src register in STX OK
    #26 invalid dst register in STX OK
    #27 invalid dst register in ST OK
    #28 invalid src register in LDX OK
    #29 invalid dst register in LDX OK
    #30 junk insn OK
    #31 junk insn2 OK
    #32 junk insn3 OK
    #33 junk insn4 OK
    #34 junk insn5 OK
    #35 misaligned read from stack OK
    #36 invalid map_fd for function call OK
    #37 don't check return value before access OK
    #38 access memory with incorrect alignment OK
    #39 sometimes access memory with incorrect alignment OK
    #40 jump test 1 OK
    #41 jump test 2 OK
    #42 jump test 3 OK
    #43 jump test 4 OK

Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
[mpe: test using samples/bpf]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
dabrace referenced this pull request in dabrace/linux Nov 10, 2014
This patch wires up the new syscall sys_bpf() on powerpc.

Passes the tests in samples/bpf:

    #0 add+sub+mul OK
    #1 unreachable OK
    #2 unreachable2 OK
    #3 out of range jump OK
    #4 out of range jump2 OK
    #5 test1 ld_imm64 OK
    #6 test2 ld_imm64 OK
    #7 test3 ld_imm64 OK
    #8 test4 ld_imm64 OK
    #9 test5 ld_imm64 OK
    #10 no bpf_exit OK
    #11 loop (back-edge) OK
    #12 loop2 (back-edge) OK
    #13 conditional loop OK
    #14 read uninitialized register OK
    #15 read invalid register OK
    #16 program doesn't init R0 before exit OK
    #17 stack out of bounds OK
    #18 invalid call insn1 OK
    #19 invalid call insn2 OK
    #20 invalid function call OK
    #21 uninitialized stack1 OK
    #22 uninitialized stack2 OK
    #23 check valid spill/fill OK
    #24 check corrupted spill/fill OK
    #25 invalid src register in STX OK
    #26 invalid dst register in STX OK
    #27 invalid dst register in ST OK
    #28 invalid src register in LDX OK
    #29 invalid dst register in LDX OK
    #30 junk insn OK
    #31 junk insn2 OK
    #32 junk insn3 OK
    #33 junk insn4 OK
    #34 junk insn5 OK
    #35 misaligned read from stack OK
    #36 invalid map_fd for function call OK
    #37 don't check return value before access OK
    #38 access memory with incorrect alignment OK
    #39 sometimes access memory with incorrect alignment OK
    #40 jump test 1 OK
    #41 jump test 2 OK
    #42 jump test 3 OK
    #43 jump test 4 OK

Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
[mpe: test using samples/bpf]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
alfonsotames pushed a commit to alfonsotames/linux that referenced this pull request Dec 15, 2014
this patch can fix NULL pointer issue in GPU kernel driver with the following log

[<7f240438>] (gckEVENT_AddList+0x0/0x810 [galcore]) from [<7f239ebc>] (gckCOMMAND_Commit+0xf28/0x118c [galcore])
[<7f238f94>] (gckCOMMAND_Commit+0x0/0x118c [galcore]) from [<7f2362dc>] (gckKERNEL_Dispatch+0x120c/0x24e4 [galcore])
[<7f2350d0>] (gckKERNEL_Dispatch+0x0/0x24e4 [galcore]) from [<7f222280>] (drv_ioctl+0x390/0x540 [galcore])
[<7f221ef0>] (drv_ioctl+0x0/0x540 [galcore]) from [<800facd0>] (vfs_ioctl+0x30/0x44)

The false code is at 0x217bc where the 0-pointer happens (r3 = 0)

gcuVIDMEM_NODE_PTR node = (gcuVIDMEM_NODE_PTR)(gcmUINT64_TO_PTR(Record->info.u.FreeVideoMemory.node));
   217b8:                e5953028             ldr           r3, [r5, torvalds#40]         ; 0x28

                if (node->VidMem.memory->object.type == gcvOBJ_VIDMEM)
   217bc:                e5932000             ldr           r2, [r3]
   217c0:                e5922000             ldr           r2, [r2]
   217c4:                e152000a             cmp       r2, sl
                {
                     gcmkVERIFY_OK(gckKERNEL_RemoveProcessDB(Event->kernel,

Date: Apr 23, 2014
Signed-off-by: Xianzhong <b07117@freescale.com>
Acked-by: Jason Liu
(cherry picked from commit fcde214)
(cherry picked from commit 952142648d76fce2663ef649d9f988f1b7809815)
(cherry picked from commit 9d7b33678f1f944f75644e958c3ceeb7f2e4bac9)
gthelen pushed a commit to gthelen/linux that referenced this pull request Apr 1, 2015
Sumeone needs to buy a tab key.

WARNING: please, no spaces at the start of a line
torvalds#29: FILE: security/tomoyo/util.c:951:
+       struct file *exe_file;$

WARNING: please, no spaces at the start of a line
torvalds#30: FILE: security/tomoyo/util.c:952:
+       const char *cp;$

WARNING: please, no spaces at the start of a line
torvalds#31: FILE: security/tomoyo/util.c:953:
+       struct mm_struct *mm = current->mm;$

WARNING: please, no spaces at the start of a line
torvalds#40: FILE: security/tomoyo/util.c:955:
+       if (!mm)$

WARNING: suspect code indent for conditional statements (7, 15)
torvalds#40: FILE: security/tomoyo/util.c:955:
+       if (!mm)
+	       return NULL;

WARNING: please, no spaces at the start of a line
torvalds#42: FILE: security/tomoyo/util.c:957:
+       exe_file = get_mm_exe_file(mm);$

WARNING: please, no spaces at the start of a line
torvalds#43: FILE: security/tomoyo/util.c:958:
+       if (!exe_file)$

WARNING: suspect code indent for conditional statements (7, 15)
torvalds#43: FILE: security/tomoyo/util.c:958:
+       if (!exe_file)
+	       return NULL;

WARNING: please, no spaces at the start of a line
torvalds#46: FILE: security/tomoyo/util.c:961:
+       cp = tomoyo_realpath_from_path(&exe_file->f_path);$

WARNING: please, no spaces at the start of a line
torvalds#47: FILE: security/tomoyo/util.c:962:
+       fput(exe_file);$

WARNING: please, no spaces at the start of a line
torvalds#48: FILE: security/tomoyo/util.c:963:
+       return cp;$

total: 0 errors, 11 warnings, 28 lines checked

./patches/tomoyo-reduce-mmap_sem-hold-for-mm-exe_file.patch has style problems, please review.

If any of these errors are false positives, please report
them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: James Morris <jmorris@namei.org>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Oct 26, 2024
…on memory

When I did memory failure tests recently, below panic occurs:

page dumped because: VM_BUG_ON_PAGE(PagePoisoned(page))
kernel BUG at include/linux/page-flags.h:616!
Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
CPU: 3 PID: 720 Comm: bash Not tainted 6.10.0-rc1-00195-g148743902568 torvalds#40
RIP: 0010:unpoison_memory+0x2f3/0x590
RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246
RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0
RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb
R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000
R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe
FS:  00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0
Call Trace:
 <TASK>
 unpoison_memory+0x2f3/0x590
 simple_attr_write_xsigned.constprop.0.isra.0+0xb3/0x110
 debugfs_attr_write+0x42/0x60
 full_proxy_write+0x5b/0x80
 vfs_write+0xd5/0x540
 ksys_write+0x64/0xe0
 do_syscall_64+0xb9/0x1d0
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f08f0314887
RSP: 002b:00007ffece710078 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000009 RCX: 00007f08f0314887
RDX: 0000000000000009 RSI: 0000564787a30410 RDI: 0000000000000001
RBP: 0000564787a30410 R08: 000000000000fefe R09: 000000007fffffff
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000009
R13: 00007f08f041b780 R14: 00007f08f0417600 R15: 00007f08f0416a00
 </TASK>
Modules linked in: hwpoison_inject
---[ end trace 0000000000000000 ]---
RIP: 0010:unpoison_memory+0x2f3/0x590
RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246
RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0
RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb
R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000
R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe
FS:  00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0
Kernel panic - not syncing: Fatal exception
Kernel Offset: 0x31c00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
---[ end Kernel panic - not syncing: Fatal exception ]---

We're hitting a BUG_ON in PF_ANY():

PAGEFLAG(HWPoison, hwpoison, PF_ANY)

#define PF_ANY(page, enforce)	PF_POISONED_CHECK(page)

#define PF_POISONED_CHECK(page) ({					\
	VM_BUG_ON_PGFLAGS(PagePoisoned(page), page);		\
	page; })

#define	PAGE_POISON_PATTERN	-1l
static inline int PagePoisoned(const struct page *page)
{
	return READ_ONCE(page->flags) == PAGE_POISON_PATTERN;
}

The offlined pages will have page->flags set to PAGE_POISON_PATTERN
while pfn is still valid:

offline_pages
  remove_pfn_range_from_zone
    page_init_poison
      memset(page, PAGE_POISON_PATTERN, size);

The root cause is that unpoison_memory() tries to check the PG_HWPoison
flags of an uninitialized page. So VM_BUG_ON_PAGE(PagePoisoned(page)) is
triggered. This can be reproduced by below steps:
1.Offline memory block:
 echo offline > /sys/devices/system/memory/memory12/state
2.Get offlined memory pfn:
 page-types -b n -rlN
3.Write pfn to unpoison-pfn
 echo <pfn> > /sys/kernel/debug/hwpoison/unpoison-pfn

Link: https://lkml.kernel.org/r/20240712064249.3882707-1-linmiaohe@huawei.com
Fixes: f165b37 ("mm: uninitialized struct page poisoning sanity checking")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
ioworker0 pushed a commit to ioworker0/linux that referenced this pull request Oct 27, 2024
…on memory

When I did memory failure tests recently, below panic occurs:

page dumped because: VM_BUG_ON_PAGE(PagePoisoned(page))
kernel BUG at include/linux/page-flags.h:616!
Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
CPU: 3 PID: 720 Comm: bash Not tainted 6.10.0-rc1-00195-g148743902568 torvalds#40
RIP: 0010:unpoison_memory+0x2f3/0x590
RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246
RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0
RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb
R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000
R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe
FS:  00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0
Call Trace:
 <TASK>
 unpoison_memory+0x2f3/0x590
 simple_attr_write_xsigned.constprop.0.isra.0+0xb3/0x110
 debugfs_attr_write+0x42/0x60
 full_proxy_write+0x5b/0x80
 vfs_write+0xd5/0x540
 ksys_write+0x64/0xe0
 do_syscall_64+0xb9/0x1d0
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f08f0314887
RSP: 002b:00007ffece710078 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000009 RCX: 00007f08f0314887
RDX: 0000000000000009 RSI: 0000564787a30410 RDI: 0000000000000001
RBP: 0000564787a30410 R08: 000000000000fefe R09: 000000007fffffff
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000009
R13: 00007f08f041b780 R14: 00007f08f0417600 R15: 00007f08f0416a00
 </TASK>
Modules linked in: hwpoison_inject
---[ end trace 0000000000000000 ]---
RIP: 0010:unpoison_memory+0x2f3/0x590
RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246
RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0
RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb
R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000
R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe
FS:  00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0
Kernel panic - not syncing: Fatal exception
Kernel Offset: 0x31c00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
---[ end Kernel panic - not syncing: Fatal exception ]---

We're hitting a BUG_ON in PF_ANY():

PAGEFLAG(HWPoison, hwpoison, PF_ANY)

#define PF_ANY(page, enforce)	PF_POISONED_CHECK(page)

#define PF_POISONED_CHECK(page) ({					\
	VM_BUG_ON_PGFLAGS(PagePoisoned(page), page);		\
	page; })

#define	PAGE_POISON_PATTERN	-1l
static inline int PagePoisoned(const struct page *page)
{
	return READ_ONCE(page->flags) == PAGE_POISON_PATTERN;
}

The offlined pages will have page->flags set to PAGE_POISON_PATTERN
while pfn is still valid:

offline_pages
  remove_pfn_range_from_zone
    page_init_poison
      memset(page, PAGE_POISON_PATTERN, size);

The root cause is that unpoison_memory() tries to check the PG_HWPoison
flags of an uninitialized page. So VM_BUG_ON_PAGE(PagePoisoned(page)) is
triggered. This can be reproduced by below steps:
1.Offline memory block:
 echo offline > /sys/devices/system/memory/memory12/state
2.Get offlined memory pfn:
 page-types -b n -rlN
3.Write pfn to unpoison-pfn
 echo <pfn> > /sys/kernel/debug/hwpoison/unpoison-pfn

Link: https://lkml.kernel.org/r/20240712064249.3882707-1-linmiaohe@huawei.com
Fixes: f165b37 ("mm: uninitialized struct page poisoning sanity checking")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
mj22226 pushed a commit to mj22226/linux that referenced this pull request Oct 28, 2024
[ Upstream commit 60f07e2 ]

We use uprobe in aarch64_be, which we found the tracee task would exit
due to SIGILL when we enable the uprobe trace.
We can see the replace inst from uprobe is not correct in aarch big-endian.
As in Armv8-A, instruction fetches are always treated as little-endian,
we should treat the UPROBE_SWBP_INSN as little-endian。

The test case is as following。
bash-4.4# ./mqueue_test_aarchbe 1 1 2 1 10 > /dev/null &
bash-4.4# cd /sys/kernel/debug/tracing/
bash-4.4# echo 'p:test /mqueue_test_aarchbe:0xc30 %x0 %x1' > uprobe_events
bash-4.4# echo 1 > events/uprobes/enable
bash-4.4#
bash-4.4# ps
  PID TTY          TIME CMD
  140 ?        00:00:01 bash
  237 ?        00:00:00 ps
[1]+  Illegal instruction     ./mqueue_test_aarchbe 1 1 2 1 100 > /dev/null

which we debug use gdb as following:

bash-4.4# gdb attach 155
(gdb) disassemble send
Dump of assembler code for function send:
   0x0000000000400c30 <+0>:     .inst   0xa00020d4 ; undefined
   0x0000000000400c34 <+4>:     mov     x29, sp
   0x0000000000400c38 <+8>:     str     w0, [sp, torvalds#28]
   0x0000000000400c3c <+12>:    strb    w1, [sp, torvalds#27]
   0x0000000000400c40 <+16>:    str     xzr, [sp, torvalds#40]
   0x0000000000400c44 <+20>:    str     xzr, [sp, torvalds#48]
   0x0000000000400c48 <+24>:    add     x0, sp, #0x1b
   0x0000000000400c4c <+28>:    mov     w3, #0x0                 // #0
   0x0000000000400c50 <+32>:    mov     x2, #0x1                 // #1
   0x0000000000400c54 <+36>:    mov     x1, x0
   0x0000000000400c58 <+40>:    ldr     w0, [sp, torvalds#28]
   0x0000000000400c5c <+44>:    bl      0x405e10 <mq_send>
   0x0000000000400c60 <+48>:    str     w0, [sp, torvalds#60]
   0x0000000000400c64 <+52>:    ldr     w0, [sp, torvalds#60]
   0x0000000000400c68 <+56>:    ldp     x29, x30, [sp], torvalds#64
   0x0000000000400c6c <+60>:    ret
End of assembler dump.
(gdb) info b
No breakpoints or watchpoints.
(gdb) c
Continuing.

Program received signal SIGILL, Illegal instruction.
0x0000000000400c30 in send ()
(gdb) x/10x 0x400c30
0x400c30 <send>:    0xd42000a0   0xfd030091      0xe01f00b9      0xe16f0039
0x400c40 <send+16>: 0xff1700f9   0xff1b00f9      0xe06f0091      0x03008052
0x400c50 <send+32>: 0x220080d2   0xe10300aa
(gdb) disassemble 0x400c30
Dump of assembler code for function send:
=> 0x0000000000400c30 <+0>:     .inst   0xa00020d4 ; undefined
   0x0000000000400c34 <+4>:     mov     x29, sp
   0x0000000000400c38 <+8>:     str     w0, [sp, torvalds#28]
   0x0000000000400c3c <+12>:    strb    w1, [sp, torvalds#27]
   0x0000000000400c40 <+16>:    str     xzr, [sp, torvalds#40]

Signed-off-by: junhua huang <huang.junhua@zte.com.cn>
Link: https://lore.kernel.org/r/202212021511106844809@zte.com.cn
Signed-off-by: Will Deacon <will@kernel.org>
Stable-dep-of: 13f8f1e ("arm64: probes: Fix uprobes for big-endian kernels")
Signed-off-by: Sasha Levin <sashal@kernel.org>
ioworker0 pushed a commit to ioworker0/linux that referenced this pull request Oct 29, 2024
…on memory

When I did memory failure tests recently, below panic occurs:

page dumped because: VM_BUG_ON_PAGE(PagePoisoned(page))
kernel BUG at include/linux/page-flags.h:616!
Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
CPU: 3 PID: 720 Comm: bash Not tainted 6.10.0-rc1-00195-g148743902568 torvalds#40
RIP: 0010:unpoison_memory+0x2f3/0x590
RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246
RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0
RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb
R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000
R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe
FS:  00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0
Call Trace:
 <TASK>
 unpoison_memory+0x2f3/0x590
 simple_attr_write_xsigned.constprop.0.isra.0+0xb3/0x110
 debugfs_attr_write+0x42/0x60
 full_proxy_write+0x5b/0x80
 vfs_write+0xd5/0x540
 ksys_write+0x64/0xe0
 do_syscall_64+0xb9/0x1d0
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f08f0314887
RSP: 002b:00007ffece710078 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000009 RCX: 00007f08f0314887
RDX: 0000000000000009 RSI: 0000564787a30410 RDI: 0000000000000001
RBP: 0000564787a30410 R08: 000000000000fefe R09: 000000007fffffff
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000009
R13: 00007f08f041b780 R14: 00007f08f0417600 R15: 00007f08f0416a00
 </TASK>
Modules linked in: hwpoison_inject
---[ end trace 0000000000000000 ]---
RIP: 0010:unpoison_memory+0x2f3/0x590
RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246
RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0
RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb
R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000
R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe
FS:  00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0
Kernel panic - not syncing: Fatal exception
Kernel Offset: 0x31c00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
---[ end Kernel panic - not syncing: Fatal exception ]---

We're hitting a BUG_ON in PF_ANY():

PAGEFLAG(HWPoison, hwpoison, PF_ANY)

#define PF_ANY(page, enforce)	PF_POISONED_CHECK(page)

#define PF_POISONED_CHECK(page) ({					\
	VM_BUG_ON_PGFLAGS(PagePoisoned(page), page);		\
	page; })

#define	PAGE_POISON_PATTERN	-1l
static inline int PagePoisoned(const struct page *page)
{
	return READ_ONCE(page->flags) == PAGE_POISON_PATTERN;
}

The offlined pages will have page->flags set to PAGE_POISON_PATTERN
while pfn is still valid:

offline_pages
  remove_pfn_range_from_zone
    page_init_poison
      memset(page, PAGE_POISON_PATTERN, size);

The root cause is that unpoison_memory() tries to check the PG_HWPoison
flags of an uninitialized page. So VM_BUG_ON_PAGE(PagePoisoned(page)) is
triggered. This can be reproduced by below steps:
1.Offline memory block:
 echo offline > /sys/devices/system/memory/memory12/state
2.Get offlined memory pfn:
 page-types -b n -rlN
3.Write pfn to unpoison-pfn
 echo <pfn> > /sys/kernel/debug/hwpoison/unpoison-pfn

Link: https://lkml.kernel.org/r/20240712064249.3882707-1-linmiaohe@huawei.com
Fixes: f165b37 ("mm: uninitialized struct page poisoning sanity checking")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
mj22226 pushed a commit to mj22226/linux that referenced this pull request Oct 29, 2024
[ Upstream commit 60f07e2 ]

We use uprobe in aarch64_be, which we found the tracee task would exit
due to SIGILL when we enable the uprobe trace.
We can see the replace inst from uprobe is not correct in aarch big-endian.
As in Armv8-A, instruction fetches are always treated as little-endian,
we should treat the UPROBE_SWBP_INSN as little-endian。

The test case is as following。
bash-4.4# ./mqueue_test_aarchbe 1 1 2 1 10 > /dev/null &
bash-4.4# cd /sys/kernel/debug/tracing/
bash-4.4# echo 'p:test /mqueue_test_aarchbe:0xc30 %x0 %x1' > uprobe_events
bash-4.4# echo 1 > events/uprobes/enable
bash-4.4#
bash-4.4# ps
  PID TTY          TIME CMD
  140 ?        00:00:01 bash
  237 ?        00:00:00 ps
[1]+  Illegal instruction     ./mqueue_test_aarchbe 1 1 2 1 100 > /dev/null

which we debug use gdb as following:

bash-4.4# gdb attach 155
(gdb) disassemble send
Dump of assembler code for function send:
   0x0000000000400c30 <+0>:     .inst   0xa00020d4 ; undefined
   0x0000000000400c34 <+4>:     mov     x29, sp
   0x0000000000400c38 <+8>:     str     w0, [sp, torvalds#28]
   0x0000000000400c3c <+12>:    strb    w1, [sp, torvalds#27]
   0x0000000000400c40 <+16>:    str     xzr, [sp, torvalds#40]
   0x0000000000400c44 <+20>:    str     xzr, [sp, torvalds#48]
   0x0000000000400c48 <+24>:    add     x0, sp, #0x1b
   0x0000000000400c4c <+28>:    mov     w3, #0x0                 // #0
   0x0000000000400c50 <+32>:    mov     x2, #0x1                 // #1
   0x0000000000400c54 <+36>:    mov     x1, x0
   0x0000000000400c58 <+40>:    ldr     w0, [sp, torvalds#28]
   0x0000000000400c5c <+44>:    bl      0x405e10 <mq_send>
   0x0000000000400c60 <+48>:    str     w0, [sp, torvalds#60]
   0x0000000000400c64 <+52>:    ldr     w0, [sp, torvalds#60]
   0x0000000000400c68 <+56>:    ldp     x29, x30, [sp], torvalds#64
   0x0000000000400c6c <+60>:    ret
End of assembler dump.
(gdb) info b
No breakpoints or watchpoints.
(gdb) c
Continuing.

Program received signal SIGILL, Illegal instruction.
0x0000000000400c30 in send ()
(gdb) x/10x 0x400c30
0x400c30 <send>:    0xd42000a0   0xfd030091      0xe01f00b9      0xe16f0039
0x400c40 <send+16>: 0xff1700f9   0xff1b00f9      0xe06f0091      0x03008052
0x400c50 <send+32>: 0x220080d2   0xe10300aa
(gdb) disassemble 0x400c30
Dump of assembler code for function send:
=> 0x0000000000400c30 <+0>:     .inst   0xa00020d4 ; undefined
   0x0000000000400c34 <+4>:     mov     x29, sp
   0x0000000000400c38 <+8>:     str     w0, [sp, torvalds#28]
   0x0000000000400c3c <+12>:    strb    w1, [sp, torvalds#27]
   0x0000000000400c40 <+16>:    str     xzr, [sp, torvalds#40]

Signed-off-by: junhua huang <huang.junhua@zte.com.cn>
Link: https://lore.kernel.org/r/202212021511106844809@zte.com.cn
Signed-off-by: Will Deacon <will@kernel.org>
Stable-dep-of: 13f8f1e ("arm64: probes: Fix uprobes for big-endian kernels")
Signed-off-by: Sasha Levin <sashal@kernel.org>
ioworker0 pushed a commit to ioworker0/linux that referenced this pull request Oct 30, 2024
…on memory

When I did memory failure tests recently, below panic occurs:

page dumped because: VM_BUG_ON_PAGE(PagePoisoned(page))
kernel BUG at include/linux/page-flags.h:616!
Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
CPU: 3 PID: 720 Comm: bash Not tainted 6.10.0-rc1-00195-g148743902568 torvalds#40
RIP: 0010:unpoison_memory+0x2f3/0x590
RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246
RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0
RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb
R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000
R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe
FS:  00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0
Call Trace:
 <TASK>
 unpoison_memory+0x2f3/0x590
 simple_attr_write_xsigned.constprop.0.isra.0+0xb3/0x110
 debugfs_attr_write+0x42/0x60
 full_proxy_write+0x5b/0x80
 vfs_write+0xd5/0x540
 ksys_write+0x64/0xe0
 do_syscall_64+0xb9/0x1d0
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f08f0314887
RSP: 002b:00007ffece710078 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000009 RCX: 00007f08f0314887
RDX: 0000000000000009 RSI: 0000564787a30410 RDI: 0000000000000001
RBP: 0000564787a30410 R08: 000000000000fefe R09: 000000007fffffff
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000009
R13: 00007f08f041b780 R14: 00007f08f0417600 R15: 00007f08f0416a00
 </TASK>
Modules linked in: hwpoison_inject
---[ end trace 0000000000000000 ]---
RIP: 0010:unpoison_memory+0x2f3/0x590
RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246
RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0
RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb
R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000
R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe
FS:  00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0
Kernel panic - not syncing: Fatal exception
Kernel Offset: 0x31c00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
---[ end Kernel panic - not syncing: Fatal exception ]---

We're hitting a BUG_ON in PF_ANY():

PAGEFLAG(HWPoison, hwpoison, PF_ANY)

#define PF_ANY(page, enforce)	PF_POISONED_CHECK(page)

#define PF_POISONED_CHECK(page) ({					\
	VM_BUG_ON_PGFLAGS(PagePoisoned(page), page);		\
	page; })

#define	PAGE_POISON_PATTERN	-1l
static inline int PagePoisoned(const struct page *page)
{
	return READ_ONCE(page->flags) == PAGE_POISON_PATTERN;
}

The offlined pages will have page->flags set to PAGE_POISON_PATTERN
while pfn is still valid:

offline_pages
  remove_pfn_range_from_zone
    page_init_poison
      memset(page, PAGE_POISON_PATTERN, size);

The root cause is that unpoison_memory() tries to check the PG_HWPoison
flags of an uninitialized page. So VM_BUG_ON_PAGE(PagePoisoned(page)) is
triggered. This can be reproduced by below steps:
1.Offline memory block:
 echo offline > /sys/devices/system/memory/memory12/state
2.Get offlined memory pfn:
 page-types -b n -rlN
3.Write pfn to unpoison-pfn
 echo <pfn> > /sys/kernel/debug/hwpoison/unpoison-pfn

Link: https://lkml.kernel.org/r/20240712064249.3882707-1-linmiaohe@huawei.com
Fixes: f165b37 ("mm: uninitialized struct page poisoning sanity checking")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
ioworker0 pushed a commit to ioworker0/linux that referenced this pull request Oct 31, 2024
…on memory

When I did memory failure tests recently, below panic occurs:

page dumped because: VM_BUG_ON_PAGE(PagePoisoned(page))
kernel BUG at include/linux/page-flags.h:616!
Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
CPU: 3 PID: 720 Comm: bash Not tainted 6.10.0-rc1-00195-g148743902568 torvalds#40
RIP: 0010:unpoison_memory+0x2f3/0x590
RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246
RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0
RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb
R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000
R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe
FS:  00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0
Call Trace:
 <TASK>
 unpoison_memory+0x2f3/0x590
 simple_attr_write_xsigned.constprop.0.isra.0+0xb3/0x110
 debugfs_attr_write+0x42/0x60
 full_proxy_write+0x5b/0x80
 vfs_write+0xd5/0x540
 ksys_write+0x64/0xe0
 do_syscall_64+0xb9/0x1d0
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f08f0314887
RSP: 002b:00007ffece710078 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000009 RCX: 00007f08f0314887
RDX: 0000000000000009 RSI: 0000564787a30410 RDI: 0000000000000001
RBP: 0000564787a30410 R08: 000000000000fefe R09: 000000007fffffff
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000009
R13: 00007f08f041b780 R14: 00007f08f0417600 R15: 00007f08f0416a00
 </TASK>
Modules linked in: hwpoison_inject
---[ end trace 0000000000000000 ]---
RIP: 0010:unpoison_memory+0x2f3/0x590
RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246
RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0
RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb
R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000
R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe
FS:  00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0
Kernel panic - not syncing: Fatal exception
Kernel Offset: 0x31c00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
---[ end Kernel panic - not syncing: Fatal exception ]---

We're hitting a BUG_ON in PF_ANY():

PAGEFLAG(HWPoison, hwpoison, PF_ANY)

#define PF_ANY(page, enforce)	PF_POISONED_CHECK(page)

#define PF_POISONED_CHECK(page) ({					\
	VM_BUG_ON_PGFLAGS(PagePoisoned(page), page);		\
	page; })

#define	PAGE_POISON_PATTERN	-1l
static inline int PagePoisoned(const struct page *page)
{
	return READ_ONCE(page->flags) == PAGE_POISON_PATTERN;
}

The offlined pages will have page->flags set to PAGE_POISON_PATTERN
while pfn is still valid:

offline_pages
  remove_pfn_range_from_zone
    page_init_poison
      memset(page, PAGE_POISON_PATTERN, size);

The root cause is that unpoison_memory() tries to check the PG_HWPoison
flags of an uninitialized page. So VM_BUG_ON_PAGE(PagePoisoned(page)) is
triggered. This can be reproduced by below steps:
1.Offline memory block:
 echo offline > /sys/devices/system/memory/memory12/state
2.Get offlined memory pfn:
 page-types -b n -rlN
3.Write pfn to unpoison-pfn
 echo <pfn> > /sys/kernel/debug/hwpoison/unpoison-pfn

Link: https://lkml.kernel.org/r/20240712064249.3882707-1-linmiaohe@huawei.com
Fixes: f165b37 ("mm: uninitialized struct page poisoning sanity checking")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
1054009064 pushed a commit to 1054009064/linux that referenced this pull request Nov 1, 2024
[ Upstream commit 60f07e2 ]

We use uprobe in aarch64_be, which we found the tracee task would exit
due to SIGILL when we enable the uprobe trace.
We can see the replace inst from uprobe is not correct in aarch big-endian.
As in Armv8-A, instruction fetches are always treated as little-endian,
we should treat the UPROBE_SWBP_INSN as little-endian。

The test case is as following。
bash-4.4# ./mqueue_test_aarchbe 1 1 2 1 10 > /dev/null &
bash-4.4# cd /sys/kernel/debug/tracing/
bash-4.4# echo 'p:test /mqueue_test_aarchbe:0xc30 %x0 %x1' > uprobe_events
bash-4.4# echo 1 > events/uprobes/enable
bash-4.4#
bash-4.4# ps
  PID TTY          TIME CMD
  140 ?        00:00:01 bash
  237 ?        00:00:00 ps
[1]+  Illegal instruction     ./mqueue_test_aarchbe 1 1 2 1 100 > /dev/null

which we debug use gdb as following:

bash-4.4# gdb attach 155
(gdb) disassemble send
Dump of assembler code for function send:
   0x0000000000400c30 <+0>:     .inst   0xa00020d4 ; undefined
   0x0000000000400c34 <+4>:     mov     x29, sp
   0x0000000000400c38 <+8>:     str     w0, [sp, torvalds#28]
   0x0000000000400c3c <+12>:    strb    w1, [sp, torvalds#27]
   0x0000000000400c40 <+16>:    str     xzr, [sp, torvalds#40]
   0x0000000000400c44 <+20>:    str     xzr, [sp, torvalds#48]
   0x0000000000400c48 <+24>:    add     x0, sp, #0x1b
   0x0000000000400c4c <+28>:    mov     w3, #0x0                 // #0
   0x0000000000400c50 <+32>:    mov     x2, #0x1                 // #1
   0x0000000000400c54 <+36>:    mov     x1, x0
   0x0000000000400c58 <+40>:    ldr     w0, [sp, torvalds#28]
   0x0000000000400c5c <+44>:    bl      0x405e10 <mq_send>
   0x0000000000400c60 <+48>:    str     w0, [sp, torvalds#60]
   0x0000000000400c64 <+52>:    ldr     w0, [sp, torvalds#60]
   0x0000000000400c68 <+56>:    ldp     x29, x30, [sp], torvalds#64
   0x0000000000400c6c <+60>:    ret
End of assembler dump.
(gdb) info b
No breakpoints or watchpoints.
(gdb) c
Continuing.

Program received signal SIGILL, Illegal instruction.
0x0000000000400c30 in send ()
(gdb) x/10x 0x400c30
0x400c30 <send>:    0xd42000a0   0xfd030091      0xe01f00b9      0xe16f0039
0x400c40 <send+16>: 0xff1700f9   0xff1b00f9      0xe06f0091      0x03008052
0x400c50 <send+32>: 0x220080d2   0xe10300aa
(gdb) disassemble 0x400c30
Dump of assembler code for function send:
=> 0x0000000000400c30 <+0>:     .inst   0xa00020d4 ; undefined
   0x0000000000400c34 <+4>:     mov     x29, sp
   0x0000000000400c38 <+8>:     str     w0, [sp, torvalds#28]
   0x0000000000400c3c <+12>:    strb    w1, [sp, torvalds#27]
   0x0000000000400c40 <+16>:    str     xzr, [sp, torvalds#40]

Signed-off-by: junhua huang <huang.junhua@zte.com.cn>
Link: https://lore.kernel.org/r/202212021511106844809@zte.com.cn
Signed-off-by: Will Deacon <will@kernel.org>
Stable-dep-of: 13f8f1e ("arm64: probes: Fix uprobes for big-endian kernels")
Signed-off-by: Sasha Levin <sashal@kernel.org>
1054009064 pushed a commit to 1054009064/linux that referenced this pull request Nov 1, 2024
[ Upstream commit 60f07e2 ]

We use uprobe in aarch64_be, which we found the tracee task would exit
due to SIGILL when we enable the uprobe trace.
We can see the replace inst from uprobe is not correct in aarch big-endian.
As in Armv8-A, instruction fetches are always treated as little-endian,
we should treat the UPROBE_SWBP_INSN as little-endian。

The test case is as following。
bash-4.4# ./mqueue_test_aarchbe 1 1 2 1 10 > /dev/null &
bash-4.4# cd /sys/kernel/debug/tracing/
bash-4.4# echo 'p:test /mqueue_test_aarchbe:0xc30 %x0 %x1' > uprobe_events
bash-4.4# echo 1 > events/uprobes/enable
bash-4.4#
bash-4.4# ps
  PID TTY          TIME CMD
  140 ?        00:00:01 bash
  237 ?        00:00:00 ps
[1]+  Illegal instruction     ./mqueue_test_aarchbe 1 1 2 1 100 > /dev/null

which we debug use gdb as following:

bash-4.4# gdb attach 155
(gdb) disassemble send
Dump of assembler code for function send:
   0x0000000000400c30 <+0>:     .inst   0xa00020d4 ; undefined
   0x0000000000400c34 <+4>:     mov     x29, sp
   0x0000000000400c38 <+8>:     str     w0, [sp, torvalds#28]
   0x0000000000400c3c <+12>:    strb    w1, [sp, torvalds#27]
   0x0000000000400c40 <+16>:    str     xzr, [sp, torvalds#40]
   0x0000000000400c44 <+20>:    str     xzr, [sp, torvalds#48]
   0x0000000000400c48 <+24>:    add     x0, sp, #0x1b
   0x0000000000400c4c <+28>:    mov     w3, #0x0                 // #0
   0x0000000000400c50 <+32>:    mov     x2, #0x1                 // #1
   0x0000000000400c54 <+36>:    mov     x1, x0
   0x0000000000400c58 <+40>:    ldr     w0, [sp, torvalds#28]
   0x0000000000400c5c <+44>:    bl      0x405e10 <mq_send>
   0x0000000000400c60 <+48>:    str     w0, [sp, torvalds#60]
   0x0000000000400c64 <+52>:    ldr     w0, [sp, torvalds#60]
   0x0000000000400c68 <+56>:    ldp     x29, x30, [sp], torvalds#64
   0x0000000000400c6c <+60>:    ret
End of assembler dump.
(gdb) info b
No breakpoints or watchpoints.
(gdb) c
Continuing.

Program received signal SIGILL, Illegal instruction.
0x0000000000400c30 in send ()
(gdb) x/10x 0x400c30
0x400c30 <send>:    0xd42000a0   0xfd030091      0xe01f00b9      0xe16f0039
0x400c40 <send+16>: 0xff1700f9   0xff1b00f9      0xe06f0091      0x03008052
0x400c50 <send+32>: 0x220080d2   0xe10300aa
(gdb) disassemble 0x400c30
Dump of assembler code for function send:
=> 0x0000000000400c30 <+0>:     .inst   0xa00020d4 ; undefined
   0x0000000000400c34 <+4>:     mov     x29, sp
   0x0000000000400c38 <+8>:     str     w0, [sp, torvalds#28]
   0x0000000000400c3c <+12>:    strb    w1, [sp, torvalds#27]
   0x0000000000400c40 <+16>:    str     xzr, [sp, torvalds#40]

Signed-off-by: junhua huang <huang.junhua@zte.com.cn>
Link: https://lore.kernel.org/r/202212021511106844809@zte.com.cn
Signed-off-by: Will Deacon <will@kernel.org>
Stable-dep-of: 13f8f1e ("arm64: probes: Fix uprobes for big-endian kernels")
Signed-off-by: Sasha Levin <sashal@kernel.org>
ioworker0 pushed a commit to ioworker0/linux that referenced this pull request Nov 1, 2024
…on memory

When I did memory failure tests recently, below panic occurs:

page dumped because: VM_BUG_ON_PAGE(PagePoisoned(page))
kernel BUG at include/linux/page-flags.h:616!
Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
CPU: 3 PID: 720 Comm: bash Not tainted 6.10.0-rc1-00195-g148743902568 torvalds#40
RIP: 0010:unpoison_memory+0x2f3/0x590
RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246
RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0
RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb
R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000
R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe
FS:  00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0
Call Trace:
 <TASK>
 unpoison_memory+0x2f3/0x590
 simple_attr_write_xsigned.constprop.0.isra.0+0xb3/0x110
 debugfs_attr_write+0x42/0x60
 full_proxy_write+0x5b/0x80
 vfs_write+0xd5/0x540
 ksys_write+0x64/0xe0
 do_syscall_64+0xb9/0x1d0
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f08f0314887
RSP: 002b:00007ffece710078 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000009 RCX: 00007f08f0314887
RDX: 0000000000000009 RSI: 0000564787a30410 RDI: 0000000000000001
RBP: 0000564787a30410 R08: 000000000000fefe R09: 000000007fffffff
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000009
R13: 00007f08f041b780 R14: 00007f08f0417600 R15: 00007f08f0416a00
 </TASK>
Modules linked in: hwpoison_inject
---[ end trace 0000000000000000 ]---
RIP: 0010:unpoison_memory+0x2f3/0x590
RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246
RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0
RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb
R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000
R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe
FS:  00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0
Kernel panic - not syncing: Fatal exception
Kernel Offset: 0x31c00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
---[ end Kernel panic - not syncing: Fatal exception ]---

We're hitting a BUG_ON in PF_ANY():

PAGEFLAG(HWPoison, hwpoison, PF_ANY)

#define PF_ANY(page, enforce)	PF_POISONED_CHECK(page)

#define PF_POISONED_CHECK(page) ({					\
	VM_BUG_ON_PGFLAGS(PagePoisoned(page), page);		\
	page; })

#define	PAGE_POISON_PATTERN	-1l
static inline int PagePoisoned(const struct page *page)
{
	return READ_ONCE(page->flags) == PAGE_POISON_PATTERN;
}

The offlined pages will have page->flags set to PAGE_POISON_PATTERN
while pfn is still valid:

offline_pages
  remove_pfn_range_from_zone
    page_init_poison
      memset(page, PAGE_POISON_PATTERN, size);

The root cause is that unpoison_memory() tries to check the PG_HWPoison
flags of an uninitialized page. So VM_BUG_ON_PAGE(PagePoisoned(page)) is
triggered. This can be reproduced by below steps:
1.Offline memory block:
 echo offline > /sys/devices/system/memory/memory12/state
2.Get offlined memory pfn:
 page-types -b n -rlN
3.Write pfn to unpoison-pfn
 echo <pfn> > /sys/kernel/debug/hwpoison/unpoison-pfn

Link: https://lkml.kernel.org/r/20240712064249.3882707-1-linmiaohe@huawei.com
Fixes: f165b37 ("mm: uninitialized struct page poisoning sanity checking")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
ioworker0 pushed a commit to ioworker0/linux that referenced this pull request Nov 2, 2024
…on memory

When I did memory failure tests recently, below panic occurs:

page dumped because: VM_BUG_ON_PAGE(PagePoisoned(page))
kernel BUG at include/linux/page-flags.h:616!
Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
CPU: 3 PID: 720 Comm: bash Not tainted 6.10.0-rc1-00195-g148743902568 torvalds#40
RIP: 0010:unpoison_memory+0x2f3/0x590
RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246
RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0
RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb
R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000
R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe
FS:  00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0
Call Trace:
 <TASK>
 unpoison_memory+0x2f3/0x590
 simple_attr_write_xsigned.constprop.0.isra.0+0xb3/0x110
 debugfs_attr_write+0x42/0x60
 full_proxy_write+0x5b/0x80
 vfs_write+0xd5/0x540
 ksys_write+0x64/0xe0
 do_syscall_64+0xb9/0x1d0
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f08f0314887
RSP: 002b:00007ffece710078 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000009 RCX: 00007f08f0314887
RDX: 0000000000000009 RSI: 0000564787a30410 RDI: 0000000000000001
RBP: 0000564787a30410 R08: 000000000000fefe R09: 000000007fffffff
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000009
R13: 00007f08f041b780 R14: 00007f08f0417600 R15: 00007f08f0416a00
 </TASK>
Modules linked in: hwpoison_inject
---[ end trace 0000000000000000 ]---
RIP: 0010:unpoison_memory+0x2f3/0x590
RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246
RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0
RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb
R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000
R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe
FS:  00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0
Kernel panic - not syncing: Fatal exception
Kernel Offset: 0x31c00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
---[ end Kernel panic - not syncing: Fatal exception ]---

We're hitting a BUG_ON in PF_ANY():

PAGEFLAG(HWPoison, hwpoison, PF_ANY)

#define PF_ANY(page, enforce)	PF_POISONED_CHECK(page)

#define PF_POISONED_CHECK(page) ({					\
	VM_BUG_ON_PGFLAGS(PagePoisoned(page), page);		\
	page; })

#define	PAGE_POISON_PATTERN	-1l
static inline int PagePoisoned(const struct page *page)
{
	return READ_ONCE(page->flags) == PAGE_POISON_PATTERN;
}

The offlined pages will have page->flags set to PAGE_POISON_PATTERN
while pfn is still valid:

offline_pages
  remove_pfn_range_from_zone
    page_init_poison
      memset(page, PAGE_POISON_PATTERN, size);

The root cause is that unpoison_memory() tries to check the PG_HWPoison
flags of an uninitialized page. So VM_BUG_ON_PAGE(PagePoisoned(page)) is
triggered. This can be reproduced by below steps:
1.Offline memory block:
 echo offline > /sys/devices/system/memory/memory12/state
2.Get offlined memory pfn:
 page-types -b n -rlN
3.Write pfn to unpoison-pfn
 echo <pfn> > /sys/kernel/debug/hwpoison/unpoison-pfn

Link: https://lkml.kernel.org/r/20240712064249.3882707-1-linmiaohe@huawei.com
Fixes: f165b37 ("mm: uninitialized struct page poisoning sanity checking")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
ioworker0 pushed a commit to ioworker0/linux that referenced this pull request Nov 3, 2024
…on memory

When I did memory failure tests recently, below panic occurs:

page dumped because: VM_BUG_ON_PAGE(PagePoisoned(page))
kernel BUG at include/linux/page-flags.h:616!
Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
CPU: 3 PID: 720 Comm: bash Not tainted 6.10.0-rc1-00195-g148743902568 torvalds#40
RIP: 0010:unpoison_memory+0x2f3/0x590
RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246
RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0
RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb
R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000
R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe
FS:  00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0
Call Trace:
 <TASK>
 unpoison_memory+0x2f3/0x590
 simple_attr_write_xsigned.constprop.0.isra.0+0xb3/0x110
 debugfs_attr_write+0x42/0x60
 full_proxy_write+0x5b/0x80
 vfs_write+0xd5/0x540
 ksys_write+0x64/0xe0
 do_syscall_64+0xb9/0x1d0
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f08f0314887
RSP: 002b:00007ffece710078 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000009 RCX: 00007f08f0314887
RDX: 0000000000000009 RSI: 0000564787a30410 RDI: 0000000000000001
RBP: 0000564787a30410 R08: 000000000000fefe R09: 000000007fffffff
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000009
R13: 00007f08f041b780 R14: 00007f08f0417600 R15: 00007f08f0416a00
 </TASK>
Modules linked in: hwpoison_inject
---[ end trace 0000000000000000 ]---
RIP: 0010:unpoison_memory+0x2f3/0x590
RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246
RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0
RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb
R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000
R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe
FS:  00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0
Kernel panic - not syncing: Fatal exception
Kernel Offset: 0x31c00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
---[ end Kernel panic - not syncing: Fatal exception ]---

We're hitting a BUG_ON in PF_ANY():

PAGEFLAG(HWPoison, hwpoison, PF_ANY)

#define PF_ANY(page, enforce)	PF_POISONED_CHECK(page)

#define PF_POISONED_CHECK(page) ({					\
	VM_BUG_ON_PGFLAGS(PagePoisoned(page), page);		\
	page; })

#define	PAGE_POISON_PATTERN	-1l
static inline int PagePoisoned(const struct page *page)
{
	return READ_ONCE(page->flags) == PAGE_POISON_PATTERN;
}

The offlined pages will have page->flags set to PAGE_POISON_PATTERN
while pfn is still valid:

offline_pages
  remove_pfn_range_from_zone
    page_init_poison
      memset(page, PAGE_POISON_PATTERN, size);

The root cause is that unpoison_memory() tries to check the PG_HWPoison
flags of an uninitialized page. So VM_BUG_ON_PAGE(PagePoisoned(page)) is
triggered. This can be reproduced by below steps:
1.Offline memory block:
 echo offline > /sys/devices/system/memory/memory12/state
2.Get offlined memory pfn:
 page-types -b n -rlN
3.Write pfn to unpoison-pfn
 echo <pfn> > /sys/kernel/debug/hwpoison/unpoison-pfn

Link: https://lkml.kernel.org/r/20240712064249.3882707-1-linmiaohe@huawei.com
Fixes: f165b37 ("mm: uninitialized struct page poisoning sanity checking")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
ioworker0 pushed a commit to ioworker0/linux that referenced this pull request Nov 5, 2024
…on memory

When I did memory failure tests recently, below panic occurs:

page dumped because: VM_BUG_ON_PAGE(PagePoisoned(page))
kernel BUG at include/linux/page-flags.h:616!
Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
CPU: 3 PID: 720 Comm: bash Not tainted 6.10.0-rc1-00195-g148743902568 torvalds#40
RIP: 0010:unpoison_memory+0x2f3/0x590
RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246
RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0
RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb
R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000
R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe
FS:  00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0
Call Trace:
 <TASK>
 unpoison_memory+0x2f3/0x590
 simple_attr_write_xsigned.constprop.0.isra.0+0xb3/0x110
 debugfs_attr_write+0x42/0x60
 full_proxy_write+0x5b/0x80
 vfs_write+0xd5/0x540
 ksys_write+0x64/0xe0
 do_syscall_64+0xb9/0x1d0
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f08f0314887
RSP: 002b:00007ffece710078 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000009 RCX: 00007f08f0314887
RDX: 0000000000000009 RSI: 0000564787a30410 RDI: 0000000000000001
RBP: 0000564787a30410 R08: 000000000000fefe R09: 000000007fffffff
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000009
R13: 00007f08f041b780 R14: 00007f08f0417600 R15: 00007f08f0416a00
 </TASK>
Modules linked in: hwpoison_inject
---[ end trace 0000000000000000 ]---
RIP: 0010:unpoison_memory+0x2f3/0x590
RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246
RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0
RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb
R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000
R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe
FS:  00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0
Kernel panic - not syncing: Fatal exception
Kernel Offset: 0x31c00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
---[ end Kernel panic - not syncing: Fatal exception ]---

We're hitting a BUG_ON in PF_ANY():

PAGEFLAG(HWPoison, hwpoison, PF_ANY)

#define PF_ANY(page, enforce)	PF_POISONED_CHECK(page)

#define PF_POISONED_CHECK(page) ({					\
	VM_BUG_ON_PGFLAGS(PagePoisoned(page), page);		\
	page; })

#define	PAGE_POISON_PATTERN	-1l
static inline int PagePoisoned(const struct page *page)
{
	return READ_ONCE(page->flags) == PAGE_POISON_PATTERN;
}

The offlined pages will have page->flags set to PAGE_POISON_PATTERN
while pfn is still valid:

offline_pages
  remove_pfn_range_from_zone
    page_init_poison
      memset(page, PAGE_POISON_PATTERN, size);

The root cause is that unpoison_memory() tries to check the PG_HWPoison
flags of an uninitialized page. So VM_BUG_ON_PAGE(PagePoisoned(page)) is
triggered. This can be reproduced by below steps:
1.Offline memory block:
 echo offline > /sys/devices/system/memory/memory12/state
2.Get offlined memory pfn:
 page-types -b n -rlN
3.Write pfn to unpoison-pfn
 echo <pfn> > /sys/kernel/debug/hwpoison/unpoison-pfn

Link: https://lkml.kernel.org/r/20240712064249.3882707-1-linmiaohe@huawei.com
Fixes: f165b37 ("mm: uninitialized struct page poisoning sanity checking")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
ioworker0 pushed a commit to ioworker0/linux that referenced this pull request Nov 5, 2024
…on memory

When I did memory failure tests recently, below panic occurs:

page dumped because: VM_BUG_ON_PAGE(PagePoisoned(page))
kernel BUG at include/linux/page-flags.h:616!
Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
CPU: 3 PID: 720 Comm: bash Not tainted 6.10.0-rc1-00195-g148743902568 torvalds#40
RIP: 0010:unpoison_memory+0x2f3/0x590
RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246
RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0
RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb
R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000
R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe
FS:  00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0
Call Trace:
 <TASK>
 unpoison_memory+0x2f3/0x590
 simple_attr_write_xsigned.constprop.0.isra.0+0xb3/0x110
 debugfs_attr_write+0x42/0x60
 full_proxy_write+0x5b/0x80
 vfs_write+0xd5/0x540
 ksys_write+0x64/0xe0
 do_syscall_64+0xb9/0x1d0
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f08f0314887
RSP: 002b:00007ffece710078 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000009 RCX: 00007f08f0314887
RDX: 0000000000000009 RSI: 0000564787a30410 RDI: 0000000000000001
RBP: 0000564787a30410 R08: 000000000000fefe R09: 000000007fffffff
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000009
R13: 00007f08f041b780 R14: 00007f08f0417600 R15: 00007f08f0416a00
 </TASK>
Modules linked in: hwpoison_inject
---[ end trace 0000000000000000 ]---
RIP: 0010:unpoison_memory+0x2f3/0x590
RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246
RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0
RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb
R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000
R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe
FS:  00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0
Kernel panic - not syncing: Fatal exception
Kernel Offset: 0x31c00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
---[ end Kernel panic - not syncing: Fatal exception ]---

We're hitting a BUG_ON in PF_ANY():

PAGEFLAG(HWPoison, hwpoison, PF_ANY)

#define PF_ANY(page, enforce)	PF_POISONED_CHECK(page)

#define PF_POISONED_CHECK(page) ({					\
	VM_BUG_ON_PGFLAGS(PagePoisoned(page), page);		\
	page; })

#define	PAGE_POISON_PATTERN	-1l
static inline int PagePoisoned(const struct page *page)
{
	return READ_ONCE(page->flags) == PAGE_POISON_PATTERN;
}

The offlined pages will have page->flags set to PAGE_POISON_PATTERN
while pfn is still valid:

offline_pages
  remove_pfn_range_from_zone
    page_init_poison
      memset(page, PAGE_POISON_PATTERN, size);

The root cause is that unpoison_memory() tries to check the PG_HWPoison
flags of an uninitialized page. So VM_BUG_ON_PAGE(PagePoisoned(page)) is
triggered. This can be reproduced by below steps:
1.Offline memory block:
 echo offline > /sys/devices/system/memory/memory12/state
2.Get offlined memory pfn:
 page-types -b n -rlN
3.Write pfn to unpoison-pfn
 echo <pfn> > /sys/kernel/debug/hwpoison/unpoison-pfn

Link: https://lkml.kernel.org/r/20240712064249.3882707-1-linmiaohe@huawei.com
Fixes: f165b37 ("mm: uninitialized struct page poisoning sanity checking")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
ioworker0 pushed a commit to ioworker0/linux that referenced this pull request Nov 6, 2024
…on memory

When I did memory failure tests recently, below panic occurs:

page dumped because: VM_BUG_ON_PAGE(PagePoisoned(page))
kernel BUG at include/linux/page-flags.h:616!
Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
CPU: 3 PID: 720 Comm: bash Not tainted 6.10.0-rc1-00195-g148743902568 torvalds#40
RIP: 0010:unpoison_memory+0x2f3/0x590
RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246
RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0
RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb
R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000
R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe
FS:  00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0
Call Trace:
 <TASK>
 unpoison_memory+0x2f3/0x590
 simple_attr_write_xsigned.constprop.0.isra.0+0xb3/0x110
 debugfs_attr_write+0x42/0x60
 full_proxy_write+0x5b/0x80
 vfs_write+0xd5/0x540
 ksys_write+0x64/0xe0
 do_syscall_64+0xb9/0x1d0
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f08f0314887
RSP: 002b:00007ffece710078 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000009 RCX: 00007f08f0314887
RDX: 0000000000000009 RSI: 0000564787a30410 RDI: 0000000000000001
RBP: 0000564787a30410 R08: 000000000000fefe R09: 000000007fffffff
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000009
R13: 00007f08f041b780 R14: 00007f08f0417600 R15: 00007f08f0416a00
 </TASK>
Modules linked in: hwpoison_inject
---[ end trace 0000000000000000 ]---
RIP: 0010:unpoison_memory+0x2f3/0x590
RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246
RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0
RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb
R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000
R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe
FS:  00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0
Kernel panic - not syncing: Fatal exception
Kernel Offset: 0x31c00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
---[ end Kernel panic - not syncing: Fatal exception ]---

We're hitting a BUG_ON in PF_ANY():

PAGEFLAG(HWPoison, hwpoison, PF_ANY)

#define PF_ANY(page, enforce)	PF_POISONED_CHECK(page)

#define PF_POISONED_CHECK(page) ({					\
	VM_BUG_ON_PGFLAGS(PagePoisoned(page), page);		\
	page; })

#define	PAGE_POISON_PATTERN	-1l
static inline int PagePoisoned(const struct page *page)
{
	return READ_ONCE(page->flags) == PAGE_POISON_PATTERN;
}

The offlined pages will have page->flags set to PAGE_POISON_PATTERN
while pfn is still valid:

offline_pages
  remove_pfn_range_from_zone
    page_init_poison
      memset(page, PAGE_POISON_PATTERN, size);

The root cause is that unpoison_memory() tries to check the PG_HWPoison
flags of an uninitialized page. So VM_BUG_ON_PAGE(PagePoisoned(page)) is
triggered. This can be reproduced by below steps:
1.Offline memory block:
 echo offline > /sys/devices/system/memory/memory12/state
2.Get offlined memory pfn:
 page-types -b n -rlN
3.Write pfn to unpoison-pfn
 echo <pfn> > /sys/kernel/debug/hwpoison/unpoison-pfn

Link: https://lkml.kernel.org/r/20240712064249.3882707-1-linmiaohe@huawei.com
Fixes: f165b37 ("mm: uninitialized struct page poisoning sanity checking")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
ioworker0 pushed a commit to ioworker0/linux that referenced this pull request Nov 6, 2024
…on memory

When I did memory failure tests recently, below panic occurs:

page dumped because: VM_BUG_ON_PAGE(PagePoisoned(page))
kernel BUG at include/linux/page-flags.h:616!
Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
CPU: 3 PID: 720 Comm: bash Not tainted 6.10.0-rc1-00195-g148743902568 torvalds#40
RIP: 0010:unpoison_memory+0x2f3/0x590
RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246
RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0
RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb
R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000
R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe
FS:  00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0
Call Trace:
 <TASK>
 unpoison_memory+0x2f3/0x590
 simple_attr_write_xsigned.constprop.0.isra.0+0xb3/0x110
 debugfs_attr_write+0x42/0x60
 full_proxy_write+0x5b/0x80
 vfs_write+0xd5/0x540
 ksys_write+0x64/0xe0
 do_syscall_64+0xb9/0x1d0
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f08f0314887
RSP: 002b:00007ffece710078 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000009 RCX: 00007f08f0314887
RDX: 0000000000000009 RSI: 0000564787a30410 RDI: 0000000000000001
RBP: 0000564787a30410 R08: 000000000000fefe R09: 000000007fffffff
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000009
R13: 00007f08f041b780 R14: 00007f08f0417600 R15: 00007f08f0416a00
 </TASK>
Modules linked in: hwpoison_inject
---[ end trace 0000000000000000 ]---
RIP: 0010:unpoison_memory+0x2f3/0x590
RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246
RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0
RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb
R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000
R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe
FS:  00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0
Kernel panic - not syncing: Fatal exception
Kernel Offset: 0x31c00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
---[ end Kernel panic - not syncing: Fatal exception ]---

We're hitting a BUG_ON in PF_ANY():

PAGEFLAG(HWPoison, hwpoison, PF_ANY)

#define PF_ANY(page, enforce)	PF_POISONED_CHECK(page)

#define PF_POISONED_CHECK(page) ({					\
	VM_BUG_ON_PGFLAGS(PagePoisoned(page), page);		\
	page; })

#define	PAGE_POISON_PATTERN	-1l
static inline int PagePoisoned(const struct page *page)
{
	return READ_ONCE(page->flags) == PAGE_POISON_PATTERN;
}

The offlined pages will have page->flags set to PAGE_POISON_PATTERN
while pfn is still valid:

offline_pages
  remove_pfn_range_from_zone
    page_init_poison
      memset(page, PAGE_POISON_PATTERN, size);

The root cause is that unpoison_memory() tries to check the PG_HWPoison
flags of an uninitialized page. So VM_BUG_ON_PAGE(PagePoisoned(page)) is
triggered. This can be reproduced by below steps:
1.Offline memory block:
 echo offline > /sys/devices/system/memory/memory12/state
2.Get offlined memory pfn:
 page-types -b n -rlN
3.Write pfn to unpoison-pfn
 echo <pfn> > /sys/kernel/debug/hwpoison/unpoison-pfn

Link: https://lkml.kernel.org/r/20240712064249.3882707-1-linmiaohe@huawei.com
Fixes: f165b37 ("mm: uninitialized struct page poisoning sanity checking")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
ioworker0 pushed a commit to ioworker0/linux that referenced this pull request Nov 6, 2024
…on memory

When I did memory failure tests recently, below panic occurs:

page dumped because: VM_BUG_ON_PAGE(PagePoisoned(page))
kernel BUG at include/linux/page-flags.h:616!
Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
CPU: 3 PID: 720 Comm: bash Not tainted 6.10.0-rc1-00195-g148743902568 torvalds#40
RIP: 0010:unpoison_memory+0x2f3/0x590
RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246
RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0
RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb
R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000
R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe
FS:  00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0
Call Trace:
 <TASK>
 unpoison_memory+0x2f3/0x590
 simple_attr_write_xsigned.constprop.0.isra.0+0xb3/0x110
 debugfs_attr_write+0x42/0x60
 full_proxy_write+0x5b/0x80
 vfs_write+0xd5/0x540
 ksys_write+0x64/0xe0
 do_syscall_64+0xb9/0x1d0
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f08f0314887
RSP: 002b:00007ffece710078 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000009 RCX: 00007f08f0314887
RDX: 0000000000000009 RSI: 0000564787a30410 RDI: 0000000000000001
RBP: 0000564787a30410 R08: 000000000000fefe R09: 000000007fffffff
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000009
R13: 00007f08f041b780 R14: 00007f08f0417600 R15: 00007f08f0416a00
 </TASK>
Modules linked in: hwpoison_inject
---[ end trace 0000000000000000 ]---
RIP: 0010:unpoison_memory+0x2f3/0x590
RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246
RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0
RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb
R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000
R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe
FS:  00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0
Kernel panic - not syncing: Fatal exception
Kernel Offset: 0x31c00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
---[ end Kernel panic - not syncing: Fatal exception ]---

We're hitting a BUG_ON in PF_ANY():

PAGEFLAG(HWPoison, hwpoison, PF_ANY)

#define PF_ANY(page, enforce)	PF_POISONED_CHECK(page)

#define PF_POISONED_CHECK(page) ({					\
	VM_BUG_ON_PGFLAGS(PagePoisoned(page), page);		\
	page; })

#define	PAGE_POISON_PATTERN	-1l
static inline int PagePoisoned(const struct page *page)
{
	return READ_ONCE(page->flags) == PAGE_POISON_PATTERN;
}

The offlined pages will have page->flags set to PAGE_POISON_PATTERN
while pfn is still valid:

offline_pages
  remove_pfn_range_from_zone
    page_init_poison
      memset(page, PAGE_POISON_PATTERN, size);

The root cause is that unpoison_memory() tries to check the PG_HWPoison
flags of an uninitialized page. So VM_BUG_ON_PAGE(PagePoisoned(page)) is
triggered. This can be reproduced by below steps:
1.Offline memory block:
 echo offline > /sys/devices/system/memory/memory12/state
2.Get offlined memory pfn:
 page-types -b n -rlN
3.Write pfn to unpoison-pfn
 echo <pfn> > /sys/kernel/debug/hwpoison/unpoison-pfn

Link: https://lkml.kernel.org/r/20240712064249.3882707-1-linmiaohe@huawei.com
Fixes: f165b37 ("mm: uninitialized struct page poisoning sanity checking")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
ioworker0 pushed a commit to ioworker0/linux that referenced this pull request Nov 6, 2024
…on memory

When I did memory failure tests recently, below panic occurs:

page dumped because: VM_BUG_ON_PAGE(PagePoisoned(page))
kernel BUG at include/linux/page-flags.h:616!
Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
CPU: 3 PID: 720 Comm: bash Not tainted 6.10.0-rc1-00195-g148743902568 torvalds#40
RIP: 0010:unpoison_memory+0x2f3/0x590
RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246
RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0
RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb
R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000
R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe
FS:  00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0
Call Trace:
 <TASK>
 unpoison_memory+0x2f3/0x590
 simple_attr_write_xsigned.constprop.0.isra.0+0xb3/0x110
 debugfs_attr_write+0x42/0x60
 full_proxy_write+0x5b/0x80
 vfs_write+0xd5/0x540
 ksys_write+0x64/0xe0
 do_syscall_64+0xb9/0x1d0
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f08f0314887
RSP: 002b:00007ffece710078 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000009 RCX: 00007f08f0314887
RDX: 0000000000000009 RSI: 0000564787a30410 RDI: 0000000000000001
RBP: 0000564787a30410 R08: 000000000000fefe R09: 000000007fffffff
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000009
R13: 00007f08f041b780 R14: 00007f08f0417600 R15: 00007f08f0416a00
 </TASK>
Modules linked in: hwpoison_inject
---[ end trace 0000000000000000 ]---
RIP: 0010:unpoison_memory+0x2f3/0x590
RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246
RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0
RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb
R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000
R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe
FS:  00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0
Kernel panic - not syncing: Fatal exception
Kernel Offset: 0x31c00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
---[ end Kernel panic - not syncing: Fatal exception ]---

We're hitting a BUG_ON in PF_ANY():

PAGEFLAG(HWPoison, hwpoison, PF_ANY)

#define PF_ANY(page, enforce)	PF_POISONED_CHECK(page)

#define PF_POISONED_CHECK(page) ({					\
	VM_BUG_ON_PGFLAGS(PagePoisoned(page), page);		\
	page; })

#define	PAGE_POISON_PATTERN	-1l
static inline int PagePoisoned(const struct page *page)
{
	return READ_ONCE(page->flags) == PAGE_POISON_PATTERN;
}

The offlined pages will have page->flags set to PAGE_POISON_PATTERN
while pfn is still valid:

offline_pages
  remove_pfn_range_from_zone
    page_init_poison
      memset(page, PAGE_POISON_PATTERN, size);

The root cause is that unpoison_memory() tries to check the PG_HWPoison
flags of an uninitialized page. So VM_BUG_ON_PAGE(PagePoisoned(page)) is
triggered. This can be reproduced by below steps:
1.Offline memory block:
 echo offline > /sys/devices/system/memory/memory12/state
2.Get offlined memory pfn:
 page-types -b n -rlN
3.Write pfn to unpoison-pfn
 echo <pfn> > /sys/kernel/debug/hwpoison/unpoison-pfn

Link: https://lkml.kernel.org/r/20240712064249.3882707-1-linmiaohe@huawei.com
Fixes: f165b37 ("mm: uninitialized struct page poisoning sanity checking")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
ioworker0 pushed a commit to ioworker0/linux that referenced this pull request Nov 7, 2024
…on memory

When I did memory failure tests recently, below panic occurs:

page dumped because: VM_BUG_ON_PAGE(PagePoisoned(page))
kernel BUG at include/linux/page-flags.h:616!
Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
CPU: 3 PID: 720 Comm: bash Not tainted 6.10.0-rc1-00195-g148743902568 torvalds#40
RIP: 0010:unpoison_memory+0x2f3/0x590
RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246
RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0
RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb
R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000
R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe
FS:  00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0
Call Trace:
 <TASK>
 unpoison_memory+0x2f3/0x590
 simple_attr_write_xsigned.constprop.0.isra.0+0xb3/0x110
 debugfs_attr_write+0x42/0x60
 full_proxy_write+0x5b/0x80
 vfs_write+0xd5/0x540
 ksys_write+0x64/0xe0
 do_syscall_64+0xb9/0x1d0
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f08f0314887
RSP: 002b:00007ffece710078 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000009 RCX: 00007f08f0314887
RDX: 0000000000000009 RSI: 0000564787a30410 RDI: 0000000000000001
RBP: 0000564787a30410 R08: 000000000000fefe R09: 000000007fffffff
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000009
R13: 00007f08f041b780 R14: 00007f08f0417600 R15: 00007f08f0416a00
 </TASK>
Modules linked in: hwpoison_inject
---[ end trace 0000000000000000 ]---
RIP: 0010:unpoison_memory+0x2f3/0x590
RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246
RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0
RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb
R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000
R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe
FS:  00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0
Kernel panic - not syncing: Fatal exception
Kernel Offset: 0x31c00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
---[ end Kernel panic - not syncing: Fatal exception ]---

We're hitting a BUG_ON in PF_ANY():

PAGEFLAG(HWPoison, hwpoison, PF_ANY)

#define PF_ANY(page, enforce)	PF_POISONED_CHECK(page)

#define PF_POISONED_CHECK(page) ({					\
	VM_BUG_ON_PGFLAGS(PagePoisoned(page), page);		\
	page; })

#define	PAGE_POISON_PATTERN	-1l
static inline int PagePoisoned(const struct page *page)
{
	return READ_ONCE(page->flags) == PAGE_POISON_PATTERN;
}

The offlined pages will have page->flags set to PAGE_POISON_PATTERN
while pfn is still valid:

offline_pages
  remove_pfn_range_from_zone
    page_init_poison
      memset(page, PAGE_POISON_PATTERN, size);

The root cause is that unpoison_memory() tries to check the PG_HWPoison
flags of an uninitialized page. So VM_BUG_ON_PAGE(PagePoisoned(page)) is
triggered. This can be reproduced by below steps:
1.Offline memory block:
 echo offline > /sys/devices/system/memory/memory12/state
2.Get offlined memory pfn:
 page-types -b n -rlN
3.Write pfn to unpoison-pfn
 echo <pfn> > /sys/kernel/debug/hwpoison/unpoison-pfn

Link: https://lkml.kernel.org/r/20240712064249.3882707-1-linmiaohe@huawei.com
Fixes: f165b37 ("mm: uninitialized struct page poisoning sanity checking")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
ioworker0 pushed a commit to ioworker0/linux that referenced this pull request Nov 7, 2024
…on memory

When I did memory failure tests recently, below panic occurs:

page dumped because: VM_BUG_ON_PAGE(PagePoisoned(page))
kernel BUG at include/linux/page-flags.h:616!
Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
CPU: 3 PID: 720 Comm: bash Not tainted 6.10.0-rc1-00195-g148743902568 torvalds#40
RIP: 0010:unpoison_memory+0x2f3/0x590
RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246
RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0
RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb
R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000
R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe
FS:  00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0
Call Trace:
 <TASK>
 unpoison_memory+0x2f3/0x590
 simple_attr_write_xsigned.constprop.0.isra.0+0xb3/0x110
 debugfs_attr_write+0x42/0x60
 full_proxy_write+0x5b/0x80
 vfs_write+0xd5/0x540
 ksys_write+0x64/0xe0
 do_syscall_64+0xb9/0x1d0
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f08f0314887
RSP: 002b:00007ffece710078 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000009 RCX: 00007f08f0314887
RDX: 0000000000000009 RSI: 0000564787a30410 RDI: 0000000000000001
RBP: 0000564787a30410 R08: 000000000000fefe R09: 000000007fffffff
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000009
R13: 00007f08f041b780 R14: 00007f08f0417600 R15: 00007f08f0416a00
 </TASK>
Modules linked in: hwpoison_inject
---[ end trace 0000000000000000 ]---
RIP: 0010:unpoison_memory+0x2f3/0x590
RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246
RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0
RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb
R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000
R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe
FS:  00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0
Kernel panic - not syncing: Fatal exception
Kernel Offset: 0x31c00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
---[ end Kernel panic - not syncing: Fatal exception ]---

We're hitting a BUG_ON in PF_ANY():

PAGEFLAG(HWPoison, hwpoison, PF_ANY)

#define PF_ANY(page, enforce)	PF_POISONED_CHECK(page)

#define PF_POISONED_CHECK(page) ({					\
	VM_BUG_ON_PGFLAGS(PagePoisoned(page), page);		\
	page; })

#define	PAGE_POISON_PATTERN	-1l
static inline int PagePoisoned(const struct page *page)
{
	return READ_ONCE(page->flags) == PAGE_POISON_PATTERN;
}

The offlined pages will have page->flags set to PAGE_POISON_PATTERN
while pfn is still valid:

offline_pages
  remove_pfn_range_from_zone
    page_init_poison
      memset(page, PAGE_POISON_PATTERN, size);

The root cause is that unpoison_memory() tries to check the PG_HWPoison
flags of an uninitialized page. So VM_BUG_ON_PAGE(PagePoisoned(page)) is
triggered. This can be reproduced by below steps:
1.Offline memory block:
 echo offline > /sys/devices/system/memory/memory12/state
2.Get offlined memory pfn:
 page-types -b n -rlN
3.Write pfn to unpoison-pfn
 echo <pfn> > /sys/kernel/debug/hwpoison/unpoison-pfn

Link: https://lkml.kernel.org/r/20240712064249.3882707-1-linmiaohe@huawei.com
Fixes: f165b37 ("mm: uninitialized struct page poisoning sanity checking")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
ioworker0 pushed a commit to ioworker0/linux that referenced this pull request Nov 7, 2024
…on memory

When I did memory failure tests recently, below panic occurs:

page dumped because: VM_BUG_ON_PAGE(PagePoisoned(page))
kernel BUG at include/linux/page-flags.h:616!
Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
CPU: 3 PID: 720 Comm: bash Not tainted 6.10.0-rc1-00195-g148743902568 torvalds#40
RIP: 0010:unpoison_memory+0x2f3/0x590
RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246
RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0
RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb
R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000
R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe
FS:  00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0
Call Trace:
 <TASK>
 unpoison_memory+0x2f3/0x590
 simple_attr_write_xsigned.constprop.0.isra.0+0xb3/0x110
 debugfs_attr_write+0x42/0x60
 full_proxy_write+0x5b/0x80
 vfs_write+0xd5/0x540
 ksys_write+0x64/0xe0
 do_syscall_64+0xb9/0x1d0
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f08f0314887
RSP: 002b:00007ffece710078 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000009 RCX: 00007f08f0314887
RDX: 0000000000000009 RSI: 0000564787a30410 RDI: 0000000000000001
RBP: 0000564787a30410 R08: 000000000000fefe R09: 000000007fffffff
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000009
R13: 00007f08f041b780 R14: 00007f08f0417600 R15: 00007f08f0416a00
 </TASK>
Modules linked in: hwpoison_inject
---[ end trace 0000000000000000 ]---
RIP: 0010:unpoison_memory+0x2f3/0x590
RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246
RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0
RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb
R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000
R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe
FS:  00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0
Kernel panic - not syncing: Fatal exception
Kernel Offset: 0x31c00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
---[ end Kernel panic - not syncing: Fatal exception ]---

We're hitting a BUG_ON in PF_ANY():

PAGEFLAG(HWPoison, hwpoison, PF_ANY)

#define PF_ANY(page, enforce)	PF_POISONED_CHECK(page)

#define PF_POISONED_CHECK(page) ({					\
	VM_BUG_ON_PGFLAGS(PagePoisoned(page), page);		\
	page; })

#define	PAGE_POISON_PATTERN	-1l
static inline int PagePoisoned(const struct page *page)
{
	return READ_ONCE(page->flags) == PAGE_POISON_PATTERN;
}

The offlined pages will have page->flags set to PAGE_POISON_PATTERN
while pfn is still valid:

offline_pages
  remove_pfn_range_from_zone
    page_init_poison
      memset(page, PAGE_POISON_PATTERN, size);

The root cause is that unpoison_memory() tries to check the PG_HWPoison
flags of an uninitialized page. So VM_BUG_ON_PAGE(PagePoisoned(page)) is
triggered. This can be reproduced by below steps:
1.Offline memory block:
 echo offline > /sys/devices/system/memory/memory12/state
2.Get offlined memory pfn:
 page-types -b n -rlN
3.Write pfn to unpoison-pfn
 echo <pfn> > /sys/kernel/debug/hwpoison/unpoison-pfn

Link: https://lkml.kernel.org/r/20240712064249.3882707-1-linmiaohe@huawei.com
Fixes: f165b37 ("mm: uninitialized struct page poisoning sanity checking")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
ioworker0 pushed a commit to ioworker0/linux that referenced this pull request Nov 8, 2024
…on memory

When I did memory failure tests recently, below panic occurs:

page dumped because: VM_BUG_ON_PAGE(PagePoisoned(page))
kernel BUG at include/linux/page-flags.h:616!
Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
CPU: 3 PID: 720 Comm: bash Not tainted 6.10.0-rc1-00195-g148743902568 torvalds#40
RIP: 0010:unpoison_memory+0x2f3/0x590
RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246
RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0
RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb
R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000
R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe
FS:  00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0
Call Trace:
 <TASK>
 unpoison_memory+0x2f3/0x590
 simple_attr_write_xsigned.constprop.0.isra.0+0xb3/0x110
 debugfs_attr_write+0x42/0x60
 full_proxy_write+0x5b/0x80
 vfs_write+0xd5/0x540
 ksys_write+0x64/0xe0
 do_syscall_64+0xb9/0x1d0
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f08f0314887
RSP: 002b:00007ffece710078 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000009 RCX: 00007f08f0314887
RDX: 0000000000000009 RSI: 0000564787a30410 RDI: 0000000000000001
RBP: 0000564787a30410 R08: 000000000000fefe R09: 000000007fffffff
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000009
R13: 00007f08f041b780 R14: 00007f08f0417600 R15: 00007f08f0416a00
 </TASK>
Modules linked in: hwpoison_inject
---[ end trace 0000000000000000 ]---
RIP: 0010:unpoison_memory+0x2f3/0x590
RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246
RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0
RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb
R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000
R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe
FS:  00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0
Kernel panic - not syncing: Fatal exception
Kernel Offset: 0x31c00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
---[ end Kernel panic - not syncing: Fatal exception ]---

We're hitting a BUG_ON in PF_ANY():

PAGEFLAG(HWPoison, hwpoison, PF_ANY)

#define PF_ANY(page, enforce)	PF_POISONED_CHECK(page)

#define PF_POISONED_CHECK(page) ({					\
	VM_BUG_ON_PGFLAGS(PagePoisoned(page), page);		\
	page; })

#define	PAGE_POISON_PATTERN	-1l
static inline int PagePoisoned(const struct page *page)
{
	return READ_ONCE(page->flags) == PAGE_POISON_PATTERN;
}

The offlined pages will have page->flags set to PAGE_POISON_PATTERN
while pfn is still valid:

offline_pages
  remove_pfn_range_from_zone
    page_init_poison
      memset(page, PAGE_POISON_PATTERN, size);

The root cause is that unpoison_memory() tries to check the PG_HWPoison
flags of an uninitialized page. So VM_BUG_ON_PAGE(PagePoisoned(page)) is
triggered. This can be reproduced by below steps:
1.Offline memory block:
 echo offline > /sys/devices/system/memory/memory12/state
2.Get offlined memory pfn:
 page-types -b n -rlN
3.Write pfn to unpoison-pfn
 echo <pfn> > /sys/kernel/debug/hwpoison/unpoison-pfn

Link: https://lkml.kernel.org/r/20240712064249.3882707-1-linmiaohe@huawei.com
Fixes: f165b37 ("mm: uninitialized struct page poisoning sanity checking")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
1054009064 pushed a commit to 1054009064/linux that referenced this pull request Nov 8, 2024
[ Upstream commit 60f07e2 ]

We use uprobe in aarch64_be, which we found the tracee task would exit
due to SIGILL when we enable the uprobe trace.
We can see the replace inst from uprobe is not correct in aarch big-endian.
As in Armv8-A, instruction fetches are always treated as little-endian,
we should treat the UPROBE_SWBP_INSN as little-endian。

The test case is as following。
bash-4.4# ./mqueue_test_aarchbe 1 1 2 1 10 > /dev/null &
bash-4.4# cd /sys/kernel/debug/tracing/
bash-4.4# echo 'p:test /mqueue_test_aarchbe:0xc30 %x0 %x1' > uprobe_events
bash-4.4# echo 1 > events/uprobes/enable
bash-4.4#
bash-4.4# ps
  PID TTY          TIME CMD
  140 ?        00:00:01 bash
  237 ?        00:00:00 ps
[1]+  Illegal instruction     ./mqueue_test_aarchbe 1 1 2 1 100 > /dev/null

which we debug use gdb as following:

bash-4.4# gdb attach 155
(gdb) disassemble send
Dump of assembler code for function send:
   0x0000000000400c30 <+0>:     .inst   0xa00020d4 ; undefined
   0x0000000000400c34 <+4>:     mov     x29, sp
   0x0000000000400c38 <+8>:     str     w0, [sp, torvalds#28]
   0x0000000000400c3c <+12>:    strb    w1, [sp, torvalds#27]
   0x0000000000400c40 <+16>:    str     xzr, [sp, torvalds#40]
   0x0000000000400c44 <+20>:    str     xzr, [sp, torvalds#48]
   0x0000000000400c48 <+24>:    add     x0, sp, #0x1b
   0x0000000000400c4c <+28>:    mov     w3, #0x0                 // #0
   0x0000000000400c50 <+32>:    mov     x2, #0x1                 // #1
   0x0000000000400c54 <+36>:    mov     x1, x0
   0x0000000000400c58 <+40>:    ldr     w0, [sp, torvalds#28]
   0x0000000000400c5c <+44>:    bl      0x405e10 <mq_send>
   0x0000000000400c60 <+48>:    str     w0, [sp, torvalds#60]
   0x0000000000400c64 <+52>:    ldr     w0, [sp, torvalds#60]
   0x0000000000400c68 <+56>:    ldp     x29, x30, [sp], torvalds#64
   0x0000000000400c6c <+60>:    ret
End of assembler dump.
(gdb) info b
No breakpoints or watchpoints.
(gdb) c
Continuing.

Program received signal SIGILL, Illegal instruction.
0x0000000000400c30 in send ()
(gdb) x/10x 0x400c30
0x400c30 <send>:    0xd42000a0   0xfd030091      0xe01f00b9      0xe16f0039
0x400c40 <send+16>: 0xff1700f9   0xff1b00f9      0xe06f0091      0x03008052
0x400c50 <send+32>: 0x220080d2   0xe10300aa
(gdb) disassemble 0x400c30
Dump of assembler code for function send:
=> 0x0000000000400c30 <+0>:     .inst   0xa00020d4 ; undefined
   0x0000000000400c34 <+4>:     mov     x29, sp
   0x0000000000400c38 <+8>:     str     w0, [sp, torvalds#28]
   0x0000000000400c3c <+12>:    strb    w1, [sp, torvalds#27]
   0x0000000000400c40 <+16>:    str     xzr, [sp, torvalds#40]

Signed-off-by: junhua huang <huang.junhua@zte.com.cn>
Link: https://lore.kernel.org/r/202212021511106844809@zte.com.cn
Signed-off-by: Will Deacon <will@kernel.org>
Stable-dep-of: 13f8f1e ("arm64: probes: Fix uprobes for big-endian kernels")
Signed-off-by: Sasha Levin <sashal@kernel.org>
1054009064 pushed a commit to 1054009064/linux that referenced this pull request Nov 8, 2024
[ Upstream commit 60f07e2 ]

We use uprobe in aarch64_be, which we found the tracee task would exit
due to SIGILL when we enable the uprobe trace.
We can see the replace inst from uprobe is not correct in aarch big-endian.
As in Armv8-A, instruction fetches are always treated as little-endian,
we should treat the UPROBE_SWBP_INSN as little-endian。

The test case is as following。
bash-4.4# ./mqueue_test_aarchbe 1 1 2 1 10 > /dev/null &
bash-4.4# cd /sys/kernel/debug/tracing/
bash-4.4# echo 'p:test /mqueue_test_aarchbe:0xc30 %x0 %x1' > uprobe_events
bash-4.4# echo 1 > events/uprobes/enable
bash-4.4#
bash-4.4# ps
  PID TTY          TIME CMD
  140 ?        00:00:01 bash
  237 ?        00:00:00 ps
[1]+  Illegal instruction     ./mqueue_test_aarchbe 1 1 2 1 100 > /dev/null

which we debug use gdb as following:

bash-4.4# gdb attach 155
(gdb) disassemble send
Dump of assembler code for function send:
   0x0000000000400c30 <+0>:     .inst   0xa00020d4 ; undefined
   0x0000000000400c34 <+4>:     mov     x29, sp
   0x0000000000400c38 <+8>:     str     w0, [sp, torvalds#28]
   0x0000000000400c3c <+12>:    strb    w1, [sp, torvalds#27]
   0x0000000000400c40 <+16>:    str     xzr, [sp, torvalds#40]
   0x0000000000400c44 <+20>:    str     xzr, [sp, torvalds#48]
   0x0000000000400c48 <+24>:    add     x0, sp, #0x1b
   0x0000000000400c4c <+28>:    mov     w3, #0x0                 // #0
   0x0000000000400c50 <+32>:    mov     x2, #0x1                 // #1
   0x0000000000400c54 <+36>:    mov     x1, x0
   0x0000000000400c58 <+40>:    ldr     w0, [sp, torvalds#28]
   0x0000000000400c5c <+44>:    bl      0x405e10 <mq_send>
   0x0000000000400c60 <+48>:    str     w0, [sp, torvalds#60]
   0x0000000000400c64 <+52>:    ldr     w0, [sp, torvalds#60]
   0x0000000000400c68 <+56>:    ldp     x29, x30, [sp], torvalds#64
   0x0000000000400c6c <+60>:    ret
End of assembler dump.
(gdb) info b
No breakpoints or watchpoints.
(gdb) c
Continuing.

Program received signal SIGILL, Illegal instruction.
0x0000000000400c30 in send ()
(gdb) x/10x 0x400c30
0x400c30 <send>:    0xd42000a0   0xfd030091      0xe01f00b9      0xe16f0039
0x400c40 <send+16>: 0xff1700f9   0xff1b00f9      0xe06f0091      0x03008052
0x400c50 <send+32>: 0x220080d2   0xe10300aa
(gdb) disassemble 0x400c30
Dump of assembler code for function send:
=> 0x0000000000400c30 <+0>:     .inst   0xa00020d4 ; undefined
   0x0000000000400c34 <+4>:     mov     x29, sp
   0x0000000000400c38 <+8>:     str     w0, [sp, torvalds#28]
   0x0000000000400c3c <+12>:    strb    w1, [sp, torvalds#27]
   0x0000000000400c40 <+16>:    str     xzr, [sp, torvalds#40]

Signed-off-by: junhua huang <huang.junhua@zte.com.cn>
Link: https://lore.kernel.org/r/202212021511106844809@zte.com.cn
Signed-off-by: Will Deacon <will@kernel.org>
Stable-dep-of: 13f8f1e ("arm64: probes: Fix uprobes for big-endian kernels")
Signed-off-by: Sasha Levin <sashal@kernel.org>
1054009064 pushed a commit to 1054009064/linux that referenced this pull request Nov 8, 2024
[ Upstream commit 60f07e2 ]

We use uprobe in aarch64_be, which we found the tracee task would exit
due to SIGILL when we enable the uprobe trace.
We can see the replace inst from uprobe is not correct in aarch big-endian.
As in Armv8-A, instruction fetches are always treated as little-endian,
we should treat the UPROBE_SWBP_INSN as little-endian。

The test case is as following。
bash-4.4# ./mqueue_test_aarchbe 1 1 2 1 10 > /dev/null &
bash-4.4# cd /sys/kernel/debug/tracing/
bash-4.4# echo 'p:test /mqueue_test_aarchbe:0xc30 %x0 %x1' > uprobe_events
bash-4.4# echo 1 > events/uprobes/enable
bash-4.4#
bash-4.4# ps
  PID TTY          TIME CMD
  140 ?        00:00:01 bash
  237 ?        00:00:00 ps
[1]+  Illegal instruction     ./mqueue_test_aarchbe 1 1 2 1 100 > /dev/null

which we debug use gdb as following:

bash-4.4# gdb attach 155
(gdb) disassemble send
Dump of assembler code for function send:
   0x0000000000400c30 <+0>:     .inst   0xa00020d4 ; undefined
   0x0000000000400c34 <+4>:     mov     x29, sp
   0x0000000000400c38 <+8>:     str     w0, [sp, torvalds#28]
   0x0000000000400c3c <+12>:    strb    w1, [sp, torvalds#27]
   0x0000000000400c40 <+16>:    str     xzr, [sp, torvalds#40]
   0x0000000000400c44 <+20>:    str     xzr, [sp, torvalds#48]
   0x0000000000400c48 <+24>:    add     x0, sp, #0x1b
   0x0000000000400c4c <+28>:    mov     w3, #0x0                 // #0
   0x0000000000400c50 <+32>:    mov     x2, #0x1                 // #1
   0x0000000000400c54 <+36>:    mov     x1, x0
   0x0000000000400c58 <+40>:    ldr     w0, [sp, torvalds#28]
   0x0000000000400c5c <+44>:    bl      0x405e10 <mq_send>
   0x0000000000400c60 <+48>:    str     w0, [sp, torvalds#60]
   0x0000000000400c64 <+52>:    ldr     w0, [sp, torvalds#60]
   0x0000000000400c68 <+56>:    ldp     x29, x30, [sp], torvalds#64
   0x0000000000400c6c <+60>:    ret
End of assembler dump.
(gdb) info b
No breakpoints or watchpoints.
(gdb) c
Continuing.

Program received signal SIGILL, Illegal instruction.
0x0000000000400c30 in send ()
(gdb) x/10x 0x400c30
0x400c30 <send>:    0xd42000a0   0xfd030091      0xe01f00b9      0xe16f0039
0x400c40 <send+16>: 0xff1700f9   0xff1b00f9      0xe06f0091      0x03008052
0x400c50 <send+32>: 0x220080d2   0xe10300aa
(gdb) disassemble 0x400c30
Dump of assembler code for function send:
=> 0x0000000000400c30 <+0>:     .inst   0xa00020d4 ; undefined
   0x0000000000400c34 <+4>:     mov     x29, sp
   0x0000000000400c38 <+8>:     str     w0, [sp, torvalds#28]
   0x0000000000400c3c <+12>:    strb    w1, [sp, torvalds#27]
   0x0000000000400c40 <+16>:    str     xzr, [sp, torvalds#40]

Signed-off-by: junhua huang <huang.junhua@zte.com.cn>
Link: https://lore.kernel.org/r/202212021511106844809@zte.com.cn
Signed-off-by: Will Deacon <will@kernel.org>
Stable-dep-of: 13f8f1e ("arm64: probes: Fix uprobes for big-endian kernels")
Signed-off-by: Sasha Levin <sashal@kernel.org>
ioworker0 pushed a commit to ioworker0/linux that referenced this pull request Nov 11, 2024
…on memory

When I did memory failure tests recently, below panic occurs:

page dumped because: VM_BUG_ON_PAGE(PagePoisoned(page))
kernel BUG at include/linux/page-flags.h:616!
Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
CPU: 3 PID: 720 Comm: bash Not tainted 6.10.0-rc1-00195-g148743902568 torvalds#40
RIP: 0010:unpoison_memory+0x2f3/0x590
RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246
RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0
RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb
R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000
R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe
FS:  00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0
Call Trace:
 <TASK>
 unpoison_memory+0x2f3/0x590
 simple_attr_write_xsigned.constprop.0.isra.0+0xb3/0x110
 debugfs_attr_write+0x42/0x60
 full_proxy_write+0x5b/0x80
 vfs_write+0xd5/0x540
 ksys_write+0x64/0xe0
 do_syscall_64+0xb9/0x1d0
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f08f0314887
RSP: 002b:00007ffece710078 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000009 RCX: 00007f08f0314887
RDX: 0000000000000009 RSI: 0000564787a30410 RDI: 0000000000000001
RBP: 0000564787a30410 R08: 000000000000fefe R09: 000000007fffffff
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000009
R13: 00007f08f041b780 R14: 00007f08f0417600 R15: 00007f08f0416a00
 </TASK>
Modules linked in: hwpoison_inject
---[ end trace 0000000000000000 ]---
RIP: 0010:unpoison_memory+0x2f3/0x590
RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246
RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0
RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb
R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000
R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe
FS:  00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0
Kernel panic - not syncing: Fatal exception
Kernel Offset: 0x31c00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
---[ end Kernel panic - not syncing: Fatal exception ]---

We're hitting a BUG_ON in PF_ANY():

PAGEFLAG(HWPoison, hwpoison, PF_ANY)

#define PF_ANY(page, enforce)	PF_POISONED_CHECK(page)

#define PF_POISONED_CHECK(page) ({					\
	VM_BUG_ON_PGFLAGS(PagePoisoned(page), page);		\
	page; })

#define	PAGE_POISON_PATTERN	-1l
static inline int PagePoisoned(const struct page *page)
{
	return READ_ONCE(page->flags) == PAGE_POISON_PATTERN;
}

The offlined pages will have page->flags set to PAGE_POISON_PATTERN
while pfn is still valid:

offline_pages
  remove_pfn_range_from_zone
    page_init_poison
      memset(page, PAGE_POISON_PATTERN, size);

The root cause is that unpoison_memory() tries to check the PG_HWPoison
flags of an uninitialized page. So VM_BUG_ON_PAGE(PagePoisoned(page)) is
triggered. This can be reproduced by below steps:
1.Offline memory block:
 echo offline > /sys/devices/system/memory/memory12/state
2.Get offlined memory pfn:
 page-types -b n -rlN
3.Write pfn to unpoison-pfn
 echo <pfn> > /sys/kernel/debug/hwpoison/unpoison-pfn

Link: https://lkml.kernel.org/r/20240712064249.3882707-1-linmiaohe@huawei.com
Fixes: f165b37 ("mm: uninitialized struct page poisoning sanity checking")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
ioworker0 pushed a commit to ioworker0/linux that referenced this pull request Nov 12, 2024
…on memory

When I did memory failure tests recently, below panic occurs:

page dumped because: VM_BUG_ON_PAGE(PagePoisoned(page))
kernel BUG at include/linux/page-flags.h:616!
Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
CPU: 3 PID: 720 Comm: bash Not tainted 6.10.0-rc1-00195-g148743902568 torvalds#40
RIP: 0010:unpoison_memory+0x2f3/0x590
RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246
RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0
RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb
R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000
R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe
FS:  00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0
Call Trace:
 <TASK>
 unpoison_memory+0x2f3/0x590
 simple_attr_write_xsigned.constprop.0.isra.0+0xb3/0x110
 debugfs_attr_write+0x42/0x60
 full_proxy_write+0x5b/0x80
 vfs_write+0xd5/0x540
 ksys_write+0x64/0xe0
 do_syscall_64+0xb9/0x1d0
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f08f0314887
RSP: 002b:00007ffece710078 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000009 RCX: 00007f08f0314887
RDX: 0000000000000009 RSI: 0000564787a30410 RDI: 0000000000000001
RBP: 0000564787a30410 R08: 000000000000fefe R09: 000000007fffffff
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000009
R13: 00007f08f041b780 R14: 00007f08f0417600 R15: 00007f08f0416a00
 </TASK>
Modules linked in: hwpoison_inject
---[ end trace 0000000000000000 ]---
RIP: 0010:unpoison_memory+0x2f3/0x590
RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246
RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0
RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb
R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000
R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe
FS:  00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0
Kernel panic - not syncing: Fatal exception
Kernel Offset: 0x31c00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
---[ end Kernel panic - not syncing: Fatal exception ]---

We're hitting a BUG_ON in PF_ANY():

PAGEFLAG(HWPoison, hwpoison, PF_ANY)

#define PF_ANY(page, enforce)	PF_POISONED_CHECK(page)

#define PF_POISONED_CHECK(page) ({					\
	VM_BUG_ON_PGFLAGS(PagePoisoned(page), page);		\
	page; })

#define	PAGE_POISON_PATTERN	-1l
static inline int PagePoisoned(const struct page *page)
{
	return READ_ONCE(page->flags) == PAGE_POISON_PATTERN;
}

The offlined pages will have page->flags set to PAGE_POISON_PATTERN
while pfn is still valid:

offline_pages
  remove_pfn_range_from_zone
    page_init_poison
      memset(page, PAGE_POISON_PATTERN, size);

The root cause is that unpoison_memory() tries to check the PG_HWPoison
flags of an uninitialized page. So VM_BUG_ON_PAGE(PagePoisoned(page)) is
triggered. This can be reproduced by below steps:
1.Offline memory block:
 echo offline > /sys/devices/system/memory/memory12/state
2.Get offlined memory pfn:
 page-types -b n -rlN
3.Write pfn to unpoison-pfn
 echo <pfn> > /sys/kernel/debug/hwpoison/unpoison-pfn

Link: https://lkml.kernel.org/r/20240712064249.3882707-1-linmiaohe@huawei.com
Fixes: f165b37 ("mm: uninitialized struct page poisoning sanity checking")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
ioworker0 pushed a commit to ioworker0/linux that referenced this pull request Nov 12, 2024
…on memory

When I did memory failure tests recently, below panic occurs:

page dumped because: VM_BUG_ON_PAGE(PagePoisoned(page))
kernel BUG at include/linux/page-flags.h:616!
Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
CPU: 3 PID: 720 Comm: bash Not tainted 6.10.0-rc1-00195-g148743902568 torvalds#40
RIP: 0010:unpoison_memory+0x2f3/0x590
RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246
RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0
RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb
R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000
R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe
FS:  00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0
Call Trace:
 <TASK>
 unpoison_memory+0x2f3/0x590
 simple_attr_write_xsigned.constprop.0.isra.0+0xb3/0x110
 debugfs_attr_write+0x42/0x60
 full_proxy_write+0x5b/0x80
 vfs_write+0xd5/0x540
 ksys_write+0x64/0xe0
 do_syscall_64+0xb9/0x1d0
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f08f0314887
RSP: 002b:00007ffece710078 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000009 RCX: 00007f08f0314887
RDX: 0000000000000009 RSI: 0000564787a30410 RDI: 0000000000000001
RBP: 0000564787a30410 R08: 000000000000fefe R09: 000000007fffffff
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000009
R13: 00007f08f041b780 R14: 00007f08f0417600 R15: 00007f08f0416a00
 </TASK>
Modules linked in: hwpoison_inject
---[ end trace 0000000000000000 ]---
RIP: 0010:unpoison_memory+0x2f3/0x590
RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246
RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0
RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb
R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000
R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe
FS:  00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0
Kernel panic - not syncing: Fatal exception
Kernel Offset: 0x31c00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
---[ end Kernel panic - not syncing: Fatal exception ]---

We're hitting a BUG_ON in PF_ANY():

PAGEFLAG(HWPoison, hwpoison, PF_ANY)

#define PF_ANY(page, enforce)	PF_POISONED_CHECK(page)

#define PF_POISONED_CHECK(page) ({					\
	VM_BUG_ON_PGFLAGS(PagePoisoned(page), page);		\
	page; })

#define	PAGE_POISON_PATTERN	-1l
static inline int PagePoisoned(const struct page *page)
{
	return READ_ONCE(page->flags) == PAGE_POISON_PATTERN;
}

The offlined pages will have page->flags set to PAGE_POISON_PATTERN
while pfn is still valid:

offline_pages
  remove_pfn_range_from_zone
    page_init_poison
      memset(page, PAGE_POISON_PATTERN, size);

The root cause is that unpoison_memory() tries to check the PG_HWPoison
flags of an uninitialized page. So VM_BUG_ON_PAGE(PagePoisoned(page)) is
triggered. This can be reproduced by below steps:
1.Offline memory block:
 echo offline > /sys/devices/system/memory/memory12/state
2.Get offlined memory pfn:
 page-types -b n -rlN
3.Write pfn to unpoison-pfn
 echo <pfn> > /sys/kernel/debug/hwpoison/unpoison-pfn

Link: https://lkml.kernel.org/r/20240712064249.3882707-1-linmiaohe@huawei.com
Fixes: f165b37 ("mm: uninitialized struct page poisoning sanity checking")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
ioworker0 pushed a commit to ioworker0/linux that referenced this pull request Nov 13, 2024
…on memory

When I did memory failure tests recently, below panic occurs:

page dumped because: VM_BUG_ON_PAGE(PagePoisoned(page))
kernel BUG at include/linux/page-flags.h:616!
Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
CPU: 3 PID: 720 Comm: bash Not tainted 6.10.0-rc1-00195-g148743902568 torvalds#40
RIP: 0010:unpoison_memory+0x2f3/0x590
RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246
RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0
RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb
R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000
R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe
FS:  00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0
Call Trace:
 <TASK>
 unpoison_memory+0x2f3/0x590
 simple_attr_write_xsigned.constprop.0.isra.0+0xb3/0x110
 debugfs_attr_write+0x42/0x60
 full_proxy_write+0x5b/0x80
 vfs_write+0xd5/0x540
 ksys_write+0x64/0xe0
 do_syscall_64+0xb9/0x1d0
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f08f0314887
RSP: 002b:00007ffece710078 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000009 RCX: 00007f08f0314887
RDX: 0000000000000009 RSI: 0000564787a30410 RDI: 0000000000000001
RBP: 0000564787a30410 R08: 000000000000fefe R09: 000000007fffffff
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000009
R13: 00007f08f041b780 R14: 00007f08f0417600 R15: 00007f08f0416a00
 </TASK>
Modules linked in: hwpoison_inject
---[ end trace 0000000000000000 ]---
RIP: 0010:unpoison_memory+0x2f3/0x590
RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246
RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0
RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb
R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000
R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe
FS:  00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0
Kernel panic - not syncing: Fatal exception
Kernel Offset: 0x31c00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
---[ end Kernel panic - not syncing: Fatal exception ]---

We're hitting a BUG_ON in PF_ANY():

PAGEFLAG(HWPoison, hwpoison, PF_ANY)

#define PF_ANY(page, enforce)	PF_POISONED_CHECK(page)

#define PF_POISONED_CHECK(page) ({					\
	VM_BUG_ON_PGFLAGS(PagePoisoned(page), page);		\
	page; })

#define	PAGE_POISON_PATTERN	-1l
static inline int PagePoisoned(const struct page *page)
{
	return READ_ONCE(page->flags) == PAGE_POISON_PATTERN;
}

The offlined pages will have page->flags set to PAGE_POISON_PATTERN
while pfn is still valid:

offline_pages
  remove_pfn_range_from_zone
    page_init_poison
      memset(page, PAGE_POISON_PATTERN, size);

The root cause is that unpoison_memory() tries to check the PG_HWPoison
flags of an uninitialized page. So VM_BUG_ON_PAGE(PagePoisoned(page)) is
triggered. This can be reproduced by below steps:
1.Offline memory block:
 echo offline > /sys/devices/system/memory/memory12/state
2.Get offlined memory pfn:
 page-types -b n -rlN
3.Write pfn to unpoison-pfn
 echo <pfn> > /sys/kernel/debug/hwpoison/unpoison-pfn

Link: https://lkml.kernel.org/r/20240712064249.3882707-1-linmiaohe@huawei.com
Fixes: f165b37 ("mm: uninitialized struct page poisoning sanity checking")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
ioworker0 pushed a commit to ioworker0/linux that referenced this pull request Nov 15, 2024
…on memory

When I did memory failure tests recently, below panic occurs:

page dumped because: VM_BUG_ON_PAGE(PagePoisoned(page))
kernel BUG at include/linux/page-flags.h:616!
Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
CPU: 3 PID: 720 Comm: bash Not tainted 6.10.0-rc1-00195-g148743902568 torvalds#40
RIP: 0010:unpoison_memory+0x2f3/0x590
RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246
RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0
RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb
R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000
R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe
FS:  00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0
Call Trace:
 <TASK>
 unpoison_memory+0x2f3/0x590
 simple_attr_write_xsigned.constprop.0.isra.0+0xb3/0x110
 debugfs_attr_write+0x42/0x60
 full_proxy_write+0x5b/0x80
 vfs_write+0xd5/0x540
 ksys_write+0x64/0xe0
 do_syscall_64+0xb9/0x1d0
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f08f0314887
RSP: 002b:00007ffece710078 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000009 RCX: 00007f08f0314887
RDX: 0000000000000009 RSI: 0000564787a30410 RDI: 0000000000000001
RBP: 0000564787a30410 R08: 000000000000fefe R09: 000000007fffffff
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000009
R13: 00007f08f041b780 R14: 00007f08f0417600 R15: 00007f08f0416a00
 </TASK>
Modules linked in: hwpoison_inject
---[ end trace 0000000000000000 ]---
RIP: 0010:unpoison_memory+0x2f3/0x590
RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246
RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0
RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb
R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000
R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe
FS:  00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0
Kernel panic - not syncing: Fatal exception
Kernel Offset: 0x31c00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
---[ end Kernel panic - not syncing: Fatal exception ]---

We're hitting a BUG_ON in PF_ANY():

PAGEFLAG(HWPoison, hwpoison, PF_ANY)

#define PF_ANY(page, enforce)	PF_POISONED_CHECK(page)

#define PF_POISONED_CHECK(page) ({					\
	VM_BUG_ON_PGFLAGS(PagePoisoned(page), page);		\
	page; })

#define	PAGE_POISON_PATTERN	-1l
static inline int PagePoisoned(const struct page *page)
{
	return READ_ONCE(page->flags) == PAGE_POISON_PATTERN;
}

The offlined pages will have page->flags set to PAGE_POISON_PATTERN
while pfn is still valid:

offline_pages
  remove_pfn_range_from_zone
    page_init_poison
      memset(page, PAGE_POISON_PATTERN, size);

The root cause is that unpoison_memory() tries to check the PG_HWPoison
flags of an uninitialized page. So VM_BUG_ON_PAGE(PagePoisoned(page)) is
triggered. This can be reproduced by below steps:
1.Offline memory block:
 echo offline > /sys/devices/system/memory/memory12/state
2.Get offlined memory pfn:
 page-types -b n -rlN
3.Write pfn to unpoison-pfn
 echo <pfn> > /sys/kernel/debug/hwpoison/unpoison-pfn

Link: https://lkml.kernel.org/r/20240712064249.3882707-1-linmiaohe@huawei.com
Fixes: f165b37 ("mm: uninitialized struct page poisoning sanity checking")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant