-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wandboard imx 3.0.35 4.0.0 test #1
Wandboard imx 3.0.35 4.0.0 test #1
Conversation
(cherry picked from commit 11a509c6d0958741af52fd85661c08975d1e271d)
Thanks Stephan - I merged these. I'll be testing them over the next day or so One thing I want to do is to upstream the gpu-viv patch. It does give warnings John On 7/31/13 6:54 PM, wolfgar wrote:
|
Stephan - I merged everything into the wandboard_imx_3.0.35_4.0.0 branch from test, and added a commit to fix a problem in the MIPI CSI exclusion code, and reorganized the mipi csi init code. I've tested it on Wandboard Solo so far (with/without) the MIPI included. |
OK, I switch to wandboard_imx_3.0.35_4.0.0 branch |
To add to the difficulty, there is no difference between the MX6 solo and DL in the CPU type detection, so we can't distinguish between WB Dual and WB solo here. This function is important because it appears to be telling the kernel how much memory it can occupy, similar to the mem= argument. We need a way to tell the difference between S and DL in order to set the memory limit this way. Any ideas? |
I have solved it that way : wolfgar@f77f226 |
I wonder if there is an easier way of getting the memory config, but this works. On 8/3/13 9:06 PM, wolfgar wrote:
|
Hi John, |
The pdata->pdev is initialized at platform code, if init fails at first, it will be not initialized, and platform exit will not be called. This also fixes an oop when config usb module wrongly: Unable to handle kernel NULL pointer dereference at virtual address 0000005 pgd = ba1c4000 [0000005] *pgd=4a145831, *pte=00000000, *ppte=00000000 Internal error: Oops: 17 [#1] PREEMPT SMP Modules linked in: ehci_hcd(+) usbcore CPU: 1 Not tainted (3.0.35-02451-ge361da1 torvalds#60) PC is at fsl_usb_host_uninit_ext+0xc/0x28 LR is at usb_hcd_fsl_probe+0x2c8/0x44c [ehci_hcd] pc : [<80062b58>] lr : [<7f060934>] psr: a0000013 sp : ba11be80 ip : 00005027 fp : 000a76e0 r10: 00000048 r9 : ba11a000 r8 : bfd4d608 r7 : ffffffed r6 : bfd4d600 r5 : bfc84400 r4 : 80aaee48 r3 : 80062b4c r2 : 00000000 r1 : 60000093 r0 : 00000000 Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user Control: 10c53c7d Table: 4a1c404a DAC: 00000015 Process modprobe (pid: 1555, stack limit = 0xba11a2f0) Stack: (0xba11be80 to 0xba11c000) be80: bfd97600 7f060934 00000000 00000000 bfd4d608 bfd4d608 80aca808 bfd4d63c bea0: 7f062bd4 80041704 ba11a000 00000000 000a76e0 802a5cec bfd4d608 802a4a14 bec0: bfd4d608 7f062bd4 bfd4d63c 00000000 80041704 802a4bac 7f062bd4 ba11bee bee0: 802a4b2 802a4254 bffd4040 bff03f38 7f065000 7f062bd4 80a934c8 bfc8cd20 bf00: 00000000 802a3be0 7f062b1c 7f062bd4 00000000 7f017ec8 7f062bd4 00000000 bf20: 7f065000 80041704 00000000 802a51a0 7f017ec8 80aae500 00000000 7f065000 bf40: 80041704 7f065058 000a79e8 8003b4c4 00000000 00000000 00000000 80a14834 bf60: 000a79e8 000a79e8 7f062c20 00000000 0000e67b 80041704 ba11a000 00000000 bf80: 000a76e0 800aa428 ba076740 800f58dc 000a79e8 0000e67b 00000000 000a75d0 bfa0: 00000080 80041580 0000e67b 00000000 000a79e8 0000e67b 000a75d0 000a76e0 bfc0: 0000e67b 00000000 000a75d0 00000080 000a6a78 00000008 000a76a0 000a76e0 bfe0: 7ec7ab50 7ec7ab40 0001a32c 2ace6490 20000010 000a79e8 4fffe821 4fffec21 [<80062b58>] (fsl_usb_host_uninit_ext+0xc/0x28) from [<7f060934>] (usb_hcd_fsl_probe+0x2c8/0x44c [ehci_hcd]) [<7f060934>] (usb_hcd_fsl_probe+0x2c8/0x44c [ehci_hcd]) from [<802a5cec>] (platform_drv_probe+0x18/0x1c) [<802a5cec>] (platform_drv_probe+0x18/0x1c) from [<802a4a14>] (driver_probe_device+0x98/0x1a4) [<802a4a14>] (driver_probe_device+0x98/0x1a4) from [<802a4bac>] (__driver_attach+0x8c/0x90) [<802a4bac>] (__driver_attach+0x8c/0x90) from [<802a4254>] (bus_for_each_dev+0x60/0x8c) [<802a4254>] (bus_for_each_dev+0x60/0x8c) from [<802a3be0>] (bus_add_driver+0x184/0x25c) [<802a3be0>] (bus_add_driver+0x184/0x25c) from [<802a51a0>] (driver_register+0x78/0x13c) [<802a51a0>] (driver_register+0x78/0x13c) from [<7f065058>] (ehci_hcd_init+0x58/0x88 [ehci_hcd]) [<7f065058>] (ehci_hcd_init+0x58/0x88 [ehci_hcd]) from [<8003b4c4>] (do_one_initcall+0x30/0x16c) [<8003b4c4>] (do_one_initcall+0x30/0x16c) from [<800aa428>] (sys_init_module+0x84/0x19c) [<800aa428>] (sys_init_module+0x84/0x19c) from [<80041580>] (ret_fast_syscall+0x0/0x30) Code: 80aaee48 e92d4010 e30e4e48 e34840aa (e590005c) ---[ end trace 719afdfe4af3a442 ]--- Signed-off-by: Peter Chen <peter.chen@freescale.com>
Revert"ENGR00254931 IPUv3 Fb: Fix display twinkling issue during suspend/resume" This reverts commit 4bd475b. That patch will lead to kernel crash when doing video playback on video16 with overlay on. The reason is that fb driver doesn't reallocate larger DMA buffer requested by V4L2 driver, while IPU hardware write to large DMA address. Other solution is needed for the original issue. kernel BUG at arch/arm/mm/dma-mapping.c:478! Unable to handle kernel NULL pointer dereference at virtual address 00000000 pgd = 80004000 [00000000] *pgd=00000000 Internal error: Oops: 817 [#1] PREEMPT SMP Modules linked in: mxc_v4l2_capture ipu_csi_enc ipu_prp_enc ipu_fg_overlay_sdc ipu_bg_overlay_sdc ipu_still ov5640_camera_mipi ov5640_camera CPU: 0 Not tainted (3.0.35-2506-g556681e #1) PC is at __bug+0x1c/0x28 LR is at __bug+0x18/0x28 pc : [<800442ac>] lr : [<800442a8>] psr: 20000113 sp : 80a8fe88 ip : c09b2000 fp : 80aa3a70 r10: 80a90080 r9 : 00000040 r8 : bffecec4 r7 : 00000001 r6 : 00000002 r5 : 00000800 r4 : ce5c5e65 r3 : 00000000 r2 : 00000104 r1 : 0bfcf000 r0 : 00000033 Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment kernel Control: 10c53c7d Table: 4a97404a DAC: 00000015 Process swapper (pid: 0, stack limit = 0x80a8e2f0) Stack: (0x80a8fe88 to 0x80a90000) fe80: 80afde30 8004a464 00000880 ffdfd6b0 bffec000 802fa678 fea0: bffec000 00000060 5e5c5e65 ce5c5e65 bffec250 80ad4c88 bffece4c 00000001 fec0: bffecec4 00000040 0000012c 8c009480 8c009488 80a90080 ffff9527 80423f58 fee0: 00000001 80a8e000 00000096 00000001 80a9004c 80a8e000 00000103 80af77e0 ff00: 00000000 80a90040 00000003 8007a034 00000096 00000000 80a8e000 0000000a ff20: 80a94c4c 80a8e000 80039c00 80a8e000 00000096 00000000 80a8e000 00000000 ff40: 00000000 8007a570 80aa3cc0 80041874 ffffffff f2a00100 00000096 00000002 ff60: 00000001 80040a0c 80af9140 80000093 00000001 00000000 80a8e000 80af1ce4 ff80: 80511044 80aa6e7c 1000406a 412fc09a 00000000 00000000 00000000 80a8ffb0 ffa0: 8004f648 80041b04 40000013 ffffffff 80041ae0 80041d08 00000001 80aa3b3c ffc0: 80af1c40 8002f538 8c005160 80008868 800082f8 00000000 00000000 8002f538 ffe0: 00000000 10c53c7d 80aa3a6c 8002f534 80aa6e74 10008040 00000000 00000000 [<800442ac>] (__bug+0x1c/0x28) from [<8004a464>] (___dma_single_dev_to_cpu+0x84/0x94) [<8004a464>] (___dma_single_dev_to_cpu+0x84/0x94) from [<802fa678>] (fec_rx_poll+0x228/0x2c8) [<802fa678>] (fec_rx_poll+0x228/0x2c8) from [<80423f58>] (net_rx_action+0xb0/0x17c) [<80423f58>] (net_rx_action+0xb0/0x17c) from [<8007a034>] (__do_softirq+0xac/0x140) [<8007a034>] (__do_softirq+0xac/0x140) from [<8007a570>] (irq_exit+0x94/0x9c) [<8007a570>] (irq_exit+0x94/0x9c) from [<80041874>] (handle_IRQ+0x50/0xac) [<80041874>] (handle_IRQ+0x50/0xac) from [<80040a0c>] (__irq_svc+0x4c/0xe8) [<80040a0c>] (__irq_svc+0x4c/0xe8) from [<80041b04>] (default_idle+0x24/0x28) [<80041d08>] (cpu_idle+0xc8/0x108) from [<80008868>] (start_kernel+0x248/0x288) [<80008868>] (start_kernel+0x248/0x288) from [<10008040>] (0x10008040) Signed-off-by: Wayne Zou <b36644@freescale.com>
…single mode when do the crypto module speed test in single mode, meet the following dump: Unable to handle kernel NULL pointer dereference at virtual address 00000000 pgd = 8c804000 [00000000] *pgd=1c84b831, *pte=00000000, *ppte=00000000 Internal error: Oops: 817 [#1] PREEMPT SMP Modules linked in: tcrypt(+) CPU: 0 Tainted: G W (3.0.35-02642-g3a18d11-dirty torvalds#92) PC is at __bug+0x1c/0x28 LR is at __bug+0x18/0x28 pc : [<80045390>] lr : [<8004538c>] psr: 60000013 sp : 8c925dd8 ip : a09b2000 fp : 881f8018 r10: 00000000 r9 : 881f8000 r8 : 8e5a0c08 r7 : 8c866840 r6 : 00000002 r5 : 00000010 r4 : 7f08bbe0 r3 : 00000000 r2 : 80ac9190 r1 : 80000093 r0 : 00000065 Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user Control: 10c53c7d Table: 1c80404a DAC: 00000015 Process insmod (pid: 3994, stack limit = 0x8c9242f0) Stack: (0x8c925dd8 to 0x8c926000) 5dc0: 00000000 8004b548 5de0: 7f08bbe0 0f08bbe0 8db9a000 8c866840 8e5a0c08 803cd650 00000001 181f8038 5e00: 00000001 8db9a000 00000000 181f8038 881f8038 8b800000 00000010 1c866a70 5e20: 8db9a000 8db9a000 00000000 803cd26c 004eea5b 00000000 00000000 8db9a000 5e40: 8db9a000 80208db8 00000010 00000001 7f08b7b4 7f088b40 00000001 80aff320 5e60: 8e5330c0 80affe00 00000003 0000012c 8c866800 8c866840 00000000 00000000 5e80: 00000000 8c925e84 8c925e84 00000041 87654321 8b9c4be0 00000000 00001000 5ea0: 1c938000 87654321 8b9c4a0c 00000000 00001000 00000000 87654321 8b9bfd44 5ec0: 00000000 00001000 00000000 87654321 8b9c497e 00000000 00001000 00000000 5ee0: 7f08ba9c 7f08bbd0 7f08bbd0 000a7008 0000833c 80042284 7f08e000 8c924000 5f00: 00000000 7f08a3e0 7f08bbd0 000a7008 0000833c 00000010 7f08bbd0 000a7008 5f20: 0000833c 80042284 8c924000 7f08e06c 7f08ba90 00000000 000a7008 8003c588 5f40: 00000000 00000000 0000001f 00000020 00000017 00000014 00000012 00000000 5f60: 8c925f74 7f08ba90 00000000 000a7008 0000833c 80042284 8c924000 00000000 5f80: 00000000 800aae9c 8e4a6c80 800f307c 00000000 0000833c 7ee9adb4 7ee9aebf 5fa0: 00000080 80042100 0000833c 7ee9adb4 000a7020 0000833c 000a7008 7ee9aebf 5fc0: 0000833c 7ee9adb4 7ee9aebf 00000080 000001de 00000000 2ab5e000 00000000 5fe0: 7ee9abf0 7ee9abe0 0001a32c 2ac3e490 60000010 000a7020 aaaaaaaa aaaaaaaa [<80045390>] (__bug+0x1c/0x28) from [<8004b548>] (___dma_single_cpu_to_dev+0xd4/0x108) [<8004b548>] (___dma_single_cpu_to_dev+0xd4/0x108) from [<803cd650>] (ahash_digest+0x3e4/0x61c) [<803cd650>] (ahash_digest+0x3e4/0x61c) from [<80208db8>] (crypto_ahash_op+0x40/0xf0) [<80208db8>] (crypto_ahash_op+0x40/0xf0) from [<7f088b40>] (test_ahash_speed.constprop.8+0x540/0x690 [tcrypt]) [<7f088b40>] (test_ahash_speed.constprop.8+0x540/0x690 [tcrypt]) from [<7f08a3e0>] (do_test+0x1270/0x1dd0 [tcrypt]) [<7f08a3e0>] (do_test+0x1270/0x1dd0 [tcrypt]) from [<7f08e06c>] (tcrypt_mod_init+0x6c/0xc8 [tcrypt]) [<7f08e06c>] (tcrypt_mod_init+0x6c/0xc8 [tcrypt]) from [<8003c588>] (do_one_initcall+0x10c/0x170) [<8003c588>] (do_one_initcall+0x10c/0x170) from [<800aae9c>] (sys_init_module+0x74/0x19c) [<800aae9c>] (sys_init_module+0x74/0x19c) from [<80042100>] (ret_fast_syscall+0x0/0x30) Code: e59f0010 e1a01003 eb131d31 e3a03000 (e5833000) Signed-off-by: Hudson Winston <B45308@freescale.com> Signed-off-by: Steve Cornelius <steve.cornelius@freescale.com> Signed-off-by: Jason Liu <r64343@freescale.com> Signed-off-by: Terry Lv <r65388@freescale.com>
…n/streamoff When do stream on/off in pair repeatedly without close the v4l device, the kernel dump happens: Unable to handle kernel paging request at virtual address 00200200 pgd = c0004000 [00200200] *pgd=00000000 Internal error: Oops: 805 [#1] PREEMPT Modules linked in: CPU: 0 Not tainted (3.0.35-06027-gbbea887-dirty torvalds#21) PC is at camera_callback+0x15c/0x1c8 LR is at 0x200200 pc : [<c03747d0>] lr : [<00200200>] psr: 20000193 sp : c0b0fed0 ip : 00200200 fp : daf1102c r10: daf11034 r9 : 00100100 r8 : daf11098 r7 : daf11100 r6 : daf11034 r5 : daf11000 r4 : c0b0e000 r3 : 00000000 r2 : 00000001 r1 : 00000001 r0 : daf114b8 Flags: nzCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment kernel Control: 10c53c7d Table: 8cc8c059 DAC: 00000015 ... Process swapper (pid: 0, stack limit = 0xc0b0e2e8) Stack: (0xc0b0fed0 to 0xc0b10000) fec0: c0374674 822a4000 daf11000 c0b87eb4 fee0: 00000000 00000027 c0b73904 c0b305e0 00000001 c0374090 da2bbbe0 c0b0e000 ff00: 00000000 c00acc58 00000000 c0098058 00989680 c0b305e0 c0b0e000 00000000 ff20: 00000002 00000001 c0b0e000 00000000 00000000 c00acdf8 00000000 c0b0e000 ff40: 9e4e7881 c0b305e0 c0b0e000 c00aee44 c00aed9c c0b43c6c 00000027 c00ac634 ff60: 00000270 c004257c ffffffff f2a00100 00000027 c00417cc 20000000 00000006 ff80: f40c4000 00000000 c0b0e000 c0b6a924 c0b1876c c0b18764 80004059 412fc09a ffa0: 00000000 00000000 c0063a40 c0b0ffc0 c004f758 c0042690 80000013 ffffffff ffc0: c004266c c004294c c0b1013c 00000000 c10960c0 c00088ec c0008334 00000000 ffe0: 00000000 c00337d4 10c53c7d c0b10060 c00337d0 80008040 00000000 00000000 [<c03747d0>] (camera_callback+0x15c/0x1c8) from [<c0374090>] (csi_irq_handler+ 0x7c/0x160) [<c0374090>] (csi_irq_handler+0x7c/0x160) from [<c00acc58>] ( handle_irq_event_percpu+0x50/0x19c) [<c00acc58>] (handle_irq_event_percpu+0x50/0x19c) from [<c00acdf8>] ( handle_irq_event+0x54/0x84) [<c00acdf8>] (handle_irq_event+0x54/0x84) from [<c00aee44>] (handle_fasteoi_irq +0xa8/0x160) [<c00aee44>] (handle_fasteoi_irq+0xa8/0x160) from [<c00ac634>] ( generic_handle_irq+0x2c/0x40) [<c00ac634>] (generic_handle_irq+0x2c/0x40) from [<c004257c>] (handle_IRQ +0x30/0x84) [<c004257c>] (handle_IRQ+0x30/0x84) from [<c00417cc>] (__irq_svc+0x4c/0xa8) [<c00417cc>] (__irq_svc+0x4c/0xa8) from [<c0042690>] (default_idle+0x24/0x28) [<c0042690>] (default_idle+0x24/0x28) from [<c004294c>] (cpu_idle+0x8c/0xc0) [<c004294c>] (cpu_idle+0x8c/0xc0) from [<c00088ec>] (start_kernel+0x294/0x2e4) [<c00088ec>] (start_kernel+0x294/0x2e4) from [<80008040>] (0x80008040) Code: e88c420 e595c030 e5858030 e8881800 (e58c8000) ---[ end trace 224150c26d2bd5f7 ]--- The root cause is cam->enc_counter is not re-initialized to 0 when calls STREAMOFF ioctl, and then in DQBUF ioctl wait_event_interruptible_timeout() sees the condition is true and access cam->done_q queue which has no strict check and could be empty. This patch adds the re-initialization and the sanity check. Also, add the pointer check for memcpy because the destination may be NULL on UERSPTR mode. Signed-off-by: Robby Cai <R63905@freescale.com>
This warning was reported recently: WARNING: at drivers/scsi/libfc/fc_exch.c:478 fc_seq_send+0x14f/0x160 [libfc]() (Not tainted) Hardware name: ProLiant DL120 G7 Modules linked in: tcm_fc target_core_iblock target_core_file target_core_pscsi target_core_mod configfs dm_round_robin dm_multipath 8021q garp stp llc bnx2fc cnic uio fcoe libfcoe libfc scsi_transport_fc scsi_tgt autofs4 sunrpc pcc_cpufreq ipv6 hpilo hpwdt e1000e microcode iTCO_wdt iTCO_vendor_support serio_raw shpchp ixgbe dca mdio sg ext4 mbcache jbd2 sd_mod crc_t10dif pata_acpi ata_generic ata_piix hpsa dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan] Pid: 5464, comm: target_completi Not tainted 2.6.32-272.el6.x86_64 #1 Call Trace: [<ffffffff8106b747>] ? warn_slowpath_common+0x87/0xc0 [<ffffffff8106b79a>] ? warn_slowpath_null+0x1a/0x20 [<ffffffffa025f7df>] ? fc_seq_send+0x14f/0x160 [libfc] [<ffffffffa035cbce>] ? ft_queue_status+0x16e/0x210 [tcm_fc] [<ffffffffa030a660>] ? target_complete_ok_work+0x0/0x4b0 [target_core_mod] [<ffffffffa030a766>] ? target_complete_ok_work+0x106/0x4b0 [target_core_mod] [<ffffffffa030a660>] ? target_complete_ok_work+0x0/0x4b0 [target_core_mod] [<ffffffff8108c760>] ? worker_thread+0x170/0x2a0 [<ffffffff810920d0>] ? autoremove_wake_function+0x0/0x40 [<ffffffff8108c5f0>] ? worker_thread+0x0/0x2a0 [<ffffffff81091d66>] ? kthread+0x96/0xa0 [<ffffffff8100c14a>] ? child_rip+0xa/0x20 [<ffffffff81091cd0>] ? kthread+0x0/0xa0 [<ffffffff8100c140>] ? child_rip+0x0/0x20 It occurs because fc_seq_send can have multiple contexts executing within it at the same time, and fc_seq_send doesn't consistently use the ep->ex_lock that protects this structure. Because of that, its possible for one context to clear the INIT bit in the ep->esb_state field while another checks it, leading to the above stack trace generated by the WARN_ON in the function. We should probably undertake the effort to convert access to the fc_exch structures to use rcu, but that a larger work item. To just fix this specific issue, we can just extend the ex_lock protection through the entire fc_seq_send path Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Reported-by: Gris Ge <fge@redhat.com> CC: Robert Love <robert.w.love@intel.com> Signed-off-by: Robert Love <robert.w.love@intel.com>
If a too small MTU value is set with ioctl(HCISETACLMTU) or by a bogus controller, memory corruption happens due to a memcpy() call with negative length. Fix this crash on either incoming or outgoing connections with a MTU smaller than L2CAP_HDR_SIZE + L2CAP_CMD_HDR_SIZE: [ 46.885433] BUG: unable to handle kernel paging request at f56ad000 [ 46.888037] IP: [<c03d94cd>] memcpy+0x1d/0x40 [ 46.888037] *pdpt = 0000000000ac3001 *pde = 00000000373f8067 *pte = 80000000356ad060 [ 46.888037] Oops: 0002 [#1] SMP DEBUG_PAGEALLOC [ 46.888037] Modules linked in: hci_vhci bluetooth virtio_balloon i2c_piix4 uhci_hcd usbcore usb_common [ 46.888037] CPU: 0 PID: 1044 Comm: kworker/u3:0 Not tainted 3.10.0-rc1+ torvalds#12 [ 46.888037] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007 [ 46.888037] Workqueue: hci0 hci_rx_work [bluetooth] [ 46.888037] task: f59b15b0 ti: f55c4000 task.ti: f55c4000 [ 46.888037] EIP: 0060:[<c03d94cd>] EFLAGS: 00010212 CPU: 0 [ 46.888037] EIP is at memcpy+0x1d/0x40 [ 46.888037] EAX: f56ac1c0 EBX: fffffff8 ECX: 3ffffc6e EDX: f55c5cf2 [ 46.888037] ESI: f55c6b32 EDI: f56ad000 EBP: f55c5c68 ESP: f55c5c5c [ 46.888037] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 [ 46.888037] CR0: 8005003b CR2: f56ad000 CR3: 3557d000 CR4: 000006f0 [ 46.888037] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 [ 46.888037] DR6: ffff0ff0 DR7: 00000400 [ 46.888037] Stack: [ 46.888037] fffffff8 00000010 00000003 f55c5cac f8c6a54c ffffffff f8c69eb2 00000000 [ 46.888037] f4783cdc f57f0070 f759c590 1001c580 00000003 0200000a 00000000 f5a88560 [ 46.888037] f5ba2600 f5a88560 00000041 00000000 f55c5d90 f8c6f4c7 00000008 f55c5cf2 [ 46.888037] Call Trace: [ 46.888037] [<f8c6a54c>] l2cap_send_cmd+0x1cc/0x230 [bluetooth] [ 46.888037] [<f8c69eb2>] ? l2cap_global_chan_by_psm+0x152/0x1a0 [bluetooth] [ 46.888037] [<f8c6f4c7>] l2cap_connect+0x3f7/0x540 [bluetooth] [ 46.888037] [<c019b37b>] ? trace_hardirqs_off+0xb/0x10 [ 46.888037] [<c01a0ff8>] ? mark_held_locks+0x68/0x110 [ 46.888037] [<c064ad20>] ? mutex_lock_nested+0x280/0x360 [ 46.888037] [<c064b9d9>] ? __mutex_unlock_slowpath+0xa9/0x150 [ 46.888037] [<c01a118c>] ? trace_hardirqs_on_caller+0xec/0x1b0 [ 46.888037] [<c064ad08>] ? mutex_lock_nested+0x268/0x360 [ 46.888037] [<c01a125b>] ? trace_hardirqs_on+0xb/0x10 [ 46.888037] [<f8c72f8d>] l2cap_recv_frame+0xb2d/0x1d30 [bluetooth] [ 46.888037] [<c01a0ff8>] ? mark_held_locks+0x68/0x110 [ 46.888037] [<c064b9d9>] ? __mutex_unlock_slowpath+0xa9/0x150 [ 46.888037] [<c01a118c>] ? trace_hardirqs_on_caller+0xec/0x1b0 [ 46.888037] [<f8c754f1>] l2cap_recv_acldata+0x2a1/0x320 [bluetooth] [ 46.888037] [<f8c491d8>] hci_rx_work+0x518/0x810 [bluetooth] [ 46.888037] [<f8c48df2>] ? hci_rx_work+0x132/0x810 [bluetooth] [ 46.888037] [<c0158979>] process_one_work+0x1a9/0x600 [ 46.888037] [<c01588fb>] ? process_one_work+0x12b/0x600 [ 46.888037] [<c015922e>] ? worker_thread+0x19e/0x320 [ 46.888037] [<c015922e>] ? worker_thread+0x19e/0x320 [ 46.888037] [<c0159187>] worker_thread+0xf7/0x320 [ 46.888037] [<c0159090>] ? rescuer_thread+0x290/0x290 [ 46.888037] [<c01602f8>] kthread+0xa8/0xb0 [ 46.888037] [<c0656777>] ret_from_kernel_thread+0x1b/0x28 [ 46.888037] [<c0160250>] ? flush_kthread_worker+0x120/0x120 [ 46.888037] Code: c3 90 8d 74 26 00 e8 63 fc ff ff eb e8 90 55 89 e5 83 ec 0c 89 5d f4 89 75 f8 89 7d fc 3e 8d 74 26 00 89 cb 89 c7 c1 e9 02 89 d6 <f3> a5 89 d9 83 e1 03 74 02 f3 a4 8b 5d f4 8b 75 f8 8b 7d fc 89 [ 46.888037] EIP: [<c03d94cd>] memcpy+0x1d/0x40 SS:ESP 0068:f55c5c5c [ 46.888037] CR2: 00000000f56ad000 [ 46.888037] ---[ end trace 0217c1f4d78714a9 ]--- Signed-off-by: Anderson Lizardo <anderson.lizardo@openbossa.org> Cc: stable@vger.kernel.org Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk> Signed-off-by: John W. Linville <linville@tuxdriver.com>
On Thu, Jun 20, 2013 at 10:00:21AM +0200, Daniel Borkmann wrote: > After having fixed a NULL pointer dereference in SCTP 1abd165 ("net: > sctp: fix NULL pointer dereference in socket destruction"), I ran into > the following NULL pointer dereference in the crypto subsystem with > the same reproducer, easily hit each time: > > BUG: unable to handle kernel NULL pointer dereference at (null) > IP: [<ffffffff81070321>] __wake_up_common+0x31/0x90 > PGD 0 > Oops: 0000 [#1] SMP > Modules linked in: padlock_sha(F-) sha256_generic(F) sctp(F) libcrc32c(F) [..] > CPU: 6 PID: 3326 Comm: cryptomgr_probe Tainted: GF 3.10.0-rc5+ #1 > Hardware name: Dell Inc. PowerEdge T410/0H19HD, BIOS 1.6.3 02/01/2011 > task: ffff88007b6cf4e0 ti: ffff88007b7cc000 task.ti: ffff88007b7cc000 > RIP: 0010:[<ffffffff81070321>] [<ffffffff81070321>] __wake_up_common+0x31/0x90 > RSP: 0018:ffff88007b7cde08 EFLAGS: 00010082 > RAX: ffffffffffffffe8 RBX: ffff88003756c130 RCX: 0000000000000000 > RDX: 0000000000000000 RSI: 0000000000000003 RDI: ffff88003756c130 > RBP: ffff88007b7cde48 R08: 0000000000000000 R09: ffff88012b173200 > R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000282 > R13: ffff88003756c138 R14: 0000000000000000 R15: 0000000000000000 > FS: 0000000000000000(0000) GS:ffff88012fc60000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > CR2: 0000000000000000 CR3: 0000000001a0b000 CR4: 00000000000007e0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > Stack: > ffff88007b7cde28 0000000300000000 ffff88007b7cde28 ffff88003756c130 > 0000000000000282 ffff88003756c128 ffffffff81227670 0000000000000000 > ffff88007b7cde78 ffffffff810722b7 ffff88007cdcf000 ffffffff81a90540 > Call Trace: > [<ffffffff81227670>] ? crypto_alloc_pcomp+0x20/0x20 > [<ffffffff810722b7>] complete_all+0x47/0x60 > [<ffffffff81227708>] cryptomgr_probe+0x98/0xc0 > [<ffffffff81227670>] ? crypto_alloc_pcomp+0x20/0x20 > [<ffffffff8106760e>] kthread+0xce/0xe0 > [<ffffffff81067540>] ? kthread_freezable_should_stop+0x70/0x70 > [<ffffffff815450dc>] ret_from_fork+0x7c/0xb0 > [<ffffffff81067540>] ? kthread_freezable_should_stop+0x70/0x70 > Code: 41 56 41 55 41 54 53 48 83 ec 18 66 66 66 66 90 89 75 cc 89 55 c8 > 4c 8d 6f 08 48 8b 57 08 41 89 cf 4d 89 c6 48 8d 42 e > RIP [<ffffffff81070321>] __wake_up_common+0x31/0x90 > RSP <ffff88007b7cde08> > CR2: 0000000000000000 > ---[ end trace b495b19270a4d37e ]--- > > My assumption is that the following is happening: the minimal SCTP > tool runs under ``echo 1 > /proc/sys/net/sctp/auth_enable'', hence > it's making use of crypto_alloc_hash() via sctp_auth_init_hmacs(). > It forks itself, heavily allocates, binds, listens and waits in > accept on sctp sockets, and then randomly kills some of them (no > need for an actual client in this case to hit this). Then, again, > allocating, binding, etc, and then killing child processes. > > The problem that might be happening here is that cryptomgr requests > the module to probe/load through cryptomgr_schedule_probe(), but > before the thread handler cryptomgr_probe() returns, we return from > the wait_for_completion_interruptible() function and probably already > have cleared up larval, thus we run into a NULL pointer dereference > when in cryptomgr_probe() complete_all() is being called. > > If we wait with wait_for_completion() instead, this panic will not > occur anymore. This is valid, because in case a signal is pending, > cryptomgr_probe() returns from probing anyway with properly calling > complete_all(). The use of wait_for_completion_interruptible is intentional so that we don't lock up the thread if a bug causes us to never wake up. This bug is caused by the helper thread using the larval without holding a reference count on it. If the helper thread completes after the original thread requesting for help has gone away and destroyed the larval, then we get the crash above. So the fix is to hold a reference count on the larval. Cc: <stable@vger.kernel.org> # 3.6+ Reported-by: Daniel Borkmann <dborkman@redhat.com> Tested-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
We triggered an oops while running trinity with 3.4 kernel: BUG: unable to handle kernel paging request at 0000000100000d07 IP: [<ffffffffa0109738>] dlci_ioctl+0xd8/0x2d4 [dlci] PGD 640c0d067 PUD 0 Oops: 0000 [#1] PREEMPT SMP CPU 3 ... Pid: 7302, comm: trinity-child3 Not tainted 3.4.24.09+ 40 Huawei Technologies Co., Ltd. Tecal RH2285 /BC11BTSA RIP: 0010:[<ffffffffa0109738>] [<ffffffffa0109738>] dlci_ioctl+0xd8/0x2d4 [dlci] ... Call Trace: [<ffffffff8137c5c3>] sock_ioctl+0x153/0x280 [<ffffffff81195494>] do_vfs_ioctl+0xa4/0x5e0 [<ffffffff8118354a>] ? fget_light+0x3ea/0x490 [<ffffffff81195a1f>] sys_ioctl+0x4f/0x80 [<ffffffff81478b69>] system_call_fastpath+0x16/0x1b ... It's because the net device is not a dlci device. Reported-by: Li Jinyue <lijinyue@huawei.com> Signed-off-by: Li Zefan <lizefan@huawei.com> Cc: stable@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
commit 8b8cf89 upstream. __kernel_time_t is a long, which cannot hold a U32_MAX on 32-bit architectures. Just drop this check as it has limited value. This fixes a crash like: [ 957.905812] kernel BUG at /srv/autobuild-ceph/gitbuilder.git/build/include/linux/ceph/decode.h:164! [ 957.914849] Internal error: Oops - BUG: 0 [#1] SMP ARM [ 957.919978] Modules linked in: rbd libceph libcrc32c ipmi_devintf ipmi_si ipmi_msghandler nfsd nfs_acl auth_rpcgss nfs fscache lockd sunrpc [ 957.932547] CPU: 1 Tainted: G W (3.9.0-ceph-19bb6a83-highbank #1) [ 957.939881] PC is at ceph_osdc_build_request+0x8c/0x4f8 [libceph] [ 957.945967] LR is at 0xec520904 [ 957.949103] pc : [<bf13e76c>] lr : [<ec520904>] psr: 20000153 [ 957.949103] sp : ec753df8 ip : 00000001 fp : ec53e100 [ 957.960571] r10: ebef25c0 r9 : ec5fa400 r8 : ecbcc000 [ 957.965788] r7 : 00000000 r6 : 00000000 r5 : ffffffff r4 : 00000020 [ 957.972307] r3 : 51cc8143 r2 : ec520900 r1 : ec753e58 r0 : ec520908 [ 957.978827] Flags: nzCv IRQs on FIQs off Mode SVC_32 ISA ARM Segment user [ 957.986039] Control: 10c5387d Table: 2c59c04a DAC: 00000015 [ 957.991777] Process rbd (pid: 2138, stack limit = 0xec752238) [ 957.997514] Stack: (0xec753df8 to 0xec754000) [ 958.001864] 3de0: 00000001 00000001 [ 958.010032] 3e00: 00000001 bf139744 ecbcc000 ec55a0a0 00000024 00000000 ebef25c0 fffffffe [ 958.018204] 3e20: ffffffff 00000000 00000000 00000001 ec5fa400 ebef25c0 ec53e100 bf166b68 [ 958.026377] 3e40: 00000000 0000220f fffffffe ffffffff ec753e58 bf13ff24 51cc8143 05b25ed2 [ 958.034548] 3e60: 00000001 00000000 00000000 bf1688d4 00000001 00000000 00000000 00000000 [ 958.042720] 3e80: 00000001 00000060 ec5fa400 ed53d200 ed439600 ed439300 00000001 00000060 [ 958.050888] 3ea0: ec5fa400 ed53d200 00000000 bf16a320 00000000 ec53e100 00000040 ec753eb8 [ 958.059059] 3ec0: ec51df00 ed53d7c0 ed53d200 ed53d7c0 00000000 ed53d7c0 ec5fa400 bf16ed70 [ 958.067230] 3ee0: 00000000 00000060 00000002 ed53d200 00000000 bf16acf4 ed53d7c0 ec752000 [ 958.075402] 3f00: ed980e50 e954f5d8 00000000 00000060 ed53d240 ed53d258 ec753f80 c04f44a8 [ 958.083574] 3f20: edb7910c ec664700 01ade920 c02e4c44 00000060 c016b3dc ec51de40 01adfb84 [ 958.091745] 3f40: 00000060 ec752000 ec753f80 ec752000 00000060 c0108444 00000007 ec51de48 [ 958.099914] 3f60: ed0eb8c0 00000000 00000000 ec51de40 01adfb84 00000001 00000060 c0108858 [ 958.108085] 3f80: 00000000 00000000 51cc8143 00000060 01adfb84 00000007 00000004 c000dd68 [ 958.116257] 3fa0: 00000000 c000dbc0 00000060 01adfb84 00000007 01adfb84 00000060 01adfb80 [ 958.124429] 3fc0: 00000060 01adfb84 00000007 00000004 beded1a8 00000000 01adf2f0 01ade920 [ 958.132599] 3fe0: 00000000 beded180 b6811324 b6811334 800f0010 00000007 2e7f5821 2e7f5c21 [ 958.140815] [<bf13e76c>] (ceph_osdc_build_request+0x8c/0x4f8 [libceph]) from [<bf166b68>] (rbd_osd_req_format_write+0x50/0x7c [rbd]) [ 958.152739] [<bf166b68>] (rbd_osd_req_format_write+0x50/0x7c [rbd]) from [<bf1688d4>] (rbd_dev_header_watch_sync+0xe0/0x204 [rbd]) [ 958.164486] [<bf1688d4>] (rbd_dev_header_watch_sync+0xe0/0x204 [rbd]) from [<bf16a320>] (rbd_dev_image_probe+0x23c/0x850 [rbd]) [ 958.175967] [<bf16a320>] (rbd_dev_image_probe+0x23c/0x850 [rbd]) from [<bf16acf4>] (rbd_add+0x3c0/0x918 [rbd]) [ 958.185975] [<bf16acf4>] (rbd_add+0x3c0/0x918 [rbd]) from [<c02e4c44>] (bus_attr_store+0x20/0x2c) [ 958.194850] [<c02e4c44>] (bus_attr_store+0x20/0x2c) from [<c016b3dc>] (sysfs_write_file+0x168/0x198) [ 958.203984] [<c016b3dc>] (sysfs_write_file+0x168/0x198) from [<c0108444>] (vfs_write+0x9c/0x170) [ 958.212768] [<c0108444>] (vfs_write+0x9c/0x170) from [<c0108858>] (sys_write+0x3c/0x70) [ 958.220768] [<c0108858>] (sys_write+0x3c/0x70) from [<c000dbc0>] (ret_fast_syscall+0x0/0x30) [ 958.229199] Code: e59d1058 e5913000 e3530000 ba000114 (e7f001f2) Signed-off-by: Josh Durgin <josh.durgin@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ef962df upstream. Inlined xattr shared free space of inode block with inlined data or data extent record, so the size of the later two should be adjusted when inlined xattr is enabled. See ocfs2_xattr_ibody_init(). But this isn't done well when reflink. For inode with inlined data, its max inlined data size is adjusted in ocfs2_duplicate_inline_data(), no problem. But for inode with data extent record, its record count isn't adjusted. Fix it, or data extent record and inlined xattr may overwrite each other, then cause data corruption or xattr failure. One panic caused by this bug in our test environment is the following: kernel BUG at fs/ocfs2/xattr.c:1435! invalid opcode: 0000 [#1] SMP Pid: 10871, comm: multi_reflink_t Not tainted 2.6.39-300.17.1.el5uek #1 RIP: ocfs2_xa_offset_pointer+0x17/0x20 [ocfs2] RSP: e02b:ffff88007a587948 EFLAGS: 00010283 RAX: 0000000000000000 RBX: 0000000000000010 RCX: 00000000000051e4 RDX: ffff880057092060 RSI: 0000000000000f80 RDI: ffff88007a587a68 RBP: ffff88007a587948 R08: 00000000000062f4 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000010 R13: ffff88007a587a68 R14: 0000000000000001 R15: ffff88007a587c68 FS: 00007fccff7f06e0(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000 CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00000000015cf000 CR3: 000000007aa76000 CR4: 0000000000000660 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process multi_reflink_t Call Trace: ocfs2_xa_reuse_entry+0x60/0x280 [ocfs2] ocfs2_xa_prepare_entry+0x17e/0x2a0 [ocfs2] ocfs2_xa_set+0xcc/0x250 [ocfs2] ocfs2_xattr_ibody_set+0x98/0x230 [ocfs2] __ocfs2_xattr_set_handle+0x4f/0x700 [ocfs2] ocfs2_xattr_set+0x6c6/0x890 [ocfs2] ocfs2_xattr_user_set+0x46/0x50 [ocfs2] generic_setxattr+0x70/0x90 __vfs_setxattr_noperm+0x80/0x1a0 vfs_setxattr+0xa9/0xb0 setxattr+0xc3/0x120 sys_fsetxattr+0xa8/0xd0 system_call_fastpath+0x16/0x1b Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Reviewed-by: Jie Liu <jeff.liu@oracle.com> Acked-by: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Sunil Mushran <sunil.mushran@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 960fd85 upstream. The function ext4_get_group_number() was introduced as an optimization in commit bd86298. Unfortunately, this commit incorrectly calculate the group number for file systems with a 1k block size (when s_first_data_block is 1 instead of zero). This could cause the following kernel BUG: [ 568.877799] ------------[ cut here ]------------ [ 568.877833] kernel BUG at fs/ext4/mballoc.c:3728! [ 568.877840] Oops: Exception in kernel mode, sig: 5 [#1] [ 568.877845] SMP NR_CPUS=32 NUMA pSeries [ 568.877852] Modules linked in: binfmt_misc [ 568.877861] CPU: 1 PID: 3516 Comm: fs_mark Not tainted 3.10.0-03216-g7c6809f-dirty #1 [ 568.877867] task: c0000001fb0b8000 ti: c0000001fa954000 task.ti: c0000001fa954000 [ 568.877873] NIP: c0000000002f42a4 LR: c0000000002f4274 CTR: c000000000317ef8 [ 568.877879] REGS: c0000001fa956ed0 TRAP: 0700 Not tainted (3.10.0-03216-g7c6809f-dirty) [ 568.877884] MSR: 8000000000029032 <SF,EE,ME,IR,DR,RI> CR: 24000428 XER: 00000000 [ 568.877902] SOFTE: 1 [ 568.877905] CFAR: c0000000002b5464 [ 568.877908] GPR00: 0000000000000001 c0000001fa957150 c000000000c6a408 c0000001fb588000 GPR04: 0000000000003fff c0000001fa9571c0 c0000001fa9571c4 000138098c50625f GPR08: 1301200000000000 0000000000000002 0000000000000001 0000000000000000 GPR12: 0000000024000422 c00000000f33a300 0000000000008000 c0000001fa9577f0 GPR16: c0000001fb7d0100 c000000000c29190 c0000000007f46e8 c000000000a14672 GPR20: 0000000000000001 0000000000000008 ffffffffffffffff 0000000000000000 GPR24: 0000000000000100 c0000001fa957278 c0000001fdb2bc78 c0000001fa957288 GPR28: 0000000000100100 c0000001fa957288 c0000001fb588000 c0000001fdb2bd10 [ 568.877993] NIP [c0000000002f42a4] .ext4_mb_release_group_pa+0xec/0x1c0 [ 568.877999] LR [c0000000002f4274] .ext4_mb_release_group_pa+0xbc/0x1c0 [ 568.878004] Call Trace: [ 568.878008] [c0000001fa957150] [c0000000002f4274] .ext4_mb_release_group_pa+0xbc/0x1c0 (unreliable) [ 568.878017] [c0000001fa957200] [c0000000002fb070] .ext4_mb_discard_lg_preallocations+0x394/0x444 [ 568.878025] [c0000001fa957340] [c0000000002fb45c] .ext4_mb_release_context+0x33c/0x734 [ 568.878032] [c0000001fa957440] [c0000000002fbcf8] .ext4_mb_new_blocks+0x4a4/0x5f4 [ 568.878039] [c0000001fa957510] [c0000000002ef56c] .ext4_ext_map_blocks+0xc28/0x1178 [ 568.878047] [c0000001fa957640] [c0000000002c1a94] .ext4_map_blocks+0x2c8/0x490 [ 568.878054] [c0000001fa957730] [c0000000002c536c] .ext4_writepages+0x738/0xc60 [ 568.878062] [c0000001fa957950] [c000000000168a78] .do_writepages+0x5c/0x80 [ 568.878069] [c0000001fa9579d0] [c00000000015d1c4] .__filemap_fdatawrite_range+0x88/0xb0 [ 568.878078] [c0000001fa957aa0] [c00000000015d23c] .filemap_write_and_wait_range+0x50/0xfc [ 568.878085] [c0000001fa957b30] [c0000000002b8edc] .ext4_sync_file+0x220/0x3c4 [ 568.878092] [c0000001fa957be0] [c0000000001f849c] .vfs_fsync_range+0x64/0x80 [ 568.878098] [c0000001fa957c70] [c0000000001f84f0] .vfs_fsync+0x38/0x4c [ 568.878105] [c0000001fa957d00] [c0000000001f87f4] .do_fsync+0x54/0x90 [ 568.878111] [c0000001fa957db0] [c0000000001f8894] .SyS_fsync+0x28/0x3c [ 568.878120] [c0000001fa957e30] [c000000000009c88] syscall_exit+0x0/0x7c [ 568.878125] Instruction dump: [ 568.878130] 60000000 813d0034 81610070 38000000 7f8b4800 419e001c 813f007c 7d2bfe70 [ 568.878144] 7d604a78 7c005850 54000ffe 7c0007b4 <0b000000> e8a10076 e87f0090 7fa4eb78 [ 568.878160] ---[ end trace 594d911d9654770b ]--- In addition fix the STD_GROUP optimization so that it works for bigalloc file systems as well. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reported-by: Li Zhong <lizhongfs@gmail.com> Reviewed-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b8e4a47 upstream. On a CPU that never ran anything, both the active and reserved ASID fields are set to zero. In this case the ASID_TO_IDX() macro will return -1, which is not a very useful value to index a bitmap. Instead of trying to offset the ASID so that ASID #1 is actually bit 0 in the asid_map bitmap, just always ignore bit 0 and start the search from bit 1. This makes the code a bit more readable, and without risk of OoB access. Acked-by: Will Deacon <will.deacon@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reported-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e8d0527 upstream. commit 2f7021a "cpufreq: protect 'policy->cpus' from offlining during __gov_queue_work()" caused a regression in CPU hotplug, because it lead to a deadlock between cpufreq governor worker thread and the CPU hotplug writer task. Lockdep splat corresponding to this deadlock is shown below: [ 60.277396] ====================================================== [ 60.277400] [ INFO: possible circular locking dependency detected ] [ 60.277407] 3.10.0-rc7-dbg-01385-g241fd04-dirty #1744 Not tainted [ 60.277411] ------------------------------------------------------- [ 60.277417] bash/2225 is trying to acquire lock: [ 60.277422] ((&(&j_cdbs->work)->work)){+.+...}, at: [<ffffffff810621b5>] flush_work+0x5/0x280 [ 60.277444] but task is already holding lock: [ 60.277449] (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff81042d8b>] cpu_hotplug_begin+0x2b/0x60 [ 60.277465] which lock already depends on the new lock. [ 60.277472] the existing dependency chain (in reverse order) is: [ 60.277477] -> #2 (cpu_hotplug.lock){+.+.+.}: [ 60.277490] [<ffffffff810ac6d4>] lock_acquire+0xa4/0x200 [ 60.277503] [<ffffffff815b6157>] mutex_lock_nested+0x67/0x410 [ 60.277514] [<ffffffff81042cbc>] get_online_cpus+0x3c/0x60 [ 60.277522] [<ffffffff814b842a>] gov_queue_work+0x2a/0xb0 [ 60.277532] [<ffffffff814b7891>] cs_dbs_timer+0xc1/0xe0 [ 60.277543] [<ffffffff8106302d>] process_one_work+0x1cd/0x6a0 [ 60.277552] [<ffffffff81063d31>] worker_thread+0x121/0x3a0 [ 60.277560] [<ffffffff8106ae2b>] kthread+0xdb/0xe0 [ 60.277569] [<ffffffff815bb96c>] ret_from_fork+0x7c/0xb0 [ 60.277580] -> #1 (&j_cdbs->timer_mutex){+.+...}: [ 60.277592] [<ffffffff810ac6d4>] lock_acquire+0xa4/0x200 [ 60.277600] [<ffffffff815b6157>] mutex_lock_nested+0x67/0x410 [ 60.277608] [<ffffffff814b785d>] cs_dbs_timer+0x8d/0xe0 [ 60.277616] [<ffffffff8106302d>] process_one_work+0x1cd/0x6a0 [ 60.277624] [<ffffffff81063d31>] worker_thread+0x121/0x3a0 [ 60.277633] [<ffffffff8106ae2b>] kthread+0xdb/0xe0 [ 60.277640] [<ffffffff815bb96c>] ret_from_fork+0x7c/0xb0 [ 60.277649] -> #0 ((&(&j_cdbs->work)->work)){+.+...}: [ 60.277661] [<ffffffff810ab826>] __lock_acquire+0x1766/0x1d30 [ 60.277669] [<ffffffff810ac6d4>] lock_acquire+0xa4/0x200 [ 60.277677] [<ffffffff810621ed>] flush_work+0x3d/0x280 [ 60.277685] [<ffffffff81062d8a>] __cancel_work_timer+0x8a/0x120 [ 60.277693] [<ffffffff81062e53>] cancel_delayed_work_sync+0x13/0x20 [ 60.277701] [<ffffffff814b89d9>] cpufreq_governor_dbs+0x529/0x6f0 [ 60.277709] [<ffffffff814b76a7>] cs_cpufreq_governor_dbs+0x17/0x20 [ 60.277719] [<ffffffff814b5df8>] __cpufreq_governor+0x48/0x100 [ 60.277728] [<ffffffff814b6b80>] __cpufreq_remove_dev.isra.14+0x80/0x3c0 [ 60.277737] [<ffffffff815adc0d>] cpufreq_cpu_callback+0x38/0x4c [ 60.277747] [<ffffffff81071a4d>] notifier_call_chain+0x5d/0x110 [ 60.277759] [<ffffffff81071b0e>] __raw_notifier_call_chain+0xe/0x10 [ 60.277768] [<ffffffff815a0a68>] _cpu_down+0x88/0x330 [ 60.277779] [<ffffffff815a0d46>] cpu_down+0x36/0x50 [ 60.277788] [<ffffffff815a2748>] store_online+0x98/0xd0 [ 60.277796] [<ffffffff81452a28>] dev_attr_store+0x18/0x30 [ 60.277806] [<ffffffff811d9edb>] sysfs_write_file+0xdb/0x150 [ 60.277818] [<ffffffff8116806d>] vfs_write+0xbd/0x1f0 [ 60.277826] [<ffffffff811686fc>] SyS_write+0x4c/0xa0 [ 60.277834] [<ffffffff815bbbbe>] tracesys+0xd0/0xd5 [ 60.277842] other info that might help us debug this: [ 60.277848] Chain exists of: (&(&j_cdbs->work)->work) --> &j_cdbs->timer_mutex --> cpu_hotplug.lock [ 60.277864] Possible unsafe locking scenario: [ 60.277869] CPU0 CPU1 [ 60.277873] ---- ---- [ 60.277877] lock(cpu_hotplug.lock); [ 60.277885] lock(&j_cdbs->timer_mutex); [ 60.277892] lock(cpu_hotplug.lock); [ 60.277900] lock((&(&j_cdbs->work)->work)); [ 60.277907] *** DEADLOCK *** [ 60.277915] 6 locks held by bash/2225: [ 60.277919] #0: (sb_writers#6){.+.+.+}, at: [<ffffffff81168173>] vfs_write+0x1c3/0x1f0 [ 60.277937] #1: (&buffer->mutex){+.+.+.}, at: [<ffffffff811d9e3c>] sysfs_write_file+0x3c/0x150 [ 60.277954] #2: (s_active#61){.+.+.+}, at: [<ffffffff811d9ec3>] sysfs_write_file+0xc3/0x150 [ 60.277972] #3: (x86_cpu_hotplug_driver_mutex){+.+...}, at: [<ffffffff81024cf7>] cpu_hotplug_driver_lock+0x17/0x20 [ 60.277990] #4: (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff815a0d32>] cpu_down+0x22/0x50 [ 60.278007] #5: (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff81042d8b>] cpu_hotplug_begin+0x2b/0x60 [ 60.278023] stack backtrace: [ 60.278031] CPU: 3 PID: 2225 Comm: bash Not tainted 3.10.0-rc7-dbg-01385-g241fd04-dirty #1744 [ 60.278037] Hardware name: Acer Aspire 5741G /Aspire 5741G , BIOS V1.20 02/08/2011 [ 60.278042] ffffffff8204e110 ffff88014df6b9f8 ffffffff815b3d90 ffff88014df6ba38 [ 60.278055] ffffffff815b0a8d ffff880150ed3f60 ffff880150ed4770 3871c4002c8980b2 [ 60.278068] ffff880150ed4748 ffff880150ed4770 ffff880150ed3f60 ffff88014df6bb00 [ 60.278081] Call Trace: [ 60.278091] [<ffffffff815b3d90>] dump_stack+0x19/0x1b [ 60.278101] [<ffffffff815b0a8d>] print_circular_bug+0x2b6/0x2c5 [ 60.278111] [<ffffffff810ab826>] __lock_acquire+0x1766/0x1d30 [ 60.278123] [<ffffffff81067e08>] ? __kernel_text_address+0x58/0x80 [ 60.278134] [<ffffffff810ac6d4>] lock_acquire+0xa4/0x200 [ 60.278142] [<ffffffff810621b5>] ? flush_work+0x5/0x280 [ 60.278151] [<ffffffff810621ed>] flush_work+0x3d/0x280 [ 60.278159] [<ffffffff810621b5>] ? flush_work+0x5/0x280 [ 60.278169] [<ffffffff810a9b14>] ? mark_held_locks+0x94/0x140 [ 60.278178] [<ffffffff81062d77>] ? __cancel_work_timer+0x77/0x120 [ 60.278188] [<ffffffff810a9cbd>] ? trace_hardirqs_on_caller+0xfd/0x1c0 [ 60.278196] [<ffffffff81062d8a>] __cancel_work_timer+0x8a/0x120 [ 60.278206] [<ffffffff81062e53>] cancel_delayed_work_sync+0x13/0x20 [ 60.278214] [<ffffffff814b89d9>] cpufreq_governor_dbs+0x529/0x6f0 [ 60.278225] [<ffffffff814b76a7>] cs_cpufreq_governor_dbs+0x17/0x20 [ 60.278234] [<ffffffff814b5df8>] __cpufreq_governor+0x48/0x100 [ 60.278244] [<ffffffff814b6b80>] __cpufreq_remove_dev.isra.14+0x80/0x3c0 [ 60.278255] [<ffffffff815adc0d>] cpufreq_cpu_callback+0x38/0x4c [ 60.278265] [<ffffffff81071a4d>] notifier_call_chain+0x5d/0x110 [ 60.278275] [<ffffffff81071b0e>] __raw_notifier_call_chain+0xe/0x10 [ 60.278284] [<ffffffff815a0a68>] _cpu_down+0x88/0x330 [ 60.278292] [<ffffffff81024cf7>] ? cpu_hotplug_driver_lock+0x17/0x20 [ 60.278302] [<ffffffff815a0d46>] cpu_down+0x36/0x50 [ 60.278311] [<ffffffff815a2748>] store_online+0x98/0xd0 [ 60.278320] [<ffffffff81452a28>] dev_attr_store+0x18/0x30 [ 60.278329] [<ffffffff811d9edb>] sysfs_write_file+0xdb/0x150 [ 60.278337] [<ffffffff8116806d>] vfs_write+0xbd/0x1f0 [ 60.278347] [<ffffffff81185950>] ? fget_light+0x320/0x4b0 [ 60.278355] [<ffffffff811686fc>] SyS_write+0x4c/0xa0 [ 60.278364] [<ffffffff815bbbbe>] tracesys+0xd0/0xd5 [ 60.280582] smpboot: CPU 1 is now offline The intention of that commit was to avoid warnings during CPU hotplug, which indicated that offline CPUs were getting IPIs from the cpufreq governor's work items. But the real root-cause of that problem was commit a66b2e5 (cpufreq: Preserve sysfs files across suspend/resume) because it totally skipped all the cpufreq callbacks during CPU hotplug in the suspend/resume path, and hence it never actually shut down the cpufreq governor's worker threads during CPU offline in the suspend/resume path. Reflecting back, the reason why we never suspected that commit as the root-cause earlier, was that the original issue was reported with just the halt command and nobody had brought in suspend/resume to the equation. The reason for _that_ in turn, as it turns out, is that earlier halt/shutdown was being done by disabling non-boot CPUs while tasks were frozen, just like suspend/resume.... but commit cf7df37 (reboot: migrate shutdown/reboot to boot cpu) which came somewhere along that very same time changed that logic: shutdown/halt no longer takes CPUs offline. Thus, the test-cases for reproducing the bug were vastly different and thus we went totally off the trail. Overall, it was one hell of a confusion with so many commits affecting each other and also affecting the symptoms of the problems in subtle ways. Finally, now since the original problematic commit (a66b2e5) has been completely reverted, revert this intermediate fix too (2f7021a), to fix the CPU hotplug deadlock. Phew! Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Reported-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Tested-by: Peter Wu <lekensteyn@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 058ebd0 upstream. Jiri managed to trigger this warning: [] ====================================================== [] [ INFO: possible circular locking dependency detected ] [] 3.10.0+ torvalds#228 Tainted: G W [] ------------------------------------------------------- [] p/6613 is trying to acquire lock: [] (rcu_node_0){..-...}, at: [<ffffffff810ca797>] rcu_read_unlock_special+0xa7/0x250 [] [] but task is already holding lock: [] (&ctx->lock){-.-...}, at: [<ffffffff810f2879>] perf_lock_task_context+0xd9/0x2c0 [] [] which lock already depends on the new lock. [] [] the existing dependency chain (in reverse order) is: [] [] -> #4 (&ctx->lock){-.-...}: [] -> #3 (&rq->lock){-.-.-.}: [] -> #2 (&p->pi_lock){-.-.-.}: [] -> #1 (&rnp->nocb_gp_wq[1]){......}: [] -> #0 (rcu_node_0){..-...}: Paul was quick to explain that due to preemptible RCU we cannot call rcu_read_unlock() while holding scheduler (or nested) locks when part of the read side critical section was preemptible. Therefore solve it by making the entire RCU read side non-preemptible. Also pull out the retry from under the non-preempt to play nice with RT. Reported-by: Jiri Olsa <jolsa@redhat.com> Helped-out-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 8965779, with some bits from commit b7b1bfc ("ipv6: split duplicate address detection and router solicitation timer") to get the __ipv6_get_lladdr() used by this patch. ] dingtianhong reported the following deadlock detected by lockdep: ====================================================== [ INFO: possible circular locking dependency detected ] 3.4.24.05-0.1-default #1 Not tainted ------------------------------------------------------- ksoftirqd/0/3 is trying to acquire lock: (&ndev->lock){+.+...}, at: [<ffffffff8147f804>] ipv6_get_lladdr+0x74/0x120 but task is already holding lock: (&mc->mca_lock){+.+...}, at: [<ffffffff8149d130>] mld_send_report+0x40/0x150 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&mc->mca_lock){+.+...}: [<ffffffff810a8027>] validate_chain+0x637/0x730 [<ffffffff810a8417>] __lock_acquire+0x2f7/0x500 [<ffffffff810a8734>] lock_acquire+0x114/0x150 [<ffffffff814f691a>] rt_spin_lock+0x4a/0x60 [<ffffffff8149e4bb>] igmp6_group_added+0x3b/0x120 [<ffffffff8149e5d8>] ipv6_mc_up+0x38/0x60 [<ffffffff81480a4d>] ipv6_find_idev+0x3d/0x80 [<ffffffff81483175>] addrconf_notify+0x3d5/0x4b0 [<ffffffff814fae3f>] notifier_call_chain+0x3f/0x80 [<ffffffff81073471>] raw_notifier_call_chain+0x11/0x20 [<ffffffff813d8722>] call_netdevice_notifiers+0x32/0x60 [<ffffffff813d92d4>] __dev_notify_flags+0x34/0x80 [<ffffffff813d9360>] dev_change_flags+0x40/0x70 [<ffffffff813ea627>] do_setlink+0x237/0x8a0 [<ffffffff813ebb6c>] rtnl_newlink+0x3ec/0x600 [<ffffffff813eb4d0>] rtnetlink_rcv_msg+0x160/0x310 [<ffffffff814040b9>] netlink_rcv_skb+0x89/0xb0 [<ffffffff813eb357>] rtnetlink_rcv+0x27/0x40 [<ffffffff81403e20>] netlink_unicast+0x140/0x180 [<ffffffff81404a9e>] netlink_sendmsg+0x33e/0x380 [<ffffffff813c4252>] sock_sendmsg+0x112/0x130 [<ffffffff813c537e>] __sys_sendmsg+0x44e/0x460 [<ffffffff813c5544>] sys_sendmsg+0x44/0x70 [<ffffffff814feab9>] system_call_fastpath+0x16/0x1b -> #0 (&ndev->lock){+.+...}: [<ffffffff810a798e>] check_prev_add+0x3de/0x440 [<ffffffff810a8027>] validate_chain+0x637/0x730 [<ffffffff810a8417>] __lock_acquire+0x2f7/0x500 [<ffffffff810a8734>] lock_acquire+0x114/0x150 [<ffffffff814f6c82>] rt_read_lock+0x42/0x60 [<ffffffff8147f804>] ipv6_get_lladdr+0x74/0x120 [<ffffffff8149b036>] mld_newpack+0xb6/0x160 [<ffffffff8149b18b>] add_grhead+0xab/0xc0 [<ffffffff8149d03b>] add_grec+0x3ab/0x460 [<ffffffff8149d14a>] mld_send_report+0x5a/0x150 [<ffffffff8149f99e>] igmp6_timer_handler+0x4e/0xb0 [<ffffffff8105705a>] call_timer_fn+0xca/0x1d0 [<ffffffff81057b9f>] run_timer_softirq+0x1df/0x2e0 [<ffffffff8104e8c7>] handle_pending_softirqs+0xf7/0x1f0 [<ffffffff8104ea3b>] __do_softirq_common+0x7b/0xf0 [<ffffffff8104f07f>] __thread_do_softirq+0x1af/0x210 [<ffffffff8104f1c1>] run_ksoftirqd+0xe1/0x1f0 [<ffffffff8106c7de>] kthread+0xae/0xc0 [<ffffffff814fff74>] kernel_thread_helper+0x4/0x10 actually we can just hold idev->lock before taking pmc->mca_lock, and avoid taking idev->lock again when iterating idev->addr_list, since the upper callers of mld_newpack() already take read_lock_bh(&idev->lock). Reported-by: dingtianhong <dingtianhong@huawei.com> Cc: dingtianhong <dingtianhong@huawei.com> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: David S. Miller <davem@davemloft.net> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Tested-by: Ding Tianhong <dingtianhong@huawei.com> Tested-by: Chen Weilong <chenweilong@huawei.com> Signed-off-by: Cong Wang <amwang@redhat.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
…ET pending data [ Upstream commit 8822b64 ] We accidentally call down to ip6_push_pending_frames when uncorking pending AF_INET data on a ipv6 socket. This results in the following splat (from Dave Jones): skbuff: skb_under_panic: text:ffffffff816765f6 len:48 put:40 head:ffff88013deb6df0 data:ffff88013deb6dec tail:0x2c end:0xc0 dev:<NULL> ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:126! invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC Modules linked in: dccp_ipv4 dccp 8021q garp bridge stp dlci mpoa snd_seq_dummy sctp fuse hidp tun bnep nfnetlink scsi_transport_iscsi rfcomm can_raw can_bcm af_802154 appletalk caif_socket can caif ipt_ULOG x25 rose af_key pppoe pppox ipx phonet irda llc2 ppp_generic slhc p8023 psnap p8022 llc crc_ccitt atm bluetooth +netrom ax25 nfc rfkill rds af_rxrpc coretemp hwmon kvm_intel kvm crc32c_intel snd_hda_codec_realtek ghash_clmulni_intel microcode pcspkr snd_hda_codec_hdmi snd_hda_intel snd_hda_codec snd_hwdep usb_debug snd_seq snd_seq_device snd_pcm e1000e snd_page_alloc snd_timer ptp snd pps_core soundcore xfs libcrc32c CPU: 2 PID: 8095 Comm: trinity-child2 Not tainted 3.10.0-rc7+ torvalds#37 task: ffff8801f52c2520 ti: ffff8801e6430000 task.ti: ffff8801e6430000 RIP: 0010:[<ffffffff816e759c>] [<ffffffff816e759c>] skb_panic+0x63/0x65 RSP: 0018:ffff8801e6431de8 EFLAGS: 00010282 RAX: 0000000000000086 RBX: ffff8802353d3cc0 RCX: 0000000000000006 RDX: 0000000000003b90 RSI: ffff8801f52c2ca0 RDI: ffff8801f52c2520 RBP: ffff8801e6431e08 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000001 R12: ffff88022ea0c800 R13: ffff88022ea0cdf8 R14: ffff8802353ecb40 R15: ffffffff81cc7800 FS: 00007f5720a10740(0000) GS:ffff880244c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000005862000 CR3: 000000022843c000 CR4: 00000000001407e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600 Stack: ffff88013deb6dec 000000000000002c 00000000000000c0 ffffffff81a3f6e4 ffff8801e6431e18 ffffffff8159a9aa ffff8801e6431e90 ffffffff816765f6 ffffffff810b756b 0000000700000002 ffff8801e6431e40 0000fea9292aa8c0 Call Trace: [<ffffffff8159a9aa>] skb_push+0x3a/0x40 [<ffffffff816765f6>] ip6_push_pending_frames+0x1f6/0x4d0 [<ffffffff810b756b>] ? mark_held_locks+0xbb/0x140 [<ffffffff81694919>] udp_v6_push_pending_frames+0x2b9/0x3d0 [<ffffffff81694660>] ? udplite_getfrag+0x20/0x20 [<ffffffff8162092a>] udp_lib_setsockopt+0x1aa/0x1f0 [<ffffffff811cc5e7>] ? fget_light+0x387/0x4f0 [<ffffffff816958a4>] udpv6_setsockopt+0x34/0x40 [<ffffffff815949f4>] sock_common_setsockopt+0x14/0x20 [<ffffffff81593c31>] SyS_setsockopt+0x71/0xd0 [<ffffffff816f5d54>] tracesys+0xdd/0xe2 Code: 00 00 48 89 44 24 10 8b 87 d8 00 00 00 48 89 44 24 08 48 8b 87 e8 00 00 00 48 c7 c7 c0 04 aa 81 48 89 04 24 31 c0 e8 e1 7e ff ff <0f> 0b 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55 RIP [<ffffffff816e759c>] skb_panic+0x63/0x65 RSP <ffff8801e6431de8> This patch adds a check if the pending data is of address family AF_INET and directly calls udp_push_ending_frames from udp_v6_push_pending_frames if that is the case. This bug was found by Dave Jones with trinity. (Also move the initialization of fl6 below the AF_INET check, even if not strictly necessary.) Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Cc: Dave Jones <davej@redhat.com> Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 75a493e ] If the socket had an IPV6_MTU value set, ip6_append_data_mtu lost track of this when appending the second frame on a corked socket. This results in the following splat: [37598.993962] ------------[ cut here ]------------ [37598.994008] kernel BUG at net/core/skbuff.c:2064! [37598.994008] invalid opcode: 0000 [#1] SMP [37598.994008] Modules linked in: tcp_lp uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core videodev media vfat fat usb_storage fuse ebtable_nat xt_CHECKSUM bridge stp llc ipt_MASQUERADE nf_conntrack_netbios_ns nf_conntrack_broadcast ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 iptable_nat +nf_nat_ipv4 nf_nat iptable_mangle nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ebtable_filter ebtables ip6table_filter ip6_tables be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb4i cxgb4 cxgb3i cxgb3 mdio libcxgbi ib_iser rdma_cm ib_addr iw_cm ib_cm ib_sa ib_mad ib_core iscsi_tcp libiscsi_tcp libiscsi +scsi_transport_iscsi rfcomm bnep iTCO_wdt iTCO_vendor_support snd_hda_codec_conexant arc4 iwldvm mac80211 snd_hda_intel acpi_cpufreq mperf coretemp snd_hda_codec microcode cdc_wdm cdc_acm [37598.994008] snd_hwdep cdc_ether snd_seq snd_seq_device usbnet mii joydev btusb snd_pcm bluetooth i2c_i801 e1000e lpc_ich mfd_core ptp iwlwifi pps_core snd_page_alloc mei cfg80211 snd_timer thinkpad_acpi snd tpm_tis soundcore rfkill tpm tpm_bios vhost_net tun macvtap macvlan kvm_intel kvm uinput binfmt_misc +dm_crypt i915 i2c_algo_bit drm_kms_helper drm i2c_core wmi video [37598.994008] CPU 0 [37598.994008] Pid: 27320, comm: t2 Not tainted 3.9.6-200.fc18.x86_64 #1 LENOVO 27744PG/27744PG [37598.994008] RIP: 0010:[<ffffffff815443a5>] [<ffffffff815443a5>] skb_copy_and_csum_bits+0x325/0x330 [37598.994008] RSP: 0018:ffff88003670da18 EFLAGS: 00010202 [37598.994008] RAX: ffff88018105c018 RBX: 0000000000000004 RCX: 00000000000006c0 [37598.994008] RDX: ffff88018105a6c0 RSI: ffff88018105a000 RDI: ffff8801e1b0aa00 [37598.994008] RBP: ffff88003670da78 R08: 0000000000000000 R09: ffff88018105c040 [37598.994008] R10: ffff8801e1b0aa00 R11: 0000000000000000 R12: 000000000000fff8 [37598.994008] R13: 00000000000004fc R14: 00000000ffff0504 R15: 0000000000000000 [37598.994008] FS: 00007f28eea59740(0000) GS:ffff88023bc00000(0000) knlGS:0000000000000000 [37598.994008] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [37598.994008] CR2: 0000003d935789e0 CR3: 00000000365cb000 CR4: 00000000000407f0 [37598.994008] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [37598.994008] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [37598.994008] Process t2 (pid: 27320, threadinfo ffff88003670c000, task ffff88022c162ee0) [37598.994008] Stack: [37598.994008] ffff88022e098a00 ffff88020f973fc0 0000000000000008 00000000000004c8 [37598.994008] ffff88020f973fc0 00000000000004c4 ffff88003670da78 ffff8801e1b0a200 [37598.994008] 0000000000000018 00000000000004c8 ffff88020f973fc0 00000000000004c4 [37598.994008] Call Trace: [37598.994008] [<ffffffff815fc21f>] ip6_append_data+0xccf/0xfe0 [37598.994008] [<ffffffff8158d9f0>] ? ip_copy_metadata+0x1a0/0x1a0 [37598.994008] [<ffffffff81661f66>] ? _raw_spin_lock_bh+0x16/0x40 [37598.994008] [<ffffffff8161548d>] udpv6_sendmsg+0x1ed/0xc10 [37598.994008] [<ffffffff812a2845>] ? sock_has_perm+0x75/0x90 [37598.994008] [<ffffffff815c3693>] inet_sendmsg+0x63/0xb0 [37598.994008] [<ffffffff812a2973>] ? selinux_socket_sendmsg+0x23/0x30 [37598.994008] [<ffffffff8153a450>] sock_sendmsg+0xb0/0xe0 [37598.994008] [<ffffffff810135d1>] ? __switch_to+0x181/0x4a0 [37598.994008] [<ffffffff8153d97d>] sys_sendto+0x12d/0x180 [37598.994008] [<ffffffff810dfb64>] ? __audit_syscall_entry+0x94/0xf0 [37598.994008] [<ffffffff81020ed1>] ? syscall_trace_enter+0x231/0x240 [37598.994008] [<ffffffff8166a7e7>] tracesys+0xdd/0xe2 [37598.994008] Code: fe 07 00 00 48 c7 c7 04 28 a6 81 89 45 a0 4c 89 4d b8 44 89 5d a8 e8 1b ac b1 ff 44 8b 5d a8 4c 8b 4d b8 8b 45 a0 e9 cf fe ff ff <0f> 0b 66 0f 1f 84 00 00 00 00 00 66 66 66 66 90 55 48 89 e5 48 [37598.994008] RIP [<ffffffff815443a5>] skb_copy_and_csum_bits+0x325/0x330 [37598.994008] RSP <ffff88003670da18> [37599.007323] ---[ end trace d69f6a17f8ac8eee ]--- While there, also check if path mtu discovery is activated for this socket. The logic was adapted from ip6_append_data when first writing on the corked socket. This bug was introduced with commit 0c18337 ("ipv6: fix incorrect ipsec fragment"). v2: a) Replace IPV6_PMTU_DISC_DO with IPV6_PMTUDISC_PROBE. b) Don't pass ipv6_pinfo to ip6_append_data_mtu (suggestion by Gao feng, thanks!). c) Change mtu to unsigned int, else we get a warning about non-matching types because of the min()-macro type-check. Acked-by: Gao feng <gaofeng@cn.fujitsu.com> Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
…g it expire [ Upstream commit 1eb4f75 ] We could end up expiring a route which is part of an ecmp route set. Doing so would invalidate the rt->rt6i_nsiblings calculations and could provoke the following panic: [ 80.144667] ------------[ cut here ]------------ [ 80.145172] kernel BUG at net/ipv6/ip6_fib.c:733! [ 80.145172] invalid opcode: 0000 [#1] SMP [ 80.145172] Modules linked in: 8021q nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 iptable_nat nf_nat_ipv4 nf_nat iptable_mangle nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ebtable_filter ebtables ip6table_filter ip6_tables +snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_page_alloc snd_timer virtio_balloon snd soundcore i2c_piix4 i2c_core virtio_net virtio_blk [ 80.145172] CPU: 1 PID: 786 Comm: ping6 Not tainted 3.10.0+ torvalds#118 [ 80.145172] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 80.145172] task: ffff880117fa0000 ti: ffff880118770000 task.ti: ffff880118770000 [ 80.145172] RIP: 0010:[<ffffffff815f3b5d>] [<ffffffff815f3b5d>] fib6_add+0x75d/0x830 [ 80.145172] RSP: 0018:ffff880118771798 EFLAGS: 00010202 [ 80.145172] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff88011350e480 [ 80.145172] RDX: ffff88011350e238 RSI: 0000000000000004 RDI: ffff88011350f738 [ 80.145172] RBP: ffff880118771848 R08: ffff880117903280 R09: 0000000000000001 [ 80.145172] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88011350f680 [ 80.145172] R13: ffff880117903280 R14: ffff880118771890 R15: ffff88011350ef90 [ 80.145172] FS: 00007f02b5127740(0000) GS:ffff88011fd00000(0000) knlGS:0000000000000000 [ 80.145172] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 80.145172] CR2: 00007f981322a000 CR3: 00000001181b1000 CR4: 00000000000006e0 [ 80.145172] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 80.145172] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 80.145172] Stack: [ 80.145172] 0000000000000001 ffff880100000000 ffff880100000000 ffff880117903280 [ 80.145172] 0000000000000000 ffff880119a4cf00 0000000000000400 00000000000007fa [ 80.145172] 0000000000000000 0000000000000000 0000000000000000 ffff88011350f680 [ 80.145172] Call Trace: [ 80.145172] [<ffffffff815eeceb>] ? rt6_bind_peer+0x4b/0x90 [ 80.145172] [<ffffffff815ed985>] __ip6_ins_rt+0x45/0x70 [ 80.145172] [<ffffffff815eee35>] ip6_ins_rt+0x35/0x40 [ 80.145172] [<ffffffff815ef1e4>] ip6_pol_route.isra.44+0x3a4/0x4b0 [ 80.145172] [<ffffffff815ef34a>] ip6_pol_route_output+0x2a/0x30 [ 80.145172] [<ffffffff81616077>] fib6_rule_action+0xd7/0x210 [ 80.145172] [<ffffffff815ef320>] ? ip6_pol_route_input+0x30/0x30 [ 80.145172] [<ffffffff81553026>] fib_rules_lookup+0xc6/0x140 [ 80.145172] [<ffffffff81616374>] fib6_rule_lookup+0x44/0x80 [ 80.145172] [<ffffffff815ef320>] ? ip6_pol_route_input+0x30/0x30 [ 80.145172] [<ffffffff815edea3>] ip6_route_output+0x73/0xb0 [ 80.145172] [<ffffffff815dfdf3>] ip6_dst_lookup_tail+0x2c3/0x2e0 [ 80.145172] [<ffffffff813007b1>] ? list_del+0x11/0x40 [ 80.145172] [<ffffffff81082a4c>] ? remove_wait_queue+0x3c/0x50 [ 80.145172] [<ffffffff815dfe4d>] ip6_dst_lookup_flow+0x3d/0xa0 [ 80.145172] [<ffffffff815fda77>] rawv6_sendmsg+0x267/0xc20 [ 80.145172] [<ffffffff815a8a83>] inet_sendmsg+0x63/0xb0 [ 80.145172] [<ffffffff8128eb93>] ? selinux_socket_sendmsg+0x23/0x30 [ 80.145172] [<ffffffff815218d6>] sock_sendmsg+0xa6/0xd0 [ 80.145172] [<ffffffff81524a68>] SYSC_sendto+0x128/0x180 [ 80.145172] [<ffffffff8109825c>] ? update_curr+0xec/0x170 [ 80.145172] [<ffffffff81041d09>] ? kvm_clock_get_cycles+0x9/0x10 [ 80.145172] [<ffffffff810afd1e>] ? __getnstimeofday+0x3e/0xd0 [ 80.145172] [<ffffffff8152509e>] SyS_sendto+0xe/0x10 [ 80.145172] [<ffffffff8164efd9>] system_call_fastpath+0x16/0x1b [ 80.145172] Code: fe ff ff 41 f6 45 2a 06 0f 85 ca fe ff ff 49 8b 7e 08 4c 89 ee e8 94 ef ff ff e9 b9 fe ff ff 48 8b 82 28 05 00 00 e9 01 ff ff ff <0f> 0b 49 8b 54 24 30 0d 00 00 40 00 89 83 14 01 00 00 48 89 53 [ 80.145172] RIP [<ffffffff815f3b5d>] fib6_add+0x75d/0x830 [ 80.145172] RSP <ffff880118771798> [ 80.387413] ---[ end trace 02f20b7a8b81ed95 ]--- [ 80.390154] Kernel panic - not syncing: Fatal exception in interrupt Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com> Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 110ecd6 ] p9_release_pages() would attempt to dereference one value past the end of pages[]. This would cause the following crashes: [ 6293.171817] BUG: unable to handle kernel paging request at ffff8807c96f3000 [ 6293.174146] IP: [<ffffffff8412793b>] p9_release_pages+0x3b/0x60 [ 6293.176447] PGD 79c5067 PUD 82c1e3067 PMD 82c197067 PTE 80000007c96f3060 [ 6293.180060] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC [ 6293.180060] Modules linked in: [ 6293.180060] CPU: 62 PID: 174043 Comm: modprobe Tainted: G W 3.10.0-next-20130710-sasha #3954 [ 6293.180060] task: ffff8807b803b000 ti: ffff880787dde000 task.ti: ffff880787dde000 [ 6293.180060] RIP: 0010:[<ffffffff8412793b>] [<ffffffff8412793b>] p9_release_pages+0x3b/0x60 [ 6293.214316] RSP: 0000:ffff880787ddfc28 EFLAGS: 00010202 [ 6293.214316] RAX: 0000000000000001 RBX: ffff8807c96f2ff8 RCX: 0000000000000000 [ 6293.222017] RDX: ffff8807b803b000 RSI: 0000000000000001 RDI: ffffea001c7e3d40 [ 6293.222017] RBP: ffff880787ddfc48 R08: 0000000000000000 R09: 0000000000000000 [ 6293.222017] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000001 [ 6293.222017] R13: 0000000000000001 R14: ffff8807cc50c070 R15: ffff8807cc50c070 [ 6293.222017] FS: 00007f572641d700(0000) GS:ffff8807f3600000(0000) knlGS:0000000000000000 [ 6293.256784] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 6293.256784] CR2: ffff8807c96f3000 CR3: 00000007c8e81000 CR4: 00000000000006e0 [ 6293.256784] Stack: [ 6293.256784] ffff880787ddfcc8 ffff880787ddfcc8 0000000000000000 ffff880787ddfcc8 [ 6293.256784] ffff880787ddfd48 ffffffff84128be8 ffff880700000002 0000000000000001 [ 6293.256784] ffff8807b803b000 ffff880787ddfce0 0000100000000000 0000000000000000 [ 6293.256784] Call Trace: [ 6293.256784] [<ffffffff84128be8>] p9_virtio_zc_request+0x598/0x630 [ 6293.256784] [<ffffffff8115c610>] ? wake_up_bit+0x40/0x40 [ 6293.256784] [<ffffffff841209b1>] p9_client_zc_rpc+0x111/0x3a0 [ 6293.256784] [<ffffffff81174b78>] ? sched_clock_cpu+0x108/0x120 [ 6293.256784] [<ffffffff84122a21>] p9_client_read+0xe1/0x2c0 [ 6293.256784] [<ffffffff81708a90>] v9fs_file_read+0x90/0xc0 [ 6293.256784] [<ffffffff812bd073>] vfs_read+0xc3/0x130 [ 6293.256784] [<ffffffff811a78bd>] ? trace_hardirqs_on+0xd/0x10 [ 6293.256784] [<ffffffff812bd5a2>] SyS_read+0x62/0xa0 [ 6293.256784] [<ffffffff841a1a00>] tracesys+0xdd/0xe2 [ 6293.256784] Code: 66 90 48 89 fb 41 89 f5 48 8b 3f 48 85 ff 74 29 85 f6 74 25 45 31 e4 66 0f 1f 84 00 00 00 00 00 e8 eb 14 12 fd 41 ff c4 49 63 c4 <48> 8b 3c c3 48 85 ff 74 05 45 39 e5 75 e7 48 83 c4 08 5b 41 5c [ 6293.256784] RIP [<ffffffff8412793b>] p9_release_pages+0x3b/0x60 [ 6293.256784] RSP <ffff880787ddfc28> [ 6293.256784] CR2: ffff8807c96f3000 [ 6293.256784] ---[ end trace 50822ee72cd360fc ]--- Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 2c8a018 ] We rename the dummy in modprobe.conf like this: install dummy0 /sbin/modprobe -o dummy0 --ignore-install dummy install dummy1 /sbin/modprobe -o dummy1 --ignore-install dummy We got oops when we run the command: modprobe dummy0 modprobe dummy1 ------------[ cut here ]------------ [ 3302.187584] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 [ 3302.195411] IP: [<ffffffff813fe62a>] __rtnl_link_unregister+0x9a/0xd0 [ 3302.201844] PGD 85c94a067 PUD 8517bd067 PMD 0 [ 3302.206305] Oops: 0002 [#1] SMP [ 3302.299737] task: ffff88105ccea300 ti: ffff880eba4a0000 task.ti: ffff880eba4a0000 [ 3302.307186] RIP: 0010:[<ffffffff813fe62a>] [<ffffffff813fe62a>] __rtnl_link_unregister+0x9a/0xd0 [ 3302.316044] RSP: 0018:ffff880eba4a1dd8 EFLAGS: 00010246 [ 3302.321332] RAX: 0000000000000000 RBX: ffffffff81a9d738 RCX: 0000000000000002 [ 3302.328436] RDX: 0000000000000000 RSI: ffffffffa04d602c RDI: ffff880eba4a1dd8 [ 3302.335541] RBP: ffff880eba4a1e18 R08: dead000000200200 R09: dead000000100100 [ 3302.342644] R10: 0000000000000080 R11: 0000000000000003 R12: ffffffff81a9d788 [ 3302.349748] R13: ffffffffa04d7020 R14: ffffffff81a9d670 R15: ffff880eba4a1dd8 [ 3302.364910] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3302.370630] CR2: 0000000000000008 CR3: 000000085e15e000 CR4: 00000000000427e0 [ 3302.377734] DR0: 0000000000000003 DR1: 00000000000000b0 DR2: 0000000000000001 [ 3302.384838] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 3302.391940] Stack: [ 3302.393944] ffff880eba4a1dd8 ffff880eba4a1dd8 ffff880eba4a1e18 ffffffffa04d70c0 [ 3302.401350] 00000000ffffffef ffffffffa01a8000 0000000000000000 ffffffff816111c8 [ 3302.408758] ffff880eba4a1e48 ffffffffa01a80be ffff880eba4a1e48 ffffffffa04d70c0 [ 3302.416164] Call Trace: [ 3302.418605] [<ffffffffa01a8000>] ? 0xffffffffa01a7fff [ 3302.423727] [<ffffffffa01a80be>] dummy_init_module+0xbe/0x1000 [dummy0] [ 3302.430405] [<ffffffffa01a8000>] ? 0xffffffffa01a7fff [ 3302.435535] [<ffffffff81000322>] do_one_initcall+0x152/0x1b0 [ 3302.441263] [<ffffffff810ab24b>] do_init_module+0x7b/0x200 [ 3302.446824] [<ffffffff810ad3d2>] load_module+0x4e2/0x530 [ 3302.452215] [<ffffffff8127ae40>] ? ddebug_dyndbg_boot_param_cb+0x60/0x60 [ 3302.458979] [<ffffffff810ad5f1>] SyS_init_module+0xd1/0x130 [ 3302.464627] [<ffffffff814b9652>] system_call_fastpath+0x16/0x1b [ 3302.490090] RIP [<ffffffff813fe62a>] __rtnl_link_unregister+0x9a/0xd0 [ 3302.496607] RSP <ffff880eba4a1dd8> [ 3302.500084] CR2: 0000000000000008 [ 3302.503466] ---[ end trace 8342d49cd49f78ed ]--- The reason is that when loading dummy, if __rtnl_link_register() return failed, the init_module should return and avoid take the wrong path. Signed-off-by: Tan Xiaojun <tanxiaojun@huawei.com> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 352900b ] Recently had this backtrace reported: WARNING: at lib/dma-debug.c:937 check_unmap+0x47d/0x930() Hardware name: System Product Name ATL1E 0000:02:00.0: DMA-API: device driver failed to check map error[device address=0x00000000cbfd1000] [size=90 bytes] [mapped as single] Modules linked in: xt_conntrack nf_conntrack ebtable_filter ebtables ip6table_filter ip6_tables snd_hda_codec_hdmi snd_hda_codec_realtek iTCO_wdt iTCO_vendor_support snd_hda_intel acpi_cpufreq mperf coretemp btrfs zlib_deflate snd_hda_codec snd_hwdep microcode raid6_pq libcrc32c snd_seq usblp serio_raw xor snd_seq_device joydev snd_pcm snd_page_alloc snd_timer snd lpc_ich i2c_i801 soundcore mfd_core atl1e asus_atk0110 ata_generic pata_acpi radeon i2c_algo_bit drm_kms_helper ttm drm i2c_core pata_marvell uinput Pid: 314, comm: systemd-journal Not tainted 3.9.0-0.rc6.git2.3.fc19.x86_64 #1 Call Trace: <IRQ> [<ffffffff81069106>] warn_slowpath_common+0x66/0x80 [<ffffffff8106916c>] warn_slowpath_fmt+0x4c/0x50 [<ffffffff8138151d>] check_unmap+0x47d/0x930 [<ffffffff810ad048>] ? sched_clock_cpu+0xa8/0x100 [<ffffffff81381a2f>] debug_dma_unmap_page+0x5f/0x70 [<ffffffff8137ce30>] ? unmap_single+0x20/0x30 [<ffffffffa01569a1>] atl1e_intr+0x3a1/0x5b0 [atl1e] [<ffffffff810d53fd>] ? trace_hardirqs_off+0xd/0x10 [<ffffffff81119636>] handle_irq_event_percpu+0x56/0x390 [<ffffffff811199ad>] handle_irq_event+0x3d/0x60 [<ffffffff8111cb6a>] handle_fasteoi_irq+0x5a/0x100 [<ffffffff8101c36f>] handle_irq+0xbf/0x150 [<ffffffff811dcb2f>] ? file_sb_list_del+0x3f/0x50 [<ffffffff81073b10>] ? irq_enter+0x50/0xa0 [<ffffffff8172738d>] do_IRQ+0x4d/0xc0 [<ffffffff811dcb2f>] ? file_sb_list_del+0x3f/0x50 [<ffffffff8171c6b2>] common_interrupt+0x72/0x72 <EOI> [<ffffffff810db5b2>] ? lock_release+0xc2/0x310 [<ffffffff8109ea04>] lg_local_unlock_cpu+0x24/0x50 [<ffffffff811dcb2f>] file_sb_list_del+0x3f/0x50 [<ffffffff811dcb6d>] fput+0x2d/0xc0 [<ffffffff811d8ea1>] filp_close+0x61/0x90 [<ffffffff811fae4d>] __close_fd+0x8d/0x150 [<ffffffff811d8ef0>] sys_close+0x20/0x50 [<ffffffff81725699>] system_call_fastpath+0x16/0x1b The usual straighforward failure to check for dma_mapping_error after a map operation is completed. This patch should fix it, the reporter wandered off after filing this bz: https://bugzilla.redhat.com/show_bug.cgi?id=954170 and I don't have hardware to test, but the fix is pretty straightforward, so I figured I'd post it for review. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: Jay Cliburn <jcliburn@gmail.com> CC: Chris Snook <chris.snook@gmail.com> CC: "David S. Miller" <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 88d84ac upstream. Fix the following: BUG: key ffff88043bdd0330 not in .data! ------------[ cut here ]------------ WARNING: at kernel/lockdep.c:2987 lockdep_init_map+0x565/0x5a0() DEBUG_LOCKS_WARN_ON(1) Modules linked in: glue_helper sb_edac(+) edac_core snd acpi_cpufreq lrw gf128mul ablk_helper iTCO_wdt evdev i2c_i801 dcdbas button cryptd pcspkr iTCO_vendor_support usb_common lpc_ich mfd_core soundcore mperf processor microcode CPU: 2 PID: 599 Comm: modprobe Not tainted 3.10.0 #1 Hardware name: Dell Inc. Precision T3600/0PTTT9, BIOS A08 01/24/2013 0000000000000009 ffff880439a1d920 ffffffff8160a9a9 ffff880439a1d958 ffffffff8103d9e0 ffff88043af4a510 ffffffff81a16e11 0000000000000000 ffff88043bdd0330 0000000000000000 ffff880439a1d9b8 ffffffff8103dacc Call Trace: dump_stack warn_slowpath_common warn_slowpath_fmt lockdep_init_map ? trace_hardirqs_on_caller ? trace_hardirqs_on debug_mutex_init __mutex_init bus_register edac_create_sysfs_mci_device edac_mc_add_mc sbridge_probe pci_device_probe driver_probe_device __driver_attach ? driver_probe_device bus_for_each_dev driver_attach bus_add_driver driver_register __pci_register_driver ? 0xffffffffa0010fff sbridge_init ? 0xffffffffa0010fff do_one_initcall load_module ? unset_module_init_ro_nx SyS_init_module tracesys ---[ end trace d24a70b0d3ddf733 ]--- EDAC MC0: Giving out device to 'sbridge_edac.c' 'Sandy Bridge Socket#0': DEV 0000:3f:0e.0 EDAC sbridge: Driver loaded. What happens is that bus_register needs a statically allocated lock_key because the last is handed in to lockdep. However, struct mem_ctl_info embeds struct bus_type (the whole struct, not a pointer to it) and the whole thing gets dynamically allocated. Fix this by using a statically allocated struct bus_type for the MC bus. Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Mauro Carvalho Chehab <mchehab@infradead.org> Cc: Markus Trippelsdorf <markus@trippelsdorf.de> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6287e73 upstream. Commit 8ef6e62 (ARM: footbridge: use fixed PCI i/o mapping) broke booting on my netwinder. Before that, everything boots fine. Since then, it crashes on boot. With earlyprintk, I see it BUG-ing like so: kernel BUG at lib/ioremap.c:27! Internal error: Oops - BUG: 0 [#1] ARM ... [<c0139b54>] (ioremap_page_range+0x128/0x154) from [<c02e6a6c>] (dc21285_setup+0xd0/0x114) [<c02e6a6c>] (dc21285_setup+0xd0/0x114) from [<c02e4874>] (pci_common_init+0xa0/0x298) [<c02e4874>] (pci_common_init+0xa0/0x298) from [<c02e793c>] (netwinder_pci_init+0xc/0x18) [<c02e793c>] (netwinder_pci_init+0xc/0x18) from [<c02e27d0>] (do_one_initcall+0xb4/0x180) ... Russell points out it's because of overlapping PCI mappings that was added with the aforementioned commit. Rob thought the code would re-use the static mapping, but that turns out to not be the case and instead hits the BUG further down. After deleting this hunk as suggested by Russel, the system boots up fine again and all my PCI devices work (IDE, ethernet, the DC21285). Signed-off-by: Mike Frysinger <vapier@gentoo.org> Acked-by: Rob Herring <rob.herring@calxeda.com> Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
…ike page) commit e0d4075 upstream. Unfortunately, I never committed the fix to a nasty oops which can occur as a result of that commit: ------------[ cut here ]------------ kernel BUG at /home/olof/work/batch/include/linux/mm.h:414! Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM Modules linked in: CPU: 0 PID: 490 Comm: killall5 Not tainted 3.11.0-rc3-00288-gabe0308 #53 task: e90acac0 ti: e9be8000 task.ti: e9be8000 PC is at special_mapping_fault+0xa4/0xc4 LR is at __do_fault+0x68/0x48c This doesn't show up unless you do quite a bit of testing; a simple boot test does not do this, so all my nightly tests were passing fine. The reason for this is that install_special_mapping() expects the page array to stick around, and as this was only inserting one page which was stored on the kernel stack, that's why this was blowing up. Reported-by: Olof Johansson <olof@lixom.net> Tested-by: Olof Johansson <olof@lixom.net> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 905a6f9 ] Otherwise we end up dereferencing the already freed net->ipv6.mrt pointer which leads to a panic (from Srivatsa S. Bhat): BUG: unable to handle kernel paging request at ffff882018552020 IP: [<ffffffffa0366b02>] ip6mr_sk_done+0x32/0xb0 [ipv6] PGD 290a067 PUD 207ffe0067 PMD 207ff1d067 PTE 8000002018552060 Oops: 0000 [#1] SMP DEBUG_PAGEALLOC Modules linked in: ebtable_nat ebtables nfs fscache nf_conntrack_ipv4 nf_defrag_ipv4 ipt_REJECT xt_CHECKSUM iptable_mangle iptable_filter ip_tables nfsd lockd nfs_acl exportfs auth_rpcgss autofs4 sunrpc 8021q garp bridge stp llc ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter +ip6_tables ipv6 vfat fat vhost_net macvtap macvlan vhost tun kvm_intel kvm uinput iTCO_wdt iTCO_vendor_support cdc_ether usbnet mii microcode i2c_i801 i2c_core lpc_ich mfd_core shpchp ioatdma dca mlx4_core be2net wmi acpi_cpufreq mperf ext4 jbd2 mbcache dm_mirror dm_region_hash dm_log dm_mod CPU: 0 PID: 7 Comm: kworker/u33:0 Not tainted 3.11.0-rc1-ea45e-a #4 Hardware name: IBM -[8737R2A]-/00Y2738, BIOS -[B2E120RUS-1.20]- 11/30/2012 Workqueue: netns cleanup_net task: ffff8810393641c0 ti: ffff881039366000 task.ti: ffff881039366000 RIP: 0010:[<ffffffffa0366b02>] [<ffffffffa0366b02>] ip6mr_sk_done+0x32/0xb0 [ipv6] RSP: 0018:ffff881039367bd8 EFLAGS: 00010286 RAX: ffff881039367fd8 RBX: ffff882018552000 RCX: dead000000200200 RDX: 0000000000000000 RSI: ffff881039367b68 RDI: ffff881039367b68 RBP: ffff881039367bf8 R08: ffff881039367b68 R09: 2222222222222222 R10: 2222222222222222 R11: 2222222222222222 R12: ffff882015a7a040 R13: ffff882014eb89c0 R14: ffff8820289e2800 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88103fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff882018552020 CR3: 0000000001c0b000 CR4: 00000000000407f0 Stack: ffff881039367c18 ffff882014eb89c0 ffff882015e28c00 0000000000000000 ffff881039367c18 ffffffffa034d9d1 ffff8820289e2800 ffff882014eb89c0 ffff881039367c58 ffffffff815bdecb ffffffff815bddf2 ffff882014eb89c0 Call Trace: [<ffffffffa034d9d1>] rawv6_close+0x21/0x40 [ipv6] [<ffffffff815bdecb>] inet_release+0xfb/0x220 [<ffffffff815bddf2>] ? inet_release+0x22/0x220 [<ffffffffa032686f>] inet6_release+0x3f/0x50 [ipv6] [<ffffffff8151c1d9>] sock_release+0x29/0xa0 [<ffffffff81525520>] sk_release_kernel+0x30/0x70 [<ffffffffa034f14b>] icmpv6_sk_exit+0x3b/0x80 [ipv6] [<ffffffff8152fff9>] ops_exit_list+0x39/0x60 [<ffffffff815306fb>] cleanup_net+0xfb/0x1a0 [<ffffffff81075e3a>] process_one_work+0x1da/0x610 [<ffffffff81075dc9>] ? process_one_work+0x169/0x610 [<ffffffff81076390>] worker_thread+0x120/0x3a0 [<ffffffff81076270>] ? process_one_work+0x610/0x610 [<ffffffff8107da2e>] kthread+0xee/0x100 [<ffffffff8107d940>] ? __init_kthread_worker+0x70/0x70 [<ffffffff8162a99c>] ret_from_fork+0x7c/0xb0 [<ffffffff8107d940>] ? __init_kthread_worker+0x70/0x70 Code: 20 48 89 5d e8 4c 89 65 f0 4c 89 6d f8 66 66 66 66 90 4c 8b 67 30 49 89 fd e8 db 3c 1e e1 49 8b 9c 24 90 08 00 00 48 85 db 74 06 <4c> 39 6b 20 74 20 bb f3 ff ff ff e8 8e 3c 1e e1 89 d8 4c 8b 65 RIP [<ffffffffa0366b02>] ip6mr_sk_done+0x32/0xb0 [ipv6] RSP <ffff881039367bd8> CR2: ffff882018552020 Reported-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Tested-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit c74f2b2 ] Requesting external module with cb_lock taken can result in the deadlock like showed below: [ 2458.111347] Showing all locks held in the system: [ 2458.111347] 1 lock held by NetworkManager/582: [ 2458.111347] #0: (cb_lock){++++++}, at: [<ffffffff8162bc79>] genl_rcv+0x19/0x40 [ 2458.111347] 1 lock held by modprobe/603: [ 2458.111347] #0: (cb_lock){++++++}, at: [<ffffffff8162baa5>] genl_lock_all+0x15/0x30 [ 2461.579457] SysRq : Show Blocked State [ 2461.580103] task PC stack pid father [ 2461.580103] NetworkManager D ffff880034b84500 4040 582 1 0x00000080 [ 2461.580103] ffff8800197ff720 0000000000000046 00000000001d5340 ffff8800197fffd8 [ 2461.580103] ffff8800197fffd8 00000000001d5340 ffff880019631700 7fffffffffffffff [ 2461.580103] ffff8800197ff880 ffff8800197ff878 ffff880019631700 ffff880019631700 [ 2461.580103] Call Trace: [ 2461.580103] [<ffffffff817355f9>] schedule+0x29/0x70 [ 2461.580103] [<ffffffff81731ad1>] schedule_timeout+0x1c1/0x360 [ 2461.580103] [<ffffffff810e69eb>] ? mark_held_locks+0xbb/0x140 [ 2461.580103] [<ffffffff817377ac>] ? _raw_spin_unlock_irq+0x2c/0x50 [ 2461.580103] [<ffffffff810e6b6d>] ? trace_hardirqs_on_caller+0xfd/0x1c0 [ 2461.580103] [<ffffffff81736398>] wait_for_completion_killable+0xe8/0x170 [ 2461.580103] [<ffffffff810b7fa0>] ? wake_up_state+0x20/0x20 [ 2461.580103] [<ffffffff81095825>] call_usermodehelper_exec+0x1a5/0x210 [ 2461.580103] [<ffffffff817362ed>] ? wait_for_completion_killable+0x3d/0x170 [ 2461.580103] [<ffffffff81095cc3>] __request_module+0x1b3/0x370 [ 2461.580103] [<ffffffff810e6b6d>] ? trace_hardirqs_on_caller+0xfd/0x1c0 [ 2461.580103] [<ffffffff8162c5c9>] ctrl_getfamily+0x159/0x190 [ 2461.580103] [<ffffffff8162d8a4>] genl_family_rcv_msg+0x1f4/0x2e0 [ 2461.580103] [<ffffffff8162d990>] ? genl_family_rcv_msg+0x2e0/0x2e0 [ 2461.580103] [<ffffffff8162da1e>] genl_rcv_msg+0x8e/0xd0 [ 2461.580103] [<ffffffff8162b729>] netlink_rcv_skb+0xa9/0xc0 [ 2461.580103] [<ffffffff8162bc88>] genl_rcv+0x28/0x40 [ 2461.580103] [<ffffffff8162ad6d>] netlink_unicast+0xdd/0x190 [ 2461.580103] [<ffffffff8162b149>] netlink_sendmsg+0x329/0x750 [ 2461.580103] [<ffffffff815db849>] sock_sendmsg+0x99/0xd0 [ 2461.580103] [<ffffffff810bb58f>] ? local_clock+0x5f/0x70 [ 2461.580103] [<ffffffff810e96e8>] ? lock_release_non_nested+0x308/0x350 [ 2461.580103] [<ffffffff815dbc6e>] ___sys_sendmsg+0x39e/0x3b0 [ 2461.580103] [<ffffffff810565af>] ? kvm_clock_read+0x2f/0x50 [ 2461.580103] [<ffffffff810218b9>] ? sched_clock+0x9/0x10 [ 2461.580103] [<ffffffff810bb2bd>] ? sched_clock_local+0x1d/0x80 [ 2461.580103] [<ffffffff810bb448>] ? sched_clock_cpu+0xa8/0x100 [ 2461.580103] [<ffffffff810e33ad>] ? trace_hardirqs_off+0xd/0x10 [ 2461.580103] [<ffffffff810bb58f>] ? local_clock+0x5f/0x70 [ 2461.580103] [<ffffffff810e3f7f>] ? lock_release_holdtime.part.28+0xf/0x1a0 [ 2461.580103] [<ffffffff8120fec9>] ? fget_light+0xf9/0x510 [ 2461.580103] [<ffffffff8120fe0c>] ? fget_light+0x3c/0x510 [ 2461.580103] [<ffffffff815dd1d2>] __sys_sendmsg+0x42/0x80 [ 2461.580103] [<ffffffff815dd222>] SyS_sendmsg+0x12/0x20 [ 2461.580103] [<ffffffff81741ad9>] system_call_fastpath+0x16/0x1b [ 2461.580103] modprobe D ffff88000f2c8000 4632 603 602 0x00000080 [ 2461.580103] ffff88000f04fba8 0000000000000046 00000000001d5340 ffff88000f04ffd8 [ 2461.580103] ffff88000f04ffd8 00000000001d5340 ffff8800377d4500 ffff8800377d4500 [ 2461.580103] ffffffff81d0b260 ffffffff81d0b268 ffffffff00000000 ffffffff81d0b2b0 [ 2461.580103] Call Trace: [ 2461.580103] [<ffffffff817355f9>] schedule+0x29/0x70 [ 2461.580103] [<ffffffff81736d4d>] rwsem_down_write_failed+0xed/0x1a0 [ 2461.580103] [<ffffffff810bb200>] ? update_cpu_load_active+0x10/0xb0 [ 2461.580103] [<ffffffff8137b473>] call_rwsem_down_write_failed+0x13/0x20 [ 2461.580103] [<ffffffff8173492d>] ? down_write+0x9d/0xb2 [ 2461.580103] [<ffffffff8162baa5>] ? genl_lock_all+0x15/0x30 [ 2461.580103] [<ffffffff8162baa5>] genl_lock_all+0x15/0x30 [ 2461.580103] [<ffffffff8162cbb3>] genl_register_family+0x53/0x1f0 [ 2461.580103] [<ffffffffa01dc000>] ? 0xffffffffa01dbfff [ 2461.580103] [<ffffffff8162d650>] genl_register_family_with_ops+0x20/0x80 [ 2461.580103] [<ffffffffa01dc000>] ? 0xffffffffa01dbfff [ 2461.580103] [<ffffffffa017fe84>] nl80211_init+0x24/0xf0 [cfg80211] [ 2461.580103] [<ffffffffa01dc000>] ? 0xffffffffa01dbfff [ 2461.580103] [<ffffffffa01dc043>] cfg80211_init+0x43/0xdb [cfg80211] [ 2461.580103] [<ffffffff810020fa>] do_one_initcall+0xfa/0x1b0 [ 2461.580103] [<ffffffff8105cb93>] ? set_memory_nx+0x43/0x50 [ 2461.580103] [<ffffffff810f75af>] load_module+0x1c6f/0x27f0 [ 2461.580103] [<ffffffff810f2c90>] ? store_uevent+0x40/0x40 [ 2461.580103] [<ffffffff810f82c6>] SyS_finit_module+0x86/0xb0 [ 2461.580103] [<ffffffff81741ad9>] system_call_fastpath+0x16/0x1b [ 2461.580103] Sched Debug Version: v0.10, 3.11.0-0.rc1.git4.1.fc20.x86_64 #1 Problem start to happen after adding net-pf-16-proto-16-family-nl80211 alias name to cfg80211 module by below commit (though that commit itself is perfectly fine): commit fb4e156 Author: Marcel Holtmann <marcel@holtmann.org> Date: Sun Apr 28 16:22:06 2013 -0700 nl80211: Add generic netlink module alias for cfg80211/nl80211 Reported-and-tested-by: Jeff Layton <jlayton@redhat.com> Reported-by: Richard W.M. Jones <rjones@redhat.com> Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Reviewed-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit aa52aee upstream. vscsi->num_queues counts the number of request virtqueue which does not include the control and event virtqueue. It is wrong to subtract VIRTIO_SCSI_VQ_BASE from vscsi->num_queues. This patch fixes the following panic. (qemu) device_del scsi0 BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 IP: [<ffffffff8179b29f>] __virtscsi_set_affinity+0x6f/0x120 PGD 0 Oops: 0000 [#1] SMP Modules linked in: CPU: 0 PID: 659 Comm: kworker/0:1 Not tainted 3.11.0-rc2+ #1172 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 Workqueue: kacpi_hotplug _handle_hotplug_event_func task: ffff88007bee1cc0 ti: ffff88007bfe4000 task.ti: ffff88007bfe4000 RIP: 0010:[<ffffffff8179b29f>] [<ffffffff8179b29f>] __virtscsi_set_affinity+0x6f/0x120 RSP: 0018:ffff88007bfe5a38 EFLAGS: 00010202 RAX: 0000000000000010 RBX: ffff880077fd0d28 RCX: 0000000000000050 RDX: 0000000000000000 RSI: 0000000000000246 RDI: 0000000000000000 RBP: ffff88007bfe5a58 R08: ffff880077f6ff00 R09: 0000000000000001 R10: ffffffff8143e673 R11: 0000000000000001 R12: 0000000000000001 R13: ffff880077fd0800 R14: 0000000000000000 R15: ffff88007bf489b0 FS: 0000000000000000(0000) GS:ffff88007ea00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000020 CR3: 0000000079f8b000 CR4: 00000000000006f0 Stack: ffff880077fd0d28 0000000000000000 ffff880077fd0800 0000000000000008 ffff88007bfe5a78 ffffffff8179b37d ffff88007bccc800 ffff88007bccc800 ffff88007bfe5a98 ffffffff8179b3b6 ffff88007bccc800 ffff880077fd0d28 Call Trace: [<ffffffff8179b37d>] virtscsi_set_affinity+0x2d/0x40 [<ffffffff8179b3b6>] virtscsi_remove_vqs+0x26/0x50 [<ffffffff8179c7d2>] virtscsi_remove+0x82/0xa0 [<ffffffff814cb6b2>] virtio_dev_remove+0x22/0x70 [<ffffffff8167ca49>] __device_release_driver+0x69/0xd0 [<ffffffff8167cb9d>] device_release_driver+0x2d/0x40 [<ffffffff8167bb96>] bus_remove_device+0x116/0x150 [<ffffffff81679936>] device_del+0x126/0x1e0 [<ffffffff81679a06>] device_unregister+0x16/0x30 [<ffffffff814cb889>] unregister_virtio_device+0x19/0x30 [<ffffffff814cdad6>] virtio_pci_remove+0x36/0x80 [<ffffffff81464ae7>] pci_device_remove+0x37/0x70 [<ffffffff8167ca49>] __device_release_driver+0x69/0xd0 [<ffffffff8167cb9d>] device_release_driver+0x2d/0x40 [<ffffffff8167bb96>] bus_remove_device+0x116/0x150 [<ffffffff81679936>] device_del+0x126/0x1e0 [<ffffffff8145edfc>] pci_stop_bus_device+0x9c/0xb0 [<ffffffff8145f036>] pci_stop_and_remove_bus_device+0x16/0x30 [<ffffffff81474a9e>] acpiphp_disable_slot+0x8e/0x150 [<ffffffff81474f6a>] hotplug_event_func+0xba/0x1a0 [<ffffffff814906c8>] ? acpi_os_release_object+0xe/0x12 [<ffffffff81475911>] _handle_hotplug_event_func+0x31/0x70 [<ffffffff810b5333>] process_one_work+0x183/0x500 [<ffffffff810b66e2>] worker_thread+0x122/0x400 [<ffffffff810b65c0>] ? manage_workers+0x2d0/0x2d0 [<ffffffff810bc5de>] kthread+0xce/0xe0 [<ffffffff810bc510>] ? kthread_freezable_should_stop+0x70/0x70 [<ffffffff81ca045c>] ret_from_fork+0x7c/0xb0 [<ffffffff810bc510>] ? kthread_freezable_should_stop+0x70/0x70 Code: 01 00 00 00 74 59 45 31 e4 83 bb c8 01 00 00 02 74 46 66 2e 0f 1f 84 00 00 00 00 00 49 63 c4 48 c1 e0 04 48 8b bc 0 3 10 02 00 00 <48> 8b 47 20 48 8b 80 d0 01 00 00 48 8b 40 50 48 85 c0 74 07 be RIP [<ffffffff8179b29f>] __virtscsi_set_affinity+0x6f/0x120 RSP <ffff88007bfe5a38> CR2: 0000000000000020 ---[ end trace 99679331a3775f48 ]--- Signed-off-by: Asias He <asias@redhat.com> Reviewed-by: Wanlong Gao <gaowanlong@cn.fujitsu.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 68c034f upstream. Quit from splice_write if pipe->nrbufs is 0 for avoiding oops in virtio-serial. When an application was doing splice from a kernel buffer to virtio-serial on a guest, the application received signal(SIGINT). This situation will normally happen, but the kernel executed a kernel panic by oops as follows: BUG: unable to handle kernel paging request at ffff882071c8ef28 IP: [<ffffffff812de48f>] sg_init_table+0x2f/0x50 PGD 1fac067 PUD 0 Oops: 0000 [#1] SMP Modules linked in: lockd sunrpc bnep bluetooth rfkill ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_page_alloc snd_timer snd microcode virtio_balloon virtio_net pcspkr soundcore i2c_piix4 i2c_core uinput floppy CPU: 1 PID: 908 Comm: trace-cmd Not tainted 3.10.0+ torvalds#49 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007 task: ffff880071c64650 ti: ffff88007bf24000 task.ti: ffff88007bf24000 RIP: 0010:[<ffffffff812de48f>] [<ffffffff812de48f>] sg_init_table+0x2f/0x50 RSP: 0018:ffff88007bf25dd8 EFLAGS: 00010286 RAX: 0000001fffffffe0 RBX: ffff882071c8ef28 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880071c8ef48 RBP: ffff88007bf25de8 R08: ffff88007fd15d40 R09: ffff880071c8ef48 R10: ffffea0001c71040 R11: ffffffff8139c555 R12: 0000000000000000 R13: ffff88007506a3c0 R14: ffff88007c862500 R15: ffff880071c8ef00 FS: 00007f0a3646c740(0000) GS:ffff88007fd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff882071c8ef28 CR3: 000000007acbb000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Stack: ffff880071c8ef48 ffff88007bf25e20 ffff88007bf25e88 ffffffff8139d6fa ffff88007bf25e28 ffffffff8127a3f4 0000000000000000 0000000000000000 ffff880071c8ef48 0000100000000000 0000000000000003 ffff88007bf25e08 Call Trace: [<ffffffff8139d6fa>] port_fops_splice_write+0xaa/0x130 [<ffffffff8127a3f4>] ? selinux_file_permission+0xc4/0x120 [<ffffffff8139d650>] ? wait_port_writable+0x1b0/0x1b0 [<ffffffff811a6fe0>] do_splice_from+0xa0/0x110 [<ffffffff811a951f>] SyS_splice+0x5ff/0x6b0 [<ffffffff8161f8c2>] system_call_fastpath+0x16/0x1b Code: c1 e2 05 48 89 e5 48 83 ec 10 4c 89 65 f8 41 89 f4 31 f6 48 89 5d f0 48 89 fb e8 8d ce ff ff 41 8d 44 24 ff 48 c1 e0 05 48 01 c3 <48> 8b 03 48 83 e0 fe 48 83 c8 02 48 89 03 48 8b 5d f0 4c 8b 65 RIP [<ffffffff812de48f>] sg_init_table+0x2f/0x50 RSP <ffff88007bf25dd8> CR2: ffff882071c8ef28 ---[ end trace 86323505eb42ea8f ]--- It seems to induce pagefault in sg_init_tabel() when pipe->nrbufs is equal to zero. This may happen in a following situation: (1) The application normally does splice(read) from a kernel buffer, then does splice(write) to virtio-serial. (2) The application receives SIGINT when is doing splice(read), so splice(read) is failed by EINTR. However, the application does not finish the operation. (3) The application tries to do splice(write) without pipe->nrbufs. (4) The virtio-console driver tries to touch scatterlist structure sgl in sg_init_table(), but the region is out of bound. To avoid the case, a kernel should check whether pipe->nrbufs is empty or not when splice_write is executed in the virtio-console driver. V3: Add Reviewed-by lines and stable@ line in sign-off area. Signed-off-by: Yoshihiro YUNOMAE <yoshihiro.yunomae.ez@hitachi.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Amit Shah <amit.shah@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2b4fbf0 upstream. Add pipe_lock/unlock for splice_write to avoid oops by following competition: (1) An application gets fds of a trace buffer, virtio-serial, pipe. (2) The application does fork() (3) The processes execute splice_read(trace buffer) and splice_write(virtio-serial) via same pipe. <parent> <child> get fds of a trace buffer, virtio-serial, pipe | fork()----------create--------+ | | splice(read) | ---+ splice(write) | +-- no competition | splice(read) | | splice(write) ---+ | | splice(read) | splice(write) splice(read) ------ competition | splice(write) Two processes share a pipe_inode_info structure. If the child execute splice(read) when the parent tries to execute splice(write), the structure can be broken. Existing virtio-serial driver does not get lock for the structure in splice_write, so this competition will induce oops. <oops messages> BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 IP: [<ffffffff811a6b5f>] splice_from_pipe_feed+0x6f/0x130 PGD 7223e067 PUD 72391067 PMD 0 Oops: 0000 [#1] SMP Modules linked in: lockd bnep bluetooth rfkill sunrpc ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_page_alloc snd_timer snd soundcore pcspkr virtio_net virtio_balloon i2c_piix4 i2c_core microcode uinput floppy CPU: 0 PID: 1072 Comm: compete-test Not tainted 3.10.0ws+ torvalds#55 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007 task: ffff880071b98000 ti: ffff88007b55e000 task.ti: ffff88007b55e000 RIP: 0010:[<ffffffff811a6b5f>] [<ffffffff811a6b5f>] splice_from_pipe_feed+0x6f/0x130 RSP: 0018:ffff88007b55fd78 EFLAGS: 00010287 RAX: 0000000000000000 RBX: ffff88007b55fe20 RCX: 0000000000000000 RDX: 0000000000001000 RSI: ffff88007a95ba30 RDI: ffff880036f9e6c0 RBP: ffff88007b55fda8 R08: 00000000000006ec R09: ffff880077626708 R10: 0000000000000003 R11: ffffffff8139ca59 R12: ffff88007a95ba30 R13: 0000000000000000 R14: ffffffff8139dd00 R15: ffff880036f9e6c0 FS: 00007f2e2e3a0740(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000018 CR3: 0000000071bd1000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Stack: ffffffff8139ca59 ffff88007b55fe20 ffff880036f9e6c0 ffffffff8139dd00 ffff8800776266c0 ffff880077626708 ffff88007b55fde8 ffffffff811a6e8e ffff88007b55fde8 ffffffff8139ca59 ffff880036f9e6c0 ffff88007b55fe20 Call Trace: [<ffffffff8139ca59>] ? alloc_buf.isra.13+0x39/0xb0 [<ffffffff8139dd00>] ? virtcons_restore+0x100/0x100 [<ffffffff811a6e8e>] __splice_from_pipe+0x7e/0x90 [<ffffffff8139ca59>] ? alloc_buf.isra.13+0x39/0xb0 [<ffffffff8139d739>] port_fops_splice_write+0xe9/0x140 [<ffffffff8127a3f4>] ? selinux_file_permission+0xc4/0x120 [<ffffffff8139d650>] ? wait_port_writable+0x1b0/0x1b0 [<ffffffff811a6fe0>] do_splice_from+0xa0/0x110 [<ffffffff811a951f>] SyS_splice+0x5ff/0x6b0 [<ffffffff8161facf>] tracesys+0xdd/0xe2 Code: 49 8b 87 80 00 00 00 4c 8d 24 d0 8b 53 04 41 8b 44 24 0c 4d 8b 6c 24 10 39 d0 89 03 76 02 89 13 49 8b 44 24 10 4c 89 e6 4c 89 ff <ff> 50 18 85 c0 0f 85 aa 00 00 00 48 89 da 4c 89 e6 4c 89 ff 41 RIP [<ffffffff811a6b5f>] splice_from_pipe_feed+0x6f/0x130 RSP <ffff88007b55fd78> CR2: 0000000000000018 ---[ end trace 24572beb7764de59 ]--- V2: Fix a locking problem for error V3: Add Reviewed-by lines and stable@ line in sign-off area Signed-off-by: Yoshihiro YUNOMAE <yoshihiro.yunomae.ez@hitachi.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Amit Shah <amit.shah@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ea3768b upstream. We used to keep the port's char device structs and the /sys entries around till the last reference to the port was dropped. This is actually unnecessary, and resulted in buggy behaviour: 1. Open port in guest 2. Hot-unplug port 3. Hot-plug a port with the same 'name' property as the unplugged one This resulted in hot-plug being unsuccessful, as a port with the same name already exists (even though it was unplugged). This behaviour resulted in a warning message like this one: -------------------8<--------------------------------------- WARNING: at fs/sysfs/dir.c:512 sysfs_add_one+0xc9/0x130() (Not tainted) Hardware name: KVM sysfs: cannot create duplicate filename '/devices/pci0000:00/0000:00:04.0/virtio0/virtio-ports/vport0p1' Call Trace: [<ffffffff8106b607>] ? warn_slowpath_common+0x87/0xc0 [<ffffffff8106b6f6>] ? warn_slowpath_fmt+0x46/0x50 [<ffffffff811f2319>] ? sysfs_add_one+0xc9/0x130 [<ffffffff811f23e8>] ? create_dir+0x68/0xb0 [<ffffffff811f2469>] ? sysfs_create_dir+0x39/0x50 [<ffffffff81273129>] ? kobject_add_internal+0xb9/0x260 [<ffffffff812733d8>] ? kobject_add_varg+0x38/0x60 [<ffffffff812734b4>] ? kobject_add+0x44/0x70 [<ffffffff81349de4>] ? get_device_parent+0xf4/0x1d0 [<ffffffff8134b389>] ? device_add+0xc9/0x650 -------------------8<--------------------------------------- Instead of relying on guest applications to release all references to the ports, we should go ahead and unregister the port from all the core layers. Any open/read calls on the port will then just return errors, and an unplug/plug operation on the host will succeed as expected. This also caused buggy behaviour in case of the device removal (not just a port): when the device was removed (which means all ports on that device are removed automatically as well), the ports with active users would clean up only when the last references were dropped -- and it would be too late then to be referencing char device pointers, resulting in oopses: -------------------8<--------------------------------------- PID: 6162 TASK: ffff8801147ad500 CPU: 0 COMMAND: "cat" #0 [ffff88011b9d5a90] machine_kexec at ffffffff8103232b #1 [ffff88011b9d5af0] crash_kexec at ffffffff810b9322 #2 [ffff88011b9d5bc0] oops_end at ffffffff814f4a50 #3 [ffff88011b9d5bf0] die at ffffffff8100f26b #4 [ffff88011b9d5c20] do_general_protection at ffffffff814f45e2 #5 [ffff88011b9d5c50] general_protection at ffffffff814f3db5 [exception RIP: strlen+2] RIP: ffffffff81272ae2 RSP: ffff88011b9d5d00 RFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff880118901c18 RCX: 0000000000000000 RDX: ffff88011799982c RSI: 00000000000000d0 RDI: 3a303030302f3030 RBP: ffff88011b9d5d38 R8: 0000000000000006 R9: ffffffffa0134500 R10: 0000000000001000 R11: 0000000000001000 R12: ffff880117a1cc10 R13: 00000000000000d0 R14: 0000000000000017 R15: ffffffff81aff700 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 torvalds#6 [ffff88011b9d5d00] kobject_get_path at ffffffff8126dc5d torvalds#7 [ffff88011b9d5d40] kobject_uevent_env at ffffffff8126e551 torvalds#8 [ffff88011b9d5dd0] kobject_uevent at ffffffff8126e9eb torvalds#9 [ffff88011b9d5de0] device_del at ffffffff813440c7 -------------------8<--------------------------------------- So clean up when we have all the context, and all that's left to do when the references to the port have dropped is to free up the port struct itself. Reported-by: chayang <chayang@redhat.com> Reported-by: YOGANANTH SUBRAMANIAN <anantyog@in.ibm.com> Reported-by: FuXiangChun <xfu@redhat.com> Reported-by: Qunfang Zhang <qzhang@redhat.com> Reported-by: Sibiao Luo <sluo@redhat.com> Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3e6b11d upstream. struct memcg_cache_params has a union. Different parts of this union are used for root and non-root caches. A part with destroying work is used only for non-root caches. I fixed the same problem in another place v3.9-rc1-16204-gf101a94, but didn't notice this one. This patch fixes the kernel panic: [ 46.848187] BUG: unable to handle kernel paging request at 000000fffffffeb8 [ 46.849026] IP: [<ffffffff811a484c>] kmem_cache_destroy_memcg_children+0x6c/0xc0 [ 46.849092] PGD 0 [ 46.849092] Oops: 0000 [#1] SMP ... Signed-off-by: Andrey Vagin <avagin@openvz.org> Cc: Glauber Costa <glommer@openvz.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.cz> Cc: Balbir Singh <bsingharora@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Konstantin Khlebnikov <khlebnikov@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Generally request_irq() should be called after hardware has been initialized into a sane state. However, sdhci driver currently calls request_irq() before sdhci_init(). At least, the following kernel panic seen on i.MX6 is caused by that. The sdhci controller on i.MX6 may have noisy glitch on DAT1 line, which will trigger SDIO interrupt handling once request_irq() is called. But at this point, the SDIO interrupt handler host->sdio_irq_thread has not been registered yet. Thus, we see the NULL pointer access with wake_up_process(host->sdio_irq_thread) in mmc_signal_sdio_irq(). sdhci-pltfm: SDHCI platform and OF driver helper mmc0: no vqmmc regulator found mmc0: no vmmc regulator found Unable to handle kernel NULL pointer dereference at virtual address 00000000 pgd = 80004000 [00000000] *pgd=00000000 Internal error: Oops: 5 [#1] SMP ARM Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.10.0+ #3 task: 9f860000 ti: 9f862000 task.ti: 9f862000 PC is at wake_up_process+0xc/0x44 LR is at sdhci_irq+0x378/0x93c ... Backtrace: [<8004f75c>] (wake_up_process+0x0/0x44) from [<803fb698>] (sdhci_irq+0x378/0x93c) r4:9fa68000 r3:00000001 [<803fb320>] (sdhci_irq+0x0/0x93c) from [<80075154>] (handle_irq_event_percpu+0x54/0x19c) [<80075100>] (handle_irq_event_percpu+0x0/0x19c) from [<800752ec>] (handle_irq_event+0x50/0x70) [<8007529c>] (handle_irq_event+0x0/0x70) from [<80078324>] (handle_fasteoi_irq+0x9c/0x170) r5:00000001 r4:9f807900 [<80078288>] (handle_fasteoi_irq+0x0/0x170) from [<80074ac0>] (generic_handle_irq+0x28/0x38) r5:8071fd64 r4:00000036 [<80074a98>] (generic_handle_irq+0x0/0x38) from [<8000ee34>] (handle_IRQ+0x54/0xb4) r4:8072ab78 r3:00000140 [<8000ede0>] (handle_IRQ+0x0/0xb4) from [<80008600>] (gic_handle_irq+0x30/0x64) r8:00000036 r7:a080e100 r6:9f863cd0 r5:8072acbc r4:a080e10c r3:00000000 [<800085d0>] (gic_handle_irq+0x0/0x64) from [<8000e0c0>] (__irq_svc+0x40/0x54) ... ---[ end trace e9af3588936b63f0 ]--- Kernel panic - not syncing: Fatal exception in interrupt Fix the panic by simply reverse the calling sequence between request_irq() and sdhci_init(). Signed-off-by: Shawn Guo <shawn.guo@freescale.com>
…ot up WAIT mode is enabled by default due to hardware reset, so we need to disable it during kernel boot up, otherwise, system may crash without proper setting for WAIT mode. CPUIdle driver will enable WAIT mode later. Below is the stack dump when crash, this patch fix it: Bad mode in data abort handler detected Internal error: Oops - bad mode: 0 [#1] SMP ARM Modules linked in: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.10.9+ torvalds#369 task: 807dba88 ti: 807d0000 task.ti: 807d0000 PC is at 0xffff1044 LR is at arch_cpu_idle+0x48/0x54 pc : [<ffff1044>] lr : [<8000f7dc>] psr: 60000192 sp : 807d1f60 ip : 00000000 fp : 00000000 r10: 807d8954 r9 : 8059980c r8 : 80819280 r7 : 00000001 r6 : 80819280 r5 : 00000000 r4 : 807d0000 r3 : 8001cbe0 r2 : 807d9510 r1 : 0104b000 r0 : 80819540 Flags: nZCv IRQs off FIQs on Mode IRQ_32 ISA ARM Segment kernel Control: 10c53c7d Table: af28804a DAC: 00000017 Process swapper/0 (pid: 0, stack limit = 0x807d0238) Stack: (0x807d1f60 to 0x807d2000) 1f60: 80819540 0104b000 807d9510 8001cbe0 807d0000 00000000 80819280 00000001 1f80: 80819280 8059980c 807d8954 00000000 00000000 807d1f60 8000f7dc ffff1044 1fa0: 60000192 ffffffff 807d0000 8005de44 807d89d0 808193c0 807bf084 807dc86c 1fc0: 8000406a 412fc09a 00000000 8077fb58 ffffffff ffffffff 8077f6b4 00000000 1fe0: 00000000 807bf088 00000000 10c53c7d 807d88d0 80008074 00000000 00000000 [<8000f7dc>] (arch_cpu_idle+0x48/0x54) from [<0104b000>] (0x104b000) Code: bad PC value ---[ end trace c2c7dd3b2230692c ]--- Kernel panic - not syncing: Attempted to kill the idle task Signed-off-by: Anson Huang <b20788@freescale.com>
When we rmmod gadget, the ci->driver needs to be cleared. Otherwise, when we plug in usb cable again, the driver will consider gadget is there, and go to enumeration procedure, but in fact, it was removed. ci_hdrc ci_hdrc.0: Connected to host Unable to handle kernel paging request at virtual address 7f02a42c pgd = 80004000 [7f02a42c] *pgd=3f13d811, *pte=00000000, *ppte=00000000 Internal error: Oops: 7 [#1] SMP ARM Modules linked in: usb_f_acm u_serial libcomposite configfs [last unloaded: g_serial] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.10.0+ torvalds#42 task: 807dba88 ti: 807d0000 task.ti: 807d0000 PC is at udc_irq+0x8fc/0xea4 LR is at l2x0_cache_sync+0x5c/0x6c pc : [<803de7f4>] lr : [<8001d0f0>] psr: 20000193 sp : 807d1d98 ip : 807d1d80 fp : 807d1df4 r10: af809900 r9 : 808184d4 r8 : 00080001 r7 : 00082001 r6 : afb711f8 r5 : afb71010 r4 : ffffffea r3 : 7f02a41c r2 : afb71010 r1 : 807d1dc0 r0 : afb71068 Flags: nzCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment kernel Control: 10c53c7d Table: 3f01804a DAC: 00000017 Process swapper/0 (pid: 0, stack limit = 0x807d0238) Stack: (0x807d1d98 to 0x807d2000) 1d80: 00000000 afb71014 1da0: 000040f6 00000000 00000001 00000000 00007530 00000000 afb71010 001dcd65 1dc0: 01000680 00400000 807d1e2c afb71010 0000004e 00000000 00000000 0000004b 1de0: 808184d4 af809900 807d1e0c 807d1df8 803dbc24 803ddf04 afba75c0 0000004e 1e00: 807d1e44 807d1e10 8007a19c 803dbb9c 8108e7e0 8108e7e0 9ceddce0 af809900 1e20: 0000004e 807d0000 0000004b 00000000 00000010 00000000 807d1e5c 807d1e48 1e40: 8007a334 8007a154 af809900 0000004e 807d1e74 807d1e60 8007d3b4 8007a2f0 1e60: 0000004b 807cce3c 807d1e8c 807d1e78 80079b08 8007d300 00000180 807d8ba0 1e80: 807d1eb4 807d1e90 8000eef4 80079aec 00000000 f400010c 807d8ce4 807d1ed8 1ea0: f4000100 96d5c75d 807d1ed4 807d1eb8 80008600 8000eeac 8042699c 60000013 1ec0: ffffffff 807d1f0c 807d1f54 807d1ed8 8000e180 800085dc 807d1f20 00000046 1ee0: 9cedd275 00000010 8108f080 807de294 00000001 807de248 96d5c75d 00000010 1f00: 00000000 807d1f54 00000000 807d1f20 8005ff54 8042699c 60000013 ffffffff 1f20: 9cedd275 00000010 00000005 8108f080 8108f080 00000001 807de248 8086bd00 1f40: 807d0000 00000001 807d1f7c 807d1f58 80426af0 80426950 807d0000 00000000 1f60: 808184c0 808184c0 807d8954 805b886c 807d1f8c 807d1f80 8000f294 80426a44 1f80: 807d1fac 807d1f90 8005f110 8000f288 807d1fac 807d8908 805b4748 807dc86c 1fa0: 807d1fbc 807d1fb0 805aa58c 8005f068 807d1ff4 807d1fc0 8077c860 805aa530 1fc0: ffffffff ffffffff 8077c330 00000000 00000000 807bef88 00000000 10c53c7d 1fe0: 807d88d0 807bef84 00000000 807d1ff8 10008074 8077c594 00000000 00000000 Backtrace: [<803ddef8>] (udc_irq+0x0/0xea4) from [<803dbc24>] (ci_irq+0x94/0x14c) [<803dbb90>] (ci_irq+0x0/0x14c) from [<8007a19c>] (handle_irq_event_percpu+0x54/0x19c) r5:0000004e r4:afba75c0 [<8007a148>] (handle_irq_event_percpu+0x0/0x19c) from [<8007a334>] (handle_irq_event+0x50/0x70) [<8007a2e4>] (handle_irq_event+0x0/0x70) from [<8007d3b4>] (handle_fasteoi_irq+0xc0/0x16c) r5:0000004e r4:af809900 [<8007d2f4>] (handle_fasteoi_irq+0x0/0x16c) from [<80079b08>] (generic_handle_irq+0x28/0x38) r5:807cce3c r4:0000004b [<80079ae0>] (generic_handle_irq+0x0/0x38) from [<8000eef4>] (handle_IRQ+0x54/0xb4) r4:807d8ba0 r3:00000180 [<8000eea0>] (handle_IRQ+0x0/0xb4) from [<80008600>] (gic_handle_irq+0x30/0x64) r8:96d5c75d r7:f4000100 r6:807d1ed8 r5:807d8ce4 r4:f400010c r3:00000000 [<800085d0>] (gic_handle_irq+0x0/0x64) from [<8000e180>] (__irq_svc+0x40/0x54) Exception stack(0x807d1ed8 to 0x807d1f20) 1ec0: 807d1f20 00000046 1ee0: 9cedd275 00000010 8108f080 807de294 00000001 807de248 96d5c75d 00000010 1f00: 00000000 807d1f54 00000000 807d1f20 8005ff54 8042699c 60000013 ffffffff r7:807d1f0c r6:ffffffff r5:60000013 r4:8042699c [<80426944>] (cpuidle_enter_state+0x0/0xf4) from [<80426af0>] (cpuidle_idle_call+0xb8/0x174) r9:00000001 r8:807d0000 r7:8086bd00 r6:807de248 r5:00000001 r4:8108f080 [<80426a38>] (cpuidle_idle_call+0x0/0x174) from [<8000f294>] (arch_cpu_idle+0x18/0x5c) [<8000f27c>] (arch_cpu_idle+0x0/0x5c) from [<8005f110>] (cpu_startup_entry+0xb4/0x148) [<8005f05c>] (cpu_startup_entry+0x0/0x148) from [<805aa58c>] (rest_init+0x68/0x80) r7:807dc86c [<805aa524>] (rest_init+0x0/0x80) from [<8077c860>] (start_kernel+0x2d8/0x334) [<8077c588>] (start_kernel+0x0/0x334) from [<10008074>] (0x10008074) Code: e59031e0 e51b203c e24b1034 e2820058 (e5933010) ---[ end trace f874b2c5533c04bc ]--- Kernel panic - not syncing: Fatal exception in interrupt Tested-by: Marek Vasut <marex@denx.de> Acked-by: Shawn Guo <shawn.guo@linaro.org> Signed-off-by: Peter Chen <peter.chen@freescale.com>
With this patch, the conntrack refcount is initially set to zero and it is bumped once it is added to any of the list, so we fulfill Eric's golden rule which is that all released objects always have a refcount that equals zero. Andrey Vagin reports that nf_conntrack_free can't be called for a conntrack with non-zero ref-counter, because it can race with nf_conntrack_find_get(). A conntrack slab is created with SLAB_DESTROY_BY_RCU. Non-zero ref-counter says that this conntrack is used. So when we release a conntrack with non-zero counter, we break this assumption. CPU1 CPU2 ____nf_conntrack_find() nf_ct_put() destroy_conntrack() ... init_conntrack __nf_conntrack_alloc (set use = 1) atomic_inc_not_zero(&ct->use) (use = 2) if (!l4proto->new(ct, skb, dataoff, timeouts)) nf_conntrack_free(ct); (use = 2 !!!) ... __nf_conntrack_alloc (set use = 1) if (!nf_ct_key_equal(h, tuple, zone)) nf_ct_put(ct); (use = 0) destroy_conntrack() /* continue to work with CT */ After applying the path "[PATCH] netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get" another bug was triggered in destroy_conntrack(): <4>[67096.759334] ------------[ cut here ]------------ <2>[67096.759353] kernel BUG at net/netfilter/nf_conntrack_core.c:211! ... <4>[67096.759837] Pid: 498649, comm: atdd veid: 666 Tainted: G C --------------- 2.6.32-042stab084.18 #1 042stab084_18 /DQ45CB <4>[67096.759932] RIP: 0010:[<ffffffffa03d99ac>] [<ffffffffa03d99ac>] destroy_conntrack+0x15c/0x190 [nf_conntrack] <4>[67096.760255] Call Trace: <4>[67096.760255] [<ffffffff814844a7>] nf_conntrack_destroy+0x17/0x30 <4>[67096.760255] [<ffffffffa03d9bb5>] nf_conntrack_find_get+0x85/0x130 [nf_conntrack] <4>[67096.760255] [<ffffffffa03d9fb2>] nf_conntrack_in+0x352/0xb60 [nf_conntrack] <4>[67096.760255] [<ffffffffa048c771>] ipv4_conntrack_local+0x51/0x60 [nf_conntrack_ipv4] <4>[67096.760255] [<ffffffff81484419>] nf_iterate+0x69/0xb0 <4>[67096.760255] [<ffffffff814b5b00>] ? dst_output+0x0/0x20 <4>[67096.760255] [<ffffffff814845d4>] nf_hook_slow+0x74/0x110 <4>[67096.760255] [<ffffffff814b5b00>] ? dst_output+0x0/0x20 <4>[67096.760255] [<ffffffff814b66d5>] raw_sendmsg+0x775/0x910 <4>[67096.760255] [<ffffffff8104c5a8>] ? flush_tlb_others_ipi+0x128/0x130 <4>[67096.760255] [<ffffffff8100bc4e>] ? apic_timer_interrupt+0xe/0x20 <4>[67096.760255] [<ffffffff8100bc4e>] ? apic_timer_interrupt+0xe/0x20 <4>[67096.760255] [<ffffffff814c136a>] inet_sendmsg+0x4a/0xb0 <4>[67096.760255] [<ffffffff81444e93>] ? sock_sendmsg+0x13/0x140 <4>[67096.760255] [<ffffffff81444f97>] sock_sendmsg+0x117/0x140 <4>[67096.760255] [<ffffffff8102e299>] ? native_smp_send_reschedule+0x49/0x60 <4>[67096.760255] [<ffffffff81519beb>] ? _spin_unlock_bh+0x1b/0x20 <4>[67096.760255] [<ffffffff8109d930>] ? autoremove_wake_function+0x0/0x40 <4>[67096.760255] [<ffffffff814960f0>] ? do_ip_setsockopt+0x90/0xd80 <4>[67096.760255] [<ffffffff8100bc4e>] ? apic_timer_interrupt+0xe/0x20 <4>[67096.760255] [<ffffffff8100bc4e>] ? apic_timer_interrupt+0xe/0x20 <4>[67096.760255] [<ffffffff814457c9>] sys_sendto+0x139/0x190 <4>[67096.760255] [<ffffffff810efa77>] ? audit_syscall_entry+0x1d7/0x200 <4>[67096.760255] [<ffffffff810ef7c5>] ? __audit_syscall_exit+0x265/0x290 <4>[67096.760255] [<ffffffff81474daf>] compat_sys_socketcall+0x13f/0x210 <4>[67096.760255] [<ffffffff8104dea3>] ia32_sysret+0x0/0x5 I have reused the original title for the RFC patch that Andrey posted and most of the original patch description. Cc: Eric Dumazet <edumazet@google.com> Cc: Andrew Vagin <avagin@parallels.com> Cc: Florian Westphal <fw@strlen.de> Reported-by: Andrew Vagin <avagin@parallels.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Andrew Vagin <avagin@parallels.com>
Setting an empty security context (length=0) on a file will lead to incorrectly dereferencing the type and other fields of the security context structure, yielding a kernel BUG. As a zero-length security context is never valid, just reject all such security contexts whether coming from userspace via setxattr or coming from the filesystem upon a getxattr request by SELinux. Setting a security context value (empty or otherwise) unknown to SELinux in the first place is only possible for a root process (CAP_MAC_ADMIN), and, if running SELinux in enforcing mode, only if the corresponding SELinux mac_admin permission is also granted to the domain by policy. In Fedora policies, this is only allowed for specific domains such as livecd for setting down security contexts that are not defined in the build host policy. Reproducer: su setenforce 0 touch foo setfattr -n security.selinux foo Caveat: Relabeling or removing foo after doing the above may not be possible without booting with SELinux disabled. Any subsequent access to foo after doing the above will also trigger the BUG. BUG output from Matthew Thode: [ 473.893141] ------------[ cut here ]------------ [ 473.962110] kernel BUG at security/selinux/ss/services.c:654! [ 473.995314] invalid opcode: 0000 [torvalds#6] SMP [ 474.027196] Modules linked in: [ 474.058118] CPU: 0 PID: 8138 Comm: ls Tainted: G D I 3.13.0-grsec #1 [ 474.116637] Hardware name: Supermicro X8ST3/X8ST3, BIOS 2.0 07/29/10 [ 474.149768] task: ffff8805f50cd010 ti: ffff8805f50cd488 task.ti: ffff8805f50cd488 [ 474.183707] RIP: 0010:[<ffffffff814681c7>] [<ffffffff814681c7>] context_struct_compute_av+0xce/0x308 [ 474.219954] RSP: 0018:ffff8805c0ac3c38 EFLAGS: 00010246 [ 474.252253] RAX: 0000000000000000 RBX: ffff8805c0ac3d94 RCX: 0000000000000100 [ 474.287018] RDX: ffff8805e8aac000 RSI: 00000000ffffffff RDI: ffff8805e8aaa000 [ 474.321199] RBP: ffff8805c0ac3cb8 R08: 0000000000000010 R09: 0000000000000006 [ 474.357446] R10: 0000000000000000 R11: ffff8805c567a000 R12: 0000000000000006 [ 474.419191] R13: ffff8805c2b74e88 R14: 00000000000001da R15: 0000000000000000 [ 474.453816] FS: 00007f2e75220800(0000) GS:ffff88061fc00000(0000) knlGS:0000000000000000 [ 474.489254] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 474.522215] CR2: 00007f2e74716090 CR3: 00000005c085e000 CR4: 00000000000207f0 [ 474.556058] Stack: [ 474.584325] ffff8805c0ac3c98 ffffffff811b549b ffff8805c0ac3c98 ffff8805f1190a40 [ 474.618913] ffff8805a6202f08 ffff8805c2b74e88 00068800d0464990 ffff8805e8aac860 [ 474.653955] ffff8805c0ac3cb8 000700068113833a ffff880606c75060 ffff8805c0ac3d94 [ 474.690461] Call Trace: [ 474.723779] [<ffffffff811b549b>] ? lookup_fast+0x1cd/0x22a [ 474.778049] [<ffffffff81468824>] security_compute_av+0xf4/0x20b [ 474.811398] [<ffffffff8196f419>] avc_compute_av+0x2a/0x179 [ 474.843813] [<ffffffff8145727b>] avc_has_perm+0x45/0xf4 [ 474.875694] [<ffffffff81457d0e>] inode_has_perm+0x2a/0x31 [ 474.907370] [<ffffffff81457e76>] selinux_inode_getattr+0x3c/0x3e [ 474.938726] [<ffffffff81455cf6>] security_inode_getattr+0x1b/0x22 [ 474.970036] [<ffffffff811b057d>] vfs_getattr+0x19/0x2d [ 475.000618] [<ffffffff811b05e5>] vfs_fstatat+0x54/0x91 [ 475.030402] [<ffffffff811b063b>] vfs_lstat+0x19/0x1b [ 475.061097] [<ffffffff811b077e>] SyS_newlstat+0x15/0x30 [ 475.094595] [<ffffffff8113c5c1>] ? __audit_syscall_entry+0xa1/0xc3 [ 475.148405] [<ffffffff8197791e>] system_call_fastpath+0x16/0x1b [ 475.179201] Code: 00 48 85 c0 48 89 45 b8 75 02 0f 0b 48 8b 45 a0 48 8b 3d 45 d0 b6 00 8b 40 08 89 c6 ff ce e8 d1 b0 06 00 48 85 c0 49 89 c7 75 02 <0f> 0b 48 8b 45 b8 4c 8b 28 eb 1e 49 8d 7d 08 be 80 01 00 00 e8 [ 475.255884] RIP [<ffffffff814681c7>] context_struct_compute_av+0xce/0x308 [ 475.296120] RSP <ffff8805c0ac3c38> [ 475.328734] ---[ end trace f076482e9d754adc ]--- Reported-by: Matthew Thode <mthode@mthode.org> Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Cc: stable@vger.kernel.org Signed-off-by: Paul Moore <pmoore@redhat.com>
CONFIG_X86_32 doesn't map the boot services regions into the EFI memory map (see commit 7008701 ("x86, efi: Don't map Boot Services on i386")), and so efi_lookup_mapped_addr() will fail to return a valid address. Executing the ioremap() path in efi_bgrt_init() causes the following warning on x86-32 because we're trying to ioremap() RAM, WARNING: CPU: 0 PID: 0 at arch/x86/mm/ioremap.c:102 __ioremap_caller+0x2ad/0x2c0() Modules linked in: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.13.0-0.rc5.git0.1.2.fc21.i686 #1 Hardware name: DellInc. Venue 8 Pro 5830/09RP78, BIOS A02 10/17/2013 00000000 00000000 c0c0df08 c09a5196 00000000 c0c0df38 c0448c1e c0b41310 00000000 00000000 c0b37bc1 00000066 c043bbfd c043bbfd 00e7dfe0 00073eff 00073eff c0c0df48 c0448ce2 00000009 00000000 c0c0df9c c043bbfd 00078d88 Call Trace: [<c09a5196>] dump_stack+0x41/0x52 [<c0448c1e>] warn_slowpath_common+0x7e/0xa0 [<c043bbfd>] ? __ioremap_caller+0x2ad/0x2c0 [<c043bbfd>] ? __ioremap_caller+0x2ad/0x2c0 [<c0448ce2>] warn_slowpath_null+0x22/0x30 [<c043bbfd>] __ioremap_caller+0x2ad/0x2c0 [<c0718f92>] ? acpi_tb_verify_table+0x1c/0x43 [<c0719c78>] ? acpi_get_table_with_size+0x63/0xb5 [<c087cd5e>] ? efi_lookup_mapped_addr+0xe/0xf0 [<c043bc2b>] ioremap_nocache+0x1b/0x20 [<c0cb01c8>] ? efi_bgrt_init+0x83/0x10c [<c0cb01c8>] efi_bgrt_init+0x83/0x10c [<c0cafd82>] efi_late_init+0x8/0xa [<c0c9bab2>] start_kernel+0x3ae/0x3c3 [<c0c9b53b>] ? repair_env_string+0x51/0x51 [<c0c9b378>] i386_start_kernel+0x12e/0x131 Switch to using early_memremap(), which won't trigger this warning, and has the added benefit of more accurately conveying what we're trying to do - map a chunk of memory. This patch addresses the following bug report, https://bugzilla.kernel.org/show_bug.cgi?id=67911 Reported-by: Adam Williamson <awilliam@redhat.com> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Matthew Garrett <mjg59@srcf.ucam.org> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Signed-off-by: Matt Fleming <matt.fleming@intel.com>
sdata->u.ap.request_smps_work can’t be flushed synchronously under wdev_lock(wdev) since ieee80211_request_smps_ap_work itself locks the same lock. While at it, reset the driver_smps_mode when the ap is stopped to its default: OFF. This solves: ====================================================== [ INFO: possible circular locking dependency detected ] 3.12.0-ipeer+ #2 Tainted: G O ------------------------------------------------------- rmmod/2867 is trying to acquire lock: ((&sdata->u.ap.request_smps_work)){+.+...}, at: [<c105b8d0>] flush_work+0x0/0x90 but task is already holding lock: (&wdev->mtx){+.+.+.}, at: [<f9b32626>] cfg80211_stop_ap+0x26/0x230 [cfg80211] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&wdev->mtx){+.+.+.}: [<c10aefa9>] lock_acquire+0x79/0xe0 [<c1607a1a>] mutex_lock_nested+0x4a/0x360 [<fb06288b>] ieee80211_request_smps_ap_work+0x2b/0x50 [mac80211] [<c105cdd8>] process_one_work+0x198/0x450 [<c105d469>] worker_thread+0xf9/0x320 [<c10669ff>] kthread+0x9f/0xb0 [<c1613397>] ret_from_kernel_thread+0x1b/0x28 -> #0 ((&sdata->u.ap.request_smps_work)){+.+...}: [<c10ae9df>] __lock_acquire+0x183f/0x1910 [<c10aefa9>] lock_acquire+0x79/0xe0 [<c105b917>] flush_work+0x47/0x90 [<c105d867>] __cancel_work_timer+0x67/0xe0 [<c105d90f>] cancel_work_sync+0xf/0x20 [<fb0765cc>] ieee80211_stop_ap+0x8c/0x340 [mac80211] [<f9b3268c>] cfg80211_stop_ap+0x8c/0x230 [cfg80211] [<f9b0d8f9>] cfg80211_leave+0x79/0x100 [cfg80211] [<f9b0da72>] cfg80211_netdev_notifier_call+0xf2/0x4f0 [cfg80211] [<c160f2c9>] notifier_call_chain+0x59/0x130 [<c106c6de>] __raw_notifier_call_chain+0x1e/0x30 [<c106c70f>] raw_notifier_call_chain+0x1f/0x30 [<c14f8213>] call_netdevice_notifiers_info+0x33/0x70 [<c14f8263>] call_netdevice_notifiers+0x13/0x20 [<c14f82a4>] __dev_close_many+0x34/0xb0 [<c14f83fe>] dev_close_many+0x6e/0xc0 [<c14f9c77>] rollback_registered_many+0xa7/0x1f0 [<c14f9dd4>] unregister_netdevice_many+0x14/0x60 [<fb06f4d9>] ieee80211_remove_interfaces+0xe9/0x170 [mac80211] [<fb055116>] ieee80211_unregister_hw+0x56/0x110 [mac80211] [<fa3e9396>] iwl_op_mode_mvm_stop+0x26/0xe0 [iwlmvm] [<f9b9d8ca>] _iwl_op_mode_stop+0x3a/0x70 [iwlwifi] [<f9b9d96f>] iwl_opmode_deregister+0x6f/0x90 [iwlwifi] [<fa405179>] __exit_compat+0xd/0x19 [iwlmvm] [<c10b8bf9>] SyS_delete_module+0x179/0x2b0 [<c1613421>] sysenter_do_call+0x12/0x32 Fixes: 687da13 ("mac80211: implement SMPS for AP") Cc: <stable@vger.kernel.org> [3.13] Reported-by: Ilan Peer <ilan.peer@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Don't set read callback to NULL during reset as this leads to memory leak of both cb and its buffer. The memory is correctly freed during mei_release. The memory leak is detectable by kmemleak if application has open read call while system is going through suspend/resume. unreferenced object 0xecead780 (size 64): comm "AsyncTask #1", pid 1018, jiffies 4294949621 (age 152.440s) hex dump (first 32 bytes): 00 01 10 00 00 02 20 00 00 bf 30 f1 00 00 00 00 ...... ...0..... 00 00 00 00 00 00 00 00 36 01 00 00 00 70 da e2 ........6....p.. backtrace: [<c1a60aec>] kmemleak_alloc+0x3c/0xa0 [<c131ed56>] kmem_cache_alloc_trace+0xc6/0x190 [<c16243c9>] mei_io_cb_init+0x29/0x50 [<c1625722>] mei_cl_read_start+0x102/0x360 [<c16268f3>] mei_read+0x103/0x4e0 [<c1324b09>] vfs_read+0x89/0x160 [<c1324d5f>] SyS_read+0x4f/0x80 [<c1a7b318>] syscall_call+0x7/0xb [<ffffffff>] 0xffffffff unreferenced object 0xe2da7000 (size 512): comm "AsyncTask #1", pid 1018, jiffies 4294949621 (age 152.440s) hex dump (first 32 bytes): 00 6c da e2 7c 00 00 00 00 00 00 00 c0 eb 0c 59 .l..|..........Y 1b 00 00 00 01 00 00 00 02 10 00 00 01 00 00 00 ................ backtrace: [<c1a60aec>] kmemleak_alloc+0x3c/0xa0 [<c131f127>] __kmalloc+0xe7/0x1d0 [<c162447e>] mei_io_cb_alloc_resp_buf+0x2e/0x60 [<c162574c>] mei_cl_read_start+0x12c/0x360 [<c16268f3>] mei_read+0x103/0x4e0 [<c1324b09>] vfs_read+0x89/0x160 [<c1324d5f>] SyS_read+0x4f/0x80 [<c1a7b318>] syscall_call+0x7/0xb [<ffffffff>] 0xffffffff Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com> Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Steven Noonan forwarded a users report where they had a problem starting vsftpd on a Xen paravirtualized guest, with this in dmesg: BUG: Bad page map in process vsftpd pte:8000000493b88165 pmd:e9cc01067 page:ffffea00124ee200 count:0 mapcount:-1 mapping: (null) index:0x0 page flags: 0x2ffc0000000014(referenced|dirty) addr:00007f97eea74000 vm_flags:00100071 anon_vma:ffff880e98f80380 mapping: (null) index:7f97eea74 CPU: 4 PID: 587 Comm: vsftpd Not tainted 3.12.7-1-ec2 #1 Call Trace: dump_stack+0x45/0x56 print_bad_pte+0x22e/0x250 unmap_single_vma+0x583/0x890 unmap_vmas+0x65/0x90 exit_mmap+0xc5/0x170 mmput+0x65/0x100 do_exit+0x393/0x9e0 do_group_exit+0xcc/0x140 SyS_exit_group+0x14/0x20 system_call_fastpath+0x1a/0x1f Disabling lock debugging due to kernel taint BUG: Bad rss-counter state mm:ffff880e9ca60580 idx:0 val:-1 BUG: Bad rss-counter state mm:ffff880e9ca60580 idx:1 val:1 The issue could not be reproduced under an HVM instance with the same kernel, so it appears to be exclusive to paravirtual Xen guests. He bisected the problem to commit 1667918 ("mm: numa: clear numa hinting information on mprotect") that was also included in 3.12-stable. The problem was related to how xen translates ptes because it was not accounting for the _PAGE_NUMA bit. This patch splits pte_present to add a pteval_present helper for use by xen so both bare metal and xen use the same code when checking if a PTE is present. [mgorman@suse.de: wrote changelog, proposed minor modifications] [akpm@linux-foundation.org: fix typo in comment] Reported-by: Steven Noonan <steven@uplinklabs.net> Tested-by: Steven Noonan <steven@uplinklabs.net> Signed-off-by: Elena Ufimtseva <ufimtseva@gmail.com> Signed-off-by: Mel Gorman <mgorman@suse.de> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: <stable@vger.kernel.org> [3.12+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If you do echo 0 > /sys/module/edac_core/parameters/edac_mc_poll_msec the following stack trace is output because the edac module is not designed to poll with a timeout of zero. WARNING: CPU: 12 PID: 0 at lib/list_debug.c:33 __list_add+0xac/0xc0() list_add corruption. prev->next should be next (ffff8808291dd1b8), but was (null). (prev=ffff8808286fe3f8). Modules linked in: sg nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache cfg80211 rfkill x86_pkg_temp_thermal coretemp kvm_intel kvm ixgbe e1000e crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd iTCO_wdt ptp sb_edac iTCO_vendor_support pps_core mdio ipmi_devintf edac_core ioatdma microcode shpchp lpc_ich pcspkr i2c_i801 dca mfd_core ipmi_si wmi ipmi_msghandler nfsd auth_rpcgss nfs_acl lockd sunrpc xfs libcrc32c sd_mod sr_mod cdrom crc_t10dif crct10dif_common mgag200 syscopyarea sysfillrect sysimgblt isci i2c_algo_bit drm_kms_helper ttm drm libsas ahci libahci scsi_transport_sas libata i2c_core dm_mirror dm_region_hash dm_log dm_mod CPU: 12 PID: 0 Comm: swapper/12 Not tainted 3.13.0+ #1 Hardware name: Intel Corporation LH Pass ........../SVRBD-ROW_T, BIOS SE5C600.86B.01.08.0003.022620131521 02/26/2013 Call Trace: <IRQ> __list_add+0xac/0xc0 __internal_add_timer+0xab/0x130 internal_add_timer+0x17/0x40 mod_timer_pinned+0xca/0x170 intel_pstate_timer_func+0x28a/0x380 call_timer_fn+0x36/0x100 run_timer_softirq+0x1ff/0x2f0 __do_softirq+0xf5/0x2e0 irq_exit+0x10d/0x120 smp_apic_timer_interrupt+0x45/0x60 apic_timer_interrupt+0x6d/0x80 <EOI> cpuidle_idle_call+0xb9/0x1f0 arch_cpu_idle+0xe/0x30 cpu_startup_entry+0x9e/0x240 start_secondary+0x1e4/0x290 kernel BUG at kernel/timer.c:1084! invalid opcode: 0000 [#1] SMP Modules linked in: sg nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache cfg80211 rfkill x86_pkg_temp_thermal coretemp kvm_intel kvm ixgbe e1000e crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd iTCO_wdt ptp sb_edac iTCO_vendor_support pps_core mdio ipmi_devintf edac_core ioatdma microcode shpchp lpc_ich pcspkr i2c_i801 dca mfd_core ipmi_si wmi ipmi_msghandler nfsd auth_rpcgss nfs_acl lockd sunrpc xfs libcrc32c sd_mod sr_mod cdrom crc_t10dif crct10dif_common mgag200 syscopyarea sysfillrect sysimgblt isci i2c_algo_bit drm_kms_helper ttm drm libsas ahci libahci scsi_transport_sas libata i2c_core dm_mirror dm_region_hash dm_log dm_mod CPU: 12 PID: 0 Comm: swapper/12 Tainted: G W 3.13.0+ #1 Hardware name: Intel Corporation LH Pass ........../SVRBD-ROW_T, BIOS SE5C600.86B.01.08.0003.022620131521 02/26/2013 Call Trace: <IRQ> run_timer_softirq+0x245/0x2f0 __do_softirq+0xf5/0x2e0 irq_exit+0x10d/0x120 smp_apic_timer_interrupt+0x45/0x60 apic_timer_interrupt+0x6d/0x80 <EOI> cpuidle_idle_call+0xb9/0x1f0 arch_cpu_idle+0xe/0x30 cpu_startup_entry+0x9e/0x240 start_secondary+0x1e4/0x290 RIP cascade+0x93/0xa0 WARNING: CPU: 36 PID: 1154 at kernel/workqueue.c:1461 __queue_delayed_work+0xed/0x1a0() Modules linked in: sg nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache cfg80211 rfkill x86_pkg_temp_thermal coretemp kvm_intel kvm ixgbe e1000e crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd iTCO_wdt ptp sb_edac iTCO_vendor_support pps_core mdio ipmi_devintf edac_core ioatdma microcode shpchp lpc_ich pcspkr i2c_i801 dca mfd_core ipmi_si wmi ipmi_msghandler nfsd auth_rpcgss nfs_acl lockd sunrpc xfs libcrc32c sd_mod sr_mod cdrom crc_t10dif crct10dif_common mgag200 syscopyarea sysfillrect sysimgblt isci i2c_algo_bit drm_kms_helper ttm drm libsas ahci libahci scsi_transport_sas libata i2c_core dm_mirror dm_region_hash dm_log dm_mod CPU: 36 PID: 1154 Comm: kworker/u481:3 Tainted: G W 3.13.0+ #1 Hardware name: Intel Corporation LH Pass ........../SVRBD-ROW_T, BIOS SE5C600.86B.01.08.0003.022620131521 02/26/2013 Workqueue: edac-poller edac_mc_workq_function [edac_core] Call Trace: dump_stack+0x45/0x56 warn_slowpath_common+0x7d/0xa0 warn_slowpath_null+0x1a/0x20 __queue_delayed_work+0xed/0x1a0 queue_delayed_work_on+0x27/0x50 edac_mc_workq_function+0x72/0xa0 [edac_core] process_one_work+0x17b/0x460 worker_thread+0x11b/0x400 kthread+0xd2/0xf0 ret_from_fork+0x7c/0xb0 This patch adds a range check in the edac_mc_poll_msec code to check for 0. Signed-off-by: Prarit Bhargava <prarit@redhat.com> Cc: Doug Thompson <dougthompson@xmission.com> Cc: Mauro Carvalho Chehab <mchehab@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Vladimir reported the following issue: Commit c65c187 ("slub: use lockdep_assert_held") requires remove_partial() to be called with n->list_lock held, but free_partial() called from kmem_cache_close() on cache destruction does not follow this rule, leading to a warning: WARNING: CPU: 0 PID: 2787 at mm/slub.c:1536 __kmem_cache_shutdown+0x1b2/0x1f0() Modules linked in: CPU: 0 PID: 2787 Comm: modprobe Tainted: G W 3.14.0-rc1-mm1+ #1 Hardware name: 0000000000000600 ffff88003ae1dde8 ffffffff816d9583 0000000000000600 0000000000000000 ffff88003ae1de28 ffffffff8107c107 0000000000000000 ffff880037ab2b00 ffff88007c240d30 ffffea0001ee5280 ffffea0001ee52a0 Call Trace: __kmem_cache_shutdown+0x1b2/0x1f0 kmem_cache_destroy+0x43/0xf0 xfs_destroy_zones+0x103/0x110 [xfs] exit_xfs_fs+0x38/0x4e4 [xfs] SyS_delete_module+0x19a/0x1f0 system_call_fastpath+0x16/0x1b His solution was to add a spinlock in order to quiet lockdep. Although there would be no contention to adding the lock, that lock also requires disabling of interrupts which will have a larger impact on the system. Instead of adding a spinlock to a location where it is not needed for lockdep, make a __remove_partial() function that does not test if the list_lock is held, as no one should have it due to it being freed. Also added a __add_partial() function that does not do the lock validation either, as it is not needed for the creation of the cache. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Reported-by: Vladimir Davydov <vdavydov@parallels.com> Suggested-by: David Rientjes <rientjes@google.com> Acked-by: David Rientjes <rientjes@google.com> Acked-by: Vladimir Davydov <vdavydov@parallels.com> Acked-by: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If napi is left enabled after a failed attempt to bring the interface up, we BUG: fec 2188000.ethernet eth0: no PHY, assuming direct connection to switch libphy: PHY fixed-0:00 not found fec 2188000.ethernet eth0: could not attach to PHY ------------[ cut here ]------------ kernel BUG at include/linux/netdevice.h:502! Internal error: Oops - BUG: 0 [#1] SMP ARM ... PC is at fec_enet_open+0x4d0/0x500 LR is at __dev_open+0xa4/0xfc Only enable napi after we are past all the failure paths. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
If we cannot calibrate TSC via MSR based calibration try_msr_calibrate_tsc() stores zero to fast_calibrate and returns that to the caller. This value gets then propagated further to clockevents code resulting division by zero oops like the one below: divide error: 0000 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 3.13.0+ torvalds#47 task: ffff880075508000 ti: ffff880075506000 task.ti: ffff880075506000 RIP: 0010:[<ffffffff810aec14>] [<ffffffff810aec14>] clockevents_config.part.3+0x24/0xa0 RSP: 0000:ffff880075507e58 EFLAGS: 00010246 RAX: ffffffffffffffff RBX: ffff880079c0cd80 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffffffffff RBP: ffff880075507e70 R08: 0000000000000001 R09: 00000000000000be R10: 00000000000000bd R11: 0000000000000003 R12: 000000000000b008 R13: 0000000000000008 R14: 000000000000b010 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff880079c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: ffff880079fff000 CR3: 0000000001c0b000 CR4: 00000000001006f0 Stack: ffff880079c0cd80 000000000000b008 0000000000000008 ffff880075507e88 ffffffff810aecb0 ffff880079c0cd80 ffff880075507e98 ffffffff81030168 ffff880075507ed8 ffffffff81d1104f 00000000000000c3 0000000000000000 Call Trace: [<ffffffff810aecb0>] clockevents_config_and_register+0x20/0x30 [<ffffffff81030168>] setup_APIC_timer+0xc8/0xd0 [<ffffffff81d1104f>] setup_boot_APIC_clock+0x4cc/0x4d8 [<ffffffff81d0f5de>] native_smp_prepare_cpus+0x3dd/0x3f0 [<ffffffff81d02ee9>] kernel_init_freeable+0xc3/0x205 [<ffffffff8177c910>] ? rest_init+0x90/0x90 [<ffffffff8177c91e>] kernel_init+0xe/0x120 [<ffffffff8178deec>] ret_from_fork+0x7c/0xb0 [<ffffffff8177c910>] ? rest_init+0x90/0x90 Prevent this from happening by: 1) Modifying try_msr_calibrate_tsc() to return calibration value or zero if it fails. 2) Check this return value in native_calibrate_tsc() and in case of zero fallback to use normal non-MSR based calibration. [mw: Added subject and changelog] Reported-and-tested-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Bin Gao <bin.gao@linux.intel.com> Cc: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk> Cc: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Link: http://lkml.kernel.org/r/1392810750-18660-1-git-send-email-mika.westerberg@linux.intel.com Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Kirill has reported the following: Task in /test killed as a result of limit of /test memory: usage 10240kB, limit 10240kB, failcnt 51 memory+swap: usage 10240kB, limit 10240kB, failcnt 0 kmem: usage 0kB, limit 18014398509481983kB, failcnt 0 Memory cgroup stats for /test: BUG: sleeping function called from invalid context at kernel/cpu.c:68 in_atomic(): 1, irqs_disabled(): 0, pid: 66, name: memcg_test 2 locks held by memcg_test/66: #0: (memcg_oom_lock#2){+.+...}, at: [<ffffffff81131014>] pagefault_out_of_memory+0x14/0x90 #1: (oom_info_lock){+.+...}, at: [<ffffffff81197b2a>] mem_cgroup_print_oom_info+0x2a/0x390 CPU: 2 PID: 66 Comm: memcg_test Not tainted 3.14.0-rc1-dirty torvalds#745 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Bochs 01/01/2011 Call Trace: __might_sleep+0x16a/0x210 get_online_cpus+0x1c/0x60 mem_cgroup_read_stat+0x27/0xb0 mem_cgroup_print_oom_info+0x260/0x390 dump_header+0x88/0x251 ? trace_hardirqs_on+0xd/0x10 oom_kill_process+0x258/0x3d0 mem_cgroup_oom_synchronize+0x656/0x6c0 ? mem_cgroup_charge_common+0xd0/0xd0 pagefault_out_of_memory+0x14/0x90 mm_fault_error+0x91/0x189 __do_page_fault+0x48e/0x580 do_page_fault+0xe/0x10 page_fault+0x22/0x30 which complains that mem_cgroup_read_stat cannot be called from an atomic context but mem_cgroup_print_oom_info takes a spinlock. Change oom_info_lock to a mutex. This was introduced by 947b3dd ("memcg, oom: lock mem_cgroup_print_oom_info"). Signed-off-by: Michal Hocko <mhocko@suse.cz> Reported-by: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Hi John
Here are a few trivial commits I made on your test branch to be able to compile and to give it a try...