Add support for Sabre ESS9018 #1

koalo · 2013-05-04T18:26:47Z

Copied from forum, by steveha:

The main difference between them is ESS9018 is running on 64fs bit clock I2S and TDA1541A only does 48fs. Also, ESS9018 can support higher sampling rates like 176.4 khz or 192 khz. Particular for 176.4 khz, I did aware there is a custom mpd to stream native DSD signals (i.e. not DoP) over I2S.

Main tasks:

Enable support for higher bitrates in the i2s driver
Write dummy driver for ESS9018

koalo · 2013-05-06T11:11:54Z

Steve, do your boards run as I2S master?

steveha8838 · 2013-05-06T16:02:37Z

I think it should be I2S master. I bought the boards in pair - transmitter and receiver. The transmitter board is mounted on raspberry pi. The I2S signals alway sends from transmitter to receiver. Hope this help.

koalo · 2013-05-06T16:09:53Z

The interesting part is the clock: Is that generated by your board (good) or should it be generated by the raspberry (not so good)?

steveha8838 · 2013-05-06T18:47:08Z

I have no tool to measure how good is the clock of raspberry. But based on the listening test of TDA1541A, its clock should be good enough. For ESS9018, some documents has shown the DAC itself will resample the incoming signals according to the board's clock. So, it will suggest to get somthing first that can be recognised by ESS9018.

steveha8838 · 2013-05-07T14:14:18Z

Latest update. After compilation of 06 May Version. 44.1/16 can be recognised on both TDA1541A and ESS9018. For ESS9018, anything above 441.khz generated noise only. What have you done to 44.1khz ?

koalo · 2013-05-08T09:48:43Z

Done by ca53c99

Please test it!

steveha8838 · 2013-05-08T10:21:27Z

It's my pleasure. Just don't laught at me. I leant how to compile the kernel within raspberry a week ago.
Your efficiency simply faster than my leaning curve.

koalo · 2013-05-08T10:52:28Z

It wasn't meant to be an order ;-) Take your time and let me know when I can help you with something.

For me 32 bit does not sound perfect, but this could be because of my source material.

steveha8838 · 2013-05-08T11:08:10Z

I am eager to do it. Just want to provide you feedback at the earliest.

steveha8838 · 2013-05-08T19:05:20Z

Finally, I have my 1st trial with the following results :-

all modules (TDA1541a and ESS9018) can be loaded sucessfully;
All sample frequencies up to 192Khz can be played on Wyred DAC2 with sound (no noise);
- Samples frequences matched with hw_params
- However
  while playing 44.1khz song, LED display on DAC2 showing 88.2khz
  while playing 96kkhz song, LED display on DAC2 showing 192khz
  while playing 192khz song, LED display on DAC2 showing '----'
Can't test songs over 192khz, it seems you have set the limit to 192khz
FYI, DXD files can be up to 352.8khz or 384khz
Tomorrow, I will investigate the sound quality.

Hope this help

steveha8838 · 2013-05-09T06:28:06Z

The sound quality is relly good on my Wyred DAC2. At the same times, I have performed some tests using mplayer. By default, mplayer will pick S16_LE. I have no problem to stream radio contents to my dac and dac can display the correct sampling frequency. I will say 16 bits model is quit perfect.

To enforce S32_LE within mplayer, I need to specific 'plughw' instead of 'hw' on command line.

e.g. mplayer -ao alsa:device=plughw=1.0 /music/transfer/192.wav

The behavior is similar to that of mpd. The I2S signals have been upsampled.

One more point I want to add :- when streamling DSD contents, the bit clock has been set correctly to 176.4khz using the current prototype. The missing part is to treat LRCLK as data. May be I am too optimistic
at this moment.

koalo · 2013-05-10T13:08:46Z

Thank you so much for testing!!

However while playing 44.1khz song, LED display on DAC2 showing 88.2khz while playing 96kkhz song, LED display on DAC2 showing 192khz while playing 192khz song, LED display on DAC2 showing '----'

That could correspond to issue #10, but I don't know.
Although, I measured the correct frequencies with a scope.

Can't test songs over 192khz, it seems you have set the limit to 192khz FYI, DXD files can be up to 352.8khz or 384khz

ALSA doesn't support sampling rates above 192kHz. At least, there is no corresponding macro. I am sorry!

To enforce S32_LE within mplayer, I need to specific 'plughw' instead of 'hw' on command line.

For me mplayer automatically chooses 32 bit as long as the audio file itself is 32 bit. Although, for 24 bit files it uses 16 bit. What is the bit depth of your files?

One more point I want to add :- when streamling DSD contents, the bit clock has been set correctly to 176.4khz
using the current prototype. The missing part is to treat LRCLK as data. May be I am too optimistic
at this moment.

Sorry, I don't understand what you want to do :-(

steveha8838 · 2013-05-12T13:29:41Z

For mplayer, I am testing it with 24bits file. The last part is related to DSD streaming vis I2S. It makes a big differenc to PCM streaming. It guess it should be out of scope. I have been playing Raspberry PI with Wyred DAC2 for last two days. They are running smoothly and I have nothing to report further.

koalo · 2013-05-13T16:32:10Z

This is really great to hear that it runs smoothly!
DSD content could be possible in some way, but as the Raspberry Pi has no direct support for that I think we wouldn't be happy with that.

I think I can close this issue now.

On continuous loading and unloading of AR6004 ath6kl USB driver it triggers a panic due to NULL pointer dereference of 'target' pointer. while true; do sudo modprobe -v ath6kl_core; sudo modprobe -v ath6kl_usb; sudo modprobe -r usb; sudo modprobe -r ath6kl_core; done ar->htc_target can be NULL due to a race condition that can occur during driver initialization(we do 'ath6kl_hif_power_on' before initializing 'ar->htc_target' via 'ath6kl_htc_create'). 'ath6kl_hif_power_on' assigns 'ath6kl_recv_complete' as usb_complete_t/callback function for 'usb_fill_bulk_urb'. Thus the possibility of ar->htc_target being NULL via ath6kl_recv_complete -> ath6kl_usb_io_comp_work before even 'ath6kl_htc_create' is finished to initialize ar->htc_create. Worth noting is the obvious solution of doing 'ath6kl_hif_power_on' later(i.e after we are done with 'ath6kl_htc_create', causes a h/w bring up failure in AR6003 SDIO, as 'ath6kl_hif_power_on' is a pre-requisite to get the target version 'ath6kl_bmi_get_target_info'. So simply check for NULL pointer for 'ar->htc_target' and bail out. [23614.518282] BUG: unable to handle kernel NULL pointer dereference at 00000904 [23614.518463] IP: [<c012e7a6>] __ticket_spin_trylock+0x6/0x30 [23614.518570] *pde = 00000000 [23614.518664] Oops: 0000 [#1] SMP [23614.518795] Modules linked in: ath6kl_usb(O+) ath6kl_core(O) [23614.520012] EIP: 0060:[<c012e7a6>] EFLAGS: 00010286 CPU: 0 [23614.520012] EIP is at __ticket_spin_trylock+0x6/0x30 Call Trace: [<c03f2a44>] do_raw_spin_trylock+0x14/0x40 [<c06daa12>] _raw_spin_lock_bh+0x52/0x80 [<f85464b4>] ? ath6kl_htc_pipe_rx_complete+0x3b4/0x4c0 [ath6kl_core] [<f85464b4>] ath6kl_htc_pipe_rx_complete+0x3b4/0x4c0 [ath6kl_core] [<c05bc272>] ? skb_dequeue+0x22/0x70 [<c05bc272>] ? skb_dequeue+0x22/0x70 [<f855bb32>] ath6kl_core_rx_complete+0x12/0x20 [ath6kl_core] [<f848771a>] ath6kl_usb_io_comp_work+0xaa/0xb0 [ath6kl_usb] [<c015b863>] process_one_work+0x1a3/0x5e0 [<c015b7e7>] ? process_one_work+0x127/0x5e0 [<f8487670>] ? ath6kl_usb_reset_resume+0x30/0x30 [ath6kl_usb] [<c015bfde>] worker_thread+0x11e/0x3f0 Kernel panic - not syncing: Fatal exception in interrupt Signed-off-by: Mohammed Shafi Shajakhan <mohammed@qca.qualcomm.com> Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>

It is easy to trigger this crash on 3.7.0: root@intel_westmere_ep-3:~# modprobe -r i7core_edac EDAC PCI: Removed device 0 for i7core_edac EDAC PCI controller: DEV 0000:fe:03.0 EDAC MC: Removed device 1 for i7core_edac.c i7 core #1: DEV 0000:fe:03.0 EDAC PCI: Removed device 1 for i7core_edac EDAC PCI controller: DEV 0000:ff:03.0 EDAC MC: Removed device 0 for i7core_edac.c i7 core #0: DEV 0000:ff:03.0 BUG: unable to handle kernel NULL pointer dereference at 0000000000000110 IP: [<ffffffff82069ee9>] __blocking_notifier_call_chain+0x29/0x80 PGD 1eaae7067 PUD 1e96e4067 PMD 0 Oops: 0000 [#1] PREEMPT SMP Modules linked in: minix acpi_cpufreq freq_table mperf ioatdma processor edac_core(-) iTCO_wdt coretemp evdev hwmon lpc_ich dca mfd_core crc32c_intel ioapic [last unloaded: i7core_edac] CPU 3 Pid: 1268, comm: modprobe Not tainted 3.7.0-WR5.0.1.0_standard+ #30 Intel Corporation S5520HC/S5520HC RIP: 0010:[<ffffffff82069ee9>] [<ffffffff82069ee9>] __blocking_notifier_call_chain+0x29/0x80 RSP: 0018:ffff8801eb12de28 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 00000000000000f0 RCX: 00000000ffffffff RDX: ffff88012b452800 RSI: 0000000000000002 RDI: 00000000000000f0 RBP: ffff8801eb12de68 R08: 0000000000000000 R09: ffffea0004ad1118 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: ffff8801eb12dee8 R14: ffff88012b452800 R15: 000000000060e518 FS: 00007f9ea95a9700(0000) GS:ffff8801efc20000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000110 CR3: 00000001262f1000 CR4: 00000000000007e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process modprobe (pid: 1268, threadinfo ffff8801eb12c000, task ffff8801e8421690) Stack: ffff88012c802a00 ffff88012b445ec0 ffff88012c802300 ffff88012b452800 0000000000000000 ffff8801eb12dee8 000000000060e080 000000000060e518 ffff8801eb12de78 ffffffff82069f56 ffff8801eb12dea8 ffffffff824ead7c Call Trace: [<ffffffff82069f56>] blocking_notifier_call_chain+0x16/0x20 [<ffffffff824ead7c>] device_del+0x3c/0x1d0 [<ffffffffa00095a8>] edac_mc_sysfs_exit+0x1c/0x2f [edac_core] [<ffffffffa000961c>] edac_exit+0x4f/0x56 [edac_core] [<ffffffff820a3d2a>] sys_delete_module+0x17a/0x240 [<ffffffff8212da7c>] ? vm_munmap+0x5c/0x80 [<ffffffff82877682>] system_call_fastpath+0x16/0x1b Code: 90 90 55 48 89 e5 48 83 ec 40 48 89 5d d8 4c 89 65 e0 4c 89 6d e8 4c 89 75 f0 4c 89 7d f8 66 66 66 66 90 31 c0 49 89 d6 48 89 fb <48> 8b 57 20 49 89 f5 41 89 cf 4c 8d 67 20 48 85 d2 74 2c 4c 89 RIP [<ffffffff82069ee9>] __blocking_notifier_call_chain+0x29/0x80 RSP <ffff8801eb12de28> CR2: 0000000000000110 ---[ end trace b69acf12ccad1c0d ]--- Usually, edac_subsys is grabbed one time by pci at initialization. But edac_subsys may be released several times if multiple pci MCs exist. The fix just makes the operations balanced. Signed-off-by: Lans Zhang <jia.zhang@windriver.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

The check in omap_musb_mailbox does not properly check if the module has been fully initialized. The patch fixes that, and the kernel panic below: $ modprobe twl4030-usb [ 13.924743] twl4030_usb twl4030-usb.33: HW_CONDITIONS 0xe0/224; link 3 [ 13.940307] Unable to handle kernel NULL pointer dereference at virtual address 00000004 [ 13.948883] pgd = ef27c000 [ 13.951751] [00000004] *pgd=af256831, *pte=00000000, *ppte=00000000 [ 13.958374] Internal error: Oops: 17 [#1] ARM [ 13.962921] Modules linked in: twl4030_usb(+) omap2430 libcomposite [ 13.969543] CPU: 0 Not tainted (3.8.0-rc1-n9xx-11758-ge37a37c-dirty #6) [ 13.976867] PC is at omap_musb_mailbox+0x18/0x54 [omap2430] [ 13.982727] LR is at twl4030_usb_probe+0x240/0x354 [twl4030_usb] [ 13.989013] pc : [<bf013b6c>] lr : [<bf018958>] psr: 60000013 [ 13.989013] sp : ef273cf0 ip : ef273d08 fp : ef273d04 [ 14.001068] r10: bf01b000 r9 : bf0191d8 r8 : 00000001 [ 14.006530] r7 : 00000000 r6 : ef140e10 r5 : 00000003 r4 : 00000000 [ 14.013397] r3 : bf0142dc r2 : 00000006 r1 : 00000000 r0 : 00000003 [ 14.020233] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user [ 14.027740] Control: 10c5387d Table: af27c019 DAC: 00000015 [ 14.033752] Process modprobe (pid: 616, stack limit = 0xef272238) [ 14.040161] Stack: (0xef273cf0 to 0xef274000) [ 14.044708] 3ce0: ef254310 00000001 ef273d34 ef273d08 [ 14.053314] 3d00: bf018958 bf013b60 bf0190a4 ef254310 c0101550 c0c3a138 ef140e10 ef140e44 [ 14.061889] 3d20: bf019150 00000001 ef273d44 ef273d38 c019890c bf018724 ef273d64 ef273d48 [ 14.070495] 3d40: c01974fc c01988f8 ef140e10 bf019150 ef140e44 00000000 ef273d84 ef273d68 [ 14.079071] 3d60: c0197728 c019748c c0197694 00000000 bf019150 c0197694 ef273dac ef273d88 [ 14.087677] 3d80: c0195c38 c01976a0 ef03610c ef143eb0 c0128954 ef254780 bf019150 c0b19548 [ 14.096252] 3da0: ef273dbc ef273db0 c0197098 c0195bf0 ef273dec ef273dc0 c0196c98 c0197080 [ 14.104858] 3dc0: bf0190a4 c0b27bc0 ef273dec bf019150 bf019190 c0b27bc0 ef272000 00000001 [ 14.113433] 3de0: ef273e14 ef273df0 c0197c18 c0196b30 ef273f48 bf019190 c0b27bc0 ef272000 [ 14.122039] 3e00: 00000001 bf01b000 ef273e24 ef273e18 c0198b28 c0197ba4 ef273e34 ef273e28 [ 14.130615] 3e20: bf01b014 c0198ae8 ef273e8c ef273e38 c0008918 bf01b00c c004f730 c012ba1c [ 14.139221] 3e40: ef273e74 00000000 c00505b0 c004f72c 00000000 ef273e60 ef273f48 bf019190 [ 14.147796] 3e60: 00000001 ef273f48 bf019190 00000001 ef286340 00000001 bf0191d8 c0065414 [ 14.156402] 3e80: ef273f44 ef273e90 c0067754 c00087fc bf01919c 00007fff c0064794 00000000 [ 14.164978] 3ea0: ef273ecc f0064000 00000001 ef272000 ef272000 00067f39 bf0192b0 bf01919c [ 14.173583] 3ec0: ef273f0c ef273ed0 c00a6bf0 c00a53fc ff000000 000000d2 c0067dc8 00000000 [ 14.182159] 3ee0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [ 14.190765] 3f00: 00000000 00000000 00000000 00000000 00000000 00000000 ffffffff 00002968 [ 14.199340] 3f20: 00080878 00067f39 00000080 c000e2e8 ef272000 00000000 ef273fa4 ef273f48 [ 14.207946] 3f40: c0067e54 c0066188 f0064000 00002968 f0065530 f0065463 f0065fb0 000012c4 [ 14.216522] 3f60: 00001664 00000000 00000000 00000000 00000014 00000015 0000000c 00000000 [ 14.225128] 3f80: 00000008 00000000 00000000 00080370 00080878 0007422c 00000000 ef273fa8 [ 14.233703] 3fa0: c000e140 c0067d80 00080370 00080878 00080878 00002968 00067f39 00000000 [ 14.242309] 3fc0: 00080370 00080878 0007422c 00000080 00074030 00067f39 bec7aef8 00000000 [ 14.250885] 3fe0: b6f05300 bec7ab68 0000e93c b6f05310 60000010 00080878 af7fe821 af7fec21 [ 14.259460] Backtrace: [ 14.262054] [<bf013b54>] (omap_musb_mailbox+0x0/0x54 [omap2430]) from [<bf018958>] (twl4030_usb_probe+0x240/0x354 [twl4030_usb]) [ 14.274200] r5:00000001 r4:ef254310 [ 14.277984] [<bf018718>] (twl4030_usb_probe+0x0/0x354 [twl4030_usb]) from [<c019890c>] (platform_drv_probe+0x20/0x24) [ 14.289123] r8:00000001 r7:bf019150 r6:ef140e44 r5:ef140e10 r4:c0c3a138 [ 14.296203] [<c01988ec>] (platform_drv_probe+0x0/0x24) from [<c01974fc>] (driver_probe_device+0x7c/0x214) [ 14.306243] [<c0197480>] (driver_probe_device+0x0/0x214) from [<c0197728>] (__driver_attach+0x94/0x98) [ 14.316009] r7:00000000 r6:ef140e44 r5:bf019150 r4:ef140e10 [ 14.321990] [<c0197694>] (__driver_attach+0x0/0x98) from [<c0195c38>] (bus_for_each_dev+0x54/0x88) [ 14.331390] r6:c0197694 r5:bf019150 r4:00000000 r3:c0197694 [ 14.337371] [<c0195be4>] (bus_for_each_dev+0x0/0x88) from [<c0197098>] (driver_attach+0x24/0x28) [ 14.346588] r6:c0b19548 r5:bf019150 r4:ef254780 [ 14.351440] [<c0197074>] (driver_attach+0x0/0x28) from [<c0196c98>] (bus_add_driver+0x174/0x244) [ 14.360687] [<c0196b24>] (bus_add_driver+0x0/0x244) from [<c0197c18>] (driver_register+0x80/0x154) [ 14.370086] r8:00000001 r7:ef272000 r6:c0b27bc0 r5:bf019190 r4:bf019150 [ 14.377136] [<c0197b98>] (driver_register+0x0/0x154) from [<c0198b28>] (platform_driver_register+0x4c/0x60) [ 14.387390] [<c0198adc>] (platform_driver_register+0x0/0x60) from [<bf01b014>] (twl4030_usb_init+0x14/0x1c [twl4030_usb]) [ 14.398895] [<bf01b000>] (twl4030_usb_init+0x0/0x1c [twl4030_usb]) from [<c0008918>] (do_one_initcall+0x128/0x1a8) [ 14.409790] [<c00087f0>] (do_one_initcall+0x0/0x1a8) from [<c0067754>] (load_module+0x15d8/0x1bf8) [ 14.419189] [<c006617c>] (load_module+0x0/0x1bf8) from [<c0067e54>] (sys_init_module+0xe0/0xf4) [ 14.428344] [<c0067d74>] (sys_init_module+0x0/0xf4) from [<c000e140>] (ret_fast_syscall+0x0/0x30) [ 14.437652] r6:0007422c r5:00080878 r4:00080370 [ 14.442504] Code: e24cb004 e59f3038 e1a05000 e593401c (e5940004) [ 14.448944] ---[ end trace dbf47e5bc5ba03c2 ]--- [ 14.453826] Kernel panic - not syncing: Fatal exception Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi> Signed-off-by: Felipe Balbi <balbi@ti.com>

On systems where wd and amthif is not initialized we will hit cl->dev == NULL. This condition is okay so we don't need to be laud about it. Fixes the follwing warning during suspend [ 137.061985] WARNING: at drivers/misc/mei/client.c:315 mei_cl_unlink+0x86/0x90 [mei]() [ 137.061986] Hardware name: 530U3BI/530U4BI/530U4BH [ 137.062140] Modules linked in: snd_hda_codec_hdmi snd_hda_codec_realtek joydev coretemp kvm_intel snd_hda_intel snd_hda_codec kvm arc4 iwldvm snd_hwdep i915 snd_pcm mac80211 ghash_clmulni_intel snd_page_alloc aesni_intel snd_seq_midi xts snd_seq_midi_event aes_x86_64 rfcomm snd_rawmidi parport_pc bnep lrw snd_seq uvcvideo i2c_algo_bit ppdev gf128mul iwlwifi snd_timer drm_kms_helper ablk_helper cryptd drm snd_seq_device videobuf2_vmalloc psmouse videobuf2_memops snd cfg80211 btusb videobuf2_core soundcore videodev lp bluetooth samsung_laptop wmi microcode mei serio_raw mac_hid video hid_generic lpc_ich parport usbhid hid r8169 [ 137.062143] Pid: 2706, comm: kworker/u:15 Tainted: G D W 3.8.0-rc2-next20130109-1-iniza-generic #1 [ 137.062144] Call Trace: [ 137.062156] [<ffffffff8105860f>] warn_slowpath_common+0x7f/0xc0 [ 137.062159] [<ffffffff8135b1ea>] ? ioread32+0x3a/0x40 [ 137.062162] [<ffffffff8105866a>] warn_slowpath_null+0x1a/0x20 [ 137.062168] [<ffffffffa0076be6>] mei_cl_unlink+0x86/0x90 [mei] [ 137.062173] [<ffffffffa0071325>] mei_reset+0xc5/0x240 [mei] [ 137.062178] [<ffffffffa0073703>] mei_pci_resume+0xa3/0x110 [mei] [ 137.062183] [<ffffffff81379cae>] pci_pm_resume+0x7e/0xe0 [ 137.062185] [<ffffffff81379c30>] ? pci_pm_thaw+0x80/0x80 [ 137.062189] [<ffffffff8145a415>] dpm_run_callback.isra.6+0x25/0x50 [ 137.062192] [<ffffffff8145a6cf>] device_resume+0x9f/0x140 [ 137.062194] [<ffffffff8145a791>] async_resume+0x21/0x50 [ 137.062200] [<ffffffff810858b0>] async_run_entry_fn+0x90/0x1c0 [ 137.062203] [<ffffffff810778e5>] process_one_work+0x155/0x460 [ 137.062207] [<ffffffff81078578>] worker_thread+0x168/0x400 [ 137.062210] [<ffffffff81078410>] ? manage_workers+0x2b0/0x2b0 [ 137.062214] [<ffffffff8107d9f0>] kthread+0xc0/0xd0 [ 137.062218] [<ffffffff8107d930>] ? flush_kthread_worker+0xb0/0xb0 [ 137.062222] [<ffffffff816bac6c>] ret_from_fork+0x7c/0xb0 [ 137.062228] [<ffffffff8107d930>] ? flush_kthread_worker+0xb0/0xb0 Reported-by: Sedat Dilek <sedat.dilek@gmail.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

5d21cc2 ("cpuset: replace cgroup_mutex locking with cpuset internal locking") incorrectly converted proc_cpuset_show() from cgroup_lock() to cpuset_mutex. proc_cpuset_show() is accessing cgroup hierarchy proper to determine cgroup path which can't be protected by cpuset_mutex. This triggered the following RCU warning. =============================== [ INFO: suspicious RCU usage. ] 3.8.0-rc3-next-20130114-sasha-00016-ga107525-dirty raspberrypi#262 Tainted: G W ------------------------------- include/linux/cgroup.h:534 suspicious rcu_dereference_check() usage! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 1 2 locks held by trinity/7514: #0: (&p->lock){+.+.+.}, at: [<ffffffff812b06aa>] seq_read+0x3a/0x3e0 #1: (cpuset_mutex){+.+...}, at: [<ffffffff811abae4>] proc_cpuset_show+0x84/0x190 stack backtrace: Pid: 7514, comm: trinity Tainted: G W +3.8.0-rc3-next-20130114-sasha-00016-ga107525-dirty raspberrypi#262 Call Trace: [<ffffffff81182cab>] lockdep_rcu_suspicious+0x10b/0x120 [<ffffffff811abb71>] proc_cpuset_show+0x111/0x190 [<ffffffff812b0827>] seq_read+0x1b7/0x3e0 [<ffffffff812b0670>] ? seq_lseek+0x110/0x110 [<ffffffff8128b4fb>] do_loop_readv_writev+0x4b/0x90 [<ffffffff8128b776>] do_readv_writev+0xf6/0x1d0 [<ffffffff8128b8ee>] vfs_readv+0x3e/0x60 [<ffffffff8128b960>] sys_readv+0x50/0xd0 [<ffffffff83d33d18>] tracesys+0xe1/0xe6 The operation can be performed under RCU read lock. Replace cpuset_mutex locking with RCU read locking. tj: Rewrote patch description. Reported-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Li Zefan <lizefan@huawei.com> Signed-off-by: Tejun Heo <tj@kernel.org>

Since commit 89c8d91 ("tty: localise the lock") I see a dead lock in one of my dummy_hcd + g_nokia test cases. The first run was usually okay, the second often resulted in a splat by lockdep and the third was usually a dead lock. Lockdep complained about tty->hangup_work and tty->legacy_mutex taken both ways: | ====================================================== | [ INFO: possible circular locking dependency detected ] | 3.7.0-rc6+ raspberrypi#204 Not tainted | ------------------------------------------------------- | kworker/2:1/35 is trying to acquire lock: | (&tty->legacy_mutex){+.+.+.}, at: [<c14051e6>] tty_lock_nested+0x36/0x80 | | but task is already holding lock: | ((&tty->hangup_work)){+.+...}, at: [<c104f6e4>] process_one_work+0x124/0x5e0 | | which lock already depends on the new lock. | | the existing dependency chain (in reverse order) is: | | -> #2 ((&tty->hangup_work)){+.+...}: | [<c107fe74>] lock_acquire+0x84/0x190 | [<c104d82d>] flush_work+0x3d/0x240 | [<c12e6986>] tty_ldisc_flush_works+0x16/0x30 | [<c12e7861>] tty_ldisc_release+0x21/0x70 | [<c12e0dfc>] tty_release+0x35c/0x470 | [<c1105e28>] __fput+0xd8/0x270 | [<c1105fcd>] ____fput+0xd/0x10 | [<c1051dd9>] task_work_run+0xb9/0xf0 | [<c1002a51>] do_notify_resume+0x51/0x80 | [<c140550a>] work_notifysig+0x35/0x3b | | -> #1 (&tty->legacy_mutex/1){+.+...}: | [<c107fe74>] lock_acquire+0x84/0x190 | [<c140276c>] mutex_lock_nested+0x6c/0x2f0 | [<c14051e6>] tty_lock_nested+0x36/0x80 | [<c1405279>] tty_lock_pair+0x29/0x70 | [<c12e0bb8>] tty_release+0x118/0x470 | [<c1105e28>] __fput+0xd8/0x270 | [<c1105fcd>] ____fput+0xd/0x10 | [<c1051dd9>] task_work_run+0xb9/0xf0 | [<c1002a51>] do_notify_resume+0x51/0x80 | [<c140550a>] work_notifysig+0x35/0x3b | | -> #0 (&tty->legacy_mutex){+.+.+.}: | [<c107f3c9>] __lock_acquire+0x1189/0x16a0 | [<c107fe74>] lock_acquire+0x84/0x190 | [<c140276c>] mutex_lock_nested+0x6c/0x2f0 | [<c14051e6>] tty_lock_nested+0x36/0x80 | [<c140523f>] tty_lock+0xf/0x20 | [<c12df8e4>] __tty_hangup+0x54/0x410 | [<c12dfcb2>] do_tty_hangup+0x12/0x20 | [<c104f763>] process_one_work+0x1a3/0x5e0 | [<c104fec9>] worker_thread+0x119/0x3a0 | [<c1055084>] kthread+0x94/0xa0 | [<c140ca37>] ret_from_kernel_thread+0x1b/0x28 | |other info that might help us debug this: | |Chain exists of: | &tty->legacy_mutex --> &tty->legacy_mutex/1 --> (&tty->hangup_work) | | Possible unsafe locking scenario: | | CPU0 CPU1 | ---- ---- | lock((&tty->hangup_work)); | lock(&tty->legacy_mutex/1); | lock((&tty->hangup_work)); | lock(&tty->legacy_mutex); | | *** DEADLOCK *** Before the path mentioned tty_ldisc_release() look like this: | tty_ldisc_halt(tty); | tty_ldisc_flush_works(tty); | tty_lock(); As it can be seen, it first flushes the workqueue and then grabs the tty_lock. Now we grab the lock first: | tty_lock_pair(tty, o_tty); | tty_ldisc_halt(tty); | tty_ldisc_flush_works(tty); so lockdep's complaint seems valid. The earlier version of this patch took the ldisc_mutex since the other user of tty_ldisc_flush_works() (tty_set_ldisc()) did this. Peter Hurley then said that it is should not be requried. Since it wasn't done earlier, I dropped this part. The code under tty_ldisc_kill() was executed earlier with the tty lock taken so it is taken again. I was able to reproduce the deadlock on v3.8-rc1, this patch fixes the problem in my testcase. I didn't notice any problems so far. Cc: Alan Cox <alan@linux.intel.com> Cc: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Allow multiple listener sockets to bind to the same port. Motivation for soresuseport would be something like a web server binding to port 80 running with multiple threads, where each thread might have it's own listener socket. This could be done as an alternative to other models: 1) have one listener thread which dispatches completed connections to workers. 2) accept on a single listener socket from multiple threads. In case #1 the listener thread can easily become the bottleneck with high connection turn-over rate. In case #2, the proportion of connections accepted per thread tends to be uneven under high connection load (assuming simple event loop: while (1) { accept(); process() }, wakeup does not promote fairness among the sockets. We have seen the disproportion to be as high as 3:1 ratio between thread accepting most connections and the one accepting the fewest. With so_reusport the distribution is uniform. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>

Motivation for soreuseport would be something like a web server binding to port 80 running with multiple threads, where each thread might have it's own listener socket. This could be done as an alternative to other models: 1) have one listener thread which dispatches completed connections to workers. 2) accept on a single listener socket from multiple threads. In case #1 the listener thread can easily become the bottleneck with high connection turn-over rate. In case #2, the proportion of connections accepted per thread tends to be uneven under high connection load (assuming simple event loop: while (1) { accept(); process() }, wakeup does not promote fairness among the sockets. We have seen the disproportion to be as high as 3:1 ratio between thread accepting most connections and the one accepting the fewest. With so_reusport the distribution is uniform. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>

Tom Herbert says: ==================== This series implements so_reuseport (SO_REUSEPORT socket option) for TCP and UDP. For TCP, so_reuseport allows multiple listener sockets to be bound to the same port. In the case of UDP, so_reuseport allows multiple sockets to bind to the same port. To prevent port hijacking all sockets bound to the same port using so_reuseport must have the same uid. Received packets are distributed to multiple sockets bound to the same port using a 4-tuple hash. The motivating case for so_resuseport in TCP would be something like a web server binding to port 80 running with multiple threads, where each thread might have it's own listener socket. This could be done as an alternative to other models: 1) have one listener thread which dispatches completed connections to workers. 2) accept on a single listener socket from multiple threads. In case #1 the listener thread can easily become the bottleneck with high connection turn-over rate. In case #2, the proportion of connections accepted per thread tends to be uneven under high connection load (assuming simple event loop: while (1) { accept(); process() }, wakeup does not promote fairness among the sockets. We have seen the disproportion to be as high as 3:1 ratio between thread accepting most connections and the one accepting the fewest. With so_reusport the distribution is uniform. The TCP implementation has a problem in that the request sockets for a listener are attached to a listener socket. If a SYN is received, a listener socket is chosen and request structure is created (SYN-RECV state). If the subsequent ack in 3WHS does not match the same port by so_reusport, the connection state is not found (reset) and the request structure is orphaned. This scenario would occur when the number of listener sockets bound to a port changes (new ones are added, or old ones closed). We are looking for a solution to this, maybe allow multiple sockets to share the same request table... The motivating case for so_reuseport in UDP would be something like a DNS server. An alternative would be to recv on the same socket from multiple threads. As in the case of TCP, the load across these threads tends to be disproportionate and we also see a lot of contection on the socket lock. Note that SO_REUSEADDR already allows multiple UDP sockets to bind to the same port, however there is no provision to prevent hijacking and nothing to distribute packets across all the sockets sharing the same bound port. This patch does not change the semantics of SO_REUSEADDR, but provides usable functionality of it for unicast. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>

For drivers that don't actually flush their queues when aggregation stop with the IEEE80211_AMPDU_TX_STOP_FLUSH or IEEE80211_AMPDU_TX_STOP_FLUSH_CONT reasons is done, like iwlwifi or iwlegacy, mac80211 can then transmit on a TID that the driver still considers busy. This happens in the following way: - IEEE80211_AMPDU_TX_STOP_FLUSH requested - driver marks TID as emptying - mac80211 removes tid_tx data, this can copy packets to the TX pending queues and also let new packets through to the driver - driver gets unexpected TX as it wasn't completely converted to the new API In iwlwifi, this lead to the following warning: WARNING: at drivers/net/wireless/iwlwifi/dvm/tx.c:442 iwlagn_tx_skb+0xc47/0xce0 Tx while agg.state = 4 Modules linked in: [...] Pid: 0, comm: kworker/0:0 Tainted: G W 3.1.0 #1 Call Trace: [<c1046e42>] warn_slowpath_common+0x72/0xa0 [<c1046f13>] warn_slowpath_fmt+0x33/0x40 [<fddffa17>] iwlagn_tx_skb+0xc47/0xce0 [iwldvm] [<fddfcaa3>] iwlagn_mac_tx+0x23/0x40 [iwldvm] [<fd8c98b6>] __ieee80211_tx+0xf6/0x3c0 [mac80211] [<fd8cbe00>] ieee80211_tx+0xd0/0x100 [mac80211] [<fd8cc176>] ieee80211_xmit+0x96/0xe0 [mac80211] [<fd8cc578>] ieee80211_subif_start_xmit+0x348/0xc80 [mac80211] [<c1445207>] dev_hard_start_xmit+0x337/0x6d0 [<c145eee9>] sch_direct_xmit+0xa9/0x210 [<c14462c0>] dev_queue_xmit+0x1b0/0x8e0 Fortunately, solving this problem is easy as the station is being destroyed, so such transmit packets can only happen due to races. Instead of trying to close the race just let the race not reach the drivers by making two changes: 1) remove the explicit aggregation session teardown in the managed mode code, the same thing will be done when the station is removed, in __sta_info_destroy. 2) When aggregation stop with AGG_STOP_DESTROY_STA is requested, leave the tid_tx data around as stopped. It will be cleared and freed in cleanup_single_sta later, but until then any racy packets will be put onto the tid_tx pending queue instead of transmitted which is fine since the station is being removed. Signed-off-by: Johannes Berg <johannes.berg@intel.com>

device eth0 entered promiscuous mode BUG: sleeping function called from invalid context at mm/slub.c:930 in_atomic(): 1, irqs_disabled(): 0, pid: 5911, name: brctl INFO: lockdep is turned off. Pid: 5911, comm: brctl Tainted: GF W O 3.6.0-0.rc7.git1.4.fc18.x86_64 #1 Call Trace: [<ffffffff810a29ca>] __might_sleep+0x18a/0x240 [<ffffffff811b5d77>] __kmalloc+0x67/0x2d0 [<ffffffffa00a61a9>] ? qlcnic_alloc_lb_filters_mem+0x59/0xa0 [qlcnic] [<ffffffffa00a61a9>] qlcnic_alloc_lb_filters_mem+0x59/0xa0 [qlcnic] [<ffffffffa009e1c1>] qlcnic_set_multi+0x81/0x100 [qlcnic] [<ffffffff8159cccf>] __dev_set_rx_mode+0x5f/0xb0 [<ffffffff8159cd4f>] dev_set_rx_mode+0x2f/0x50 [<ffffffff8159d00c>] dev_set_promiscuity+0x3c/0x50 [<ffffffffa05ed728>] br_add_if+0x1e8/0x400 [bridge] [<ffffffffa05ee2df>] add_del_if+0x5f/0x90 [bridge] [<ffffffffa05eee0b>] br_dev_ioctl+0x4b/0x90 [bridge] [<ffffffff8159d613>] dev_ifsioc+0x373/0x3b0 [<ffffffff8159d78f>] dev_ioctl+0x13f/0x860 [<ffffffff812dd6e1>] ? avc_has_perm_flags+0x31/0x2c0 [<ffffffff8157c18d>] sock_do_ioctl+0x5d/0x70 [<ffffffff8157c21d>] sock_ioctl+0x7d/0x2c0 [<ffffffff812df922>] ? inode_has_perm.isra.48.constprop.61+0x62/0xa0 [<ffffffff811e4979>] do_vfs_ioctl+0x99/0x5a0 [<ffffffff812df9f7>] ? file_has_perm+0x97/0xb0 [<ffffffff810d716d>] ? trace_hardirqs_on+0xd/0x10 [<ffffffff811e4f19>] sys_ioctl+0x99/0xa0 [<ffffffff816e7369>] system_call_fastpath+0x16/0x1b br0: port 1(eth0) entered forwarding state br0: port 1(eth0) entered forwarding state Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>

This patch reduces the critical section protected by sco_conn_lock in sco_conn_ready function. The lock is acquired only when it is really needed. This patch fixes the following lockdep warning which is generated when the host terminates a SCO connection. Today, this warning is a false positive. There is no way those two threads reported by lockdep are running at the same time since hdev->workqueue (where rx_work is queued) is single-thread. However, if somehow this behavior is changed in future, we will have a potential deadlock. ====================================================== [ INFO: possible circular locking dependency detected ] 3.8.0-rc1+ #7 Not tainted ------------------------------------------------------- kworker/u:1H/1018 is trying to acquire lock: (&(&conn->lock)->rlock){+.+...}, at: [<ffffffffa0033ba6>] sco_chan_del+0x66/0x190 [bluetooth] but task is already holding lock: (slock-AF_BLUETOOTH-BTPROTO_SCO){+.+...}, at: [<ffffffffa0033d5a>] sco_conn_del+0x8a/0xe0 [bluetooth] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (slock-AF_BLUETOOTH-BTPROTO_SCO){+.+...}: [<ffffffff81083011>] lock_acquire+0xb1/0xe0 [<ffffffff813efd01>] _raw_spin_lock+0x41/0x80 [<ffffffffa003436e>] sco_connect_cfm+0xbe/0x350 [bluetooth] [<ffffffffa0015d6c>] hci_event_packet+0xd3c/0x29b0 [bluetooth] [<ffffffffa0004583>] hci_rx_work+0x133/0x870 [bluetooth] [<ffffffff8104d65f>] process_one_work+0x2bf/0x4f0 [<ffffffff81050022>] worker_thread+0x2b2/0x3e0 [<ffffffff81056021>] kthread+0xd1/0xe0 [<ffffffff813f14bc>] ret_from_fork+0x7c/0xb0 -> #0 (&(&conn->lock)->rlock){+.+...}: [<ffffffff81082215>] __lock_acquire+0x1465/0x1c70 [<ffffffff81083011>] lock_acquire+0xb1/0xe0 [<ffffffff813efd01>] _raw_spin_lock+0x41/0x80 [<ffffffffa0033ba6>] sco_chan_del+0x66/0x190 [bluetooth] [<ffffffffa0033d6d>] sco_conn_del+0x9d/0xe0 [bluetooth] [<ffffffffa0034653>] sco_disconn_cfm+0x53/0x60 [bluetooth] [<ffffffffa000fef3>] hci_disconn_complete_evt.isra.54+0x363/0x3c0 [bluetooth] [<ffffffffa00150f7>] hci_event_packet+0xc7/0x29b0 [bluetooth] [<ffffffffa0004583>] hci_rx_work+0x133/0x870 [bluetooth] [<ffffffff8104d65f>] process_one_work+0x2bf/0x4f0 [<ffffffff81050022>] worker_thread+0x2b2/0x3e0 [<ffffffff81056021>] kthread+0xd1/0xe0 [<ffffffff813f14bc>] ret_from_fork+0x7c/0xb0 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(slock-AF_BLUETOOTH-BTPROTO_SCO); lock(&(&conn->lock)->rlock); lock(slock-AF_BLUETOOTH-BTPROTO_SCO); lock(&(&conn->lock)->rlock); *** DEADLOCK *** 4 locks held by kworker/u:1H/1018: #0: (hdev->name#2){.+.+.+}, at: [<ffffffff8104d5f8>] process_one_work+0x258/0x4f0 #1: ((&hdev->rx_work)){+.+.+.}, at: [<ffffffff8104d5f8>] process_one_work+0x258/0x4f0 #2: (&hdev->lock){+.+.+.}, at: [<ffffffffa000fbe9>] hci_disconn_complete_evt.isra.54+0x59/0x3c0 [bluetooth] #3: (slock-AF_BLUETOOTH-BTPROTO_SCO){+.+...}, at: [<ffffffffa0033d5a>] sco_conn_del+0x8a/0xe0 [bluetooth] stack backtrace: Pid: 1018, comm: kworker/u:1H Not tainted 3.8.0-rc1+ #7 Call Trace: [<ffffffff813e92f9>] print_circular_bug+0x1fb/0x20c [<ffffffff81082215>] __lock_acquire+0x1465/0x1c70 [<ffffffff81083011>] lock_acquire+0xb1/0xe0 [<ffffffffa0033ba6>] ? sco_chan_del+0x66/0x190 [bluetooth] [<ffffffff813efd01>] _raw_spin_lock+0x41/0x80 [<ffffffffa0033ba6>] ? sco_chan_del+0x66/0x190 [bluetooth] [<ffffffffa0033ba6>] sco_chan_del+0x66/0x190 [bluetooth] [<ffffffffa0033d6d>] sco_conn_del+0x9d/0xe0 [bluetooth] [<ffffffffa0034653>] sco_disconn_cfm+0x53/0x60 [bluetooth] [<ffffffffa000fef3>] hci_disconn_complete_evt.isra.54+0x363/0x3c0 [bluetooth] [<ffffffffa000fbd0>] ? hci_disconn_complete_evt.isra.54+0x40/0x3c0 [bluetooth] [<ffffffffa00150f7>] hci_event_packet+0xc7/0x29b0 [bluetooth] [<ffffffff81202e90>] ? __dynamic_pr_debug+0x80/0x90 [<ffffffff8133ff7d>] ? kfree_skb+0x2d/0x40 [<ffffffffa0021644>] ? hci_send_to_monitor+0x1a4/0x1c0 [bluetooth] [<ffffffffa0004583>] hci_rx_work+0x133/0x870 [bluetooth] [<ffffffff8104d5f8>] ? process_one_work+0x258/0x4f0 [<ffffffff8104d65f>] process_one_work+0x2bf/0x4f0 [<ffffffff8104d5f8>] ? process_one_work+0x258/0x4f0 [<ffffffff8104fdc1>] ? worker_thread+0x51/0x3e0 [<ffffffffa0004450>] ? hci_tx_work+0x800/0x800 [bluetooth] [<ffffffff81050022>] worker_thread+0x2b2/0x3e0 [<ffffffff8104fd70>] ? busy_worker_rebind_fn+0x100/0x100 [<ffffffff81056021>] kthread+0xd1/0xe0 [<ffffffff81055f50>] ? flush_kthread_worker+0xc0/0xc0 [<ffffffff813f14bc>] ret_from_fork+0x7c/0xb0 [<ffffffff81055f50>] ? flush_kthread_worker+0xc0/0xc0 Signed-off-by: Andre Guedes <andre.guedes@openbossa.org> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>

Power supply subsystem creates thermal zone device for the property 'POWER_SUPPLY_PROP_TEMP' which requires thermal subsystem to be ready before 'ab8500 battery temperature monitor' driver is initialized. ab8500 btemp driver is initialized with subsys_initcall whereas thermal subsystem is initialized with fs_initcall which causes thermal_zone_device_register(...) to crash since the required structure 'thermal_class' is not initialized yet: Unable to handle kernel NULL pointer dereference at virtual address 000000a4 pgd = c0004000 [000000a4] *pgd=00000000 Internal error: Oops: 5 [#1] PREEMPT SMP ARM Modules linked in: CPU: 0 Tainted: G W (3.8.0-rc4-00001-g632fda8-dirty #1) PC is at _raw_spin_lock+0x18/0x54 LR is at get_device_parent+0x50/0x1b8 pc : [<c02f1dd0>] lr : [<c01cb248>] psr: 60000013 sp : ef04bdc8 ip : 00000000 fp : c0446180 r10: ef216e38 r9 : c03af5d0 r8 : ef275c18 r7 : 00000000 r6 : c0476c14 r5 : ef275c18 r4 : ef095840 r3 : ef04a000 r2 : 00000001 r1 : 00000000 r0 : 000000a4 Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment kernel Control: 10c5787d Table: 0000404a DAC: 00000015 Process swapper/0 (pid: 1, stack limit = 0xef04a238) Stack: (0xef04bdc8 to 0xef04c000) [...] [<c02f1dd0>] (_raw_spin_lock+0x18/0x54) from [<c01cb248>] (get_device_parent+0x50/0x1b8) [<c01cb248>] (get_device_parent+0x50/0x1b8) from [<c01cb8d8>] (device_add+0xa4/0x574) [<c01cb8d8>] (device_add+0xa4/0x574) from [<c020b91c>] (thermal_zone_device_register+0x118/0x938) [<c020b91c>] (thermal_zone_device_register+0x118/0x938) from [<c0202030>] (power_supply_register+0x170/0x1f8) [<c0202030>] (power_supply_register+0x170/0x1f8) from [<c02055ec>] (ab8500_btemp_probe+0x208/0x47c) [<c02055ec>] (ab8500_btemp_probe+0x208/0x47c) from [<c01cf0dc>] (platform_drv_probe+0x14/0x18) [<c01cf0dc>] (platform_drv_probe+0x14/0x18) from [<c01cde70>] (driver_probe_device+0x74/0x20c) [<c01cde70>] (driver_probe_device+0x74/0x20c) from [<c01ce094>] (__driver_attach+0x8c/0x90) [<c01ce094>] (__driver_attach+0x8c/0x90) from [<c01cc640>] (bus_for_each_dev+0x4c/0x80) [<c01cc640>] (bus_for_each_dev+0x4c/0x80) from [<c01cd6b4>] (bus_add_driver+0x16c/0x23c) [<c01cd6b4>] (bus_add_driver+0x16c/0x23c) from [<c01ce54c>] (driver_register+0x78/0x14c) [<c01ce54c>] (driver_register+0x78/0x14c) from [<c00086ac>] (do_one_initcall+0xfc/0x164) [<c00086ac>] (do_one_initcall+0xfc/0x164) from [<c02e89c8>] (kernel_init+0x120/0x2b8) [<c02e89c8>] (kernel_init+0x120/0x2b8) from [<c000e358>] (ret_from_fork+0x14/0x3c) Code: e3c3303f e5932004 e2822001 e5832004 (e1903f9f) ---[ end trace ed9df72941b5bada ]--- Signed-off-by: Rajanikanth H.V <rajanikanth.hv@stericsson.com> Cc: stable@vger.kernel.org Signed-off-by: Anton Vorontsov <anton@enomsg.org>

commit 9ae794a upstream. System may crash in ahci_hw_interrupt() or ahci_thread_fn() when accessing the interrupt status in a port's private_data if the port is actually a DUMMY port. 00:1f.2 SATA controller: Intel Corporation 82801JI (ICH10 Family) SATA AHCI Controller <snip console output for linux-3.15-rc1> [ 9.352080] ahci 0000:00:1f.2: AHCI 0001.0200 32 slots 6 ports 3 Gbps 0x1 impl SATA mode [ 9.352084] ahci 0000:00:1f.2: flags: 64bit ncq sntf pm led clo pio slum part ccc [ 9.368155] Console: switching to colour frame buffer device 128x48 [ 9.439759] mgag200 0000:11:00.0: fb0: mgadrmfb frame buffer device [ 9.446765] mgag200 0000:11:00.0: registered panic notifier [ 9.470166] scsi1 : ahci [ 9.479166] scsi2 : ahci [ 9.488172] scsi3 : ahci [ 9.497174] scsi4 : ahci [ 9.506175] scsi5 : ahci [ 9.515174] scsi6 : ahci [ 9.518181] ata1: SATA max UDMA/133 abar m2048@0x95c00000 port 0x95c00100 irq 91 [ 9.526448] ata2: DUMMY [ 9.529182] ata3: DUMMY [ 9.531916] ata4: DUMMY [ 9.534650] ata5: DUMMY [ 9.537382] ata6: DUMMY [ 9.576196] [drm] Initialized mgag200 1.0.0 20110418 for 0000:11:00.0 on minor 0 [ 9.845257] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) [ 9.865161] ata1.00: ATAPI: Optiarc DVD RW AD-7580S, FX04, max UDMA/100 [ 9.891407] ata1.00: configured for UDMA/100 [ 9.900525] scsi 1:0:0:0: CD-ROM Optiarc DVD RW AD-7580S FX04 PQ: 0 ANSI: 5 [ 10.247399] iTCO_vendor_support: vendor-support=0 [ 10.261572] iTCO_wdt: Intel TCO WatchDog Timer Driver v1.11 [ 10.269764] iTCO_wdt: unable to reset NO_REBOOT flag, device disabled by hardware/BIOS [ 10.301932] sd 0:2:0:0: [sda] 570310656 512-byte logical blocks: (291 GB/271 GiB) [ 10.317085] sd 0:2:0:0: [sda] Write Protect is off [ 10.328326] sd 0:2:0:0: [sda] Write cache: disabled, read cache: disabled, supports DPO and FUA [ 10.375452] BUG: unable to handle kernel NULL pointer dereference at 000000000000003c [ 10.384217] IP: [<ffffffffa0133df0>] ahci_hw_interrupt+0x100/0x130 [libahci] [ 10.392101] PGD 0 [ 10.394353] Oops: 0000 [#1] SMP [ 10.397978] Modules linked in: sr_mod(+) cdrom sd_mod iTCO_wdt crc_t10dif iTCO_vendor_support crct10dif_common ahci libahci libata lpc_ich mfd_core mgag200 syscopyarea sysfillrect sysimgblt i2c_algo_bit drm_kms_helper ttm drm i2c_core megaraid_sas dm_mirror dm_region_hash dm_log dm_mod [ 10.426499] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.15.0-rc1 #1 [ 10.433495] Hardware name: QCI QSSC-S4R/QSSC-S4R, BIOS QSSC-S4R.QCI.01.00.S013.032920111005 03/29/2011 [ 10.443886] task: ffffffff81906460 ti: ffffffff818f0000 task.ti: ffffffff818f0000 [ 10.452239] RIP: 0010:[<ffffffffa0133df0>] [<ffffffffa0133df0>] ahci_hw_interrupt+0x100/0x130 [libahci] [ 10.462838] RSP: 0018:ffff880033c03d98 EFLAGS: 00010046 [ 10.468767] RAX: 0000000000a400a4 RBX: ffff880029a6bc18 RCX: 00000000fffffffa [ 10.476731] RDX: 00000000000000a4 RSI: ffff880029bb0000 RDI: ffff880029a6bc18 [ 10.484696] RBP: ffff880033c03dc8 R08: 0000000000000000 R09: ffff88002f800490 [ 10.492661] R10: 0000000000000000 R11: 0000000000000005 R12: 0000000000000000 [ 10.500625] R13: ffff880029a6bd98 R14: 0000000000000000 R15: ffffc90000194000 [ 10.508590] FS: 0000000000000000(0000) GS:ffff880033c00000(0000) knlGS:0000000000000000 [ 10.517623] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 10.524035] CR2: 000000000000003c CR3: 00000000328ff000 CR4: 00000000000007b0 [ 10.531999] Stack: [ 10.534241] 0000000000000017 ffff880031ba7d00 000000000000005c ffff880031ba7d00 [ 10.542535] 0000000000000000 000000000000005c ffff880033c03e10 ffffffff810c2a1e [ 10.550827] ffff880031ae2900 000000008108fb4f ffff880031ae2900 ffff880031ae2984 [ 10.559121] Call Trace: [ 10.561849] <IRQ> [ 10.563994] [<ffffffff810c2a1e>] handle_irq_event_percpu+0x3e/0x1a0 [ 10.571309] [<ffffffff810c2bbd>] handle_irq_event+0x3d/0x60 [ 10.577631] [<ffffffff810c4fdd>] try_one_irq.isra.6+0x8d/0xf0 [ 10.584142] [<ffffffff810c5313>] note_interrupt+0x173/0x1f0 [ 10.590460] [<ffffffff810c2a8e>] handle_irq_event_percpu+0xae/0x1a0 [ 10.597554] [<ffffffff810c2bbd>] handle_irq_event+0x3d/0x60 [ 10.603872] [<ffffffff810c5727>] handle_edge_irq+0x77/0x130 [ 10.610199] [<ffffffff81014b8f>] handle_irq+0xbf/0x150 [ 10.616040] [<ffffffff8109ff4e>] ? vtime_account_idle+0xe/0x50 [ 10.622654] [<ffffffff815fca1a>] ? atomic_notifier_call_chain+0x1a/0x20 [ 10.630140] [<ffffffff816038cf>] do_IRQ+0x4f/0xf0 [ 10.635490] [<ffffffff815f8aed>] common_interrupt+0x6d/0x6d [ 10.641805] <EOI> [ 10.643950] [<ffffffff8149ca9f>] ? cpuidle_enter_state+0x4f/0xc0 [ 10.650972] [<ffffffff8149ca98>] ? cpuidle_enter_state+0x48/0xc0 [ 10.657775] [<ffffffff8149cb47>] cpuidle_enter+0x17/0x20 [ 10.663807] [<ffffffff810b0070>] cpu_startup_entry+0x2c0/0x3d0 [ 10.670423] [<ffffffff815dfcc7>] rest_init+0x77/0x80 [ 10.676065] [<ffffffff81a60f47>] start_kernel+0x40f/0x41a [ 10.682190] [<ffffffff81a60941>] ? repair_env_string+0x5c/0x5c [ 10.688799] [<ffffffff81a60120>] ? early_idt_handlers+0x120/0x120 [ 10.695699] [<ffffffff81a605ee>] x86_64_start_reservations+0x2a/0x2c [ 10.702889] [<ffffffff81a60733>] x86_64_start_kernel+0x143/0x152 [ 10.709689] Code: a0 fc ff 85 c0 8b 4d d4 74 c3 48 8b 7b 08 89 ca 48 c7 c6 60 66 13 a0 31 c0 e8 9d 70 28 e1 8b 4d d4 eb aa 0f 1f 84 00 00 00 00 00 <45> 8b 64 24 3c 48 89 df e8 23 47 4c e1 41 83 fc 01 19 c0 48 83 [ 10.731470] RIP [<ffffffffa0133df0>] ahci_hw_interrupt+0x100/0x130 [libahci] [ 10.739441] RSP <ffff880033c03d98> [ 10.743333] CR2: 000000000000003c [ 10.747032] ---[ end trace b6e82636970e2690 ]--- [ 10.760190] Kernel panic - not syncing: Fatal exception in interrupt [ 10.767291] Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffff9fffffff) Cc: Alexander Gordeev <agordeev@redhat.com> Cc: Tejun Heo <tj@kernel.org> Signed-of-by: David Milburn <dmilburn@redhat.com> Fixes: 5ca72c4 ("AHCI: Support multiple MSIs")

commit a585f87 upstream. The scenario here is that someone calls enable_irq_wake() from somewhere in the code. This will result in the lockdep producing a backtrace as can be seen below. In my case, this problem is triggered when using the wl1271 (TI WlCore) driver found in drivers/net/wireless/ti/ . The problem cause is rather obvious from the backtrace, but let's outline the dependency. enable_irq_wake() grabs the IRQ buslock in irq_set_irq_wake(), which in turns calls mxs_gpio_set_wake_irq() . But mxs_gpio_set_wake_irq() calls enable_irq_wake() again on the one-level-higher IRQ , thus it tries to grab the IRQ buslock again in irq_set_irq_wake() . Because the spinlock in irq_set_irq_wake()->irq_get_desc_buslock()->__irq_get_desc_lock() is not marked as recursive, lockdep will spew the stuff below. We know we can safely re-enter the lock, so use IRQ_GC_INIT_NESTED_LOCK to fix the spew. ============================================= [ INFO: possible recursive locking detected ] 3.10.33-00012-gf06b763-dirty raspberrypi#61 Not tainted --------------------------------------------- kworker/0:1/18 is trying to acquire lock: (&irq_desc_lock_class){-.-...}, at: [<c00685f0>] __irq_get_desc_lock+0x48/0x88 but task is already holding lock: (&irq_desc_lock_class){-.-...}, at: [<c00685f0>] __irq_get_desc_lock+0x48/0x88 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&irq_desc_lock_class); lock(&irq_desc_lock_class); *** DEADLOCK *** May be due to missing lock nesting notation 3 locks held by kworker/0:1/18: #0: (events){.+.+.+}, at: [<c0036308>] process_one_work+0x134/0x4a4 #1: ((&fw_work->work)){+.+.+.}, at: [<c0036308>] process_one_work+0x134/0x4a4 #2: (&irq_desc_lock_class){-.-...}, at: [<c00685f0>] __irq_get_desc_lock+0x48/0x88 stack backtrace: CPU: 0 PID: 18 Comm: kworker/0:1 Not tainted 3.10.33-00012-gf06b763-dirty raspberrypi#61 Workqueue: events request_firmware_work_func [<c0013eb4>] (unwind_backtrace+0x0/0xf0) from [<c0011c74>] (show_stack+0x10/0x14) [<c0011c74>] (show_stack+0x10/0x14) from [<c005bb08>] (__lock_acquire+0x140c/0x1a64) [<c005bb08>] (__lock_acquire+0x140c/0x1a64) from [<c005c6a8>] (lock_acquire+0x9c/0x104) [<c005c6a8>] (lock_acquire+0x9c/0x104) from [<c051d5a4>] (_raw_spin_lock_irqsave+0x44/0x58) [<c051d5a4>] (_raw_spin_lock_irqsave+0x44/0x58) from [<c00685f0>] (__irq_get_desc_lock+0x48/0x88) [<c00685f0>] (__irq_get_desc_lock+0x48/0x88) from [<c0068e78>] (irq_set_irq_wake+0x20/0xf4) [<c0068e78>] (irq_set_irq_wake+0x20/0xf4) from [<c027260c>] (mxs_gpio_set_wake_irq+0x1c/0x24) [<c027260c>] (mxs_gpio_set_wake_irq+0x1c/0x24) from [<c0068cf4>] (set_irq_wake_real+0x30/0x44) [<c0068cf4>] (set_irq_wake_real+0x30/0x44) from [<c0068ee4>] (irq_set_irq_wake+0x8c/0xf4) [<c0068ee4>] (irq_set_irq_wake+0x8c/0xf4) from [<c0310748>] (wlcore_nvs_cb+0x10c/0x97c) [<c0310748>] (wlcore_nvs_cb+0x10c/0x97c) from [<c02be5e8>] (request_firmware_work_func+0x38/0x58) [<c02be5e8>] (request_firmware_work_func+0x38/0x58) from [<c0036394>] (process_one_work+0x1c0/0x4a4) [<c0036394>] (process_one_work+0x1c0/0x4a4) from [<c0036a4c>] (worker_thread+0x138/0x394) [<c0036a4c>] (worker_thread+0x138/0x394) from [<c003cb74>] (kthread+0xa4/0xb0) [<c003cb74>] (kthread+0xa4/0xb0) from [<c000ee00>] (ret_from_fork+0x14/0x34) wlcore: loaded Signed-off-by: Marek Vasut <marex@denx.de> Acked-by: Shawn Guo <shawn.guo@linaro.org> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Jiri Slaby <jslaby@suse.cz>

…s during lockd_up commit 679b033 upstream. We had a Fedora ABRT report with a stack trace like this: kernel BUG at net/sunrpc/svc.c:550! invalid opcode: 0000 [#1] SMP [...] CPU: 2 PID: 913 Comm: rpc.nfsd Not tainted 3.13.6-200.fc20.x86_64 #1 Hardware name: Hewlett-Packard HP ProBook 4740s/1846, BIOS 68IRR Ver. F.40 01/29/2013 task: ffff880146b00000 ti: ffff88003f9b8000 task.ti: ffff88003f9b8000 RIP: 0010:[<ffffffffa0305fa8>] [<ffffffffa0305fa8>] svc_destroy+0x128/0x130 [sunrpc] RSP: 0018:ffff88003f9b9de0 EFLAGS: 00010206 RAX: ffff88003f829628 RBX: ffff88003f829600 RCX: 00000000000041ee RDX: 0000000000000000 RSI: 0000000000000286 RDI: 0000000000000286 RBP: ffff88003f9b9de8 R08: 0000000000017360 R09: ffff88014fa97360 R10: ffffffff8114ce57 R11: ffffea00051c9c00 R12: ffff88003f829600 R13: 00000000ffffff9e R14: ffffffff81cc7cc0 R15: 0000000000000000 FS: 00007f4fde284840(0000) GS:ffff88014fa80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f4fdf5192f8 CR3: 00000000a569a000 CR4: 00000000001407e0 Stack: ffff88003f792300 ffff88003f9b9e18 ffffffffa02de02a 0000000000000000 ffffffff81cc7cc0 ffff88003f9cb000 0000000000000008 ffff88003f9b9e60 ffffffffa033bb35 ffffffff8131c86c ffff88003f9cb000 ffff8800a5715008 Call Trace: [<ffffffffa02de02a>] lockd_up+0xaa/0x330 [lockd] [<ffffffffa033bb35>] nfsd_svc+0x1b5/0x2f0 [nfsd] [<ffffffff8131c86c>] ? simple_strtoull+0x2c/0x50 [<ffffffffa033c630>] ? write_pool_threads+0x280/0x280 [nfsd] [<ffffffffa033c6bb>] write_threads+0x8b/0xf0 [nfsd] [<ffffffff8114efa4>] ? __get_free_pages+0x14/0x50 [<ffffffff8114eff6>] ? get_zeroed_page+0x16/0x20 [<ffffffff811dec51>] ? simple_transaction_get+0xb1/0xd0 [<ffffffffa033c098>] nfsctl_transaction_write+0x48/0x80 [nfsd] [<ffffffff811b8b34>] vfs_write+0xb4/0x1f0 [<ffffffff811c3f99>] ? putname+0x29/0x40 [<ffffffff811b9569>] SyS_write+0x49/0xa0 [<ffffffff810fc2a6>] ? __audit_syscall_exit+0x1f6/0x2a0 [<ffffffff816962e9>] system_call_fastpath+0x16/0x1b Code: 31 c0 e8 82 db 37 e1 e9 2a ff ff ff 48 8b 07 8b 57 14 48 c7 c7 d5 c6 31 a0 48 8b 70 20 31 c0 e8 65 db 37 e1 e9 f4 fe ff ff 0f 0b <0f> 0b 66 0f 1f 44 00 00 0f 1f 44 00 00 55 48 89 e5 41 56 41 55 RIP [<ffffffffa0305fa8>] svc_destroy+0x128/0x130 [sunrpc] RSP <ffff88003f9b9de0> Evidently, we created some lockd sockets and then failed to create others. make_socks then returned an error and we tried to tear down the svc, but svc->sv_permsocks was not empty so we ended up tripping over the BUG() in svc_destroy(). Fix this by ensuring that we tear down any live sockets we created when socket creation is going to return an error. Fixes: 786185b (SUNRPC: move per-net operations from...) Reported-by: Raphos <raphoszap@laposte.net> Signed-off-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>

commit 10164c2 upstream. Fix driver new_id sysfs-attribute removal deadlock by making sure to not hold any locks that the attribute operations grab when removing the attribute. Specifically, usb_serial_deregister holds the table mutex when deregistering the driver, which includes removing the new_id attribute. This can lead to a deadlock as writing to new_id increments the attribute's active count before trying to grab the same mutex in usb_serial_probe. The deadlock can easily be triggered by inserting a sleep in usb_serial_deregister and writing the id of an unbound device to new_id during module unload. As the table mutex (in this case) is used to prevent subdriver unload during probe, it should be sufficient to only hold the lock while manipulating the usb-serial driver list during deregister. A racing probe will then either fail to find a matching subdriver or fail to get the corresponding module reference. Since v3.15-rc1 this also triggers the following lockdep warning: ====================================================== [ INFO: possible circular locking dependency detected ] 3.15.0-rc2 raspberrypi#123 Tainted: G W ------------------------------------------------------- modprobe/190 is trying to acquire lock: (s_active#4){++++.+}, at: [<c0167aa0>] kernfs_remove_by_name_ns+0x4c/0x94 but task is already holding lock: (table_lock){+.+.+.}, at: [<bf004d84>] usb_serial_deregister+0x3c/0x78 [usbserial] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (table_lock){+.+.+.}: [<c0075f84>] __lock_acquire+0x1694/0x1ce4 [<c0076de8>] lock_acquire+0xb4/0x154 [<c03af3cc>] _raw_spin_lock+0x4c/0x5c [<c02bbc24>] usb_store_new_id+0x14c/0x1ac [<bf007eb4>] new_id_store+0x68/0x70 [usbserial] [<c025f568>] drv_attr_store+0x30/0x3c [<c01690e0>] sysfs_kf_write+0x5c/0x60 [<c01682c0>] kernfs_fop_write+0xd4/0x194 [<c010881c>] vfs_write+0xbc/0x198 [<c0108e4c>] SyS_write+0x4c/0xa0 [<c000f880>] ret_fast_syscall+0x0/0x48 -> #0 (s_active#4){++++.+}: [<c03a7a28>] print_circular_bug+0x68/0x2f8 [<c0076218>] __lock_acquire+0x1928/0x1ce4 [<c0076de8>] lock_acquire+0xb4/0x154 [<c0166b70>] __kernfs_remove+0x254/0x310 [<c0167aa0>] kernfs_remove_by_name_ns+0x4c/0x94 [<c0169fb8>] remove_files.isra.1+0x48/0x84 [<c016a2fc>] sysfs_remove_group+0x58/0xac [<c016a414>] sysfs_remove_groups+0x34/0x44 [<c02623b8>] driver_remove_groups+0x1c/0x20 [<c0260e9c>] bus_remove_driver+0x3c/0xe4 [<c026235c>] driver_unregister+0x38/0x58 [<bf007fb4>] usb_serial_bus_deregister+0x84/0x88 [usbserial] [<bf004db4>] usb_serial_deregister+0x6c/0x78 [usbserial] [<bf005330>] usb_serial_deregister_drivers+0x2c/0x4c [usbserial] [<bf016618>] usb_serial_module_exit+0x14/0x1c [sierra] [<c009d6cc>] SyS_delete_module+0x184/0x210 [<c000f880>] ret_fast_syscall+0x0/0x48 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(table_lock); lock(s_active#4); lock(table_lock); lock(s_active#4); *** DEADLOCK *** 1 lock held by modprobe/190: #0: (table_lock){+.+.+.}, at: [<bf004d84>] usb_serial_deregister+0x3c/0x78 [usbserial] stack backtrace: CPU: 0 PID: 190 Comm: modprobe Tainted: G W 3.15.0-rc2 raspberrypi#123 [<c0015e10>] (unwind_backtrace) from [<c0013728>] (show_stack+0x20/0x24) [<c0013728>] (show_stack) from [<c03a9a54>] (dump_stack+0x24/0x28) [<c03a9a54>] (dump_stack) from [<c03a7cac>] (print_circular_bug+0x2ec/0x2f8) [<c03a7cac>] (print_circular_bug) from [<c0076218>] (__lock_acquire+0x1928/0x1ce4) [<c0076218>] (__lock_acquire) from [<c0076de8>] (lock_acquire+0xb4/0x154) [<c0076de8>] (lock_acquire) from [<c0166b70>] (__kernfs_remove+0x254/0x310) [<c0166b70>] (__kernfs_remove) from [<c0167aa0>] (kernfs_remove_by_name_ns+0x4c/0x94) [<c0167aa0>] (kernfs_remove_by_name_ns) from [<c0169fb8>] (remove_files.isra.1+0x48/0x84) [<c0169fb8>] (remove_files.isra.1) from [<c016a2fc>] (sysfs_remove_group+0x58/0xac) [<c016a2fc>] (sysfs_remove_group) from [<c016a414>] (sysfs_remove_groups+0x34/0x44) [<c016a414>] (sysfs_remove_groups) from [<c02623b8>] (driver_remove_groups+0x1c/0x20) [<c02623b8>] (driver_remove_groups) from [<c0260e9c>] (bus_remove_driver+0x3c/0xe4) [<c0260e9c>] (bus_remove_driver) from [<c026235c>] (driver_unregister+0x38/0x58) [<c026235c>] (driver_unregister) from [<bf007fb4>] (usb_serial_bus_deregister+0x84/0x88 [usbserial]) [<bf007fb4>] (usb_serial_bus_deregister [usbserial]) from [<bf004db4>] (usb_serial_deregister+0x6c/0x78 [usbserial]) [<bf004db4>] (usb_serial_deregister [usbserial]) from [<bf005330>] (usb_serial_deregister_drivers+0x2c/0x4c [usbserial]) [<bf005330>] (usb_serial_deregister_drivers [usbserial]) from [<bf016618>] (usb_serial_module_exit+0x14/0x1c [sierra]) [<bf016618>] (usb_serial_module_exit [sierra]) from [<c009d6cc>] (SyS_delete_module+0x184/0x210) [<c009d6cc>] (SyS_delete_module) from [<c000f880>] (ret_fast_syscall+0x0/0x48) Signed-off-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>

[ Upstream commit dc8eaaa ] When I open the LOCKDEP config and run these steps: modprobe 8021q vconfig add eth2 20 vconfig add eth2.20 30 ifconfig eth2 xx.xx.xx.xx then the Call Trace happened: [32524.386288] ============================================= [32524.386293] [ INFO: possible recursive locking detected ] [32524.386298] 3.14.0-rc2-0.7-default+ #35 Tainted: G O [32524.386302] --------------------------------------------- [32524.386306] ifconfig/3103 is trying to acquire lock: [32524.386310] (&vlan_netdev_addr_lock_key/1){+.....}, at: [<ffffffff814275f4>] dev_mc_sync+0x64/0xb0 [32524.386326] [32524.386326] but task is already holding lock: [32524.386330] (&vlan_netdev_addr_lock_key/1){+.....}, at: [<ffffffff8141af83>] dev_set_rx_mode+0x23/0x40 [32524.386341] [32524.386341] other info that might help us debug this: [32524.386345] Possible unsafe locking scenario: [32524.386345] [32524.386350] CPU0 [32524.386352] ---- [32524.386354] lock(&vlan_netdev_addr_lock_key/1); [32524.386359] lock(&vlan_netdev_addr_lock_key/1); [32524.386364] [32524.386364] *** DEADLOCK *** [32524.386364] [32524.386368] May be due to missing lock nesting notation [32524.386368] [32524.386373] 2 locks held by ifconfig/3103: [32524.386376] #0: (rtnl_mutex){+.+.+.}, at: [<ffffffff81431d42>] rtnl_lock+0x12/0x20 [32524.386387] #1: (&vlan_netdev_addr_lock_key/1){+.....}, at: [<ffffffff8141af83>] dev_set_rx_mode+0x23/0x40 [32524.386398] [32524.386398] stack backtrace: [32524.386403] CPU: 1 PID: 3103 Comm: ifconfig Tainted: G O 3.14.0-rc2-0.7-default+ #35 [32524.386409] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007 [32524.386414] ffffffff81ffae40 ffff8800d9625ae8 ffffffff814f68a2 ffff8800d9625bc8 [32524.386421] ffffffff810a35fb ffff8800d8a8d9d0 00000000d9625b28 ffff8800d8a8e5d0 [32524.386428] 000003cc00000000 0000000000000002 ffff8800d8a8e5f8 0000000000000000 [32524.386435] Call Trace: [32524.386441] [<ffffffff814f68a2>] dump_stack+0x6a/0x78 [32524.386448] [<ffffffff810a35fb>] __lock_acquire+0x7ab/0x1940 [32524.386454] [<ffffffff810a323a>] ? __lock_acquire+0x3ea/0x1940 [32524.386459] [<ffffffff810a4874>] lock_acquire+0xe4/0x110 [32524.386464] [<ffffffff814275f4>] ? dev_mc_sync+0x64/0xb0 [32524.386471] [<ffffffff814fc07a>] _raw_spin_lock_nested+0x2a/0x40 [32524.386476] [<ffffffff814275f4>] ? dev_mc_sync+0x64/0xb0 [32524.386481] [<ffffffff814275f4>] dev_mc_sync+0x64/0xb0 [32524.386489] [<ffffffffa0500cab>] vlan_dev_set_rx_mode+0x2b/0x50 [8021q] [32524.386495] [<ffffffff8141addf>] __dev_set_rx_mode+0x5f/0xb0 [32524.386500] [<ffffffff8141af8b>] dev_set_rx_mode+0x2b/0x40 [32524.386506] [<ffffffff8141b3cf>] __dev_open+0xef/0x150 [32524.386511] [<ffffffff8141b177>] __dev_change_flags+0xa7/0x190 [32524.386516] [<ffffffff8141b292>] dev_change_flags+0x32/0x80 [32524.386524] [<ffffffff8149ca56>] devinet_ioctl+0x7d6/0x830 [32524.386532] [<ffffffff81437b0b>] ? dev_ioctl+0x34b/0x660 [32524.386540] [<ffffffff814a05b0>] inet_ioctl+0x80/0xa0 [32524.386550] [<ffffffff8140199d>] sock_do_ioctl+0x2d/0x60 [32524.386558] [<ffffffff81401a52>] sock_ioctl+0x82/0x2a0 [32524.386568] [<ffffffff811a7123>] do_vfs_ioctl+0x93/0x590 [32524.386578] [<ffffffff811b2705>] ? rcu_read_lock_held+0x45/0x50 [32524.386586] [<ffffffff811b39e5>] ? __fget_light+0x105/0x110 [32524.386594] [<ffffffff811a76b1>] SyS_ioctl+0x91/0xb0 [32524.386604] [<ffffffff815057e2>] system_call_fastpath+0x16/0x1b ======================================================================== The reason is that all of the addr_lock_key for vlan dev have the same class, so if we change the status for vlan dev, the vlan dev and its real dev will hold the same class of addr_lock_key together, so the warning happened. we should distinguish the lock depth for vlan dev and its real dev. v1->v2: Convert the vlan_netdev_addr_lock_key to an array of eight elements, which could support to add 8 vlan id on a same vlan dev, I think it is enough for current scene, because a netdev's name is limited to IFNAMSIZ which could not hold 8 vlan id, and the vlan dev would not meet the same class key with its real dev. The new function vlan_dev_get_lockdep_subkey() will return the subkey and make the vlan dev could get a suitable class key. v2->v3: According David's suggestion, I use the subclass to distinguish the lock key for vlan dev and its real dev, but it make no sense, because the difference for subclass in the lock_class_key doesn't mean that the difference class for lock_key, so I use lock_depth to distinguish the different depth for every vlan dev, the same depth of the vlan dev could have the same lock_class_key, I import the MAX_LOCK_DEPTH from the include/linux/sched.h, I think it is enough here, the lockdep should never exceed that value. v3->v4: Add a huge array of locking keys will waste static kernel memory and is not a appropriate method, we could use _nested() variants to fix the problem, calculate the depth for every vlan dev, and use the depth as the subclass for addr_lock_key. Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jiri Slaby <jslaby@suse.cz>

[ Upstream commit b14878c ] Currently, it is possible to create an SCTP socket, then switch auth_enable via sysctl setting to 1 and crash the system on connect: Oops[#1]: CPU: 0 PID: 0 Comm: swapper Not tainted 3.14.1-mipsgit-20140415 #1 task: ffffffff8056ce80 ti: ffffffff8055c000 task.ti: ffffffff8055c000 [...] Call Trace: [<ffffffff8043c4e8>] sctp_auth_asoc_set_default_hmac+0x68/0x80 [<ffffffff8042b300>] sctp_process_init+0x5e0/0x8a4 [<ffffffff8042188c>] sctp_sf_do_5_1B_init+0x234/0x34c [<ffffffff804228c8>] sctp_do_sm+0xb4/0x1e8 [<ffffffff80425a08>] sctp_endpoint_bh_rcv+0x1c4/0x214 [<ffffffff8043af68>] sctp_rcv+0x588/0x630 [<ffffffff8043e8e8>] sctp6_rcv+0x10/0x24 [<ffffffff803acb50>] ip6_input+0x2c0/0x440 [<ffffffff8030fc00>] __netif_receive_skb_core+0x4a8/0x564 [<ffffffff80310650>] process_backlog+0xb4/0x18c [<ffffffff80313cbc>] net_rx_action+0x12c/0x210 [<ffffffff80034254>] __do_softirq+0x17c/0x2ac [<ffffffff800345e0>] irq_exit+0x54/0xb0 [<ffffffff800075a4>] ret_from_irq+0x0/0x4 [<ffffffff800090ec>] rm7k_wait_irqoff+0x24/0x48 [<ffffffff8005e388>] cpu_startup_entry+0xc0/0x148 [<ffffffff805a88b0>] start_kernel+0x37c/0x398 Code: dd0900b8 000330f8 0126302d <dcc60000> 50c0fff1 0047182a a48306a0 03e00008 00000000 ---[ end trace b530b0551467f2fd ]--- Kernel panic - not syncing: Fatal exception in interrupt What happens while auth_enable=0 in that case is, that ep->auth_hmacs is initialized to NULL in sctp_auth_init_hmacs() when endpoint is being created. After that point, if an admin switches over to auth_enable=1, the machine can crash due to NULL pointer dereference during reception of an INIT chunk. When we enter sctp_process_init() via sctp_sf_do_5_1B_init() in order to respond to an INIT chunk, the INIT verification succeeds and while we walk and process all INIT params via sctp_process_param() we find that net->sctp.auth_enable is set, therefore do not fall through, but invoke sctp_auth_asoc_set_default_hmac() instead, and thus, dereference what we have set to NULL during endpoint initialization phase. The fix is to make auth_enable immutable by caching its value during endpoint initialization, so that its original value is being carried along until destruction. The bug seems to originate from the very first days. Fix in joint work with Daniel Borkmann. Reported-by: Joshua Kinard <kumba@gentoo.org> Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Tested-by: Joshua Kinard <kumba@gentoo.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jiri Slaby <jslaby@suse.cz>

commit 10164c2 upstream. Fix driver new_id sysfs-attribute removal deadlock by making sure to not hold any locks that the attribute operations grab when removing the attribute. Specifically, usb_serial_deregister holds the table mutex when deregistering the driver, which includes removing the new_id attribute. This can lead to a deadlock as writing to new_id increments the attribute's active count before trying to grab the same mutex in usb_serial_probe. The deadlock can easily be triggered by inserting a sleep in usb_serial_deregister and writing the id of an unbound device to new_id during module unload. As the table mutex (in this case) is used to prevent subdriver unload during probe, it should be sufficient to only hold the lock while manipulating the usb-serial driver list during deregister. A racing probe will then either fail to find a matching subdriver or fail to get the corresponding module reference. Since v3.15-rc1 this also triggers the following lockdep warning: ====================================================== [ INFO: possible circular locking dependency detected ] 3.15.0-rc2 raspberrypi#123 Tainted: G W ------------------------------------------------------- modprobe/190 is trying to acquire lock: (s_active#4){++++.+}, at: [<c0167aa0>] kernfs_remove_by_name_ns+0x4c/0x94 but task is already holding lock: (table_lock){+.+.+.}, at: [<bf004d84>] usb_serial_deregister+0x3c/0x78 [usbserial] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (table_lock){+.+.+.}: [<c0075f84>] __lock_acquire+0x1694/0x1ce4 [<c0076de8>] lock_acquire+0xb4/0x154 [<c03af3cc>] _raw_spin_lock+0x4c/0x5c [<c02bbc24>] usb_store_new_id+0x14c/0x1ac [<bf007eb4>] new_id_store+0x68/0x70 [usbserial] [<c025f568>] drv_attr_store+0x30/0x3c [<c01690e0>] sysfs_kf_write+0x5c/0x60 [<c01682c0>] kernfs_fop_write+0xd4/0x194 [<c010881c>] vfs_write+0xbc/0x198 [<c0108e4c>] SyS_write+0x4c/0xa0 [<c000f880>] ret_fast_syscall+0x0/0x48 -> #0 (s_active#4){++++.+}: [<c03a7a28>] print_circular_bug+0x68/0x2f8 [<c0076218>] __lock_acquire+0x1928/0x1ce4 [<c0076de8>] lock_acquire+0xb4/0x154 [<c0166b70>] __kernfs_remove+0x254/0x310 [<c0167aa0>] kernfs_remove_by_name_ns+0x4c/0x94 [<c0169fb8>] remove_files.isra.1+0x48/0x84 [<c016a2fc>] sysfs_remove_group+0x58/0xac [<c016a414>] sysfs_remove_groups+0x34/0x44 [<c02623b8>] driver_remove_groups+0x1c/0x20 [<c0260e9c>] bus_remove_driver+0x3c/0xe4 [<c026235c>] driver_unregister+0x38/0x58 [<bf007fb4>] usb_serial_bus_deregister+0x84/0x88 [usbserial] [<bf004db4>] usb_serial_deregister+0x6c/0x78 [usbserial] [<bf005330>] usb_serial_deregister_drivers+0x2c/0x4c [usbserial] [<bf016618>] usb_serial_module_exit+0x14/0x1c [sierra] [<c009d6cc>] SyS_delete_module+0x184/0x210 [<c000f880>] ret_fast_syscall+0x0/0x48 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(table_lock); lock(s_active#4); lock(table_lock); lock(s_active#4); *** DEADLOCK *** 1 lock held by modprobe/190: #0: (table_lock){+.+.+.}, at: [<bf004d84>] usb_serial_deregister+0x3c/0x78 [usbserial] stack backtrace: CPU: 0 PID: 190 Comm: modprobe Tainted: G W 3.15.0-rc2 raspberrypi#123 [<c0015e10>] (unwind_backtrace) from [<c0013728>] (show_stack+0x20/0x24) [<c0013728>] (show_stack) from [<c03a9a54>] (dump_stack+0x24/0x28) [<c03a9a54>] (dump_stack) from [<c03a7cac>] (print_circular_bug+0x2ec/0x2f8) [<c03a7cac>] (print_circular_bug) from [<c0076218>] (__lock_acquire+0x1928/0x1ce4) [<c0076218>] (__lock_acquire) from [<c0076de8>] (lock_acquire+0xb4/0x154) [<c0076de8>] (lock_acquire) from [<c0166b70>] (__kernfs_remove+0x254/0x310) [<c0166b70>] (__kernfs_remove) from [<c0167aa0>] (kernfs_remove_by_name_ns+0x4c/0x94) [<c0167aa0>] (kernfs_remove_by_name_ns) from [<c0169fb8>] (remove_files.isra.1+0x48/0x84) [<c0169fb8>] (remove_files.isra.1) from [<c016a2fc>] (sysfs_remove_group+0x58/0xac) [<c016a2fc>] (sysfs_remove_group) from [<c016a414>] (sysfs_remove_groups+0x34/0x44) [<c016a414>] (sysfs_remove_groups) from [<c02623b8>] (driver_remove_groups+0x1c/0x20) [<c02623b8>] (driver_remove_groups) from [<c0260e9c>] (bus_remove_driver+0x3c/0xe4) [<c0260e9c>] (bus_remove_driver) from [<c026235c>] (driver_unregister+0x38/0x58) [<c026235c>] (driver_unregister) from [<bf007fb4>] (usb_serial_bus_deregister+0x84/0x88 [usbserial]) [<bf007fb4>] (usb_serial_bus_deregister [usbserial]) from [<bf004db4>] (usb_serial_deregister+0x6c/0x78 [usbserial]) [<bf004db4>] (usb_serial_deregister [usbserial]) from [<bf005330>] (usb_serial_deregister_drivers+0x2c/0x4c [usbserial]) [<bf005330>] (usb_serial_deregister_drivers [usbserial]) from [<bf016618>] (usb_serial_module_exit+0x14/0x1c [sierra]) [<bf016618>] (usb_serial_module_exit [sierra]) from [<c009d6cc>] (SyS_delete_module+0x184/0x210) [<c009d6cc>] (SyS_delete_module) from [<c000f880>] (ret_fast_syscall+0x0/0x48) Signed-off-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>

[ Upstream commit dc8eaaa ] When I open the LOCKDEP config and run these steps: modprobe 8021q vconfig add eth2 20 vconfig add eth2.20 30 ifconfig eth2 xx.xx.xx.xx then the Call Trace happened: [32524.386288] ============================================= [32524.386293] [ INFO: possible recursive locking detected ] [32524.386298] 3.14.0-rc2-0.7-default+ #35 Tainted: G O [32524.386302] --------------------------------------------- [32524.386306] ifconfig/3103 is trying to acquire lock: [32524.386310] (&vlan_netdev_addr_lock_key/1){+.....}, at: [<ffffffff814275f4>] dev_mc_sync+0x64/0xb0 [32524.386326] [32524.386326] but task is already holding lock: [32524.386330] (&vlan_netdev_addr_lock_key/1){+.....}, at: [<ffffffff8141af83>] dev_set_rx_mode+0x23/0x40 [32524.386341] [32524.386341] other info that might help us debug this: [32524.386345] Possible unsafe locking scenario: [32524.386345] [32524.386350] CPU0 [32524.386352] ---- [32524.386354] lock(&vlan_netdev_addr_lock_key/1); [32524.386359] lock(&vlan_netdev_addr_lock_key/1); [32524.386364] [32524.386364] *** DEADLOCK *** [32524.386364] [32524.386368] May be due to missing lock nesting notation [32524.386368] [32524.386373] 2 locks held by ifconfig/3103: [32524.386376] #0: (rtnl_mutex){+.+.+.}, at: [<ffffffff81431d42>] rtnl_lock+0x12/0x20 [32524.386387] #1: (&vlan_netdev_addr_lock_key/1){+.....}, at: [<ffffffff8141af83>] dev_set_rx_mode+0x23/0x40 [32524.386398] [32524.386398] stack backtrace: [32524.386403] CPU: 1 PID: 3103 Comm: ifconfig Tainted: G O 3.14.0-rc2-0.7-default+ #35 [32524.386409] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007 [32524.386414] ffffffff81ffae40 ffff8800d9625ae8 ffffffff814f68a2 ffff8800d9625bc8 [32524.386421] ffffffff810a35fb ffff8800d8a8d9d0 00000000d9625b28 ffff8800d8a8e5d0 [32524.386428] 000003cc00000000 0000000000000002 ffff8800d8a8e5f8 0000000000000000 [32524.386435] Call Trace: [32524.386441] [<ffffffff814f68a2>] dump_stack+0x6a/0x78 [32524.386448] [<ffffffff810a35fb>] __lock_acquire+0x7ab/0x1940 [32524.386454] [<ffffffff810a323a>] ? __lock_acquire+0x3ea/0x1940 [32524.386459] [<ffffffff810a4874>] lock_acquire+0xe4/0x110 [32524.386464] [<ffffffff814275f4>] ? dev_mc_sync+0x64/0xb0 [32524.386471] [<ffffffff814fc07a>] _raw_spin_lock_nested+0x2a/0x40 [32524.386476] [<ffffffff814275f4>] ? dev_mc_sync+0x64/0xb0 [32524.386481] [<ffffffff814275f4>] dev_mc_sync+0x64/0xb0 [32524.386489] [<ffffffffa0500cab>] vlan_dev_set_rx_mode+0x2b/0x50 [8021q] [32524.386495] [<ffffffff8141addf>] __dev_set_rx_mode+0x5f/0xb0 [32524.386500] [<ffffffff8141af8b>] dev_set_rx_mode+0x2b/0x40 [32524.386506] [<ffffffff8141b3cf>] __dev_open+0xef/0x150 [32524.386511] [<ffffffff8141b177>] __dev_change_flags+0xa7/0x190 [32524.386516] [<ffffffff8141b292>] dev_change_flags+0x32/0x80 [32524.386524] [<ffffffff8149ca56>] devinet_ioctl+0x7d6/0x830 [32524.386532] [<ffffffff81437b0b>] ? dev_ioctl+0x34b/0x660 [32524.386540] [<ffffffff814a05b0>] inet_ioctl+0x80/0xa0 [32524.386550] [<ffffffff8140199d>] sock_do_ioctl+0x2d/0x60 [32524.386558] [<ffffffff81401a52>] sock_ioctl+0x82/0x2a0 [32524.386568] [<ffffffff811a7123>] do_vfs_ioctl+0x93/0x590 [32524.386578] [<ffffffff811b2705>] ? rcu_read_lock_held+0x45/0x50 [32524.386586] [<ffffffff811b39e5>] ? __fget_light+0x105/0x110 [32524.386594] [<ffffffff811a76b1>] SyS_ioctl+0x91/0xb0 [32524.386604] [<ffffffff815057e2>] system_call_fastpath+0x16/0x1b ======================================================================== The reason is that all of the addr_lock_key for vlan dev have the same class, so if we change the status for vlan dev, the vlan dev and its real dev will hold the same class of addr_lock_key together, so the warning happened. we should distinguish the lock depth for vlan dev and its real dev. v1->v2: Convert the vlan_netdev_addr_lock_key to an array of eight elements, which could support to add 8 vlan id on a same vlan dev, I think it is enough for current scene, because a netdev's name is limited to IFNAMSIZ which could not hold 8 vlan id, and the vlan dev would not meet the same class key with its real dev. The new function vlan_dev_get_lockdep_subkey() will return the subkey and make the vlan dev could get a suitable class key. v2->v3: According David's suggestion, I use the subclass to distinguish the lock key for vlan dev and its real dev, but it make no sense, because the difference for subclass in the lock_class_key doesn't mean that the difference class for lock_key, so I use lock_depth to distinguish the different depth for every vlan dev, the same depth of the vlan dev could have the same lock_class_key, I import the MAX_LOCK_DEPTH from the include/linux/sched.h, I think it is enough here, the lockdep should never exceed that value. v3->v4: Add a huge array of locking keys will waste static kernel memory and is not a appropriate method, we could use _nested() variants to fix the problem, calculate the depth for every vlan dev, and use the depth as the subclass for addr_lock_key. Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jiri Slaby <jslaby@suse.cz>

[ Upstream commit b14878c ] Currently, it is possible to create an SCTP socket, then switch auth_enable via sysctl setting to 1 and crash the system on connect: Oops[#1]: CPU: 0 PID: 0 Comm: swapper Not tainted 3.14.1-mipsgit-20140415 #1 task: ffffffff8056ce80 ti: ffffffff8055c000 task.ti: ffffffff8055c000 [...] Call Trace: [<ffffffff8043c4e8>] sctp_auth_asoc_set_default_hmac+0x68/0x80 [<ffffffff8042b300>] sctp_process_init+0x5e0/0x8a4 [<ffffffff8042188c>] sctp_sf_do_5_1B_init+0x234/0x34c [<ffffffff804228c8>] sctp_do_sm+0xb4/0x1e8 [<ffffffff80425a08>] sctp_endpoint_bh_rcv+0x1c4/0x214 [<ffffffff8043af68>] sctp_rcv+0x588/0x630 [<ffffffff8043e8e8>] sctp6_rcv+0x10/0x24 [<ffffffff803acb50>] ip6_input+0x2c0/0x440 [<ffffffff8030fc00>] __netif_receive_skb_core+0x4a8/0x564 [<ffffffff80310650>] process_backlog+0xb4/0x18c [<ffffffff80313cbc>] net_rx_action+0x12c/0x210 [<ffffffff80034254>] __do_softirq+0x17c/0x2ac [<ffffffff800345e0>] irq_exit+0x54/0xb0 [<ffffffff800075a4>] ret_from_irq+0x0/0x4 [<ffffffff800090ec>] rm7k_wait_irqoff+0x24/0x48 [<ffffffff8005e388>] cpu_startup_entry+0xc0/0x148 [<ffffffff805a88b0>] start_kernel+0x37c/0x398 Code: dd0900b8 000330f8 0126302d <dcc60000> 50c0fff1 0047182a a48306a0 03e00008 00000000 ---[ end trace b530b0551467f2fd ]--- Kernel panic - not syncing: Fatal exception in interrupt What happens while auth_enable=0 in that case is, that ep->auth_hmacs is initialized to NULL in sctp_auth_init_hmacs() when endpoint is being created. After that point, if an admin switches over to auth_enable=1, the machine can crash due to NULL pointer dereference during reception of an INIT chunk. When we enter sctp_process_init() via sctp_sf_do_5_1B_init() in order to respond to an INIT chunk, the INIT verification succeeds and while we walk and process all INIT params via sctp_process_param() we find that net->sctp.auth_enable is set, therefore do not fall through, but invoke sctp_auth_asoc_set_default_hmac() instead, and thus, dereference what we have set to NULL during endpoint initialization phase. The fix is to make auth_enable immutable by caching its value during endpoint initialization, so that its original value is being carried along until destruction. The bug seems to originate from the very first days. Fix in joint work with Daniel Borkmann. Reported-by: Joshua Kinard <kumba@gentoo.org> Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Tested-by: Joshua Kinard <kumba@gentoo.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jiri Slaby <jslaby@suse.cz>

commit 4797ec2 upstream. The following happens when trying to run a kvm guest on a kernel configured for 64k pages. This doesn't happen with 4k pages: BUG: failure at include/linux/mm.h:297/put_page_testzero()! Kernel panic - not syncing: BUG! CPU: 2 PID: 4228 Comm: qemu-system-aar Tainted: GF 3.13.0-0.rc7.31.sa2.k32v1.aarch64.debug #1 Call trace: [<fffffe0000096034>] dump_backtrace+0x0/0x16c [<fffffe00000961b4>] show_stack+0x14/0x1c [<fffffe000066e648>] dump_stack+0x84/0xb0 [<fffffe0000668678>] panic+0xf4/0x220 [<fffffe000018ec78>] free_reserved_area+0x0/0x110 [<fffffe000018edd8>] free_pages+0x50/0x88 [<fffffe00000a759c>] kvm_free_stage2_pgd+0x30/0x40 [<fffffe00000a5354>] kvm_arch_destroy_vm+0x18/0x44 [<fffffe00000a1854>] kvm_put_kvm+0xf0/0x184 [<fffffe00000a1938>] kvm_vm_release+0x10/0x1c [<fffffe00001edc1c>] __fput+0xb0/0x288 [<fffffe00001ede4c>] ____fput+0xc/0x14 [<fffffe00000d5a2c>] task_work_run+0xa8/0x11c [<fffffe0000095c14>] do_notify_resume+0x54/0x58 In arch/arm/kvm/mmu.c:unmap_range(), we end up doing an extra put_page() on the stage2 pgd which leads to the BUG in put_page_testzero(). This happens because a pud_huge() test in unmap_range() returns true when it should always be false with 2-level pages tables used by 64k pages. This patch removes support for huge puds if 2-level pagetables are being used. Signed-off-by: Mark Salter <msalter@redhat.com> [catalin.marinas@arm.com: removed #ifndef around PUD_SIZE check] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>

commit aa07c71 upstream. After setting ACL for directory, I got two problems that caused by the cached zero-length default posix acl. This patch make sure nfsd4_set_nfs4_acl calls ->set_acl with a NULL ACL structure if there are no entries. Thanks for Christoph Hellwig's advice. First problem: ............ hang ........... Second problem: [ 1610.167668] ------------[ cut here ]------------ [ 1610.168320] kernel BUG at /root/nfs/linux/fs/nfsd/nfs4acl.c:239! [ 1610.168320] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC [ 1610.168320] Modules linked in: nfsv4(OE) nfs(OE) nfsd(OE) rpcsec_gss_krb5 fscache ip6t_rpfilter ip6t_REJECT cfg80211 xt_conntrack rfkill ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw auth_rpcgss nfs_acl snd_intel8x0 ppdev lockd snd_ac97_codec ac97_bus snd_pcm snd_timer e1000 pcspkr parport_pc snd parport serio_raw joydev i2c_piix4 sunrpc(OE) microcode soundcore i2c_core ata_generic pata_acpi [last unloaded: nfsd] [ 1610.168320] CPU: 0 PID: 27397 Comm: nfsd Tainted: G OE 3.15.0-rc1+ #15 [ 1610.168320] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 1610.168320] task: ffff88005ab653d0 ti: ffff88005a944000 task.ti: ffff88005a944000 [ 1610.168320] RIP: 0010:[<ffffffffa034d5ed>] [<ffffffffa034d5ed>] _posix_to_nfsv4_one+0x3cd/0x3d0 [nfsd] [ 1610.168320] RSP: 0018:ffff88005a945b00 EFLAGS: 00010293 [ 1610.168320] RAX: 0000000000000001 RBX: ffff88006700bac0 RCX: 0000000000000000 [ 1610.168320] RDX: 0000000000000000 RSI: ffff880067c83f00 RDI: ffff880068233300 [ 1610.168320] RBP: ffff88005a945b48 R08: ffffffff81c64830 R09: 0000000000000000 [ 1610.168320] R10: ffff88004ea85be0 R11: 000000000000f475 R12: ffff880068233300 [ 1610.168320] R13: 0000000000000003 R14: 0000000000000002 R15: ffff880068233300 [ 1610.168320] FS: 0000000000000000(0000) GS:ffff880077800000(0000) knlGS:0000000000000000 [ 1610.168320] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 1610.168320] CR2: 00007f5bcbd3b0b9 CR3: 0000000001c0f000 CR4: 00000000000006f0 [ 1610.168320] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1610.168320] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 1610.168320] Stack: [ 1610.168320] ffffffff00000000 0000000b67c83500 000000076700bac0 0000000000000000 [ 1610.168320] ffff88006700bac0 ffff880068233300 ffff88005a945c08 0000000000000002 [ 1610.168320] 0000000000000000 ffff88005a945b88 ffffffffa034e2d5 000000065a945b68 [ 1610.168320] Call Trace: [ 1610.168320] [<ffffffffa034e2d5>] nfsd4_get_nfs4_acl+0x95/0x150 [nfsd] [ 1610.168320] [<ffffffffa03400d6>] nfsd4_encode_fattr+0x646/0x1e70 [nfsd] [ 1610.168320] [<ffffffff816a6e6e>] ? kmemleak_alloc+0x4e/0xb0 [ 1610.168320] [<ffffffffa0327962>] ? nfsd_setuser_and_check_port+0x52/0x80 [nfsd] [ 1610.168320] [<ffffffff812cd4bb>] ? selinux_cred_prepare+0x1b/0x30 [ 1610.168320] [<ffffffffa0341caa>] nfsd4_encode_getattr+0x5a/0x60 [nfsd] [ 1610.168320] [<ffffffffa0341e07>] nfsd4_encode_operation+0x67/0x110 [nfsd] [ 1610.168320] [<ffffffffa033844d>] nfsd4_proc_compound+0x21d/0x810 [nfsd] [ 1610.168320] [<ffffffffa0324d9b>] nfsd_dispatch+0xbb/0x200 [nfsd] [ 1610.168320] [<ffffffffa00850cd>] svc_process_common+0x46d/0x6d0 [sunrpc] [ 1610.168320] [<ffffffffa0085433>] svc_process+0x103/0x170 [sunrpc] [ 1610.168320] [<ffffffffa032472f>] nfsd+0xbf/0x130 [nfsd] [ 1610.168320] [<ffffffffa0324670>] ? nfsd_destroy+0x80/0x80 [nfsd] [ 1610.168320] [<ffffffff810a5202>] kthread+0xd2/0xf0 [ 1610.168320] [<ffffffff810a5130>] ? insert_kthread_work+0x40/0x40 [ 1610.168320] [<ffffffff816c1ebc>] ret_from_fork+0x7c/0xb0 [ 1610.168320] [<ffffffff810a5130>] ? insert_kthread_work+0x40/0x40 [ 1610.168320] Code: 78 02 e9 e7 fc ff ff 31 c0 31 d2 31 c9 66 89 45 ce 41 8b 04 24 66 89 55 d0 66 89 4d d2 48 8d 04 80 49 8d 5c 84 04 e9 37 fd ff ff <0f> 0b 90 0f 1f 44 00 00 55 8b 56 08 c7 07 00 00 00 00 8b 46 0c [ 1610.168320] RIP [<ffffffffa034d5ed>] _posix_to_nfsv4_one+0x3cd/0x3d0 [nfsd] [ 1610.168320] RSP <ffff88005a945b00> [ 1610.257313] ---[ end trace 838254e3e352285b ]--- Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>

commit c5450db upstream. While accessing cur_policy during executing events CPUFREQ_GOV_START, CPUFREQ_GOV_STOP, CPUFREQ_GOV_LIMITS, same mutex lock is not taken, dbs_data->mutex, which leads to race and data corruption while running continious suspend resume test. This is seen with ondemand governor with suspend resume test using rtcwake. Unable to handle kernel NULL pointer dereference at virtual address 00000028 pgd = ed610000 [00000028] *pgd=adf11831, *pte=00000000, *ppte=00000000 Internal error: Oops: 17 [#1] PREEMPT SMP ARM Modules linked in: nvhost_vi CPU: 1 PID: 3243 Comm: rtcwake Not tainted 3.10.24-gf5cf9e5 #1 task: ee708040 ti: ed61c000 task.ti: ed61c000 PC is at cpufreq_governor_dbs+0x400/0x634 LR is at cpufreq_governor_dbs+0x3f8/0x634 pc : [<c05652b8>] lr : [<c05652b0>] psr: 600f0013 sp : ed61dcb0 ip : 000493e0 fp : c1cc14f0 r10: 00000000 r9 : 00000000 r8 : 00000000 r7 : eb725280 r6 : c1cc1560 r5 : eb575200 r4 : ebad7740 r3 : ee708040 r2 : ed61dca8 r1 : 001ebd24 r0 : 00000000 Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user Control: 10c5387d Table: ad61006a DAC: 00000015 [<c05652b8>] (cpufreq_governor_dbs+0x400/0x634) from [<c055f700>] (__cpufreq_governor+0x98/0x1b4) [<c055f700>] (__cpufreq_governor+0x98/0x1b4) from [<c0560770>] (__cpufreq_set_policy+0x250/0x320) [<c0560770>] (__cpufreq_set_policy+0x250/0x320) from [<c0561dcc>] (cpufreq_update_policy+0xcc/0x168) [<c0561dcc>] (cpufreq_update_policy+0xcc/0x168) from [<c0561ed0>] (cpu_freq_notify+0x68/0xdc) [<c0561ed0>] (cpu_freq_notify+0x68/0xdc) from [<c008eff8>] (notifier_call_chain+0x4c/0x8c) [<c008eff8>] (notifier_call_chain+0x4c/0x8c) from [<c008f3d4>] (__blocking_notifier_call_chain+0x50/0x68) [<c008f3d4>] (__blocking_notifier_call_chain+0x50/0x68) from [<c008f40c>] (blocking_notifier_call_chain+0x20/0x28) [<c008f40c>] (blocking_notifier_call_chain+0x20/0x28) from [<c00aac6c>] (pm_qos_update_bounded_target+0xd8/0x310) [<c00aac6c>] (pm_qos_update_bounded_target+0xd8/0x310) from [<c00ab3b0>] (__pm_qos_update_request+0x64/0x70) [<c00ab3b0>] (__pm_qos_update_request+0x64/0x70) from [<c004b4b8>] (tegra_pm_notify+0x114/0x134) [<c004b4b8>] (tegra_pm_notify+0x114/0x134) from [<c008eff8>] (notifier_call_chain+0x4c/0x8c) [<c008eff8>] (notifier_call_chain+0x4c/0x8c) from [<c008f3d4>] (__blocking_notifier_call_chain+0x50/0x68) [<c008f3d4>] (__blocking_notifier_call_chain+0x50/0x68) from [<c008f40c>] (blocking_notifier_call_chain+0x20/0x28) [<c008f40c>] (blocking_notifier_call_chain+0x20/0x28) from [<c00ac228>] (pm_notifier_call_chain+0x1c/0x34) [<c00ac228>] (pm_notifier_call_chain+0x1c/0x34) from [<c00ad38c>] (enter_state+0xec/0x128) [<c00ad38c>] (enter_state+0xec/0x128) from [<c00ad400>] (pm_suspend+0x38/0xa4) [<c00ad400>] (pm_suspend+0x38/0xa4) from [<c00ac114>] (state_store+0x70/0xc0) [<c00ac114>] (state_store+0x70/0xc0) from [<c027b1e8>] (kobj_attr_store+0x14/0x20) [<c027b1e8>] (kobj_attr_store+0x14/0x20) from [<c019cd9c>] (sysfs_write_file+0x104/0x184) [<c019cd9c>] (sysfs_write_file+0x104/0x184) from [<c0143038>] (vfs_write+0xd0/0x19c) [<c0143038>] (vfs_write+0xd0/0x19c) from [<c0143414>] (SyS_write+0x4c/0x78) [<c0143414>] (SyS_write+0x4c/0x78) from [<c000f080>] (ret_fast_syscall+0x0/0x30) Code: e1a00006 eb084346 e59b0020 e5951024 (e5903028) ---[ end trace 0488523c8f6b0f9d ]--- Signed-off-by: Bibek Basu <bbasu@nvidia.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>

commit 0430e49 upstream. Commit 8aac627 "move exit_task_namespaces() outside of exit_notify" introduced the kernel opps since the kernel v3.10, which happens when Apparmor and IMA-appraisal are enabled at the same time. ---------------------------------------------------------------------- [ 106.750167] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 [ 106.750221] IP: [<ffffffff811ec7da>] our_mnt+0x1a/0x30 [ 106.750241] PGD 0 [ 106.750254] Oops: 0000 [#1] SMP [ 106.750272] Modules linked in: cuse parport_pc ppdev bnep rfcomm bluetooth rpcsec_gss_krb5 nfsd auth_rpcgss nfs_acl nfs lockd sunrpc fscache dm_crypt intel_rapl x86_pkg_temp_thermal intel_powerclamp kvm_intel snd_hda_codec_hdmi kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 glue_helper lrw gf128mul ablk_helper cryptd snd_hda_codec_realtek dcdbas snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_page_alloc snd_seq_midi snd_seq_midi_event snd_rawmidi psmouse snd_seq microcode serio_raw snd_timer snd_seq_device snd soundcore video lpc_ich coretemp mac_hid lp parport mei_me mei nbd hid_generic e1000e usbhid ahci ptp hid libahci pps_core [ 106.750658] CPU: 6 PID: 1394 Comm: mysqld Not tainted 3.13.0-rc7-kds+ #15 [ 106.750673] Hardware name: Dell Inc. OptiPlex 9010/0M9KCM, BIOS A08 09/19/2012 [ 106.750689] task: ffff8800de804920 ti: ffff880400fca000 task.ti: ffff880400fca000 [ 106.750704] RIP: 0010:[<ffffffff811ec7da>] [<ffffffff811ec7da>] our_mnt+0x1a/0x30 [ 106.750725] RSP: 0018:ffff880400fcba60 EFLAGS: 00010286 [ 106.750738] RAX: 0000000000000000 RBX: 0000000000000100 RCX: ffff8800d51523e7 [ 106.750764] RDX: ffffffffffffffea RSI: ffff880400fcba34 RDI: ffff880402d20020 [ 106.750791] RBP: ffff880400fcbae0 R08: 0000000000000000 R09: 0000000000000001 [ 106.750817] R10: 0000000000000000 R11: 0000000000000001 R12: ffff8800d5152300 [ 106.750844] R13: ffff8803eb8df510 R14: ffff880400fcbb28 R15: ffff8800d51523e7 [ 106.750871] FS: 0000000000000000(0000) GS:ffff88040d200000(0000) knlGS:0000000000000000 [ 106.750910] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 106.750935] CR2: 0000000000000018 CR3: 0000000001c0e000 CR4: 00000000001407e0 [ 106.750962] Stack: [ 106.750981] ffffffff813434eb ffff880400fcbb20 ffff880400fcbb18 0000000000000000 [ 106.751037] ffff8800de804920 ffffffff8101b9b9 0001800000000000 0000000000000100 [ 106.751093] 0000010000000000 0000000000000002 000000000000000e ffff8803eb8df500 [ 106.751149] Call Trace: [ 106.751172] [<ffffffff813434eb>] ? aa_path_name+0x2ab/0x430 [ 106.751199] [<ffffffff8101b9b9>] ? sched_clock+0x9/0x10 [ 106.751225] [<ffffffff8134a68d>] aa_path_perm+0x7d/0x170 [ 106.751250] [<ffffffff8101b945>] ? native_sched_clock+0x15/0x80 [ 106.751276] [<ffffffff8134aa73>] aa_file_perm+0x33/0x40 [ 106.751301] [<ffffffff81348c5e>] common_file_perm+0x8e/0xb0 [ 106.751327] [<ffffffff81348d78>] apparmor_file_permission+0x18/0x20 [ 106.751355] [<ffffffff8130c853>] security_file_permission+0x23/0xa0 [ 106.751382] [<ffffffff811c77a2>] rw_verify_area+0x52/0xe0 [ 106.751407] [<ffffffff811c789d>] vfs_read+0x6d/0x170 [ 106.751432] [<ffffffff811cda31>] kernel_read+0x41/0x60 [ 106.751457] [<ffffffff8134fd45>] ima_calc_file_hash+0x225/0x280 [ 106.751483] [<ffffffff8134fb52>] ? ima_calc_file_hash+0x32/0x280 [ 106.751509] [<ffffffff8135022d>] ima_collect_measurement+0x9d/0x160 [ 106.751536] [<ffffffff810b552d>] ? trace_hardirqs_on+0xd/0x10 [ 106.751562] [<ffffffff8134f07c>] ? ima_file_free+0x6c/0xd0 [ 106.751587] [<ffffffff81352824>] ima_update_xattr+0x34/0x60 [ 106.751612] [<ffffffff8134f0d0>] ima_file_free+0xc0/0xd0 [ 106.751637] [<ffffffff811c9635>] __fput+0xd5/0x300 [ 106.751662] [<ffffffff811c98ae>] ____fput+0xe/0x10 [ 106.751687] [<ffffffff81086774>] task_work_run+0xc4/0xe0 [ 106.751712] [<ffffffff81066fad>] do_exit+0x2bd/0xa90 [ 106.751738] [<ffffffff8173c958>] ? retint_swapgs+0x13/0x1b [ 106.751763] [<ffffffff8106780c>] do_group_exit+0x4c/0xc0 [ 106.751788] [<ffffffff81067894>] SyS_exit_group+0x14/0x20 [ 106.751814] [<ffffffff8174522d>] system_call_fastpath+0x1a/0x1f [ 106.751839] Code: c3 0f 1f 44 00 00 55 48 89 e5 e8 22 fe ff ff 5d c3 0f 1f 44 00 00 55 65 48 8b 04 25 c0 c9 00 00 48 8b 80 28 06 00 00 48 89 e5 5d <48> 8b 40 18 48 39 87 c0 00 00 00 0f 94 c0 c3 0f 1f 80 00 00 00 [ 106.752185] RIP [<ffffffff811ec7da>] our_mnt+0x1a/0x30 [ 106.752214] RSP <ffff880400fcba60> [ 106.752236] CR2: 0000000000000018 [ 106.752258] ---[ end trace 3c520748b4732721 ]--- ---------------------------------------------------------------------- The reason for the oops is that IMA-appraisal uses "kernel_read()" when file is closed. kernel_read() honors LSM security hook which calls Apparmor handler, which uses current->nsproxy->mnt_ns. The 'guilty' commit changed the order of cleanup code so that nsproxy->mnt_ns was not already available for Apparmor. Discussion about the issue with Al Viro and Eric W. Biederman suggested that kernel_read() is too high-level for IMA. Another issue, except security checking, that was identified is mandatory locking. kernel_read honors it as well and it might prevent IMA from calculating necessary hash. It was suggested to use simplified version of the function without security and locking checks. This patch introduces special version ima_kernel_read(), which skips security and mandatory locking checking. It prevents the kernel oops to happen. Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com> Suggested-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>

[ Upstream commit 9709674 ] Alexey gave a AddressSanitizer[1] report that finally gave a good hint at where was the origin of various problems already reported by Dormando in the past [2] Problem comes from the fact that UDP can have a lockless TX path, and concurrent threads can manipulate sk_dst_cache, while another thread, is holding socket lock and calls __sk_dst_set() in ip4_datagram_release_cb() (this was added in linux-3.8) It seems that all we need to do is to use sk_dst_check() and sk_dst_set() so that all the writers hold same spinlock (sk->sk_dst_lock) to prevent corruptions. TCP stack do not need this protection, as all sk_dst_cache writers hold the socket lock. [1] https://code.google.com/p/address-sanitizer/wiki/AddressSanitizerForKernel AddressSanitizer: heap-use-after-free in ipv4_dst_check Read of size 2 by thread T15453: [<ffffffff817daa3a>] ipv4_dst_check+0x1a/0x90 ./net/ipv4/route.c:1116 [<ffffffff8175b789>] __sk_dst_check+0x89/0xe0 ./net/core/sock.c:531 [<ffffffff81830a36>] ip4_datagram_release_cb+0x46/0x390 ??:0 [<ffffffff8175eaea>] release_sock+0x17a/0x230 ./net/core/sock.c:2413 [<ffffffff81830882>] ip4_datagram_connect+0x462/0x5d0 ??:0 [<ffffffff81846d06>] inet_dgram_connect+0x76/0xd0 ./net/ipv4/af_inet.c:534 [<ffffffff817580ac>] SYSC_connect+0x15c/0x1c0 ./net/socket.c:1701 [<ffffffff817596ce>] SyS_connect+0xe/0x10 ./net/socket.c:1682 [<ffffffff818b0a29>] system_call_fastpath+0x16/0x1b ./arch/x86/kernel/entry_64.S:629 Freed by thread T15455: [<ffffffff8178d9b8>] dst_destroy+0xa8/0x160 ./net/core/dst.c:251 [<ffffffff8178de25>] dst_release+0x45/0x80 ./net/core/dst.c:280 [<ffffffff818304c1>] ip4_datagram_connect+0xa1/0x5d0 ??:0 [<ffffffff81846d06>] inet_dgram_connect+0x76/0xd0 ./net/ipv4/af_inet.c:534 [<ffffffff817580ac>] SYSC_connect+0x15c/0x1c0 ./net/socket.c:1701 [<ffffffff817596ce>] SyS_connect+0xe/0x10 ./net/socket.c:1682 [<ffffffff818b0a29>] system_call_fastpath+0x16/0x1b ./arch/x86/kernel/entry_64.S:629 Allocated by thread T15453: [<ffffffff8178d291>] dst_alloc+0x81/0x2b0 ./net/core/dst.c:171 [<ffffffff817db3b7>] rt_dst_alloc+0x47/0x50 ./net/ipv4/route.c:1406 [< inlined >] __ip_route_output_key+0x3e8/0xf70 __mkroute_output ./net/ipv4/route.c:1939 [<ffffffff817dde08>] __ip_route_output_key+0x3e8/0xf70 ./net/ipv4/route.c:2161 [<ffffffff817deb34>] ip_route_output_flow+0x14/0x30 ./net/ipv4/route.c:2249 [<ffffffff81830737>] ip4_datagram_connect+0x317/0x5d0 ??:0 [<ffffffff81846d06>] inet_dgram_connect+0x76/0xd0 ./net/ipv4/af_inet.c:534 [<ffffffff817580ac>] SYSC_connect+0x15c/0x1c0 ./net/socket.c:1701 [<ffffffff817596ce>] SyS_connect+0xe/0x10 ./net/socket.c:1682 [<ffffffff818b0a29>] system_call_fastpath+0x16/0x1b ./arch/x86/kernel/entry_64.S:629 [2] <4>[196727.311203] general protection fault: 0000 [#1] SMP <4>[196727.311224] Modules linked in: xt_TEE xt_dscp xt_DSCP macvlan bridge coretemp crc32_pclmul ghash_clmulni_intel gpio_ich microcode ipmi_watchdog ipmi_devintf sb_edac edac_core lpc_ich mfd_core tpm_tis tpm tpm_bios ipmi_si ipmi_msghandler isci igb libsas i2c_algo_bit ixgbe ptp pps_core mdio <4>[196727.311333] CPU: 17 PID: 0 Comm: swapper/17 Not tainted 3.10.26 #1 <4>[196727.311344] Hardware name: Supermicro X9DRi-LN4+/X9DR3-LN4+/X9DRi-LN4+/X9DR3-LN4+, BIOS 3.0 07/05/2013 <4>[196727.311364] task: ffff885e6f069700 ti: ffff885e6f072000 task.ti: ffff885e6f072000 <4>[196727.311377] RIP: 0010:[<ffffffff815f8c7f>] [<ffffffff815f8c7f>] ipv4_dst_destroy+0x4f/0x80 <4>[196727.311399] RSP: 0018:ffff885effd23a70 EFLAGS: 00010282 <4>[196727.311409] RAX: dead000000200200 RBX: ffff8854c398ecc0 RCX: 0000000000000040 <4>[196727.311423] RDX: dead000000100100 RSI: dead000000100100 RDI: dead000000200200 <4>[196727.311437] RBP: ffff885effd23a80 R08: ffffffff815fd9e0 R09: ffff885d5a590800 <4>[196727.311451] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 <4>[196727.311464] R13: ffffffff81c8c280 R14: 0000000000000000 R15: ffff880e85ee16ce <4>[196727.311510] FS: 0000000000000000(0000) GS:ffff885effd20000(0000) knlGS:0000000000000000 <4>[196727.311554] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4>[196727.311581] CR2: 00007a46751eb000 CR3: 0000005e65688000 CR4: 00000000000407e0 <4>[196727.311625] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 <4>[196727.311669] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 <4>[196727.311713] Stack: <4>[196727.311733] ffff8854c398ecc0 ffff8854c398ecc0 ffff885effd23ab0 ffffffff815b7f42 <4>[196727.311784] ffff88be6595bc00 ffff8854c398ecc0 0000000000000000 ffff8854c398ecc0 <4>[196727.311834] ffff885effd23ad0 ffffffff815b86c6 ffff885d5a590800 ffff8816827821c0 <4>[196727.311885] Call Trace: <4>[196727.311907] <IRQ> <4>[196727.311912] [<ffffffff815b7f42>] dst_destroy+0x32/0xe0 <4>[196727.311959] [<ffffffff815b86c6>] dst_release+0x56/0x80 <4>[196727.311986] [<ffffffff81620bd5>] tcp_v4_do_rcv+0x2a5/0x4a0 <4>[196727.312013] [<ffffffff81622b5a>] tcp_v4_rcv+0x7da/0x820 <4>[196727.312041] [<ffffffff815fd9e0>] ? ip_rcv_finish+0x360/0x360 <4>[196727.312070] [<ffffffff815de02d>] ? nf_hook_slow+0x7d/0x150 <4>[196727.312097] [<ffffffff815fd9e0>] ? ip_rcv_finish+0x360/0x360 <4>[196727.312125] [<ffffffff815fda92>] ip_local_deliver_finish+0xb2/0x230 <4>[196727.312154] [<ffffffff815fdd9a>] ip_local_deliver+0x4a/0x90 <4>[196727.312183] [<ffffffff815fd799>] ip_rcv_finish+0x119/0x360 <4>[196727.312212] [<ffffffff815fe00b>] ip_rcv+0x22b/0x340 <4>[196727.312242] [<ffffffffa0339680>] ? macvlan_broadcast+0x160/0x160 [macvlan] <4>[196727.312275] [<ffffffff815b0c62>] __netif_receive_skb_core+0x512/0x640 <4>[196727.312308] [<ffffffff811427fb>] ? kmem_cache_alloc+0x13b/0x150 <4>[196727.312338] [<ffffffff815b0db1>] __netif_receive_skb+0x21/0x70 <4>[196727.312368] [<ffffffff815b0fa1>] netif_receive_skb+0x31/0xa0 <4>[196727.312397] [<ffffffff815b1ae8>] napi_gro_receive+0xe8/0x140 <4>[196727.312433] [<ffffffffa00274f1>] ixgbe_poll+0x551/0x11f0 [ixgbe] <4>[196727.312463] [<ffffffff815fe00b>] ? ip_rcv+0x22b/0x340 <4>[196727.312491] [<ffffffff815b1691>] net_rx_action+0x111/0x210 <4>[196727.312521] [<ffffffff815b0db1>] ? __netif_receive_skb+0x21/0x70 <4>[196727.312552] [<ffffffff810519d0>] __do_softirq+0xd0/0x270 <4>[196727.312583] [<ffffffff816cef3c>] call_softirq+0x1c/0x30 <4>[196727.312613] [<ffffffff81004205>] do_softirq+0x55/0x90 <4>[196727.312640] [<ffffffff81051c85>] irq_exit+0x55/0x60 <4>[196727.312668] [<ffffffff816cf5c3>] do_IRQ+0x63/0xe0 <4>[196727.312696] [<ffffffff816c5aaa>] common_interrupt+0x6a/0x6a <4>[196727.312722] <EOI> <1>[196727.313071] RIP [<ffffffff815f8c7f>] ipv4_dst_destroy+0x4f/0x80 <4>[196727.313100] RSP <ffff885effd23a70> <4>[196727.313377] ---[ end trace 64b3f14fae0f2e29 ]--- <0>[196727.380908] Kernel panic - not syncing: Fatal exception in interrupt Reported-by: Alexey Preobrazhensky <preobr@google.com> Reported-by: dormando <dormando@rydia.ne> Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: 8141ed9 ("ipv4: Add a socket release callback for datagram sockets") Cc: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jiri Slaby <jslaby@suse.cz>

commit 00159a2 upstream. When booting a kexec/kdump kernel on a system that has specific memory hotplug regions the boot will fail with warnings like: swapper/0: page allocation failure: order:9, mode:0x84d0 CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.10.0-65.el7.x86_64 #1 Hardware name: QCI QSSC-S4R/QSSC-S4R, BIOS QSSC-S4R.QCI.01.00.S013.032920111005 03/29/2011 0000000000000000 ffff8800341bd8c8 ffffffff815bcc67 ffff8800341bd950 ffffffff8113b1a0 ffff880036339b00 0000000000000009 00000000000084d0 ffff8800341bd950 ffffffff815b87ee 0000000000000000 0000000000000200 Call Trace: [<ffffffff815bcc67>] dump_stack+0x19/0x1b [<ffffffff8113b1a0>] warn_alloc_failed+0xf0/0x160 [<ffffffff815b87ee>] ? __alloc_pages_direct_compact+0xac/0x196 [<ffffffff8113f14f>] __alloc_pages_nodemask+0x7ff/0xa00 [<ffffffff815b417c>] vmemmap_alloc_block+0x62/0xba [<ffffffff815b41e9>] vmemmap_alloc_block_buf+0x15/0x3b [<ffffffff815b1ff6>] vmemmap_populate+0xb4/0x21b [<ffffffff815b461d>] sparse_mem_map_populate+0x27/0x35 [<ffffffff815b400f>] sparse_add_one_section+0x7a/0x185 [<ffffffff815a1e9f>] __add_pages+0xaf/0x240 [<ffffffff81047359>] arch_add_memory+0x59/0xd0 [<ffffffff815a21d9>] add_memory+0xb9/0x1b0 [<ffffffff81333b9c>] acpi_memory_device_add+0x18d/0x26d [<ffffffff81309a01>] acpi_bus_device_attach+0x7d/0xcd [<ffffffff8132379d>] acpi_ns_walk_namespace+0xc8/0x17f [<ffffffff81309984>] ? acpi_bus_type_and_status+0x90/0x90 [<ffffffff81309984>] ? acpi_bus_type_and_status+0x90/0x90 [<ffffffff81323c8c>] acpi_walk_namespace+0x95/0xc5 [<ffffffff8130a6d6>] acpi_bus_scan+0x8b/0x9d [<ffffffff81a2019a>] acpi_scan_init+0x63/0x160 [<ffffffff81a1ffb5>] acpi_init+0x25d/0x2a6 [<ffffffff81a1fd58>] ? acpi_sleep_proc_init+0x2a/0x2a [<ffffffff810020e2>] do_one_initcall+0xe2/0x190 [<ffffffff819e20c4>] kernel_init_freeable+0x17c/0x207 [<ffffffff819e18d0>] ? do_early_param+0x88/0x88 [<ffffffff8159fea0>] ? rest_init+0x80/0x80 [<ffffffff8159feae>] kernel_init+0xe/0x180 [<ffffffff815cca2c>] ret_from_fork+0x7c/0xb0 [<ffffffff8159fea0>] ? rest_init+0x80/0x80 Mem-Info: Node 0 DMA per-cpu: CPU 0: hi: 0, btch: 1 usd: 0 Node 0 DMA32 per-cpu: CPU 0: hi: 42, btch: 7 usd: 0 active_anon:0 inactive_anon:0 isolated_anon:0 active_file:0 inactive_file:0 isolated_file:0 unevictable:0 dirty:0 writeback:0 unstable:0 free:872 slab_reclaimable:13 slab_unreclaimable:1880 mapped:0 shmem:0 pagetables:0 bounce:0 free_cma:0 because the system has run out of memory at boot time. This occurs because of the following sequence in the boot: Main kernel boots and sets E820 map. The second kernel is booted with a map generated by the kdump service using memmap= and memmap=exactmap. These parameters are added to the kernel parameters of the kexec/kdump kernel. The kexec/kdump kernel has limited memory resources so as not to severely impact the main kernel. The system then panics and the kdump/kexec kernel boots (which is a completely new kernel boot). During this boot ACPI is initialized and the kernel (as can be seen above) traverses the ACPI namespace and finds an entry for a memory device to be hotadded. ie) [<ffffffff815a1e9f>] __add_pages+0xaf/0x240 [<ffffffff81047359>] arch_add_memory+0x59/0xd0 [<ffffffff815a21d9>] add_memory+0xb9/0x1b0 [<ffffffff81333b9c>] acpi_memory_device_add+0x18d/0x26d [<ffffffff81309a01>] acpi_bus_device_attach+0x7d/0xcd [<ffffffff8132379d>] acpi_ns_walk_namespace+0xc8/0x17f [<ffffffff81309984>] ? acpi_bus_type_and_status+0x90/0x90 [<ffffffff81309984>] ? acpi_bus_type_and_status+0x90/0x90 [<ffffffff81323c8c>] acpi_walk_namespace+0x95/0xc5 [<ffffffff8130a6d6>] acpi_bus_scan+0x8b/0x9d [<ffffffff81a2019a>] acpi_scan_init+0x63/0x160 [<ffffffff81a1ffb5>] acpi_init+0x25d/0x2a6 At this point the kernel adds page table information and the the kexec/kdump kernel runs out of memory. This can also be reproduced by using the memmap=exactmap and mem=X parameters on the main kernel and booting. This patchset resolves the problem by adding a kernel parameter, acpi_no_memhotplug, to disable ACPI memory hotplug. Signed-off-by: Prarit Bhargava <prarit@redhat.com> Acked-by: Vivek Goyal <vgoyal@redhat.com> Acked-by: Toshi Kani <toshi.kani@hp.com> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>

commit d5653f2 upstream. Fix NULL pointer exceptions when platform data is not supplied. Trace of one exception: Unable to handle kernel NULL pointer dereference at virtual address 00000008 pgd = c0004000 [00000008] *pgd=00000000 Internal error: Oops: 5 [#1] PREEMPT SMP ARM Modules linked in: CPU: 2 PID: 1 Comm: swapper/0 Not tainted 3.14.0-12045-gead5dd4687a6-dirty raspberrypi#1628 task: eea80000 ti: eea88000 task.ti: eea88000 PC is at max77693_muic_probe+0x27c/0x528 LR is at regmap_write+0x50/0x60 pc : [<c041d1c8>] lr : [<c02eba60>] psr: 20000113 sp : eea89e38 ip : 00000000 fp : c098a834 r10: ee1a5a10 r9 : 00000005 r8 : c098a83c r7 : 0000000a r6 : c098a774 r5 : 00000005 r4 : eeb006d0 r3 : c0697bd8 r2 : 00000000 r1 : 00000001 r0 : 00000000 Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment kernel Control: 10c5387d Table: 4000404a DAC: 00000015 Process swapper/0 (pid: 1, stack limit = 0xeea88240) Stack: (0xeea89e38 to 0xeea8a000) 9e20: c08499fc eeb006d0 9e40: 00000000 00000000 c0915f98 00000001 00000000 ee1a5a10 c098a730 c09a88b8 9e60: 00000000 c098a730 c0915f98 00000000 00000000 c02d6aa0 c02d6a88 ee1a5a10 9e80: c0a712c8 c02d54e4 00001204 c0628b00 ee1a5a10 c098a730 ee1a5a44 00000000 9ea0: eea88000 c02d57b4 00000000 c098a730 c02d5728 c02d3a24 ee813e5c eeb9d534 9ec0: c098a730 ee22f700 c097c720 c02d4b14 c08174ec c098a730 00000006 c098a730 9ee0: 00000006 c092fd30 c09b8500 c02d5df8 00000000 c093cbb8 00000006 c0008928 9f00: 000000c3 ef7fc785 00000000 ef7fc794 00000000 c08af968 00000072 eea89f30 9f20: ef7fc85e c065f198 000000c3 c003e87c 00000003 00000000 c092fd3c 00000000 9f40: c08af618 c0826d58 00000006 00000006 c0956f58 c093cbb8 00000006 c092fd30 9f60: c09b8500 000000c3 c092fd3c c08e8510 00000000 c08e8bb0 00000006 00000006 9f80: c08e8510 c0c0c0c0 00000000 c0628fac 00000000 00000000 00000000 00000000 9fa0: 00000000 c0628fb4 00000000 c000f038 00000000 00000000 00000000 00000000 9fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 9fe0: 00000000 00000000 00000000 00000000 00000013 00000000 c0c0c0c0 c0c0c0c0 [<c041d1c8>] (max77693_muic_probe) from [<c02d6aa0>] (platform_drv_probe+0x18/0x48) [<c02d6aa0>] (platform_drv_probe) from [<c02d54e4>] (driver_probe_device+0x140/0x384) [<c02d54e4>] (driver_probe_device) from [<c02d57b4>] (__driver_attach+0x8c/0x90) [<c02d57b4>] (__driver_attach) from [<c02d3a24>] (bus_for_each_dev+0x54/0x88) [<c02d3a24>] (bus_for_each_dev) from [<c02d4b14>] (bus_add_driver+0xe8/0x204) [<c02d4b14>] (bus_add_driver) from [<c02d5df8>] (driver_register+0x78/0xf4) [<c02d5df8>] (driver_register) from [<c0008928>] (do_one_initcall+0xc4/0x174) [<c0008928>] (do_one_initcall) from [<c08e8bb0>] (kernel_init_freeable+0xfc/0x1c8) [<c08e8bb0>] (kernel_init_freeable) from [<c0628fb4>] (kernel_init+0x8/0xec) [<c0628fb4>] (kernel_init) from [<c000f038>] (ret_from_fork+0x14/0x3c) Code: caffffe7 e59d200c e3550001 b3a05001 (e5923008) ---[ end trace 85db969ce011bde7 ]--- Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Fixes: 190d7cf Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>

commit 72abc8f upstream. I hit the same assert failed as Dolev Raviv reported in Kernel v3.10 shows like this: [ 9641.164028] UBIFS assert failed in shrink_tnc at 131 (pid 13297) [ 9641.234078] CPU: 1 PID: 13297 Comm: mmap.test Tainted: G O 3.10.40 #1 [ 9641.234116] [<c0011a6c>] (unwind_backtrace+0x0/0x12c) from [<c000d0b0>] (show_stack+0x20/0x24) [ 9641.234137] [<c000d0b0>] (show_stack+0x20/0x24) from [<c0311134>] (dump_stack+0x20/0x28) [ 9641.234188] [<c0311134>] (dump_stack+0x20/0x28) from [<bf22425c>] (shrink_tnc_trees+0x25c/0x350 [ubifs]) [ 9641.234265] [<bf22425c>] (shrink_tnc_trees+0x25c/0x350 [ubifs]) from [<bf2245ac>] (ubifs_shrinker+0x25c/0x310 [ubifs]) [ 9641.234307] [<bf2245ac>] (ubifs_shrinker+0x25c/0x310 [ubifs]) from [<c00cdad8>] (shrink_slab+0x1d4/0x2f8) [ 9641.234327] [<c00cdad8>] (shrink_slab+0x1d4/0x2f8) from [<c00d03d0>] (do_try_to_free_pages+0x300/0x544) [ 9641.234344] [<c00d03d0>] (do_try_to_free_pages+0x300/0x544) from [<c00d0a44>] (try_to_free_pages+0x2d0/0x398) [ 9641.234363] [<c00d0a44>] (try_to_free_pages+0x2d0/0x398) from [<c00c6a60>] (__alloc_pages_nodemask+0x494/0x7e8) [ 9641.234382] [<c00c6a60>] (__alloc_pages_nodemask+0x494/0x7e8) from [<c00f62d8>] (new_slab+0x78/0x238) [ 9641.234400] [<c00f62d8>] (new_slab+0x78/0x238) from [<c031081c>] (__slab_alloc.constprop.42+0x1a4/0x50c) [ 9641.234419] [<c031081c>] (__slab_alloc.constprop.42+0x1a4/0x50c) from [<c00f80e8>] (kmem_cache_alloc_trace+0x54/0x188) [ 9641.234459] [<c00f80e8>] (kmem_cache_alloc_trace+0x54/0x188) from [<bf227908>] (do_readpage+0x168/0x468 [ubifs]) [ 9641.234553] [<bf227908>] (do_readpage+0x168/0x468 [ubifs]) from [<bf2296a0>] (ubifs_readpage+0x424/0x464 [ubifs]) [ 9641.234606] [<bf2296a0>] (ubifs_readpage+0x424/0x464 [ubifs]) from [<c00c17c0>] (filemap_fault+0x304/0x418) [ 9641.234638] [<c00c17c0>] (filemap_fault+0x304/0x418) from [<c00de694>] (__do_fault+0xd4/0x530) [ 9641.234665] [<c00de694>] (__do_fault+0xd4/0x530) from [<c00e10c0>] (handle_pte_fault+0x480/0xf54) [ 9641.234690] [<c00e10c0>] (handle_pte_fault+0x480/0xf54) from [<c00e2bf8>] (handle_mm_fault+0x140/0x184) [ 9641.234716] [<c00e2bf8>] (handle_mm_fault+0x140/0x184) from [<c0316688>] (do_page_fault+0x150/0x3ac) [ 9641.234737] [<c0316688>] (do_page_fault+0x150/0x3ac) from [<c000842c>] (do_DataAbort+0x3c/0xa0) [ 9641.234759] [<c000842c>] (do_DataAbort+0x3c/0xa0) from [<c0314e38>] (__dabt_usr+0x38/0x40) After analyzing the code, I found a condition that may cause this failed in correct operations. Thus, I think this assertion is wrong and should be removed. Suppose there are two clean znodes and one dirty znode in TNC. So the per-filesystem atomic_t @clean_zn_cnt is (2). If commit start, dirty_znode is set to COW_ZNODE in get_znodes_to_commit() in case of potentially ops on this znode. We clear COW bit and DIRTY bit in write_index() without @tnc_mutex locked. We don't increase @clean_zn_cnt in this place. As the comments in write_index() shows, if another process hold @tnc_mutex and dirty this znode after we clean it, @clean_zn_cnt would be decreased to (1). We will increase @clean_zn_cnt to (2) with @tnc_mutex locked in free_obsolete_znodes() to keep it right. If shrink_tnc() performs between decrease and increase, it will release other 2 clean znodes it holds and found @clean_zn_cnt is less than zero (1 - 2 = -1), then hit the assertion. Because free_obsolete_znodes() will soon correct @clean_zn_cnt and no harm to fs in this case, I think this assertion could be removed. 2 clean zondes and 1 dirty znode, @clean_zn_cnt == 2 Thread A (commit) Thread B (write or others) Thread C (shrinker) ->write_index ->clear_bit(DIRTY_NODE) ->clear_bit(COW_ZNODE) @clean_zn_cnt == 2 ->mutex_locked(&tnc_mutex) ->dirty_cow_znode ->!ubifs_zn_cow(znode) ->!test_and_set_bit(DIRTY_NODE) ->atomic_dec(&clean_zn_cnt) ->mutex_unlocked(&tnc_mutex) @clean_zn_cnt == 1 ->mutex_locked(&tnc_mutex) ->shrink_tnc ->destroy_tnc_subtree ->atomic_sub(&clean_zn_cnt, 2) ->ubifs_assert <- hit ->mutex_unlocked(&tnc_mutex) @clean_zn_cnt == -1 ->mutex_lock(&tnc_mutex) ->free_obsolete_znodes ->atomic_inc(&clean_zn_cnt) ->mutux_unlock(&tnc_mutex) @clean_zn_cnt == 0 (correct after shrink) Signed-off-by: hujianyang <hujianyang@huawei.com> Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>

commit 60e1751 upstream. Avoid that closing /dev/infiniband/umad<n> or /dev/infiniband/issm<n> triggers a use-after-free. __fput() invokes f_op->release() before it invokes cdev_put(). Make sure that the ib_umad_device structure is freed by the cdev_put() call instead of f_op->release(). This avoids that changing the port mode from IB into Ethernet and back to IB followed by restarting opensmd triggers the following kernel oops: general protection fault: 0000 [#1] PREEMPT SMP RIP: 0010:[<ffffffff810cc65c>] [<ffffffff810cc65c>] module_put+0x2c/0x170 Call Trace: [<ffffffff81190f20>] cdev_put+0x20/0x30 [<ffffffff8118e2ce>] __fput+0x1ae/0x1f0 [<ffffffff8118e35e>] ____fput+0xe/0x10 [<ffffffff810723bc>] task_work_run+0xac/0xe0 [<ffffffff81002a9f>] do_notify_resume+0x9f/0xc0 [<ffffffff814b8398>] int_signal+0x12/0x17 Reference: https://bugzilla.kernel.org/show_bug.cgi?id=75051 Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Yann Droneaud <ydroneaud@opteya.com> Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>

commit 7adb5c8 upstream. At probe time, the musb_am335x driver register its childs by calling of_platform_populate(), which registers all childs in the devicetree hierarchy recursively. On the other side, the driver's remove() function uses of_device_unregister() to remove each child of musb_am335x's. However, when musb_dsps is loaded, its devices are attached to the musb_am335x device as musb_am335x childs. Hence, musb_am335x remove() will attempt to unregister the devices registered by musb_dsps, which produces a kernel panic. In other words, the childs in the "struct device" hierarchy are not the same as the childs in the "devicetree" hierarchy. Ideally, we should enforce the removal of the devices registered by musb_am335x *only*, instead of all its child devices. However, because of the recursive nature of of_platform_populate, this doesn't seem possible. Therefore, as the only solution at hand, this commit disables musb_am335x driver removal capability, preventing it from being ever removed. This was originally suggested by Sebastian Siewior: https://www.mail-archive.com/linux-omap@vger.kernel.org/msg104946.html And for reference, here's the panic upon module removal: musb-hdrc musb-hdrc.0.auto: remove, state 4 usb usb1: USB disconnect, device number 1 musb-hdrc musb-hdrc.0.auto: USB bus 1 deregistered Unable to handle kernel NULL pointer dereference at virtual address 0000008 pgd = de11c000 [0000008] *pgd=9e174831, *pte=00000000, *ppte=00000000 Internal error: Oops: 17 [#1] ARM Modules linked in: musb_am335x(-) musb_dsps musb_hdrc usbcore usb_common CPU: 0 PID: 623 Comm: modprobe Not tainted 3.15.0-rc4-00001-g24efd13 raspberrypi#69 task: de1b7500 ti: de122000 task.ti: de122000 PC is at am335x_shutdown+0x10/0x28 LR is at am335x_shutdown+0xc/0x28 pc : [<c0327798>] lr : [<c0327794>] psr: a0000013 sp : de123df8 ip : 00000004 fp : 00028f00 r10: 00000000 r9 : de122000 r8 : c000e6c4 r7 : de0e3c10 r6 : de0e3800 r5 : de624010 r4 : de1ec750 r3 : de0e3810 r2 : 00000000 r1 : 00000001 r0 : 00000000 Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user Control: 10c5387d Table: 9e11c019 DAC: 00000015 Process modprobe (pid: 623, stack limit = 0xde122240) Stack: (0xde123df8 to 0xde124000) 3de0: de0e3810 bf054488 3e00: bf05444c de624010 60000013 bf043650 000012fc de624010 de0e3810 bf043a20 3e20: de0e3810 bf04b240 c0635b88 c02ca37c c02ca364 c02c8db0 de1b7500 de0e3844 3e40: de0e3810 c02c8e28 c0635b88 de02824c de0e3810 c02c884c de0e3800 de0e3810 3e60: de0e3818 c02c5b20 bf05417c de0e3800 de0e3800 c0635b88 de0f2410 c02ca838 3e80: bf05417c de0e3800 bf055438 c02ca8cc de0e3c10 bf054194 de0e3c10 c02ca37c 3ea0: c02ca364 c02c8db0 de1b7500 de0e3c44 de0e3c10 c02c8e28 c0635b88 de02824c 3ec0: de0e3c10 c02c884c de0e3c10 de0e3c10 de0e3c18 c02c5b20 de0e3c10 de0e3c10 3ee0: 00000000 bf059000 a0000013 c02c5bc0 00000000 bf05900c de0e3c10 c02c5c48 3f00: de0dd0c0 de1ec970 de0f2410 bf05929c de0f2444 bf05902c de0f2410 c02ca37c 3f20: c02ca364 c02c8db0 bf05929c de0f2410 bf05929c c02c94c8 bf05929c 00000000 3f40: 00000800 c02c8ab4 bf0592e0 c007fc40 c00dd820 6273756d 336d615f 00783533 3f60: c064a0ac de1b7500 de122000 de1b7500 c000e590 00000001 c000e6c4 c0060160 3f80: 00028e70 00028e70 00028ea4 00000081 60000010 00028e70 00028e70 00028ea4 3fa0: 00000081 c000e500 00028e70 00028e70 00028ea4 00000800 becb59f8 00027608 3fc0: 00028e70 00028e70 00028ea4 00000081 00000001 00000001 00000000 00028f00 3fe0: b6e6b6f0 becb59d4 000160e8 b6e6b6fc 60000010 00028ea4 00000000 00000000 [<c0327798>] (am335x_shutdown) from [<bf054488>] (dsps_musb_exit+0x3c/0x4c [musb_dsps]) [<bf054488>] (dsps_musb_exit [musb_dsps]) from [<bf043650>] (musb_shutdown+0x80/0x90 [musb_hdrc]) [<bf043650>] (musb_shutdown [musb_hdrc]) from [<bf043a20>] (musb_remove+0x24/0x68 [musb_hdrc]) [<bf043a20>] (musb_remove [musb_hdrc]) from [<c02ca37c>] (platform_drv_remove+0x18/0x1c) [<c02ca37c>] (platform_drv_remove) from [<c02c8db0>] (__device_release_driver+0x70/0xc8) [<c02c8db0>] (__device_release_driver) from [<c02c8e28>] (device_release_driver+0x20/0x2c) [<c02c8e28>] (device_release_driver) from [<c02c884c>] (bus_remove_device+0xdc/0x10c) [<c02c884c>] (bus_remove_device) from [<c02c5b20>] (device_del+0x104/0x198) [<c02c5b20>] (device_del) from [<c02ca838>] (platform_device_del+0x14/0x9c) [<c02ca838>] (platform_device_del) from [<c02ca8cc>] (platform_device_unregister+0xc/0x20) [<c02ca8cc>] (platform_device_unregister) from [<bf054194>] (dsps_remove+0x18/0x38 [musb_dsps]) [<bf054194>] (dsps_remove [musb_dsps]) from [<c02ca37c>] (platform_drv_remove+0x18/0x1c) [<c02ca37c>] (platform_drv_remove) from [<c02c8db0>] (__device_release_driver+0x70/0xc8) [<c02c8db0>] (__device_release_driver) from [<c02c8e28>] (device_release_driver+0x20/0x2c) [<c02c8e28>] (device_release_driver) from [<c02c884c>] (bus_remove_device+0xdc/0x10c) [<c02c884c>] (bus_remove_device) from [<c02c5b20>] (device_del+0x104/0x198) [<c02c5b20>] (device_del) from [<c02c5bc0>] (device_unregister+0xc/0x20) [<c02c5bc0>] (device_unregister) from [<bf05900c>] (of_remove_populated_child+0xc/0x14 [musb_am335x]) [<bf05900c>] (of_remove_populated_child [musb_am335x]) from [<c02c5c48>] (device_for_each_child+0x44/0x70) [<c02c5c48>] (device_for_each_child) from [<bf05902c>] (am335x_child_remove+0x18/0x30 [musb_am335x]) [<bf05902c>] (am335x_child_remove [musb_am335x]) from [<c02ca37c>] (platform_drv_remove+0x18/0x1c) [<c02ca37c>] (platform_drv_remove) from [<c02c8db0>] (__device_release_driver+0x70/0xc8) [<c02c8db0>] (__device_release_driver) from [<c02c94c8>] (driver_detach+0xb4/0xb8) [<c02c94c8>] (driver_detach) from [<c02c8ab4>] (bus_remove_driver+0x4c/0xa0) [<c02c8ab4>] (bus_remove_driver) from [<c007fc40>] (SyS_delete_module+0x128/0x1cc) [<c007fc40>] (SyS_delete_module) from [<c000e500>] (ret_fast_syscall+0x0/0x48) Fixes: 97238b3 ("usb: musb: dsps: use proper child nodes") Acked-by: George Cherian <george.cherian@ti.com> Signed-off-by: Ezequiel Garcia <ezequiel@vanguardiasur.com.ar> Signed-off-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>

commit e4adcff upstream. We need to delete un-finished td from current request's td list at ep_dequeue API, otherwise, this non-user td will be remained at td list before this request is freed. So if we do ep_queue-> ep_dequeue->ep_queue sequence, when the complete interrupt for the second ep_queue comes, we search td list for this request, the first td (added by the first ep_queue) will be handled, and its status is still active, so we will consider the this transfer still not be completed, but in fact, it has completed. It causes the peripheral side considers it never receives current data for this transfer. We met this problem when do "Error Recovery Test - Device Configured" test item for USBCV2 MSC test, the host has never received ACK for the IN token for CSW due to peripheral considers it does not get this CBW, the USBCV test log like belows: -------------------------------------------------------------------------- INFO Issuing BOT MSC Reset, reset should always succeed INFO Retrieving status on CBW endpoint INFO CBW endpoint status = 0x0 INFO Retrieving status on CSW endpoint INFO CSW endpoint status = 0x0 INFO Issuing required command (Test Unit Ready) to verify device has recovered INFO Issuing CBW (attempt #1): INFO |----- CBW LUN = 0x0 INFO |----- CBW Flags = 0x0 INFO |----- CBW Data Transfer Length = 0x0 INFO |----- CBW CDB Length = 0x6 INFO |----- CBW CDB-00 = 0x0 INFO |----- CBW CDB-01 = 0x0 INFO |----- CBW CDB-02 = 0x0 INFO |----- CBW CDB-03 = 0x0 INFO |----- CBW CDB-04 = 0x0 INFO |----- CBW CDB-05 = 0x0 INFO Issuing CSW : try 1 INFO CSW Bulk Request timed out! ERROR Failed CSW phase : should have been success or stall FAIL (5.3.4) The CSW status value must be 0x00, 0x01, or 0x02. ERROR BOTCommonMSCRequest failed: error=80004000 Cc: Andrzej Pietrasiewicz <andrzej.p@samsung.com> Signed-off-by: Peter Chen <peter.chen@freescale.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>

commit 7cd2b0a upstream. Oleg reports a division by zero error on zero-length write() to the percpu_pagelist_fraction sysctl: divide error: 0000 [#1] SMP DEBUG_PAGEALLOC CPU: 1 PID: 9142 Comm: badarea_io Not tainted 3.15.0-rc2-vm-nfs+ #19 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 task: ffff8800d5aeb6e0 ti: ffff8800d87a2000 task.ti: ffff8800d87a2000 RIP: 0010: percpu_pagelist_fraction_sysctl_handler+0x84/0x120 RSP: 0018:ffff8800d87a3e78 EFLAGS: 00010246 RAX: 0000000000000f89 RBX: ffff88011f7fd000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000010 RBP: ffff8800d87a3e98 R08: ffffffff81d002c8 R09: ffff8800d87a3f50 R10: 000000000000000b R11: 0000000000000246 R12: 0000000000000060 R13: ffffffff81c3c3e0 R14: ffffffff81cfddf8 R15: ffff8801193b0800 FS: 00007f614f1e9740(0000) GS:ffff88011f440000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00007f614f1fa000 CR3: 00000000d9291000 CR4: 00000000000006e0 Call Trace: proc_sys_call_handler+0xb3/0xc0 proc_sys_write+0x14/0x20 vfs_write+0xba/0x1e0 SyS_write+0x46/0xb0 tracesys+0xe1/0xe6 However, if the percpu_pagelist_fraction sysctl is set by the user, it is also impossible to restore it to the kernel default since the user cannot write 0 to the sysctl. This patch allows the user to write 0 to restore the default behavior. It still requires a fraction equal to or larger than 8, however, as stated by the documentation for sanity. If a value in the range [1, 7] is written, the sysctl will return EINVAL. This successfully solves the divide by zero issue at the same time. Signed-off-by: David Rientjes <rientjes@google.com> Reported-by: Oleg Drokin <green@linuxhacker.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Jiri Slaby <jslaby@suse.cz>

…refcnt an atomic_t commit a5049a8 upstream. Hello, So, this patch should do. Joe, Vivek, can one of you guys please verify that the oops goes away with this patch? Jens, the original thread can be read at http://thread.gmane.org/gmane.linux.kernel/1720729 The fix converts blkg->refcnt from int to atomic_t. It does some overhead but it should be minute compared to everything else which is going on and the involved cacheline bouncing, so I think it's highly unlikely to cause any noticeable difference. Also, the refcnt in question should be converted to a perpcu_ref for blk-mq anyway, so the atomic_t is likely to go away pretty soon anyway. Thanks. ------- 8< ------- __blkg_release_rcu() may be invoked after the associated request_queue is released with a RCU grace period inbetween. As such, the function and callbacks invoked from it must not dereference the associated request_queue. This is clearly indicated in the comment above the function. Unfortunately, while trying to fix a different issue, 2a4fd07 ("blkcg: move bulk of blkcg_gq release operations to the RCU callback") ignored this and added [un]locking of @blkg->q->queue_lock to __blkg_release_rcu(). This of course can cause oops as the request_queue may be long gone by the time this code gets executed. general protection fault: 0000 [#1] SMP CPU: 21 PID: 30 Comm: rcuos/21 Not tainted 3.15.0 #1 Hardware name: Stratus ftServer 6400/G7LAZ, BIOS BIOS Version 6.3:57 12/25/2013 task: ffff880854021de0 ti: ffff88085403c000 task.ti: ffff88085403c000 RIP: 0010:[<ffffffff8162e9e5>] [<ffffffff8162e9e5>] _raw_spin_lock_irq+0x15/0x60 RSP: 0018:ffff88085403fdf0 EFLAGS: 00010086 RAX: 0000000000020000 RBX: 0000000000000010 RCX: 0000000000000000 RDX: 000060ef80008248 RSI: 0000000000000286 RDI: 6b6b6b6b6b6b6b6b RBP: ffff88085403fdf0 R08: 0000000000000286 R09: 0000000000009f39 R10: 0000000000020001 R11: 0000000000020001 R12: ffff88103c17a130 R13: ffff88103c17a080 R14: 0000000000000000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88107fca0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000006e5ab8 CR3: 000000000193d000 CR4: 00000000000407e0 Stack: ffff88085403fe18 ffffffff812cbfc2 ffff88103c17a130 0000000000000000 ffff88103c17a130 ffff88085403fec0 ffffffff810d1d28 ffff880854021de0 ffff880854021de0 ffff88107fcaec58 ffff88085403fe80 ffff88107fcaec30 Call Trace: [<ffffffff812cbfc2>] __blkg_release_rcu+0x72/0x150 [<ffffffff810d1d28>] rcu_nocb_kthread+0x1e8/0x300 [<ffffffff81091d81>] kthread+0xe1/0x100 [<ffffffff8163813c>] ret_from_fork+0x7c/0xb0 Code: ff 47 04 48 8b 7d 08 be 00 02 00 00 e8 55 48 a4 ff 5d c3 0f 1f 00 66 66 66 66 90 55 48 89 e5 +fa 66 66 90 66 66 90 b8 00 00 02 00 <f0> 0f c1 07 89 c2 c1 ea 10 66 39 c2 75 02 5d c3 83 e2 fe 0f +b7 RIP [<ffffffff8162e9e5>] _raw_spin_lock_irq+0x15/0x60 RSP <ffff88085403fdf0> The request_queue locking was added because blkcg_gq->refcnt is an int protected with the queue lock and __blkg_release_rcu() needs to put the parent. Let's fix it by making blkcg_gq->refcnt an atomic_t and dropping queue locking in the function. Given the general heavy weight of the current request_queue and blkcg operations, this is unlikely to cause any noticeable overhead. Moreover, blkcg_gq->refcnt is likely to be converted to percpu_ref in the near future, so whatever (most likely negligible) overhead it may add is temporary. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Joe Lawrence <joe.lawrence@stratus.com> Acked-by: Vivek Goyal <vgoyal@redhat.com> Link: http://lkml.kernel.org/g/alpine.DEB.2.02.1406081816540.17948@jlaw-desktop.mno.stratus.com Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>

commit dc78327 upstream. With a kernel configured with ARM64_64K_PAGES && !TRANSPARENT_HUGEPAGE, the following is triggered at early boot: SMP: Total of 8 processors activated. devtmpfs: initialized Unable to handle kernel NULL pointer dereference at virtual address 00000008 pgd = fffffe0000050000 [00000008] *pgd=00000043fba00003, *pmd=00000043fba00003, *pte=00e0000078010407 Internal error: Oops: 96000006 [#1] SMP Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.15.0-rc864k+ raspberrypi#44 task: fffffe03bc040000 ti: fffffe03bc080000 task.ti: fffffe03bc080000 PC is at __list_add+0x10/0xd4 LR is at free_one_page+0x270/0x638 ... Call trace: __list_add+0x10/0xd4 free_one_page+0x26c/0x638 __free_pages_ok.part.52+0x84/0xbc __free_pages+0x74/0xbc init_cma_reserved_pageblock+0xe8/0x104 cma_init_reserved_areas+0x190/0x1e4 do_one_initcall+0xc4/0x154 kernel_init_freeable+0x204/0x2a8 kernel_init+0xc/0xd4 This happens because init_cma_reserved_pageblock() calls __free_one_page() with pageblock_order as page order but it is bigger than MAX_ORDER. This in turn causes accesses past zone->free_list[]. Fix the problem by changing init_cma_reserved_pageblock() such that it splits pageblock into individual MAX_ORDER pages if pageblock is bigger than a MAX_ORDER page. In cases where !CONFIG_HUGETLB_PAGE_SIZE_VARIABLE, which is all architectures expect for ia64, powerpc and tile at the moment, the â��pageblock_order > MAX_ORDERâ�� condition will be optimised out since both sides of the operator are constants. In cases where pageblock size is variable, the performance degradation should not be significant anyway since init_cma_reserved_pageblock() is called only at boot time at most MAX_CMA_AREAS times which by default is eight. Signed-off-by: Michal Nazarewicz <mina86@mina86.com> Reported-by: Mark Salter <msalter@redhat.com> Tested-by: Mark Salter <msalter@redhat.com> Tested-by: Christopher Covington <cov@codeaurora.org> Cc: Mel Gorman <mgorman@suse.de> Cc: David Rientjes <rientjes@google.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Jiri Slaby <jslaby@suse.cz>

commit 5a6024f upstream. When hot-adding and onlining CPU, kernel panic occurs, showing following call trace. BUG: unable to handle kernel paging request at 0000000000001d08 IP: [<ffffffff8114acfd>] __alloc_pages_nodemask+0x9d/0xb10 PGD 0 Oops: 0000 [#1] SMP ... Call Trace: [<ffffffff812b8745>] ? cpumask_next_and+0x35/0x50 [<ffffffff810a3283>] ? find_busiest_group+0x113/0x8f0 [<ffffffff81193bc9>] ? deactivate_slab+0x349/0x3c0 [<ffffffff811926f1>] new_slab+0x91/0x300 [<ffffffff815de95a>] __slab_alloc+0x2bb/0x482 [<ffffffff8105bc1c>] ? copy_process.part.25+0xfc/0x14c0 [<ffffffff810a3c78>] ? load_balance+0x218/0x890 [<ffffffff8101a679>] ? sched_clock+0x9/0x10 [<ffffffff81105ba9>] ? trace_clock_local+0x9/0x10 [<ffffffff81193d1c>] kmem_cache_alloc_node+0x8c/0x200 [<ffffffff8105bc1c>] copy_process.part.25+0xfc/0x14c0 [<ffffffff81114d0d>] ? trace_buffer_unlock_commit+0x4d/0x60 [<ffffffff81085a80>] ? kthread_create_on_node+0x140/0x140 [<ffffffff8105d0ec>] do_fork+0xbc/0x360 [<ffffffff8105d3b6>] kernel_thread+0x26/0x30 [<ffffffff81086652>] kthreadd+0x2c2/0x300 [<ffffffff81086390>] ? kthread_create_on_cpu+0x60/0x60 [<ffffffff815f20ec>] ret_from_fork+0x7c/0xb0 [<ffffffff81086390>] ? kthread_create_on_cpu+0x60/0x60 In my investigation, I found the root cause is wq_numa_possible_cpumask. All entries of wq_numa_possible_cpumask is allocated by alloc_cpumask_var_node(). And these entries are used without initializing. So these entries have wrong value. When hot-adding and onlining CPU, wq_update_unbound_numa() is called. wq_update_unbound_numa() calls alloc_unbound_pwq(). And alloc_unbound_pwq() calls get_unbound_pool(). In get_unbound_pool(), worker_pool->node is set as follow: 3592 /* if cpumask is contained inside a NUMA node, we belong to that node */ 3593 if (wq_numa_enabled) { 3594 for_each_node(node) { 3595 if (cpumask_subset(pool->attrs->cpumask, 3596 wq_numa_possible_cpumask[node])) { 3597 pool->node = node; 3598 break; 3599 } 3600 } 3601 } But wq_numa_possible_cpumask[node] does not have correct cpumask. So, wrong node is selected. As a result, kernel panic occurs. By this patch, all entries of wq_numa_possible_cpumask are allocated by zalloc_cpumask_var_node to initialize them. And the panic disappeared. Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Tejun Heo <tj@kernel.org> Fixes: bce9038 ("workqueue: add wq_numa_tbl_len and wq_numa_possible_cpumask[]") Signed-off-by: Jiri Slaby <jslaby@suse.cz>

commit 2a96dfa upstream. After unbinding the driver memory was corrupted by double free of clk_lookup structure. This lead to OOPS when re-binding the driver again. The driver allocated memory for 'clk_lookup' with devm_kzalloc. During driver removal this memory was freed twice: once by clkdev_drop() and second by devm code. Kernel panic log: [ 30.839284] Unable to handle kernel paging request at virtual address 5f343173 [ 30.846476] pgd = dee14000 [ 30.849165] [5f343173] *pgd=00000000 [ 30.852703] Internal error: Oops: 805 [#1] PREEMPT SMP ARM [ 30.858166] Modules linked in: [ 30.861208] CPU: 0 PID: 1 Comm: bash Not tainted 3.16.0-rc2-00239-g94bdf617b07e-dirty raspberrypi#40 [ 30.869364] task: df478000 ti: df480000 task.ti: df480000 [ 30.874752] PC is at clkdev_add+0x2c/0x38 [ 30.878738] LR is at clkdev_add+0x18/0x38 [ 30.882732] pc : [<c0350908>] lr : [<c03508f4>] psr: 60000013 [ 30.882732] sp : df481e78 ip : 00000001 fp : c0700ed8 [ 30.894187] r10: 0000000c r9 : 00000000 r8 : c07b0e3c [ 30.899396] r7 : 00000002 r6 : df45f9d0 r5 : df421390 r4 : c0700d6c [ 30.905906] r3 : 5f343173 r2 : c0700d84 r1 : 60000013 r0 : c0700d6c [ 30.912417] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user [ 30.919534] Control: 10c53c7d Table: 5ee1406a DAC: 00000015 [ 30.925262] Process bash (pid: 1, stack limit = 0xdf480240) [ 30.930817] Stack: (0xdf481e78 to 0xdf482000) [ 30.935159] 1e60: 00001000 df6de610 [ 30.943321] 1e80: df7f4558 c0355650 c05ec6ec c0700eb0 df6de600 df7f4510 dec9d69c 00000014 [ 30.951480] 1ea0: 00167b48 df6de610 c0700e30 c0713518 00000000 c0700e30 dec9d69c 00000006 [ 30.959639] 1ec0: 00167b48 c02c1b7c c02c1b64 df6de610 c07aff48 c02c0420 c06fb150 c047cc20 [ 30.967798] 1ee0: df6de610 df6de610 c0700e30 df6de644 c06fb150 0000000c dec9d690 c02bef90 [ 30.975957] 1f00: dec9c6c0 dece4c00 df481f80 dece4c00 0000000c c02be73c 0000000c c016ca8c [ 30.984116] 1f20: c016ca48 00000000 00000000 c016c1f4 00000000 00000000 b6f18000 df481f80 [ 30.992276] 1f40: df7f66c0 0000000c df480000 df480000 b6f18000 c011094c df47839c 60000013 [ 31.000435] 1f60: 00000000 00000000 df7f66c0 df7f66c0 0000000c df480000 b6f18000 c0110dd4 [ 31.008594] 1f80: 00000000 00000000 0000000c b6ec05d8 0000000c b6f18000 00000004 c000f2a8 [ 31.016753] 1fa0: 00001000 c000f0e0 b6ec05d8 0000000c 00000001 b6f18000 0000000c 00000000 [ 31.024912] 1fc0: b6ec05d8 0000000c b6f18000 00000004 0000000c 00000001 00000000 00167b48 [ 31.033071] 1fe0: 00000000 bed83a80 b6e004f0 b6e5122c 60000010 00000001 ffffffff ffffffff [ 31.041248] [<c0350908>] (clkdev_add) from [<c0355650>] (s2mps11_clk_probe+0x2b4/0x3b4) [ 31.049223] [<c0355650>] (s2mps11_clk_probe) from [<c02c1b7c>] (platform_drv_probe+0x18/0x48) [ 31.057728] [<c02c1b7c>] (platform_drv_probe) from [<c02c0420>] (driver_probe_device+0x13c/0x384) [ 31.066579] [<c02c0420>] (driver_probe_device) from [<c02bef90>] (bind_store+0x88/0xd8) [ 31.074564] [<c02bef90>] (bind_store) from [<c02be73c>] (drv_attr_store+0x20/0x2c) [ 31.082118] [<c02be73c>] (drv_attr_store) from [<c016ca8c>] (sysfs_kf_write+0x44/0x48) [ 31.090016] [<c016ca8c>] (sysfs_kf_write) from [<c016c1f4>] (kernfs_fop_write+0xc0/0x17c) [ 31.098176] [<c016c1f4>] (kernfs_fop_write) from [<c011094c>] (vfs_write+0xa0/0x1c4) [ 31.105899] [<c011094c>] (vfs_write) from [<c0110dd4>] (SyS_write+0x40/0x8c) [ 31.112931] [<c0110dd4>] (SyS_write) from [<c000f0e0>] (ret_fast_syscall+0x0/0x3c) [ 31.120481] Code: e2842018 e584501c e1a00004 e885000c (e5835000) [ 31.126596] ---[ end trace efad45bfa3a61b05 ]--- [ 31.131181] Kernel panic - not syncing: Fatal exception [ 31.136368] CPU1: stopping [ 31.139054] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G D 3.16.0-rc2-00239-g94bdf617b07e-dirty raspberrypi#40 [ 31.148697] [<c0016480>] (unwind_backtrace) from [<c0012950>] (show_stack+0x10/0x14) [ 31.156419] [<c0012950>] (show_stack) from [<c0480db8>] (dump_stack+0x80/0xcc) [ 31.163622] [<c0480db8>] (dump_stack) from [<c001499c>] (handle_IPI+0x130/0x15c) [ 31.170998] [<c001499c>] (handle_IPI) from [<c000862c>] (gic_handle_irq+0x60/0x68) [ 31.178549] [<c000862c>] (gic_handle_irq) from [<c0013480>] (__irq_svc+0x40/0x70) [ 31.186009] Exception stack(0xdf4bdf88 to 0xdf4bdfd0) [ 31.191046] df80: ffffffed 00000000 00000000 00000000 df4bc000 c06d042c [ 31.199207] dfa0: 00000000 ffffffed c06d03c0 00000000 c070c288 00000000 00000000 df4bdfd0 [ 31.207363] dfc0: c0010324 c0010328 60000013 ffffffff [ 31.212402] [<c0013480>] (__irq_svc) from [<c0010328>] (arch_cpu_idle+0x28/0x30) [ 31.219783] [<c0010328>] (arch_cpu_idle) from [<c005f150>] (cpu_startup_entry+0x2c4/0x3f0) [ 31.228027] [<c005f150>] (cpu_startup_entry) from [<400086c4>] (0x400086c4) [ 31.234968] ---[ end Kernel panic - not syncing: Fatal exception Fixes: 7cc560d ("clk: s2mps11: Add support for s2mps11") Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Reviewed-by: Yadwinder Singh Brar <yadi.brar@samsung.com> Signed-off-by: Mike Turquette <mturquette@linaro.org> Signed-off-by: Jiri Slaby <jslaby@suse.cz>

commit 3f1f9b8 upstream. This fixes the following lockdep complaint: [ INFO: possible circular locking dependency detected ] 3.16.0-rc2-mm1+ #7 Tainted: G O ------------------------------------------------------- kworker/u24:0/4356 is trying to acquire lock: (&(&sbi->s_es_lru_lock)->rlock){+.+.-.}, at: [<ffffffff81285fff>] __ext4_es_shrink+0x4f/0x2e0 but task is already holding lock: (&ei->i_es_lock){++++-.}, at: [<ffffffff81286961>] ext4_es_insert_extent+0x71/0x180 which lock already depends on the new lock. Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&ei->i_es_lock); lock(&(&sbi->s_es_lru_lock)->rlock); lock(&ei->i_es_lock); lock(&(&sbi->s_es_lru_lock)->rlock); *** DEADLOCK *** 6 locks held by kworker/u24:0/4356: #0: ("writeback"){.+.+.+}, at: [<ffffffff81071d00>] process_one_work+0x180/0x560 #1: ((&(&wb->dwork)->work)){+.+.+.}, at: [<ffffffff81071d00>] process_one_work+0x180/0x560 #2: (&type->s_umount_key#22){++++++}, at: [<ffffffff811a9c74>] grab_super_passive+0x44/0x90 #3: (jbd2_handle){+.+...}, at: [<ffffffff812979f9>] start_this_handle+0x189/0x5f0 #4: (&ei->i_data_sem){++++..}, at: [<ffffffff81247062>] ext4_map_blocks+0x132/0x550 #5: (&ei->i_es_lock){++++-.}, at: [<ffffffff81286961>] ext4_es_insert_extent+0x71/0x180 stack backtrace: CPU: 0 PID: 4356 Comm: kworker/u24:0 Tainted: G O 3.16.0-rc2-mm1+ #7 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 Workqueue: writeback bdi_writeback_workfn (flush-253:0) ffffffff8213dce0 ffff880014b07538 ffffffff815df0bb 0000000000000007 ffffffff8213e040 ffff880014b07588 ffffffff815db3dd ffff880014b07568 ffff880014b07610 ffff88003b868930 ffff88003b868908 ffff88003b868930 Call Trace: [<ffffffff815df0bb>] dump_stack+0x4e/0x68 [<ffffffff815db3dd>] print_circular_bug+0x1fb/0x20c [<ffffffff810a7a3e>] __lock_acquire+0x163e/0x1d00 [<ffffffff815e89dc>] ? retint_restore_args+0xe/0xe [<ffffffff815ddc7b>] ? __slab_alloc+0x4a8/0x4ce [<ffffffff81285fff>] ? __ext4_es_shrink+0x4f/0x2e0 [<ffffffff810a8707>] lock_acquire+0x87/0x120 [<ffffffff81285fff>] ? __ext4_es_shrink+0x4f/0x2e0 [<ffffffff8128592d>] ? ext4_es_free_extent+0x5d/0x70 [<ffffffff815e6f09>] _raw_spin_lock+0x39/0x50 [<ffffffff81285fff>] ? __ext4_es_shrink+0x4f/0x2e0 [<ffffffff8119760b>] ? kmem_cache_alloc+0x18b/0x1a0 [<ffffffff81285fff>] __ext4_es_shrink+0x4f/0x2e0 [<ffffffff812869b8>] ext4_es_insert_extent+0xc8/0x180 [<ffffffff812470f4>] ext4_map_blocks+0x1c4/0x550 [<ffffffff8124c4c4>] ext4_writepages+0x6d4/0xd00 ... Reported-by: Minchan Kim <minchan@kernel.org> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reported-by: Minchan Kim <minchan@kernel.org> Cc: Zheng Liu <gnehzuil.liu@gmail.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>

commit 07d410e upstream. commit fb78b81 provide a workaround for kernel panic, but bring potential deadlock risk. that is in sirfsoc_rx_tmo_process_tl while enter into sirfsoc_uart_pio_rx_chars cpu hold uart_port->lock, if uart interrupt comes cpu enter into sirfsoc_uart_isr and deadlock occurs in getting uart_port->lock. the patch replace spin_lock version to spin_lock_irq* version to avoid spinlock dead lock issue. let function tty_flip_buffer_push in tasklet outof spin_lock_irq* protect area to avoid add the pair of spin_lock and spin_unlock for tty_flip_buffer_push. BTW drop self defined unused spinlock protect of tx_lock/rx_lock. 56274.220464] BUG: spinlock lockup suspected on CPU#0, swapper/0/0 [56274.223648] lock: 0xc05d9db0, .magic: dead4ead, .owner: swapper/0/0, .owner_cpu: 0 [56274.231278] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G O 3.10.35 #1 [56274.238241] [<c0015530>] (unwind_backtrace+0x0/0xf4) from [<c00120d8>] (show_stack+0x10/0x14) [56274.246742] [<c00120d8>] (show_stack+0x10/0x14) from [<c01b11b0>] (do_raw_spin_lock+0x110/0x184) [56274.255501] [<c01b11b0>] (do_raw_spin_lock+0x110/0x184) from [<c02124c8>] (sirfsoc_uart_isr+0x20/0x42c) [56274.264874] [<c02124c8>] (sirfsoc_uart_isr+0x20/0x42c) from [<c0075790>] (handle_irq_event_percpu+0x54/0x17c) [56274.274758] [<c0075790>] (handle_irq_event_percpu+0x54/0x17c) from [<c00758f4>] (handle_irq_event+0x3c/0x5c) [56274.284561] [<c00758f4>] (handle_irq_event+0x3c/0x5c) from [<c0077fa0>] (handle_level_irq+0x98/0xfc) [56274.293670] [<c0077fa0>] (handle_level_irq+0x98/0xfc) from [<c0074f44>] (generic_handle_irq+0x2c/0x3c) [56274.302952] [<c0074f44>] (generic_handle_irq+0x2c/0x3c) from [<c000ef80>] (handle_IRQ+0x40/0x90) [56274.311706] [<c000ef80>] (handle_IRQ+0x40/0x90) from [<c000dc80>] (__irq_svc+0x40/0x70) [56274.319697] [<c000dc80>] (__irq_svc+0x40/0x70) from [<c038113c>] (_raw_spin_unlock_irqrestore+0x10/0x48) [56274.329158] [<c038113c>] (_raw_spin_unlock_irqrestore+0x10/0x48) from [<c0200034>] (tty_port_tty_get+0x58/0x90) [56274.339213] [<c0200034>] (tty_port_tty_get+0x58/0x90) from [<c0212008>] (sirfsoc_uart_pio_rx_chars+0x1c/0xc8) [56274.349097] [<c0212008>] (sirfsoc_uart_pio_rx_chars+0x1c/0xc8) from [<c0212ef8>] (sirfsoc_rx_tmo_process_tl+0xe4/0x1fc) [56274.359853] [<c0212ef8>] (sirfsoc_rx_tmo_process_tl+0xe4/0x1fc) from [<c0027c04>] (tasklet_action+0x84/0x114) [56274.369739] [<c0027c04>] (tasklet_action+0x84/0x114) from [<c0027db4>] (__do_softirq+0x120/0x200) [56274.378585] [<c0027db4>] (__do_softirq+0x120/0x200) from [<c0027f44>] (do_softirq+0x54/0x5c) [56274.386998] [<c0027f44>] (do_softirq+0x54/0x5c) from [<c00281ec>] (irq_exit+0x9c/0xd0) [56274.394899] [<c00281ec>] (irq_exit+0x9c/0xd0) from [<c000ef84>] (handle_IRQ+0x44/0x90) [56274.402790] [<c000ef84>] (handle_IRQ+0x44/0x90) from [<c000dc80>] (__irq_svc+0x40/0x70) [56274.410774] [<c000dc80>] (__irq_svc+0x40/0x70) from [<c0288af4>] (cpuidle_enter_state+0x50/0xe0) [56274.419532] [<c0288af4>] (cpuidle_enter_state+0x50/0xe0) from [<c0288c34>] (cpuidle_idle_call+0xb0/0x148) [56274.429080] [<c0288c34>] (cpuidle_idle_call+0xb0/0x148) from [<c000f3ac>] (arch_cpu_idle+0x8/0x38) [56274.438016] [<c000f3ac>] (arch_cpu_idle+0x8/0x38) from [<c0059344>] (cpu_startup_entry+0xfc/0x140) [56274.446956] [<c0059344>] (cpu_startup_entry+0xfc/0x140) from [<c04a3a54>] (start_kernel+0x2d8/0x2e4) Signed-off-by: Qipan Li <Qipan.Li@csr.com> Signed-off-by: Barry Song <Baohua.Song@csr.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Jiri Slaby <jslaby@suse.cz>

[ Upstream commit 5924f17 ] When in repair-mode and TCP_RECV_QUEUE is set, we end up calling tcp_push with mss_now being 0. If data is in the send-queue and tcp_set_skb_tso_segs gets called, we crash because it will divide by mss_now: [ 347.151939] divide error: 0000 [#1] SMP [ 347.152907] Modules linked in: [ 347.152907] CPU: 1 PID: 1123 Comm: packetdrill Not tainted 3.16.0-rc2 #4 [ 347.152907] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007 [ 347.152907] task: f5b88540 ti: f3c82000 task.ti: f3c82000 [ 347.152907] EIP: 0060:[<c1601359>] EFLAGS: 00210246 CPU: 1 [ 347.152907] EIP is at tcp_set_skb_tso_segs+0x49/0xa0 [ 347.152907] EAX: 00000b67 EBX: f5acd080 ECX: 00000000 EDX: 00000000 [ 347.152907] ESI: f5a28f40 EDI: f3c88f00 EBP: f3c83d10 ESP: f3c83d00 [ 347.152907] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 [ 347.152907] CR0: 80050033 CR2: 083158b0 CR3: 35146000 CR4: 000006b0 [ 347.152907] Stack: [ 347.152907] c167f9d9 f5acd080 000005b4 00000002 f3c83d20 c16013e6 f3c88f00 f5acd080 [ 347.152907] f3c83da0 c1603b5a f3c83d38 c10a0188 00000000 00000000 f3c83d84 c10acc85 [ 347.152907] c1ad5ec0 00000000 00000000 c1ad679c 010003e0 00000000 00000000 f3c88fc8 [ 347.152907] Call Trace: [ 347.152907] [<c167f9d9>] ? apic_timer_interrupt+0x2d/0x34 [ 347.152907] [<c16013e6>] tcp_init_tso_segs+0x36/0x50 [ 347.152907] [<c1603b5a>] tcp_write_xmit+0x7a/0xbf0 [ 347.152907] [<c10a0188>] ? up+0x28/0x40 [ 347.152907] [<c10acc85>] ? console_unlock+0x295/0x480 [ 347.152907] [<c10ad24f>] ? vprintk_emit+0x1ef/0x4b0 [ 347.152907] [<c1605716>] __tcp_push_pending_frames+0x36/0xd0 [ 347.152907] [<c15f4860>] tcp_push+0xf0/0x120 [ 347.152907] [<c15f7641>] tcp_sendmsg+0xf1/0xbf0 [ 347.152907] [<c116d920>] ? kmem_cache_free+0xf0/0x120 [ 347.152907] [<c106a682>] ? __sigqueue_free+0x32/0x40 [ 347.152907] [<c106a682>] ? __sigqueue_free+0x32/0x40 [ 347.152907] [<c114f0f0>] ? do_wp_page+0x3e0/0x850 [ 347.152907] [<c161c36a>] inet_sendmsg+0x4a/0xb0 [ 347.152907] [<c1150269>] ? handle_mm_fault+0x709/0xfb0 [ 347.152907] [<c15a006b>] sock_aio_write+0xbb/0xd0 [ 347.152907] [<c1180b79>] do_sync_write+0x69/0xa0 [ 347.152907] [<c1181023>] vfs_write+0x123/0x160 [ 347.152907] [<c1181d55>] SyS_write+0x55/0xb0 [ 347.152907] [<c167f0d8>] sysenter_do_call+0x12/0x28 This can easily be reproduced with the following packetdrill-script (the "magic" with netem, sk_pacing and limit_output_bytes is done to prevent the kernel from pushing all segments, because hitting the limit without doing this is not so easy with packetdrill): 0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 +0 bind(3, ..., ...) = 0 +0 listen(3, 1) = 0 +0 < S 0:0(0) win 32792 <mss 1460> +0 > S. 0:0(0) ack 1 <mss 1460> +0.1 < . 1:1(0) ack 1 win 65000 +0 accept(3, ..., ...) = 4 // This forces that not all segments of the snd-queue will be pushed +0 `tc qdisc add dev tun0 root netem delay 10ms` +0 `sysctl -w net.ipv4.tcp_limit_output_bytes=2` +0 setsockopt(4, SOL_SOCKET, 47, [2], 4) = 0 +0 write(4,...,10000) = 10000 +0 write(4,...,10000) = 10000 // Set tcp-repair stuff, particularly TCP_RECV_QUEUE +0 setsockopt(4, SOL_TCP, 19, [1], 4) = 0 +0 setsockopt(4, SOL_TCP, 20, [1], 4) = 0 // This now will make the write push the remaining segments +0 setsockopt(4, SOL_SOCKET, 47, [20000], 4) = 0 +0 `sysctl -w net.ipv4.tcp_limit_output_bytes=130000` // Now we will crash +0 write(4,...,1000) = 1000 This happens since ec34232 (tcp: fix retransmission in repair mode). Prior to that, the call to tcp_push was prevented by a check for tp->repair. The patch fixes it, by adding the new goto-label out_nopush. When exiting tcp_sendmsg and a push is not required, which is the case for tp->repair, we go to this label. When repairing and calling send() with TCP_RECV_QUEUE, the data is actually put in the receive-queue. So, no push is required because no data has been added to the send-queue. Cc: Andrew Vagin <avagin@openvz.org> Cc: Pavel Emelyanov <xemul@parallels.com> Fixes: ec34232 (tcp: fix retransmission in repair mode) Signed-off-by: Christoph Paasch <christoph.paasch@uclouvain.be> Acked-by: Andrew Vagin <avagin@openvz.org> Acked-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jiri Slaby <jslaby@suse.cz>

commit 1a112d1 upstream. 1871ee1 ("libata: support the ata host which implements a queue depth less than 32") directly used ata_port->scsi_host->can_queue from ata_qc_new() to determine the number of tags supported by the host; unfortunately, SAS controllers doing SATA don't initialize ->scsi_host leading to the following oops. BUG: unable to handle kernel NULL pointer dereference at 0000000000000058 IP: [<ffffffff814e0618>] ata_qc_new_init+0x188/0x1b0 PGD 0 Oops: 0002 [#1] SMP Modules linked in: isci libsas scsi_transport_sas mgag200 drm_kms_helper ttm CPU: 1 PID: 518 Comm: udevd Not tainted 3.16.0-rc6+ raspberrypi#62 Hardware name: Intel Corporation S2600CO/S2600CO, BIOS SE5C600.86B.02.02.0002.122320131210 12/23/2013 task: ffff880c1a00b280 ti: ffff88061a000000 task.ti: ffff88061a000000 RIP: 0010:[<ffffffff814e0618>] [<ffffffff814e0618>] ata_qc_new_init+0x188/0x1b0 RSP: 0018:ffff88061a003ae8 EFLAGS: 0001001 RAX: 0000000000000001 RBX: ffff88000241ca80 RCX: 00000000000000fa RDX: 0000000000000020 RSI: 0000000000000020 RDI: ffff8806194aa298 RBP: ffff88061a003ae8 R08: ffff8806194a8000 R09: 0000000000000000 R10: 0000000000000000 R11: ffff88000241ca80 R12: ffff88061ad58200 R13: ffff8806194aa298 R14: ffffffff814e67a0 R15: ffff8806194a8000 FS: 00007f3ad7fe3840(0000) GS:ffff880627620000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000058 CR3: 000000061a118000 CR4: 00000000001407e0 Stack: ffff88061a003b20 ffffffff814e96e1 ffff88000241ca80 ffff88061ad58200 ffff8800b6bf6000 ffff880c1c988000 ffff880619903850 ffff88061a003b68 ffffffffa0056ce1 ffff88061a003b48 0000000013d6e6f8 ffff88000241ca80 Call Trace: [<ffffffff814e96e1>] ata_sas_queuecmd+0xa1/0x430 [<ffffffffa0056ce1>] sas_queuecommand+0x191/0x220 [libsas] [<ffffffff8149afee>] scsi_dispatch_cmd+0x10e/0x300 [<ffffffff814a3bc5>] scsi_request_fn+0x2f5/0x550 [<ffffffff81317613>] __blk_run_queue+0x33/0x40 [<ffffffff8131781a>] queue_unplugged+0x2a/0x90 [<ffffffff8131ceb4>] blk_flush_plug_list+0x1b4/0x210 [<ffffffff8131d274>] blk_finish_plug+0x14/0x50 [<ffffffff8117eaa8>] __do_page_cache_readahead+0x198/0x1f0 [<ffffffff8117ee21>] force_page_cache_readahead+0x31/0x50 [<ffffffff8117ee7e>] page_cache_sync_readahead+0x3e/0x50 [<ffffffff81172ac6>] generic_file_read_iter+0x496/0x5a0 [<ffffffff81219897>] blkdev_read_iter+0x37/0x40 [<ffffffff811e307e>] new_sync_read+0x7e/0xb0 [<ffffffff811e3734>] vfs_read+0x94/0x170 [<ffffffff811e43c6>] SyS_read+0x46/0xb0 [<ffffffff811e33d1>] ? SyS_lseek+0x91/0xb0 [<ffffffff8171ee29>] system_call_fastpath+0x16/0x1b Code: 00 00 00 88 50 29 83 7f 08 01 19 d2 83 e2 f0 83 ea 50 88 50 34 c6 81 1d 02 00 00 40 c6 81 17 02 00 00 00 5d c3 66 0f 1f 44 00 00 <89> 14 25 58 00 00 00 Fix it by introducing ata_host->n_tags which is initialized to ATA_MAX_QUEUE - 1 in ata_host_init() for SAS controllers and set to scsi_host_template->can_queue in ata_host_register() for !SAS ones. As SAS hosts are never registered, this will give them the same ATA_MAX_QUEUE - 1 as before. Note that we can't use scsi_host->can_queue directly for SAS hosts anyway as they can go higher than the libata maximum. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Mike Qiu <qiudayu@linux.vnet.ibm.com> Reported-by: Jesse Brandeburg <jesse.brandeburg@gmail.com> Reported-by: Peter Hurley <peter@hurleysoftware.com> Reported-by: Peter Zijlstra <peterz@infradead.org> Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru> Fixes: 1871ee1 ("libata: support the ata host which implements a queue depth less than 32") Cc: Kevin Hao <haokexin@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Jiri Slaby <jslaby@suse.cz>

commit 0b462c8 upstream. While a queue is being destroyed, all the blkgs are destroyed and its ->root_blkg pointer is set to NULL. If someone else starts to drain while the queue is in this state, the following oops happens. NULL pointer dereference at 0000000000000028 IP: [<ffffffff8144e944>] blk_throtl_drain+0x84/0x230 PGD e4a1067 PUD b773067 PMD 0 Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC Modules linked in: cfq_iosched(-) [last unloaded: cfq_iosched] CPU: 1 PID: 537 Comm: bash Not tainted 3.16.0-rc3-work+ #2 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 task: ffff88000e222250 ti: ffff88000efd4000 task.ti: ffff88000efd4000 RIP: 0010:[<ffffffff8144e944>] [<ffffffff8144e944>] blk_throtl_drain+0x84/0x230 RSP: 0018:ffff88000efd7bf0 EFLAGS: 00010046 RAX: 0000000000000000 RBX: ffff880015091450 RCX: 0000000000000001 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffff88000efd7c10 R08: 0000000000000000 R09: 0000000000000001 R10: ffff88000e222250 R11: 0000000000000000 R12: ffff880015091450 R13: ffff880015092e00 R14: ffff880015091d70 R15: ffff88001508fc28 FS: 00007f1332650740(0000) GS:ffff88001fa80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000028 CR3: 0000000009446000 CR4: 00000000000006e0 Stack: ffffffff8144e8f6 ffff880015091450 0000000000000000 ffff880015091d80 ffff88000efd7c28 ffffffff8144ae2f ffff880015091450 ffff88000efd7c58 ffffffff81427641 ffff880015091450 ffffffff82401f00 ffff880015091450 Call Trace: [<ffffffff8144ae2f>] blkcg_drain_queue+0x1f/0x60 [<ffffffff81427641>] __blk_drain_queue+0x71/0x180 [<ffffffff81429b3e>] blk_queue_bypass_start+0x6e/0xb0 [<ffffffff814498b8>] blkcg_deactivate_policy+0x38/0x120 [<ffffffff8144ec44>] blk_throtl_exit+0x34/0x50 [<ffffffff8144aea5>] blkcg_exit_queue+0x35/0x40 [<ffffffff8142d476>] blk_release_queue+0x26/0xd0 [<ffffffff81454968>] kobject_cleanup+0x38/0x70 [<ffffffff81454848>] kobject_put+0x28/0x60 [<ffffffff81427505>] blk_put_queue+0x15/0x20 [<ffffffff817d07bb>] scsi_device_dev_release_usercontext+0x16b/0x1c0 [<ffffffff810bc339>] execute_in_process_context+0x89/0xa0 [<ffffffff817d064c>] scsi_device_dev_release+0x1c/0x20 [<ffffffff817930e2>] device_release+0x32/0xa0 [<ffffffff81454968>] kobject_cleanup+0x38/0x70 [<ffffffff81454848>] kobject_put+0x28/0x60 [<ffffffff817934d7>] put_device+0x17/0x20 [<ffffffff817d11b9>] __scsi_remove_device+0xa9/0xe0 [<ffffffff817d121b>] scsi_remove_device+0x2b/0x40 [<ffffffff817d1257>] sdev_store_delete+0x27/0x30 [<ffffffff81792ca8>] dev_attr_store+0x18/0x30 [<ffffffff8126f75e>] sysfs_kf_write+0x3e/0x50 [<ffffffff8126ea87>] kernfs_fop_write+0xe7/0x170 [<ffffffff811f5e9f>] vfs_write+0xaf/0x1d0 [<ffffffff811f69bd>] SyS_write+0x4d/0xc0 [<ffffffff81d24692>] system_call_fastpath+0x16/0x1b 776687b ("block, blk-mq: draining can't be skipped even if bypass_depth was non-zero") made it easier to trigger this bug by making blk_queue_bypass_start() drain even when it loses the first bypass test to blk_cleanup_queue(); however, the bug has always been there even before the commit as blk_queue_bypass_start() could race against queue destruction, win the initial bypass test but perform the actual draining after blk_cleanup_queue() already destroyed all blkgs. Fix it by skippping calling into policy draining if all the blkgs are already gone. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Shirish Pargaonkar <spargaonkar@suse.com> Reported-by: Sasha Levin <sasha.levin@oracle.com> Reported-by: Jet Chen <jet.chen@intel.com> Tested-by: Shirish Pargaonkar <spargaonkar@suse.com> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>

ghost assigned koalo May 4, 2013

koalo closed this as completed May 13, 2013

koalo mentioned this issue May 13, 2013

Add 32 bit support - Done, but needs testing! #4

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for Sabre ESS9018 #1

Add support for Sabre ESS9018 #1

koalo commented May 4, 2013

koalo commented May 6, 2013

steveha8838 commented May 6, 2013

koalo commented May 6, 2013

steveha8838 commented May 6, 2013

steveha8838 commented May 7, 2013

koalo commented May 8, 2013

steveha8838 commented May 8, 2013

koalo commented May 8, 2013

steveha8838 commented May 8, 2013

steveha8838 commented May 8, 2013

steveha8838 commented May 9, 2013

koalo commented May 10, 2013

steveha8838 commented May 12, 2013

koalo commented May 13, 2013

Add support for Sabre ESS9018 #1

Add support for Sabre ESS9018 #1

Comments

koalo commented May 4, 2013

koalo commented May 6, 2013

steveha8838 commented May 6, 2013

koalo commented May 6, 2013

steveha8838 commented May 6, 2013

steveha8838 commented May 7, 2013

koalo commented May 8, 2013

steveha8838 commented May 8, 2013

koalo commented May 8, 2013

steveha8838 commented May 8, 2013

steveha8838 commented May 8, 2013

steveha8838 commented May 9, 2013

koalo commented May 10, 2013

steveha8838 commented May 12, 2013

koalo commented May 13, 2013