forked from torvalds/linux
-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New commits to master #2
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
I hit an oops when merging reloc roots fails, the reason is that new reloc roots may be added and we should make sure we cleanup all reloc roots. Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
The closing parenthesis is in the wrong place. We want to check "sizeof(*arg->clone_sources) * arg->clone_sources_count" instead of "sizeof(*arg->clone_sources * arg->clone_sources_count)". Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Jie Liu <jeff.liu@oracle.com> Signed-off-by: Chris Mason <clm@fb.com> cc: stable@vger.kernel.org
While enabling these machines, we found we would sometimes lose an interrupt if we change hardware volume during playback, and that disabling msi fixed this issue. (Losing the interrupt caused underruns and crackling audio, as the one second timeout is usually bigger than the period size.) The machines were all machines from HP, running AMD Hudson controller, and Realtek ALC282 codec. Cc: stable@vger.kernel.org BugLink: https://bugs.launchpad.net/bugs/1260225 Signed-off-by: David Henningsson <david.henningsson@canonical.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
Commit ec39f64 (drm/radeon/dpm: Convert to use devm_hwmon_register_with_groups) converted one usage of dev_get_drvdata, but there were two more. bug: https://bugs.freedesktop.org/show_bug.cgi?id=72457 Signed-off-by: Martin Andersson <g02maran@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
I implemented support for this, but forget to hook up the callback so the driver can actually use it. On asics with a dedicated DMA engine, we use the DMA engine for buffer migration so this is just for testing purposes. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Otherwise we end up with a rather strange looking result. Signed-off-by: Christian König <christian.koenig@amd.com> Tested-by: Tom Stellard <thomas.stellard@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fixes improperly set up display params for 2D tiling on oland. Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
This causes a race condition between drm_dev_unregister() and pci_driver.shutdown at shutdown or driver unload time. We need to revisit how to properly support kexec within the drm. This reverts commit 846ae41.
The hugepage code had the exact same bug that regular pages had in commit 7485d0d ("futexes: Remove rw parameter from get_futex_key()"). The regular page case was fixed by commit 9ea7150 ("futex: Fix regression with read only mappings"), but the transparent hugepage case (added in a5b338f: "thp: update futex compound knowledge") case remained broken. Found by Dave Jones and his trinity tool. Reported-and-tested-by: Dave Jones <davej@fedoraproject.org> Cc: stable@kernel.org # v2.6.38+ Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Mel Gorman <mgorman@suse.de> Cc: Darren Hart <dvhart@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When debugging the read-only hugepage case, I was confused by the fact that get_futex_key() did an access_ok() only for the non-shared futex case, since the user address checking really isn't in any way specific to the private key handling. Now, it turns out that the shared key handling does effectively do the equivalent checks inside get_user_pages_fast() (it doesn't actually check the address range on x86, but does check the page protections for being a user page). So it wasn't actually a bug, but the fact that we treat the address differently for private and shared futexes threw me for a loop. Just move the check up, so that it gets done for both cases. Also, use the 'rw' parameter for the type, even if it doesn't actually matter any more (it's a historical artifact of the old racy i386 "page faults from kernel space don't check write protections"). Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull xfs bugfixes from Ben Myers: - fix for buffer overrun in agfl with growfs on v4 superblock - return EINVAL if requested discard length is less than a block - fix possible memory corruption in xfs_attrlist_by_handle() * tag 'xfs-for-linus-v3.13-rc4' of git://oss.sgi.com/xfs/xfs: xfs: growfs overruns AGFL buffer on V4 filesystems xfs: don't perform discard if the given range length is less than block size xfs: underflow bug in xfs_attrlist_by_handle()
…/kernel/git/dhowells/linux-fs Pull misc keyrings fixes from David Howells: "These break down into five sets: - A patch to error handling in the big_key type for huge payloads. If the payload is larger than the "low limit" and the backing store allocation fails, then big_key_instantiate() doesn't clear the payload pointers in the key, assuming them to have been previously cleared - but only one of them is. Unfortunately, the garbage collector still calls big_key_destroy() when sees one of the pointers with a weird value in it (and not NULL) which it then tries to clean up. - Three patches to fix the keyring type: * A patch to fix the hash function to correctly divide keyrings off from keys in the topology of the tree inside the associative array. This is only a problem if searching through nested keyrings - and only if the hash function incorrectly puts the a keyring outside of the 0 branch of the root node. * A patch to fix keyrings' use of the associative array. The __key_link_begin() function initially passes a NULL key pointer to assoc_array_insert() on the basis that it's holding a place in the tree whilst it does more allocation and stuff. This is only a problem when a node contains 16 keys that match at that level and we want to add an also matching 17th. This should easily be manufactured with a keyring full of keyrings (without chucking any other sort of key into the mix) - except for (a) above which makes it on average adding the 65th keyring. * A patch to fix searching down through nested keyrings, where any keyring in the set has more than 16 keyrings and none of the first keyrings we look through has a match (before the tree iteration needs to step to a more distal node). Test in keyutils test suite: http://git.kernel.org/cgit/linux/kernel/git/dhowells/keyutils.git/commit/?id=8b4ae963ed92523aea18dfbb8cab3f4979e13bd1 - A patch to fix the big_key type's use of a shmem file as its backing store causing audit messages and LSM check failures. This is done by setting S_PRIVATE on the file to avoid LSM checks on the file (access to the shmem file goes through the keyctl() interface and so is gated by the LSM that way). This isn't normally a problem if a key is used by the context that generated it - and it's currently only used by libkrb5. Test in keyutils test suite: http://git.kernel.org/cgit/linux/kernel/git/dhowells/keyutils.git/commit/?id=d9a53cbab42c293962f2f78f7190253fc73bd32e - A patch to add a generated file to .gitignore. - A patch to fix the alignment of the system certificate data such that it it works on s390. As I understand it, on the S390 arch, symbols must be 2-byte aligned because loading the address discards the least-significant bit" * tag 'keys-devel-20131210' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: KEYS: correct alignment of system_certificate_list content in assembly file Ignore generated file kernel/x509_certificate_list security: shmem: implement kernel private shmem inodes KEYS: Fix searching of nested keyrings KEYS: Fix multiple key add into associative array KEYS: Fix the keyring hash function KEYS: Pre-clear struct key on allocation
…nux-vfio Pull iommu fixes from Alex Williamson: "arm/smmu driver updates via Will Deacon fixing locking around page table walks and a couple other issues" * tag 'iommu-fixes-for-v3.13-rc4' of git://github.com/awilliam/linux-vfio: iommu/arm-smmu: fix error return code in arm_smmu_device_dt_probe() iommu/arm-smmu: remove potential NULL dereference on mapping path iommu/arm-smmu: use mutex instead of spinlock for locking page tables
netback seems to be somewhat confused about the napi budget parameter. The parameter is supposed to limit the number of skbs processed in each poll, but netback has this confused with grant operations. This patch fixes that, properly limiting the work done in each poll. Note that this limit makes sure we do not process any more data from the shared ring than we intend to pass back from the poll. This is important to prevent tx_queue potentially growing without bound. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This patch changes the RING_FINAL_CHECK_FOR_REQUESTS in xenvif_build_tx_gops to a check for RING_HAS_UNCONSUMED_REQUESTS as the former call has the side effect of advancing the ring event pointer and therefore inviting another interrupt from the frontend before the napi poll has actually finished, thereby defeating the point of napi. The event pointer is updated by RING_FINAL_CHECK_FOR_REQUESTS in xenvif_poll, the napi poll function, if the work done is less than the budget i.e. when actually transitioning back to interrupt mode. Reported-by: Malcolm Crossley <malcolm.crossley@citrix.com> Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
When explicitly hashing the end of a string with the word-at-a-time interface, we have to be careful which end of the word we pick up. On big-endian CPUs, the upper-bits will contain the data we're after, so ensure we generate our masks accordingly (and avoid hashing whatever random junk may have been sitting after the string). This patch adds a new dcache helper, bytemask_from_count, which creates a mask appropriate for the CPU endianness. Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Whilst architectures may be able to do better than this (which they can, by simply defining their own macro), this is a generic stab at a zero_bytemask implementation for the asm-generic, big-endian word-at-a-time implementation. On arm64, a clz instruction is used to implement the fls efficiently. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When CPSW and Davinci MDIO are build as modules, CPSW crashes when accessing CPSW registers in CPSW probe. The same is working in built-in as the CPSW clocks are enabled in Davindi MDIO probe, SO Enabling the clocks before accessing the version register and moving out the other register access to cpsw device open. Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com> Signed-off-by: Felipe Balbi <balbi@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
We should be writing bits here but instead we're writing the numbers that correspond to the bits we want to write. Fix it by wrapping the numbers in the BIT() macro. This fixes gpios acting as interrupts. Cc: stable@vger.kernel.org Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
…kernel/git/jdelvare/staging Pull hwmon fixes from Jean Delvare. * 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging: hwmon: Prevent some divide by zeros in FAN_TO_REG() hwmon: (w83l768ng) Fix fan speed control range hwmon: (w83l786ng) Fix fan speed control mode setting and reporting hwmon: (lm90) Unregister hwmon device if interrupt setup fails
…nel/git/groeck/linux-staging Pull hwmon fix from Guenter Roeck: "Fix HIH-6130 driver to work with BeagleBone" * tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: hwmon: HIH-6130: Support I2C bus drivers without I2C_FUNC_SMBUS_QUICK
…rnel/git/mchehab/linux-media Pull media fixes from Mauro Carvalho Chehab: "A dvb core deadlock fix, a couple videobuf2 fixes an a series of media driver fixes" * 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (30 commits) [media] videobuf2-dma-sg: fix possible memory leak [media] vb2: regression fix: always set length field. [media] mt9p031: Include linux/of.h header [media] rtl2830: add parent for I2C adapter [media] media: marvell-ccic: use devm to release clk [media] ths7303: Declare as static a private function [media] em28xx-video: Swap release order to avoid lock nesting [media] usbtv: Add support for PAL video source [media] media_tree: Fix spelling errors [media] videobuf2: Add support for file access mode flags for DMABUF exporting [media] radio-shark2: Mark shark_resume_leds() inline to kill compiler warning [media] radio-shark: Mark shark_resume_leds() inline to kill compiler warning [media] af9035: unlock on error in af9035_i2c_master_xfer() [media] af9033: fix broken I2C [media] v4l: omap3isp: Don't check for missing get_fmt op on remote subdev [media] af9035: fix broken I2C and USB I/O [media] wm8775: fix broken audio routing [media] marvell-ccic: drop resource free in driver remove [media] tef6862/radio-tea5764: actually assign clamp result [media] cx231xx: use after free on error path in probe ...
According to the manual, if a port is set for level detection using the corresponding bit in the edge/level select register and an external level interrupt signal is asserted, the corresponding bit in INTDT does not use the FF to hold the input. Thus, writing 1 to the corresponding bits in INTCLR cannot clear the corresponding bits in the INTDT register. Instead, when an external input signal is stopped, the corresponding bit in INTDT is cleared automatically. Since the INTDT bit cannot be cleared for the level interrupts until the interrupt signal is stopped, we end up with the infinite loop when using deferred (threaded) IRQ handling. Since a deferred interrupt is disabled by the low-level handler and re-enabled only when the deferred handler is completed, Fix the issue by dropping disabled interrupts from the pending mask as suggested by Laurent Pinchart <laurent.pinchart@ideasonboard.com> Changes in V2: * Drop disabled interrupts from pending mask altogether instead of dropping level interrupts one by one once they get handled. Signed-off-by: Valentine Barshak <valentine.barshak@cogentembedded.com> Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Acked-by: Magnus Damm <damm@opensource.se> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
neigh_priv_len is defined as u8. With all debug enabled struct ipoib_neigh has 200 bytes. The largest part is sk_buff_head with 96 bytes and here the spinlock with 72 bytes. The size value still fits in this u8 leaving some room for more. On -RT struct ipoib_neigh put on weight and has 392 bytes. The main reason is sk_buff_head with 288 and the fatty here is spinlock with 192 bytes. This does no longer fit into into neigh_priv_len and gcc complains. This patch changes neigh_priv_len from being 8bit to 16bit. Since the following element (dev_id) is 16bit followed by a spinlock which is aligned, the struct remains with a total size of 3200 (allmodconfig) / 2048 (with as much debug off as possible) bytes on x86-64. On x86-32 the struct is 1856 (allmodconfig) / 1216 (with as much debug off as possible) bytes long. The numbers were gained with and without the patch to prove that this change does not increase the size of the struct. Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
There is a mistake in checking the gso_prefix mask when passing large packets to a guest. The wrong shift is applied to the bit - the raw skb gso type is used rather then the translated one. This leads to large packets being handed to the guest without the GSO metadata. This patch fixes the check. The mistake manifested as errors whilst running Microsoft HCK large packet offload tests between a pair of Windows 8 VMs. I have verified this patch fixes those errors. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: David Vrabel <david.vrabel@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
…/git/dtor/input Pull input fixes from Dmitry Torokhov: "A fix for recent sysfs breakage in serio subsystem plus a fixup to adxl34x driver" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: adxl34x - Fix bug in definition of ADXL346_2D_ORIENT Input: serio - fix sysfs layout
…el/git/tiwai/sound Pull sound fixes from Takashi Iwai: "Still a slightly high amount of changes than wished, but they are all good regression and/or device-specific fixes. Majority of commits are for HD-audio, an HDMI ctl index fix that hits old graphics boards, regression fixes for AD codecs and a few quirks. Other than that, two major fixes are included: a 64bit ABI fix for compress offload, and 64bit dma_addr_t truncation fix, which had hit on PAE kernels" * tag 'sound-3.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: hda - Add static DAC/pin mapping for AD1986A codec ALSA: hda - One more Dell headset detection quirk ALSA: hda - hdmi: Fix IEC958 ctl indexes for some simple HDMI devices ALSA: hda - Mute all aamix inputs as default ALSA: compress: Fix 64bit ABI incompatibility ALSA: memalloc.h - fix wrong truncation of dma_addr_t ALSA: hda - Another Dell headset detection quirk ALSA: hda - A Dell headset detection quirk ALSA: hda - Remove quirk for Dell Vostro 131 ALSA: usb-audio: fix uninitialized variable compile warning ALSA: hda - fix mic issues on Acer Aspire E-572
If a muxed i2c bus gets created the default retry count and timeout of the muxed bus is zero. Hence it it possible that you end up with a situation where the parent controller sets a default retry count and timeout which gets applied and used while the muxed bus (using the same controller) has a default retry count of zero and a default timeout of 1s (set in i2c_add_adapter()). This can be solved by initializing the retry count and timeout of the muxed bus with the values used by the the parent at creation time. Signed-off-by: Elie De Brauwer <eliedebrauwer@gmail.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
In multiple functions the vcpu_id is used as an offset into a bitfield. Ag malicious user could specify a vcpu_id greater than 255 in order to set or clear bits in kernel memory. This could be used to elevate priveges in the kernel. This patch verifies that the vcpu_id provided is less than 255. The api documentation already specifies that the vcpu_id must be less than max_vcpus, but this is currently not checked. Reported-by: Andrew Honig <ahonig@google.com> Cc: stable@vger.kernel.org Signed-off-by: Andrew Honig <ahonig@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Under guest controllable circumstances apic_get_tmcct will execute a divide by zero and cause a crash. If the guest cpuid support tsc deadline timers and performs the following sequence of requests the host will crash. - Set the mode to periodic - Set the TMICT to 0 - Set the mode bits to 11 (neither periodic, nor one shot, nor tsc deadline) - Set the TMICT to non-zero. Then the lapic_timer.period will be 0, but the TMICT will not be. If the guest then reads from the TMCCT then the host will perform a divide by 0. This patch ensures that if the lapic_timer.period is 0, then the division does not occur. Reported-by: Andrew Honig <ahonig@google.com> Cc: stable@vger.kernel.org Signed-off-by: Andrew Honig <ahonig@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit 8f34a1d ("arm64: ptrace: use HW_BREAKPOINT_EMPTY type for disabled breakpoints") fixed an issue with GDB trying to zero breakpoint control registers. The problem there is that the arch hw_breakpoint code will attempt to create a (disabled), execute breakpoint of length 0. This will fail validation and report unexpected failure to GDB. To avoid this, we treated disabled breakpoints as HW_BREAKPOINT_EMPTY, but that seems to have broken with recent kernels, causing watchpoints to be treated as TYPE_INST in the core code and returning ENOSPC for any further breakpoints. This patch fixes the problem by prioritising the `enable' field of the breakpoint: if it is cleared, we simply update the perf_event_attr to indicate that the thing is disabled and don't bother changing either the type or the length. This reinforces the behaviour that the breakpoint control register is essentially read-only apart from the enable bit when disabling a breakpoint. Cc: <stable@vger.kernel.org> Reported-by: Aaron Liu <liucy214@gmail.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
…ux/kernel/git/rostedt/linux-trace Pull ftrace fix from Steven Rostedt: "This fixes a long standing bug in the ftrace profiler. The problem is that the profiler only initializes the online CPUs, and not possible CPUs. This causes issues if the user takes CPUs online or offline while the profiler is running. If we online a CPU after starting the profiler, we lose all the trace information on the CPU going online. If we offline a CPU after running a test and start a new test, it will not clear the old data from that CPU. This bug causes incorrect data to be reported to the user if they online or offline CPUs during the profiling" * tag 'trace-fixes-v3.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: ftrace: Initialize the ftrace profiler for each possible cpu
…/scm/linux/kernel/git/xen/tip Pull Xen bugfixes from Konrad Rzeszutek Wilk: - Fix balloon driver for auto-translate guests (PVHVM, ARM) to not use scratch pages. - Fix block API header for ARM32 and ARM64 to have proper layout - On ARM when mapping guests, stick on PTE_SPECIAL - When using SWIOTLB under ARM, don't call swiotlb functions twice - When unmapping guests memory and if we fail, don't return pages which failed to be unmapped. - Grant driver was using the wrong address on ARM. * tag 'stable/for-linus-3.13-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/balloon: Seperate the auto-translate logic properly (v2) xen/block: Correctly define structures in public headers on ARM32 and ARM64 arm: xen: foreign mapping PTEs are special. xen/arm64: do not call the swiotlb functions twice xen: privcmd: do not return pages which we have failed to unmap XEN: Grant table address, xen_hvm_resume_frames, is a phys_addr not a pfn
…kvm-master Patch queue for 3.13 - 2013-12-18 This fixes some grave issues we've only found after 3.13-rc1: - Make the modularized HV/PR book3s kvm work well as modules - Fix some race conditions - Fix compilation with certain compilers (booke) - Fix THP for book3s_hv - Fix preemption for book3s_pr Alexander Graf (4): KVM: PPC: Book3S: PR: Don't clobber our exit handler id KVM: PPC: Book3S: PR: Export kvmppc_copy_to|from_svcpu KVM: PPC: Book3S: PR: Make svcpu -> vcpu store preempt savvy KVM: PPC: Book3S: PR: Enable interrupts earlier Aneesh Kumar K.V (1): powerpc: book3s: kvm: Don't abuse host r2 in exit path Paul Mackerras (5): KVM: PPC: Book3S HV: Fix physical address calculations KVM: PPC: Book3S HV: Refine barriers in guest entry/exit KVM: PPC: Book3S HV: Make tbacct_lock irq-safe KVM: PPC: Book3S HV: Take SRCU read lock around kvm_read_guest() call KVM: PPC: Book3S HV: Don't drop low-order page address bits Scott Wood (1): powerpc/kvm/booke: Fix build break due to stack frame size warning pingfan liu (1): powerpc: kvm: fix rare but potential deadlock scene
Sasha Levin found a NULL pointer dereference that is due to a missing page table lock, which in turn is due to the pmd entry in question being a transparent huge-table entry. The code - introduced in commit 1998cc0 ("mm: make madvise(MADV_WILLNEED) support swap file prefetch") - correctly checks for this situation using pmd_none_or_trans_huge_or_clear_bad(), but it turns out that that function doesn't work correctly. pmd_none_or_trans_huge_or_clear_bad() expected that pmd_bad() would trigger if the transparent hugepage bit was set, but it doesn't do that if pmd_numa() is also set. Note that the NUMA bit only gets set on real NUMA machines, so people trying to reproduce this on most normal development systems would never actually trigger this. Fix it by removing the very subtle (and subtly incorrect) expectation, and instead just checking pmd_trans_huge() explicitly. Reported-by: Sasha Levin <sasha.levin@oracle.com> Acked-by: Andrea Arcangeli <aarcange@redhat.com> [ Additionally remove the now stale test for pmd_trans_huge() inside the pmd_bad() case - Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
…airness policy" This reverts commit 73f038b. The NUMA behaviour of this patch is less than ideal. An alternative approch is to interleave allocations only within local zones which is implemented in the next patch. Cc: stable@vger.kernel.org Signed-off-by: Mel Gorman <mgorman@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 81c0a2b ("mm: page_alloc: fair zone allocator policy") meant to bring aging fairness among zones in system, but it was overzealous and badly regressed basic workloads on NUMA systems. Due to the way kswapd and page allocator interacts, we still want to make sure that all zones in any given node are used equally for all allocations to maximize memory utilization and prevent thrashing on the highest zone in the node. While the same principle applies to NUMA nodes - memory utilization is obviously improved by spreading allocations throughout all nodes - remote references can be costly and so many workloads prefer locality over memory utilization. The original change assumed that zone_reclaim_mode would be a good enough predictor for that, but it turned out to be as indicative as a coin flip. Revert the NUMA aspect of the fairness until we can find a proper way to make it configurable and agree on a sane default. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Michal Hocko <mhocko@suse.cz> Signed-off-by: Mel Gorman <mgorman@suse.de> Cc: <stable@kernel.org> # 3.12 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In struct page we have enough space to fit long-size page->ptl there, but we use dynamically-allocated page->ptl if size(spinlock_t) is larger than sizeof(int). It hurts 64-bit architectures with CONFIG_GENERIC_LOCKBREAK, where sizeof(spinlock_t) == 8, but it easily fits into struct page. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Hugh Dickins <hughd@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull KVM fixes from Paolo Bonzini: "The PPC folks had a large amount of changes queued for 3.13, and now they are fixing the bugs" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: PPC: Book3S HV: Don't drop low-order page address bits powerpc: book3s: kvm: Don't abuse host r2 in exit path powerpc/kvm/booke: Fix build break due to stack frame size warning KVM: PPC: Book3S: PR: Enable interrupts earlier KVM: PPC: Book3S: PR: Make svcpu -> vcpu store preempt savvy KVM: PPC: Book3S: PR: Export kvmppc_copy_to|from_svcpu KVM: PPC: Book3S: PR: Don't clobber our exit handler id powerpc: kvm: fix rare but potential deadlock scene KVM: PPC: Book3S HV: Take SRCU read lock around kvm_read_guest() call KVM: PPC: Book3S HV: Make tbacct_lock irq-safe KVM: PPC: Book3S HV: Refine barriers in guest entry/exit KVM: PPC: Book3S HV: Fix physical address calculations
…linux/kernel/git/djbw/dmaengine Pull dmaengine fixes from Dan Williams: - deprecation of net_dma to be removed in 3.14 - crash regression fix in pl330 from the dmaengine_unmap rework - crash regression fix for any channel running raid ops without CONFIG_ASYNC_TX_DMA from dmaengine_unmap - memory leak regression in mv_xor from dmaengine_unmap - build warning regressions in mv_xor, fsldma, ppc4xx, txx9, and at_hdmac from dmaengine_unmap - sleep in atomic regression in dma_async_memcpy_pg_to_pg - new fix in mv_xor for handling channel initialization failures * tag 'dmaengine-fixes-3.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/dmaengine: net_dma: mark broken dma: pl330: ensure DMA descriptors are zero-initialised dmaengine: fix sleep in atomic dmaengine: mv_xor: fix oops when channels fail to initialise dma: mv_xor: Use dmaengine_unmap_data for the self-tests dmaengine: fix enable for high order unmap pools dma: fix build warnings in txx9 dmatest: fix build warning on mips dma: fix fsldma build warnings dma: fix build warnings in ppc4xx dmaengine: at_hdmac: remove unused function dma: mv_xor: remove mv_desc_get_dest_addr()
Some pstore backing devices use on board flash as persistent storage. These have limited numbers of write cycles so it is a poor idea to use them from high frequency operations. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
…git/arm64/linux Pull arm64 ptrace fix from Catalin Marinas. * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: ptrace: avoid using HW_BREAKPOINT_EMPTY for disabled events
…nux/kernel/git/vgupta/arc Pull ARC fix from Vineet Gupta: "Fix busted syscall table due to unistd header inclusion issue" * tag 'arc-fixes-for-3.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: ARC: Allow conditional multiple inclusion of uapi/asm/unistd.h
Commit 597d795 ('mm: do not allocate page->ptl dynamically, if spinlock_t fits to long') restructures some allocators that are compiled even if USE_SPLIT_PTLOCKS arn't used. It results in compilation failure: mm/memory.c:4282:6: error: 'struct page' has no member named 'ptl' mm/memory.c:4288:12: error: 'struct page' has no member named 'ptl' Add in the missing ifdef. Fixes: 597d795 ('mm: do not allocate page->ptl dynamically, if spinlock_t fits to long') Signed-off-by: Olof Johansson <olof@lixom.net> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull xfs bugfixes from Ben Myers: "This contains fixes for some asserts related to project quotas, a memory leak, a hang when disabling group or project quotas before disabling user quotas, Dave's email address, several fixes for the alignment of file allocation to stripe unit/width geometry, a fix for an assertion with xfs_zero_remaining_bytes, and the behavior of metadata writeback in the face of IO errors. Details: - fix memory leak in xfs_dir2_node_removename - fix quota assertion in xfs_setattr_size - fix quota assertions in xfs_qm_vop_create_dqattach - fix for hang when disabling group and project quotas before disabling user quotas - fix Dave Chinner's email address in MAINTAINERS - fix for file allocation alignment - fix for assertion in xfs_buf_stale by removing xfsbdstrat - fix for alignment with swalloc mount option - fix for "retry forever" semantics on IO errors" * tag 'xfs-for-linus-v3.13-rc5' of git://oss.sgi.com/xfs/xfs: xfs: abort metadata writeback on permanent errors xfs: swalloc doesn't align allocations properly xfs: remove xfsbdstrat error xfs: align initial file allocations correctly MAINTAINERS: fix incorrect mail address of XFS maintainer xfs: fix infinite loop by detaching the group/project hints from user dquot xfs: fix assertion failure at xfs_setattr_nonsize xfs: fix false assertion at xfs_qm_vop_create_dqattach xfs: fix memory leak in xfs_dir2_node_removename
Commit 1bf49dd ("./Makefile: export initial ramdisk compression config option") started setting the INITRD_COMPRESS environment variable depending on which decompression models the kernel had available. That is completely broken. For example, we by default have CONFIG_RD_LZ4 enabled, and are able to decompress such an initrd, but the user tools to *create* such an initrd may not be availble. So trying to tell dracut to generate an lz4-compressed image just because we can decode such an image is completely inappropriate. Cc: J P <ppandit@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jan Beulich <JBeulich@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
AshishNamdev
pushed a commit
that referenced
this pull request
Dec 22, 2013
…is completed Currently, when mounting pstore file system, a read callback of efi_pstore driver runs mutiple times as below. - In the first read callback, scan efivar_sysfs_list from head and pass a kmsg buffer of a entry to an upper pstore layer. - In the second read callback, rescan efivar_sysfs_list from the entry and pass another kmsg buffer to it. - Repeat the scan and pass until the end of efivar_sysfs_list. In this process, an entry is read across the multiple read function calls. To avoid race between the read and erasion, the whole process above is protected by a spinlock, holding in open() and releasing in close(). At the same time, kmemdup() is called to pass the buffer to pstore filesystem during it. And then, it causes a following lockdep warning. To make the dynamic memory allocation runnable without taking spinlock, holding off a deletion of sysfs entry if it happens while scanning it via efi_pstore, and deleting it after the scan is completed. To implement it, this patch introduces two flags, scanning and deleting, to efivar_entry. On the code basis, it seems that all the scanning and deleting logic is not needed because __efivars->lock are not dropped when reading from the EFI variable store. But, the scanning and deleting logic is still needed because an efi-pstore and a pstore filesystem works as follows. In case an entry(A) is found, the pointer is saved to psi->data. And efi_pstore_read() passes the entry(A) to a pstore filesystem by releasing __efivars->lock. And then, the pstore filesystem calls efi_pstore_read() again and the same entry(A), which is saved to psi->data, is used for resuming to scan a sysfs-list. So, to protect the entry(A), the logic is needed. [ 1.143710] ------------[ cut here ]------------ [ 1.144058] WARNING: CPU: 1 PID: 1 at kernel/lockdep.c:2740 lockdep_trace_alloc+0x104/0x110() [ 1.144058] DEBUG_LOCKS_WARN_ON(irqs_disabled_flags(flags)) [ 1.144058] Modules linked in: [ 1.144058] CPU: 1 PID: 1 Comm: systemd Not tainted 3.11.0-rc5 #2 [ 1.144058] 0000000000000009 ffff8800797e9ae0 ffffffff816614a5 ffff8800797e9b28 [ 1.144058] ffff8800797e9b18 ffffffff8105510d 0000000000000080 0000000000000046 [ 1.144058] 00000000000000d0 00000000000003af ffffffff81ccd0c0 ffff8800797e9b78 [ 1.144058] Call Trace: [ 1.144058] [<ffffffff816614a5>] dump_stack+0x54/0x74 [ 1.144058] [<ffffffff8105510d>] warn_slowpath_common+0x7d/0xa0 [ 1.144058] [<ffffffff8105517c>] warn_slowpath_fmt+0x4c/0x50 [ 1.144058] [<ffffffff8131290f>] ? vsscanf+0x57f/0x7b0 [ 1.144058] [<ffffffff810bbd74>] lockdep_trace_alloc+0x104/0x110 [ 1.144058] [<ffffffff81192da0>] __kmalloc_track_caller+0x50/0x280 [ 1.144058] [<ffffffff815147bb>] ? efi_pstore_read_func.part.1+0x12b/0x170 [ 1.144058] [<ffffffff8115b260>] kmemdup+0x20/0x50 [ 1.144058] [<ffffffff815147bb>] efi_pstore_read_func.part.1+0x12b/0x170 [ 1.144058] [<ffffffff81514800>] ? efi_pstore_read_func.part.1+0x170/0x170 [ 1.144058] [<ffffffff815148b4>] efi_pstore_read_func+0xb4/0xe0 [ 1.144058] [<ffffffff81512b7b>] __efivar_entry_iter+0xfb/0x120 [ 1.144058] [<ffffffff8151428f>] efi_pstore_read+0x3f/0x50 [ 1.144058] [<ffffffff8128d7ba>] pstore_get_records+0x9a/0x150 [ 1.158207] [<ffffffff812af25c>] ? selinux_d_instantiate+0x1c/0x20 [ 1.158207] [<ffffffff8128ce30>] ? parse_options+0x80/0x80 [ 1.158207] [<ffffffff8128ced5>] pstore_fill_super+0xa5/0xc0 [ 1.158207] [<ffffffff811ae7d2>] mount_single+0xa2/0xd0 [ 1.158207] [<ffffffff8128ccf8>] pstore_mount+0x18/0x20 [ 1.158207] [<ffffffff811ae8b9>] mount_fs+0x39/0x1b0 [ 1.158207] [<ffffffff81160550>] ? __alloc_percpu+0x10/0x20 [ 1.158207] [<ffffffff811c9493>] vfs_kern_mount+0x63/0xf0 [ 1.158207] [<ffffffff811cbb0e>] do_mount+0x23e/0xa20 [ 1.158207] [<ffffffff8115b51b>] ? strndup_user+0x4b/0xf0 [ 1.158207] [<ffffffff811cc373>] SyS_mount+0x83/0xc0 [ 1.158207] [<ffffffff81673cc2>] system_call_fastpath+0x16/0x1b [ 1.158207] ---[ end trace 61981bc62de9f6f4 ]--- Signed-off-by: Seiji Aguchi <seiji.aguchi@hds.com> Tested-by: Madper Xie <cxie@redhat.com> Cc: stable@kernel.org Signed-off-by: Matt Fleming <matt.fleming@intel.com>
AshishNamdev
pushed a commit
that referenced
this pull request
Dec 22, 2013
The patch fixes the following lockdep warning, which is 100% reproducible on network restart: ====================================================== [ INFO: possible circular locking dependency detected ] 3.12.0+ torvalds#47 Tainted: GF ------------------------------------------------------- kworker/1:1/27 is trying to acquire lock: ((&(&adapter->watchdog_task)->work)){+.+...}, at: [<ffffffff8108a5b0>] flush_work+0x0/0x70 but task is already holding lock: (&adapter->mutex){+.+...}, at: [<ffffffffa0177c0a>] e1000_reset_task+0x4a/0xa0 [e1000] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&adapter->mutex){+.+...}: [<ffffffff810bdb5d>] lock_acquire+0x9d/0x120 [<ffffffff816b8cbc>] mutex_lock_nested+0x4c/0x390 [<ffffffffa017233d>] e1000_watchdog+0x7d/0x5b0 [e1000] [<ffffffff8108b972>] process_one_work+0x1d2/0x510 [<ffffffff8108ca80>] worker_thread+0x120/0x3a0 [<ffffffff81092c1e>] kthread+0xee/0x110 [<ffffffff816c3d7c>] ret_from_fork+0x7c/0xb0 -> #0 ((&(&adapter->watchdog_task)->work)){+.+...}: [<ffffffff810bd9c0>] __lock_acquire+0x1710/0x1810 [<ffffffff810bdb5d>] lock_acquire+0x9d/0x120 [<ffffffff8108a5eb>] flush_work+0x3b/0x70 [<ffffffff8108b5d8>] __cancel_work_timer+0x98/0x140 [<ffffffff8108b693>] cancel_delayed_work_sync+0x13/0x20 [<ffffffffa0170cec>] e1000_down_and_stop+0x3c/0x60 [e1000] [<ffffffffa01775b1>] e1000_down+0x131/0x220 [e1000] [<ffffffffa0177c12>] e1000_reset_task+0x52/0xa0 [e1000] [<ffffffff8108b972>] process_one_work+0x1d2/0x510 [<ffffffff8108ca80>] worker_thread+0x120/0x3a0 [<ffffffff81092c1e>] kthread+0xee/0x110 [<ffffffff816c3d7c>] ret_from_fork+0x7c/0xb0 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&adapter->mutex); lock((&(&adapter->watchdog_task)->work)); lock(&adapter->mutex); lock((&(&adapter->watchdog_task)->work)); *** DEADLOCK *** 3 locks held by kworker/1:1/27: #0: (events){.+.+.+}, at: [<ffffffff8108b906>] process_one_work+0x166/0x510 #1: ((&adapter->reset_task)){+.+...}, at: [<ffffffff8108b906>] process_one_work+0x166/0x510 #2: (&adapter->mutex){+.+...}, at: [<ffffffffa0177c0a>] e1000_reset_task+0x4a/0xa0 [e1000] stack backtrace: CPU: 1 PID: 27 Comm: kworker/1:1 Tainted: GF 3.12.0+ torvalds#47 Hardware name: System manufacturer System Product Name/P5B-VM SE, BIOS 0501 05/31/2007 Workqueue: events e1000_reset_task [e1000] ffffffff820f6000 ffff88007b9dba98 ffffffff816b54a2 0000000000000002 ffffffff820f5e50 ffff88007b9dbae8 ffffffff810ba936 ffff88007b9dbac8 ffff88007b9dbb48 ffff88007b9d8f00 ffff88007b9d8780 ffff88007b9d8f00 Call Trace: [<ffffffff816b54a2>] dump_stack+0x49/0x5f [<ffffffff810ba936>] print_circular_bug+0x216/0x310 [<ffffffff810bd9c0>] __lock_acquire+0x1710/0x1810 [<ffffffff8108a5b0>] ? __flush_work+0x250/0x250 [<ffffffff810bdb5d>] lock_acquire+0x9d/0x120 [<ffffffff8108a5b0>] ? __flush_work+0x250/0x250 [<ffffffff8108a5eb>] flush_work+0x3b/0x70 [<ffffffff8108a5b0>] ? __flush_work+0x250/0x250 [<ffffffff8108b5d8>] __cancel_work_timer+0x98/0x140 [<ffffffff8108b693>] cancel_delayed_work_sync+0x13/0x20 [<ffffffffa0170cec>] e1000_down_and_stop+0x3c/0x60 [e1000] [<ffffffffa01775b1>] e1000_down+0x131/0x220 [e1000] [<ffffffffa0177c12>] e1000_reset_task+0x52/0xa0 [e1000] [<ffffffff8108b972>] process_one_work+0x1d2/0x510 [<ffffffff8108b906>] ? process_one_work+0x166/0x510 [<ffffffff8108ca80>] worker_thread+0x120/0x3a0 [<ffffffff8108c960>] ? manage_workers+0x2c0/0x2c0 [<ffffffff81092c1e>] kthread+0xee/0x110 [<ffffffff81092b30>] ? __init_kthread_worker+0x70/0x70 [<ffffffff816c3d7c>] ret_from_fork+0x7c/0xb0 [<ffffffff81092b30>] ? __init_kthread_worker+0x70/0x70 == The issue background == The problem occurs, because e1000_down(), which is called under adapter->mutex by e1000_reset_task(), tries to synchronously cancel e1000 auxiliary works (reset_task, watchdog_task, phy_info_task, fifo_stall_task), which take adapter->mutex in their handlers. So the question is what does adapter->mutex protect there? The adapter->mutex was introduced by commit 0ef4ee ("e1000: convert to private mutex from rtnl") as a replacement for rtnl_lock() taken in the asynchronous handlers. It targeted on fixing a similar lockdep warning issued when e1000_down() was called under rtnl_lock(), and it fixed it, but unfortunately it introduced the lockdep warning described above. Anyway, that said the source of this bug is that the asynchronous works were made to take rtnl_lock() some time ago, so let's look deeper and find why it was added there. The rtnl_lock() was added to asynchronous handlers by commit 338c15 ("e1000: fix occasional panic on unload") in order to prevent asynchronous handlers from execution after the module is unloaded (e1000_down() is called) as it follows from the comment to the commit: > Net drivers in general have an issue where timers fired > by mod_timer or work threads with schedule_work are running > outside of the rtnl_lock. > > With no other lock protection these routines are vulnerable > to races with driver unload or reset paths. > > The longer term solution to this might be a redesign with > safer locks being taken in the driver to guarantee no > reentrance, but for now a safe and effective fix is > to take the rtnl_lock in these routines. I'm not sure if this locking scheme fixed the problem or just made it unlikely, although I incline to the latter. Anyway, this was long time ago when e1000 auxiliary works were implemented as timers scheduling real work handlers in their routines. The e1000_down() function only canceled the timers, but left the real handlers running if they were running, which could result in work execution after module unload. Today, the e1000 driver uses sane delayed works instead of the pair timer+work to implement its delayed asynchronous handlers, and the e1000_down() synchronously cancels all the works so that the problem that commit 338c15 tried to cope with disappeared, and we don't need any locks in the handlers any more. Moreover, any locking there can potentially result in a deadlock. So, this patch reverts commits 0ef4ee and 338c15. Fixes: 0ef4eed ("e1000: convert to private mutex from rtnl") Fixes: 338c15e ("e1000: fix occasional panic on unload") Cc: Tushar Dave <tushar.n.dave@intel.com> Cc: Patrick McHardy <kaber@trash.net> Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
AshishNamdev
pushed a commit
that referenced
this pull request
Dec 22, 2013
Dave Jones reported a use after free in UDP stack : [ 5059.434216] ========================= [ 5059.434314] [ BUG: held lock freed! ] [ 5059.434420] 3.13.0-rc3+ torvalds#9 Not tainted [ 5059.434520] ------------------------- [ 5059.434620] named/863 is freeing memory ffff88005e960000-ffff88005e96061f, with a lock still held there! [ 5059.434815] (slock-AF_INET){+.-...}, at: [<ffffffff8149bd21>] udp_queue_rcv_skb+0xd1/0x4b0 [ 5059.435012] 3 locks held by named/863: [ 5059.435086] #0: (rcu_read_lock){.+.+..}, at: [<ffffffff8143054d>] __netif_receive_skb_core+0x11d/0x940 [ 5059.435295] #1: (rcu_read_lock){.+.+..}, at: [<ffffffff81467a5e>] ip_local_deliver_finish+0x3e/0x410 [ 5059.435500] #2: (slock-AF_INET){+.-...}, at: [<ffffffff8149bd21>] udp_queue_rcv_skb+0xd1/0x4b0 [ 5059.435734] stack backtrace: [ 5059.435858] CPU: 0 PID: 863 Comm: named Not tainted 3.13.0-rc3+ torvalds#9 [loadavg: 0.21 0.06 0.06 1/115 1365] [ 5059.436052] Hardware name: /D510MO, BIOS MOPNV10J.86A.0175.2010.0308.0620 03/08/2010 [ 5059.436223] 0000000000000002 ffff88007e203ad8 ffffffff8153a372 ffff8800677130e0 [ 5059.436390] ffff88007e203b10 ffffffff8108cafa ffff88005e960000 ffff88007b00cfc0 [ 5059.436554] ffffea00017a5800 ffffffff8141c490 0000000000000246 ffff88007e203b48 [ 5059.436718] Call Trace: [ 5059.436769] <IRQ> [<ffffffff8153a372>] dump_stack+0x4d/0x66 [ 5059.436904] [<ffffffff8108cafa>] debug_check_no_locks_freed+0x15a/0x160 [ 5059.437037] [<ffffffff8141c490>] ? __sk_free+0x110/0x230 [ 5059.437147] [<ffffffff8112da2a>] kmem_cache_free+0x6a/0x150 [ 5059.437260] [<ffffffff8141c490>] __sk_free+0x110/0x230 [ 5059.437364] [<ffffffff8141c5c9>] sk_free+0x19/0x20 [ 5059.437463] [<ffffffff8141cb25>] sock_edemux+0x25/0x40 [ 5059.437567] [<ffffffff8141c181>] sock_queue_rcv_skb+0x81/0x280 [ 5059.437685] [<ffffffff8149bd21>] ? udp_queue_rcv_skb+0xd1/0x4b0 [ 5059.437805] [<ffffffff81499c82>] __udp_queue_rcv_skb+0x42/0x240 [ 5059.437925] [<ffffffff81541d25>] ? _raw_spin_lock+0x65/0x70 [ 5059.438038] [<ffffffff8149bebb>] udp_queue_rcv_skb+0x26b/0x4b0 [ 5059.438155] [<ffffffff8149c712>] __udp4_lib_rcv+0x152/0xb00 [ 5059.438269] [<ffffffff8149d7f5>] udp_rcv+0x15/0x20 [ 5059.438367] [<ffffffff81467b2f>] ip_local_deliver_finish+0x10f/0x410 [ 5059.438492] [<ffffffff81467a5e>] ? ip_local_deliver_finish+0x3e/0x410 [ 5059.438621] [<ffffffff81468653>] ip_local_deliver+0x43/0x80 [ 5059.438733] [<ffffffff81467f70>] ip_rcv_finish+0x140/0x5a0 [ 5059.438843] [<ffffffff81468926>] ip_rcv+0x296/0x3f0 [ 5059.438945] [<ffffffff81430b72>] __netif_receive_skb_core+0x742/0x940 [ 5059.439074] [<ffffffff8143054d>] ? __netif_receive_skb_core+0x11d/0x940 [ 5059.442231] [<ffffffff8108c81d>] ? trace_hardirqs_on+0xd/0x10 [ 5059.442231] [<ffffffff81430d83>] __netif_receive_skb+0x13/0x60 [ 5059.442231] [<ffffffff81431c1e>] netif_receive_skb+0x1e/0x1f0 [ 5059.442231] [<ffffffff814334e0>] napi_gro_receive+0x70/0xa0 [ 5059.442231] [<ffffffffa01de426>] rtl8169_poll+0x166/0x700 [r8169] [ 5059.442231] [<ffffffff81432bc9>] net_rx_action+0x129/0x1e0 [ 5059.442231] [<ffffffff810478cd>] __do_softirq+0xed/0x240 [ 5059.442231] [<ffffffff81047e25>] irq_exit+0x125/0x140 [ 5059.442231] [<ffffffff81004241>] do_IRQ+0x51/0xc0 [ 5059.442231] [<ffffffff81542bef>] common_interrupt+0x6f/0x6f We need to keep a reference on the socket, by using skb_steal_sock() at the right place. Note that another patch is needed to fix a race in udp_sk_rx_dst_set(), as we hold no lock protecting the dst. Fixes: 421b388 ("udp: ipv4: Add udp early demux") Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Shawn Bohrer <sbohrer@rgmadvisors.com> Signed-off-by: David S. Miller <davem@davemloft.net>
AshishNamdev
pushed a commit
that referenced
this pull request
Feb 22, 2014
… into fixes mvebu fixes for v3.13 (incremental #2) - allow building and booting DT and non-DT plat-orion SoCs - catch proper return value for kirkwood_pm_init() - properly check return of of_iomap to solve boot hangs (mirabox, others) - remove a compile warning on Armada 370 with non-SMP. * tag 'mvebu-fixes-3.13-2' of git://git.infradead.org/linux-mvebu: ARM: mvebu: fix compilation warning on Armada 370 (i.e. non-SMP) ARM: mvebu: Fix kernel hang in mvebu_soc_id_init() when of_iomap failed ARM: kirkwood: kirkwood_pm_init() should return void ARM: orion: provide C-style interrupt handler for MULTI_IRQ_HANDLER Signed-off-by: Olof Johansson <olof@lixom.net>
AshishNamdev
pushed a commit
that referenced
this pull request
Feb 22, 2014
sdata->u.ap.request_smps_work can’t be flushed synchronously under wdev_lock(wdev) since ieee80211_request_smps_ap_work itself locks the same lock. While at it, reset the driver_smps_mode when the ap is stopped to its default: OFF. This solves: ====================================================== [ INFO: possible circular locking dependency detected ] 3.12.0-ipeer+ #2 Tainted: G O ------------------------------------------------------- rmmod/2867 is trying to acquire lock: ((&sdata->u.ap.request_smps_work)){+.+...}, at: [<c105b8d0>] flush_work+0x0/0x90 but task is already holding lock: (&wdev->mtx){+.+.+.}, at: [<f9b32626>] cfg80211_stop_ap+0x26/0x230 [cfg80211] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&wdev->mtx){+.+.+.}: [<c10aefa9>] lock_acquire+0x79/0xe0 [<c1607a1a>] mutex_lock_nested+0x4a/0x360 [<fb06288b>] ieee80211_request_smps_ap_work+0x2b/0x50 [mac80211] [<c105cdd8>] process_one_work+0x198/0x450 [<c105d469>] worker_thread+0xf9/0x320 [<c10669ff>] kthread+0x9f/0xb0 [<c1613397>] ret_from_kernel_thread+0x1b/0x28 -> #0 ((&sdata->u.ap.request_smps_work)){+.+...}: [<c10ae9df>] __lock_acquire+0x183f/0x1910 [<c10aefa9>] lock_acquire+0x79/0xe0 [<c105b917>] flush_work+0x47/0x90 [<c105d867>] __cancel_work_timer+0x67/0xe0 [<c105d90f>] cancel_work_sync+0xf/0x20 [<fb0765cc>] ieee80211_stop_ap+0x8c/0x340 [mac80211] [<f9b3268c>] cfg80211_stop_ap+0x8c/0x230 [cfg80211] [<f9b0d8f9>] cfg80211_leave+0x79/0x100 [cfg80211] [<f9b0da72>] cfg80211_netdev_notifier_call+0xf2/0x4f0 [cfg80211] [<c160f2c9>] notifier_call_chain+0x59/0x130 [<c106c6de>] __raw_notifier_call_chain+0x1e/0x30 [<c106c70f>] raw_notifier_call_chain+0x1f/0x30 [<c14f8213>] call_netdevice_notifiers_info+0x33/0x70 [<c14f8263>] call_netdevice_notifiers+0x13/0x20 [<c14f82a4>] __dev_close_many+0x34/0xb0 [<c14f83fe>] dev_close_many+0x6e/0xc0 [<c14f9c77>] rollback_registered_many+0xa7/0x1f0 [<c14f9dd4>] unregister_netdevice_many+0x14/0x60 [<fb06f4d9>] ieee80211_remove_interfaces+0xe9/0x170 [mac80211] [<fb055116>] ieee80211_unregister_hw+0x56/0x110 [mac80211] [<fa3e9396>] iwl_op_mode_mvm_stop+0x26/0xe0 [iwlmvm] [<f9b9d8ca>] _iwl_op_mode_stop+0x3a/0x70 [iwlwifi] [<f9b9d96f>] iwl_opmode_deregister+0x6f/0x90 [iwlwifi] [<fa405179>] __exit_compat+0xd/0x19 [iwlmvm] [<c10b8bf9>] SyS_delete_module+0x179/0x2b0 [<c1613421>] sysenter_do_call+0x12/0x32 Fixes: 687da13 ("mac80211: implement SMPS for AP") Cc: <stable@vger.kernel.org> [3.13] Reported-by: Ilan Peer <ilan.peer@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
AshishNamdev
pushed a commit
that referenced
this pull request
Feb 22, 2014
sparc_cpu_model isn't in asm/system.h any more, so remove a comment about it. Signed-off-by: David Howells <dhowells@redhat.com> cc: "David S. Miller" <davem@davemloft.net> cc: sparclinux@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
AshishNamdev
pushed a commit
that referenced
this pull request
Feb 22, 2014
Pull sparc fixes from David Miller: "Three minor fixes from David Howells and Paul Gortmaker" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: Sparc: sparc_cpu_model isn't in asm/system.h any more [ver #2] sparc32: make copy_to/from_user_page() usable from modular code sparc32: fix build failure for arch_jump_label_transform
AshishNamdev
pushed a commit
that referenced
this pull request
Mar 23, 2014
vmxnet3's netpoll driver is incorrectly coded. It directly calls vmxnet3_do_poll, which is the driver internal napi poll routine. As the netpoll controller method doesn't block real napi polls in any way, there is a potential for race conditions in which the netpoll controller method and the napi poll method run concurrently. The result is data corruption causing panics such as this one recently observed: PID: 1371 TASK: ffff88023762caa0 CPU: 1 COMMAND: "rs:main Q:Reg" #0 [ffff88023abd5780] machine_kexec at ffffffff81038f3b #1 [ffff88023abd57e0] crash_kexec at ffffffff810c5d92 #2 [ffff88023abd58b0] oops_end at ffffffff8152b570 #3 [ffff88023abd58e0] die at ffffffff81010e0b #4 [ffff88023abd5910] do_trap at ffffffff8152add4 #5 [ffff88023abd5970] do_invalid_op at ffffffff8100cf95 #6 [ffff88023abd5a10] invalid_op at ffffffff8100bf9b [exception RIP: vmxnet3_rq_rx_complete+1968] RIP: ffffffffa00f1e80 RSP: ffff88023abd5ac8 RFLAGS: 00010086 RAX: 0000000000000000 RBX: ffff88023b5dcee0 RCX: 00000000000000c0 RDX: 0000000000000000 RSI: 00000000000005f2 RDI: ffff88023b5dcee0 RBP: ffff88023abd5b48 R8: 0000000000000000 R9: ffff88023a3b6048 R10: 0000000000000000 R11: 0000000000000002 R12: ffff8802398d4cd8 R13: ffff88023af35140 R14: ffff88023b60c890 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #7 [ffff88023abd5b50] vmxnet3_do_poll at ffffffffa00f204a [vmxnet3] #8 [ffff88023abd5b80] vmxnet3_netpoll at ffffffffa00f209c [vmxnet3] torvalds#9 [ffff88023abd5ba0] netpoll_poll_dev at ffffffff81472bb7 The fix is to do as other drivers do, and have the poll controller call the top half interrupt handler, which schedules a napi poll properly to recieve frames Tested by myself, successfully. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: Shreyas Bhatewara <sbhatewara@vmware.com> CC: "VMware, Inc." <pv-drivers@vmware.com> CC: "David S. Miller" <davem@davemloft.net> CC: stable@vger.kernel.org Reviewed-by: Shreyas N Bhatewara <sbhatewara@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
AshishNamdev
pushed a commit
that referenced
this pull request
Jul 2, 2014
…kup detector Peter Wu noticed the following splat on his machine when updating /proc/sys/kernel/watchdog_thresh: BUG: sleeping function called from invalid context at mm/slub.c:965 in_atomic(): 1, irqs_disabled(): 0, pid: 1, name: init 3 locks held by init/1: #0: (sb_writers#3){.+.+.+}, at: [<ffffffff8117b663>] vfs_write+0x143/0x180 #1: (watchdog_proc_mutex){+.+.+.}, at: [<ffffffff810e02d3>] proc_dowatchdog+0x33/0x110 #2: (cpu_hotplug.lock){.+.+.+}, at: [<ffffffff810589c2>] get_online_cpus+0x32/0x80 Preemption disabled at:[<ffffffff810e0384>] proc_dowatchdog+0xe4/0x110 CPU: 0 PID: 1 Comm: init Not tainted 3.16.0-rc1-testing torvalds#34 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: dump_stack+0x4e/0x7a __might_sleep+0x11d/0x190 kmem_cache_alloc_trace+0x4e/0x1e0 perf_event_alloc+0x55/0x440 perf_event_create_kernel_counter+0x26/0xe0 watchdog_nmi_enable+0x75/0x140 update_timers_all_cpus+0x53/0xa0 proc_dowatchdog+0xe4/0x110 proc_sys_call_handler+0xb3/0xc0 proc_sys_write+0x14/0x20 vfs_write+0xad/0x180 SyS_write+0x49/0xb0 system_call_fastpath+0x16/0x1b NMI watchdog: disabled (cpu0): hardware events not enabled What happened is after updating the watchdog_thresh, the lockup detector is restarted to utilize the new value. Part of this process involved disabling preemption. Once preemption was disabled, perf tried to allocate a new event (as part of the restart). This caused the above BUG_ON as you can't sleep with preemption disabled. The preemption restriction seemed agressive as we are not doing anything on that particular cpu, but with all the online cpus (which are protected by the get_online_cpus lock). Remove the restriction and the BUG_ON goes away. Signed-off-by: Don Zickus <dzickus@redhat.com> Acked-by: Michal Hocko <mhocko@suse.cz> Reported-by: Peter Wu <peter@lekensteyn.nl> Tested-by: Peter Wu <peter@lekensteyn.nl> Acked-by: David Rientjes <rientjes@google.com> Cc: <stable@vger.kernel.org> [3.13+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
AshishNamdev
pushed a commit
that referenced
this pull request
Jul 2, 2014
cxgb4_netdev maybe lead to dead lock, since it uses a spin lock, and be called in both thread and softirq context, but not disable BH, the lockdep report is below; In fact, cxgb4_netdev only reads adap_rcu_list with RCU protection, so not need to hold spin lock again. ================================= [ INFO: inconsistent lock state ] 3.14.7+ torvalds#24 Tainted: G C O --------------------------------- inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. radvd/3794 [HC0[0]:SC1[1]:HE1:SE0] takes: (adap_rcu_lock){+.?...}, at: [<ffffffffa09989ea>] clip_add+0x2c/0x116 [cxgb4] {SOFTIRQ-ON-W} state was registered at: [<ffffffff810fca81>] __lock_acquire+0x34a/0xe48 [<ffffffff810fd98b>] lock_acquire+0x82/0x9d [<ffffffff815d6ff8>] _raw_spin_lock+0x34/0x43 [<ffffffffa09989ea>] clip_add+0x2c/0x116 [cxgb4] [<ffffffffa0998beb>] cxgb4_inet6addr_handler+0x117/0x12c [cxgb4] [<ffffffff815da98b>] notifier_call_chain+0x32/0x5c [<ffffffff815da9f9>] __atomic_notifier_call_chain+0x44/0x6e [<ffffffff815daa32>] atomic_notifier_call_chain+0xf/0x11 [<ffffffff815b1356>] inet6addr_notifier_call_chain+0x16/0x18 [<ffffffffa01f72e5>] ipv6_add_addr+0x404/0x46e [ipv6] [<ffffffffa01f8df0>] addrconf_add_linklocal+0x5f/0x95 [ipv6] [<ffffffffa01fc3e9>] addrconf_notify+0x632/0x841 [ipv6] [<ffffffff815da98b>] notifier_call_chain+0x32/0x5c [<ffffffff810e09a1>] __raw_notifier_call_chain+0x9/0xb [<ffffffff810e09b2>] raw_notifier_call_chain+0xf/0x11 [<ffffffff8151b3b7>] call_netdevice_notifiers_info+0x4e/0x56 [<ffffffff8151b3d0>] call_netdevice_notifiers+0x11/0x13 [<ffffffff8151c0a6>] netdev_state_change+0x1f/0x38 [<ffffffff8152f004>] linkwatch_do_dev+0x3b/0x49 [<ffffffff8152f184>] __linkwatch_run_queue+0x10b/0x144 [<ffffffff8152f1dd>] linkwatch_event+0x20/0x27 [<ffffffff810d7bc0>] process_one_work+0x1cb/0x2ee [<ffffffff810d7e3b>] worker_thread+0x12e/0x1fc [<ffffffff810dd391>] kthread+0xc4/0xcc [<ffffffff815dc48c>] ret_from_fork+0x7c/0xb0 irq event stamp: 3388 hardirqs last enabled at (3388): [<ffffffff810c6c85>] __local_bh_enable_ip+0xaa/0xd9 hardirqs last disabled at (3387): [<ffffffff810c6c2d>] __local_bh_enable_ip+0x52/0xd9 softirqs last enabled at (3288): [<ffffffffa01f1d5b>] rcu_read_unlock_bh+0x0/0x2f [ipv6] softirqs last disabled at (3289): [<ffffffff815ddafc>] do_softirq_own_stack+0x1c/0x30 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(adap_rcu_lock); <Interrupt> lock(adap_rcu_lock); *** DEADLOCK *** 5 locks held by radvd/3794: #0: (sk_lock-AF_INET6){+.+.+.}, at: [<ffffffffa020b85a>] rawv6_sendmsg+0x74b/0xa4d [ipv6] #1: (rcu_read_lock){.+.+..}, at: [<ffffffff8151ac6b>] rcu_lock_acquire+0x0/0x29 #2: (rcu_read_lock){.+.+..}, at: [<ffffffffa01f4cca>] rcu_lock_acquire.constprop.16+0x0/0x30 [ipv6] #3: (rcu_read_lock){.+.+..}, at: [<ffffffff810e09b4>] rcu_lock_acquire+0x0/0x29 #4: (rcu_read_lock){.+.+..}, at: [<ffffffffa0998782>] rcu_lock_acquire.constprop.40+0x0/0x30 [cxgb4] stack backtrace: CPU: 7 PID: 3794 Comm: radvd Tainted: G C O 3.14.7+ torvalds#24 Hardware name: Supermicro X7DBU/X7DBU, BIOS 6.00 12/03/2007 ffffffff81f15990 ffff88012fdc36a8 ffffffff815d0016 0000000000000006 ffff8800c80dc2a0 ffff88012fdc3708 ffffffff815cc727 0000000000000001 0000000000000001 ffff880100000000 ffffffff81015b02 ffff8800c80dcb58 Call Trace: <IRQ> [<ffffffff815d0016>] dump_stack+0x4e/0x71 [<ffffffff815cc727>] print_usage_bug+0x1ec/0x1fd [<ffffffff81015b02>] ? save_stack_trace+0x27/0x44 [<ffffffff810fbfaa>] ? check_usage_backwards+0xa0/0xa0 [<ffffffff810fc640>] mark_lock+0x11b/0x212 [<ffffffff810fca0b>] __lock_acquire+0x2d4/0xe48 [<ffffffff810fbfaa>] ? check_usage_backwards+0xa0/0xa0 [<ffffffff810fbff6>] ? check_usage_forwards+0x4c/0xa6 [<ffffffff810c6c8a>] ? __local_bh_enable_ip+0xaf/0xd9 [<ffffffff810fd98b>] lock_acquire+0x82/0x9d [<ffffffffa09989ea>] ? clip_add+0x2c/0x116 [cxgb4] [<ffffffffa0998782>] ? rcu_read_unlock+0x23/0x23 [cxgb4] [<ffffffff815d6ff8>] _raw_spin_lock+0x34/0x43 [<ffffffffa09989ea>] ? clip_add+0x2c/0x116 [cxgb4] [<ffffffffa09987b0>] ? rcu_lock_acquire.constprop.40+0x2e/0x30 [cxgb4] [<ffffffffa0998782>] ? rcu_read_unlock+0x23/0x23 [cxgb4] [<ffffffffa09989ea>] clip_add+0x2c/0x116 [cxgb4] [<ffffffffa0998beb>] cxgb4_inet6addr_handler+0x117/0x12c [cxgb4] [<ffffffff810fd99d>] ? lock_acquire+0x94/0x9d [<ffffffff810e09b4>] ? raw_notifier_call_chain+0x11/0x11 [<ffffffff815da98b>] notifier_call_chain+0x32/0x5c [<ffffffff815da9f9>] __atomic_notifier_call_chain+0x44/0x6e [<ffffffff815daa32>] atomic_notifier_call_chain+0xf/0x11 [<ffffffff815b1356>] inet6addr_notifier_call_chain+0x16/0x18 [<ffffffffa01f72e5>] ipv6_add_addr+0x404/0x46e [ipv6] [<ffffffff810fde6a>] ? trace_hardirqs_on+0xd/0xf [<ffffffffa01fb634>] addrconf_prefix_rcv+0x385/0x6ea [ipv6] [<ffffffffa0207950>] ndisc_rcv+0x9d3/0xd76 [ipv6] [<ffffffffa020d536>] icmpv6_rcv+0x592/0x67b [ipv6] [<ffffffff810c6c85>] ? __local_bh_enable_ip+0xaa/0xd9 [<ffffffff810c6c85>] ? __local_bh_enable_ip+0xaa/0xd9 [<ffffffff810fd8dc>] ? lock_release+0x14e/0x17b [<ffffffffa020df97>] ? rcu_read_unlock+0x21/0x23 [ipv6] [<ffffffff8150df52>] ? rcu_read_unlock+0x23/0x23 [<ffffffffa01f4ede>] ip6_input_finish+0x1e4/0x2fc [ipv6] [<ffffffffa01f540b>] ip6_input+0x33/0x38 [ipv6] [<ffffffffa01f5557>] ip6_mc_input+0x147/0x160 [ipv6] [<ffffffffa01f4ba3>] ip6_rcv_finish+0x7c/0x81 [ipv6] [<ffffffffa01f5397>] ipv6_rcv+0x3a1/0x3e2 [ipv6] [<ffffffff8151ef96>] __netif_receive_skb_core+0x4ab/0x511 [<ffffffff810fdc94>] ? mark_held_locks+0x71/0x99 [<ffffffff8151f0c0>] ? process_backlog+0x69/0x15e [<ffffffff8151f045>] __netif_receive_skb+0x49/0x5b [<ffffffff8151f0cf>] process_backlog+0x78/0x15e [<ffffffff8151f571>] ? net_rx_action+0x1a2/0x1cc [<ffffffff8151f47b>] net_rx_action+0xac/0x1cc [<ffffffff810c69b7>] ? __do_softirq+0xad/0x218 [<ffffffff810c69ff>] __do_softirq+0xf5/0x218 [<ffffffff815ddafc>] do_softirq_own_stack+0x1c/0x30 <EOI> [<ffffffff810c6bb6>] do_softirq+0x38/0x5d [<ffffffffa01f1d5b>] ? ip6_copy_metadata+0x156/0x156 [ipv6] [<ffffffff810c6c78>] __local_bh_enable_ip+0x9d/0xd9 [<ffffffffa01f1d88>] rcu_read_unlock_bh+0x2d/0x2f [ipv6] [<ffffffffa01f28b4>] ip6_finish_output2+0x381/0x3d8 [ipv6] [<ffffffffa01f49ef>] ip6_finish_output+0x6e/0x73 [ipv6] [<ffffffffa01f4a70>] ip6_output+0x7c/0xa8 [ipv6] [<ffffffff815b1bfa>] dst_output+0x18/0x1c [<ffffffff815b1c9e>] ip6_local_out+0x1c/0x21 [<ffffffffa01f2489>] ip6_push_pending_frames+0x37d/0x427 [ipv6] [<ffffffff81558af8>] ? skb_orphan+0x39/0x39 [<ffffffffa020b85a>] ? rawv6_sendmsg+0x74b/0xa4d [ipv6] [<ffffffffa020ba51>] rawv6_sendmsg+0x942/0xa4d [ipv6] [<ffffffff81584cd2>] inet_sendmsg+0x3d/0x66 [<ffffffff81508930>] __sock_sendmsg_nosec+0x25/0x27 [<ffffffff8150b0d7>] sock_sendmsg+0x5a/0x7b [<ffffffff810fd8dc>] ? lock_release+0x14e/0x17b [<ffffffff8116d756>] ? might_fault+0x9e/0xa5 [<ffffffff8116d70d>] ? might_fault+0x55/0xa5 [<ffffffff81508cb1>] ? copy_from_user+0x2a/0x2c [<ffffffff8150b70c>] ___sys_sendmsg+0x226/0x2d9 [<ffffffff810fcd25>] ? __lock_acquire+0x5ee/0xe48 [<ffffffff810fde01>] ? trace_hardirqs_on_caller+0x145/0x1a1 [<ffffffff8118efcb>] ? slab_free_hook.isra.71+0x50/0x59 [<ffffffff8115c81f>] ? release_pages+0xbc/0x181 [<ffffffff810fd99d>] ? lock_acquire+0x94/0x9d [<ffffffff81115e97>] ? read_seqcount_begin.constprop.25+0x73/0x90 [<ffffffff8150c408>] __sys_sendmsg+0x3d/0x5b [<ffffffff8150c433>] SyS_sendmsg+0xd/0x19 [<ffffffff815dc53d>] system_call_fastpath+0x1a/0x1f Reported-by: Ben Greear <greearb@candelatech.com> Cc: Casey Leedom <leedom@chelsio.com> Cc: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Casey Leedom <leedom@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
AshishNamdev
pushed a commit
that referenced
this pull request
Oct 2, 2014
This reverts commit 8b37e1b. It's broken as it changes led_blink_set() in a way that it can now sleep (while synchronously waiting for workqueue to be cancelled). That's a problem, because it's possible that this function gets called from atomic context (tpt_trig_timer() takes a readlock and thus disables preemption). This has been brought up 3 weeks ago already [1] but no proper fix has materialized, and I keep seeing the problem since 3.17-rc1. [1] https://lkml.org/lkml/2014/8/16/128 BUG: sleeping function called from invalid context at kernel/workqueue.c:2650 in_atomic(): 1, irqs_disabled(): 0, pid: 2335, name: wpa_supplicant 5 locks held by wpa_supplicant/2335: #0: (rtnl_mutex){+.+.+.}, at: [<ffffffff814c7c92>] rtnl_lock+0x12/0x20 #1: (&wdev->mtx){+.+.+.}, at: [<ffffffffc06e649c>] cfg80211_mgd_wext_siwessid+0x5c/0x180 [cfg80211] #2: (&local->mtx){+.+.+.}, at: [<ffffffffc0817dea>] ieee80211_prep_connection+0x17a/0x9a0 [mac80211] #3: (&local->chanctx_mtx){+.+.+.}, at: [<ffffffffc08081ed>] ieee80211_vif_use_channel+0x5d/0x2a0 [mac80211] #4: (&trig->leddev_list_lock){.+.+..}, at: [<ffffffffc081e68c>] tpt_trig_timer+0xec/0x170 [mac80211] CPU: 0 PID: 2335 Comm: wpa_supplicant Not tainted 3.17.0-rc3 #1 Hardware name: LENOVO 7470BN2/7470BN2, BIOS 6DET38WW (2.02 ) 12/19/2008 ffff8800360b5a50 ffff8800751f76d8 ffffffff8159e97f ffff8800360b5a30 ffff8800751f76e8 ffffffff810739a5 ffff8800751f77b0 ffffffff8106862f ffffffff810685d0 0aa2209200000000 ffff880000000004 ffff8800361c59d0 Call Trace: [<ffffffff8159e97f>] dump_stack+0x4d/0x66 [<ffffffff810739a5>] __might_sleep+0xe5/0x120 [<ffffffff8106862f>] flush_work+0x5f/0x270 [<ffffffff810685d0>] ? mod_delayed_work_on+0x80/0x80 [<ffffffff810945ca>] ? mark_held_locks+0x6a/0x90 [<ffffffff81068a5f>] ? __cancel_work_timer+0x6f/0x100 [<ffffffff810946ed>] ? trace_hardirqs_on_caller+0xfd/0x1c0 [<ffffffff81068a6b>] __cancel_work_timer+0x7b/0x100 [<ffffffff81068b0e>] cancel_delayed_work_sync+0xe/0x10 [<ffffffff8147cf3b>] led_blink_set+0x1b/0x40 [<ffffffffc081e6b0>] tpt_trig_timer+0x110/0x170 [mac80211] [<ffffffffc081ecdd>] ieee80211_mod_tpt_led_trig+0x9d/0x160 [mac80211] [<ffffffffc07e4278>] __ieee80211_recalc_idle+0x98/0x140 [mac80211] [<ffffffffc07e59ce>] ieee80211_idle_off+0xe/0x10 [mac80211] [<ffffffffc0804e5b>] ieee80211_add_chanctx+0x3b/0x220 [mac80211] [<ffffffffc08062e4>] ieee80211_new_chanctx+0x44/0xf0 [mac80211] [<ffffffffc080838a>] ieee80211_vif_use_channel+0x1fa/0x2a0 [mac80211] [<ffffffffc0817df8>] ieee80211_prep_connection+0x188/0x9a0 [mac80211] [<ffffffffc081c246>] ieee80211_mgd_auth+0x256/0x2e0 [mac80211] [<ffffffffc07eab33>] ieee80211_auth+0x13/0x20 [mac80211] [<ffffffffc06cb006>] cfg80211_mlme_auth+0x106/0x270 [cfg80211] [<ffffffffc06ce085>] cfg80211_conn_do_work+0x155/0x3b0 [cfg80211] [<ffffffffc06cf670>] cfg80211_connect+0x3f0/0x540 [cfg80211] [<ffffffffc06e6148>] cfg80211_mgd_wext_connect+0x158/0x1f0 [cfg80211] [<ffffffffc06e651e>] cfg80211_mgd_wext_siwessid+0xde/0x180 [cfg80211] [<ffffffffc06e36c0>] ? cfg80211_wext_giwessid+0x50/0x50 [cfg80211] [<ffffffffc06e36dd>] cfg80211_wext_siwessid+0x1d/0x40 [cfg80211] [<ffffffff81584d0c>] ioctl_standard_iw_point+0x14c/0x3e0 [<ffffffff810946ed>] ? trace_hardirqs_on_caller+0xfd/0x1c0 [<ffffffff8158502a>] ioctl_standard_call+0x8a/0xd0 [<ffffffff81584fa0>] ? ioctl_standard_iw_point+0x3e0/0x3e0 [<ffffffff81584b76>] wireless_process_ioctl.constprop.10+0xb6/0x100 [<ffffffff8158521d>] wext_handle_ioctl+0x5d/0xb0 [<ffffffff814cfb29>] dev_ioctl+0x329/0x620 [<ffffffff810946ed>] ? trace_hardirqs_on_caller+0xfd/0x1c0 [<ffffffff8149c7f2>] sock_ioctl+0x142/0x2e0 [<ffffffff811b0140>] do_vfs_ioctl+0x300/0x520 [<ffffffff815a67fb>] ? sysret_check+0x1b/0x56 [<ffffffff810946ed>] ? trace_hardirqs_on_caller+0xfd/0x1c0 [<ffffffff811b03e1>] SyS_ioctl+0x81/0xa0 [<ffffffff815a67d6>] system_call_fastpath+0x1a/0x1f wlan0: send auth to 00:0b:6b:3c:8c:e4 (try 1/3) wlan0: authenticated wlan0: associate with 00:0b:6b:3c:8c:e4 (try 1/3) wlan0: RX AssocResp from 00:0b:6b:3c:8c:e4 (capab=0x431 status=0 aid=2) wlan0: associated IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready cfg80211: Calling CRDA for country: NA wlan0: Limiting TX power to 27 (27 - 0) dBm as advertised by 00:0b:6b:3c:8c:e4 ================================= [ INFO: inconsistent lock state ] 3.17.0-rc3 #1 Not tainted --------------------------------- inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. swapper/0/0 [HC0[0]:SC1[1]:HE1:SE0] takes: ((&(&led_cdev->blink_work)->work)){+.?...}, at: [<ffffffff810685d0>] flush_work+0x0/0x270 {SOFTIRQ-ON-W} state was registered at: [<ffffffff81094dbe>] __lock_acquire+0x30e/0x1a30 [<ffffffff81096c81>] lock_acquire+0x91/0x110 [<ffffffff81068608>] flush_work+0x38/0x270 [<ffffffff81068a6b>] __cancel_work_timer+0x7b/0x100 [<ffffffff81068b0e>] cancel_delayed_work_sync+0xe/0x10 [<ffffffff8147cf3b>] led_blink_set+0x1b/0x40 [<ffffffffc081e6b0>] tpt_trig_timer+0x110/0x170 [mac80211] [<ffffffffc081ecdd>] ieee80211_mod_tpt_led_trig+0x9d/0x160 [mac80211] [<ffffffffc07e4278>] __ieee80211_recalc_idle+0x98/0x140 [mac80211] [<ffffffffc07e59ce>] ieee80211_idle_off+0xe/0x10 [mac80211] [<ffffffffc0804e5b>] ieee80211_add_chanctx+0x3b/0x220 [mac80211] [<ffffffffc08062e4>] ieee80211_new_chanctx+0x44/0xf0 [mac80211] [<ffffffffc080838a>] ieee80211_vif_use_channel+0x1fa/0x2a0 [mac80211] [<ffffffffc0817df8>] ieee80211_prep_connection+0x188/0x9a0 [mac80211] [<ffffffffc081c246>] ieee80211_mgd_auth+0x256/0x2e0 [mac80211] [<ffffffffc07eab33>] ieee80211_auth+0x13/0x20 [mac80211] [<ffffffffc06cb006>] cfg80211_mlme_auth+0x106/0x270 [cfg80211] [<ffffffffc06ce085>] cfg80211_conn_do_work+0x155/0x3b0 [cfg80211] [<ffffffffc06cf670>] cfg80211_connect+0x3f0/0x540 [cfg80211] [<ffffffffc06e6148>] cfg80211_mgd_wext_connect+0x158/0x1f0 [cfg80211] [<ffffffffc06e651e>] cfg80211_mgd_wext_siwessid+0xde/0x180 [cfg80211] [<ffffffffc06e36dd>] cfg80211_wext_siwessid+0x1d/0x40 [cfg80211] [<ffffffff81584d0c>] ioctl_standard_iw_point+0x14c/0x3e0 [<ffffffff8158502a>] ioctl_standard_call+0x8a/0xd0 [<ffffffff81584b76>] wireless_process_ioctl.constprop.10+0xb6/0x100 [<ffffffff8158521d>] wext_handle_ioctl+0x5d/0xb0 [<ffffffff814cfb29>] dev_ioctl+0x329/0x620 [<ffffffff8149c7f2>] sock_ioctl+0x142/0x2e0 [<ffffffff811b0140>] do_vfs_ioctl+0x300/0x520 [<ffffffff811b03e1>] SyS_ioctl+0x81/0xa0 [<ffffffff815a67d6>] system_call_fastpath+0x1a/0x1f irq event stamp: 493416 hardirqs last enabled at (493416): [<ffffffff81068a5f>] __cancel_work_timer+0x6f/0x100 hardirqs last disabled at (493415): [<ffffffff81067e9f>] try_to_grab_pending+0x1f/0x160 softirqs last enabled at (493408): [<ffffffff81053ced>] _local_bh_enable+0x1d/0x50 softirqs last disabled at (493409): [<ffffffff81054c75>] irq_exit+0xa5/0xb0 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock((&(&led_cdev->blink_work)->work)); <Interrupt> lock((&(&led_cdev->blink_work)->work)); *** DEADLOCK *** 2 locks held by swapper/0/0: #0: (((&tpt_trig->timer))){+.-...}, at: [<ffffffff810b4c50>] call_timer_fn+0x0/0x180 #1: (&trig->leddev_list_lock){.+.?..}, at: [<ffffffffc081e68c>] tpt_trig_timer+0xec/0x170 [mac80211] stack backtrace: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.17.0-rc3 #1 Hardware name: LENOVO 7470BN2/7470BN2, BIOS 6DET38WW (2.02 ) 12/19/2008 ffffffff8246eb30 ffff88007c203b00 ffffffff8159e97f ffffffff81a194c0 ffff88007c203b50 ffffffff81599c29 0000000000000001 ffffffff00000001 ffff880000000000 0000000000000006 ffffffff81a194c0 ffffffff81093ad0 Call Trace: <IRQ> [<ffffffff8159e97f>] dump_stack+0x4d/0x66 [<ffffffff81599c29>] print_usage_bug+0x1f4/0x205 [<ffffffff81093ad0>] ? check_usage_backwards+0x140/0x140 [<ffffffff810944d3>] mark_lock+0x223/0x2b0 [<ffffffff81094d60>] __lock_acquire+0x2b0/0x1a30 [<ffffffff81096c81>] lock_acquire+0x91/0x110 [<ffffffff810685d0>] ? mod_delayed_work_on+0x80/0x80 [<ffffffffc081e5a0>] ? __ieee80211_get_rx_led_name+0x10/0x10 [mac80211] [<ffffffff81068608>] flush_work+0x38/0x270 [<ffffffff810685d0>] ? mod_delayed_work_on+0x80/0x80 [<ffffffff810945ca>] ? mark_held_locks+0x6a/0x90 [<ffffffff81068a5f>] ? __cancel_work_timer+0x6f/0x100 [<ffffffffc081e5a0>] ? __ieee80211_get_rx_led_name+0x10/0x10 [mac80211] [<ffffffff8109469d>] ? trace_hardirqs_on_caller+0xad/0x1c0 [<ffffffffc081e5a0>] ? __ieee80211_get_rx_led_name+0x10/0x10 [mac80211] [<ffffffff81068a6b>] __cancel_work_timer+0x7b/0x100 [<ffffffff81068b0e>] cancel_delayed_work_sync+0xe/0x10 [<ffffffff8147cf3b>] led_blink_set+0x1b/0x40 [<ffffffffc081e6b0>] tpt_trig_timer+0x110/0x170 [mac80211] [<ffffffff810b4cc5>] call_timer_fn+0x75/0x180 [<ffffffff810b4c50>] ? process_timeout+0x10/0x10 [<ffffffffc081e5a0>] ? __ieee80211_get_rx_led_name+0x10/0x10 [mac80211] [<ffffffff810b50ac>] run_timer_softirq+0x1fc/0x2f0 [<ffffffff81054805>] __do_softirq+0x115/0x2e0 [<ffffffff81054c75>] irq_exit+0xa5/0xb0 [<ffffffff810049b3>] do_IRQ+0x53/0xf0 [<ffffffff815a74af>] common_interrupt+0x6f/0x6f <EOI> [<ffffffff8147b56e>] ? cpuidle_enter_state+0x6e/0x180 [<ffffffff8147b732>] cpuidle_enter+0x12/0x20 [<ffffffff8108bba0>] cpu_startup_entry+0x330/0x360 [<ffffffff8158fb51>] rest_init+0xc1/0xd0 [<ffffffff8158fa90>] ? csum_partial_copy_generic+0x170/0x170 [<ffffffff81af3ff2>] start_kernel+0x44f/0x45a [<ffffffff81af399c>] ? set_init_arg+0x53/0x53 [<ffffffff81af35ad>] x86_64_start_reservations+0x2a/0x2c [<ffffffff81af36a0>] x86_64_start_kernel+0xf1/0xf4 Cc: Vincent Donnefort <vdonnefort@gmail.com> Cc: Hugh Dickins <hughd@google.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Bryan Wu <cooloney@gmail.com>
AshishNamdev
pushed a commit
that referenced
this pull request
Oct 2, 2014
While debugging a cpufreq-related hardware failure on a system I saw the following lockdep warning: ========================= [ BUG: held lock freed! ] 3.17.0-rc4+ #1 Tainted: G E ------------------------- insmod/2247 is freeing memory ffff88006e1b1400-ffff88006e1b17ff, with a lock still held there! (&policy->rwsem){+.+...}, at: [<ffffffff8156d37d>] __cpufreq_add_dev.isra.21+0x47d/0xb80 3 locks held by insmod/2247: #0: (subsys mutex#5){+.+.+.}, at: [<ffffffff81485579>] subsys_interface_register+0x69/0x120 #1: (cpufreq_rwsem){.+.+.+}, at: [<ffffffff8156cf73>] __cpufreq_add_dev.isra.21+0x73/0xb80 #2: (&policy->rwsem){+.+...}, at: [<ffffffff8156d37d>] __cpufreq_add_dev.isra.21+0x47d/0xb80 stack backtrace: CPU: 0 PID: 2247 Comm: insmod Tainted: G E 3.17.0-rc4+ #1 Hardware name: HP ProLiant MicroServer Gen8, BIOS J06 08/24/2013 0000000000000000 000000008f3063c4 ffff88006f87bb30 ffffffff8171b358 ffff88006bcf3750 ffff88006f87bb68 ffffffff810e09e1 ffff88006e1b1400 ffffea0001b86c00 ffffffff8156d327 ffff880073003500 0000000000000246 Call Trace: [<ffffffff8171b358>] dump_stack+0x4d/0x66 [<ffffffff810e09e1>] debug_check_no_locks_freed+0x171/0x180 [<ffffffff8156d327>] ? __cpufreq_add_dev.isra.21+0x427/0xb80 [<ffffffff8121412b>] kfree+0xab/0x2b0 [<ffffffff8156d327>] __cpufreq_add_dev.isra.21+0x427/0xb80 [<ffffffff81724cf7>] ? _raw_spin_unlock+0x27/0x40 [<ffffffffa003517f>] ? pcc_cpufreq_do_osc+0x17f/0x17f [pcc_cpufreq] [<ffffffff8156da8e>] cpufreq_add_dev+0xe/0x10 [<ffffffff814855d1>] subsys_interface_register+0xc1/0x120 [<ffffffff8156bcf2>] cpufreq_register_driver+0x112/0x340 [<ffffffff8121415a>] ? kfree+0xda/0x2b0 [<ffffffffa003517f>] ? pcc_cpufreq_do_osc+0x17f/0x17f [pcc_cpufreq] [<ffffffffa003562e>] pcc_cpufreq_init+0x4af/0xe81 [pcc_cpufreq] [<ffffffffa003517f>] ? pcc_cpufreq_do_osc+0x17f/0x17f [pcc_cpufreq] [<ffffffff81002144>] do_one_initcall+0xd4/0x210 [<ffffffff811f7472>] ? __vunmap+0xd2/0x120 [<ffffffff81127155>] load_module+0x1315/0x1b70 [<ffffffff811222a0>] ? store_uevent+0x70/0x70 [<ffffffff811229d9>] ? copy_module_from_fd.isra.44+0x129/0x180 [<ffffffff81127b86>] SyS_finit_module+0xa6/0xd0 [<ffffffff81725b69>] system_call_fastpath+0x16/0x1b cpufreq: __cpufreq_add_dev: ->get() failed insmod: ERROR: could not insert module pcc-cpufreq.ko: No such device The warning occurs in the __cpufreq_add_dev() code which does down_write(&policy->rwsem); ... if (cpufreq_driver->get && !cpufreq_driver->setpolicy) { policy->cur = cpufreq_driver->get(policy->cpu); if (!policy->cur) { pr_err("%s: ->get() failed\n", __func__); goto err_get_freq; } If cpufreq_driver->get(policy->cpu) returns an error we execute the code at err_get_freq, which does not up the policy->rwsem. This causes the lockdep warning. Trivial patch to up the policy->rwsem in the error path. After the patch has been applied, and an error occurs in the cpufreq_driver->get(policy->cpu) call we will now see cpufreq: __cpufreq_add_dev: ->get() failed cpufreq: __cpufreq_add_dev: ->get() failed modprobe: ERROR: could not insert 'pcc_cpufreq': No such device Fixes: 4e97b63 (cpufreq: Initialize governor for a new policy under policy->rwsem) Signed-off-by: Prarit Bhargava <prarit@redhat.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: 3.14+ <stable@vger.kernel.org> # 3.14+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
updating to latest commits