writing to rootfs causes system crash with the branch block-mpage-bvecs-for-next #2

dongsupark · 2014-12-07T16:14:03Z

When being tested with the branch block-mpage-bvecs-for-next, the kernel crash immediately after writing some blocks to the root filesystem. Have seen this on a QEMU VM with ext4 root filesystem.

The cx23885 driver still used sg++ instead of sg = sg_next(sg). This worked with vb1 since that filled in the sglist manually, page-by-page, but it fails with vb2 which uses core scatterlist code that can combine contiguous scatterlist entries into one larger entry. This bug led to the following crash as reported by Mariusz: [20712.990258] BUG: Bad page state in process vb2-cx23885[0] pfn:2ca34 [20712.990265] page:ffffea00009c3b60 count:-1 mapcount:0 mapping: (null) index:0x0 [20712.990266] flags: 0x4000000000000000() [20712.990268] page dumped because: nonzero _count [20712.990269] Modules linked in: tun binfmt_misc nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables xt_mark xt_REDIRECT xt_limit xt_conntrack xt_nat xt_tcpudp iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_filter ip_tables x_tables sit ip_tunnel nvidia(PO) stb6100 stv090x cx88_dvb videobuf_dvb cx88_vp3054_i2c tuner kvm_amd kvm cx8802 k10temp cx8800 cx88xx btcx_risc videobuf_dma_sg videobuf_core usb_storage ds2490 usbhid ftdi_sio cx23885 tveeprom cx2341x videobuf2_dvb videobuf2_core videobuf2_dma_sg videobuf2_memops asus_atk0110 snd_emu10k1 snd_hwdep snd_util_mem snd_ac97_codec ac97_bus snd_rawmidi snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd w1_therm wire ipv6 [20712.990301] CPU: 2 PID: 26942 Comm: vb2-cx23885[0] Tainted: P B W O 3.18.0-rc5-00001-gb3652d1 #2 [20712.990303] Hardware name: System manufacturer System Product Name/M4A785TD-V EVO, BIOS 2105 07/23/2010 [20712.990305] ffffffff81765734 ffff880137683a78 ffffffff815b6b32 0000000000000006 [20712.990307] ffffea00009c3b60 ffff880137683aa8 ffffffff8108ec27 ffffffff81765712 [20712.990309] ffffffff8189c840 0000000000000246 ffffea00009c3b60 ffff880137683b78 [20712.990312] Call Trace: [20712.990317] [<ffffffff815b6b32>] dump_stack+0x46/0x58 [20712.990321] [<ffffffff8108ec27>] bad_page+0xe9/0x107 [20712.990323] [<ffffffff810912ca>] get_page_from_freelist+0x3b2/0x505 [20712.990326] [<ffffffff8109150a>] __alloc_pages_nodemask+0xed/0x65f [20712.990330] [<ffffffff81047a52>] ? ttwu_do_activate.constprop.78+0x57/0x5c [20712.990332] [<ffffffff81049ff3>] ? try_to_wake_up+0x21b/0x22d [20712.990336] [<ffffffff810070f4>] dma_generic_alloc_coherent+0x6e/0xf5 [20712.990339] [<ffffffff810261a9>] gart_alloc_coherent+0x105/0x114 [20712.990341] [<ffffffff81025963>] ? flush_gart+0x39/0x3d [20712.990343] [<ffffffff810260a4>] ? gart_map_sg+0x3a0/0x3a0 [20712.990349] [<ffffffffa0141a1e>] cx23885_risc_databuffer+0xa7/0x133 [cx23885] [20712.990354] [<ffffffffa0142764>] cx23885_buf_prepare+0x121/0x134 [cx23885] [20712.990359] [<ffffffffa0144210>] buffer_prepare+0x14/0x16 [cx23885] [20712.990363] [<ffffffffa011f101>] __buf_prepare+0x190/0x279 [videobuf2_core] [20712.990366] [<ffffffffa011d906>] ? vb2_queue_or_prepare_buf+0xb8/0xc0 [videobuf2_core] [20712.990369] [<ffffffffa011f34b>] vb2_internal_qbuf+0x51/0x1e5 [videobuf2_core] [20712.990372] [<ffffffffa0120537>] vb2_thread+0x199/0x1f6 [videobuf2_core] [20712.990376] [<ffffffffa012039e>] ? vb2_fop_write+0xdf/0xdf [videobuf2_core] [20712.990379] [<ffffffff81043e61>] kthread+0xdf/0xe7 [20712.990381] [<ffffffff81043d82>] ? kthread_create_on_node+0x16d/0x16d [20712.990384] [<ffffffff815bd46c>] ret_from_fork+0x7c/0xb0 [20712.990386] [<ffffffff81043d82>] ? kthread_create_on_node+0x16d/0x16d Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Reported-by: Mariusz Bialonczyk <manio@skyboo.net> Tested-by: Mariusz Bialonczyk <manio@skyboo.net> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>

…nux/kernel/git/aegl/linux Pull pstore update #2 from Tony Luck: "Couple of pstore-ram enhancements to allow use of different memory attributes" * tag 'please-pull-morepstore' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux: pstore-ram: Allow optional mapping with pgprot_noncached pstore-ram: Fix hangs by using write-combine mappings

Fix the following lockdep report: ============================================= [ INFO: possible recursive locking detected ] 3.17.0+ #3 Tainted: G E --------------------------------------------- kworker/u64:3/299 is trying to acquire lock: (&epc->mutex){+.+.+.}, at: [<ffffffffa074e07a>] process_mpa_request+0x1aa/0x3e0 [iw_cxgb4] but task is already holding lock: (&epc->mutex){+.+.+.}, at: [<ffffffffa074e34e>] rx_data+0x9e/0x1f0 [iw_cxgb4] other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&epc->mutex); lock(&epc->mutex); *** DEADLOCK *** May be due to missing lock nesting notation 3 locks held by kworker/u64:3/299: #0: ("%s""iw_cxgb4"){.+.+.+}, at: [<ffffffff8106f14d>] process_one_work+0x13d/0x4d0 #1: (skb_work){+.+.+.}, at: [<ffffffff8106f14d>] process_one_work+0x13d/0x4d0 #2: (&epc->mutex){+.+.+.}, at: [<ffffffffa074e34e>] rx_data+0x9e/0x1f0 [iw_cxgb4] stack backtrace: CPU: 2 PID: 299 Comm: kworker/u64:3 Tainted: G E 3.17.0+ #3 Hardware name: Dell Inc. PowerEdge T110/0X744K, BIOS 1.2.1 01/28/2010 Workqueue: iw_cxgb4 process_work [iw_cxgb4] ffff8800b91593d0 ffff8800b8a2f9f8 ffffffff815df107 0000000000000001 ffff8800b9158750 ffff8800b8a2fa28 ffffffff8109f0e2 ffff8800bb768a00 ffff8800b91593d0 ffff8800b9158750 0000000000000000 ffff8800b8a2fa88 Call Trace: [<ffffffff815df107>] dump_stack+0x49/0x62 [<ffffffff8109f0e2>] print_deadlock_bug+0xf2/0x100 [<ffffffff810a0f04>] validate_chain+0x454/0x700 [<ffffffff810a1574>] __lock_acquire+0x3c4/0x580 [<ffffffffa074e07a>] ? process_mpa_request+0x1aa/0x3e0 [iw_cxgb4] [<ffffffff810a17cc>] lock_acquire+0x9c/0x110 [<ffffffffa074e07a>] ? process_mpa_request+0x1aa/0x3e0 [iw_cxgb4] [<ffffffff815e111b>] mutex_lock_nested+0x4b/0x360 [<ffffffffa074e07a>] ? process_mpa_request+0x1aa/0x3e0 [iw_cxgb4] [<ffffffff810c181a>] ? del_timer_sync+0xaa/0xd0 [<ffffffff810c1770>] ? try_to_del_timer_sync+0x70/0x70 [<ffffffffa074e07a>] process_mpa_request+0x1aa/0x3e0 [iw_cxgb4] [<ffffffffa074a3ec>] ? update_rx_credits+0xec/0x140 [iw_cxgb4] [<ffffffffa074e381>] rx_data+0xd1/0x1f0 [iw_cxgb4] [<ffffffff8109ff23>] ? mark_held_locks+0x73/0xa0 [<ffffffff815e4b90>] ? _raw_spin_unlock_irqrestore+0x40/0x70 [<ffffffff810a020d>] ? trace_hardirqs_on_caller+0xfd/0x1c0 [<ffffffff810a02dd>] ? trace_hardirqs_on+0xd/0x10 [<ffffffffa074c931>] process_work+0x51/0x80 [iw_cxgb4] [<ffffffff8106f1c8>] process_one_work+0x1b8/0x4d0 [<ffffffff8106f14d>] ? process_one_work+0x13d/0x4d0 [<ffffffff8106f600>] worker_thread+0x120/0x3c0 [<ffffffff8106f4e0>] ? process_one_work+0x4d0/0x4d0 [<ffffffff81074a0e>] kthread+0xde/0x100 [<ffffffff815e4b40>] ? _raw_spin_unlock_irq+0x30/0x40 [<ffffffff81074930>] ? __init_kthread_worker+0x70/0x70 [<ffffffff815e512c>] ret_from_fork+0x7c/0xb0 [<ffffffff81074930>] ? __init_kthread_worker+0x70/0x70 Based on original work by Steve Wise <swise@opengridcomputing.com>. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Roland Dreier <roland@purestorage.com>

…/git/viro/vfs Pull vfs pile #2 from Al Viro: "Next pile (and there'll be one or two more). The large piece in this one is getting rid of /proc/*/ns/* weirdness; among other things, it allows to (finally) make nameidata completely opaque outside of fs/namei.c, making for easier further cleanups in there" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: coda_venus_readdir(): use file_inode() fs/namei.c: fold link_path_walk() call into path_init() path_init(): don't bother with LOOKUP_PARENT in argument fs/namei.c: new helper (path_cleanup()) path_init(): store the "base" pointer to file in nameidata itself make default ->i_fop have ->open() fail with ENXIO make nameidata completely opaque outside of fs/namei.c kill proc_ns completely take the targets of /proc/*/ns/* symlinks to separate fs bury struct proc_ns in fs/proc copy address of proc_ns_ops into ns_common new helpers: ns_alloc_inum/ns_free_inum make proc_ns_operations work with struct ns_common * instead of void * switch the rest of proc_ns_operations to working with &...->ns netns: switch ->get()/->put()/->install()/->inum() to working with &net->ns make mntns ->get()/->put()/->install()/->inum() work with &mnt_ns->ns common object embedded into various struct ....ns

Currently the calculations of stolen time for PPC Book3S HV guests uses fields in both the vcpu struct and the kvmppc_vcore struct. The fields in the kvmppc_vcore struct are protected by the vcpu->arch.tbacct_lock of the vcpu that has taken responsibility for running the virtual core. This works correctly but confuses lockdep, because it sees that the code takes the tbacct_lock for a vcpu in kvmppc_remove_runnable() and then takes another vcpu's tbacct_lock in vcore_stolen_time(), and it thinks there is a possibility of deadlock, causing it to print reports like this: ============================================= [ INFO: possible recursive locking detected ] 3.18.0-rc7-kvm-00016-g8db4bc6 torvalds#89 Not tainted --------------------------------------------- qemu-system-ppc/6188 is trying to acquire lock: (&(&vcpu->arch.tbacct_lock)->rlock){......}, at: [<d00000000ecb1fe8>] .vcore_stolen_time+0x48/0xd0 [kvm_hv] but task is already holding lock: (&(&vcpu->arch.tbacct_lock)->rlock){......}, at: [<d00000000ecb25a0>] .kvmppc_remove_runnable.part.3+0x30/0xd0 [kvm_hv] other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&(&vcpu->arch.tbacct_lock)->rlock); lock(&(&vcpu->arch.tbacct_lock)->rlock); *** DEADLOCK *** May be due to missing lock nesting notation 3 locks held by qemu-system-ppc/6188: #0: (&vcpu->mutex){+.+.+.}, at: [<d00000000eb93f98>] .vcpu_load+0x28/0xe0 [kvm] #1: (&(&vcore->lock)->rlock){+.+...}, at: [<d00000000ecb41b0>] .kvmppc_vcpu_run_hv+0x530/0x1530 [kvm_hv] #2: (&(&vcpu->arch.tbacct_lock)->rlock){......}, at: [<d00000000ecb25a0>] .kvmppc_remove_runnable.part.3+0x30/0xd0 [kvm_hv] stack backtrace: CPU: 40 PID: 6188 Comm: qemu-system-ppc Not tainted 3.18.0-rc7-kvm-00016-g8db4bc6 torvalds#89 Call Trace: [c000000b2754f3f0] [c000000000b31b6c] .dump_stack+0x88/0xb4 (unreliable) [c000000b2754f470] [c0000000000faeb8] .__lock_acquire+0x1878/0x2190 [c000000b2754f600] [c0000000000fbf0c] .lock_acquire+0xcc/0x1a0 [c000000b2754f6d0] [c000000000b2954c] ._raw_spin_lock_irq+0x4c/0x70 [c000000b2754f760] [d00000000ecb1fe8] .vcore_stolen_time+0x48/0xd0 [kvm_hv] [c000000b2754f7f0] [d00000000ecb25b4] .kvmppc_remove_runnable.part.3+0x44/0xd0 [kvm_hv] [c000000b2754f880] [d00000000ecb43ec] .kvmppc_vcpu_run_hv+0x76c/0x1530 [kvm_hv] [c000000b2754f9f0] [d00000000eb9f46c] .kvmppc_vcpu_run+0x2c/0x40 [kvm] [c000000b2754fa60] [d00000000eb9c9a4] .kvm_arch_vcpu_ioctl_run+0x54/0x160 [kvm] [c000000b2754faf0] [d00000000eb94538] .kvm_vcpu_ioctl+0x498/0x760 [kvm] [c000000b2754fcb0] [c000000000267eb4] .do_vfs_ioctl+0x444/0x770 [c000000b2754fd90] [c0000000002682a4] .SyS_ioctl+0xc4/0xe0 [c000000b2754fe30] [c0000000000092e4] syscall_exit+0x0/0x98 In order to make the locking easier to analyse, we change the code to use a spinlock in the kvmppc_vcore struct to protect the stolen_tb and preempt_tb fields. This lock needs to be an irq-safe lock since it is used in the kvmppc_core_vcpu_load_hv() and kvmppc_core_vcpu_put_hv() functions, which are called with the scheduler rq lock held, which is an irq-safe lock. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>

David reported that perf can segfault when adding an uprobe event like this: $ perf probe -x /lib64/libc-2.14.90.so -a 'malloc size=%di' (gdb) bt #0 parse_eh_frame_hdr (hdr=0x0, hdr_size=2596, hdr_vaddr=71788, ehdr=0x7fffffffd390, eh_frame_vaddr= 0x7fffffffd378, table_entries=0x8808d8, table_encoding=0x8808e0 "") at dwarf_getcfi_elf.c:79 #1 0x000000385f81615a in getcfi_scn_eh_frame (hdr_vaddr=71788, hdr_scn=0x8839b0, shdr=0x7fffffffd2f0, scn=<optimized out>, ehdr=0x7fffffffd390, elf=0x882b30) at dwarf_getcfi_elf.c:231 #2 getcfi_shdr (ehdr=0x7fffffffd390, elf=0x882b30) at dwarf_getcfi_elf.c:283 #3 dwarf_getcfi_elf (elf=0x882b30) at dwarf_getcfi_elf.c:309 #4 0x00000000004d5bac in debuginfo__find_probes (pf=0x7fffffffd4f0, dbg=Unhandled dwarf expression opcode 0xfa) at util/probe-finder.c:993 #5 0x00000000004d634a in debuginfo__find_trace_events (dbg=0x880840, pev=<optimized out>, tevs=0x880f88, max_tevs=<optimized out>) at util/probe-finder.c:1200 torvalds#6 0x00000000004aed6b in try_to_find_probe_trace_events (target=0x881b20 "/lib64/libpthread-2.14.90.so", max_tevs=128, tevs=0x880f88, pev=0x859b30) at util/probe-event.c:482 torvalds#7 convert_to_probe_trace_events (target=0x881b20 "/lib64/libpthread-2.14.90.so", max_tevs=128, tevs=0x880f88, pev=0x859b30) at util/probe-event.c:2356 torvalds#8 add_perf_probe_events (pevs=<optimized out>, npevs=1, max_tevs=128, target=0x881b20 "/lib64/libpthread-2.14.90.so", force_add=false) at util/probe-event.c:2391 torvalds#9 0x000000000044014f in __cmd_probe (argc=<optimized out>, argv=0x7fffffffe2f0, prefix=Unhandled dwarf expression opcode 0xfa) at at builtin-probe.c:488 torvalds#10 0x0000000000440313 in cmd_probe (argc=5, argv=0x7fffffffe2f0, prefix=<optimized out>) at builtin-probe.c:506 torvalds#11 0x000000000041d133 in run_builtin (p=0x805680, argc=5, argv=0x7fffffffe2f0) at perf.c:341 torvalds#12 0x000000000041c8b2 in handle_internal_command (argv=<optimized out>, argc=<optimized out>) at perf.c:400 torvalds#13 run_argv (argv=<optimized out>, argcp=<optimized out>) at perf.c:444 torvalds#14 main (argc=5, argv=0x7fffffffe2f0) at perf.c:559 And I found a related commit (5704c8c4fa71 "getcfi_scn_eh_frame: Don't crash and burn when .eh_frame bits aren't there.") in elfutils that can lead to a unexpected crash like this. To safely use the function, it needs to check the .eh_frame section is a PROGBITS type. Reported-by: David Ahern <dsahern@gmail.com> Tested-by: David Ahern <dsahern@gmail.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: David Ahern <dsahern@gmail.com> Cc: Mark Wielaard <mjw@redhat.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Link: http://lkml.kernel.org/r/20141230090533.GH6081@sejong Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

Kirill A. Shutemov says: This simple test-case trigers few locking asserts in kernel: int main(int argc, char **argv) { unsigned int block_size = 16 * 4096; struct nl_mmap_req req = { .nm_block_size = block_size, .nm_block_nr = 64, .nm_frame_size = 16384, .nm_frame_nr = 64 * block_size / 16384, }; unsigned int ring_size; int fd; fd = socket(AF_NETLINK, SOCK_RAW, NETLINK_GENERIC); if (setsockopt(fd, SOL_NETLINK, NETLINK_RX_RING, &req, sizeof(req)) < 0) exit(1); if (setsockopt(fd, SOL_NETLINK, NETLINK_TX_RING, &req, sizeof(req)) < 0) exit(1); ring_size = req.nm_block_nr * req.nm_block_size; mmap(NULL, 2 * ring_size, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0); return 0; } +++ exited with 0 +++ BUG: sleeping function called from invalid context at /home/kas/git/public/linux-mm/kernel/locking/mutex.c:616 in_atomic(): 1, irqs_disabled(): 0, pid: 1, name: init 3 locks held by init/1: #0: (reboot_mutex){+.+...}, at: [<ffffffff81080959>] SyS_reboot+0xa9/0x220 #1: ((reboot_notifier_list).rwsem){.+.+..}, at: [<ffffffff8107f379>] __blocking_notifier_call_chain+0x39/0x70 #2: (rcu_callback){......}, at: [<ffffffff810d32e0>] rcu_do_batch.isra.49+0x160/0x10c0 Preemption disabled at:[<ffffffff8145365f>] __delay+0xf/0x20 CPU: 1 PID: 1 Comm: init Not tainted 4.1.0-00009-gbddf4c4818e0 torvalds#253 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Debian-1.8.2-1 04/01/2014 ffff88017b3d8000 ffff88027bc03c38 ffffffff81929ceb 0000000000000102 0000000000000000 ffff88027bc03c68 ffffffff81085a9d 0000000000000002 ffffffff81ca2a20 0000000000000268 0000000000000000 ffff88027bc03c98 Call Trace: <IRQ> [<ffffffff81929ceb>] dump_stack+0x4f/0x7b [<ffffffff81085a9d>] ___might_sleep+0x16d/0x270 [<ffffffff81085bed>] __might_sleep+0x4d/0x90 [<ffffffff8192e96f>] mutex_lock_nested+0x2f/0x430 [<ffffffff81932fed>] ? _raw_spin_unlock_irqrestore+0x5d/0x80 [<ffffffff81464143>] ? __this_cpu_preempt_check+0x13/0x20 [<ffffffff8182fc3d>] netlink_set_ring+0x1ed/0x350 [<ffffffff8182e000>] ? netlink_undo_bind+0x70/0x70 [<ffffffff8182fe20>] netlink_sock_destruct+0x80/0x150 [<ffffffff817e484d>] __sk_free+0x1d/0x160 [<ffffffff817e49a9>] sk_free+0x19/0x20 [..] Cong Wang says: We can't hold mutex lock in a rcu callback, [..] Thomas Graf says: The socket should be dead at this point. It might be simpler to add a netlink_release_ring() function which doesn't require locking at all. Reported-by: "Kirill A. Shutemov" <kirill@shutemov.name> Diagnosed-by: Cong Wang <cwang@twopensource.com> Suggested-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>

In dev_queue_xmit() net_cls protected with rcu-bh. [ 270.730026] =============================== [ 270.730029] [ INFO: suspicious RCU usage. ] [ 270.730033] 4.2.0-rc3+ #2 Not tainted [ 270.730036] ------------------------------- [ 270.730040] include/linux/cgroup.h:353 suspicious rcu_dereference_check() usage! [ 270.730041] other info that might help us debug this: [ 270.730043] rcu_scheduler_active = 1, debug_locks = 1 [ 270.730045] 2 locks held by dhclient/748: [ 270.730046] #0: (rcu_read_lock_bh){......}, at: [<ffffffff81682b70>] __dev_queue_xmit+0x50/0x960 [ 270.730085] #1: (&qdisc_tx_lock){+.....}, at: [<ffffffff81682d60>] __dev_queue_xmit+0x240/0x960 [ 270.730090] stack backtrace: [ 270.730096] CPU: 0 PID: 748 Comm: dhclient Not tainted 4.2.0-rc3+ #2 [ 270.730098] Hardware name: OpenStack Foundation OpenStack Nova, BIOS Bochs 01/01/2011 [ 270.730100] 0000000000000001 ffff8800bafeba58 ffffffff817ad487 0000000000000007 [ 270.730103] ffff880232a0a780 ffff8800bafeba88 ffffffff810ca4f2 ffff88022fb23e00 [ 270.730105] ffff880232a0a780 ffff8800bafebb68 ffff8800bafebb68 ffff8800bafebaa8 [ 270.730108] Call Trace: [ 270.730121] [<ffffffff817ad487>] dump_stack+0x4c/0x65 [ 270.730148] [<ffffffff810ca4f2>] lockdep_rcu_suspicious+0xe2/0x120 [ 270.730153] [<ffffffff816a62d2>] task_cls_state+0x92/0xa0 [ 270.730158] [<ffffffffa00b534f>] cls_cgroup_classify+0x4f/0x120 [cls_cgroup] [ 270.730164] [<ffffffff816aac74>] tc_classify_compat+0x74/0xc0 [ 270.730166] [<ffffffff816ab573>] tc_classify+0x33/0x90 [ 270.730170] [<ffffffffa00bcb0a>] htb_enqueue+0xaa/0x4a0 [sch_htb] [ 270.730172] [<ffffffff81682e26>] __dev_queue_xmit+0x306/0x960 [ 270.730174] [<ffffffff81682b70>] ? __dev_queue_xmit+0x50/0x960 [ 270.730176] [<ffffffff816834a3>] dev_queue_xmit_sk+0x13/0x20 [ 270.730185] [<ffffffff81787770>] dev_queue_xmit+0x10/0x20 [ 270.730187] [<ffffffff8178b91c>] packet_snd.isra.62+0x54c/0x760 [ 270.730190] [<ffffffff8178be25>] packet_sendmsg+0x2f5/0x3f0 [ 270.730203] [<ffffffff81665245>] ? sock_def_readable+0x5/0x190 [ 270.730210] [<ffffffff817b64bb>] ? _raw_spin_unlock+0x2b/0x40 [ 270.730216] [<ffffffff8173bcbc>] ? unix_dgram_sendmsg+0x5cc/0x640 [ 270.730219] [<ffffffff8165f367>] sock_sendmsg+0x47/0x50 [ 270.730221] [<ffffffff8165f42f>] sock_write_iter+0x7f/0xd0 [ 270.730232] [<ffffffff811fd4c7>] __vfs_write+0xa7/0xf0 [ 270.730234] [<ffffffff811fe5b8>] vfs_write+0xb8/0x190 [ 270.730236] [<ffffffff811fe8c2>] SyS_write+0x52/0xb0 [ 270.730239] [<ffffffff817b6bae>] entry_SYSCALL_64_fastpath+0x12/0x76 Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: David S. Miller <davem@davemloft.net>

The cgroup attaches inode->i_wb via mark_inode_dirty and when set_page_writeback is called, __inc_wb_stat() updates i_wb's stat. So, we need to explicitly call set_page_dirty->__mark_inode_dirty in prior to any writebacking pages. This patch should resolve the following kernel panic reported by Andreas Reis. https://bugzilla.kernel.org/show_bug.cgi?id=101801 --- Comment #2 from Andreas Reis <andreas.reis@gmail.com> --- BUG: unable to handle kernel NULL pointer dereference at 00000000000000a8 IP: [<ffffffff8149deea>] __percpu_counter_add+0x1a/0x90 PGD 2951ff067 PUD 2df43f067 PMD 0 Oops: 0000 [#1] PREEMPT SMP Modules linked in: CPU: 7 PID: 10356 Comm: gcc Tainted: G W 4.2.0-1-cu #1 Hardware name: Gigabyte Technology Co., Ltd. G1.Sniper M5/G1.Sniper M5, BIOS T01 02/03/2015 task: ffff880295044f80 ti: ffff880295140000 task.ti: ffff880295140000 RIP: 0010:[<ffffffff8149deea>] [<ffffffff8149deea>] __percpu_counter_add+0x1a/0x90 RSP: 0018:ffff880295143ac8 EFLAGS: 00010082 RAX: 0000000000000003 RBX: ffffea000a526d40 RCX: 0000000000000001 RDX: 0000000000000020 RSI: 0000000000000001 RDI: 0000000000000088 RBP: ffff880295143ae8 R08: 0000000000000000 R09: ffff88008f69bb30 R10: 00000000fffffffa R11: 0000000000000000 R12: 0000000000000088 R13: 0000000000000001 R14: ffff88041d099000 R15: ffff880084a205d0 FS: 00007f8549374700(0000) GS:ffff88042f3c0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000000000a8 CR3: 000000033e1d5000 CR4: 00000000001406e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Stack: 0000000000000000 ffffea000a526d40 ffff880084a20738 ffff880084a20750 ffff880295143b48 ffffffff811cc91e ffff880000000000 0000000000000296 0000000000000000 ffff880417090198 0000000000000000 ffffea000a526d40 Call Trace: [<ffffffff811cc91e>] __test_set_page_writeback+0xde/0x1d0 [<ffffffff813fee87>] do_write_data_page+0xe7/0x3a0 [<ffffffff813faeea>] gc_data_segment+0x5aa/0x640 [<ffffffff813fb0b8>] do_garbage_collect+0x138/0x150 [<ffffffff813fb3fe>] f2fs_gc+0x1be/0x3e0 [<ffffffff81405541>] f2fs_balance_fs+0x81/0x90 [<ffffffff813ee357>] f2fs_unlink+0x47/0x1d0 [<ffffffff81239329>] vfs_unlink+0x109/0x1b0 [<ffffffff8123e3d7>] do_unlinkat+0x287/0x2c0 [<ffffffff8123ebc6>] SyS_unlink+0x16/0x20 [<ffffffff81942e2e>] entry_SYSCALL_64_fastpath+0x12/0x71 Code: 41 5e 5d c3 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 55 49 89 f5 41 54 49 89 fc 53 48 83 ec 08 65 ff 05 e6 d9 b6 7e <48> 8b 47 20 48 63 ca 65 8b 18 48 63 db 48 01 f3 48 39 cb 7d 0a RIP [<ffffffff8149deea>] __percpu_counter_add+0x1a/0x90 RSP <ffff880295143ac8> CR2: 00000000000000a8 ---[ end trace 5132449a58ed93a3 ]--- note: gcc[10356] exited with preempt_count 2 Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

Hayes Wang says: ==================== r8152: issues fix v2: Replace patch #2 with "r8152: fix wakeup settings". v1: These patches are used to fix issues. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>

Hayes Wang says: ==================== r8152: device reset v3: For patch #2, remove cancel_delayed_work(). v2: For patch #1, remove usb_autopm_get_interface(), usb_autopm_put_interface(), and the checking of intf->condition. For patch #2, replace the original method with usb_queue_reset_device() to reset the device. v1: Although the driver works normally, we find the device may get all 0xff data when transmitting packets on certain platforms. It would break the device and no packet could be transmitted. The reset is necessary to recover the hw for this situation. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>

Nikolay has reported a hang when a memcg reclaim got stuck with the following backtrace: PID: 18308 TASK: ffff883d7c9b0a30 CPU: 1 COMMAND: "rsync" #0 __schedule at ffffffff815ab152 #1 schedule at ffffffff815ab76e #2 schedule_timeout at ffffffff815ae5e5 #3 io_schedule_timeout at ffffffff815aad6a #4 bit_wait_io at ffffffff815abfc6 #5 __wait_on_bit at ffffffff815abda5 torvalds#6 wait_on_page_bit at ffffffff8111fd4f torvalds#7 shrink_page_list at ffffffff81135445 torvalds#8 shrink_inactive_list at ffffffff81135845 torvalds#9 shrink_lruvec at ffffffff81135ead torvalds#10 shrink_zone at ffffffff811360c3 torvalds#11 shrink_zones at ffffffff81136eff torvalds#12 do_try_to_free_pages at ffffffff8113712f torvalds#13 try_to_free_mem_cgroup_pages at ffffffff811372be torvalds#14 try_charge at ffffffff81189423 torvalds#15 mem_cgroup_try_charge at ffffffff8118c6f5 torvalds#16 __add_to_page_cache_locked at ffffffff8112137d torvalds#17 add_to_page_cache_lru at ffffffff81121618 torvalds#18 pagecache_get_page at ffffffff8112170b torvalds#19 grow_dev_page at ffffffff811c8297 torvalds#20 __getblk_slow at ffffffff811c91d6 torvalds#21 __getblk_gfp at ffffffff811c92c1 torvalds#22 ext4_ext_grow_indepth at ffffffff8124565c torvalds#23 ext4_ext_create_new_leaf at ffffffff81246ca8 torvalds#24 ext4_ext_insert_extent at ffffffff81246f09 torvalds#25 ext4_ext_map_blocks at ffffffff8124a848 torvalds#26 ext4_map_blocks at ffffffff8121a5b7 torvalds#27 mpage_map_one_extent at ffffffff8121b1fa torvalds#28 mpage_map_and_submit_extent at ffffffff8121f07b torvalds#29 ext4_writepages at ffffffff8121f6d5 torvalds#30 do_writepages at ffffffff8112c490 torvalds#31 __filemap_fdatawrite_range at ffffffff81120199 torvalds#32 filemap_flush at ffffffff8112041c torvalds#33 ext4_alloc_da_blocks at ffffffff81219da1 torvalds#34 ext4_rename at ffffffff81229b91 torvalds#35 ext4_rename2 at ffffffff81229e32 torvalds#36 vfs_rename at ffffffff811a08a5 torvalds#37 SYSC_renameat2 at ffffffff811a3ffc torvalds#38 sys_renameat2 at ffffffff811a408e torvalds#39 sys_rename at ffffffff8119e51e torvalds#40 system_call_fastpath at ffffffff815afa89 Dave Chinner has properly pointed out that this is a deadlock in the reclaim code because ext4 doesn't submit pages which are marked by PG_writeback right away. The heuristic was introduced by commit e62e384 ("memcg: prevent OOM with too many dirty pages") and it was applied only when may_enter_fs was specified. The code has been changed by c3b94f4 ("memcg: further prevent OOM with too many dirty pages") which has removed the __GFP_FS restriction with a reasoning that we do not get into the fs code. But this is not sufficient apparently because the fs doesn't necessarily submit pages marked PG_writeback for IO right away. ext4_bio_write_page calls io_submit_add_bh but that doesn't necessarily submit the bio. Instead it tries to map more pages into the bio and mpage_map_one_extent might trigger memcg charge which might end up waiting on a page which is marked PG_writeback but hasn't been submitted yet so we would end up waiting for something that never finishes. Fix this issue by replacing __GFP_IO by may_enter_fs check (for case 2) before we go to wait on the writeback. The page fault path, which is the only path that triggers memcg oom killer since 3.12, shouldn't require GFP_NOFS and so we shouldn't reintroduce the premature OOM killer issue which was originally addressed by the heuristic. As per David Chinner the xfs is doing similar thing since 2.6.15 already so ext4 is not the only affected filesystem. Moreover he notes: : For example: IO completion might require unwritten extent conversion : which executes filesystem transactions and GFP_NOFS allocations. The : writeback flag on the pages can not be cleared until unwritten : extent conversion completes. Hence memory reclaim cannot wait on : page writeback to complete in GFP_NOFS context because it is not : safe to do so, memcg reclaim or otherwise. Cc: stable@vger.kernel.org # 3.9+ [tytso@mit.edu: corrected the control flow] Fixes: c3b94f4 ("memcg: further prevent OOM with too many dirty pages") Reported-by: Nikolay Borisov <kernel@kyup.com> Signed-off-by: Michal Hocko <mhocko@suse.cz> Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

The shm implementation internally uses shmem or hugetlbfs inodes for shm segments. As these inodes are never directly exposed to userspace and only accessed through the shm operations which are already hooked by security modules, mark the inodes with the S_PRIVATE flag so that inode security initialization and permission checking is skipped. This was motivated by the following lockdep warning: ====================================================== [ INFO: possible circular locking dependency detected ] 4.2.0-0.rc3.git0.1.fc24.x86_64+debug #1 Tainted: G W ------------------------------------------------------- httpd/1597 is trying to acquire lock: (&ids->rwsem){+++++.}, at: shm_close+0x34/0x130 but task is already holding lock: (&mm->mmap_sem){++++++}, at: SyS_shmdt+0x4b/0x180 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #3 (&mm->mmap_sem){++++++}: lock_acquire+0xc7/0x270 __might_fault+0x7a/0xa0 filldir+0x9e/0x130 xfs_dir2_block_getdents.isra.12+0x198/0x1c0 [xfs] xfs_readdir+0x1b4/0x330 [xfs] xfs_file_readdir+0x2b/0x30 [xfs] iterate_dir+0x97/0x130 SyS_getdents+0x91/0x120 entry_SYSCALL_64_fastpath+0x12/0x76 -> #2 (&xfs_dir_ilock_class){++++.+}: lock_acquire+0xc7/0x270 down_read_nested+0x57/0xa0 xfs_ilock+0x167/0x350 [xfs] xfs_ilock_attr_map_shared+0x38/0x50 [xfs] xfs_attr_get+0xbd/0x190 [xfs] xfs_xattr_get+0x3d/0x70 [xfs] generic_getxattr+0x4f/0x70 inode_doinit_with_dentry+0x162/0x670 sb_finish_set_opts+0xd9/0x230 selinux_set_mnt_opts+0x35c/0x660 superblock_doinit+0x77/0xf0 delayed_superblock_init+0x10/0x20 iterate_supers+0xb3/0x110 selinux_complete_init+0x2f/0x40 security_load_policy+0x103/0x600 sel_write_load+0xc1/0x750 __vfs_write+0x37/0x100 vfs_write+0xa9/0x1a0 SyS_write+0x58/0xd0 entry_SYSCALL_64_fastpath+0x12/0x76 ... Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Reported-by: Morten Stevens <mstevens@fedoraproject.org> Acked-by: Hugh Dickins <hughd@google.com> Acked-by: Paul Moore <paul@paul-moore.com> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Eric Paris <eparis@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

It turns out that a PV domU also requires the "Xen PV" APIC driver. Otherwise, the flat driver is used and we get stuck in busy loops that never exit, such as in this stack trace: (gdb) target remote localhost:9999 Remote debugging using localhost:9999 __xapic_wait_icr_idle () at ./arch/x86/include/asm/ipi.h:56 56 while (native_apic_mem_read(APIC_ICR) & APIC_ICR_BUSY) (gdb) bt #0 __xapic_wait_icr_idle () at ./arch/x86/include/asm/ipi.h:56 #1 __default_send_IPI_shortcut (shortcut=<optimized out>, dest=<optimized out>, vector=<optimized out>) at ./arch/x86/include/asm/ipi.h:75 #2 apic_send_IPI_self (vector=246) at arch/x86/kernel/apic/probe_64.c:54 #3 0xffffffff81011336 in arch_irq_work_raise () at arch/x86/kernel/irq_work.c:47 #4 0xffffffff8114990c in irq_work_queue (work=0xffff88000fc0e400) at kernel/irq_work.c:100 #5 0xffffffff8110c29d in wake_up_klogd () at kernel/printk/printk.c:2633 torvalds#6 0xffffffff8110ca60 in vprintk_emit (facility=0, level=<optimized out>, dict=0x0 <irq_stack_union>, dictlen=<optimized out>, fmt=<optimized out>, args=<optimized out>) at kernel/printk/printk.c:1778 torvalds#7 0xffffffff816010c8 in printk (fmt=<optimized out>) at kernel/printk/printk.c:1868 torvalds#8 0xffffffffc00013ea in ?? () torvalds#9 0x0000000000000000 in ?? () Mailing-list-thread: https://lkml.org/lkml/2015/8/4/755 Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Cc: <stable@vger.kernel.org> Signed-off-by: David Vrabel <david.vrabel@citrix.com>

Orabug: 20189959 PID: 614 TASK: ffff882a739da580 CPU: 3 COMMAND: "ocfs2dc" #0 [ffff882ecc3759b0] machine_kexec at ffffffff8103b35d #1 [ffff882ecc375a20] crash_kexec at ffffffff810b95b5 #2 [ffff882ecc375af0] oops_end at ffffffff815091d8 #3 [ffff882ecc375b20] die at ffffffff8101868b #4 [ffff882ecc375b50] do_trap at ffffffff81508bb0 #5 [ffff882ecc375ba0] do_invalid_op at ffffffff810165e5 torvalds#6 [ffff882ecc375c40] invalid_op at ffffffff815116fb [exception RIP: ocfs2_ci_checkpointed+208] RIP: ffffffffa0a7e940 RSP: ffff882ecc375cf0 RFLAGS: 00010002 RAX: 0000000000000001 RBX: 000000000000654b RCX: ffff8812dc83f1f8 RDX: 00000000000017d9 RSI: ffff8812dc83f1f8 RDI: ffffffffa0b2c318 RBP: ffff882ecc375d20 R8: ffff882ef6ecfa60 R9: ffff88301f272200 R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffffffffff R13: ffff8812dc83f4f0 R14: 0000000000000000 R15: ffff8812dc83f1f8 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 torvalds#7 [ffff882ecc375d28] ocfs2_check_meta_downconvert at ffffffffa0a7edbd [ocfs2] torvalds#8 [ffff882ecc375d38] ocfs2_unblock_lock at ffffffffa0a84af8 [ocfs2] torvalds#9 [ffff882ecc375dc8] ocfs2_process_blocked_lock at ffffffffa0a85285 [ocfs2] torvalds#10 [ffff882ecc375e18] ocfs2_downconvert_thread_do_work at ffffffffa0a85445 [ocfs2] torvalds#11 [ffff882ecc375e68] ocfs2_downconvert_thread at ffffffffa0a854de [ocfs2] torvalds#12 [ffff882ecc375ee8] kthread at ffffffff81090da7 torvalds#13 [ffff882ecc375f48] kernel_thread_helper at ffffffff81511884 assert is tripped because the tran is not checkpointed and the lock level is PR. Some time ago, chmod command had been executed. As result, the following call chain left the inode cluster lock in PR state, latter on causing the assert. system_call_fastpath -> my_chmod -> sys_chmod -> sys_fchmodat -> notify_change -> ocfs2_setattr -> posix_acl_chmod -> ocfs2_iop_set_acl -> ocfs2_set_acl -> ocfs2_acl_set_mode Here is how. 1119 int ocfs2_setattr(struct dentry *dentry, struct iattr *attr) 1120 { 1247 ocfs2_inode_unlock(inode, 1); <<< WRONG thing to do. .. 1258 if (!status && attr->ia_valid & ATTR_MODE) { 1259 status = posix_acl_chmod(inode, inode->i_mode); 519 posix_acl_chmod(struct inode *inode, umode_t mode) 520 { .. 539 ret = inode->i_op->set_acl(inode, acl, ACL_TYPE_ACCESS); 287 int ocfs2_iop_set_acl(struct inode *inode, struct posix_acl *acl, ... 288 { 289 return ocfs2_set_acl(NULL, inode, NULL, type, acl, NULL, NULL); 224 int ocfs2_set_acl(handle_t *handle, 225 struct inode *inode, ... 231 { .. 252 ret = ocfs2_acl_set_mode(inode, di_bh, 253 handle, mode); 168 static int ocfs2_acl_set_mode(struct inode *inode, struct buffer_head ... 170 { 183 if (handle == NULL) { >>> BUG: inode lock not held in ex at this point <<< 184 handle = ocfs2_start_trans(OCFS2_SB(inode->i_sb), 185 OCFS2_INODE_UPDATE_CREDITS); ocfs2_setattr.#1247 we unlock and at #1259 call posix_acl_chmod. When we reach ocfs2_acl_set_mode.torvalds#181 and do trans, the inode cluster lock is not held in EX mode (it should be). How this could have happended? We are the lock master, were holding lock EX and have released it in ocfs2_setattr.#1247. Note that there are no holders of this lock at this point. Another node needs the lock in PR, and we downconvert from EX to PR. So the inode lock is PR when do the trans in ocfs2_acl_set_mode.torvalds#184. The trans stays in core (not flushed to disc). Now another node want the lock in EX, downconvert thread gets kicked (the one that tripped assert abovt), finds an unflushed trans but the lock is not EX (it is PR). If the lock was at EX, it would have flushed the trans ocfs2_ci_checkpointed -> ocfs2_start_checkpoint before downconverting (to NULL) for the request. ocfs2_setattr must not drop inode lock ex in this code path. If it does, takes it again before the trans, say in ocfs2_set_acl, another cluster node can get in between, execute another setattr, overwriting the one in progress on this node, resulting in a mode acl size combo that is a mix of the two. Signed-off-by: Tariq Saeed <tariq.x.saeed@oracle.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

GIT fc4597fb8f32648009bfbe2793bef4a9cd3e9460 commit b9e43f49f67229f5135b8f0f5abdbf72fa31caa1 Author: Theodore Ts'o <tytso@mit.edu> Date: Sun Aug 16 10:03:57 2015 -0400 Revert "ext4: remove block_device_ejected" This reverts commit 08439fec266c3cc5702953b4f54bdf5649357de0. Unfortunately we still need to test for bdi->dev to avoid a crash when a USB stick is yanked out while a file system is mounted: usb 2-2: USB disconnect, device number 2 Buffer I/O error on dev sdb1, logical block 15237120, lost sync page write JBD2: Error -5 detected when updating journal superblock for sdb1-8. BUG: unable to handle kernel paging request at 34beb000 IP: [<c136ce88>] __percpu_counter_add+0x18/0xc0 *pdpt = 0000000023db9001 *pde = 0000000000000000 Oops: 0000 [#1] SMP CPU: 0 PID: 4083 Comm: umount Tainted: G U OE 4.1.1-040101-generic #201507011435 Hardware name: LENOVO 7675CTO/7675CTO, BIOS 7NETC2WW (2.22 ) 03/22/2011 task: ebf06b50 ti: ebebc000 task.ti: ebebc000 EIP: 0060:[<c136ce88>] EFLAGS: 00010082 CPU: 0 EIP is at __percpu_counter_add+0x18/0xc0 EAX: f21c8e88 EBX: f21c8e88 ECX: 00000000 EDX: 00000001 ESI: 00000001 EDI: 00000000 EBP: ebebde60 ESP: ebebde40 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 CR0: 8005003b CR2: 34beb000 CR3: 33354200 CR4: 000007f0 Stack: c1abe100 edcb0098 edcb00ec ffffffff f21c8e68 ffffffff f21c8e68 f286d160 ebebde84 c1160454 00000010 00000282 f72a77f8 00000984 f72a77f8 f286d160 f286d170 ebebdea0 c11e613f 00000000 00000282 f72a77f8 edd7f4d0 00000000 Call Trace: [<c1160454>] account_page_dirtied+0x74/0x110 [<c11e613f>] __set_page_dirty+0x3f/0xb0 [<c11e6203>] mark_buffer_dirty+0x53/0xc0 [<c124a0cb>] ext4_commit_super+0x17b/0x250 [<c124ac71>] ext4_put_super+0xc1/0x320 [<c11f04ba>] ? fsnotify_unmount_inodes+0x1aa/0x1c0 [<c11cfeda>] ? evict_inodes+0xca/0xe0 [<c11b925a>] generic_shutdown_super+0x6a/0xe0 [<c10a1df0>] ? prepare_to_wait_event+0xd0/0xd0 [<c1165a50>] ? unregister_shrinker+0x40/0x50 [<c11b92f6>] kill_block_super+0x26/0x70 [<c11b94f5>] deactivate_locked_super+0x45/0x80 [<c11ba007>] deactivate_super+0x47/0x60 [<c11d2b39>] cleanup_mnt+0x39/0x80 [<c11d2bc0>] __cleanup_mnt+0x10/0x20 [<c1080b51>] task_work_run+0x91/0xd0 [<c1011e3c>] do_notify_resume+0x7c/0x90 [<c1720da5>] work_notify Code: 8b 55 e8 e9 f4 fe ff ff 90 90 90 90 90 90 90 90 90 90 90 55 89 e5 83 ec 20 89 5d f4 89 c3 89 75 f8 89 d6 89 7d fc 89 cf 8b 48 14 <64> 8b 01 89 45 ec 89 c2 8b 45 08 c1 fa 1f 01 75 ec 89 55 f0 89 EIP: [<c136ce88>] __percpu_counter_add+0x18/0xc0 SS:ESP 0068:ebebde40 CR2: 0000000034beb000 ---[ end trace dd564a7bea834ecd ]--- Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=101011 Signed-off-by: Theodore Ts'o <tytso@mit.edu> commit 395ae54bd8775508a9616817188cabbcd6f53260 Author: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Date: Fri Aug 14 17:19:43 2015 -0500 ALSA: usb: handle descriptor with SYNC_NONE illegal value The M-Audio Transit exposes an interface with a SYNC_NONE attribute. This is not a valid value according to the USB audio classspec. However there is a sync endpoint associated to this record. Changing the logic to try to use this sync endpoint allows for seamless transitions between altset 2 and altset 3. If any errors happen, the behavior remains the same. $ more /proc/asound/card1/stream0 M-Audio Transit USB at usb-0000:00:14.0-2, full speed : USB Audio Playback: Status: Stop Interface 1 Altset 1 Format: S24_3LE Channels: 2 Endpoint: 3 OUT (ADAPTIVE) Rates: 48001 - 96000 (continuous) Interface 1 Altset 2 Format: S24_3LE Channels: 2 Endpoint: 3 OUT (NONE) Rates: 8000 - 48000 (continuous) Interface 1 Altset 3 Format: S16_LE Channels: 2 Endpoint: 3 OUT (ASYNC) Rates: 8000 - 48000 (continuous) Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> commit 630184477e7eccb2b31ee4c20b6905ca5fa4b3a8 Author: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Date: Fri Aug 14 17:19:42 2015 -0500 ALSA: usb: fix corrupted pointers due to interface setting change When a transition occurs between alternate settings that do not use the same synchronization method, the substream pointers were not reset. This prevents audio from being played during the second transition. Identified and tested with M-Audio Transit device (0763:2006 Midiman M-Audio Transit) Details of the issue: First playback to adaptive endpoint: $ aplay -Dhw:1,0 ~/24_96.wav Playing WAVE '/home/plb/24_96.wav' : Signed 24 bit Little Endian in 3bytes, Rate 96000 Hz, Stereo [ 3169.297556] usb 1-2: setting usb interface 1:1 [ 3169.297568] usb 1-2: Creating new playback data endpoint #3 [ 3169.298563] usb 1-2: Setting params for ep #3 (type 0, 3 urbs), ret=0 [ 3169.298574] usb 1-2: Starting data EP @ffff880035fc8000 first playback to asynchronous endpoint: $ aplay -Dhw:1,0 ~/16_48.wav Playing WAVE '/home/plb/16_48.wav' : Signed 16 bit Little Endian, Rate 48000 Hz, Stereo [ 3204.520251] usb 1-2: setting usb interface 1:3 [ 3204.520264] usb 1-2: Creating new playback data endpoint #3 [ 3204.520272] usb 1-2: Creating new capture sync endpoint #83 [ 3204.521162] usb 1-2: Setting params for ep #3 (type 0, 4 urbs), ret=0 [ 3204.521177] usb 1-2: Setting params for ep #83 (type 1, 4 urbs), ret=0 [ 3204.521182] usb 1-2: Starting data EP @ffff880035fce000 [ 3204.521204] usb 1-2: Starting sync EP @ffff8800bd616000 second playback to adaptive endpoint: no audio and error on terminal: $ aplay -Dhw:1,0 ~/24_96.wav Playing WAVE '/home/plb/24_96.wav' : Signed 24 bit Little Endian in 3bytes, Rate 96000 Hz, Stereo aplay: pcm_write:1939: write error: Input/output error [ 3239.483589] usb 1-2: setting usb interface 1:1 [ 3239.483601] usb 1-2: Re-using EP 3 in iface 1,1 @ffff880035fc8000 [ 3239.484590] usb 1-2: Setting params for ep #3 (type 0, 4 urbs), ret=0 [ 3239.484606] usb 1-2: Setting params for ep #83 (type 1, 4 urbs), ret=0 This last line shows that a sync endpoint is used when it shouldn't. The sync endpoint is no longer valid and the pointers are corrupted Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> commit 493f133f47750aa5566fafa9403617e3f0506f8c Author: Len Brown <len.brown@intel.com> Date: Wed Mar 25 23:20:37 2015 -0400 intel_idle: Skylake Client Support Skylake Client CPU idle Power states (C-states) are similar to the previous generation, Broadwell. However, Skylake does get its own table with updated worst-case latency and average energy-break-even residency values. Signed-off-by: Len Brown <len.brown@intel.com> commit d248b61f611463cca906d5663a9a0de63ade97a9 Author: Hai Li <hali@codeaurora.org> Date: Thu Aug 13 17:49:29 2015 -0400 drm/msm/dsi: Introduce DSI configuration module With more platforms supported, the DSI host configuration array keeps expanding. This change moves those to a separate dsi_cfg module. Signed-off-by: Hai Li <hali@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com> commit 1bf4d7c5651a7cfcdcd77389b42d266441ecf444 Author: Hai Li <hali@codeaurora.org> Date: Thu Aug 13 17:45:53 2015 -0400 drm/msm/dsi: Make each PHY type compilation independent On a certain platform, only one type of DSI PHY is used. This change allows the user to only compile the PHY type which is being used. Signed-off-by: Hai Li <hali@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com> commit 5c8290284402bf7d2c12269402b3177b899c78b7 Author: Hai Li <hali@codeaurora.org> Date: Thu Aug 13 17:45:52 2015 -0400 drm/msm/dsi: Split PHY drivers to separate files This change moves each PHY type specific code into separate files. Signed-off-by: Hai Li <hali@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com> commit 29e61690130adb1c27053558d2f21af88ae0334e Author: Hai Li <hali@codeaurora.org> Date: Thu Aug 13 17:45:51 2015 -0400 drm/msm/dsi: Return void from msm_dsi_phy_disable() We are not checking the return value from msm_dsi_phy_disable(). Change the return type to void. Signed-off-by: Hai Li <hali@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com> commit fae11c1106ad8304c09e3b9bf95dd6d03f4a5afa Author: Hai Li <hali@codeaurora.org> Date: Thu Aug 13 17:45:50 2015 -0400 drm/msm/dsi: Specify bitmask to set source PLL The bit position to configure source PLL will change on new types of PHYs. The caller should pass down this information. Signed-off-by: Hai Li <hali@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com> commit 29f034d776209042f7aaaf1518a66841c1d42233 Author: jilai wang <jilaiw@codeaurora.org> Date: Wed Aug 5 15:33:29 2015 -0400 drm/msm/mdp: Clear pending interrupt status before enable interrupt Pending interrupt status needs to be cleared before enable the interrupt. Otherwise it's possible to get a pending interrupt instead of an incoming interrupt. Signed-off-by: Jilai Wang <jilaiw@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com> commit 8089082fae1975ad9d5abbd37c0ee8f688be28a0 Author: jilai wang <jilaiw@codeaurora.org> Date: Fri Jul 31 10:13:26 2015 -0400 drm/msm/mdp5: Add rotation (hflip/vflip) support to MDP5 planes (v2) MDP5 SSPPs can flip the input source horizontally or vertically. This change is to add this support to MDP5 planes. v1: Initial change v2: Use existing "rotation" property instead of creating msm specific properties. In order to be compatiable with legacy non-atomic set_property, switch to drm_atomic_helper_plane_set_property helper function. Signed-off-by: Jilai Wang <jilaiw@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com> commit 095022b9aa0546a0f3d3fb5f42c82b4004103864 Author: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Date: Mon Jul 27 15:12:39 2015 +0100 drm/msm: add calls to prepare and unprepare panel Prepare the panel before it's enabled and un-prepare after disable, this will make sure that the regulators are switched on and off correctly. Tested it on APQ8064 based IFC6410 with panel. Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Signed-off-by: Rob Clark <robdclark@gmail.com> commit 09992e4d46935798b4a1cf3a734e6a0f3470f107 Author: Archit Taneja <architt@codeaurora.org> Date: Mon Aug 3 14:09:36 2015 +0530 drm/msm/dsi: Modify dsi manager bridge ops to work with external bridges The dsi bridge ops call drm_panel functions to set up the connected drm_panel. Add checks to make sure these aren't called when we're connected to an external bridge. Signed-off-by: Archit Taneja <architt@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com> commit c118e29033aa5b38b593ebd0e02f8b1224c20ed3 Author: Archit Taneja <architt@codeaurora.org> Date: Fri Jul 31 14:06:10 2015 +0530 drm/msm/dsi: Allow dsi to connect to an external bridge There are platforms where the DSI output can be connected to another encoder bridge chip (DSI to HDMI, DSI to LVDS etc). Add support for external bridge support to the dsi driver. We assume that the external bridge chip would be of the type drm_bridge. The dsi driver's internal drm_bridge (msm_dsi->bridge) is linked to the external bridge's drm_bridge struct. In the case we're connected to an external bridge, we don't need to create and manage a connector within our driver, it's the bridge driver's responsibility to create one. v2: - Move the external bridge attaching stuff to dsi manager to make things cleaner. - Force the bridge to connect to a video mode encoder for now (the dsi mode flags may have not been populated by modeset_init) Signed-off-by: Archit Taneja <architt@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com> commit 6f054ec5b9ced3041f29541ae79402198678fc06 Author: Archit Taneja <architt@codeaurora.org> Date: Mon Aug 3 14:08:33 2015 +0530 drm/msm/dsi: Create a helper to check if there is a connected device Create a helper msm_dsi_device_connected() which checks whether we have a device connected to the dsi host or not. This check gets messy when we have support external bridges too. Having an inline function makes it more legible. For now, the check only consists of msm_dsi->panel being non-NULL. Later, this will check if we have an external bridge or not. This helper isn't used in dsi_connector related code as that's specific to only when a drm_panel is connected. Signed-off-by: Archit Taneja <architt@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com> commit a9ddac9c5765712fa7eace55feeaf7c4ac75e32b Author: Archit Taneja <architt@codeaurora.org> Date: Mon Aug 3 14:05:45 2015 +0530 drm/msm/dsi: Refer to connected device as 'device' instead of 'panel' We currently support only panels connected to dsi output. We're going to also support external bridge chips now. Change 'panel_node' to 'device_node' in the struct msm_dsi_host and 'panel_flags' to 'device_flags' in msm_dsi. This makes things sound a bit more generic. Signed-off-by: Archit Taneja <architt@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com> commit 60d05cb4eabcfcab00784677c2a55ed1b9bda2ec Author: Archit Taneja <architt@codeaurora.org> Date: Thu Jun 25 14:36:35 2015 +0530 drm/msm/dsi: Make TE gpio optional Platforms containing only DSI video mode devices don't need a TE gpio. Make TE gpio optional. Signed-off-by: Archit Taneja <architt@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com> commit 3d6df06249277aabaf895951855c4ed704b038bb Author: Archit Taneja <architt@codeaurora.org> Date: Tue Jun 9 14:17:22 2015 +0530 drm/msm: mdp4 lvds: get panel node via of graph parsing We currently get the output connected to LVDS by looking for a phandle called 'qcom,lvds-panel' under the mdp DT node. Use the more standard of_graph approach to create an lvds output port, and retrieve the panel node from the port's endpoint data. v3 - Fix return value checks of of_graph_* calls. Tested-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Signed-off-by: Archit Taneja <architt@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com> commit f7009d266d8b2f4b54da42399aaa536d74fe3e7c Author: Archit Taneja <architt@codeaurora.org> Date: Thu Jun 25 11:43:40 2015 +0530 drm/msm: dsi host: Use device graph parsing to parse connected panel The dsi host looks for the connected panel node by parsing for a child named 'panel'. This hierarchy isn't very flexible. The connected panel is forced to be a child to the dsi host, and hence, a mipi dsi device. This isn't suitable for dsi devices that don't use mipi dsi as their control bus. Follow the of_graph approach of creating ports and endpoints to represent the connections between the dsi host and the panel connected to it. In our case, the dsi host will only have one output port, linked to the panel's input port. Update DT binding documentation with device graph usage info. v3: - Fix return value checks of of_graph_* calls. - Don't make port a mandatory DT property - Fix defer check when no panel node specified - Rename parse_dt func to align with other dsi_host funcs Reviewed-by: Hai Li <hali@codeaurora.org> Signed-off-by: Archit Taneja <architt@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com> commit 14bb28b0f91f868f081f71eb0c7b590e13527c3c Author: Archit Taneja <architt@codeaurora.org> Date: Thu Jun 4 15:01:57 2015 +0530 drm/msm: dsi host: add missing of_node_put() Decrement device node refcount if of_get_child_by_name is successfully called. Signed-off-by: Archit Taneja <architt@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com> commit 78b1d470d57dd7a6e0efda63ebad97f0d44e817c Author: Hai Li <hali@codeaurora.org> Date: Mon Jul 27 13:49:45 2015 -0400 drm/msm: Enable clocks during enable/disable_vblank() callbacks AHB clock should be enabled before accessing registers during enable/disable_vblank(). Since these 2 callbacks are called in atomic context while clk_prepare may cause thread sleep, a work is scheduled to control vblanks. v2: fixup spinlock initialization Signed-off-by: Hai Li <hali@codeaurora.org> [add comment about cancel_work_sync() before drm_irq_uninstall()] Signed-off-by: Rob Clark <robdclark@gmail.com> commit 8a94b0aa372ebf7375c8ea861cb9bbf84b39d2df Author: jilai wang <jilaiw@codeaurora.org> Date: Wed Jul 8 18:25:40 2015 -0400 drm/msm/mdp5: Add support for msm8x74v1 msm8x74v1 has different MDP5 version (v1.0) from msm8x74v2 (v1.2). Add a separate config data to support msm8x74v1. Signed-off-by: Jilai Wang <jilaiw@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com> commit 8155ad4ce67d2f3418a4a72144c10114d21f0ece Author: jilai wang <jilaiw@codeaurora.org> Date: Tue Jul 7 17:17:28 2015 -0400 drm/msm/mdp5: Add DMA pipe planes for MDP5 This change is to add planes which use DMA pipes for MDP5. Signed-off-by: Jilai Wang <jilaiw@codeaurora.org> [slight comment adjust to s/Construct public planes/Construct video planes/ since DMA planes are public planes too, they just can't scale or CSC] Signed-off-by: Rob Clark <robdclark@gmail.com> commit 3498409f0315b93f969f87c31d014a9819f6fa7d Author: jilai wang <jilaiw@codeaurora.org> Date: Wed Jul 8 18:12:40 2015 -0400 drm/msm/mdp: Add capabilities to MDP planes (v2) MDP planes can be implemented using different type of HW pipes, RGB/VIG/DMA pipes for MDP5 and RGB/VG/DMA pipes for MDP4. Each type of pipe has different HW capabilities such as scaling, color space conversion, decimation... Add a variable in plane data structure to specify the difference of each plane which comes from mdp5_cfg data and use it to differenciate the plane operation. V1: Initial change V2: Fix a typo in mdp4_kms.h Signed-off-by: Jilai Wang <jilaiw@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com> commit bef799fb77dc30d32362b6850e997f1c29fc99eb Author: Stephane Viau <sviau@codeaurora.org> Date: Mon Jul 6 16:35:31 2015 -0400 drm/msm/mdp5: add more YUV formats for MDP5 Add packed YUV422 and planar YUV420 formats to MDP supported formats. Signed-off-by: Stephane Viau <sviau@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com> commit 9cc137a3ffa3162a0d5af822149a5cec1b42c24e Author: Wentao Xu <wentaox@codeaurora.org> Date: Mon Jul 6 16:35:30 2015 -0400 drm/msm/mdp5: use 2 memory clients for YUV formats on newer mdp5 Newer MDP5 uses 2 shared memory pool clients for certain YUV formats. For example, if VIG0 is used to fetch data in YUYV format, it will use VIG0_Y for Y component, and VIG0_Cr for UV packed. Signed-off-by: Wentao Xu <wentaox@codeaurora.org> [rebase] Signed-off-by: Stephane Viau <sviau@codeaurora.org> commit ff78a6b3771f48d1d5585e5d08ab4ae6fd606ab0 Author: Wentao Xu <wentaox@codeaurora.org> Date: Mon Jul 6 16:35:29 2015 -0400 drm/msm/mdp: mark if a MDP format is YUV at definition This makes it easy to determine if a format is YUV. The old method of using chroma sample type incorrectly marks YUV444 as RGB format. Signed-off-by: Wentao Xu <wentaox@codeaurora.org> [rebase] Signed-off-by: Stephane Viau <sviau@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com> commit 02b3ee466443ba6780562fb2af5fe0ad5bf059f6 Author: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Date: Mon Jul 6 11:09:41 2015 +0200 drm/msm/dp: use flags argument of devm_gpiod_get to set direction Since 39b2bbe3d715 (gpio: add flags argument to gpiod_get*() functions) which appeared in v3.17-rc1, the gpiod_get* functions take an additional parameter that allows to specify direction and initial value for output. Use this to simplify the driver. Furthermore this is one caller less that stops us making the flags argument to gpiod_get*() mandatory. Acked-by: Alexandre Courbot <acourbot@nvidia.com> Acked-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Rob Clark <robdclark@gmail.com> commit 328e1a633c9bc26c36ecd320246e4a9b2726e81a Author: Hai Li <hali@codeaurora.org> Date: Fri Jul 3 10:09:46 2015 -0400 drm/msm/dsi: Save/Restore PLL status across PHY reset Reset DSI PHY silently changes its PLL registers to reset status, which will make cached status in clock driver invalid and result in wrong output rate of link clocks. The current restore mechanism in DSI PLL does not cover all the cases. This change is to recover PLL status after PHY reset to match HW status with cached status in clock driver. Signed-off-by: Hai Li <hali@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com> commit da882cd1ee132ecbb4a4848a6b0797ea2ed4bee7 Author: Markus Elfring <elfring@users.sourceforge.net> Date: Sat Jun 27 22:23:28 2015 +0200 drm/msm/dsi: One function call less in dsi_init() after error detection The dsi_destroy() function was called in two cases by the dsi_init() function during error handling even if the passed variable contained a null pointer. * This implementation detail could be improved by adjustments for jump targets according to the Linux coding style convention. * Drop an unnecessary initialisation for the variable "msm_dsi" then. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> [add couple missing ERR_PTR()'s] Signed-off-by: Rob Clark <robdclark@gmail.com> commit a60bbb2764b73a3f54c7462f3d28f960b7811682 Author: Markus Elfring <elfring@users.sourceforge.net> Date: Sat Jun 27 22:05:31 2015 +0200 drm/msm/dsi: Delete an unnecessary check before the function call "dsi_destroy" The dsi_destroy() function tests whether its argument is NULL and then returns immediately. Thus the test around the call is not needed. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Rob Clark <robdclark@gmail.com> commit b96b3a06d1211ba86674db99a6aafe39ef4cbed2 Author: Hai Li <hali@codeaurora.org> Date: Fri Jun 26 16:03:26 2015 -0400 drm/msm/mdp5: Allocate CTL0/1 for dual DSI single FLUSH This change takes advantage of a HW feature that synchronize flush operation on CTL1 to CTL0, to keep dual DSI pipes in sync. Signed-off-by: Hai Li <hali@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com> commit c71716b17bc772e9c38f85a4b496bbfac0dd32f0 Author: Hai Li <hali@codeaurora.org> Date: Fri Jun 26 16:03:25 2015 -0400 drm/msm/mdp5: Allocate CTL for each display interface In MDP5, CTL contains information of the whole pipeline whose output goes down to a display interface. In various cases, one interface may require 2 CRTCs, but only one CTL. Some interfaces also require to use certain CTLs. Instead of allocating CTL for each active CRTC, this change is to associate a CTL with each interface. Signed-off-by: Hai Li <hali@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com> commit 129877819c0a5f8d419fe67ae08a8a7c811afa5e Author: jilai wang <jilaiw@codeaurora.org> Date: Thu Jun 25 17:37:42 2015 -0400 drm/msm/mdp5: Add plane blending operation support for MDP5 (v2) This change is to add properties alpha/zpos/blend_mode to mdp5 plane for alpha blending operation to generate the blended output. v1: Initial change v2: Change "premultilied" property to enum (Rob's comment) Signed-off-by: Jilai Wang <jilaiw@codeaurora.org> [Don't actually expose alpha/premultiplied props to userspace yet pending a chance for discussion and some userspace to exercise it] Signed-off-by: Rob Clark <robdclark@gmail.com> commit 4ff696eafaa50d6d649d256d20528b475104a500 Author: Rob Clark <robdclark@gmail.com> Date: Tue Jul 28 11:05:03 2015 -0400 drm/msm: don't install plane properties on crtc This was a hold-over from the pre-atomic days and legacy userspace that only understood CRTCs. Fortunately we don't have any properties, so this doesn't change anything. But before we start growing some plane properties, we should fix this. Signed-off-by: Rob Clark <robdclark@gmail.com> commit 01199361c665245d557b8eefef56d648ddb3867a Author: Archit Taneja <architt@codeaurora.org> Date: Thu Jun 25 11:29:24 2015 +0530 drm/msm/dsi: Report PHY errors only when they really occur DSI PHY errors are falsely reported whenever a dsi error occurs. This is because DSI_DLN0_PHY_ERR isn't only used as a status register, but also used to mask PHY errors. Currently, we end up reading the mask bits too and therefore always report errors. Ignore the register mask bits and check for only the status/clear bits. Signed-off-by: Archit Taneja <architt@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com> commit 9b7a9fc29a48026d797cbf237121850c1c241df4 Author: Hai Li <hali@codeaurora.org> Date: Wed Jun 24 19:13:40 2015 -0400 drm/msm: Set different display size limitation on each target The maximum output width of one pipeline depends on the LayerMixer's capability. It may be different on each target. Also, MDP5 doesn't have vertical limitation in one frame, as long as the pixel clock can be supported. This change obtains the maximum LM resolution from configuration table and treat it as the whole pipe's limitation for MDP5. The size limit on MDP4 is not changed. Signed-off-by: Hai Li <hali@codeaurora.org> commit 5cf3a4553fc8395c4ad38077f8cee6c91f832393 Author: Rob Clark <robdclark@gmail.com> Date: Mon Jul 27 20:52:50 2015 -0400 drm/msm/hdmi: standardize on lead chip for compatible names For all of these devices, msm89xy was the lead chip, so standardize the compatible names to align with convention used by rest of the qcom/msm drivers. Signed-off-by: Rob Clark <robdclark@gmail.com> commit 3a84f8469e2687b9fdcf83d615b8001a2443566a Author: Stephane Viau <sviau@codeaurora.org> Date: Fri Jun 19 16:04:47 2015 -0400 drm/msm: Add support for msm8x94 This change adds the MDP and HDMI support for msm8x94. Note that HDMI PHY registers are not being accessed anymore from the driver. Signed-off-by: Stephane Viau <sviau@codeaurora.org> [rename compatible s/8x94/8994/ since preference is to not trust the marketing folks who invent chip #'s but instead name things after the lead chip.. we should rename some 80XY to 89XY to standardize on the lead chip but leave that for another patch. Also, update dt bindings doc] Signed-off-by: Rob Clark <robdclark@gmail.com> commit da32855219f86f27cad1b12be2264ffb0b97b9fa Author: Stephane Viau <sviau@codeaurora.org> Date: Fri Jun 19 16:04:46 2015 -0400 drm/msm/hdmi: remove ->reset() from HDMI PHY ->reset() currently only accesses HDMI core registers, and yet it is located in hdmi_phy*. Since no PHY registers are being accessed during ->reset(), it would be better to bring that function in hdmi core module where HDMI core registers are usually being accessed. This will also help for msm8x94 for which no PHY registers accesses are done (->phy_init == NULL) but the HDMI PHY reset from HDMI core still needs to be done. Note: SW_RESET_PLL bit is not written in hdmi_phy_8x60_reset(); this write should not affect anything if the corresponding field is not writable. Signed-off-by: Stephane Viau <sviau@codeaurora.org> [fixed warning about unused 'phy' in hpd_enable() while merging] Signed-off-by: Rob Clark <robdclark@gmail.com> commit dcefc117cc192f215d04c4e7cbae6b76a9bafcf4 Author: Hai Li <hali@codeaurora.org> Date: Thu Jun 18 10:14:21 2015 -0400 drm/msm/dsi: Add support for msm8x94 DSI controller on msm8x94 is version 1.3, which requires different power supplies and works with 20nm DSI PHY. This change is to add the basic support for this version. Signed-off-by: Hai Li <hali@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com> commit ab8909b032ffccc15384879dd5798b8647d6ab8a Author: Hai Li <hali@codeaurora.org> Date: Thu Jun 11 10:56:46 2015 -0400 drm/msm/dsi: Use pinctrl in DSI driver Some targets use pinctrl framework to configure some pins. This change allows DSI driver to set default and sleep pinctrl status. Signed-off-by: Hai Li <hali@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com> commit 678565c3cb2100a8f03c23592f13f6b78e69a590 Author: Hai Li <hali@codeaurora.org> Date: Wed Jun 10 13:18:18 2015 -0400 drm/msm/dsi: Rename *dual panel* to *dual DSI* The current term of *dual panel* in DSI driver code causes confusion. It is supposed to indicate the panel using two DSI links. Rename it to *dual DSI*. Signed-off-by: Hai Li <hali@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com> commit 13351cd17791694f2dcc96dc920e58b090b18c31 Author: Hai Li <hali@codeaurora.org> Date: Wed Jun 10 13:18:17 2015 -0400 drm/msm/dsi: Update source PLL selection in DSI PHY The source PLL to be used by each DSI PHY should be decided by DSI manager based on dual DSI information, while the register programming to select PLL is different from one type of PHY to another. This change adds the H/W difference to PHY configuration and updates the interface between DSI manager and PHY. With this change, PLL selection can be supported on different targets. Signed-off-by: Hai Li <hali@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com> commit c6a57a50ad562a2e6fc6ac3218b710caea73a58b Author: jilai wang <jilaiw@codeaurora.org> Date: Thu Apr 2 17:49:01 2015 -0400 drm/msm/hdmi: add hdmi hdcp support (V3) Add HDMI HDCP support including HDCP PartI/II/III authentication. V1: Initial Change V2: Address Bjorn&Rob's comments Refactor the authentication process to use single work instead of multiple work for different authentication stages. V3: Update to align with qcom SCM api. Signed-off-by: Jilai Wang <jilaiw@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com> commit 2d3584eb871da2a6fa72e3d50781f33b0312589a Author: Rob Clark <robdclark@gmail.com> Date: Mon Jul 27 19:37:12 2015 -0400 drm/msm: update generated headers Signed-off-by: Rob Clark <robdclark@gmail.com> commit c0015bf3a34961342a27b021672049e535ab36a1 Author: Alexander Aring <alex.aring@gmail.com> Date: Sat Aug 15 11:00:33 2015 +0200 ieee802154: 6lowpan: fix non-lowpan wpan interfaces We receive all 802.15.4 frames on the packet handler "lowpan_rcv" this patch checks if the wpan device belongs to a lowpan interface. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> commit 0751272880f3a0c74c786ecfaba2b3d98748482f Author: Alexander Aring <alex.aring@gmail.com> Date: Sat Aug 15 11:00:32 2015 +0200 ieee802154: 6lowpan: fix packet layer registration This patch fixes 802.15.4 packet layer registration when mutliple lowpan interfaces will be added. We need to register the packet layer at the first lowpan interface and deregister it at the last interface. This done by open_count variable which is protected by rtnl. Additional do a quiet fix by adding dev_put(real_dev) when netdev registration fails, which fix the refcount for the wpan dev. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> commit 4481c0767e52eea674794de4b9123c9bc3d24f24 Author: Peter Poklop <peter.poklop@gmail.com> Date: Sat Aug 15 20:47:09 2015 +0200 Bluetooth: btusb: mark 0c10:0000 devices with BTUSB_SWAVE This patch enables quirk handling for Silicon Wave based devices and fixes kernel bug with id 42985. T: Bus=01 Lev=01 Prnt=01 Port=07 Cnt=04 Dev#= 6 Spd=12 MxCh= 0 D: Ver= 1.10 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=0c10 ProdID=0000 Rev=15.00 S: Manufacturer=SiW S: Product=SiW S: SerialNumber=340A05F61100 C:* #Ifs= 2 Cfg#= 1 Atr=a0 MxPwr= 50mA I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=81(I) Atr=03(Int.) MxPS= 16 Ivl=1ms E: Ad=82(I) Atr=02(Bulk) MxPS= 64 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS= 64 Ivl=0ms I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 0 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 0 Ivl=1ms I: If#= 1 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 9 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 9 Ivl=1ms I: If#= 1 Alt= 2 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 17 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 17 Ivl=1ms I: If#= 1 Alt= 3 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 25 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 25 Ivl=1ms I: If#= 1 Alt= 4 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 33 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 33 Ivl=1ms I: If#= 1 Alt= 5 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 49 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 49 Ivl=1ms Signed-off-by: Peter Poklop <peter.poklop@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> commit e294a5371b2e0bd22d4a917d4c354a52a7057b6e Author: Theodore Ts'o <tytso@mit.edu> Date: Sat Aug 15 14:59:44 2015 -0400 ext4: ratelimit the file system mounted message The xfstests ext4/305 will mount and unmount the same file system over 4,000 times, and each one of these will cause a system log message. Ratelimit this message since if we are getting more than a few dozen of these messages, they probably aren't going to be helpful. Signed-off-by: Theodore Ts'o <tytso@mit.edu> commit 0048b4837affd153897ed1222283492070027aa9 Author: Ming Lei <ming.lei@canonical.com> Date: Sun Aug 9 03:41:51 2015 -0400 blk-mq: fix race between timeout and freeing request Inside timeout handler, blk_mq_tag_to_rq() is called to retrieve the request from one tag. This way is obviously wrong because the request can be freed any time and some fiedds of the request can't be trusted, then kernel oops might be triggered[1]. Currently wrt. blk_mq_tag_to_rq(), the only special case is that the flush request can share same tag with the request cloned from, and the two requests can't be active at the same time, so this patch fixes the above issue by updating tags->rqs[tag] with the active request(either flush rq or the request cloned from) of the tag. Also blk_mq_tag_to_rq() gets much simplified with this patch. Given blk_mq_tag_to_rq() is mainly for drivers and the caller must make sure the request can't be freed, so in bt_for_each() this helper is replaced with tags->rqs[tag]. [1] kernel oops log [ 439.696220] BUG: unable to handle kernel NULL pointer dereference at 0000000000000158^M [ 439.697162] IP: [<ffffffff812d89ba>] blk_mq_tag_to_rq+0x21/0x6e^M [ 439.700653] PGD 7ef765067 PUD 7ef764067 PMD 0 ^M [ 439.700653] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC ^M [ 439.700653] Dumping ftrace buffer:^M [ 439.700653] (ftrace buffer empty)^M [ 439.700653] Modules linked in: nbd ipv6 kvm_intel kvm serio_raw^M [ 439.700653] CPU: 6 PID: 2779 Comm: stress-ng-sigfd Not tainted 4.2.0-rc5-next-20150805+ #265^M [ 439.730500] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011^M [ 439.730500] task: ffff880605308000 ti: ffff88060530c000 task.ti: ffff88060530c000^M [ 439.730500] RIP: 0010:[<ffffffff812d89ba>] [<ffffffff812d89ba>] blk_mq_tag_to_rq+0x21/0x6e^M [ 439.730500] RSP: 0018:ffff880819203da0 EFLAGS: 00010283^M [ 439.730500] RAX: ffff880811b0e000 RBX: ffff8800bb465f00 RCX: 0000000000000002^M [ 439.730500] RDX: 0000000000000000 RSI: 0000000000000202 RDI: 0000000000000000^M [ 439.730500] RBP: ffff880819203db0 R08: 0000000000000002 R09: 0000000000000000^M [ 439.730500] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000202^M [ 439.730500] R13: ffff880814104800 R14: 0000000000000002 R15: ffff880811a2ea00^M [ 439.730500] FS: 00007f165b3f5740(0000) GS:ffff880819200000(0000) knlGS:0000000000000000^M [ 439.730500] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b^M [ 439.730500] CR2: 0000000000000158 CR3: 00000007ef766000 CR4: 00000000000006e0^M [ 439.730500] Stack:^M [ 439.730500] 0000000000000008 ffff8808114eed90 ffff880819203e00 ffffffff812dc104^M [ 439.755663] ffff880819203e40 ffffffff812d9f5e 0000020000000000 ffff8808114eed80^M [ 439.755663] Call Trace:^M [ 439.755663] <IRQ> ^M [ 439.755663] [<ffffffff812dc104>] bt_for_each+0x6e/0xc8^M [ 439.755663] [<ffffffff812d9f5e>] ? blk_mq_rq_timed_out+0x6a/0x6a^M [ 439.755663] [<ffffffff812d9f5e>] ? blk_mq_rq_timed_out+0x6a/0x6a^M [ 439.755663] [<ffffffff812dc1b3>] blk_mq_tag_busy_iter+0x55/0x5e^M [ 439.755663] [<ffffffff812d88b4>] ? blk_mq_bio_to_request+0x38/0x38^M [ 439.755663] [<ffffffff812d8911>] blk_mq_rq_timer+0x5d/0xd4^M [ 439.755663] [<ffffffff810a3e10>] call_timer_fn+0xf7/0x284^M [ 439.755663] [<ffffffff810a3d1e>] ? call_timer_fn+0x5/0x284^M [ 439.755663] [<ffffffff812d88b4>] ? blk_mq_bio_to_request+0x38/0x38^M [ 439.755663] [<ffffffff810a46d6>] run_timer_softirq+0x1ce/0x1f8^M [ 439.755663] [<ffffffff8104c367>] __do_softirq+0x181/0x3a4^M [ 439.755663] [<ffffffff8104c76e>] irq_exit+0x40/0x94^M [ 439.755663] [<ffffffff81031482>] smp_apic_timer_interrupt+0x33/0x3e^M [ 439.755663] [<ffffffff815559a4>] apic_timer_interrupt+0x84/0x90^M [ 439.755663] <EOI> ^M [ 439.755663] [<ffffffff81554350>] ? _raw_spin_unlock_irq+0x32/0x4a^M [ 439.755663] [<ffffffff8106a98b>] finish_task_switch+0xe0/0x163^M [ 439.755663] [<ffffffff8106a94d>] ? finish_task_switch+0xa2/0x163^M [ 439.755663] [<ffffffff81550066>] __schedule+0x469/0x6cd^M [ 439.755663] [<ffffffff8155039b>] schedule+0x82/0x9a^M [ 439.789267] [<ffffffff8119b28b>] signalfd_read+0x186/0x49a^M [ 439.790911] [<ffffffff8106d86a>] ? wake_up_q+0x47/0x47^M [ 439.790911] [<ffffffff811618c2>] __vfs_read+0x28/0x9f^M [ 439.790911] [<ffffffff8117a289>] ? __fget_light+0x4d/0x74^M [ 439.790911] [<ffffffff811620a7>] vfs_read+0x7a/0xc6^M [ 439.790911] [<ffffffff8116292b>] SyS_read+0x49/0x7f^M [ 439.790911] [<ffffffff81554c17>] entry_SYSCALL_64_fastpath+0x12/0x6f^M [ 439.790911] Code: 48 89 e5 e8 a9 b8 e7 ff 5d c3 0f 1f 44 00 00 55 89 f2 48 89 e5 41 54 41 89 f4 53 48 8b 47 60 48 8b 1c d0 48 8b 7b 30 48 8b 53 38 <48> 8b 87 58 01 00 00 48 85 c0 75 09 48 8b 97 88 0c 00 00 eb 10 ^M [ 439.790911] RIP [<ffffffff812d89ba>] blk_mq_tag_to_rq+0x21/0x6e^M [ 439.790911] RSP <ffff880819203da0>^M [ 439.790911] CR2: 0000000000000158^M [ 439.790911] ---[ end trace d40af58949325661 ]---^M Cc: <stable@vger.kernel.org> Signed-off-by: Ming Lei <ming.lei@canonical.com> Signed-off-by: Jens Axboe <axboe@fb.com> commit 596f5aad2a704b72934e5abec1b1b4114c16f45b Author: Ming Lei <ming.lei@canonical.com> Date: Sun Aug 9 03:41:50 2015 -0400 blk-mq: fix buffer overflow when reading sysfs file of 'pending' There may be lots of pending requests so that the buffer of PAGE_SIZE can't hold them at all. One typical example is scsi-mq, the queue depth(.can_queue) of scsi_host and blk-mq is quite big but scsi_device's queue_depth is a bit small(.cmd_per_lun), then it is quite easy to have lots of pending requests in hw queue. This patch fixes the following warning and the related memory destruction. [ 359.025101] fill_read_buffer: blk_mq_hw_sysfs_show+0x0/0x7d returned bad count^M [ 359.055595] irq event stamp: 15537^M [ 359.055606] general protection fault: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC ^M [ 359.055614] Dumping ftrace buffer:^M [ 359.055660] (ftrace buffer empty)^M [ 359.055672] Modules linked in: nbd ipv6 kvm_intel kvm serio_raw^M [ 359.055678] CPU: 4 PID: 21631 Comm: stress-ng-sysfs Not tainted 4.2.0-rc5-next-20150805 #434^M [ 359.055679] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011^M [ 359.055682] task: ffff8802161cc000 ti: ffff88021b4a8000 task.ti: ffff88021b4a8000^M [ 359.055693] RIP: 0010:[<ffffffff811541c5>] [<ffffffff811541c5>] __kmalloc+0xe8/0x152^M Cc: <stable@vger.kernel.org> Signed-off-by: Ming Lei <ming.lei@canonical.com> Signed-off-by: Jens Axboe <axboe@fb.com> commit da0b5e40ab10f44019a1ecec09fadfcd5bdb76b6 Author: Dan Carpenter <dan.carpenter@oracle.com> Date: Sat Aug 15 11:38:13 2015 -0400 ext4: silence a format string false positive Static checkers complain that the format string should be "%s". It does not make a difference for the current code. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> commit 9810446836ab5a4b34a749bb77f3eb5836f982cc Author: Dan Carpenter <dan.carpenter@oracle.com> Date: Sat Aug 15 11:30:31 2015 -0400 ext4: simplify some code in read_mmp_block() My static check complains because we have: if (!*bh) return -ENOMEM; if (*bh) { The second check is unnecessary. I've simplified this code by moving the "if (!*bh)" checks around. Also Andreas Dilger says we should probably print a warning if sb_getblk() fails. [ Restructured the code so that we print a warning message as well if the mmp block doesn't check out, and to print the error code to disambiguate between the error cases. - TYT ] Reviewed-by: Andreas Dilger <adilger@dilger.ca> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> commit c642dc9e1aaed953597e7092d7df329e6234096e Author: Eric Sandeen <sandeen@redhat.com> Date: Sat Aug 15 10:45:06 2015 -0400 ext4: don't manipulate recovery flag when freezing no-journal fs At some point along this sequence of changes: f6e63f9 ext4: fold ext4_nojournal_sops into ext4_sops bb04457 ext4: support freezing ext2 (nojournal) file systems 9ca9238 ext4: Use separate super_operations structure for no_journal filesystems ext4 started setting needs_recovery on filesystems without journals when they are unfrozen. This makes no sense, and in fact confuses blkid to the point where it doesn't recognize the filesystem at all. (freeze ext2; unfreeze ext2; run blkid; see no output; run dumpe2fs, see needs_recovery set on fs w/ no journal). To fix this, don't manipulate the INCOMPAT_RECOVER feature on filesystems without journals. Reported-by: Stu Mark <smark@datto.com> Reviewed-by: Jan Kara <jack@suse.com> Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org commit 3c1dd7411ccc1d151bd4f5d2b6d07c7272d86dc0 Author: Benjamin Cama <benoar@dolka.fr> Date: Tue Jul 14 16:25:58 2015 +0200 ARM: orion5x: fix legacy orion5x IRQ numbers Since v3.18, attempts to deliver IRQ0 are rejected, breaking orion5x. Fix this by increasing all interrupts by one, as did 5d6bed2a9c8b for dove. Also, force MULTI_IRQ_HANDLER for all orion platforms (including dove) as the specific handler is needed to shift back IRQ numbers by one. [gregory.clement@free-electrons.com]: moved the select MULTI_IRQ_HANDLER from PLAT_ORION_LEGACY to ARCH_ORION5X as it broke the build for dove. Fixes: a71b092a9c68 ("ARM: Convert handle_IRQ to use __handle_domain_irq") Signed-off-by: Benjamin Cama <benoar@dolka.fr> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Cc: <stable@vger.kernel.org> commit 601e4576594543200bde9201e4d23242e73a778b Author: Ricard Wanderlof <ricard.wanderlof@axis.com> Date: Thu Aug 13 15:10:19 2015 +0200 ASoC: ssm2518: Add explicit device tree support Add OF match table to SSM2518 to allow direct matching without going through I2C subsystem. Signed-off-by: Ricard Wanderlof <ricardw@axis.com> Acked-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Mark Brown <broonie@kernel.org> commit 1afb9c539daebc2c8a7b33d0e0b8fc9f74671b02 Author: Viresh Kumar <viresh.kumar@linaro.org> Date: Thu Jul 30 12:40:35 2015 +0530 thermal/cpu_cooling: update policy limits if clipped_freq < policy->max policy->max is the maximum allowed frequency defined by user and clipped_freq is the maximum that thermal constraints allow. If clipped_freq is lower than policy->max, then we need to readjust policy->max. But, if clipped_freq is greater than policy->max, we don't need to do anything. We used to call cpufreq_verify_within_limits() in this case, but it doesn't change anything in this case. Lets skip this unnecessary call and write a comment that explains this. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Eduardo Valentin <edubezval@gmail.com> commit abcbcc25cb3edfc3c9af210a88c9386e353191fe Author: Viresh Kumar <viresh.kumar@linaro.org> Date: Thu Jul 30 12:40:34 2015 +0530 thermal/cpu_cooling: rename max_freq as clipped_freq in notifier That's what it is for, lets name it properly. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Eduardo Valentin <edubezval@gmail.com> commit 59f0d21883f39d27f14408d4ca211dce80658963 Author: Viresh Kumar <viresh.kumar@linaro.org> Date: Thu Jul 30 12:40:33 2015 +0530 thermal/cpu_cooling: rename cpufreq_val as clipped_freq That's what it is for, lets name it properly. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Eduardo Valentin <edubezval@gmail.com> commit a24af233a1fd09002cabc05d6da248cc5656a2e1 Author: Viresh Kumar <viresh.kumar@linaro.org> Date: Thu Jul 30 12:40:32 2015 +0530 thermal/cpu_cooling: convert 'switch' block to 'if' block in notifier We just need to take care of single event here and there is no need to increase indentation level of most of the code (which causes lines longer that 80 columns to break). Kill the switch block. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Eduardo Valentin <edubezval@gmail.com> commit 166529c9b6f91b97d771e2e7ebf748aadb239b44 Author: Viresh Kumar <viresh.kumar@linaro.org> Date: Thu Jul 30 12:40:31 2015 +0530 thermal/cpu_cooling: quit early after updating policy If a valid cpufreq_dev is found for policy->cpu, we should update the policy and quit the for loop. There is no need to keep traversing the list of cpufreq_dev's. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Eduardo Valentin <edubezval@gmail.com> commit 76fd38ce21de506a3867768fac42729eb6d7dedf Author: Viresh Kumar <viresh.kumar@linaro.org> Date: Thu Jul 30 12:40:30 2015 +0530 thermal/cpu_cooling: No need to initialize max_freq to 0 Its always set before getting used, don't initialize it. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Eduardo Valentin <edubezval@gmail.com> commit 02373d7c69b4270bbab930f8a81b0721be794347 Author: Russell King <rmk+kernel@arm.linux.org.uk> Date: Wed Aug 12 15:22:16 2015 +0530 thermal: cpu_cooling: fix lockdep problems in cpu_cooling A recent change to the cpu_cooling code introduced a AB-BA deadlock scenario between the cpufreq_policy_notifier_list rwsem and the cooling_cpufreq_lock. This is caused by cooling_cpufreq_lock being held before the registration/removal of the notifier block (an operation which takes the rwsem), and the notifier code itself which takes the locks in the reverse order: ====================================================== [ INFO: possible circular locking dependency detected ] 3.18.0+ #1453 Not tainted ------------------------------------------------------- rc.local/770 is trying to acquire lock: (cooling_cpufreq_lock){+.+.+.}, at: [<c04abfc4>] cpufreq_thermal_notifier+0x34/0xfc but task is already holding lock: ((cpufreq_policy_notifier_list).rwsem){++++.+}, at: [<c0042f04>] __blocking_notifier_call_chain+0x34/0x68 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 ((cpufreq_policy_notifier_list).rwsem){++++.+}: [<c06bc3b0>] down_write+0x44/0x9c [<c0043444>] blocking_notifier_chain_register+0x28/0xd8 [<c04ad610>] cpufreq_register_notifier+0x68/0x90 [<c04abe4c>] __cpufreq_cooling_register.part.1+0x120/0x180 [<c04abf44>] __cpufreq_cooling_register+0x98/0xa4 [<c04abf8c>] cpufreq_cooling_register+0x18/0x1c [<bf0046f8>] imx_thermal_probe+0x1c0/0x470 [imx_thermal] [<c037cef8>] platform_drv_probe+0x50/0xac [<c037b710>] driver_probe_device+0x114/0x234 [<c037b8cc>] __driver_attach+0x9c/0xa0 [<c0379d68>] bus_for_each_dev+0x5c/0x90 [<c037b204>] driver_attach+0x24/0x28 [<c037ae7c>] bus_add_driver+0xe0/0x1d8 [<c037c0cc>] driver_register+0x80/0xfc [<c037cd80>] __platform_driver_register+0x50/0x64 [<bf007018>] 0xbf007018 [<c0008a5c>] do_one_initcall+0x88/0x1d8 [<c0095da4>] load_module+0x1768/0x1ef8 [<c0096614>] SyS_init_module+0xe0/0xf4 [<c000ec00>] ret_fast_syscall+0x0/0x48 -> #0 (cooling_cpufreq_lock){+.+.+.}: [<c00619f8>] lock_acquire+0xb0/0x124 [<c06ba3b4>] mutex_lock_nested+0x5c/0x3d8 [<c04abfc4>] cpufreq_thermal_notifier+0x34/0xfc [<c0042bf4>] notifier_call_chain+0x4c/0x8c [<c0042f20>] __blocking_notifier_call_chain+0x50/0x68 [<c0042f58>] blocking_notifier_call_chain+0x20/0x28 [<c04ae62c>] cpufreq_set_policy+0x7c/0x1d0 [<c04af3cc>] store_scaling_governor+0x74/0x9c [<c04ad418>] store+0x90/0xc0 [<c0175384>] sysfs_kf_write+0x54/0x58 [<c01746b4>] kernfs_fop_write+0xdc/0x190 [<c010dcc0>] vfs_write+0xac/0x1b4 [<c010dfec>] SyS_write+0x44/0x90 [<c000ec00>] ret_fast_syscall+0x0/0x48 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock((cpufreq_policy_notifier_list).rwsem); lock(cooling_cpufreq_lock); lock((cpufreq_policy_notifier_list).rwsem); lock(cooling_cpufreq_lock); *** DEADLOCK *** 7 locks held by rc.local/770: #0: (sb_writers#6){.+.+.+}, at: [<c010dda0>] vfs_write+0x18c/0x1b4 #1: (&of->mutex){+.+.+.}, at: [<c0174678>] kernfs_fop_write+0xa0/0x190 #2: (s_active#52){.+.+.+}, at: [<c0174680>] kernfs_fop_write+0xa8/0x190 #3: (cpu_hotplug.lock){++++++}, at: [<c0026a60>] get_online_cpus+0x34/0x90 #4: (cpufreq_rwsem){.+.+.+}, at: [<c04ad3e0>] store+0x58/0xc0 #5: (&policy->rwsem){+.+.+.}, at: [<c04ad3f8>] store+0x70/0xc0 #6: ((cpufreq_policy_notifier_list).rwsem){++++.+}, at: [<c0042f04>] __blocking_notifier_call_chain+0x34/0x68 stack backtrace: CPU: 0 PID: 770 Comm: rc.local Not tainted 3.18.0+ #1453 Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) Backtrace: [<c00121c8>] (dump_backtrace) from [<c0012360>] (show_stack+0x18/0x1c) r6:c0b85a80 r5:c0b75630 r4:00000000 r3:00000000 [<c0012348>] (show_stack) from [<c06b6c48>] (dump_stack+0x7c/0x98) [<c06b6bcc>] (dump_stack) from [<c06b42a4>] (print_circular_bug+0x28c/0x2d8) r4:c0b85a80 r3:d0071d40 [<c06b4018>] (print_circular_bug) from [<c00613b0>] (__lock_acquire+0x1acc/0x1bb0) r10:c0b50660 r8:c09e6d80 r7:d0071d40 r6:c11d0f0c r5:00000007 r4:d0072240 [<c005f8e4>] (__lock_acquire) from [<c00619f8>] (lock_acquire+0xb0/0x124) r10:00000000 r9:c04abfc4 r8:00000000 r7:00000000 r6:00000000 r5:c0a06f0c r4:00000000 [<c0061948>] (lock_acquire) from [<c06ba3b4>] (mutex_lock_nested+0x5c/0x3d8) r10:ec853800 r9:c0a06ed4 r8:d0071d40 r7:c0a06ed4 r6:c11d0f0c r5:00000000 r4:c04abfc4 [<c06ba358>] (mutex_lock_nested) from [<c04abfc4>] (cpufreq_thermal_notifier+0x34/0xfc) r10:ec853800 r9:ec85380c r8:d00d7d3c r7:c0a06ed4 r6:d00d7d3c r5:00000000 r4:fffffffe [<c04abf90>] (cpufreq_thermal_notifier) from [<c0042bf4>] (notifier_call_chain+0x4c/0x8c) r7:00000000 r6:00000000 r5:00000000 r4:fffffffe [<c0042ba8>] (notifier_call_chain) from [<c0042f20>] (__blocking_notifier_call_chain+0x50/0x68) r8:c0a072a4 r7:00000000 r6:d00d7d3c r5:ffffffff r4:c0a06fc8 r3:ffffffff [<c0042ed0>] (__blocking_notifier_call_chain) from [<c0042f58>] (blocking_notifier_call_chain+0x20/0x28) r7:ec98b540 r6:c13ebc80 r5:ed76e600 r4:d00d7d3c [<c0042f38>] (blocking_notifier_call_chain) from [<c04ae62c>] (cpufreq_set_policy+0x7c/0x1d0) [<c04ae5b0>] (cpufreq_set_policy) from [<c04af3cc>] (store_scaling_governor+0x74/0x9c) r7:ec98b540 r6:0000000c r5:ec98b540 r4:ed76e600 [<c04af358>] (store_scaling_governor) from [<c04ad418>] (store+0x90/0xc0) r6:0000000c r5:ed76e6d4 r4:ed76e600 [<c04ad388>] (store) from [<c0175384>] (sysfs_kf_write+0x54/0x58) r8:0000000c r7:d00d7f78 r6:ec98b540 r5:0000000c r4:ec853800 r3:0000000c [<c0175330>] (sysfs_kf_write) from [<c01746b4>] (kernfs_fop_write+0xdc/0x190) r6:ec98b540 r5:00000000 r4:00000000 r3:c0175330 [<c01745d8>] (kernfs_fop_write) from [<c010dcc0>] (vfs_write+0xac/0x1b4) r10:0162aa70 r9:d00d6000 r8:0000000c r7:d00d7f78 r6:0162aa70 r5:0000000c r4:eccde500 [<c010dc14>] (vfs_write) from [<c010dfec>] (SyS_write+0x44/0x90) r10:0162aa70 r8:0000000c r7:eccde500 r6:eccde500 r5:00000000 r4:00000000 [<c010dfa8>] (SyS_write) from [<c000ec00>] (ret_fast_syscall+0x0/0x48) r10:00000000 r8:c000edc4 r7:00000004 r6:000216cc r5:0000000c r4:0162aa70 Solve this by moving to finer grained locking - use one mutex to protect the cpufreq_dev_list as a whole, and a separate lock to ensure correct ordering of cpufreq notifier registration and removal. cooling_list_lock is taken within cooling_cpufreq_lock on (un)registration to preserve the behavior of the code, i.e. to atomically add/remove to the list and (un)register the notifier. Fixes: 2dcd851fe4b4 ("thermal: cpu_cooling: Update always cpufreq policy with Reviewed-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Eduardo Valentin <edubezval@gmail.com> commit 92f26189b181a65fcb1ff6220a4bf45d44502e4a Author: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Date: Thu Aug 13 19:06:05 2015 +0530 auxdisplay: ks0108: initialize local parport variable The local variable ks0108_parport is used by other functions to write to the parallel port. We missed initializing it when we converted the driver to use new Parallel Port codes. Fixes: 4edd70c133f3 ("auxdisplay: ks0108: use new parport device model") Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit 22de5f8bf8c0c03b775cc35cbe0e2c4d43d75f5a Author: Jaegeuk Kim <jaegeuk@kernel.org> Date: Tue Aug 11 21:59:49 2015 -0700 f2fs: skip checkpoint if there is no dirty and prefree segments We should avoid needless checkpoints when there is no dirty and prefree segment. eviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> commit 133acc6fa7c50d11da0138ad5039bb299c343dd5 Author: Chao Yu <chao2.yu@samsung.com> Date: Tue Jul 28 18:33:46 2015 +0800 f2fs: shrink free_nids entries This patch introduces __count_free_nids/try_to_free_nids and registers them in slab shrinker for shrinking under memory pressure. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> commit 2d69049ad68145747641e336681a43d17bfb0aaf Author: Chao Yu <chao2.yu@samsung.com> Date: Wed Aug 12 17:48:21 2015 +0800 f2fs: avoid clear valid page In f2fs_delete_entry, if last dirent is remove from the dentry page, we will try to punch that page since it has no valid date in it. But truncate_hole which is used for punching could fail because of no memory or IO error, if that happened, we'd better skip clearing this valid dentry page. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> commit 19b1f190271a345137c1139bef3573173d714cfd Author: Chao Yu <chao2.yu@samsung.com> Date: Wed Aug 12 17:47:08 2015 +0800 MAINTAINERS: add myself as a dedicated reviewer of f2fs I volunteer to be a dedicated reviewer of f2fs, add my email address in maintainship entry of f2fs. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> commit f706ffe3f7a6b8fdc7251d68d8689a0d9546500c Author: Jaegeuk Kim <jaegeuk@kernel.org> Date: Tue Aug 11 12:45:39 2015 -0700 f2fs: do not write any node pages related to orphan inodes We should not write node pages when deleting orphan inodes. In order to do that, we can eaisly set POR_DOING flag earlier before entering orphan inode routine. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> commit 4c278394b0feb7aadc538be12ab0474b106a7255 Author: Jaegeuk Kim <jaegeuk@kernel.org> Date: Tue Aug 11 16:01:30 2015 -0700 f2fs: avoid a build warning If F2FS_CHECK_FS is turned off, we can get a build warning for unused variable. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> commit 8c14bfadeac2a01b305ef4434907295b81b58db2 Author: Chao Yu <chao2.yu@samsung.com> Date: Fri Aug 7 17:58:43 2015 +0800 f2fs: handle error of f2fs_iget correctly In recover_orphan_inode, whenever f2fs_iget fail, we will make kernel panic, but it's not reasonable, because f2fs_iget can fail due to a lot of reasons including out of memory. So we change error handling method as below: a) when finding no entry for the orphan inode, bug_on for catching bugs; b) for other reasons, report it to caller. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> commit 47e70ca46f9074efe6573263c0de5bef0af829de Author: Jaegeuk Kim <jaegeuk@kernel.org> Date: Tue Aug 11 10:17:27 2015 -0700 f2fs: do not assign a new segment for dio under space shortage If there is not enough free segment, we should not assign a new segment explicitly. Otherwise, we can run out of free segment. Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> commit 4d283ec908e617fa28bcb06bce310206f0655d67 Author: Andy Lutomirski <luto@kernel.org> Date: Thu Aug 13 13:18:48 2015 -0700 x86/kvm: Rename VMX's segment access rights defines VMX encodes access rights differently from LAR, and the latter is most likely what x86 people think of when they think of "access rights". Rename them to avoid confusion. Cc: kvm@vger.kernel.org Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> commit 5417cb576a80d3fab0437c13163e8f71aa58dc50 Author: Alexandre Belloni <alexandre.belloni@free-electrons.com> Date: Tue Aug 4 11:33:59 2015 +0200 rtc: rx8025: check time validity when necessary Check time validity when reading time as this is when we need to know. Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> commit 115a7a530d303f47a37468b3c284dcba999a8ac5 Author: Alexandre Belloni <alexandre.belloni@free-electrons.com> Date: Tue Aug 4 11:24:33 2015 +0200 rtc: rx8025: fix RX8025_BIT_CTRL2_CTFG initialization RX8025_BIT_CTRL2_CTFG was set to 0 only when it was already 0. Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> commit bf4b02d33764ce8d06a1e6091a7ea673cd7d3935 Author: Alexandre Belloni <alexandre.belloni@free-electrons.com> Date: Tue Aug 4 10:56:50 2015 +0200 rtc: rx8025: remove useless initialization irq_freq is already initialized to 1 in rtc_device_register() Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> commit 25f8014d5f584803d1706a2c785d4b4f12c874d5 Author: Alexandre Belloni <alexandre.belloni@free-electrons.com> Date: Tue Aug 4 10:48:20 2015 +0200 rtc: rx8025: reset validity when setting time Wait for the user to set the time to reset the validity bits. Until then, the time may be invalid. Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> commit 647ce61d8fbfd69c4743a5e8aff935954cae326c Author: Alexandre Belloni <alexandre.belloni@free-electrons.com> Date: Tue Aug 4 10:46:22 2015 +0200 rtc: rx8025: fix rx8025_init_client() rx8025_init_client is modifying ctrl[0] and writing it to RX8025_REG_CTRL2 but ctrl[0] is actually RX8025_REG_CTRL1. Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> commit 42e0f616eff588f4d211511b76e17af846ad5bf7 Author: Alexandre Belloni <alexandre.belloni@free-electrons.com> Date: Tue Aug 4 00:45:37 2015 +0200 rtc: rx8025: continue without alarm when irq request fails Instead of bailing out, disable alarms and continue when devm_request_threaded_irq() fails. This allows to still provide some functionality. Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> commit 89b43e9954f105656cea8985a361be45b107495f Author: Alexandre Belloni <alexandre.belloni@free-electrons.com> Date: Tue Aug 4 00:40:25 2015 +0200 rtc: rx8025: cleanup accessors Remove useless error messages, at that point, the user already knows something went wrong but will not be able to do anything about it anyway. It is also highly unlikely that some registers are readable/writable but not some other ones. Also, transform rx8025_read_reg to be more resemblant to i2c_smbus_read_byte_data() Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> commit 8f2e038d69db32e113a09c0e74616d5dc0637740 Author: Alexandre Belloni <alexandre.belloni@free-electrons.com> Date: Sun Jul 26 10:13:31 2015 +0200 rtc: rx8025: don't reset the time Stop setting the time to epoch when it is invalid. The proper way to handle that is to return an error whe…

…t initialized. In case something goes wrong with power well initialization we were calling intel_prepare_ddi during boot while encoder list isnt't initilized. [ 9.618747] i915 0000:00:02.0: Invalid ROM contents [ 9.631446] [drm] failed to find VBIOS tables [ 9.720036] BUG: unable to handle kernel NULL pointer dereference at 00000000 00000058 [ 9.721986] IP: [<ffffffffa014eb72>] ddi_get_encoder_port+0x82/0x190 [i915] [ 9.723736] PGD 0 [ 9.724286] Oops: 0000 [#1] PREEMPT SMP [ 9.725386] Modules linked in: intel_powerclamp snd_hda_intel(+) coretemp crc 32c_intel snd_hda_codec snd_hda_core serio_raw snd_pcm snd_timer i915(+) parport _pc parport pinctrl_sunrisepoint pinctrl_intel nfsd nfs_acl [ 9.730635] CPU: 0 PID: 497 Comm: systemd-udevd Not tainted 4.3.0-rc2-eywa-10 967-g72de2cfd-dirty #2 [ 9.732785] Hardware name: Intel Corporation Cannonlake Client platform/Skyla ke DT DDR4 RVP8, BIOS CNLSE2R1.R00.X021.B00.1508040310 08/04/2015 [ 9.735785] task: ffff88008a704700 ti: ffff88016a1ac000 task.ti: ffff88016a1a c000 [ 9.737584] RIP: 0010:[<ffffffffa014eb72>] [<ffffffffa014eb72>] ddi_get_enco der_port+0x82/0x190 [i915] [ 9.739934] RSP: 0000:ffff88016a1af710 EFLAGS: 00010296 [ 9.741184] RAX: 000000000000004e RBX: ffff88008a9edc98 RCX: 0000000000000001 [ 9.742934] RDX: 000000000000004e RSI: ffffffff81fc1e82 RDI: 00000000ffffffff [ 9.744634] RBP: ffff88016a1af730 R08: 0000000000000000 R09: 0000000000000578 [ 9.746333] R10: 0000000000001065 R11: 0000000000000578 R12: fffffffffffffff8 [ 9.748033] R13: ffff88016a1af7a8 R14: ffff88016a1af794 R15: 0000000000000000 [ 9.749733] FS: 00007eff2e1e07c0(0000) GS:ffff88016fc00000(0000) knlGS:00000 00000000000 [ 9.751683] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 9.753083] CR2: 0000000000000058 CR3: 000000016922b000 CR4: 00000000003406f0 [ 9.754782] Stack: [ 9.755332] ffff88008a9edc98 ffff88008a9ed800 ffffffffa01d07b0 00000000fffb9 09e [ 9.757232] ffff88016a1af7d8 ffffffffa0154ea7 0000000000000246 ffff88016a370 080 [ 9.759182] ffff88016a370080 ffff88008a9ed800 0000000000000246 ffff88008a9ed c98 [ 9.761132] Call Trace: [ 9.761782] [<ffffffffa0154ea7>] intel_prepare_ddi+0x67/0x860 [i915] [ 9.763332] [<ffffffff81a56996>] ? _raw_spin_unlock_irqrestore+0x26/0x40 [ 9.765031] [<ffffffffa00fad01>] ? gen9_read32+0x141/0x360 [i915] [ 9.766531] [<ffffffffa00b43e1>] skl_set_power_well+0x431/0xa80 [i915] [ 9.768181] [<ffffffffa00b4a63>] skl_power_well_enable+0x13/0x20 [i915] [ 9.769781] [<ffffffffa00b2188>] intel_power_well_enable+0x28/0x50 [i915] [ 9.771481] [<ffffffffa00b4d52>] intel_display_power_get+0x92/0xc0 [i915] [ 9.773180] [<ffffffffa00b4fcb>] intel_display_set_init_power+0x3b/0x40 [i91 5] [ 9.774980] [<ffffffffa00b5170>] intel_power_domains_init_hw+0x120/0x520 [i9 15] [ 9.776780] [<ffffffffa0194c61>] i915_driver_load+0xb21/0xf40 [i915] So let's protect this case. My first attempt was to remove the intel_prepare_ddi, but Daniel had pointed out this is really needed to restore those registers values. And Imre pointed out that this case was without the flag protection and this was actually where things were going bad. So I've just checked and this indeed solves my issue. The regressing intel_prepare_ddi call was added in commit 1d2b952 Author: Damien Lespiau <damien.lespiau@intel.com> Date: Fri Mar 6 18:50:53 2015 +0000 drm/i915/skl: Restore the DDI translation tables when enabling PW1 Cc: Imre Deak <imre.deak@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Imre Deak <imre.deak@intel.com> [Jani: regression reference] Signed-off-by: Jani Nikula <jani.nikula@intel.com>

Dmitry Vyukov reported the following using trinity and the memory error detector AddressSanitizer (https://code.google.com/p/address-sanitizer/wiki/AddressSanitizerForKernel). [ 124.575597] ERROR: AddressSanitizer: heap-buffer-overflow on address ffff88002e280000 [ 124.576801] ffff88002e280000 is located 131938492886538 bytes to the left of 28857600-byte region [ffffffff81282e0a, ffffffff82e0830a) [ 124.578633] Accessed by thread T10915: [ 124.579295] inlined in describe_heap_address ./arch/x86/mm/asan/report.c:164 [ 124.579295] #0 ffffffff810dd277 in asan_report_error ./arch/x86/mm/asan/report.c:278 [ 124.580137] #1 ffffffff810dc6a0 in asan_check_region ./arch/x86/mm/asan/asan.c:37 [ 124.581050] #2 ffffffff810dd423 in __tsan_read8 ??:0 [ 124.581893] #3 ffffffff8107c093 in get_wchan ./arch/x86/kernel/process_64.c:444 The address checks in the 64bit implementation of get_wchan() are wrong in several ways: - The lower bound of the stack is not the start of the stack page. It's the start of the stack page plus sizeof (struct thread_info) - The upper bound must be: top_of_stack - TOP_OF_KERNEL_STACK_PADDING - 2 * sizeof(unsigned long). The 2 * sizeof(unsigned long) is required because the stack pointer points at the frame pointer. The layout on the stack is: ... IP FP ... IP FP. So we need to make sure that both IP and FP are in the bounds. Fix the bound checks and get rid of the mix of numeric constants, u64 and unsigned long. Making all unsigned long allows us to use the same function for 32bit as well. Use READ_ONCE() when accessing the stack. This does not prevent a concurrent wakeup of the task and the stack changing, but at least it avoids TOCTOU. Also check task state at the end of the loop. Again that does not prevent concurrent changes, but it avoids walking for nothing. Add proper comments while at it. Reported-by: Dmitry Vyukov <dvyukov@google.com> Reported-by: Sasha Levin <sasha.levin@oracle.com> Based-on-patch-from: Wolfram Gloger <wmglo@dent.med.uni-muenchen.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@alien8.de> Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Kostya Serebryany <kcc@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: kasan-dev <kasan-dev@googlegroups.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Wolfram Gloger <wmglo@dent.med.uni-muenchen.de> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20150930083302.694788319@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

Commit 1a3d595 ("MIPS: Tidy up FPU context switching") removed FP context saving from the asm-written resume function in favour of reusing existing code to perform the same task. However it only removed the FP context saving code from the r4k_switch.S implementation of resume. Octeon uses its own implementation in octeon_switch.S, so remove FP context saving there too in order to prevent attempting to save context twice. That formerly led to an exception from the second save as follows because the FPU had already been disabled by the first save: do_cpu invoked from kernel context![#1]: CPU: 0 PID: 2 Comm: kthreadd Not tainted 4.3.0-rc2-dirty #2 task: 800000041f84a008 ti: 800000041f864000 task.ti: 800000041f864000 $ 0 : 0000000000000000 0000000010008ce1 0000000000100000 ffffffffbfffffff $ 4 : 800000041f84a008 800000041f84ac08 800000041f84c000 0000000000000004 $ 8 : 0000000000000001 0000000000000000 0000000000000000 0000000000000001 $12 : 0000000010008ce3 0000000000119c60 0000000000000036 800000041f864000 $16 : 800000041f84ac08 800000000792ce80 800000041f84a008 ffffffff81758b00 $20 : 0000000000000000 ffffffff8175ae50 0000000000000000 ffffffff8176c740 $24 : 0000000000000006 ffffffff81170300 $28 : 800000041f864000 800000041f867d90 0000000000000000 ffffffff815f3fa0 Hi : 0000000000fa8257 Lo : ffffffffe15cfc00 epc : ffffffff8112821c resume+0x9c/0x200 ra : ffffffff815f3fa0 __schedule+0x3f0/0x7d8 Status: 10008ce2 KX SX UX KERNEL EXL Cause : 1080002c (ExcCode 0b) PrId : 000d0601 (Cavium Octeon+) Modules linked in: Process kthreadd (pid: 2, threadinfo=800000041f864000, task=800000041f84a008, tls=0000000000000000) Stack : ffffffff81604218 ffffffff815f7e08 800000041f84a008 ffffffff811681b0 800000041f84a008 ffffffff817e9878 0000000000000000 ffffffff81770000 ffffffff81768340 ffffffff81161398 0000000000000001 0000000000000000 0000000000000000 ffffffff815f4424 0000000000000000 ffffffff81161d68 ffffffff81161be8 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 ffffffff8111e16c 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 ... Call Trace: [<ffffffff8112821c>] resume+0x9c/0x200 [<ffffffff815f3fa0>] __schedule+0x3f0/0x7d8 [<ffffffff815f4424>] schedule+0x34/0x98 [<ffffffff81161d68>] kthreadd+0x180/0x198 [<ffffffff8111e16c>] ret_from_kernel_thread+0x14/0x1c Tested using cavium_octeon_defconfig on an EdgeRouter Lite. Fixes: 1a3d595 ("MIPS: Tidy up FPU context switching") Reported-by: Aaro Koskinen <aaro.koskinen@nokia.com> Signed-off-by: Paul Burton <paul.burton@imgtec.com> Cc: linux-mips@linux-mips.org Cc: Aleksey Makarov <aleksey.makarov@auriga.com> Cc: linux-kernel@vger.kernel.org Cc: Chandrakala Chavva <cchavva@caviumnetworks.com> Cc: David Daney <david.daney@cavium.com> Cc: Leonid Rosenboim <lrosenboim@caviumnetworks.com> Patchwork: https://patchwork.linux-mips.org/patch/11166/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

My colleague ran into a program stall on a x86_64 server, where n_tty_read() was waiting for data even if there was data in the buffer in the pty. kernel stack for the stuck process looks like below. #0 [ffff88303d107b58] __schedule at ffffffff815c4b20 #1 [ffff88303d107bd0] schedule at ffffffff815c513e #2 [ffff88303d107bf0] schedule_timeout at ffffffff815c7818 #3 [ffff88303d107ca0] wait_woken at ffffffff81096bd2 #4 [ffff88303d107ce0] n_tty_read at ffffffff8136fa23 #5 [ffff88303d107dd0] tty_read at ffffffff81368013 torvalds#6 [ffff88303d107e20] __vfs_read at ffffffff811a3704 torvalds#7 [ffff88303d107ec0] vfs_read at ffffffff811a3a57 torvalds#8 [ffff88303d107f00] sys_read at ffffffff811a4306 torvalds#9 [ffff88303d107f50] entry_SYSCALL_64_fastpath at ffffffff815c86d7 There seems to be two problems causing this issue. First, in drivers/tty/n_tty.c, __receive_buf() stores the data and updates ldata->commit_head using smp_store_release() and then checks the wait queue using waitqueue_active(). However, since there is no memory barrier, __receive_buf() could return without calling wake_up_interactive_poll(), and at the same time, n_tty_read() could start to wait in wait_woken() as in the following chart. __receive_buf() n_tty_read() ------------------------------------------------------------------------ if (waitqueue_active(&tty->read_wait)) /* Memory operations issued after the RELEASE may be completed before the RELEASE operation has completed */ add_wait_queue(&tty->read_wait, &wait); ... if (!input_available_p(tty, 0)) { smp_store_release(&ldata->commit_head, ldata->read_head); ... timeout = wait_woken(&wait, TASK_INTERRUPTIBLE, timeout); ------------------------------------------------------------------------ The second problem is that n_tty_read() also lacks a memory barrier call and could also cause __receive_buf() to return without calling wake_up_interactive_poll(), and n_tty_read() to wait in wait_woken() as in the chart below. __receive_buf() n_tty_read() ------------------------------------------------------------------------ spin_lock_irqsave(&q->lock, flags); /* from add_wait_queue() */ ... if (!input_available_p(tty, 0)) { /* Memory operations issued after the RELEASE may be completed before the RELEASE operation has completed */ smp_store_release(&ldata->commit_head, ldata->read_head); if (waitqueue_active(&tty->read_wait)) __add_wait_queue(q, wait); spin_unlock_irqrestore(&q->lock,flags); /* from add_wait_queue() */ ... timeout = wait_woken(&wait, TASK_INTERRUPTIBLE, timeout); ------------------------------------------------------------------------ There are also other places in drivers/tty/n_tty.c which have similar calls to waitqueue_active(), so instead of adding many memory barrier calls, this patch simply removes the call to waitqueue_active(), leaving just wake_up*() behind. This fixes both problems because, even though the memory access before or after the spinlocks in both wake_up*() and add_wait_queue() can sneak into the critical section, it cannot go past it and the critical section assures that they will be serialized (please see "INTER-CPU ACQUIRING BARRIER EFFECTS" in Documentation/memory-barriers.txt for a better explanation). Moreover, the resulting code is much simpler. Latency measurement using a ping-pong test over a pty doesn't show any visible performance drop. Signed-off-by: Kosuke Tatsukawa <tatsu@ab.jp.nec.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

When running kprobe test on arm64 rt kernel, it reports the below warning: root@qemu7:~# modprobe kprobe_example BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:917 in_atomic(): 0, irqs_disabled(): 128, pid: 484, name: modprobe CPU: 0 PID: 484 Comm: modprobe Not tainted 4.1.6-rt5 #2 Hardware name: linux,dummy-virt (DT) Call trace: [<ffffffc0000891b8>] dump_backtrace+0x0/0x128 [<ffffffc000089300>] show_stack+0x20/0x30 [<ffffffc00061dae8>] dump_stack+0x1c/0x28 [<ffffffc0000bbad0>] ___might_sleep+0x120/0x198 [<ffffffc0006223e8>] rt_spin_lock+0x28/0x40 [<ffffffc000622b30>] __aarch64_insn_write+0x28/0x78 [<ffffffc000622e48>] aarch64_insn_patch_text_nosync+0x18/0x48 [<ffffffc000622ee8>] aarch64_insn_patch_text_cb+0x70/0xa0 [<ffffffc000622f40>] aarch64_insn_patch_text_sync+0x28/0x48 [<ffffffc0006236e0>] arch_arm_kprobe+0x38/0x48 [<ffffffc00010e6f4>] arm_kprobe+0x34/0x50 [<ffffffc000110374>] register_kprobe+0x4cc/0x5b8 [<ffffffbffc002038>] kprobe_init+0x38/0x7c [kprobe_example] [<ffffffc000084240>] do_one_initcall+0x90/0x1b0 [<ffffffc00061c498>] do_init_module+0x6c/0x1cc [<ffffffc0000fd0c0>] load_module+0x17f8/0x1db0 [<ffffffc0000fd8cc>] SyS_finit_module+0xb4/0xc8 Convert patch_lock to raw loc kto avoid this issue. Although the problem is found on rt kernel, the fix should be applicable to mainline kernel too. Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Yang Shi <yang.shi@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>

When freqdomain_cpus attribute is read from an offlined cpu, it will cause crash. This change prevents calling cpufreq_show_cpus when policy driver_data is NULL. Crash info: [ 170.814949] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 [ 170.814990] IP: [<ffffffff813b2490>] _find_next_bit.part.0+0x10/0x70 [ 170.815021] PGD 227d30067 PUD 229e56067 PMD 0 [ 170.815043] Oops: 0000 [#2] SMP [ 170.816022] CPU: 3 PID: 3121 Comm: cat Tainted: G D OE 4.3.0-rc3+ torvalds#33 ... ... [ 170.816657] Call Trace: [ 170.816672] [<ffffffff813b2505>] ? find_next_bit+0x15/0x20 [ 170.816696] [<ffffffff8160e47c>] cpufreq_show_cpus+0x5c/0xd0 [ 170.816722] [<ffffffffa031a409>] show_freqdomain_cpus+0x19/0x20 [acpi_cpufreq] [ 170.816749] [<ffffffff8160e65b>] show+0x3b/0x60 [ 170.816769] [<ffffffff8129b31c>] sysfs_kf_seq_show+0xbc/0x130 [ 170.816793] [<ffffffff81299be3>] kernfs_seq_show+0x23/0x30 [ 170.816816] [<ffffffff81240f2c>] seq_read+0xec/0x390 [ 170.816837] [<ffffffff8129a64a>] kernfs_fop_read+0x10a/0x160 [ 170.816861] [<ffffffff8121d9b7>] __vfs_read+0x37/0x100 [ 170.816883] [<ffffffff813217c0>] ? security_file_permission+0xa0/0xc0 [ 170.816909] [<ffffffff8121e2e3>] vfs_read+0x83/0x130 [ 170.816930] [<ffffffff8121f035>] SyS_read+0x55/0xc0 ... ... [ 170.817185] ---[ end trace bc6eadf82b2b965a ]--- Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: 4.2+ <stable@vger.kernel.org> # 4.2+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

writing to rootfs causes system crash with the branch block-mpage-bvecs-for-next #2

writing to rootfs causes system crash with the branch block-mpage-bvecs-for-next #2

dongsupark commented Dec 7, 2014

writing to rootfs causes system crash with the branch block-mpage-bvecs-for-next #2

writing to rootfs causes system crash with the branch block-mpage-bvecs-for-next #2

Comments

dongsupark commented Dec 7, 2014