-
Notifications
You must be signed in to change notification settings - Fork 54.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added backlight support for some samsung laptops #11
Conversation
Models: * N120 * R468/R418 * X320/X420/X520 * R510/P510 * N350 * R470/R420 * R528/R728 * SQ1S
I'm not doing github pulls. The pull requests are seriously Please don't press the "pull request" github button. Do proper kernel
|
How are pullrequest seriously misdesigned (apart from that you might be used to a different kind of workflow)? |
I'm not doing linux kernel pulls. The kernel pulls are seriously Please don't press the "pull request" kernel button. Do proper github GitHub |
I honestly would like to know why github pull requests are misdesigned. I'll grant that I didn't actually create git but they seem to work just fine, is there something I am missing? |
Wow, great discussion went on there. Shacon raised perfectly valid points and Torvalds was basically "f this, I don't care, you're crazy". Great response! |
On Fri, Sep 9, 2011 at 12:49 AM, Nils Werner
Can you read? "If the merge message doesn't tell me who the merge is from and what If you can't understand that, then yes, you're crazy. Or just terminally stupid. The quality of github "issues" and comments really is very low. This
|
First, I agree with Scott: In many cases people delete their fork (or at least the branch). So where would the message point you to? The pull request of the pulling repository will much more likely be around for a long time. Also, what if the branch you'll pull from has changed in the meantime? You'd end up with changes that are not documented in the pull request and thus not reviewed by the ones discussing the pull request. As soon as the PR is posted you must put them out of reach of the author to keep them from sneaking in changes.
Also, you did notice that you've proven my point right there, right? |
On Fri, Sep 9, 2011 at 12:10 PM, Nils Werner
That's a "implementation problem". It's not an argument for doing crap. Simple solution: if people delete the branch or repository, consider You can make the "pull request" namespace separate from the branch git pull git://github.com/ and then if there i a previous pull request, add a number to it (so it Or something along those lines. The important part is that YOU MUST
We actually do this in the kernel on purpose sometimes - people fix up That said, again, you could do the same thing: if somebody changes a
Umm, considering that the pull requests used to have no documentation As soon as the PR is posted you must put them out of reach of the
Umm. I'm not polite. Big news. I'd rather be acerbic than stupid.
|
A decentralized system that doesn't accept disappearing nodes sounds more like a design problem.
Years after the branch has been merged? Is that a problem we wanted to solve?
I meant malicuous changes. Hierarchies are shallow, elite circles basically nonexistant so that's a real issue. And the biggest strength of GitHub.
Thats the first constructive comment to this discussion. And sounds like a good idea, apart from the problem that you'd lose the link to the PR wich, to many, is more useful than being able to immediately recognise the source. Also it would probably require lots of modifications to the deamon though.And very disciplined contributors (always make sure to use dead-end topic-branches, not everybody does that). Separating the two simply improves the workflow a lot. It'd be interesting what @schacon has to say about it.
When was that? Months ago? I am talking about your comment 2 days ago. |
A personal, unrelated note: Being unable to lead an objective discussion. Judging people, then insulting them just to prove a point. Recognising ones flaws but being unwilling to change them, instead bragging about them. Missing the ability to reflect on ones actions during interactions with others. That sounds pretty stupid to me. Anyways, I'm moving on. |
I thought we were talking about pull requests and branches? When did a branch become a node?
Except that, as indicated by Scott Chacon [0], the most common scenario is to perform the pull request locally on your machine, allowing you to pull the code and then review it without said code being changed before merging. I can understand your argument in relation to pull requests done using the button on the website though. [0] https://github.com/torvalds/diveclog/pull/18 |
* Ingo Molnar <mingo@elte.hu> wrote: > The patch below addresses these concerns, serializes the output, tidies up the > printout, resulting in this new output: There's one bug remaining that my patch does not address: the vCPUs are not printed in order: # vCPU #0's dump: # vCPU #2's dump: # vCPU torvalds#24's dump: # vCPU #5's dump: # vCPU torvalds#39's dump: # vCPU torvalds#38's dump: # vCPU torvalds#51's dump: # vCPU torvalds#11's dump: # vCPU torvalds#10's dump: # vCPU torvalds#12's dump: This is undesirable as the order of printout is highly random, so successive dumps are difficult to compare. The patch below serializes the signalling itself. (this is on top of the previous patch) The patch also tweaks the vCPU printout line a bit so that it does not start with '#', which is discarded if such messages are pasted into Git commit messages. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Pekka Enberg <penberg@kernel.org>
If the pte mapping in generic_perform_write() is unmapped between iov_iter_fault_in_readable() and iov_iter_copy_from_user_atomic(), the "copied" parameter to ->end_write can be zero. ext4 couldn't cope with it with delayed allocations enabled. This skips the i_disksize enlargement logic if copied is zero and no new data was appeneded to the inode. gdb> bt #0 0xffffffff811afe80 in ext4_da_should_update_i_disksize (file=0xffff88003f606a80, mapping=0xffff88001d3824e0, pos=0x1\ 08000, len=0x1000, copied=0x0, page=0xffffea0000d792e8, fsdata=0x0) at fs/ext4/inode.c:2467 #1 ext4_da_write_end (file=0xffff88003f606a80, mapping=0xffff88001d3824e0, pos=0x108000, len=0x1000, copied=0x0, page=0\ xffffea0000d792e8, fsdata=0x0) at fs/ext4/inode.c:2512 #2 0xffffffff810d97f1 in generic_perform_write (iocb=<value optimized out>, iov=<value optimized out>, nr_segs=<value o\ ptimized out>, pos=0x108000, ppos=0xffff88001e26be40, count=<value optimized out>, written=0x0) at mm/filemap.c:2440 #3 generic_file_buffered_write (iocb=<value optimized out>, iov=<value optimized out>, nr_segs=<value optimized out>, p\ os=0x108000, ppos=0xffff88001e26be40, count=<value optimized out>, written=0x0) at mm/filemap.c:2482 #4 0xffffffff810db5d1 in __generic_file_aio_write (iocb=0xffff88001e26bde8, iov=0xffff88001e26bec8, nr_segs=0x1, ppos=0\ xffff88001e26be40) at mm/filemap.c:2600 #5 0xffffffff810db853 in generic_file_aio_write (iocb=0xffff88001e26bde8, iov=0xffff88001e26bec8, nr_segs=<value optimi\ zed out>, pos=<value optimized out>) at mm/filemap.c:2632 #6 0xffffffff811a71aa in ext4_file_write (iocb=0xffff88001e26bde8, iov=0xffff88001e26bec8, nr_segs=0x1, pos=0x108000) a\ t fs/ext4/file.c:136 #7 0xffffffff811375aa in do_sync_write (filp=0xffff88003f606a80, buf=<value optimized out>, len=<value optimized out>, \ ppos=0xffff88001e26bf48) at fs/read_write.c:406 #8 0xffffffff81137e56 in vfs_write (file=0xffff88003f606a80, buf=0x1ec2960 <Address 0x1ec2960 out of bounds>, count=0x4\ 000, pos=0xffff88001e26bf48) at fs/read_write.c:435 #9 0xffffffff8113816c in sys_write (fd=<value optimized out>, buf=0x1ec2960 <Address 0x1ec2960 out of bounds>, count=0x\ 4000) at fs/read_write.c:487 #10 <signal handler called> #11 0x00007f120077a390 in __brk_reservation_fn_dmi_alloc__ () #12 0x0000000000000000 in ?? () gdb> print offset $22 = 0xffffffffffffffff gdb> print idx $23 = 0xffffffff gdb> print inode->i_blkbits $24 = 0xc gdb> up #1 ext4_da_write_end (file=0xffff88003f606a80, mapping=0xffff88001d3824e0, pos=0x108000, len=0x1000, copied=0x0, page=0\ xffffea0000d792e8, fsdata=0x0) at fs/ext4/inode.c:2512 2512 if (ext4_da_should_update_i_disksize(page, end)) { gdb> print start $25 = 0x0 gdb> print end $26 = 0xffffffffffffffff gdb> print pos $27 = 0x108000 gdb> print new_i_size $28 = 0x108000 gdb> print ((struct ext4_inode_info *)((char *)inode-((int)(&((struct ext4_inode_info *)0)->vfs_inode))))->i_disksize $29 = 0xd9000 gdb> down 2467 for (i = 0; i < idx; i++) gdb> print i $30 = 0xd44acbee This is 100% reproducible with some autonuma development code tuned in a very aggressive manner (not normal way even for knumad) which does "exotic" changes to the ptes. It wouldn't normally trigger but I don't see why it can't happen normally if the page is added to swap cache in between the two faults leading to "copied" being zero (which then hangs in ext4). So it should be fixed. Especially possible with lumpy reclaim (albeit disabled if compaction is enabled) as that would ignore the young bits in the ptes. Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@kernel.org
If the netdev is already in NETREG_UNREGISTERING/_UNREGISTERED state, do not update the real num tx queues. netdev_queue_update_kobjects() is already called via remove_queue_kobjects() at NETREG_UNREGISTERING time. So, when upper layer driver, e.g., FCoE protocol stack is monitoring the netdev event of NETDEV_UNREGISTER and calls back to LLD ndo_fcoe_disable() to remove extra queues allocated for FCoE, the associated txq sysfs kobjects are already removed, and trying to update the real num queues would cause something like below: ... PID: 25138 TASK: ffff88021e64c440 CPU: 3 COMMAND: "kworker/3:3" #0 [ffff88021f007760] machine_kexec at ffffffff810226d9 #1 [ffff88021f0077d0] crash_kexec at ffffffff81089d2d #2 [ffff88021f0078a0] oops_end at ffffffff813bca78 #3 [ffff88021f0078d0] no_context at ffffffff81029e72 #4 [ffff88021f007920] __bad_area_nosemaphore at ffffffff8102a155 #5 [ffff88021f0079f0] bad_area_nosemaphore at ffffffff8102a23e torvalds#6 [ffff88021f007a00] do_page_fault at ffffffff813bf32e torvalds#7 [ffff88021f007b10] page_fault at ffffffff813bc045 [exception RIP: sysfs_find_dirent+17] RIP: ffffffff81178611 RSP: ffff88021f007bc0 RFLAGS: 00010246 RAX: ffff88021e64c440 RBX: ffffffff8156cc63 RCX: 0000000000000004 RDX: ffffffff8156cc63 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffff88021f007be0 R8: 0000000000000004 R9: 0000000000000008 R10: ffffffff816fed00 R11: 0000000000000004 R12: 0000000000000000 R13: ffffffff8156cc63 R14: 0000000000000000 R15: ffff8802222a0000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 torvalds#8 [ffff88021f007be8] sysfs_get_dirent at ffffffff81178c07 torvalds#9 [ffff88021f007c18] sysfs_remove_group at ffffffff8117ac27 torvalds#10 [ffff88021f007c48] netdev_queue_update_kobjects at ffffffff813178f9 torvalds#11 [ffff88021f007c88] netif_set_real_num_tx_queues at ffffffff81303e38 torvalds#12 [ffff88021f007cc8] ixgbe_set_num_queues at ffffffffa0249763 [ixgbe] torvalds#13 [ffff88021f007cf8] ixgbe_init_interrupt_scheme at ffffffffa024ea89 [ixgbe] torvalds#14 [ffff88021f007d48] ixgbe_fcoe_disable at ffffffffa0267113 [ixgbe] torvalds#15 [ffff88021f007d68] vlan_dev_fcoe_disable at ffffffffa014fef5 [8021q] torvalds#16 [ffff88021f007d78] fcoe_interface_cleanup at ffffffffa02b7dfd [fcoe] torvalds#17 [ffff88021f007df8] fcoe_destroy_work at ffffffffa02b7f08 [fcoe] torvalds#18 [ffff88021f007e18] process_one_work at ffffffff8105d7ca torvalds#19 [ffff88021f007e68] worker_thread at ffffffff81060513 torvalds#20 [ffff88021f007ee8] kthread at ffffffff810648b6 torvalds#21 [ffff88021f007f48] kernel_thread_helper at ffffffff813c40f4 Signed-off-by: Yi Zou <yi.zou@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Tested-by: Stephen Ko <stephen.s.ko@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
If the netdev is already in NETREG_UNREGISTERING/_UNREGISTERED state, do not update the real num tx queues. netdev_queue_update_kobjects() is already called via remove_queue_kobjects() at NETREG_UNREGISTERING time. So, when upper layer driver, e.g., FCoE protocol stack is monitoring the netdev event of NETDEV_UNREGISTER and calls back to LLD ndo_fcoe_disable() to remove extra queues allocated for FCoE, the associated txq sysfs kobjects are already removed, and trying to update the real num queues would cause something like below: ... PID: 25138 TASK: ffff88021e64c440 CPU: 3 COMMAND: "kworker/3:3" #0 [ffff88021f007760] machine_kexec at ffffffff810226d9 #1 [ffff88021f0077d0] crash_kexec at ffffffff81089d2d #2 [ffff88021f0078a0] oops_end at ffffffff813bca78 #3 [ffff88021f0078d0] no_context at ffffffff81029e72 #4 [ffff88021f007920] __bad_area_nosemaphore at ffffffff8102a155 #5 [ffff88021f0079f0] bad_area_nosemaphore at ffffffff8102a23e torvalds#6 [ffff88021f007a00] do_page_fault at ffffffff813bf32e torvalds#7 [ffff88021f007b10] page_fault at ffffffff813bc045 [exception RIP: sysfs_find_dirent+17] RIP: ffffffff81178611 RSP: ffff88021f007bc0 RFLAGS: 00010246 RAX: ffff88021e64c440 RBX: ffffffff8156cc63 RCX: 0000000000000004 RDX: ffffffff8156cc63 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffff88021f007be0 R8: 0000000000000004 R9: 0000000000000008 R10: ffffffff816fed00 R11: 0000000000000004 R12: 0000000000000000 R13: ffffffff8156cc63 R14: 0000000000000000 R15: ffff8802222a0000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 torvalds#8 [ffff88021f007be8] sysfs_get_dirent at ffffffff81178c07 torvalds#9 [ffff88021f007c18] sysfs_remove_group at ffffffff8117ac27 torvalds#10 [ffff88021f007c48] netdev_queue_update_kobjects at ffffffff813178f9 torvalds#11 [ffff88021f007c88] netif_set_real_num_tx_queues at ffffffff81303e38 torvalds#12 [ffff88021f007cc8] ixgbe_set_num_queues at ffffffffa0249763 [ixgbe] torvalds#13 [ffff88021f007cf8] ixgbe_init_interrupt_scheme at ffffffffa024ea89 [ixgbe] torvalds#14 [ffff88021f007d48] ixgbe_fcoe_disable at ffffffffa0267113 [ixgbe] torvalds#15 [ffff88021f007d68] vlan_dev_fcoe_disable at ffffffffa014fef5 [8021q] torvalds#16 [ffff88021f007d78] fcoe_interface_cleanup at ffffffffa02b7dfd [fcoe] torvalds#17 [ffff88021f007df8] fcoe_destroy_work at ffffffffa02b7f08 [fcoe] torvalds#18 [ffff88021f007e18] process_one_work at ffffffff8105d7ca torvalds#19 [ffff88021f007e68] worker_thread at ffffffff81060513 torvalds#20 [ffff88021f007ee8] kthread at ffffffff810648b6 torvalds#21 [ffff88021f007f48] kernel_thread_helper at ffffffff813c40f4 Signed-off-by: Yi Zou <yi.zou@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Tested-by: Stephen Ko <stephen.s.ko@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
…S block during isolation for migration commit 0bf380b upstream. When isolating for migration, migration starts at the start of a zone which is not necessarily pageblock aligned. Further, it stops isolating when COMPACT_CLUSTER_MAX pages are isolated so migrate_pfn is generally not aligned. This allows isolate_migratepages() to call pfn_to_page() on an invalid PFN which can result in a crash. This was originally reported against a 3.0-based kernel with the following trace in a crash dump. PID: 9902 TASK: d47aecd0 CPU: 0 COMMAND: "memcg_process_s" #0 [d72d3ad0] crash_kexec at c028cfdb #1 [d72d3b24] oops_end at c05c5322 #2 [d72d3b38] __bad_area_nosemaphore at c0227e60 #3 [d72d3bec] bad_area at c0227fb6 #4 [d72d3c00] do_page_fault at c05c72ec #5 [d72d3c80] error_code (via page_fault) at c05c47a4 EAX: 00000000 EBX: 000c0000 ECX: 00000001 EDX: 00000807 EBP: 000c0000 DS: 007b ESI: 00000001 ES: 007b EDI: f3000a80 GS: 6f50 CS: 0060 EIP: c030b15a ERR: ffffffff EFLAGS: 00010002 torvalds#6 [d72d3cb4] isolate_migratepages at c030b15a torvalds#7 [d72d3d1] zone_watermark_ok at c02d26cb torvalds#8 [d72d3d2c] compact_zone at c030b8de torvalds#9 [d72d3d68] compact_zone_order at c030bba1 torvalds#10 [d72d3db4] try_to_compact_pages at c030bc84 torvalds#11 [d72d3ddc] __alloc_pages_direct_compact at c02d61e7 torvalds#12 [d72d3e08] __alloc_pages_slowpath at c02d66c7 torvalds#13 [d72d3e78] __alloc_pages_nodemask at c02d6a97 torvalds#14 [d72d3eb8] alloc_pages_vma at c030a845 torvalds#15 [d72d3ed4] do_huge_pmd_anonymous_page at c03178eb torvalds#16 [d72d3f00] handle_mm_fault at c02f36c6 torvalds#17 [d72d3f30] do_page_fault at c05c70ed torvalds#18 [d72d3fb0] error_code (via page_fault) at c05c47a4 EAX: b71ff000 EBX: 00000001 ECX: 00001600 EDX: 00000431 DS: 007b ESI: 08048950 ES: 007b EDI: bfaa3788 SS: 007b ESP: bfaa36e0 EBP: bfaa3828 GS: 6f50 CS: 0073 EIP: 080487c8 ERR: ffffffff EFLAGS: 00010202 It was also reported by Herbert van den Bergh against 3.1-based kernel with the following snippet from the console log. BUG: unable to handle kernel paging request at 01c00008 IP: [<c0522399>] isolate_migratepages+0x119/0x390 *pdpt = 000000002f7ce001 *pde = 0000000000000000 It is expected that it also affects 3.2.x and current mainline. The problem is that pfn_valid is only called on the first PFN being checked and that PFN is not necessarily aligned. Lets say we have a case like this H = MAX_ORDER_NR_PAGES boundary | = pageblock boundary m = cc->migrate_pfn f = cc->free_pfn o = memory hole H------|------H------|----m-Hoooooo|ooooooH-f----|------H The migrate_pfn is just below a memory hole and the free scanner is beyond the hole. When isolate_migratepages started, it scans from migrate_pfn to migrate_pfn+pageblock_nr_pages which is now in a memory hole. It checks pfn_valid() on the first PFN but then scans into the hole where there are not necessarily valid struct pages. This patch ensures that isolate_migratepages calls pfn_valid when necessary. Reported-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com> Tested-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com> Signed-off-by: Mel Gorman <mgorman@suse.de> Acked-by: Michal Nazarewicz <mina86@mina86.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
…S block during isolation for migration commit 0bf380b upstream. When isolating for migration, migration starts at the start of a zone which is not necessarily pageblock aligned. Further, it stops isolating when COMPACT_CLUSTER_MAX pages are isolated so migrate_pfn is generally not aligned. This allows isolate_migratepages() to call pfn_to_page() on an invalid PFN which can result in a crash. This was originally reported against a 3.0-based kernel with the following trace in a crash dump. PID: 9902 TASK: d47aecd0 CPU: 0 COMMAND: "memcg_process_s" #0 [d72d3ad0] crash_kexec at c028cfdb #1 [d72d3b24] oops_end at c05c5322 #2 [d72d3b38] __bad_area_nosemaphore at c0227e60 #3 [d72d3bec] bad_area at c0227fb6 #4 [d72d3c00] do_page_fault at c05c72ec #5 [d72d3c80] error_code (via page_fault) at c05c47a4 EAX: 00000000 EBX: 000c0000 ECX: 00000001 EDX: 00000807 EBP: 000c0000 DS: 007b ESI: 00000001 ES: 007b EDI: f3000a80 GS: 6f50 CS: 0060 EIP: c030b15a ERR: ffffffff EFLAGS: 00010002 torvalds#6 [d72d3cb4] isolate_migratepages at c030b15a torvalds#7 [d72d3d1] zone_watermark_ok at c02d26cb torvalds#8 [d72d3d2c] compact_zone at c030b8de torvalds#9 [d72d3d68] compact_zone_order at c030bba1 torvalds#10 [d72d3db4] try_to_compact_pages at c030bc84 torvalds#11 [d72d3ddc] __alloc_pages_direct_compact at c02d61e7 torvalds#12 [d72d3e08] __alloc_pages_slowpath at c02d66c7 torvalds#13 [d72d3e78] __alloc_pages_nodemask at c02d6a97 torvalds#14 [d72d3eb8] alloc_pages_vma at c030a845 torvalds#15 [d72d3ed4] do_huge_pmd_anonymous_page at c03178eb torvalds#16 [d72d3f00] handle_mm_fault at c02f36c6 torvalds#17 [d72d3f30] do_page_fault at c05c70ed torvalds#18 [d72d3fb0] error_code (via page_fault) at c05c47a4 EAX: b71ff000 EBX: 00000001 ECX: 00001600 EDX: 00000431 DS: 007b ESI: 08048950 ES: 007b EDI: bfaa3788 SS: 007b ESP: bfaa36e0 EBP: bfaa3828 GS: 6f50 CS: 0073 EIP: 080487c8 ERR: ffffffff EFLAGS: 00010202 It was also reported by Herbert van den Bergh against 3.1-based kernel with the following snippet from the console log. BUG: unable to handle kernel paging request at 01c00008 IP: [<c0522399>] isolate_migratepages+0x119/0x390 *pdpt = 000000002f7ce001 *pde = 0000000000000000 It is expected that it also affects 3.2.x and current mainline. The problem is that pfn_valid is only called on the first PFN being checked and that PFN is not necessarily aligned. Lets say we have a case like this H = MAX_ORDER_NR_PAGES boundary | = pageblock boundary m = cc->migrate_pfn f = cc->free_pfn o = memory hole H------|------H------|----m-Hoooooo|ooooooH-f----|------H The migrate_pfn is just below a memory hole and the free scanner is beyond the hole. When isolate_migratepages started, it scans from migrate_pfn to migrate_pfn+pageblock_nr_pages which is now in a memory hole. It checks pfn_valid() on the first PFN but then scans into the hole where there are not necessarily valid struct pages. This patch ensures that isolate_migratepages calls pfn_valid when necessary. Reported-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com> Tested-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com> Signed-off-by: Mel Gorman <mgorman@suse.de> Acked-by: Michal Nazarewicz <mina86@mina86.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
fixed: WARNING: please, no space before tabs torvalds#11: FILE: adt7411.c:11: + * ^I use power-down mode for suspend?, interrupt handling?$ not fixed as all other macros around it are the same structure and this one is only 2 chars longer: WARNING: line over 80 characters torvalds#229: FILE: adt7411.c:229: +static ADT7411_BIT_ATTR(fast_sampling, ADT7411_REG_CFG3, ADT7411_CFG3_ADC_CLK_225); Signed-off-by: Frans Meulenbroeks <fransmeulenbroeks@gmail.com> Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>
…S block during isolation for migration commit 0bf380b upstream. When isolating for migration, migration starts at the start of a zone which is not necessarily pageblock aligned. Further, it stops isolating when COMPACT_CLUSTER_MAX pages are isolated so migrate_pfn is generally not aligned. This allows isolate_migratepages() to call pfn_to_page() on an invalid PFN which can result in a crash. This was originally reported against a 3.0-based kernel with the following trace in a crash dump. PID: 9902 TASK: d47aecd0 CPU: 0 COMMAND: "memcg_process_s" #0 [d72d3ad0] crash_kexec at c028cfdb #1 [d72d3b24] oops_end at c05c5322 #2 [d72d3b38] __bad_area_nosemaphore at c0227e60 #3 [d72d3bec] bad_area at c0227fb6 #4 [d72d3c00] do_page_fault at c05c72ec #5 [d72d3c80] error_code (via page_fault) at c05c47a4 EAX: 00000000 EBX: 000c0000 ECX: 00000001 EDX: 00000807 EBP: 000c0000 DS: 007b ESI: 00000001 ES: 007b EDI: f3000a80 GS: 6f50 CS: 0060 EIP: c030b15a ERR: ffffffff EFLAGS: 00010002 #6 [d72d3cb4] isolate_migratepages at c030b15a #7 [d72d3d1] zone_watermark_ok at c02d26cb #8 [d72d3d2c] compact_zone at c030b8de #9 [d72d3d68] compact_zone_order at c030bba1 torvalds#10 [d72d3db4] try_to_compact_pages at c030bc84 torvalds#11 [d72d3ddc] __alloc_pages_direct_compact at c02d61e7 torvalds#12 [d72d3e08] __alloc_pages_slowpath at c02d66c7 torvalds#13 [d72d3e78] __alloc_pages_nodemask at c02d6a97 torvalds#14 [d72d3eb8] alloc_pages_vma at c030a845 torvalds#15 [d72d3ed4] do_huge_pmd_anonymous_page at c03178eb torvalds#16 [d72d3f00] handle_mm_fault at c02f36c6 torvalds#17 [d72d3f30] do_page_fault at c05c70ed torvalds#18 [d72d3fb0] error_code (via page_fault) at c05c47a4 EAX: b71ff000 EBX: 00000001 ECX: 00001600 EDX: 00000431 DS: 007b ESI: 08048950 ES: 007b EDI: bfaa3788 SS: 007b ESP: bfaa36e0 EBP: bfaa3828 GS: 6f50 CS: 0073 EIP: 080487c8 ERR: ffffffff EFLAGS: 00010202 It was also reported by Herbert van den Bergh against 3.1-based kernel with the following snippet from the console log. BUG: unable to handle kernel paging request at 01c00008 IP: [<c0522399>] isolate_migratepages+0x119/0x390 *pdpt = 000000002f7ce001 *pde = 0000000000000000 It is expected that it also affects 3.2.x and current mainline. The problem is that pfn_valid is only called on the first PFN being checked and that PFN is not necessarily aligned. Lets say we have a case like this H = MAX_ORDER_NR_PAGES boundary | = pageblock boundary m = cc->migrate_pfn f = cc->free_pfn o = memory hole H------|------H------|----m-Hoooooo|ooooooH-f----|------H The migrate_pfn is just below a memory hole and the free scanner is beyond the hole. When isolate_migratepages started, it scans from migrate_pfn to migrate_pfn+pageblock_nr_pages which is now in a memory hole. It checks pfn_valid() on the first PFN but then scans into the hole where there are not necessarily valid struct pages. This patch ensures that isolate_migratepages calls pfn_valid when necessary. Reported-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com> Tested-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com> Signed-off-by: Mel Gorman <mgorman@suse.de> Acked-by: Michal Nazarewicz <mina86@mina86.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
…S block during isolation for migration commit 0bf380b upstream. When isolating for migration, migration starts at the start of a zone which is not necessarily pageblock aligned. Further, it stops isolating when COMPACT_CLUSTER_MAX pages are isolated so migrate_pfn is generally not aligned. This allows isolate_migratepages() to call pfn_to_page() on an invalid PFN which can result in a crash. This was originally reported against a 3.0-based kernel with the following trace in a crash dump. PID: 9902 TASK: d47aecd0 CPU: 0 COMMAND: "memcg_process_s" #0 [d72d3ad0] crash_kexec at c028cfdb #1 [d72d3b24] oops_end at c05c5322 #2 [d72d3b38] __bad_area_nosemaphore at c0227e60 #3 [d72d3bec] bad_area at c0227fb6 #4 [d72d3c00] do_page_fault at c05c72ec #5 [d72d3c80] error_code (via page_fault) at c05c47a4 EAX: 00000000 EBX: 000c0000 ECX: 00000001 EDX: 00000807 EBP: 000c0000 DS: 007b ESI: 00000001 ES: 007b EDI: f3000a80 GS: 6f50 CS: 0060 EIP: c030b15a ERR: ffffffff EFLAGS: 00010002 #6 [d72d3cb4] isolate_migratepages at c030b15a #7 [d72d3d1] zone_watermark_ok at c02d26cb #8 [d72d3d2c] compact_zone at c030b8de #9 [d72d3d68] compact_zone_order at c030bba1 torvalds#10 [d72d3db4] try_to_compact_pages at c030bc84 torvalds#11 [d72d3ddc] __alloc_pages_direct_compact at c02d61e7 torvalds#12 [d72d3e08] __alloc_pages_slowpath at c02d66c7 torvalds#13 [d72d3e78] __alloc_pages_nodemask at c02d6a97 torvalds#14 [d72d3eb8] alloc_pages_vma at c030a845 torvalds#15 [d72d3ed4] do_huge_pmd_anonymous_page at c03178eb torvalds#16 [d72d3f00] handle_mm_fault at c02f36c6 torvalds#17 [d72d3f30] do_page_fault at c05c70ed torvalds#18 [d72d3fb0] error_code (via page_fault) at c05c47a4 EAX: b71ff000 EBX: 00000001 ECX: 00001600 EDX: 00000431 DS: 007b ESI: 08048950 ES: 007b EDI: bfaa3788 SS: 007b ESP: bfaa36e0 EBP: bfaa3828 GS: 6f50 CS: 0073 EIP: 080487c8 ERR: ffffffff EFLAGS: 00010202 It was also reported by Herbert van den Bergh against 3.1-based kernel with the following snippet from the console log. BUG: unable to handle kernel paging request at 01c00008 IP: [<c0522399>] isolate_migratepages+0x119/0x390 *pdpt = 000000002f7ce001 *pde = 0000000000000000 It is expected that it also affects 3.2.x and current mainline. The problem is that pfn_valid is only called on the first PFN being checked and that PFN is not necessarily aligned. Lets say we have a case like this H = MAX_ORDER_NR_PAGES boundary | = pageblock boundary m = cc->migrate_pfn f = cc->free_pfn o = memory hole H------|------H------|----m-Hoooooo|ooooooH-f----|------H The migrate_pfn is just below a memory hole and the free scanner is beyond the hole. When isolate_migratepages started, it scans from migrate_pfn to migrate_pfn+pageblock_nr_pages which is now in a memory hole. It checks pfn_valid() on the first PFN but then scans into the hole where there are not necessarily valid struct pages. This patch ensures that isolate_migratepages calls pfn_valid when necessary. Reported-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com> Tested-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com> Signed-off-by: Mel Gorman <mgorman@suse.de> Acked-by: Michal Nazarewicz <mina86@mina86.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
…S block during isolation for migration commit 0bf380b upstream. When isolating for migration, migration starts at the start of a zone which is not necessarily pageblock aligned. Further, it stops isolating when COMPACT_CLUSTER_MAX pages are isolated so migrate_pfn is generally not aligned. This allows isolate_migratepages() to call pfn_to_page() on an invalid PFN which can result in a crash. This was originally reported against a 3.0-based kernel with the following trace in a crash dump. PID: 9902 TASK: d47aecd0 CPU: 0 COMMAND: "memcg_process_s" #0 [d72d3ad0] crash_kexec at c028cfdb #1 [d72d3b24] oops_end at c05c5322 #2 [d72d3b38] __bad_area_nosemaphore at c0227e60 #3 [d72d3bec] bad_area at c0227fb6 #4 [d72d3c00] do_page_fault at c05c72ec #5 [d72d3c80] error_code (via page_fault) at c05c47a4 EAX: 00000000 EBX: 000c0000 ECX: 00000001 EDX: 00000807 EBP: 000c0000 DS: 007b ESI: 00000001 ES: 007b EDI: f3000a80 GS: 6f50 CS: 0060 EIP: c030b15a ERR: ffffffff EFLAGS: 00010002 #6 [d72d3cb4] isolate_migratepages at c030b15a #7 [d72d3d1] zone_watermark_ok at c02d26cb #8 [d72d3d2c] compact_zone at c030b8de #9 [d72d3d68] compact_zone_order at c030bba1 torvalds#10 [d72d3db4] try_to_compact_pages at c030bc84 torvalds#11 [d72d3ddc] __alloc_pages_direct_compact at c02d61e7 torvalds#12 [d72d3e08] __alloc_pages_slowpath at c02d66c7 torvalds#13 [d72d3e78] __alloc_pages_nodemask at c02d6a97 torvalds#14 [d72d3eb8] alloc_pages_vma at c030a845 torvalds#15 [d72d3ed4] do_huge_pmd_anonymous_page at c03178eb torvalds#16 [d72d3f00] handle_mm_fault at c02f36c6 torvalds#17 [d72d3f30] do_page_fault at c05c70ed torvalds#18 [d72d3fb0] error_code (via page_fault) at c05c47a4 EAX: b71ff000 EBX: 00000001 ECX: 00001600 EDX: 00000431 DS: 007b ESI: 08048950 ES: 007b EDI: bfaa3788 SS: 007b ESP: bfaa36e0 EBP: bfaa3828 GS: 6f50 CS: 0073 EIP: 080487c8 ERR: ffffffff EFLAGS: 00010202 It was also reported by Herbert van den Bergh against 3.1-based kernel with the following snippet from the console log. BUG: unable to handle kernel paging request at 01c00008 IP: [<c0522399>] isolate_migratepages+0x119/0x390 *pdpt = 000000002f7ce001 *pde = 0000000000000000 It is expected that it also affects 3.2.x and current mainline. The problem is that pfn_valid is only called on the first PFN being checked and that PFN is not necessarily aligned. Lets say we have a case like this H = MAX_ORDER_NR_PAGES boundary | = pageblock boundary m = cc->migrate_pfn f = cc->free_pfn o = memory hole H------|------H------|----m-Hoooooo|ooooooH-f----|------H The migrate_pfn is just below a memory hole and the free scanner is beyond the hole. When isolate_migratepages started, it scans from migrate_pfn to migrate_pfn+pageblock_nr_pages which is now in a memory hole. It checks pfn_valid() on the first PFN but then scans into the hole where there are not necessarily valid struct pages. This patch ensures that isolate_migratepages calls pfn_valid when necessary. Reported-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com> Tested-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com> Signed-off-by: Mel Gorman <mgorman@suse.de> Acked-by: Michal Nazarewicz <mina86@mina86.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
…S block during isolation for migration commit 0bf380b upstream. When isolating for migration, migration starts at the start of a zone which is not necessarily pageblock aligned. Further, it stops isolating when COMPACT_CLUSTER_MAX pages are isolated so migrate_pfn is generally not aligned. This allows isolate_migratepages() to call pfn_to_page() on an invalid PFN which can result in a crash. This was originally reported against a 3.0-based kernel with the following trace in a crash dump. PID: 9902 TASK: d47aecd0 CPU: 0 COMMAND: "memcg_process_s" #0 [d72d3ad0] crash_kexec at c028cfdb #1 [d72d3b24] oops_end at c05c5322 #2 [d72d3b38] __bad_area_nosemaphore at c0227e60 #3 [d72d3bec] bad_area at c0227fb6 #4 [d72d3c00] do_page_fault at c05c72ec #5 [d72d3c80] error_code (via page_fault) at c05c47a4 EAX: 00000000 EBX: 000c0000 ECX: 00000001 EDX: 00000807 EBP: 000c0000 DS: 007b ESI: 00000001 ES: 007b EDI: f3000a80 GS: 6f50 CS: 0060 EIP: c030b15a ERR: ffffffff EFLAGS: 00010002 #6 [d72d3cb4] isolate_migratepages at c030b15a #7 [d72d3d1] zone_watermark_ok at c02d26cb #8 [d72d3d2c] compact_zone at c030b8de #9 [d72d3d68] compact_zone_order at c030bba1 torvalds#10 [d72d3db4] try_to_compact_pages at c030bc84 torvalds#11 [d72d3ddc] __alloc_pages_direct_compact at c02d61e7 torvalds#12 [d72d3e08] __alloc_pages_slowpath at c02d66c7 torvalds#13 [d72d3e78] __alloc_pages_nodemask at c02d6a97 torvalds#14 [d72d3eb8] alloc_pages_vma at c030a845 torvalds#15 [d72d3ed4] do_huge_pmd_anonymous_page at c03178eb torvalds#16 [d72d3f00] handle_mm_fault at c02f36c6 torvalds#17 [d72d3f30] do_page_fault at c05c70ed torvalds#18 [d72d3fb0] error_code (via page_fault) at c05c47a4 EAX: b71ff000 EBX: 00000001 ECX: 00001600 EDX: 00000431 DS: 007b ESI: 08048950 ES: 007b EDI: bfaa3788 SS: 007b ESP: bfaa36e0 EBP: bfaa3828 GS: 6f50 CS: 0073 EIP: 080487c8 ERR: ffffffff EFLAGS: 00010202 It was also reported by Herbert van den Bergh against 3.1-based kernel with the following snippet from the console log. BUG: unable to handle kernel paging request at 01c00008 IP: [<c0522399>] isolate_migratepages+0x119/0x390 *pdpt = 000000002f7ce001 *pde = 0000000000000000 It is expected that it also affects 3.2.x and current mainline. The problem is that pfn_valid is only called on the first PFN being checked and that PFN is not necessarily aligned. Lets say we have a case like this H = MAX_ORDER_NR_PAGES boundary | = pageblock boundary m = cc->migrate_pfn f = cc->free_pfn o = memory hole H------|------H------|----m-Hoooooo|ooooooH-f----|------H The migrate_pfn is just below a memory hole and the free scanner is beyond the hole. When isolate_migratepages started, it scans from migrate_pfn to migrate_pfn+pageblock_nr_pages which is now in a memory hole. It checks pfn_valid() on the first PFN but then scans into the hole where there are not necessarily valid struct pages. This patch ensures that isolate_migratepages calls pfn_valid when necessary. Reported-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com> Tested-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com> Signed-off-by: Mel Gorman <mgorman@suse.de> Acked-by: Michal Nazarewicz <mina86@mina86.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
…S block during isolation for migration commit 0bf380b upstream. When isolating for migration, migration starts at the start of a zone which is not necessarily pageblock aligned. Further, it stops isolating when COMPACT_CLUSTER_MAX pages are isolated so migrate_pfn is generally not aligned. This allows isolate_migratepages() to call pfn_to_page() on an invalid PFN which can result in a crash. This was originally reported against a 3.0-based kernel with the following trace in a crash dump. PID: 9902 TASK: d47aecd0 CPU: 0 COMMAND: "memcg_process_s" #0 [d72d3ad0] crash_kexec at c028cfdb #1 [d72d3b24] oops_end at c05c5322 #2 [d72d3b38] __bad_area_nosemaphore at c0227e60 #3 [d72d3bec] bad_area at c0227fb6 #4 [d72d3c00] do_page_fault at c05c72ec #5 [d72d3c80] error_code (via page_fault) at c05c47a4 EAX: 00000000 EBX: 000c0000 ECX: 00000001 EDX: 00000807 EBP: 000c0000 DS: 007b ESI: 00000001 ES: 007b EDI: f3000a80 GS: 6f50 CS: 0060 EIP: c030b15a ERR: ffffffff EFLAGS: 00010002 #6 [d72d3cb4] isolate_migratepages at c030b15a #7 [d72d3d1] zone_watermark_ok at c02d26cb #8 [d72d3d2c] compact_zone at c030b8de #9 [d72d3d68] compact_zone_order at c030bba1 torvalds#10 [d72d3db4] try_to_compact_pages at c030bc84 torvalds#11 [d72d3ddc] __alloc_pages_direct_compact at c02d61e7 torvalds#12 [d72d3e08] __alloc_pages_slowpath at c02d66c7 torvalds#13 [d72d3e78] __alloc_pages_nodemask at c02d6a97 torvalds#14 [d72d3eb8] alloc_pages_vma at c030a845 torvalds#15 [d72d3ed4] do_huge_pmd_anonymous_page at c03178eb torvalds#16 [d72d3f00] handle_mm_fault at c02f36c6 torvalds#17 [d72d3f30] do_page_fault at c05c70ed torvalds#18 [d72d3fb0] error_code (via page_fault) at c05c47a4 EAX: b71ff000 EBX: 00000001 ECX: 00001600 EDX: 00000431 DS: 007b ESI: 08048950 ES: 007b EDI: bfaa3788 SS: 007b ESP: bfaa36e0 EBP: bfaa3828 GS: 6f50 CS: 0073 EIP: 080487c8 ERR: ffffffff EFLAGS: 00010202 It was also reported by Herbert van den Bergh against 3.1-based kernel with the following snippet from the console log. BUG: unable to handle kernel paging request at 01c00008 IP: [<c0522399>] isolate_migratepages+0x119/0x390 *pdpt = 000000002f7ce001 *pde = 0000000000000000 It is expected that it also affects 3.2.x and current mainline. The problem is that pfn_valid is only called on the first PFN being checked and that PFN is not necessarily aligned. Lets say we have a case like this H = MAX_ORDER_NR_PAGES boundary | = pageblock boundary m = cc->migrate_pfn f = cc->free_pfn o = memory hole H------|------H------|----m-Hoooooo|ooooooH-f----|------H The migrate_pfn is just below a memory hole and the free scanner is beyond the hole. When isolate_migratepages started, it scans from migrate_pfn to migrate_pfn+pageblock_nr_pages which is now in a memory hole. It checks pfn_valid() on the first PFN but then scans into the hole where there are not necessarily valid struct pages. This patch ensures that isolate_migratepages calls pfn_valid when necessary. Reported-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com> Tested-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com> Signed-off-by: Mel Gorman <mgorman@suse.de> Acked-by: Michal Nazarewicz <mina86@mina86.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
…S block during isolation for migration commit 0bf380b upstream. When isolating for migration, migration starts at the start of a zone which is not necessarily pageblock aligned. Further, it stops isolating when COMPACT_CLUSTER_MAX pages are isolated so migrate_pfn is generally not aligned. This allows isolate_migratepages() to call pfn_to_page() on an invalid PFN which can result in a crash. This was originally reported against a 3.0-based kernel with the following trace in a crash dump. PID: 9902 TASK: d47aecd0 CPU: 0 COMMAND: "memcg_process_s" #0 [d72d3ad0] crash_kexec at c028cfdb #1 [d72d3b24] oops_end at c05c5322 #2 [d72d3b38] __bad_area_nosemaphore at c0227e60 #3 [d72d3bec] bad_area at c0227fb6 #4 [d72d3c00] do_page_fault at c05c72ec #5 [d72d3c80] error_code (via page_fault) at c05c47a4 EAX: 00000000 EBX: 000c0000 ECX: 00000001 EDX: 00000807 EBP: 000c0000 DS: 007b ESI: 00000001 ES: 007b EDI: f3000a80 GS: 6f50 CS: 0060 EIP: c030b15a ERR: ffffffff EFLAGS: 00010002 #6 [d72d3cb4] isolate_migratepages at c030b15a #7 [d72d3d1] zone_watermark_ok at c02d26cb #8 [d72d3d2c] compact_zone at c030b8de #9 [d72d3d68] compact_zone_order at c030bba1 torvalds#10 [d72d3db4] try_to_compact_pages at c030bc84 torvalds#11 [d72d3ddc] __alloc_pages_direct_compact at c02d61e7 torvalds#12 [d72d3e08] __alloc_pages_slowpath at c02d66c7 torvalds#13 [d72d3e78] __alloc_pages_nodemask at c02d6a97 torvalds#14 [d72d3eb8] alloc_pages_vma at c030a845 torvalds#15 [d72d3ed4] do_huge_pmd_anonymous_page at c03178eb torvalds#16 [d72d3f00] handle_mm_fault at c02f36c6 torvalds#17 [d72d3f30] do_page_fault at c05c70ed torvalds#18 [d72d3fb0] error_code (via page_fault) at c05c47a4 EAX: b71ff000 EBX: 00000001 ECX: 00001600 EDX: 00000431 DS: 007b ESI: 08048950 ES: 007b EDI: bfaa3788 SS: 007b ESP: bfaa36e0 EBP: bfaa3828 GS: 6f50 CS: 0073 EIP: 080487c8 ERR: ffffffff EFLAGS: 00010202 It was also reported by Herbert van den Bergh against 3.1-based kernel with the following snippet from the console log. BUG: unable to handle kernel paging request at 01c00008 IP: [<c0522399>] isolate_migratepages+0x119/0x390 *pdpt = 000000002f7ce001 *pde = 0000000000000000 It is expected that it also affects 3.2.x and current mainline. The problem is that pfn_valid is only called on the first PFN being checked and that PFN is not necessarily aligned. Lets say we have a case like this H = MAX_ORDER_NR_PAGES boundary | = pageblock boundary m = cc->migrate_pfn f = cc->free_pfn o = memory hole H------|------H------|----m-Hoooooo|ooooooH-f----|------H The migrate_pfn is just below a memory hole and the free scanner is beyond the hole. When isolate_migratepages started, it scans from migrate_pfn to migrate_pfn+pageblock_nr_pages which is now in a memory hole. It checks pfn_valid() on the first PFN but then scans into the hole where there are not necessarily valid struct pages. This patch ensures that isolate_migratepages calls pfn_valid when necessary. Reported-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com> Tested-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com> Signed-off-by: Mel Gorman <mgorman@suse.de> Acked-by: Michal Nazarewicz <mina86@mina86.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
…S block during isolation for migration commit 0bf380b upstream. When isolating for migration, migration starts at the start of a zone which is not necessarily pageblock aligned. Further, it stops isolating when COMPACT_CLUSTER_MAX pages are isolated so migrate_pfn is generally not aligned. This allows isolate_migratepages() to call pfn_to_page() on an invalid PFN which can result in a crash. This was originally reported against a 3.0-based kernel with the following trace in a crash dump. PID: 9902 TASK: d47aecd0 CPU: 0 COMMAND: "memcg_process_s" #0 [d72d3ad0] crash_kexec at c028cfdb #1 [d72d3b24] oops_end at c05c5322 #2 [d72d3b38] __bad_area_nosemaphore at c0227e60 #3 [d72d3bec] bad_area at c0227fb6 #4 [d72d3c00] do_page_fault at c05c72ec #5 [d72d3c80] error_code (via page_fault) at c05c47a4 EAX: 00000000 EBX: 000c0000 ECX: 00000001 EDX: 00000807 EBP: 000c0000 DS: 007b ESI: 00000001 ES: 007b EDI: f3000a80 GS: 6f50 CS: 0060 EIP: c030b15a ERR: ffffffff EFLAGS: 00010002 #6 [d72d3cb4] isolate_migratepages at c030b15a #7 [d72d3d1] zone_watermark_ok at c02d26cb #8 [d72d3d2c] compact_zone at c030b8de #9 [d72d3d68] compact_zone_order at c030bba1 torvalds#10 [d72d3db4] try_to_compact_pages at c030bc84 torvalds#11 [d72d3ddc] __alloc_pages_direct_compact at c02d61e7 torvalds#12 [d72d3e08] __alloc_pages_slowpath at c02d66c7 torvalds#13 [d72d3e78] __alloc_pages_nodemask at c02d6a97 torvalds#14 [d72d3eb8] alloc_pages_vma at c030a845 torvalds#15 [d72d3ed4] do_huge_pmd_anonymous_page at c03178eb torvalds#16 [d72d3f00] handle_mm_fault at c02f36c6 torvalds#17 [d72d3f30] do_page_fault at c05c70ed torvalds#18 [d72d3fb0] error_code (via page_fault) at c05c47a4 EAX: b71ff000 EBX: 00000001 ECX: 00001600 EDX: 00000431 DS: 007b ESI: 08048950 ES: 007b EDI: bfaa3788 SS: 007b ESP: bfaa36e0 EBP: bfaa3828 GS: 6f50 CS: 0073 EIP: 080487c8 ERR: ffffffff EFLAGS: 00010202 It was also reported by Herbert van den Bergh against 3.1-based kernel with the following snippet from the console log. BUG: unable to handle kernel paging request at 01c00008 IP: [<c0522399>] isolate_migratepages+0x119/0x390 *pdpt = 000000002f7ce001 *pde = 0000000000000000 It is expected that it also affects 3.2.x and current mainline. The problem is that pfn_valid is only called on the first PFN being checked and that PFN is not necessarily aligned. Lets say we have a case like this H = MAX_ORDER_NR_PAGES boundary | = pageblock boundary m = cc->migrate_pfn f = cc->free_pfn o = memory hole H------|------H------|----m-Hoooooo|ooooooH-f----|------H The migrate_pfn is just below a memory hole and the free scanner is beyond the hole. When isolate_migratepages started, it scans from migrate_pfn to migrate_pfn+pageblock_nr_pages which is now in a memory hole. It checks pfn_valid() on the first PFN but then scans into the hole where there are not necessarily valid struct pages. This patch ensures that isolate_migratepages calls pfn_valid when necessary. Reported-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com> Tested-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com> Signed-off-by: Mel Gorman <mgorman@suse.de> Acked-by: Michal Nazarewicz <mina86@mina86.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
…S block during isolation for migration commit 0bf380b upstream. When isolating for migration, migration starts at the start of a zone which is not necessarily pageblock aligned. Further, it stops isolating when COMPACT_CLUSTER_MAX pages are isolated so migrate_pfn is generally not aligned. This allows isolate_migratepages() to call pfn_to_page() on an invalid PFN which can result in a crash. This was originally reported against a 3.0-based kernel with the following trace in a crash dump. PID: 9902 TASK: d47aecd0 CPU: 0 COMMAND: "memcg_process_s" #0 [d72d3ad0] crash_kexec at c028cfdb #1 [d72d3b24] oops_end at c05c5322 #2 [d72d3b38] __bad_area_nosemaphore at c0227e60 #3 [d72d3bec] bad_area at c0227fb6 #4 [d72d3c00] do_page_fault at c05c72ec #5 [d72d3c80] error_code (via page_fault) at c05c47a4 EAX: 00000000 EBX: 000c0000 ECX: 00000001 EDX: 00000807 EBP: 000c0000 DS: 007b ESI: 00000001 ES: 007b EDI: f3000a80 GS: 6f50 CS: 0060 EIP: c030b15a ERR: ffffffff EFLAGS: 00010002 #6 [d72d3cb4] isolate_migratepages at c030b15a #7 [d72d3d1] zone_watermark_ok at c02d26cb #8 [d72d3d2c] compact_zone at c030b8de #9 [d72d3d68] compact_zone_order at c030bba1 torvalds#10 [d72d3db4] try_to_compact_pages at c030bc84 torvalds#11 [d72d3ddc] __alloc_pages_direct_compact at c02d61e7 torvalds#12 [d72d3e08] __alloc_pages_slowpath at c02d66c7 torvalds#13 [d72d3e78] __alloc_pages_nodemask at c02d6a97 torvalds#14 [d72d3eb8] alloc_pages_vma at c030a845 torvalds#15 [d72d3ed4] do_huge_pmd_anonymous_page at c03178eb torvalds#16 [d72d3f00] handle_mm_fault at c02f36c6 torvalds#17 [d72d3f30] do_page_fault at c05c70ed torvalds#18 [d72d3fb0] error_code (via page_fault) at c05c47a4 EAX: b71ff000 EBX: 00000001 ECX: 00001600 EDX: 00000431 DS: 007b ESI: 08048950 ES: 007b EDI: bfaa3788 SS: 007b ESP: bfaa36e0 EBP: bfaa3828 GS: 6f50 CS: 0073 EIP: 080487c8 ERR: ffffffff EFLAGS: 00010202 It was also reported by Herbert van den Bergh against 3.1-based kernel with the following snippet from the console log. BUG: unable to handle kernel paging request at 01c00008 IP: [<c0522399>] isolate_migratepages+0x119/0x390 *pdpt = 000000002f7ce001 *pde = 0000000000000000 It is expected that it also affects 3.2.x and current mainline. The problem is that pfn_valid is only called on the first PFN being checked and that PFN is not necessarily aligned. Lets say we have a case like this H = MAX_ORDER_NR_PAGES boundary | = pageblock boundary m = cc->migrate_pfn f = cc->free_pfn o = memory hole H------|------H------|----m-Hoooooo|ooooooH-f----|------H The migrate_pfn is just below a memory hole and the free scanner is beyond the hole. When isolate_migratepages started, it scans from migrate_pfn to migrate_pfn+pageblock_nr_pages which is now in a memory hole. It checks pfn_valid() on the first PFN but then scans into the hole where there are not necessarily valid struct pages. This patch ensures that isolate_migratepages calls pfn_valid when necessary. Reported-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com> Tested-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com> Signed-off-by: Mel Gorman <mgorman@suse.de> Acked-by: Michal Nazarewicz <mina86@mina86.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
BugLink: http://bugs.launchpad.net/bugs/907778 commit ea51d13 upstream. If the pte mapping in generic_perform_write() is unmapped between iov_iter_fault_in_readable() and iov_iter_copy_from_user_atomic(), the "copied" parameter to ->end_write can be zero. ext4 couldn't cope with it with delayed allocations enabled. This skips the i_disksize enlargement logic if copied is zero and no new data was appeneded to the inode. gdb> bt #0 0xffffffff811afe80 in ext4_da_should_update_i_disksize (file=0xffff88003f606a80, mapping=0xffff88001d3824e0, pos=0x1\ 08000, len=0x1000, copied=0x0, page=0xffffea0000d792e8, fsdata=0x0) at fs/ext4/inode.c:2467 #1 ext4_da_write_end (file=0xffff88003f606a80, mapping=0xffff88001d3824e0, pos=0x108000, len=0x1000, copied=0x0, page=0\ xffffea0000d792e8, fsdata=0x0) at fs/ext4/inode.c:2512 #2 0xffffffff810d97f1 in generic_perform_write (iocb=<value optimized out>, iov=<value optimized out>, nr_segs=<value o\ ptimized out>, pos=0x108000, ppos=0xffff88001e26be40, count=<value optimized out>, written=0x0) at mm/filemap.c:2440 #3 generic_file_buffered_write (iocb=<value optimized out>, iov=<value optimized out>, nr_segs=<value optimized out>, p\ os=0x108000, ppos=0xffff88001e26be40, count=<value optimized out>, written=0x0) at mm/filemap.c:2482 #4 0xffffffff810db5d1 in __generic_file_aio_write (iocb=0xffff88001e26bde8, iov=0xffff88001e26bec8, nr_segs=0x1, ppos=0\ xffff88001e26be40) at mm/filemap.c:2600 #5 0xffffffff810db853 in generic_file_aio_write (iocb=0xffff88001e26bde8, iov=0xffff88001e26bec8, nr_segs=<value optimi\ zed out>, pos=<value optimized out>) at mm/filemap.c:2632 torvalds#6 0xffffffff811a71aa in ext4_file_write (iocb=0xffff88001e26bde8, iov=0xffff88001e26bec8, nr_segs=0x1, pos=0x108000) a\ t fs/ext4/file.c:136 torvalds#7 0xffffffff811375aa in do_sync_write (filp=0xffff88003f606a80, buf=<value optimized out>, len=<value optimized out>, \ ppos=0xffff88001e26bf48) at fs/read_write.c:406 torvalds#8 0xffffffff81137e56 in vfs_write (file=0xffff88003f606a80, buf=0x1ec2960 <Address 0x1ec2960 out of bounds>, count=0x4\ 000, pos=0xffff88001e26bf48) at fs/read_write.c:435 torvalds#9 0xffffffff8113816c in sys_write (fd=<value optimized out>, buf=0x1ec2960 <Address 0x1ec2960 out of bounds>, count=0x\ 4000) at fs/read_write.c:487 torvalds#10 <signal handler called> torvalds#11 0x00007f120077a390 in __brk_reservation_fn_dmi_alloc__ () torvalds#12 0x0000000000000000 in ?? () gdb> print offset $22 = 0xffffffffffffffff gdb> print idx $23 = 0xffffffff gdb> print inode->i_blkbits $24 = 0xc gdb> up #1 ext4_da_write_end (file=0xffff88003f606a80, mapping=0xffff88001d3824e0, pos=0x108000, len=0x1000, copied=0x0, page=0\ xffffea0000d792e8, fsdata=0x0) at fs/ext4/inode.c:2512 2512 if (ext4_da_should_update_i_disksize(page, end)) { gdb> print start $25 = 0x0 gdb> print end $26 = 0xffffffffffffffff gdb> print pos $27 = 0x108000 gdb> print new_i_size $28 = 0x108000 gdb> print ((struct ext4_inode_info *)((char *)inode-((int)(&((struct ext4_inode_info *)0)->vfs_inode))))->i_disksize $29 = 0xd9000 gdb> down 2467 for (i = 0; i < idx; i++) gdb> print i $30 = 0xd44acbee This is 100% reproducible with some autonuma development code tuned in a very aggressive manner (not normal way even for knumad) which does "exotic" changes to the ptes. It wouldn't normally trigger but I don't see why it can't happen normally if the page is added to swap cache in between the two faults leading to "copied" being zero (which then hangs in ext4). So it should be fixed. Especially possible with lumpy reclaim (albeit disabled if compaction is enabled) as that would ignore the young bits in the ptes. Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Brad Figg <brad.figg@canonical.com>
commit cf96b8e upstream. ASan reports a memory leak caused by evlist not being deleted on exit in perf-report, perf-script and perf-data. The problem is caused by evlist->session not being deleted, which is allocated in perf_session__read_header, called in perf_session__new if perf_data is in read mode. In case of write mode, the session->evlist is filled by the caller. This patch solves the problem by calling evlist__delete in perf_session__delete if perf_data is in read mode. Changes in v2: - call evlist__delete from within perf_session__delete v1: https://lore.kernel.org/lkml/20210621234317.235545-1-rickyman7@gmail.com/ ASan report follows: $ ./perf script report flamegraph ================================================================= ==227640==ERROR: LeakSanitizer: detected memory leaks <SNIP unrelated> Indirect leak of 2704 byte(s) in 1 object(s) allocated from: #0 0x4f4137 in calloc (/home/user/linux/tools/perf/perf+0x4f4137) #1 0xbe3d56 in zalloc /home/user/linux/tools/lib/perf/../../lib/zalloc.c:8:9 #2 0x7f999e in evlist__new /home/user/linux/tools/perf/util/evlist.c:77:26 #3 0x8ad938 in perf_session__read_header /home/user/linux/tools/perf/util/header.c:3797:20 #4 0x8ec714 in perf_session__open /home/user/linux/tools/perf/util/session.c:109:6 #5 0x8ebe83 in perf_session__new /home/user/linux/tools/perf/util/session.c:213:10 torvalds#6 0x60c6de in cmd_script /home/user/linux/tools/perf/builtin-script.c:3856:12 torvalds#7 0x7b2930 in run_builtin /home/user/linux/tools/perf/perf.c:313:11 torvalds#8 0x7b120f in handle_internal_command /home/user/linux/tools/perf/perf.c:365:8 torvalds#9 0x7b2493 in run_argv /home/user/linux/tools/perf/perf.c:409:2 torvalds#10 0x7b0c89 in main /home/user/linux/tools/perf/perf.c:539:3 torvalds#11 0x7f5260654b74 (/lib64/libc.so.6+0x27b74) Indirect leak of 568 byte(s) in 1 object(s) allocated from: #0 0x4f4137 in calloc (/home/user/linux/tools/perf/perf+0x4f4137) #1 0xbe3d56 in zalloc /home/user/linux/tools/lib/perf/../../lib/zalloc.c:8:9 #2 0x80ce88 in evsel__new_idx /home/user/linux/tools/perf/util/evsel.c:268:24 #3 0x8aed93 in evsel__new /home/user/linux/tools/perf/util/evsel.h:210:9 #4 0x8ae07e in perf_session__read_header /home/user/linux/tools/perf/util/header.c:3853:11 #5 0x8ec714 in perf_session__open /home/user/linux/tools/perf/util/session.c:109:6 torvalds#6 0x8ebe83 in perf_session__new /home/user/linux/tools/perf/util/session.c:213:10 torvalds#7 0x60c6de in cmd_script /home/user/linux/tools/perf/builtin-script.c:3856:12 torvalds#8 0x7b2930 in run_builtin /home/user/linux/tools/perf/perf.c:313:11 torvalds#9 0x7b120f in handle_internal_command /home/user/linux/tools/perf/perf.c:365:8 torvalds#10 0x7b2493 in run_argv /home/user/linux/tools/perf/perf.c:409:2 torvalds#11 0x7b0c89 in main /home/user/linux/tools/perf/perf.c:539:3 torvalds#12 0x7f5260654b74 (/lib64/libc.so.6+0x27b74) Indirect leak of 264 byte(s) in 1 object(s) allocated from: #0 0x4f4137 in calloc (/home/user/linux/tools/perf/perf+0x4f4137) #1 0xbe3d56 in zalloc /home/user/linux/tools/lib/perf/../../lib/zalloc.c:8:9 #2 0xbe3e70 in xyarray__new /home/user/linux/tools/lib/perf/xyarray.c:10:23 #3 0xbd7754 in perf_evsel__alloc_id /home/user/linux/tools/lib/perf/evsel.c:361:21 #4 0x8ae201 in perf_session__read_header /home/user/linux/tools/perf/util/header.c:3871:7 #5 0x8ec714 in perf_session__open /home/user/linux/tools/perf/util/session.c:109:6 torvalds#6 0x8ebe83 in perf_session__new /home/user/linux/tools/perf/util/session.c:213:10 torvalds#7 0x60c6de in cmd_script /home/user/linux/tools/perf/builtin-script.c:3856:12 torvalds#8 0x7b2930 in run_builtin /home/user/linux/tools/perf/perf.c:313:11 torvalds#9 0x7b120f in handle_internal_command /home/user/linux/tools/perf/perf.c:365:8 torvalds#10 0x7b2493 in run_argv /home/user/linux/tools/perf/perf.c:409:2 torvalds#11 0x7b0c89 in main /home/user/linux/tools/perf/perf.c:539:3 torvalds#12 0x7f5260654b74 (/lib64/libc.so.6+0x27b74) Indirect leak of 32 byte(s) in 1 object(s) allocated from: #0 0x4f4137 in calloc (/home/user/linux/tools/perf/perf+0x4f4137) #1 0xbe3d56 in zalloc /home/user/linux/tools/lib/perf/../../lib/zalloc.c:8:9 #2 0xbd77e0 in perf_evsel__alloc_id /home/user/linux/tools/lib/perf/evsel.c:365:14 #3 0x8ae201 in perf_session__read_header /home/user/linux/tools/perf/util/header.c:3871:7 #4 0x8ec714 in perf_session__open /home/user/linux/tools/perf/util/session.c:109:6 #5 0x8ebe83 in perf_session__new /home/user/linux/tools/perf/util/session.c:213:10 torvalds#6 0x60c6de in cmd_script /home/user/linux/tools/perf/builtin-script.c:3856:12 torvalds#7 0x7b2930 in run_builtin /home/user/linux/tools/perf/perf.c:313:11 torvalds#8 0x7b120f in handle_internal_command /home/user/linux/tools/perf/perf.c:365:8 torvalds#9 0x7b2493 in run_argv /home/user/linux/tools/perf/perf.c:409:2 torvalds#10 0x7b0c89 in main /home/user/linux/tools/perf/perf.c:539:3 torvalds#11 0x7f5260654b74 (/lib64/libc.so.6+0x27b74) Indirect leak of 7 byte(s) in 1 object(s) allocated from: #0 0x4b8207 in strdup (/home/user/linux/tools/perf/perf+0x4b8207) #1 0x8b4459 in evlist__set_event_name /home/user/linux/tools/perf/util/header.c:2292:16 #2 0x89d862 in process_event_desc /home/user/linux/tools/perf/util/header.c:2313:3 #3 0x8af319 in perf_file_section__process /home/user/linux/tools/perf/util/header.c:3651:9 #4 0x8aa6e9 in perf_header__process_sections /home/user/linux/tools/perf/util/header.c:3427:9 #5 0x8ae3e7 in perf_session__read_header /home/user/linux/tools/perf/util/header.c:3886:2 torvalds#6 0x8ec714 in perf_session__open /home/user/linux/tools/perf/util/session.c:109:6 torvalds#7 0x8ebe83 in perf_session__new /home/user/linux/tools/perf/util/session.c:213:10 torvalds#8 0x60c6de in cmd_script /home/user/linux/tools/perf/builtin-script.c:3856:12 torvalds#9 0x7b2930 in run_builtin /home/user/linux/tools/perf/perf.c:313:11 torvalds#10 0x7b120f in handle_internal_command /home/user/linux/tools/perf/perf.c:365:8 torvalds#11 0x7b2493 in run_argv /home/user/linux/tools/perf/perf.c:409:2 torvalds#12 0x7b0c89 in main /home/user/linux/tools/perf/perf.c:539:3 torvalds#13 0x7f5260654b74 (/lib64/libc.so.6+0x27b74) SUMMARY: AddressSanitizer: 3728 byte(s) leaked in 7 allocation(s). Signed-off-by: Riccardo Mancini <rickyman7@gmail.com> Acked-by: Ian Rogers <irogers@google.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210624231926.212208-1-rickyman7@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: stable@vger.kernel.org # 5.10.228 Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5b188cc upstream. Disable strict aliasing, as has been done in the kernel proper for decades (literally since before git history) to fix issues where gcc will optimize away loads in code that looks 100% correct, but is _technically_ undefined behavior, and thus can be thrown away by the compiler. E.g. arm64's vPMU counter access test casts a uint64_t (unsigned long) pointer to a u64 (unsigned long long) pointer when setting PMCR.N via u64p_replace_bits(), which gcc-13 detects and optimizes away, i.e. ignores the result and uses the original PMCR. The issue is most easily observed by making set_pmcr_n() noinline and wrapping the call with printf(), e.g. sans comments, for this code: printf("orig = %lx, next = %lx, want = %lu\n", pmcr_orig, pmcr, pmcr_n); set_pmcr_n(&pmcr, pmcr_n); printf("orig = %lx, next = %lx, want = %lu\n", pmcr_orig, pmcr, pmcr_n); gcc-13 generates: 0000000000401c90 <set_pmcr_n>: 401c90: f9400002 ldr x2, [x0] 401c94: b3751022 bfi x2, x1, torvalds#11, #5 401c98: f9000002 str x2, [x0] 401c9c: d65f03c0 ret 0000000000402660 <test_create_vpmu_vm_with_pmcr_n>: 402724: aa1403e3 mov x3, x20 402728: aa1503e2 mov x2, x21 40272c: aa1603e0 mov x0, x22 402730: aa1503e1 mov x1, x21 402734: 940060ff bl 41ab30 <_IO_printf> 402738: aa1403e1 mov x1, x20 40273c: 910183e0 add x0, sp, #0x60 402740: 97fffd54 bl 401c90 <set_pmcr_n> 402744: aa1403e3 mov x3, x20 402748: aa1503e2 mov x2, x21 40274c: aa1503e1 mov x1, x21 402750: aa1603e0 mov x0, x22 402754: 940060f7 bl 41ab30 <_IO_printf> with the value stored in [sp + 0x60] ignored by both printf() above and in the test proper, resulting in a false failure due to vcpu_set_reg() simply storing the original value, not the intended value. $ ./vpmu_counter_access Random seed: 0x6b8b4567 orig = 3040, next = 3040, want = 0 orig = 3040, next = 3040, want = 0 ==== Test Assertion Failure ==== aarch64/vpmu_counter_access.c:505: pmcr_n == get_pmcr_n(pmcr) pid=71578 tid=71578 errno=9 - Bad file descriptor 1 0x400673: run_access_test at vpmu_counter_access.c:522 2 (inlined by) main at vpmu_counter_access.c:643 3 0x4132d7: __libc_start_call_main at libc-start.o:0 4 0x413653: __libc_start_main at ??:0 5 0x40106f: _start at ??:0 Failed to update PMCR.N to 0 (received: 6) Somewhat bizarrely, gcc-11 also exhibits the same behavior, but only if set_pmcr_n() is marked noinline, whereas gcc-13 fails even if set_pmcr_n() is inlined in its sole caller. Cc: stable@vger.kernel.org Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116912 Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5b188cc upstream. Disable strict aliasing, as has been done in the kernel proper for decades (literally since before git history) to fix issues where gcc will optimize away loads in code that looks 100% correct, but is _technically_ undefined behavior, and thus can be thrown away by the compiler. E.g. arm64's vPMU counter access test casts a uint64_t (unsigned long) pointer to a u64 (unsigned long long) pointer when setting PMCR.N via u64p_replace_bits(), which gcc-13 detects and optimizes away, i.e. ignores the result and uses the original PMCR. The issue is most easily observed by making set_pmcr_n() noinline and wrapping the call with printf(), e.g. sans comments, for this code: printf("orig = %lx, next = %lx, want = %lu\n", pmcr_orig, pmcr, pmcr_n); set_pmcr_n(&pmcr, pmcr_n); printf("orig = %lx, next = %lx, want = %lu\n", pmcr_orig, pmcr, pmcr_n); gcc-13 generates: 0000000000401c90 <set_pmcr_n>: 401c90: f9400002 ldr x2, [x0] 401c94: b3751022 bfi x2, x1, torvalds#11, #5 401c98: f9000002 str x2, [x0] 401c9c: d65f03c0 ret 0000000000402660 <test_create_vpmu_vm_with_pmcr_n>: 402724: aa1403e3 mov x3, x20 402728: aa1503e2 mov x2, x21 40272c: aa1603e0 mov x0, x22 402730: aa1503e1 mov x1, x21 402734: 940060ff bl 41ab30 <_IO_printf> 402738: aa1403e1 mov x1, x20 40273c: 910183e0 add x0, sp, #0x60 402740: 97fffd54 bl 401c90 <set_pmcr_n> 402744: aa1403e3 mov x3, x20 402748: aa1503e2 mov x2, x21 40274c: aa1503e1 mov x1, x21 402750: aa1603e0 mov x0, x22 402754: 940060f7 bl 41ab30 <_IO_printf> with the value stored in [sp + 0x60] ignored by both printf() above and in the test proper, resulting in a false failure due to vcpu_set_reg() simply storing the original value, not the intended value. $ ./vpmu_counter_access Random seed: 0x6b8b4567 orig = 3040, next = 3040, want = 0 orig = 3040, next = 3040, want = 0 ==== Test Assertion Failure ==== aarch64/vpmu_counter_access.c:505: pmcr_n == get_pmcr_n(pmcr) pid=71578 tid=71578 errno=9 - Bad file descriptor 1 0x400673: run_access_test at vpmu_counter_access.c:522 2 (inlined by) main at vpmu_counter_access.c:643 3 0x4132d7: __libc_start_call_main at libc-start.o:0 4 0x413653: __libc_start_main at ??:0 5 0x40106f: _start at ??:0 Failed to update PMCR.N to 0 (received: 6) Somewhat bizarrely, gcc-11 also exhibits the same behavior, but only if set_pmcr_n() is marked noinline, whereas gcc-13 fails even if set_pmcr_n() is inlined in its sole caller. Cc: stable@vger.kernel.org Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116912 Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ 123.491737][ T1] Unexpected kernel BRK exception at EL1 [ 123.497593][ T1] Internal error: ptrace BRK handler: f20003e8 [#1] PREEMPT SMP [ 123.500785][ T1] Modules linked in: [ 123.502567][ T1] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 5.8.0-rc3-next-20200630-00003-g15e24419c239-dirty torvalds#11 [ 123.507468][ T1] Hardware name: linux,dummy-virt (DT) [ 123.509826][ T1] pstate: 80400005 (Nzcv daif +PAN -UAO BTYPE=--) [ 123.512609][ T1] pc : of_unittest_untrack_overlay+0x64/0x134 [ 123.515245][ T1] lr : of_unittest_untrack_overlay+0x64/0x134 [ 123.517848][ T1] sp : ffff00006a65fb30 [ 123.519668][ T1] x29: ffff00006a65fb30 x28: 0000000000000000 [ 123.522295][ T1] x27: ffff00006a65fc30 x26: ffffa00016b86f00 [ 123.524937][ T1] x25: 0000000000000000 x24: 0000000000000000 [ 123.527592][ T1] x23: ffffa00014c72540 x22: ffffa00016b86000 [ 123.530191][ T1] x21: 0000000000000000 x20: 00000000ffffffff [ 123.532845][ T1] x19: 00000000ffffffff x18: 0000000000002690 [ 123.535547][ T1] x17: 0000000000002718 x16: 00000000000014b8 [ 123.538299][ T1] x15: 0000000000000001 x14: 0080000000000000 [ 123.541055][ T1] x13: 0000000000000002 x12: ffff94000298d209 [ 123.543801][ T1] x11: 1ffff4000298d208 x10: ffff94000298d208 [ 123.546580][ T1] x9 : dfffa00000000000 x8 : ffffa00014c69047 [ 123.549247][ T1] x7 : 0000000000000001 x6 : ffffa00014c69040 [ 123.552026][ T1] x5 : ffff00006a654040 x4 : 0000000000000000 [ 123.554799][ T1] x3 : ffffa00011d59d04 x2 : 00000000ffffffff [ 123.557541][ T1] x1 : ffff00006a654040 x0 : 0000000000000000 [ 123.560390][ T1] Call trace: [ 123.561935][ T1] of_unittest_untrack_overlay+0x64/0x134 [ 123.564469][ T1] of_unittest+0x2220/0x2438 [ 123.566585][ T1] do_one_initcall+0x470/0xa10 [ 123.568751][ T1] kernel_init_freeable+0x510/0x5f0 [ 123.571123][ T1] kernel_init+0x18/0x1e8 [ 123.573078][ T1] ret_from_fork+0x10/0x18 [ 123.575119][ T1] Code: 97978a9c d4210000 14000024 97978a99 (d4207d00) [ 123.578138][ T1] ---[ end trace c4e049fb5e3b0ba0 ]--- [ 123.580449][ T1] Kernel panic - not syncing: Fatal exception [ 123.583116][ T1] Kernel Offset: disabled [ 123.585066][ T1] CPU features: 0x240002,20002004 [ 123.587259][ T1] Memory Limit: none [ 123.588986][ T1] ---[ end Kernel panic - not syncing: Fatal exception ]--- Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
[ 123.491737][ T1] Unexpected kernel BRK exception at EL1 [ 123.497593][ T1] Internal error: ptrace BRK handler: f20003e8 [#1] PREEMPT SMP [ 123.500785][ T1] Modules linked in: [ 123.502567][ T1] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 5.8.0-rc3-next-20200630-00003-g15e24419c239-dirty torvalds#11 [ 123.507468][ T1] Hardware name: linux,dummy-virt (DT) [ 123.509826][ T1] pstate: 80400005 (Nzcv daif +PAN -UAO BTYPE=--) [ 123.512609][ T1] pc : of_unittest_untrack_overlay+0x64/0x134 [ 123.515245][ T1] lr : of_unittest_untrack_overlay+0x64/0x134 [ 123.517848][ T1] sp : ffff00006a65fb30 [ 123.519668][ T1] x29: ffff00006a65fb30 x28: 0000000000000000 [ 123.522295][ T1] x27: ffff00006a65fc30 x26: ffffa00016b86f00 [ 123.524937][ T1] x25: 0000000000000000 x24: 0000000000000000 [ 123.527592][ T1] x23: ffffa00014c72540 x22: ffffa00016b86000 [ 123.530191][ T1] x21: 0000000000000000 x20: 00000000ffffffff [ 123.532845][ T1] x19: 00000000ffffffff x18: 0000000000002690 [ 123.535547][ T1] x17: 0000000000002718 x16: 00000000000014b8 [ 123.538299][ T1] x15: 0000000000000001 x14: 0080000000000000 [ 123.541055][ T1] x13: 0000000000000002 x12: ffff94000298d209 [ 123.543801][ T1] x11: 1ffff4000298d208 x10: ffff94000298d208 [ 123.546580][ T1] x9 : dfffa00000000000 x8 : ffffa00014c69047 [ 123.549247][ T1] x7 : 0000000000000001 x6 : ffffa00014c69040 [ 123.552026][ T1] x5 : ffff00006a654040 x4 : 0000000000000000 [ 123.554799][ T1] x3 : ffffa00011d59d04 x2 : 00000000ffffffff [ 123.557541][ T1] x1 : ffff00006a654040 x0 : 0000000000000000 [ 123.560390][ T1] Call trace: [ 123.561935][ T1] of_unittest_untrack_overlay+0x64/0x134 [ 123.564469][ T1] of_unittest+0x2220/0x2438 [ 123.566585][ T1] do_one_initcall+0x470/0xa10 [ 123.568751][ T1] kernel_init_freeable+0x510/0x5f0 [ 123.571123][ T1] kernel_init+0x18/0x1e8 [ 123.573078][ T1] ret_from_fork+0x10/0x18 [ 123.575119][ T1] Code: 97978a9c d4210000 14000024 97978a99 (d4207d00) [ 123.578138][ T1] ---[ end trace c4e049fb5e3b0ba0 ]--- [ 123.580449][ T1] Kernel panic - not syncing: Fatal exception [ 123.583116][ T1] Kernel Offset: disabled [ 123.585066][ T1] CPU features: 0x240002,20002004 [ 123.587259][ T1] Memory Limit: none [ 123.588986][ T1] ---[ end Kernel panic - not syncing: Fatal exception ]--- Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
[ 123.491737][ T1] Unexpected kernel BRK exception at EL1 [ 123.497593][ T1] Internal error: ptrace BRK handler: f20003e8 [#1] PREEMPT SMP [ 123.500785][ T1] Modules linked in: [ 123.502567][ T1] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 5.8.0-rc3-next-20200630-00003-g15e24419c239-dirty torvalds#11 [ 123.507468][ T1] Hardware name: linux,dummy-virt (DT) [ 123.509826][ T1] pstate: 80400005 (Nzcv daif +PAN -UAO BTYPE=--) [ 123.512609][ T1] pc : of_unittest_untrack_overlay+0x64/0x134 [ 123.515245][ T1] lr : of_unittest_untrack_overlay+0x64/0x134 [ 123.517848][ T1] sp : ffff00006a65fb30 [ 123.519668][ T1] x29: ffff00006a65fb30 x28: 0000000000000000 [ 123.522295][ T1] x27: ffff00006a65fc30 x26: ffffa00016b86f00 [ 123.524937][ T1] x25: 0000000000000000 x24: 0000000000000000 [ 123.527592][ T1] x23: ffffa00014c72540 x22: ffffa00016b86000 [ 123.530191][ T1] x21: 0000000000000000 x20: 00000000ffffffff [ 123.532845][ T1] x19: 00000000ffffffff x18: 0000000000002690 [ 123.535547][ T1] x17: 0000000000002718 x16: 00000000000014b8 [ 123.538299][ T1] x15: 0000000000000001 x14: 0080000000000000 [ 123.541055][ T1] x13: 0000000000000002 x12: ffff94000298d209 [ 123.543801][ T1] x11: 1ffff4000298d208 x10: ffff94000298d208 [ 123.546580][ T1] x9 : dfffa00000000000 x8 : ffffa00014c69047 [ 123.549247][ T1] x7 : 0000000000000001 x6 : ffffa00014c69040 [ 123.552026][ T1] x5 : ffff00006a654040 x4 : 0000000000000000 [ 123.554799][ T1] x3 : ffffa00011d59d04 x2 : 00000000ffffffff [ 123.557541][ T1] x1 : ffff00006a654040 x0 : 0000000000000000 [ 123.560390][ T1] Call trace: [ 123.561935][ T1] of_unittest_untrack_overlay+0x64/0x134 [ 123.564469][ T1] of_unittest+0x2220/0x2438 [ 123.566585][ T1] do_one_initcall+0x470/0xa10 [ 123.568751][ T1] kernel_init_freeable+0x510/0x5f0 [ 123.571123][ T1] kernel_init+0x18/0x1e8 [ 123.573078][ T1] ret_from_fork+0x10/0x18 [ 123.575119][ T1] Code: 97978a9c d4210000 14000024 97978a99 (d4207d00) [ 123.578138][ T1] ---[ end trace c4e049fb5e3b0ba0 ]--- [ 123.580449][ T1] Kernel panic - not syncing: Fatal exception [ 123.583116][ T1] Kernel Offset: disabled [ 123.585066][ T1] CPU features: 0x240002,20002004 [ 123.587259][ T1] Memory Limit: none [ 123.588986][ T1] ---[ end Kernel panic - not syncing: Fatal exception ]--- Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Disable strict aliasing, as has been done in the kernel proper for decades (literally since before git history) to fix issues where gcc will optimize away loads in code that looks 100% correct, but is _technically_ undefined behavior, and thus can be thrown away by the compiler. E.g. arm64's vPMU counter access test casts a uint64_t (unsigned long) pointer to a u64 (unsigned long long) pointer when setting PMCR.N via u64p_replace_bits(), which gcc-13 detects and optimizes away, i.e. ignores the result and uses the original PMCR. The issue is most easily observed by making set_pmcr_n() noinline and wrapping the call with printf(), e.g. sans comments, for this code: printf("orig = %lx, next = %lx, want = %lu\n", pmcr_orig, pmcr, pmcr_n); set_pmcr_n(&pmcr, pmcr_n); printf("orig = %lx, next = %lx, want = %lu\n", pmcr_orig, pmcr, pmcr_n); gcc-13 generates: 0000000000401c90 <set_pmcr_n>: 401c90: f9400002 ldr x2, [x0] 401c94: b3751022 bfi x2, x1, torvalds#11, #5 401c98: f9000002 str x2, [x0] 401c9c: d65f03c0 ret 0000000000402660 <test_create_vpmu_vm_with_pmcr_n>: 402724: aa1403e3 mov x3, x20 402728: aa1503e2 mov x2, x21 40272c: aa1603e0 mov x0, x22 402730: aa1503e1 mov x1, x21 402734: 940060ff bl 41ab30 <_IO_printf> 402738: aa1403e1 mov x1, x20 40273c: 910183e0 add x0, sp, #0x60 402740: 97fffd54 bl 401c90 <set_pmcr_n> 402744: aa1403e3 mov x3, x20 402748: aa1503e2 mov x2, x21 40274c: aa1503e1 mov x1, x21 402750: aa1603e0 mov x0, x22 402754: 940060f7 bl 41ab30 <_IO_printf> with the value stored in [sp + 0x60] ignored by both printf() above and in the test proper, resulting in a false failure due to vcpu_set_reg() simply storing the original value, not the intended value. $ ./vpmu_counter_access Random seed: 0x6b8b4567 orig = 3040, next = 3040, want = 0 orig = 3040, next = 3040, want = 0 ==== Test Assertion Failure ==== aarch64/vpmu_counter_access.c:505: pmcr_n == get_pmcr_n(pmcr) pid=71578 tid=71578 errno=9 - Bad file descriptor 1 0x400673: run_access_test at vpmu_counter_access.c:522 2 (inlined by) main at vpmu_counter_access.c:643 3 0x4132d7: __libc_start_call_main at libc-start.o:0 4 0x413653: __libc_start_main at ??:0 5 0x40106f: _start at ??:0 Failed to update PMCR.N to 0 (received: 6) Somewhat bizarrely, gcc-11 also exhibits the same behavior, but only if set_pmcr_n() is marked noinline, whereas gcc-13 fails even if set_pmcr_n() is inlined in its sole caller. Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116912 Signed-off-by: Sean Christopherson <seanjc@google.com>
[ 123.491737][ T1] Unexpected kernel BRK exception at EL1 [ 123.497593][ T1] Internal error: ptrace BRK handler: f20003e8 [#1] PREEMPT SMP [ 123.500785][ T1] Modules linked in: [ 123.502567][ T1] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 5.8.0-rc3-next-20200630-00003-g15e24419c239-dirty torvalds#11 [ 123.507468][ T1] Hardware name: linux,dummy-virt (DT) [ 123.509826][ T1] pstate: 80400005 (Nzcv daif +PAN -UAO BTYPE=--) [ 123.512609][ T1] pc : of_unittest_untrack_overlay+0x64/0x134 [ 123.515245][ T1] lr : of_unittest_untrack_overlay+0x64/0x134 [ 123.517848][ T1] sp : ffff00006a65fb30 [ 123.519668][ T1] x29: ffff00006a65fb30 x28: 0000000000000000 [ 123.522295][ T1] x27: ffff00006a65fc30 x26: ffffa00016b86f00 [ 123.524937][ T1] x25: 0000000000000000 x24: 0000000000000000 [ 123.527592][ T1] x23: ffffa00014c72540 x22: ffffa00016b86000 [ 123.530191][ T1] x21: 0000000000000000 x20: 00000000ffffffff [ 123.532845][ T1] x19: 00000000ffffffff x18: 0000000000002690 [ 123.535547][ T1] x17: 0000000000002718 x16: 00000000000014b8 [ 123.538299][ T1] x15: 0000000000000001 x14: 0080000000000000 [ 123.541055][ T1] x13: 0000000000000002 x12: ffff94000298d209 [ 123.543801][ T1] x11: 1ffff4000298d208 x10: ffff94000298d208 [ 123.546580][ T1] x9 : dfffa00000000000 x8 : ffffa00014c69047 [ 123.549247][ T1] x7 : 0000000000000001 x6 : ffffa00014c69040 [ 123.552026][ T1] x5 : ffff00006a654040 x4 : 0000000000000000 [ 123.554799][ T1] x3 : ffffa00011d59d04 x2 : 00000000ffffffff [ 123.557541][ T1] x1 : ffff00006a654040 x0 : 0000000000000000 [ 123.560390][ T1] Call trace: [ 123.561935][ T1] of_unittest_untrack_overlay+0x64/0x134 [ 123.564469][ T1] of_unittest+0x2220/0x2438 [ 123.566585][ T1] do_one_initcall+0x470/0xa10 [ 123.568751][ T1] kernel_init_freeable+0x510/0x5f0 [ 123.571123][ T1] kernel_init+0x18/0x1e8 [ 123.573078][ T1] ret_from_fork+0x10/0x18 [ 123.575119][ T1] Code: 97978a9c d4210000 14000024 97978a99 (d4207d00) [ 123.578138][ T1] ---[ end trace c4e049fb5e3b0ba0 ]--- [ 123.580449][ T1] Kernel panic - not syncing: Fatal exception [ 123.583116][ T1] Kernel Offset: disabled [ 123.585066][ T1] CPU features: 0x240002,20002004 [ 123.587259][ T1] Memory Limit: none [ 123.588986][ T1] ---[ end Kernel panic - not syncing: Fatal exception ]--- Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
[ Upstream commit 5bf1557 ] test_progs uses glibc specific functions backtrace() and backtrace_symbols_fd() to print backtrace in case of SIGSEGV. Recent commit (see fixes) updated test_progs.c to define stub versions of the same functions with attriubte "weak" in order to allow linking test_progs against musl libc. Unfortunately this broke the backtrace handling for glibc builds. As it turns out, glibc defines backtrace() and backtrace_symbols_fd() as weak: $ llvm-readelf --symbols /lib64/libc.so.6 \ | grep -P '( backtrace_symbols_fd| backtrace)$' 4910: 0000000000126b40 161 FUNC WEAK DEFAULT 16 backtrace 6843: 0000000000126f90 852 FUNC WEAK DEFAULT 16 backtrace_symbols_fd So does test_progs: $ llvm-readelf --symbols test_progs \ | grep -P '( backtrace_symbols_fd| backtrace)$' 2891: 00000000006ad190 15 FUNC WEAK DEFAULT 13 backtrace 11215: 00000000006ad1a0 41 FUNC WEAK DEFAULT 13 backtrace_symbols_fd In such situation dynamic linker is not obliged to favour glibc implementation over the one defined in test_progs. Compiling with the following simple modification to test_progs.c demonstrates the issue: $ git diff ... \--- a/tools/testing/selftests/bpf/test_progs.c \+++ b/tools/testing/selftests/bpf/test_progs.c \@@ -1817,6 +1817,7 @@ int main(int argc, char **argv) if (err) return err; + *(int *)0xdeadbeef = 42; err = cd_flavor_subdir(argv[0]); if (err) return err; $ ./test_progs [0]: Caught signal torvalds#11! Stack trace: <backtrace not supported> Segmentation fault (core dumped) Resolve this by hiding stub definitions behind __GLIBC__ macro check instead of using "weak" attribute. Fixes: c9a83e7 ("selftests/bpf: Fix compile if backtrace support missing in libc") Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Tony Ambardar <tony.ambardar@gmail.com> Reviewed-by: Tony Ambardar <tony.ambardar@gmail.com> Acked-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/bpf/20241003210307.3847907-1-eddyz87@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
Until now the Identification Request command was used to detect devices. The command is not strictly mandatory for displays to implement but should at least result in a valid error response. Some devices seen to not even send back an error/null response but instead only respond with null bytes causing the detection to abort. Now the first chunk of the capability string is requested for the first detection step. As the capabilites request command is effectively mandatory, this should improve compatibility with badly programmed displays. May fix issues torvalds#11 and torvalds#20.
[ Upstream commit 5bf1557 ] test_progs uses glibc specific functions backtrace() and backtrace_symbols_fd() to print backtrace in case of SIGSEGV. Recent commit (see fixes) updated test_progs.c to define stub versions of the same functions with attriubte "weak" in order to allow linking test_progs against musl libc. Unfortunately this broke the backtrace handling for glibc builds. As it turns out, glibc defines backtrace() and backtrace_symbols_fd() as weak: $ llvm-readelf --symbols /lib64/libc.so.6 \ | grep -P '( backtrace_symbols_fd| backtrace)$' 4910: 0000000000126b40 161 FUNC WEAK DEFAULT 16 backtrace 6843: 0000000000126f90 852 FUNC WEAK DEFAULT 16 backtrace_symbols_fd So does test_progs: $ llvm-readelf --symbols test_progs \ | grep -P '( backtrace_symbols_fd| backtrace)$' 2891: 00000000006ad190 15 FUNC WEAK DEFAULT 13 backtrace 11215: 00000000006ad1a0 41 FUNC WEAK DEFAULT 13 backtrace_symbols_fd In such situation dynamic linker is not obliged to favour glibc implementation over the one defined in test_progs. Compiling with the following simple modification to test_progs.c demonstrates the issue: $ git diff ... \--- a/tools/testing/selftests/bpf/test_progs.c \+++ b/tools/testing/selftests/bpf/test_progs.c \@@ -1817,6 +1817,7 @@ int main(int argc, char **argv) if (err) return err; + *(int *)0xdeadbeef = 42; err = cd_flavor_subdir(argv[0]); if (err) return err; $ ./test_progs [0]: Caught signal torvalds#11! Stack trace: <backtrace not supported> Segmentation fault (core dumped) Resolve this by hiding stub definitions behind __GLIBC__ macro check instead of using "weak" attribute. Fixes: c9a83e7 ("selftests/bpf: Fix compile if backtrace support missing in libc") Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Tony Ambardar <tony.ambardar@gmail.com> Reviewed-by: Tony Ambardar <tony.ambardar@gmail.com> Acked-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/bpf/20241003210307.3847907-1-eddyz87@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5bf1557 ] test_progs uses glibc specific functions backtrace() and backtrace_symbols_fd() to print backtrace in case of SIGSEGV. Recent commit (see fixes) updated test_progs.c to define stub versions of the same functions with attriubte "weak" in order to allow linking test_progs against musl libc. Unfortunately this broke the backtrace handling for glibc builds. As it turns out, glibc defines backtrace() and backtrace_symbols_fd() as weak: $ llvm-readelf --symbols /lib64/libc.so.6 \ | grep -P '( backtrace_symbols_fd| backtrace)$' 4910: 0000000000126b40 161 FUNC WEAK DEFAULT 16 backtrace 6843: 0000000000126f90 852 FUNC WEAK DEFAULT 16 backtrace_symbols_fd So does test_progs: $ llvm-readelf --symbols test_progs \ | grep -P '( backtrace_symbols_fd| backtrace)$' 2891: 00000000006ad190 15 FUNC WEAK DEFAULT 13 backtrace 11215: 00000000006ad1a0 41 FUNC WEAK DEFAULT 13 backtrace_symbols_fd In such situation dynamic linker is not obliged to favour glibc implementation over the one defined in test_progs. Compiling with the following simple modification to test_progs.c demonstrates the issue: $ git diff ... \--- a/tools/testing/selftests/bpf/test_progs.c \+++ b/tools/testing/selftests/bpf/test_progs.c \@@ -1817,6 +1817,7 @@ int main(int argc, char **argv) if (err) return err; + *(int *)0xdeadbeef = 42; err = cd_flavor_subdir(argv[0]); if (err) return err; $ ./test_progs [0]: Caught signal torvalds#11! Stack trace: <backtrace not supported> Segmentation fault (core dumped) Resolve this by hiding stub definitions behind __GLIBC__ macro check instead of using "weak" attribute. Fixes: c9a83e7 ("selftests/bpf: Fix compile if backtrace support missing in libc") Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Tony Ambardar <tony.ambardar@gmail.com> Reviewed-by: Tony Ambardar <tony.ambardar@gmail.com> Acked-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/bpf/20241003210307.3847907-1-eddyz87@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5bf1557 ] test_progs uses glibc specific functions backtrace() and backtrace_symbols_fd() to print backtrace in case of SIGSEGV. Recent commit (see fixes) updated test_progs.c to define stub versions of the same functions with attriubte "weak" in order to allow linking test_progs against musl libc. Unfortunately this broke the backtrace handling for glibc builds. As it turns out, glibc defines backtrace() and backtrace_symbols_fd() as weak: $ llvm-readelf --symbols /lib64/libc.so.6 \ | grep -P '( backtrace_symbols_fd| backtrace)$' 4910: 0000000000126b40 161 FUNC WEAK DEFAULT 16 backtrace 6843: 0000000000126f90 852 FUNC WEAK DEFAULT 16 backtrace_symbols_fd So does test_progs: $ llvm-readelf --symbols test_progs \ | grep -P '( backtrace_symbols_fd| backtrace)$' 2891: 00000000006ad190 15 FUNC WEAK DEFAULT 13 backtrace 11215: 00000000006ad1a0 41 FUNC WEAK DEFAULT 13 backtrace_symbols_fd In such situation dynamic linker is not obliged to favour glibc implementation over the one defined in test_progs. Compiling with the following simple modification to test_progs.c demonstrates the issue: $ git diff ... \--- a/tools/testing/selftests/bpf/test_progs.c \+++ b/tools/testing/selftests/bpf/test_progs.c \@@ -1817,6 +1817,7 @@ int main(int argc, char **argv) if (err) return err; + *(int *)0xdeadbeef = 42; err = cd_flavor_subdir(argv[0]); if (err) return err; $ ./test_progs [0]: Caught signal torvalds#11! Stack trace: <backtrace not supported> Segmentation fault (core dumped) Resolve this by hiding stub definitions behind __GLIBC__ macro check instead of using "weak" attribute. Fixes: c9a83e7 ("selftests/bpf: Fix compile if backtrace support missing in libc") Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Tony Ambardar <tony.ambardar@gmail.com> Reviewed-by: Tony Ambardar <tony.ambardar@gmail.com> Acked-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/bpf/20241003210307.3847907-1-eddyz87@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5bf1557 ] test_progs uses glibc specific functions backtrace() and backtrace_symbols_fd() to print backtrace in case of SIGSEGV. Recent commit (see fixes) updated test_progs.c to define stub versions of the same functions with attriubte "weak" in order to allow linking test_progs against musl libc. Unfortunately this broke the backtrace handling for glibc builds. As it turns out, glibc defines backtrace() and backtrace_symbols_fd() as weak: $ llvm-readelf --symbols /lib64/libc.so.6 \ | grep -P '( backtrace_symbols_fd| backtrace)$' 4910: 0000000000126b40 161 FUNC WEAK DEFAULT 16 backtrace 6843: 0000000000126f90 852 FUNC WEAK DEFAULT 16 backtrace_symbols_fd So does test_progs: $ llvm-readelf --symbols test_progs \ | grep -P '( backtrace_symbols_fd| backtrace)$' 2891: 00000000006ad190 15 FUNC WEAK DEFAULT 13 backtrace 11215: 00000000006ad1a0 41 FUNC WEAK DEFAULT 13 backtrace_symbols_fd In such situation dynamic linker is not obliged to favour glibc implementation over the one defined in test_progs. Compiling with the following simple modification to test_progs.c demonstrates the issue: $ git diff ... \--- a/tools/testing/selftests/bpf/test_progs.c \+++ b/tools/testing/selftests/bpf/test_progs.c \@@ -1817,6 +1817,7 @@ int main(int argc, char **argv) if (err) return err; + *(int *)0xdeadbeef = 42; err = cd_flavor_subdir(argv[0]); if (err) return err; $ ./test_progs [0]: Caught signal torvalds#11! Stack trace: <backtrace not supported> Segmentation fault (core dumped) Resolve this by hiding stub definitions behind __GLIBC__ macro check instead of using "weak" attribute. Fixes: c9a83e7 ("selftests/bpf: Fix compile if backtrace support missing in libc") Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Tony Ambardar <tony.ambardar@gmail.com> Reviewed-by: Tony Ambardar <tony.ambardar@gmail.com> Acked-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/bpf/20241003210307.3847907-1-eddyz87@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
…4_xdr_dec_open_noattr The `nfs4_xdr_dec_open()` function does not properly check the return status of the `ACCESS` operation. This oversight can result in out-of-bounds memory access when decoding NFSv4 compound requests. For instance, in an NFSv4 compound request `{5, PUTFH, OPEN, GETFH, ACCESS, GETATTR}`, if the `ACCESS` operation (step 4) returns an error, the function proceeds to decode the subsequent `GETATTR` operation (step 5) without validating the RPC buffer's state. This can cause an RPC buffer overflow, which leading to a system panic. This issue can be reliably reproduced by running multiple `fsstress` tests in the same directory exported by the Ganesha NFS server. This patch introduces proper error handling for the `ACCESS` operation in `nfs4_xdr_dec_open()` and `nfs4_xdr_dec_open_noattr()`. When an error is detected, the decoding process is terminated gracefully to prevent further buffer corruption and ensure system stability. torvalds#7 [ffffa42b17337bc0] page_fault at ffffffff906010fe [exception RIP: xdr_set_page_base+61] RIP: ffffffffc12166dd RSP: ffffa42b17337c78 RFLAGS: 00010246 RAX: 0000000000000000 RBX: ffffa42b17337db8 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffa42b17337db8 RBP: 0000000000000000 R8: ffff904948b0a650 R9: 0000000000000000 R10: 8080808080808080 R11: ffff904ac3c68be4 R12: 0000000000000009 R13: ffffa42b17337db8 R14: ffff904aa6aee000 R15: ffffffffc11f7f50 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 torvalds#8 [ffffa42b17337c78] xdr_set_next_buffer at ffffffffc1217b0b [sunrpc] torvalds#9 [ffffa42b17337c90] xdr_inline_decode at ffffffffc1218259 [sunrpc] torvalds#10 [ffffa42b17337cb8] __decode_op_hdr at ffffffffc128d2c2 [nfsv4] torvalds#11 [ffffa42b17337cf0] decode_getfattr_generic.constprop.124 at ffffffffc12980a2 [nfsv4] torvalds#12 [ffffa42b17337d58] nfs4_xdr_dec_open at ffffffffc1298374 [nfsv4] torvalds#13 [ffffa42b17337db0] call_decode at ffffffffc11f8144 [sunrpc] torvalds#14 [ffffa42b17337e28] __rpc_execute at ffffffffc1206ad5 [sunrpc] torvalds#15 [ffffa42b17337e80] rpc_async_schedule at ffffffffc1206e39 [sunrpc] torvalds#16 [ffffa42b17337e98] process_one_work at ffffffff8fcfe397 torvalds#17 [ffffa42b17337ed8] worker_thread at ffffffff8fcfea60 torvalds#18 [ffffa42b17337f10] kthread at ffffffff8fd04406 torvalds#19 [ffffa42b17337f50] ret_from_fork at ffffffff9060023f Signed-off-by: changxin.liu <changxin.liu@lenovonetapp.com>
[ Upstream commit 5bf1557 ] test_progs uses glibc specific functions backtrace() and backtrace_symbols_fd() to print backtrace in case of SIGSEGV. Recent commit (see fixes) updated test_progs.c to define stub versions of the same functions with attriubte "weak" in order to allow linking test_progs against musl libc. Unfortunately this broke the backtrace handling for glibc builds. As it turns out, glibc defines backtrace() and backtrace_symbols_fd() as weak: $ llvm-readelf --symbols /lib64/libc.so.6 \ | grep -P '( backtrace_symbols_fd| backtrace)$' 4910: 0000000000126b40 161 FUNC WEAK DEFAULT 16 backtrace 6843: 0000000000126f90 852 FUNC WEAK DEFAULT 16 backtrace_symbols_fd So does test_progs: $ llvm-readelf --symbols test_progs \ | grep -P '( backtrace_symbols_fd| backtrace)$' 2891: 00000000006ad190 15 FUNC WEAK DEFAULT 13 backtrace 11215: 00000000006ad1a0 41 FUNC WEAK DEFAULT 13 backtrace_symbols_fd In such situation dynamic linker is not obliged to favour glibc implementation over the one defined in test_progs. Compiling with the following simple modification to test_progs.c demonstrates the issue: $ git diff ... \--- a/tools/testing/selftests/bpf/test_progs.c \+++ b/tools/testing/selftests/bpf/test_progs.c \@@ -1817,6 +1817,7 @@ int main(int argc, char **argv) if (err) return err; + *(int *)0xdeadbeef = 42; err = cd_flavor_subdir(argv[0]); if (err) return err; $ ./test_progs [0]: Caught signal torvalds#11! Stack trace: <backtrace not supported> Segmentation fault (core dumped) Resolve this by hiding stub definitions behind __GLIBC__ macro check instead of using "weak" attribute. Fixes: c9a83e7 ("selftests/bpf: Fix compile if backtrace support missing in libc") Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Tony Ambardar <tony.ambardar@gmail.com> Reviewed-by: Tony Ambardar <tony.ambardar@gmail.com> Acked-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/bpf/20241003210307.3847907-1-eddyz87@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
there is a global spinlock between reset and clk, if locked in reset, then print some debug information, maybe dead-lock when uart driver try to disable clk. Backtrace stopped: frame did not save the PC (gdb) thread 4 [Switching to thread 4 (Thread 4)] #0 cpu_relax () at ./arch/riscv/include/asm/vdso/processor.h:22 22 ./arch/riscv/include/asm/vdso/processor.h: No such file or directory. (gdb) bt #0 cpu_relax () at ./arch/riscv/include/asm/vdso/processor.h:22 #1 arch_spin_lock (lock=lock@entry=0xffffffff81a57cd0 <enable_lock>) at ./include/asm-generic/spinlock.h:49 #2 do_raw_spin_lock (lock=lock@entry=0xffffffff81a57cd0 <enable_lock>) at ./include/linux/spinlock.h:186 #3 0xffffffff80aa21ce in __raw_spin_lock_irqsave (lock=0xffffffff81a57cd0 <enable_lock>) at ./include/linux/spinlock_api_smp.h:111 #4 _raw_spin_lock_irqsave (lock=lock@entry=0xffffffff81a57cd0 <enable_lock>) at kernel/locking/spinlock.c:162 #5 0xffffffff80563416 in clk_enable_lock () at ./include/linux/spinlock.h:325 torvalds#6 0xffffffff805648de in clk_core_disable_lock (core=0xffffffd900512500) at drivers/clk/clk.c:1062 torvalds#7 0xffffffff8056527e in clk_disable (clk=<optimized out>) at drivers/clk/clk.c:1084 torvalds#8 clk_disable (clk=0xffffffd9048b5100) at drivers/clk/clk.c:1079 torvalds#9 0xffffffff8059e5d4 in serial_pxa_console_write (co=<optimized out>, s=0xffffffff81a68250 <text> "[ 14.708612] [RESET][spacemit_reset_set][373]:assert = 1, id = 59 \n", count=<optimized out>) at drivers/tty/serial/pxa_k1x.c:1724 torvalds#10 0xffffffff8004a34c in call_console_driver (dropped_text=0xffffffff81a68650 <dropped_text> "", len=69, text=0xffffffff81a68250 <text> "[ 14.708612] [RESET][spacemit_reset_set][373]:assert = 1, id = 59 \n", con=0xffffffff81964c10 <serial_pxa_console>) at kernel/printk/printk.c:1942 torvalds#11 console_emit_next_record (con=con@entry=0xffffffff81964c10 <serial_pxa_console>, ext_text=<optimized out>, dropped_text=0xffffffff81a68650 <dropped_text> "", handover=0xffffffc80578baa7, text=0xffffffff81a68250 <text> "[ 14.708612] [RESET][spacemit_reset_set][373]:assert = 1, id = 59 \n") at kernel/printk/printk.c:2731 torvalds#12 0xffffffff8004a49a in console_flush_all (handover=0xffffffc80578baa7, next_seq=<synthetic pointer>, do_cond_resched=false) at kernel/printk/printk.c:2793 torvalds#13 console_unlock () at kernel/printk/printk.c:2860 torvalds#14 0xffffffff8004b388 in vprintk_emit (facility=facility@entry=0, level=<optimized out>, level@entry=-1, dev_info=dev_info@entry=0x0, fmt=<optimized out>, args=<optimized out>) at kernel/printk/printk.c:2268 torvalds#15 0xffffffff8004b3ae in vprintk_default (fmt=<optimized out>, args=<optimized out>) at kernel/printk/printk.c:2279 torvalds#16 0xffffffff8004b646 in vprintk (fmt=fmt@entry=0xffffffff813be470 "\001\066[RESET][%s][%d]:assert = %d, id = %d \n", args=args@entry=0xffffffc80578bbd8) at kernel/printk/printk_safe.c:50 torvalds#17 0xffffffff80a880d6 in _printk (fmt=fmt@entry=0xffffffff813be470 "\001\066[RESET][%s][%d]:assert = %d, id = %d \n") at kernel/printk/printk.c:2289 torvalds#18 0xffffffff80a90bb6 in spacemit_reset_set (rcdev=rcdev@entry=0xffffffff81f563a8 <k1x_reset_controller+8>, id=id@entry=59, assert=assert@entry=true) at drivers/reset/reset-spacemit-k1x.c:373 torvalds#19 0xffffffff805823b6 in spacemit_reset_update (assert=true, id=59, rcdev=0xffffffff81f563a8 <k1x_reset_controller+8>) at drivers/reset/reset-spacemit-k1x.c:401 torvalds#20 spacemit_reset_update (assert=true, id=59, rcdev=0xffffffff81f563a8 <k1x_reset_controller+8>) at drivers/reset/reset-spacemit-k1x.c:387 torvalds#21 spacemit_reset_assert (rcdev=0xffffffff81f563a8 <k1x_reset_controller+8>, id=59) at drivers/reset/reset-spacemit-k1x.c:413 torvalds#22 0xffffffff8058158e in reset_control_assert (rstc=0xffffffd902b2f280) at drivers/reset/core.c:485 torvalds#23 0xffffffff807ccf96 in cpp_disable_clocks (cpp_dev=cpp_dev@entry=0xffffffd904cc9040) at drivers/media/platform/spacemit/camera/cam_cpp/k1x_cpp.c:960 torvalds#24 0xffffffff807cd0b2 in cpp_release_hardware (cpp_dev=cpp_dev@entry=0xffffffd904cc9040) at drivers/media/platform/spacemit/camera/cam_cpp/k1x_cpp.c:1038 torvalds#25 0xffffffff807cd990 in cpp_close_node (sd=<optimized out>, fh=<optimized out>) at drivers/media/platform/spacemit/camera/cam_cpp/k1x_cpp.c:1135 torvalds#26 0xffffffff8079525e in subdev_close (file=0xffffffd906645d00) at drivers/media/v4l2-core/v4l2-subdev.c:105 torvalds#27 0xffffffff8078e49e in v4l2_release (inode=<optimized out>, filp=0xffffffd906645d00) at drivers/media/v4l2-core/v4l2-dev.c:459 torvalds#28 0xffffffff80154974 in __fput (file=0xffffffd906645d00) at fs/file_table.c:320 torvalds#29 0xffffffff80154aa2 in ____fput (work=<optimized out>) at fs/file_table.c:348 torvalds#30 0xffffffff8002677e in task_work_run () at kernel/task_work.c:179 torvalds#31 0xffffffff800053b4 in resume_user_mode_work (regs=0xffffffc80578bee0) at ./include/linux/resume_user_mode.h:49 torvalds#32 do_work_pending (regs=0xffffffc80578bee0, thread_info_flags=<optimized out>) at arch/riscv/kernel/signal.c:478 torvalds#33 0xffffffff800039c6 in handle_exception () at arch/riscv/kernel/entry.S:374 Backtrace stopped: frame did not save the PC (gdb) thread 1 [Switching to thread 1 (Thread 1)] #0 0xffffffff80047e9c in arch_spin_lock (lock=lock@entry=0xffffffff81a57cd8 <g_cru_lock>) at ./include/asm-generic/spinlock.h:49 49 ./include/asm-generic/spinlock.h: No such file or directory. (gdb) bt #0 0xffffffff80047e9c in arch_spin_lock (lock=lock@entry=0xffffffff81a57cd8 <g_cru_lock>) at ./include/asm-generic/spinlock.h:49 #1 do_raw_spin_lock (lock=lock@entry=0xffffffff81a57cd8 <g_cru_lock>) at ./include/linux/spinlock.h:186 #2 0xffffffff80aa21ce in __raw_spin_lock_irqsave (lock=0xffffffff81a57cd8 <g_cru_lock>) at ./include/linux/spinlock_api_smp.h:111 #3 _raw_spin_lock_irqsave (lock=0xffffffff81a57cd8 <g_cru_lock>) at kernel/locking/spinlock.c:162 #4 0xffffffff8056c4cc in ccu_mix_disable (hw=0xffffffff81956858 <sdh2_clk+120>) at ./include/linux/spinlock.h:325 #5 0xffffffff80564832 in clk_core_disable (core=0xffffffd900529900) at drivers/clk/clk.c:1051 torvalds#6 clk_core_disable (core=0xffffffd900529900) at drivers/clk/clk.c:1031 torvalds#7 0xffffffff805648e6 in clk_core_disable_lock (core=0xffffffd900529900) at drivers/clk/clk.c:1063 torvalds#8 0xffffffff8056527e in clk_disable (clk=<optimized out>) at drivers/clk/clk.c:1084 torvalds#9 clk_disable (clk=clk@entry=0xffffffd904fafa80) at drivers/clk/clk.c:1079 torvalds#10 0xffffffff808bb898 in clk_disable_unprepare (clk=0xffffffd904fafa80) at ./include/linux/clk.h:1085 torvalds#11 0xffffffff808bb916 in spacemit_sdhci_runtime_suspend (dev=<optimized out>) at drivers/mmc/host/sdhci-of-k1x.c:1469 torvalds#12 0xffffffff8066e8e2 in pm_generic_runtime_suspend (dev=<optimized out>) at drivers/base/power/generic_ops.c:25 torvalds#13 0xffffffff80670398 in __rpm_callback (cb=cb@entry=0xffffffff8066e8ca <pm_generic_runtime_suspend>, dev=dev@entry=0xffffffd9018a2810) at drivers/base/power/runtime.c:395 torvalds#14 0xffffffff806704b8 in rpm_callback (cb=cb@entry=0xffffffff8066e8ca <pm_generic_runtime_suspend>, dev=dev@entry=0xffffffd9018a2810) at drivers/base/power/runtime.c:529 torvalds#15 0xffffffff80670bdc in rpm_suspend (dev=0xffffffd9018a2810, rpmflags=<optimized out>) at drivers/base/power/runtime.c:672 torvalds#16 0xffffffff806716de in pm_runtime_work (work=0xffffffd9018a2948) at drivers/base/power/runtime.c:974 torvalds#17 0xffffffff800236f4 in process_one_work (worker=worker@entry=0xffffffd9013ee9c0, work=0xffffffd9018a2948) at kernel/workqueue.c:2289 torvalds#18 0xffffffff80023ba6 in worker_thread (__worker=0xffffffd9013ee9c0) at kernel/workqueue.c:2436 torvalds#19 0xffffffff80028bb2 in kthread (_create=0xffffffd9017de840) at kernel/kthread.c:376 torvalds#20 0xffffffff80003934 in handle_exception () at arch/riscv/kernel/entry.S:249 Backtrace stopped: frame did not save the PC (gdb) Change-Id: Ia95b41ffd6c1893c9c5e9c1c9fc0c155ea902d2c
there is an invalid instrucation crash when run node.js: [ 443.219580] node[3123]: unhandled signal 4 code 0x1 at 0x00000038be663620 [ 443.226499] CPU: 5 PID: 3123 Comm: node Not tainted 6.6.36+ torvalds#11 [ 443.232501] Hardware name: spacemit k1-x deb1 board (DT) [ 443.237875] epc : 00000038be663620 ra : 00000038be652e00 sp : 0000003ff310a000 [ 443.245195] gp : 000000000447d6d0 tp : 0000003f82e2b780 t0 : 0000003e5c000000 [ 443.252501] t1 : 00000000000d31b8 t2 : 0000000000000063 s0 : 0000003ff310a050 [ 443.259815] s1 : 0000003ff3109fd0 a0 : 0000003c1e11ba29 a1 : 0000000000000004 [ 443.267121] a2 : 00000000000d31b8 a3 : 0000000000000003 a4 : 000000000019759e [ 443.274435] a5 : 0000000000000075 a6 : 000000000000006c a7 : 0000000000000065 [ 443.281749] s2 : 00000000010df958 s3 : 0000000000000001 s4 : 0000003e5c0d31b8 [ 443.289054] s5 : 00000000045442e0 s6 : 0000000004544260 s7 : 0000000ba8d91399 [ 443.296368] s8 : 0000000000000000 s9 : 00000038be650168 s10: 0000000ba8d9fa81 [ 443.303674] s11: 0000000000000000 t3 : 00000038be650198 t4 : 0000002200000000 [ 443.310980] t5 : 0000000000000008 t6 : 00000038be663620 [ 443.316352] status: 8000000200006020 badaddr: 0000000000800e13 cause: 0000000000000002 the op-code 0x00800e13 should be a valid instruction 'li t3, 0' the cause of the issue is that the i-cache data is wrong, when flush i-cahce request from user-space, icache of all cores related to the process should be flushed Change-Id: I0a06c77a2a3c1aa7aaf1e930eaa774d405e6fddb
there is a global spinlock between reset and clk, if locked in reset, then print some debug information, maybe dead-lock when uart driver try to disable clk. Backtrace stopped: frame did not save the PC (gdb) thread 4 [Switching to thread 4 (Thread 4)] #0 cpu_relax () at ./arch/riscv/include/asm/vdso/processor.h:22 22 ./arch/riscv/include/asm/vdso/processor.h: No such file or directory. (gdb) bt #0 cpu_relax () at ./arch/riscv/include/asm/vdso/processor.h:22 #1 arch_spin_lock (lock=lock@entry=0xffffffff81a57cd0 <enable_lock>) at ./include/asm-generic/spinlock.h:49 #2 do_raw_spin_lock (lock=lock@entry=0xffffffff81a57cd0 <enable_lock>) at ./include/linux/spinlock.h:186 #3 0xffffffff80aa21ce in __raw_spin_lock_irqsave (lock=0xffffffff81a57cd0 <enable_lock>) at ./include/linux/spinlock_api_smp.h:111 #4 _raw_spin_lock_irqsave (lock=lock@entry=0xffffffff81a57cd0 <enable_lock>) at kernel/locking/spinlock.c:162 #5 0xffffffff80563416 in clk_enable_lock () at ./include/linux/spinlock.h:325 torvalds#6 0xffffffff805648de in clk_core_disable_lock (core=0xffffffd900512500) at drivers/clk/clk.c:1062 torvalds#7 0xffffffff8056527e in clk_disable (clk=<optimized out>) at drivers/clk/clk.c:1084 torvalds#8 clk_disable (clk=0xffffffd9048b5100) at drivers/clk/clk.c:1079 torvalds#9 0xffffffff8059e5d4 in serial_pxa_console_write (co=<optimized out>, s=0xffffffff81a68250 <text> "[ 14.708612] [RESET][spacemit_reset_set][373]:assert = 1, id = 59 \n", count=<optimized out>) at drivers/tty/serial/pxa_k1x.c:1724 torvalds#10 0xffffffff8004a34c in call_console_driver (dropped_text=0xffffffff81a68650 <dropped_text> "", len=69, text=0xffffffff81a68250 <text> "[ 14.708612] [RESET][spacemit_reset_set][373]:assert = 1, id = 59 \n", con=0xffffffff81964c10 <serial_pxa_console>) at kernel/printk/printk.c:1942 torvalds#11 console_emit_next_record (con=con@entry=0xffffffff81964c10 <serial_pxa_console>, ext_text=<optimized out>, dropped_text=0xffffffff81a68650 <dropped_text> "", handover=0xffffffc80578baa7, text=0xffffffff81a68250 <text> "[ 14.708612] [RESET][spacemit_reset_set][373]:assert = 1, id = 59 \n") at kernel/printk/printk.c:2731 torvalds#12 0xffffffff8004a49a in console_flush_all (handover=0xffffffc80578baa7, next_seq=<synthetic pointer>, do_cond_resched=false) at kernel/printk/printk.c:2793 torvalds#13 console_unlock () at kernel/printk/printk.c:2860 torvalds#14 0xffffffff8004b388 in vprintk_emit (facility=facility@entry=0, level=<optimized out>, level@entry=-1, dev_info=dev_info@entry=0x0, fmt=<optimized out>, args=<optimized out>) at kernel/printk/printk.c:2268 torvalds#15 0xffffffff8004b3ae in vprintk_default (fmt=<optimized out>, args=<optimized out>) at kernel/printk/printk.c:2279 torvalds#16 0xffffffff8004b646 in vprintk (fmt=fmt@entry=0xffffffff813be470 "\001\066[RESET][%s][%d]:assert = %d, id = %d \n", args=args@entry=0xffffffc80578bbd8) at kernel/printk/printk_safe.c:50 torvalds#17 0xffffffff80a880d6 in _printk (fmt=fmt@entry=0xffffffff813be470 "\001\066[RESET][%s][%d]:assert = %d, id = %d \n") at kernel/printk/printk.c:2289 torvalds#18 0xffffffff80a90bb6 in spacemit_reset_set (rcdev=rcdev@entry=0xffffffff81f563a8 <k1x_reset_controller+8>, id=id@entry=59, assert=assert@entry=true) at drivers/reset/reset-spacemit-k1x.c:373 torvalds#19 0xffffffff805823b6 in spacemit_reset_update (assert=true, id=59, rcdev=0xffffffff81f563a8 <k1x_reset_controller+8>) at drivers/reset/reset-spacemit-k1x.c:401 torvalds#20 spacemit_reset_update (assert=true, id=59, rcdev=0xffffffff81f563a8 <k1x_reset_controller+8>) at drivers/reset/reset-spacemit-k1x.c:387 torvalds#21 spacemit_reset_assert (rcdev=0xffffffff81f563a8 <k1x_reset_controller+8>, id=59) at drivers/reset/reset-spacemit-k1x.c:413 torvalds#22 0xffffffff8058158e in reset_control_assert (rstc=0xffffffd902b2f280) at drivers/reset/core.c:485 torvalds#23 0xffffffff807ccf96 in cpp_disable_clocks (cpp_dev=cpp_dev@entry=0xffffffd904cc9040) at drivers/media/platform/spacemit/camera/cam_cpp/k1x_cpp.c:960 torvalds#24 0xffffffff807cd0b2 in cpp_release_hardware (cpp_dev=cpp_dev@entry=0xffffffd904cc9040) at drivers/media/platform/spacemit/camera/cam_cpp/k1x_cpp.c:1038 torvalds#25 0xffffffff807cd990 in cpp_close_node (sd=<optimized out>, fh=<optimized out>) at drivers/media/platform/spacemit/camera/cam_cpp/k1x_cpp.c:1135 torvalds#26 0xffffffff8079525e in subdev_close (file=0xffffffd906645d00) at drivers/media/v4l2-core/v4l2-subdev.c:105 torvalds#27 0xffffffff8078e49e in v4l2_release (inode=<optimized out>, filp=0xffffffd906645d00) at drivers/media/v4l2-core/v4l2-dev.c:459 torvalds#28 0xffffffff80154974 in __fput (file=0xffffffd906645d00) at fs/file_table.c:320 torvalds#29 0xffffffff80154aa2 in ____fput (work=<optimized out>) at fs/file_table.c:348 torvalds#30 0xffffffff8002677e in task_work_run () at kernel/task_work.c:179 torvalds#31 0xffffffff800053b4 in resume_user_mode_work (regs=0xffffffc80578bee0) at ./include/linux/resume_user_mode.h:49 torvalds#32 do_work_pending (regs=0xffffffc80578bee0, thread_info_flags=<optimized out>) at arch/riscv/kernel/signal.c:478 torvalds#33 0xffffffff800039c6 in handle_exception () at arch/riscv/kernel/entry.S:374 Backtrace stopped: frame did not save the PC (gdb) thread 1 [Switching to thread 1 (Thread 1)] #0 0xffffffff80047e9c in arch_spin_lock (lock=lock@entry=0xffffffff81a57cd8 <g_cru_lock>) at ./include/asm-generic/spinlock.h:49 49 ./include/asm-generic/spinlock.h: No such file or directory. (gdb) bt #0 0xffffffff80047e9c in arch_spin_lock (lock=lock@entry=0xffffffff81a57cd8 <g_cru_lock>) at ./include/asm-generic/spinlock.h:49 #1 do_raw_spin_lock (lock=lock@entry=0xffffffff81a57cd8 <g_cru_lock>) at ./include/linux/spinlock.h:186 #2 0xffffffff80aa21ce in __raw_spin_lock_irqsave (lock=0xffffffff81a57cd8 <g_cru_lock>) at ./include/linux/spinlock_api_smp.h:111 #3 _raw_spin_lock_irqsave (lock=0xffffffff81a57cd8 <g_cru_lock>) at kernel/locking/spinlock.c:162 #4 0xffffffff8056c4cc in ccu_mix_disable (hw=0xffffffff81956858 <sdh2_clk+120>) at ./include/linux/spinlock.h:325 #5 0xffffffff80564832 in clk_core_disable (core=0xffffffd900529900) at drivers/clk/clk.c:1051 torvalds#6 clk_core_disable (core=0xffffffd900529900) at drivers/clk/clk.c:1031 torvalds#7 0xffffffff805648e6 in clk_core_disable_lock (core=0xffffffd900529900) at drivers/clk/clk.c:1063 torvalds#8 0xffffffff8056527e in clk_disable (clk=<optimized out>) at drivers/clk/clk.c:1084 torvalds#9 clk_disable (clk=clk@entry=0xffffffd904fafa80) at drivers/clk/clk.c:1079 torvalds#10 0xffffffff808bb898 in clk_disable_unprepare (clk=0xffffffd904fafa80) at ./include/linux/clk.h:1085 torvalds#11 0xffffffff808bb916 in spacemit_sdhci_runtime_suspend (dev=<optimized out>) at drivers/mmc/host/sdhci-of-k1x.c:1469 torvalds#12 0xffffffff8066e8e2 in pm_generic_runtime_suspend (dev=<optimized out>) at drivers/base/power/generic_ops.c:25 torvalds#13 0xffffffff80670398 in __rpm_callback (cb=cb@entry=0xffffffff8066e8ca <pm_generic_runtime_suspend>, dev=dev@entry=0xffffffd9018a2810) at drivers/base/power/runtime.c:395 torvalds#14 0xffffffff806704b8 in rpm_callback (cb=cb@entry=0xffffffff8066e8ca <pm_generic_runtime_suspend>, dev=dev@entry=0xffffffd9018a2810) at drivers/base/power/runtime.c:529 torvalds#15 0xffffffff80670bdc in rpm_suspend (dev=0xffffffd9018a2810, rpmflags=<optimized out>) at drivers/base/power/runtime.c:672 torvalds#16 0xffffffff806716de in pm_runtime_work (work=0xffffffd9018a2948) at drivers/base/power/runtime.c:974 torvalds#17 0xffffffff800236f4 in process_one_work (worker=worker@entry=0xffffffd9013ee9c0, work=0xffffffd9018a2948) at kernel/workqueue.c:2289 torvalds#18 0xffffffff80023ba6 in worker_thread (__worker=0xffffffd9013ee9c0) at kernel/workqueue.c:2436 torvalds#19 0xffffffff80028bb2 in kthread (_create=0xffffffd9017de840) at kernel/kthread.c:376 torvalds#20 0xffffffff80003934 in handle_exception () at arch/riscv/kernel/entry.S:249 Backtrace stopped: frame did not save the PC (gdb) Change-Id: Ia95b41ffd6c1893c9c5e9c1c9fc0c155ea902d2c
there is an invalid instrucation crash when run node.js: [ 443.219580] node[3123]: unhandled signal 4 code 0x1 at 0x00000038be663620 [ 443.226499] CPU: 5 PID: 3123 Comm: node Not tainted 6.6.36+ torvalds#11 [ 443.232501] Hardware name: spacemit k1-x deb1 board (DT) [ 443.237875] epc : 00000038be663620 ra : 00000038be652e00 sp : 0000003ff310a000 [ 443.245195] gp : 000000000447d6d0 tp : 0000003f82e2b780 t0 : 0000003e5c000000 [ 443.252501] t1 : 00000000000d31b8 t2 : 0000000000000063 s0 : 0000003ff310a050 [ 443.259815] s1 : 0000003ff3109fd0 a0 : 0000003c1e11ba29 a1 : 0000000000000004 [ 443.267121] a2 : 00000000000d31b8 a3 : 0000000000000003 a4 : 000000000019759e [ 443.274435] a5 : 0000000000000075 a6 : 000000000000006c a7 : 0000000000000065 [ 443.281749] s2 : 00000000010df958 s3 : 0000000000000001 s4 : 0000003e5c0d31b8 [ 443.289054] s5 : 00000000045442e0 s6 : 0000000004544260 s7 : 0000000ba8d91399 [ 443.296368] s8 : 0000000000000000 s9 : 00000038be650168 s10: 0000000ba8d9fa81 [ 443.303674] s11: 0000000000000000 t3 : 00000038be650198 t4 : 0000002200000000 [ 443.310980] t5 : 0000000000000008 t6 : 00000038be663620 [ 443.316352] status: 8000000200006020 badaddr: 0000000000800e13 cause: 0000000000000002 the op-code 0x00800e13 should be a valid instruction 'li t3, 0' the cause of the issue is that the i-cache data is wrong, when flush i-cahce request from user-space, icache of all cores related to the process should be flushed Change-Id: I0a06c77a2a3c1aa7aaf1e930eaa774d405e6fddb
Petr Machata says: ==================== vxlan: Support user-defined reserved bits Currently the VXLAN header validation works by vxlan_rcv() going feature by feature, each feature clearing the bits that it consumes. If anything is left unparsed at the end, the packet is rejected. Unfortunately there are machines out there that send VXLAN packets with reserved bits set, even if they are configured to not use the corresponding features. One such report is here[1], and we have heard similar complaints from our customers as well. This patchset adds an attribute that makes it configurable which bits the user wishes to tolerate and which they consider reserved. This was recommended in [1] as well. A knob like that inevitably allows users to set as reserved bits that are in fact required for the features enabled by the netdevice, such as GPE. This is detected, and such configurations are rejected. In patches #1..torvalds#7, the reserved bits validation code is gradually moved away from the unparsed approach described above, to one where a given set of valid bits is precomputed and then the packet is validated against that. In patch torvalds#8, this precomputed set is made configurable through a new attribute IFLA_VXLAN_RESERVED_BITS. Patches torvalds#9 and torvalds#10 massage the testsuite a bit, so that patch torvalds#11 can introduce a selftest for the resreved bits feature. The corresponding iproute2 support is available in [2]. [1] https://lore.kernel.org/netdev/db8b9e19-ad75-44d3-bfb2-46590d426ff5@proxmox.com/ [2] https://github.com/pmachata/iproute2/commits/vxlan_reserved_bits/ ==================== Link: https://patch.msgid.link/cover.1733412063.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
[ 123.491737][ T1] Unexpected kernel BRK exception at EL1 [ 123.497593][ T1] Internal error: ptrace BRK handler: f20003e8 [#1] PREEMPT SMP [ 123.500785][ T1] Modules linked in: [ 123.502567][ T1] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 5.8.0-rc3-next-20200630-00003-g15e24419c239-dirty torvalds#11 [ 123.507468][ T1] Hardware name: linux,dummy-virt (DT) [ 123.509826][ T1] pstate: 80400005 (Nzcv daif +PAN -UAO BTYPE=--) [ 123.512609][ T1] pc : of_unittest_untrack_overlay+0x64/0x134 [ 123.515245][ T1] lr : of_unittest_untrack_overlay+0x64/0x134 [ 123.517848][ T1] sp : ffff00006a65fb30 [ 123.519668][ T1] x29: ffff00006a65fb30 x28: 0000000000000000 [ 123.522295][ T1] x27: ffff00006a65fc30 x26: ffffa00016b86f00 [ 123.524937][ T1] x25: 0000000000000000 x24: 0000000000000000 [ 123.527592][ T1] x23: ffffa00014c72540 x22: ffffa00016b86000 [ 123.530191][ T1] x21: 0000000000000000 x20: 00000000ffffffff [ 123.532845][ T1] x19: 00000000ffffffff x18: 0000000000002690 [ 123.535547][ T1] x17: 0000000000002718 x16: 00000000000014b8 [ 123.538299][ T1] x15: 0000000000000001 x14: 0080000000000000 [ 123.541055][ T1] x13: 0000000000000002 x12: ffff94000298d209 [ 123.543801][ T1] x11: 1ffff4000298d208 x10: ffff94000298d208 [ 123.546580][ T1] x9 : dfffa00000000000 x8 : ffffa00014c69047 [ 123.549247][ T1] x7 : 0000000000000001 x6 : ffffa00014c69040 [ 123.552026][ T1] x5 : ffff00006a654040 x4 : 0000000000000000 [ 123.554799][ T1] x3 : ffffa00011d59d04 x2 : 00000000ffffffff [ 123.557541][ T1] x1 : ffff00006a654040 x0 : 0000000000000000 [ 123.560390][ T1] Call trace: [ 123.561935][ T1] of_unittest_untrack_overlay+0x64/0x134 [ 123.564469][ T1] of_unittest+0x2220/0x2438 [ 123.566585][ T1] do_one_initcall+0x470/0xa10 [ 123.568751][ T1] kernel_init_freeable+0x510/0x5f0 [ 123.571123][ T1] kernel_init+0x18/0x1e8 [ 123.573078][ T1] ret_from_fork+0x10/0x18 [ 123.575119][ T1] Code: 97978a9c d4210000 14000024 97978a99 (d4207d00) [ 123.578138][ T1] ---[ end trace c4e049fb5e3b0ba0 ]--- [ 123.580449][ T1] Kernel panic - not syncing: Fatal exception [ 123.583116][ T1] Kernel Offset: disabled [ 123.585066][ T1] CPU features: 0x240002,20002004 [ 123.587259][ T1] Memory Limit: none [ 123.588986][ T1] ---[ end Kernel panic - not syncing: Fatal exception ]--- Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Once we are inside the 'ext4_xattr_delete_inode' function and trying to delete the inode, the 'xattr_sem' should be unlocked. We need trylock here to avoid false-positive warning from lockdep about reclaim circular dependency. This fixes the following KASAN reported issue: ================================================================== BUG: KASAN: slab-use-after-free in ext4_xattr_inode_dec_ref_all+0xb8c/0xe90 Read of size 4 at addr ffff888012c120c4 by task repro/2065 CPU: 1 UID: 0 PID: 2065 Comm: repro Not tainted 6.13.0-rc2+ torvalds#11 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x1fd/0x300 ? tcp_gro_dev_warn+0x260/0x260 ? _printk+0xc0/0x100 ? read_lock_is_recursive+0x10/0x10 ? irq_work_queue+0x72/0xf0 ? __virt_addr_valid+0x17b/0x4b0 print_address_description+0x78/0x390 print_report+0x107/0x1f0 ? __virt_addr_valid+0x17b/0x4b0 ? __virt_addr_valid+0x3ff/0x4b0 ? __phys_addr+0xb5/0x160 ? ext4_xattr_inode_dec_ref_all+0xb8c/0xe90 kasan_report+0xcc/0x100 ? ext4_xattr_inode_dec_ref_all+0xb8c/0xe90 ext4_xattr_inode_dec_ref_all+0xb8c/0xe90 ? ext4_xattr_delete_inode+0xd30/0xd30 ? __ext4_journal_ensure_credits+0x5f0/0x5f0 ? __ext4_journal_ensure_credits+0x2b/0x5f0 ? inode_update_timestamps+0x410/0x410 ext4_xattr_delete_inode+0xb64/0xd30 ? ext4_truncate+0xb70/0xdc0 ? ext4_expand_extra_isize_ea+0x1d20/0x1d20 ? __ext4_mark_inode_dirty+0x670/0x670 ? ext4_journal_check_start+0x16f/0x240 ? ext4_inode_is_fast_symlink+0x2f2/0x3a0 ext4_evict_inode+0xc8c/0xff0 ? ext4_inode_is_fast_symlink+0x3a0/0x3a0 ? do_raw_spin_unlock+0x53/0x8a0 ? ext4_inode_is_fast_symlink+0x3a0/0x3a0 evict+0x4ac/0x950 ? proc_nr_inodes+0x310/0x310 ? trace_ext4_drop_inode+0xa2/0x220 ? _raw_spin_unlock+0x1a/0x30 ? iput+0x4cb/0x7e0 do_unlinkat+0x495/0x7c0 ? try_break_deleg+0x120/0x120 ? 0xffffffff81000000 ? __check_object_size+0x15a/0x210 ? strncpy_from_user+0x13e/0x250 ? getname_flags+0x1dc/0x530 __x64_sys_unlinkat+0xc8/0xf0 do_syscall_64+0x65/0x110 entry_SYSCALL_64_after_hwframe+0x67/0x6f RIP: 0033:0x434ffd Code: 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 8 RSP: 002b:00007ffc50fa7b28 EFLAGS: 00000246 ORIG_RAX: 0000000000000107 RAX: ffffffffffffffda RBX: 00007ffc50fa7e18 RCX: 0000000000434ffd RDX: 0000000000000000 RSI: 0000000020000240 RDI: 0000000000000005 RBP: 00007ffc50fa7be0 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001 R13: 00007ffc50fa7e08 R14: 00000000004bbf30 R15: 0000000000000001 </TASK> The buggy address belongs to the object at ffff888012c12000 which belongs to the cache filp of size 360 The buggy address is located 196 bytes inside of freed 360-byte region [ffff888012c12000, ffff888012c12168) The buggy address belongs to the physical page: page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x12c12 head: order:1 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0 flags: 0x40(head|node=0|zone=0) page_type: f5(slab) raw: 0000000000000040 ffff888000ad7640 ffffea0000497a00 dead000000000004 raw: 0000000000000000 0000000000100010 00000001f5000000 0000000000000000 head: 0000000000000040 ffff888000ad7640 ffffea0000497a00 dead000000000004 head: 0000000000000000 0000000000100010 00000001f5000000 0000000000000000 head: 0000000000000001 ffffea00004b0481 ffffffffffffffff 0000000000000000 head: 0000000000000002 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff888012c11f80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff888012c12000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb >ffff888012c12080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff888012c12100: fb fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc ffff888012c12180: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ================================================================== Reported-by: syzbot+b244bda78289b00204ed@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=b244bda78289b00204ed Signed-off-by: Bhupesh <bhupesh@igalia.com>
[ Upstream commit 5bf1557 ] test_progs uses glibc specific functions backtrace() and backtrace_symbols_fd() to print backtrace in case of SIGSEGV. Recent commit (see fixes) updated test_progs.c to define stub versions of the same functions with attriubte "weak" in order to allow linking test_progs against musl libc. Unfortunately this broke the backtrace handling for glibc builds. As it turns out, glibc defines backtrace() and backtrace_symbols_fd() as weak: $ llvm-readelf --symbols /lib64/libc.so.6 \ | grep -P '( backtrace_symbols_fd| backtrace)$' 4910: 0000000000126b40 161 FUNC WEAK DEFAULT 16 backtrace 6843: 0000000000126f90 852 FUNC WEAK DEFAULT 16 backtrace_symbols_fd So does test_progs: $ llvm-readelf --symbols test_progs \ | grep -P '( backtrace_symbols_fd| backtrace)$' 2891: 00000000006ad190 15 FUNC WEAK DEFAULT 13 backtrace 11215: 00000000006ad1a0 41 FUNC WEAK DEFAULT 13 backtrace_symbols_fd In such situation dynamic linker is not obliged to favour glibc implementation over the one defined in test_progs. Compiling with the following simple modification to test_progs.c demonstrates the issue: $ git diff ... \--- a/tools/testing/selftests/bpf/test_progs.c \+++ b/tools/testing/selftests/bpf/test_progs.c \@@ -1817,6 +1817,7 @@ int main(int argc, char **argv) if (err) return err; + *(int *)0xdeadbeef = 42; err = cd_flavor_subdir(argv[0]); if (err) return err; $ ./test_progs [0]: Caught signal torvalds#11! Stack trace: <backtrace not supported> Segmentation fault (core dumped) Resolve this by hiding stub definitions behind __GLIBC__ macro check instead of using "weak" attribute. Fixes: c9a83e7 ("selftests/bpf: Fix compile if backtrace support missing in libc") Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Tony Ambardar <tony.ambardar@gmail.com> Reviewed-by: Tony Ambardar <tony.ambardar@gmail.com> Acked-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/bpf/20241003210307.3847907-1-eddyz87@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
[ 123.491737][ T1] Unexpected kernel BRK exception at EL1 [ 123.497593][ T1] Internal error: ptrace BRK handler: f20003e8 [#1] PREEMPT SMP [ 123.500785][ T1] Modules linked in: [ 123.502567][ T1] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 5.8.0-rc3-next-20200630-00003-g15e24419c239-dirty torvalds#11 [ 123.507468][ T1] Hardware name: linux,dummy-virt (DT) [ 123.509826][ T1] pstate: 80400005 (Nzcv daif +PAN -UAO BTYPE=--) [ 123.512609][ T1] pc : of_unittest_untrack_overlay+0x64/0x134 [ 123.515245][ T1] lr : of_unittest_untrack_overlay+0x64/0x134 [ 123.517848][ T1] sp : ffff00006a65fb30 [ 123.519668][ T1] x29: ffff00006a65fb30 x28: 0000000000000000 [ 123.522295][ T1] x27: ffff00006a65fc30 x26: ffffa00016b86f00 [ 123.524937][ T1] x25: 0000000000000000 x24: 0000000000000000 [ 123.527592][ T1] x23: ffffa00014c72540 x22: ffffa00016b86000 [ 123.530191][ T1] x21: 0000000000000000 x20: 00000000ffffffff [ 123.532845][ T1] x19: 00000000ffffffff x18: 0000000000002690 [ 123.535547][ T1] x17: 0000000000002718 x16: 00000000000014b8 [ 123.538299][ T1] x15: 0000000000000001 x14: 0080000000000000 [ 123.541055][ T1] x13: 0000000000000002 x12: ffff94000298d209 [ 123.543801][ T1] x11: 1ffff4000298d208 x10: ffff94000298d208 [ 123.546580][ T1] x9 : dfffa00000000000 x8 : ffffa00014c69047 [ 123.549247][ T1] x7 : 0000000000000001 x6 : ffffa00014c69040 [ 123.552026][ T1] x5 : ffff00006a654040 x4 : 0000000000000000 [ 123.554799][ T1] x3 : ffffa00011d59d04 x2 : 00000000ffffffff [ 123.557541][ T1] x1 : ffff00006a654040 x0 : 0000000000000000 [ 123.560390][ T1] Call trace: [ 123.561935][ T1] of_unittest_untrack_overlay+0x64/0x134 [ 123.564469][ T1] of_unittest+0x2220/0x2438 [ 123.566585][ T1] do_one_initcall+0x470/0xa10 [ 123.568751][ T1] kernel_init_freeable+0x510/0x5f0 [ 123.571123][ T1] kernel_init+0x18/0x1e8 [ 123.573078][ T1] ret_from_fork+0x10/0x18 [ 123.575119][ T1] Code: 97978a9c d4210000 14000024 97978a99 (d4207d00) [ 123.578138][ T1] ---[ end trace c4e049fb5e3b0ba0 ]--- [ 123.580449][ T1] Kernel panic - not syncing: Fatal exception [ 123.583116][ T1] Kernel Offset: disabled [ 123.585066][ T1] CPU features: 0x240002,20002004 [ 123.587259][ T1] Memory Limit: none [ 123.588986][ T1] ---[ end Kernel panic - not syncing: Fatal exception ]--- Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
[ Upstream commit 5bf1557 ] test_progs uses glibc specific functions backtrace() and backtrace_symbols_fd() to print backtrace in case of SIGSEGV. Recent commit (see fixes) updated test_progs.c to define stub versions of the same functions with attriubte "weak" in order to allow linking test_progs against musl libc. Unfortunately this broke the backtrace handling for glibc builds. As it turns out, glibc defines backtrace() and backtrace_symbols_fd() as weak: $ llvm-readelf --symbols /lib64/libc.so.6 \ | grep -P '( backtrace_symbols_fd| backtrace)$' 4910: 0000000000126b40 161 FUNC WEAK DEFAULT 16 backtrace 6843: 0000000000126f90 852 FUNC WEAK DEFAULT 16 backtrace_symbols_fd So does test_progs: $ llvm-readelf --symbols test_progs \ | grep -P '( backtrace_symbols_fd| backtrace)$' 2891: 00000000006ad190 15 FUNC WEAK DEFAULT 13 backtrace 11215: 00000000006ad1a0 41 FUNC WEAK DEFAULT 13 backtrace_symbols_fd In such situation dynamic linker is not obliged to favour glibc implementation over the one defined in test_progs. Compiling with the following simple modification to test_progs.c demonstrates the issue: $ git diff ... \--- a/tools/testing/selftests/bpf/test_progs.c \+++ b/tools/testing/selftests/bpf/test_progs.c \@@ -1817,6 +1817,7 @@ int main(int argc, char **argv) if (err) return err; + *(int *)0xdeadbeef = 42; err = cd_flavor_subdir(argv[0]); if (err) return err; $ ./test_progs [0]: Caught signal torvalds#11! Stack trace: <backtrace not supported> Segmentation fault (core dumped) Resolve this by hiding stub definitions behind __GLIBC__ macro check instead of using "weak" attribute. Fixes: c9a83e7 ("selftests/bpf: Fix compile if backtrace support missing in libc") Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Tony Ambardar <tony.ambardar@gmail.com> Reviewed-by: Tony Ambardar <tony.ambardar@gmail.com> Acked-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/bpf/20241003210307.3847907-1-eddyz87@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5bf1557 ] test_progs uses glibc specific functions backtrace() and backtrace_symbols_fd() to print backtrace in case of SIGSEGV. Recent commit (see fixes) updated test_progs.c to define stub versions of the same functions with attriubte "weak" in order to allow linking test_progs against musl libc. Unfortunately this broke the backtrace handling for glibc builds. As it turns out, glibc defines backtrace() and backtrace_symbols_fd() as weak: $ llvm-readelf --symbols /lib64/libc.so.6 \ | grep -P '( backtrace_symbols_fd| backtrace)$' 4910: 0000000000126b40 161 FUNC WEAK DEFAULT 16 backtrace 6843: 0000000000126f90 852 FUNC WEAK DEFAULT 16 backtrace_symbols_fd So does test_progs: $ llvm-readelf --symbols test_progs \ | grep -P '( backtrace_symbols_fd| backtrace)$' 2891: 00000000006ad190 15 FUNC WEAK DEFAULT 13 backtrace 11215: 00000000006ad1a0 41 FUNC WEAK DEFAULT 13 backtrace_symbols_fd In such situation dynamic linker is not obliged to favour glibc implementation over the one defined in test_progs. Compiling with the following simple modification to test_progs.c demonstrates the issue: $ git diff ... \--- a/tools/testing/selftests/bpf/test_progs.c \+++ b/tools/testing/selftests/bpf/test_progs.c \@@ -1817,6 +1817,7 @@ int main(int argc, char **argv) if (err) return err; + *(int *)0xdeadbeef = 42; err = cd_flavor_subdir(argv[0]); if (err) return err; $ ./test_progs [0]: Caught signal torvalds#11! Stack trace: <backtrace not supported> Segmentation fault (core dumped) Resolve this by hiding stub definitions behind __GLIBC__ macro check instead of using "weak" attribute. Fixes: c9a83e7 ("selftests/bpf: Fix compile if backtrace support missing in libc") Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Tony Ambardar <tony.ambardar@gmail.com> Reviewed-by: Tony Ambardar <tony.ambardar@gmail.com> Acked-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/bpf/20241003210307.3847907-1-eddyz87@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
[ 123.491737][ T1] Unexpected kernel BRK exception at EL1 [ 123.497593][ T1] Internal error: ptrace BRK handler: f20003e8 [#1] PREEMPT SMP [ 123.500785][ T1] Modules linked in: [ 123.502567][ T1] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 5.8.0-rc3-next-20200630-00003-g15e24419c239-dirty torvalds#11 [ 123.507468][ T1] Hardware name: linux,dummy-virt (DT) [ 123.509826][ T1] pstate: 80400005 (Nzcv daif +PAN -UAO BTYPE=--) [ 123.512609][ T1] pc : of_unittest_untrack_overlay+0x64/0x134 [ 123.515245][ T1] lr : of_unittest_untrack_overlay+0x64/0x134 [ 123.517848][ T1] sp : ffff00006a65fb30 [ 123.519668][ T1] x29: ffff00006a65fb30 x28: 0000000000000000 [ 123.522295][ T1] x27: ffff00006a65fc30 x26: ffffa00016b86f00 [ 123.524937][ T1] x25: 0000000000000000 x24: 0000000000000000 [ 123.527592][ T1] x23: ffffa00014c72540 x22: ffffa00016b86000 [ 123.530191][ T1] x21: 0000000000000000 x20: 00000000ffffffff [ 123.532845][ T1] x19: 00000000ffffffff x18: 0000000000002690 [ 123.535547][ T1] x17: 0000000000002718 x16: 00000000000014b8 [ 123.538299][ T1] x15: 0000000000000001 x14: 0080000000000000 [ 123.541055][ T1] x13: 0000000000000002 x12: ffff94000298d209 [ 123.543801][ T1] x11: 1ffff4000298d208 x10: ffff94000298d208 [ 123.546580][ T1] x9 : dfffa00000000000 x8 : ffffa00014c69047 [ 123.549247][ T1] x7 : 0000000000000001 x6 : ffffa00014c69040 [ 123.552026][ T1] x5 : ffff00006a654040 x4 : 0000000000000000 [ 123.554799][ T1] x3 : ffffa00011d59d04 x2 : 00000000ffffffff [ 123.557541][ T1] x1 : ffff00006a654040 x0 : 0000000000000000 [ 123.560390][ T1] Call trace: [ 123.561935][ T1] of_unittest_untrack_overlay+0x64/0x134 [ 123.564469][ T1] of_unittest+0x2220/0x2438 [ 123.566585][ T1] do_one_initcall+0x470/0xa10 [ 123.568751][ T1] kernel_init_freeable+0x510/0x5f0 [ 123.571123][ T1] kernel_init+0x18/0x1e8 [ 123.573078][ T1] ret_from_fork+0x10/0x18 [ 123.575119][ T1] Code: 97978a9c d4210000 14000024 97978a99 (d4207d00) [ 123.578138][ T1] ---[ end trace c4e049fb5e3b0ba0 ]--- [ 123.580449][ T1] Kernel panic - not syncing: Fatal exception [ 123.583116][ T1] Kernel Offset: disabled [ 123.585066][ T1] CPU features: 0x240002,20002004 [ 123.587259][ T1] Memory Limit: none [ 123.588986][ T1] ---[ end Kernel panic - not syncing: Fatal exception ]--- Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
…atch-fixes WARNING: Possible repeated word: 'to' torvalds#11: set as null leaving it to to be accessed. Additionally, the read-only WARNING: Please use correct Fixes: style 'Fixes: <12 chars of sha1> ("<title line>")' - ie: 'Fixes: fatal: not a ("nux-next'")' torvalds#21: Fixes: 8f9e8f5 ("ocfs2: Fix Q_GETNEXTQUOTA for filesystem without quotas") WARNING: Reported-by: should be immediately followed by Closes: with a URL to the report torvalds#23: Reported-by: syzbot+d173bf8a5a7faeede34c@syzkaller.appspotmail.com Tested-by: syzbot+d173bf8a5a7faeede34c@syzkaller.appspotmail.com ERROR: space required before the open brace '{' torvalds#47: FILE: fs/ocfs2/quota_global.c:896: + if (!sb_has_quota_active(sb, type)){ total: 1 errors, 3 warnings, 15 lines checked NOTE: For some of the reported defects, checkpatch may be able to mechanically convert to the typical style using --fix or --fix-inplace. ./patches/ocfs2-fix-slab-use-after-free-due-to-dangling-pointer-dqi_priv.patch has style problems, please review. NOTE: If any of the errors are false positives, please report them to the maintainer, see CHECKPATCH in MAINTAINERS. Please run checkpatch prior to sending patches Cc: Changwei Ge <gechangwei@live.cn> Cc: Dennis Lam <dennis.lamerice@gmail.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Jun Piao <piaojun@huawei.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Mark Fasheh <mark@fasheh.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
[ 123.491737][ T1] Unexpected kernel BRK exception at EL1 [ 123.497593][ T1] Internal error: ptrace BRK handler: f20003e8 [#1] PREEMPT SMP [ 123.500785][ T1] Modules linked in: [ 123.502567][ T1] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 5.8.0-rc3-next-20200630-00003-g15e24419c239-dirty torvalds#11 [ 123.507468][ T1] Hardware name: linux,dummy-virt (DT) [ 123.509826][ T1] pstate: 80400005 (Nzcv daif +PAN -UAO BTYPE=--) [ 123.512609][ T1] pc : of_unittest_untrack_overlay+0x64/0x134 [ 123.515245][ T1] lr : of_unittest_untrack_overlay+0x64/0x134 [ 123.517848][ T1] sp : ffff00006a65fb30 [ 123.519668][ T1] x29: ffff00006a65fb30 x28: 0000000000000000 [ 123.522295][ T1] x27: ffff00006a65fc30 x26: ffffa00016b86f00 [ 123.524937][ T1] x25: 0000000000000000 x24: 0000000000000000 [ 123.527592][ T1] x23: ffffa00014c72540 x22: ffffa00016b86000 [ 123.530191][ T1] x21: 0000000000000000 x20: 00000000ffffffff [ 123.532845][ T1] x19: 00000000ffffffff x18: 0000000000002690 [ 123.535547][ T1] x17: 0000000000002718 x16: 00000000000014b8 [ 123.538299][ T1] x15: 0000000000000001 x14: 0080000000000000 [ 123.541055][ T1] x13: 0000000000000002 x12: ffff94000298d209 [ 123.543801][ T1] x11: 1ffff4000298d208 x10: ffff94000298d208 [ 123.546580][ T1] x9 : dfffa00000000000 x8 : ffffa00014c69047 [ 123.549247][ T1] x7 : 0000000000000001 x6 : ffffa00014c69040 [ 123.552026][ T1] x5 : ffff00006a654040 x4 : 0000000000000000 [ 123.554799][ T1] x3 : ffffa00011d59d04 x2 : 00000000ffffffff [ 123.557541][ T1] x1 : ffff00006a654040 x0 : 0000000000000000 [ 123.560390][ T1] Call trace: [ 123.561935][ T1] of_unittest_untrack_overlay+0x64/0x134 [ 123.564469][ T1] of_unittest+0x2220/0x2438 [ 123.566585][ T1] do_one_initcall+0x470/0xa10 [ 123.568751][ T1] kernel_init_freeable+0x510/0x5f0 [ 123.571123][ T1] kernel_init+0x18/0x1e8 [ 123.573078][ T1] ret_from_fork+0x10/0x18 [ 123.575119][ T1] Code: 97978a9c d4210000 14000024 97978a99 (d4207d00) [ 123.578138][ T1] ---[ end trace c4e049fb5e3b0ba0 ]--- [ 123.580449][ T1] Kernel panic - not syncing: Fatal exception [ 123.583116][ T1] Kernel Offset: disabled [ 123.585066][ T1] CPU features: 0x240002,20002004 [ 123.587259][ T1] Memory Limit: none [ 123.588986][ T1] ---[ end Kernel panic - not syncing: Fatal exception ]--- Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Models: