-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PineNote config categorization & update + IPsec enablement #8
PineNote config categorization & update + IPsec enablement #8
Conversation
e999f61
to
3ac45f8
Compare
5147912
to
095c9c7
Compare
095c9c7
to
93e43f4
Compare
When creating a trace_probe we would set nr_args prior to truncating the arguments to MAX_TRACE_ARGS. However, we would only initialize arguments up to the limit. This caused invalid memory access when attempting to set up probes with more than 128 fetchargs. BUG: kernel NULL pointer dereference, address: 0000000000000020 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: Oops: 0000 [m-weigand#1] PREEMPT SMP PTI CPU: 0 UID: 0 PID: 1769 Comm: cat Not tainted 6.11.0-rc7+ m-weigand#8 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-1.fc39 04/01/2014 RIP: 0010:__set_print_fmt+0x134/0x330 Resolve the issue by applying the MAX_TRACE_ARGS limit earlier. Return an error when there are too many arguments instead of silently truncating. Link: https://lore.kernel.org/all/20240930202656.292869-1-mikel@mikelr.com/ Fixes: 035ba76 ("tracing/probes: cleanup: Set trace_probe::nr_args at trace_probe_init") Signed-off-by: Mikel Rychliski <mikel@mikelr.com> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
The referenced commits introduced a two-step process for deleting FTEs: - Lock the FTE, delete it from hardware, set the hardware deletion function to NULL and unlock the FTE. - Lock the parent flow group, delete the software copy of the FTE, and remove it from the xarray. However, this approach encounters a race condition if a rule with the same match value is added simultaneously. In this scenario, fs_core may set the hardware deletion function to NULL prematurely, causing a panic during subsequent rule deletions. To prevent this, ensure the active flag of the FTE is checked under a lock, which will prevent the fs_core layer from attaching a new steering rule to an FTE that is in the process of deletion. [ 438.967589] MOSHE: 2496 mlx5_del_flow_rules del_hw_func [ 438.968205] ------------[ cut here ]------------ [ 438.968654] refcount_t: decrement hit 0; leaking memory. [ 438.969249] WARNING: CPU: 0 PID: 8957 at lib/refcount.c:31 refcount_warn_saturate+0xfb/0x110 [ 438.970054] Modules linked in: act_mirred cls_flower act_gact sch_ingress openvswitch nsh mlx5_vdpa vringh vhost_iotlb vdpa mlx5_ib mlx5_core xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter rpcsec_gss_krb5 auth_rpcgss oid_registry overlay rpcrdma rdma_ucm ib_iser libiscsi scsi_transport_iscsi ib_umad rdma_cm ib_ipoib iw_cm ib_cm ib_uverbs ib_core zram zsmalloc fuse [last unloaded: cls_flower] [ 438.973288] CPU: 0 UID: 0 PID: 8957 Comm: tc Not tainted 6.12.0-rc1+ m-weigand#8 [ 438.973888] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 [ 438.974874] RIP: 0010:refcount_warn_saturate+0xfb/0x110 [ 438.975363] Code: 40 66 3b 82 c6 05 16 e9 4d 01 01 e8 1f 7c a0 ff 0f 0b c3 cc cc cc cc 48 c7 c7 10 66 3b 82 c6 05 fd e8 4d 01 01 e8 05 7c a0 ff <0f> 0b c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 90 [ 438.976947] RSP: 0018:ffff888124a53610 EFLAGS: 00010286 [ 438.977446] RAX: 0000000000000000 RBX: ffff888119d56de0 RCX: 0000000000000000 [ 438.978090] RDX: ffff88852c828700 RSI: ffff88852c81b3c0 RDI: ffff88852c81b3c0 [ 438.978721] RBP: ffff888120fa0e88 R08: 0000000000000000 R09: ffff888124a534b0 [ 438.979353] R10: 0000000000000001 R11: 0000000000000001 R12: ffff888119d56de0 [ 438.979979] R13: ffff888120fa0ec0 R14: ffff888120fa0ee8 R15: ffff888119d56de0 [ 438.980607] FS: 00007fe6dcc0f800(0000) GS:ffff88852c800000(0000) knlGS:0000000000000000 [ 438.983984] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 438.984544] CR2: 00000000004275e0 CR3: 0000000186982001 CR4: 0000000000372eb0 [ 438.985205] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 438.985842] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 438.986507] Call Trace: [ 438.986799] <TASK> [ 438.987070] ? __warn+0x7d/0x110 [ 438.987426] ? refcount_warn_saturate+0xfb/0x110 [ 438.987877] ? report_bug+0x17d/0x190 [ 438.988261] ? prb_read_valid+0x17/0x20 [ 438.988659] ? handle_bug+0x53/0x90 [ 438.989054] ? exc_invalid_op+0x14/0x70 [ 438.989458] ? asm_exc_invalid_op+0x16/0x20 [ 438.989883] ? refcount_warn_saturate+0xfb/0x110 [ 438.990348] mlx5_del_flow_rules+0x2f7/0x340 [mlx5_core] [ 438.990932] __mlx5_eswitch_del_rule+0x49/0x170 [mlx5_core] [ 438.991519] ? mlx5_lag_is_sriov+0x3c/0x50 [mlx5_core] [ 438.992054] ? xas_load+0x9/0xb0 [ 438.992407] mlx5e_tc_rule_unoffload+0x45/0xe0 [mlx5_core] [ 438.993037] mlx5e_tc_del_fdb_flow+0x2a6/0x2e0 [mlx5_core] [ 438.993623] mlx5e_flow_put+0x29/0x60 [mlx5_core] [ 438.994161] mlx5e_delete_flower+0x261/0x390 [mlx5_core] [ 438.994728] tc_setup_cb_destroy+0xb9/0x190 [ 438.995150] fl_hw_destroy_filter+0x94/0xc0 [cls_flower] [ 438.995650] fl_change+0x11a4/0x13c0 [cls_flower] [ 438.996105] tc_new_tfilter+0x347/0xbc0 [ 438.996503] ? ___slab_alloc+0x70/0x8c0 [ 438.996929] rtnetlink_rcv_msg+0xf9/0x3e0 [ 438.997339] ? __netlink_sendskb+0x4c/0x70 [ 438.997751] ? netlink_unicast+0x286/0x2d0 [ 438.998171] ? __pfx_rtnetlink_rcv_msg+0x10/0x10 [ 438.998625] netlink_rcv_skb+0x54/0x100 [ 438.999020] netlink_unicast+0x203/0x2d0 [ 438.999421] netlink_sendmsg+0x1e4/0x420 [ 438.999820] __sock_sendmsg+0xa1/0xb0 [ 439.000203] ____sys_sendmsg+0x207/0x2a0 [ 439.000600] ? copy_msghdr_from_user+0x6d/0xa0 [ 439.001072] ___sys_sendmsg+0x80/0xc0 [ 439.001459] ? ___sys_recvmsg+0x8b/0xc0 [ 439.001848] ? generic_update_time+0x4d/0x60 [ 439.002282] __sys_sendmsg+0x51/0x90 [ 439.002658] do_syscall_64+0x50/0x110 [ 439.003040] entry_SYSCALL_64_after_hwframe+0x76/0x7e Fixes: 718ce4d ("net/mlx5: Consolidate update FTE for all removal changes") Fixes: cefc235 ("net/mlx5: Fix FTE cleanup") Signed-off-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20241107183527.676877-4-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2ca3519
to
eb880c6
Compare
The stanzas starting with ``file: <path>/Kconfig[.ext]`` should be sorted alphabetically. Fix the sorting so that the stanzas are actually sorted like that.
Move the various USB module configuration options under their appropriate header. This makes it easier to see to what category they belong and where to look in the kernel sources. It's also the categorization as used by the Debian kernel, which makes comparing with that easier.
The ``I2C_COMPAT`` module was dropped due to upstream commit 7e72208 ("i2c: Remove I2C_COMPAT config symbol and related code") (part of kernel 6.12) Link: https://git.kernel.org/linus/7e722083fcc3e148838fc3dc2486c498908c4677
The CYTTSP4* modules were dropped due to upstream commit 25162a4 ("Input: cyttsp4 - remove driver") (part of kernel 6.12) Link: https://git.kernel.org/linus/25162a4f64f8ba0065f300977589fe1f6af332f0
Before updating the configuration, let's categorize them first.
Cryptography, which includes hashing algorithms, are used in MANY places and you want to use the modules provided by the kernel, so enable the most common ones as modules, just like is being done in the Debian kernel package. Only make ``CRYPTO_SHA256`` built-in just like the Debian kernel even though the precise reason for that does not apply here. Also enable support for crypto hardware modules. Link: https://www.wireguard.com/papers/wireguard.pdf
Several modules were defined more then once, 2 went from "XZY is not set" to being defined and the others were just defined twice.
Categorize the modules under the ``/net/`` directory. It's (quite) possible I missed several, but those can be added later. The networking drivers, which can be found under ``/drivers/net``, are deliberately not part of this commit. Drop ``WIRELESS_EXT`` as that module is not user selectable.
Primary motivation for updating was removing the ``CFG80211_WEXT`` module as Wireless Extensions have been deprecated for over a decade now, so there really is no reason to enable it for the PineNote. Furthermore, make the config more in line with Debian's, but don't include the ones Debian has because they want to support ancient stuff.
Enable the modules for strongSwan's IPsec support. While at it, also enable ``INET_DIAG`` as that's really useful. Link: https://docs.strongswan.org/docs/latest/install/kernelModules.html
Drop ``CONFIG_WLAN_VENDOR_CISCO`` as that module doesn't exist. There is a ``CONFIG_NET_VENDOR_CISCO`` though, but that's not used in the pinenote_defconfig.
Drop ``BT_HS`` as that module was removed in upstream commit e7b0229 ("Bluetooth: Remove BT_HS") (part of kernel 6.9) The BT_HCIUART* modules are in ``drivers/bluetooth`` and seem to depend on tty modules, so do that separately.
The added configuration is what's already configured in the PineNote kernel but it's clearer when it's configured explicitly.
Categorize the module before improving its configuration.
The SCMI specification provides a standardized interface for power, performance and resource management on a SoC. The low-level management actions are performed by a system controller that directly controls the SoC hardware or platform. The system controller provides SCMI interfaces to its clients. A typical example of an SCMI client is an Operating System kernel. To enable basic support for that: - Explicitly make ``ARM_SCMI_TRANSPORT_MAILBOX``, ``ARM_SCMI_POWER_DOMAIN`` and ``ARM_SCMI_PERF_DOMAIN `` builtin to make sure they're enabled even if upstream changes the default away from 'y' - Make the SCMI settings related to power and pinctrl builtin as they can be needed for (initial) power operations and early boot - Build the other modules for the basic SCMI framework as modules This makes the SCMI configuration in line with what was used when testing the TF-A "feat(rk3568): support SCMI for clock/reset domain" patch set. Link: https://developer.arm.com/documentation/102886/001/ Link: https://developer.arm.com/documentation/den0056/latest/ Link: https://review.trustedfirmware.org/c/TF-A/trusted-firmware-a/+/31265
The use of 'Update' is a bit of an understatement as the configuration is completely changed ... because the old one did not make any sense. The problem: The performance and powersave governors only set it to max and min respectively, which is rarely useful or a good idea, so remove those. The ``ARM_RK3399_DMC_DEVFREQ`` module is really only possibly useful on rk3399 based devices, but apparently it does not work there either, so remove that option as well. The solution: Replace the configuration with what's used on Debian, or at least the parts that are or could be useful for the PineNote. First explicitly enable the generic DVFS support as that is useful and important (and effectively enabled in the current PN kernel). Secondly, make the SIMPLE_ONDEMAND governor built-in as generally a/the ondemand governor is useful. Furthermore it was made built-in on Debian as well as the ARM Mali GPU driver needs a devfreq governor and the SIMPLE_ONDEMAND is excellent for that. Also make the PASSIVE governor available (as module) as that's also available in the Debian kernel and it uses OPP tables (which are defined in the DeviceTree). Keep the USERSPACE governor, but make it a module. Not sure if it's useful, but to reduce the risk of breaking things, don't remove it. Make devfreq-event support explicitly available and enable the ``DEVFREQ_EVENT_ROCKCHIP_DFI`` driver as a module as that is used/configured in the rk356x SoCs.
eb880c6
to
edababd
Compare
Also drop ``ARCH_RANDOM`` as that doesn't exist.
71a1391
to
0aed29f
Compare
This cleans up the configuration a LOT, but it's still not complete. As this was all taking a bit more time then anticipated, I stopped trying to make the configuration more in line with that of the Debian kernel. When adding new modules, I did use the configuration as used in the Debian kernel, but I did not change the configuration of existing items. I do not expect any regressions, but there should possibly be some improvements. |
Drop ``VIDEO_V4L2_SUBDEV_API`` as it's not user selectable.
The ``MEDIA_PLATFORM_SUPPORT`` and ``VIDEO_DEV`` were already effectively enabled in the PineNote kernel, but V4L2 is used a lot in embedded devices/SoCs, so lets make it explicit.
The ``rk356x.dtsi`` file contains the configuration shared by all rk3566 and rk3568 based devices, so improve the support for the SoC by adding the modules that were missing. The configuration I used for those modules matches what is used in the Debian kernel and therefor a lot of them are build as modules. The configuration for the modules which were already present were left untouched even though they regularly differ from what the Debian kernel uses. The difference usually boils down to ``=y`` vs ``=m`` as kernel modules are build as loadable modules by default in the Debian kernel. The only setting that was changed was enabling ``HW_RANDOM`` as module as that is a dependency of ``HW_RANDOM_ROCKCHIP``. As it turns out, a lot of other HW_RANDOM_* modules take their default from ``HW_RANDOM``, so explicitly disable them even though most wouldn't be enabled due to other missing dependencies. I prefer explicitly enabling modules instead of modules getting enabled 'accidentally' due to enabling of some other kernel module. In a similar vain, explicitly disable ``CLK_RK3576`` and ``CLK_RK3588`` similar to other CLK_RK* which depend on ``ARM64`` as they would otherwise get enabled due to ``COMMON_CLK_ROCKCHIP``. Also add ``PCI_ENDPOINT`` because that's a dependency of ``PCIE_ROCKCHIP_DW_EP``.
The ``rk3566.dtsi`` file contains the configuration shared by all rk3566 based devices, so add the missing modules from that too.
The (upstream) ``rk3566-pinenote.dtsi`` file contains the configuration shared by all PineNote devices, so add the missing modules from that too.
Start by adding the missing modules from rk3566-quartz64-a.dts. There were 3 previous commits related to Quartz64 Model A: 540f666 ("add some kernel options to make the kernel compatible to the Quartz64-A") 7135c9a ("pinenote defconfig: add a few sata and pcie related kernel options for the Quartz64-A") e30c02f ("a few more quartz-64-a defconfig settings") Update the items which were added there to match what's currently used in the Debian kernel as there is no need to differentiate from it.
Enable the USB_VIDEO_CLASS modules to support the Logitech StreamCam as requested by Caffeine (Camden) on IRC. As this is a USB device and this is a generic model, other modules which use UVC should work now too.
0aed29f
to
82bce21
Compare
commit 44d1745 upstream. Use a dedicated mutex to guard kvm_usage_count to fix a potential deadlock on x86 due to a chain of locks and SRCU synchronizations. Translating the below lockdep splat, CPU1 m-weigand#6 will wait on CPU0 m-weigand#1, CPU0 m-weigand#8 will wait on CPU2 m-weigand#3, and CPU2 m-weigand#7 will wait on CPU1 m-weigand#4 (if there's a writer, due to the fairness of r/w semaphores). CPU0 CPU1 CPU2 1 lock(&kvm->slots_lock); 2 lock(&vcpu->mutex); 3 lock(&kvm->srcu); 4 lock(cpu_hotplug_lock); 5 lock(kvm_lock); 6 lock(&kvm->slots_lock); 7 lock(cpu_hotplug_lock); 8 sync(&kvm->srcu); Note, there are likely more potential deadlocks in KVM x86, e.g. the same pattern of taking cpu_hotplug_lock outside of kvm_lock likely exists with __kvmclock_cpufreq_notifier(): cpuhp_cpufreq_online() | -> cpufreq_online() | -> cpufreq_gov_performance_limits() | -> __cpufreq_driver_target() | -> __target_index() | -> cpufreq_freq_transition_begin() | -> cpufreq_notify_transition() | -> ... __kvmclock_cpufreq_notifier() But, actually triggering such deadlocks is beyond rare due to the combination of dependencies and timings involved. E.g. the cpufreq notifier is only used on older CPUs without a constant TSC, mucking with the NX hugepage mitigation while VMs are running is very uncommon, and doing so while also onlining/offlining a CPU (necessary to generate contention on cpu_hotplug_lock) would be even more unusual. The most robust solution to the general cpu_hotplug_lock issue is likely to switch vm_list to be an RCU-protected list, e.g. so that x86's cpufreq notifier doesn't to take kvm_lock. For now, settle for fixing the most blatant deadlock, as switching to an RCU-protected list is a much more involved change, but add a comment in locking.rst to call out that care needs to be taken when walking holding kvm_lock and walking vm_list. ====================================================== WARNING: possible circular locking dependency detected 6.10.0-smp--c257535a0c9d-pip torvalds#330 Tainted: G S O ------------------------------------------------------ tee/35048 is trying to acquire lock: ff6a80eced71e0a8 (&kvm->slots_lock){+.+.}-{3:3}, at: set_nx_huge_pages+0x179/0x1e0 [kvm] but task is already holding lock: ffffffffc07abb08 (kvm_lock){+.+.}-{3:3}, at: set_nx_huge_pages+0x14a/0x1e0 [kvm] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> m-weigand#3 (kvm_lock){+.+.}-{3:3}: __mutex_lock+0x6a/0xb40 mutex_lock_nested+0x1f/0x30 kvm_dev_ioctl+0x4fb/0xe50 [kvm] __se_sys_ioctl+0x7b/0xd0 __x64_sys_ioctl+0x21/0x30 x64_sys_call+0x15d0/0x2e60 do_syscall_64+0x83/0x160 entry_SYSCALL_64_after_hwframe+0x76/0x7e -> m-weigand#2 (cpu_hotplug_lock){++++}-{0:0}: cpus_read_lock+0x2e/0xb0 static_key_slow_inc+0x16/0x30 kvm_lapic_set_base+0x6a/0x1c0 [kvm] kvm_set_apic_base+0x8f/0xe0 [kvm] kvm_set_msr_common+0x9ae/0xf80 [kvm] vmx_set_msr+0xa54/0xbe0 [kvm_intel] __kvm_set_msr+0xb6/0x1a0 [kvm] kvm_arch_vcpu_ioctl+0xeca/0x10c0 [kvm] kvm_vcpu_ioctl+0x485/0x5b0 [kvm] __se_sys_ioctl+0x7b/0xd0 __x64_sys_ioctl+0x21/0x30 x64_sys_call+0x15d0/0x2e60 do_syscall_64+0x83/0x160 entry_SYSCALL_64_after_hwframe+0x76/0x7e -> m-weigand#1 (&kvm->srcu){.+.+}-{0:0}: __synchronize_srcu+0x44/0x1a0 synchronize_srcu_expedited+0x21/0x30 kvm_swap_active_memslots+0x110/0x1c0 [kvm] kvm_set_memslot+0x360/0x620 [kvm] __kvm_set_memory_region+0x27b/0x300 [kvm] kvm_vm_ioctl_set_memory_region+0x43/0x60 [kvm] kvm_vm_ioctl+0x295/0x650 [kvm] __se_sys_ioctl+0x7b/0xd0 __x64_sys_ioctl+0x21/0x30 x64_sys_call+0x15d0/0x2e60 do_syscall_64+0x83/0x160 entry_SYSCALL_64_after_hwframe+0x76/0x7e -> #0 (&kvm->slots_lock){+.+.}-{3:3}: __lock_acquire+0x15ef/0x2e30 lock_acquire+0xe0/0x260 __mutex_lock+0x6a/0xb40 mutex_lock_nested+0x1f/0x30 set_nx_huge_pages+0x179/0x1e0 [kvm] param_attr_store+0x93/0x100 module_attr_store+0x22/0x40 sysfs_kf_write+0x81/0xb0 kernfs_fop_write_iter+0x133/0x1d0 vfs_write+0x28d/0x380 ksys_write+0x70/0xe0 __x64_sys_write+0x1f/0x30 x64_sys_call+0x281b/0x2e60 do_syscall_64+0x83/0x160 entry_SYSCALL_64_after_hwframe+0x76/0x7e Cc: Chao Gao <chao.gao@intel.com> Fixes: 0bf5049 ("KVM: Drop kvm_count_lock and instead protect kvm_usage_count with kvm_lock") Cc: stable@vger.kernel.org Reviewed-by: Kai Huang <kai.huang@intel.com> Acked-by: Kai Huang <kai.huang@intel.com> Tested-by: Farrah Chen <farrah.chen@intel.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-ID: <20240830043600.127750-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On a 64-bit build: drivers/usb/gadget/function/f_gud.c: In function ‘f_gud_a lloc_func_inst’: drivers/usb/gadget/function/f_gud.c:855:21: warning: conversion from ‘long unsigned int’ to ‘u32’ {aka ‘unsigned int’} changes value from ‘18446744073709551615’ to ‘4294967295’ [-Woverflow] 855 | opts->connectors = ~0UL; Change to unsigned. Fixes m-weigand#8 Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
This PR does the following updates:
rk356x.dtsi
,rk3566.dtsi
andrk3566-pinenote.dtsi
and update module config for Quartz64-AThis also makes it more in line with Debian's kernel config.