-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dummy bat #29
Closed
Closed
Dummy bat #29
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
commit 97367c9 upstream. It turned out that the current implementation of the port subscription is racy. The subscription contains two linked lists, and we have to add to or delete from both lists. Since both connection and disconnection procedures perform the same order for those two lists (i.e. src list, then dest list), when a deletion happens during a connection procedure, the src list may be deleted before the dest list addition completes, and this may lead to a use-after-free or an Oops, even though the access to both lists are protected via mutex. The simple workaround for this race is to change the access order for the disconnection, namely, dest list, then src list. This assures that the connection has been established when disconnecting, and also the concurrent deletion can be avoided. Reported-and-tested-by: folkert <folkert@vanheusden.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20210801182754.GP890690@belle.intranet.vanheusden.com Link: https://lore.kernel.org/r/20210803114312.2536-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 7e71b85 ] U-Boot attempts to fix up the "clock-frequency" property of the "/sysclk" node: https://elixir.bootlin.com/u-boot/v2021.04/source/arch/arm/cpu/armv8/fsl-layerscape/fdt.c#L512 but fails to do so: ## Booting kernel from Legacy Image at a1000000 ... Image Name: Created: 2021-06-08 10:31:38 UTC Image Type: AArch64 Linux Kernel Image (gzip compressed) Data Size: 15431370 Bytes = 14.7 MiB Load Address: 80080000 Entry Point: 80080000 Verifying Checksum ... OK ## Flattened Device Tree blob at a0000000 Booting using the fdt blob at 0xa0000000 Uncompressing Kernel Image Loading Device Tree to 00000000fbb19000, end 00000000fbb22717 ... OK Unable to update property /sysclk:clock-frequency, err=FDT_ERR_NOTFOUND Starting kernel ... All Layerscape SoCs except LS1028A use "sysclk" as the node name, and not "clock-sysclk". So change the node name of LS1028A accordingly. Fixes: 8897f32 ("arm64: dts: Add support for NXP LS1028A SoC") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f9613aa ] Commit e76bdfd ("ARM: imx: Added perf functionality to mmdc driver") introduced imx_mmdc_remove(), the mmdc_base need be unmapped in it if config PERF_EVENTS is enabled. If imx_mmdc_perf_init() fails, the mmdc_base also need be unmapped. Fixes: e76bdfd ("ARM: imx: Added perf functionality to mmdc driver") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Reviewed-by: Dong Aisheng <aisheng.dong@nxp.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f07ec85 ] clock source is prepared and enabled by clk_prepare_enable() in probe function, but no disable or unprepare in remove and error path. Fixes: 9454a0c ("ARM: imx: add mmdc ipg clock operation for mmdc") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Reviewed-by: Dong Aisheng <aisheng.dong@nxp.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fd8e838 ] The AR803x PHY used on this modules seems to require the reset line to be asserted for around 10ms in order to avoid rare cases where the PHY gets stuck in an incoherent state that prevents it to function correctly. The previous value of 2ms was found to be problematic on some setups, causing intermittent issues where the PHY would be unresponsive every once in a while on some sytems, with a low occurrence (it typically took around 30 consecutive reboots to encounter the issue). Bumping the delay to the 10ms makes the issue dissapear, with more than 2500 consecutive reboots performed without the issue showing-up. Fixes: 208d7ba ("ARM: imx: initial SolidRun HummingBoard support") Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Tested-by: Hervé Codina <herve.codina@bootlin.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 828db68 ] NXP and AzureWave don't recommend using SDIO bus mode 3.3V@50MHz due to noise affecting the wireless throughput. Colibri iMX6ULL uses only 3.3V signaling for Wi-Fi module AW-CM276NF. Limit the SDIO Clock on Colibri iMX6ULL to 25MHz. Fixes: c2e4987 ("ARM: dts: imx6ull: add Toradex Colibri iMX6ULL support") Signed-off-by: Oleksandr Suvorov <oleksandr.suvorov@toradex.com> Reviewed-by: Fabio Estevam <festevam@gmail.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 20fb739 ] The function imx_mmdc_perf_init recently had a 3rd argument added to it but the equivalent macro was not updated and is still the older 2 argument version. Fix this by adding in the missing 3rd argumement mmdc_ipg_clk. Fixes: f07ec85 ("ARM: imx: add missing clk_disable_unprepare()") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3d9e30a ] The pinctrl_power_button/pinctrl_power_out each define single GPIO pinmux, except it is exactly the other one than the matching gpio-keys and gpio-poweroff DT nodes use for that functionality. Swap the two GPIOs to correct this error. Fixes: 50d29fd ("ARM: dts: imx53: Add power GPIOs on M53Menlo") Signed-off-by: Marek Vasut <marex@denx.de> Cc: Shawn Guo <shawnguo@kernel.org> Cc: Fabio Estevam <festevam@gmail.com> Cc: NXP Linux Team <linux-imx@nxp.com> Reviewed-by: Fabio Estevam <festevam@gmail.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ee7ab3f ] Some SFP modules are not detected when i2c-fast-mode is enabled even when clock-frequency is already set to 100000. The I2C bus violates the timing specifications when run in fast mode. So disable fast mode on Turris Mox. Same change was already applied for uDPU (also Armada 3720 board with SFP) in commit fe3ec63 ("arm64: dts: uDPU: remove i2c-fast-mode"). Fixes: 7109d81 ("arm64: dts: marvell: add DTS for Turris Mox") Signed-off-by: Pali Rohár <pali@kernel.org> Reviewed-by: Marek Behún <kabel@kernel.org> Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4511781 ] The following scenario describes an echo test for Samsung USBC Headset (AKG) with VID/PID (0x04e8/0xa051). We first start a capture stream(USB IN transfer) in 96Khz/24bit/1ch mode. In clock find source function, we get value 0x2 for clock selector and 0x1 for clock source. Kernel-4.14 behavior Since clock source is valid so clock selector was not set again. We pass through this function and start a playback stream(USB OUT transfer) in 48Khz/32bit/2ch mode. This time we get value 0x1 for clock selector and 0x1 for clock source. Finally clock id with this setting is 0x9. Kernel-5.10 behavior Clock selector was always set one more time even it is valid. When we start a playback stream, we will get 0x2 for clock selector and 0x1 for clock source. In this case clock id becomes 0xA. This is an incorrect clock source setting and results in severe noises. We see wrong data rate in USB IN transfer. (From 288 bytes/ms becomes 144 bytes/ms) It should keep in 288 bytes/ms. This earphone works fine on older kernel version load because this is a newly-added behavior. Fixes: d2e8f64 ("ALSA: usb-audio: Explicitly set up the clock selector") Signed-off-by: chihhao.chen <chihhao.chen@mediatek.com> Link: https://lore.kernel.org/r/1627100621-19225-1-git-send-email-chihhao.chen@mediatek.com Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 24b5b19 ] Enabling the framebuffer leads to a system hang. Running, as a debug hack, the store_pan() function in drivers/video/fbdev/core/fbsysfs.c without taking the console_lock, allows to see the crash backtrace on the serial line. ~ # echo 0 0 > /sys/class/graphics/fb0/pan [ 9.719414] Unhandled exception: IPSR = 00000005 LR = fffffff1 [ 9.726937] CPU: 0 PID: 49 Comm: sh Not tainted 5.13.0-rc5 intel#9 [ 9.733008] Hardware name: STM32 (Device Tree Support) [ 9.738296] PC is at clk_gate_is_enabled+0x0/0x28 [ 9.743426] LR is at stm32f4_pll_div_set_rate+0xf/0x38 [ 9.748857] pc : [<0011e4be>] lr : [<0011f9e3>] psr: 0100000b [ 9.755373] sp : 00bc7be0 ip : 00000000 fp : 001f3ac4 [ 9.760812] r10: 002610d0 r9 : 01efe920 r8 : 00540560 [ 9.766269] r7 : 02e7ddb0 r6 : 0173eed8 r5 : 00000000 r4 : 004027c0 [ 9.773081] r3 : 0011e4bf r2 : 02e7ddb0 r1 : 0173eed8 r0 : 1d3267b8 [ 9.779911] xPSR: 0100000b [ 9.782719] CPU: 0 PID: 49 Comm: sh Not tainted 5.13.0-rc5 intel#9 [ 9.788791] Hardware name: STM32 (Device Tree Support) [ 9.794120] [<0000afa1>] (unwind_backtrace) from [<0000a33f>] (show_stack+0xb/0xc) [ 9.802421] [<0000a33f>] (show_stack) from [<0000a8df>] (__invalid_entry+0x4b/0x4c) The `pll_num' field in the post_div_data configuration contained a wrong value which also referenced an uninitialized hardware clock when clk_register_pll_div() was called. Fixes: 517633e ("clk: stm32f4: Add post divisor for I2S & SAI PLLs") Signed-off-by: Dario Binacchi <dariobin@libero.it> Reviewed-by: Gabriel Fernandez <gabriel.fernandez@st.com> Link: https://lore.kernel.org/r/20210725160725.10788-1-dariobin@libero.it Signed-off-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
…d-regulator [ Upstream commit c68ef4a ] This device tree include file describes a fixed-regulator connecting smps7_reg output (1.8V) to some 1.8V rail and consumers (vdds_1v8_main). This regulator does not physically exist. I assume it was introduced as a wrapper around smps7_reg to provide a speaking signal name "vdds_1v8_main" as label. This fixed-regulator without real function was not an issue in driver code until Commit 98e48cd ("regulator: core: resolve supply for boot-on/always-on regulators") introduced a new check for regulator initialization which makes Palmas regulator registration fail: [ 5.407712] ldo1: supplied by vsys_cobra [ 5.412748] ldo2: supplied by vsys_cobra [ 5.417603] palmas-pmic 48070000.i2c:palmas@48:palmas_pmic: failed to register 48070000.i2c:palmas@48:palmas_pmic regulator The reason is that the supply-chain of regulators is too long and goes from ldo3 through the virtual vdds_1v8_main regulator and then back to smps7. This adds a cross-dependency of probing Palmas regulators and the fixed-regulator which leads to probe deferral by the new check and is no longer resolved. Since we do not control what device tree files including this one reference (either &vdds_1v8_main or &smps7_reg or both) we keep both labels for smps7 for compatibility. Fixes: 98e48cd ("regulator: core: resolve supply for boot-on/always-on regulators") Signed-off-by: H. Nikolaus Schaller <hns@goldelico.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 135cbd3 ] Since 00b80ac ("spi: imx: mx51-ecspi: Move some initialisation to prepare_message hook."), the MX51_ECSPI_CONFIG write no longer happens in prepare_transfer hook, but rather in prepare_message hook, however the MX51_ECSPI_CONFIG delay is still left in prepare_transfer hook and thus has no effect. This leads to low bus frequency operation problems described in 6fd8b85 ("spi: spi-imx: Fix out-of-order CS/SCLK operation at low speeds") again. Move the MX51_ECSPI_CONFIG write delay into the prepare_message hook as well, thus reinstating the low bus frequency fix. Fixes: 00b80ac ("spi: imx: mx51-ecspi: Move some initialisation to prepare_message hook.") Signed-off-by: Marek Vasut <marex@denx.de> Cc: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Cc: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20210703022300.296114-1-marex@denx.de Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 53ca18a ] The spi_imx->spi_bus_clk may be uninitialized and thus also zero in mx51_ecspi_prepare_message(), which would lead to division by zero in kernel. Since bitbang .setup_transfer callback which initializes the spi_imx->spi_bus_clk is called after bitbang prepare_message callback, iterate over all the transfers in spi_message, find the one with lowest bus frequency, and use that bus frequency for the delay calculation. Note that it is not possible to move this CONFIGREG delay back into the .setup_transfer callback, because that is invoked too late, after the GPIO chipselects were already configured. Fixes: 135cbd3 ("spi: imx: mx51-ecspi: Reinstate low-speed CONFIGREG delay") Signed-off-by: Marek Vasut <marex@denx.de> Cc: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Cc: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20210726100102.5188-1-marex@denx.de Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5c04243 ] Media event code 3 is defined in the MMC-6 spec as follows: "MediaRemoval: The media has been removed from the specified slot, and the Drive is unable to access the media without user intervention. This applies to media changers only." This indicated that treating the condition as an EJECT_REQUEST was appropriate. However, doing so had the unfortunate side-effect of causing the drive tray to be physically ejected on resume. Instead treat the event as a MEDIA_CHANGE request. Fixes: 7dd753c ("scsi: sr: Return appropriate error code when disk is ejected") Link: https://bugzilla.kernel.org/show_bug.cgi?id=213759 Link: https://lore.kernel.org/r/20210726114913.6760-1-limanyi@uniontech.com Signed-off-by: Li Manyi <limanyi@uniontech.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c592b46 ] If a vb2_queue sets q->min_buffers_needed then when the number of queued buffers reaches q->min_buffers_needed, vb2_core_qbuf() will call the start_streaming() callback. If start_streaming() returns an error, then that error was just returned by vb2_core_qbuf(), but the buffer was still queued. However, userspace expects that if VIDIOC_QBUF fails, the buffer is returned dequeued. So if start_streaming() fails, then remove the buffer from the queue, thus avoiding this unwanted side-effect. Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Tested-by: Kieran Bingham <kieran.bingham@ideasonboard.com> Fixes: b3379c6 ("[media] vb2: only call start_streaming if sufficient buffers are queued") Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7199dde ] Commit dea7a9f ("dmaengine: imx-dma: remove dma_slave_config direction usage") changes the method from a "configuration when called" to an "configuration when used". Due to this, only the cyclic DMA type gets configured correctly, while the generic DMA type is left non-configured. Without this additional call, the struct imxdma_channel::word_size member is stuck at DMA_SLAVE_BUSWIDTH_UNDEFINED and imxdma_prep_slave_sg() always returns NULL. Signed-off-by: Juergen Borleis <jbe@pengutronix.de> Fixes: dea7a9f ("dmaengine: imx-dma: remove dma_slave_config direction usage") Link: https://lore.kernel.org/r/20210729071821.9857-1-jbe@pengutronix.de Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d51c590 ] GSO expects inner transport header offset to be valid when skb->encapsulation flag is set. GSO uses this value to calculate the length of an individual segment of a GSO packet in skb_gso_transport_seglen(). However, tcp/udp gro_complete callbacks don't update the skb->inner_transport_header when processing an encapsulated TCP/UDP segment. As a result a GRO skb has ->inner_transport_header set to a value carried over from earlier skb processing. This can have mild to tragic consequences. From miscalculating the GSO segment length to triggering a page fault [1], when trying to read TCP/UDP header at an address past the skb->data page. The latter scenario leads to an oops report like so: BUG: unable to handle page fault for address: ffff9fa7ec00d008 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 123f201067 P4D 123f201067 PUD 123f209067 PMD 0 Oops: 0000 [intel#1] SMP NOPTI CPU: 44 PID: 0 Comm: swapper/44 Not tainted 5.4.53-cloudflare-2020.7.21 intel#1 Hardware name: HYVE EDGE-METAL-GEN10/HS-1811DLite1, BIOS V2.15 02/21/2020 RIP: 0010:skb_gso_transport_seglen+0x44/0xa0 Code: c0 41 83 e0 11 f6 87 81 00 00 00 20 74 30 0f b7 87 aa 00 00 00 0f [...] RSP: 0018:ffffad8640bacbb8 EFLAGS: 00010202 RAX: 000000000000feda RBX: ffff9fcc8d31bc00 RCX: ffff9fa7ec00cffc RDX: ffff9fa7ebffdec0 RSI: 000000000000feda RDI: 0000000000000122 RBP: 00000000000005c4 R08: 0000000000000001 R09: 0000000000000000 R10: ffff9fe588ae3800 R11: ffff9fe011fc92f0 R12: ffff9fcc8d31bc00 R13: ffff9fe0119d4300 R14: 00000000000005c4 R15: ffff9fba57d70900 FS: 0000000000000000(0000) GS:ffff9fe68df00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff9fa7ec00d008 CR3: 0000003e99b1c000 CR4: 0000000000340ee0 Call Trace: <IRQ> skb_gso_validate_network_len+0x11/0x70 __ip_finish_output+0x109/0x1c0 ip_sublist_rcv_finish+0x57/0x70 ip_sublist_rcv+0x2aa/0x2d0 ? ip_rcv_finish_core.constprop.0+0x390/0x390 ip_list_rcv+0x12b/0x14f __netif_receive_skb_list_core+0x2a9/0x2d0 netif_receive_skb_list_internal+0x1b5/0x2e0 napi_complete_done+0x93/0x140 veth_poll+0xc0/0x19f [veth] ? mlx5e_napi_poll+0x221/0x610 [mlx5_core] net_rx_action+0x1f8/0x790 __do_softirq+0xe1/0x2bf irq_exit+0x8e/0xc0 do_IRQ+0x58/0xe0 common_interrupt+0xf/0xf </IRQ> The bug can be observed in a simple setup where we send IP/GRE/IP/TCP packets into a netns over a veth pair. Inside the netns, packets are forwarded to dummy device: trafgen -> [veth A]--[veth B] -forward-> [dummy] For veth B to GRO aggregate packets on receive, it needs to have an XDP program attached (for example, a trivial XDP_PASS). Additionally, for UDP, we need to enable GSO_UDP_L4 feature on the device: ip netns exec A ethtool -K AB rx-udp-gro-forwarding on The last component is an artificial delay to increase the chances of GRO batching happening: ip netns exec A tc qdisc add dev AB root \ netem delay 200us slot 5ms 10ms packets 2 bytes 64k With such a setup in place, the bug can be observed by tracing the skb outer and inner offsets when GSO skb is transmitted from the dummy device: tcp: FUNC DEV SKB_LEN NH TH ENC INH ITH GSO_SIZE GSO_TYPE ip_finish_output dumB 2830 270 290 1 294 254 1383 (tcpv4,gre,) ^^^ udp: FUNC DEV SKB_LEN NH TH ENC INH ITH GSO_SIZE GSO_TYPE ip_finish_output dumB 2818 270 290 1 294 254 1383 (gre,udp_l4,) ^^^ Fix it by updating the inner transport header offset in tcp/udp gro_complete callbacks, similar to how {inet,ipv6}_gro_complete callbacks update the inner network header offset, when skb->encapsulation flag is set. [1] https://lore.kernel.org/netdev/CAKxSbF01cLpZem2GFaUaifh0S-5WYViZemTicAg7FCHOnh6kug@mail.gmail.com/ Fixes: bf296b1 ("tcp: Add GRO support") Fixes: f993bc2 ("net: core: handle encapsulation offloads when computing segment lengths") Fixes: e20cf8d ("udp: implement GRO for plain UDP sockets.") Reported-by: Alex Forster <aforster@cloudflare.com> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
….port_fdb_add [ Upstream commit e11e865 ] The SJA1105 switch family leaves it up to software to decide where within the FDB to install a static entry, and to concatenate destination ports for already existing entries (the FDB is also used for multicast entries), it is not as simple as just saying "please add this entry". This means we first need to search for an existing FDB entry before adding a new one. The driver currently manages to fool itself into thinking that if an FDB entry already exists, there is nothing to be done. But that FDB entry might be dynamically learned, case in which it should be replaced with a static entry, but instead it is left alone. This patch checks the LOCKEDS ("locked/static") bit from found FDB entries, and lets the code "goto skip_finding_an_index;" if the FDB entry was not static. So we also need to move the place where we set LOCKEDS = true, to cover the new case where a dynamic FDB entry existed but was dynamic. Fixes: 291d1e7 ("net: dsa: sja1105: Add support for FDB and MDB management") Fixes: 1da7382 ("net: dsa: sja1105: Add FDB operations for P/Q/R/S series") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
…y with statically added ones [ Upstream commit 6c5fc15 ] The procedure to add a static FDB entry in sja1105 is concurrent with dynamic learning performed on all bridge ports and the CPU port. The switch looks up the FDB from left to right, and also learns dynamically from left to right, so it is possible that between the moment when we pick up a free slot to install an FDB entry, another slot to the left of that one becomes free due to an address ageing out, and that other slot is then immediately used by the switch to learn dynamically the same address as we're trying to add statically. The result is that we succeeded to add our static FDB entry, but it is being shadowed by a dynamic FDB entry to its left, and the switch will behave as if our static FDB entry did not exist. We cannot really prevent this from happening unless we make the entire process to add a static FDB entry a huge critical section where address learning is temporarily disabled on _all_ ports, and then re-enabled according to the configuration done by sja1105_port_set_learning. However, that is kind of disruptive for the operation of the network. What we can do alternatively is to simply read back the FDB for dynamic entries located before our newly added static one, and delete them. This will guarantee that our static FDB entry is now operational. It will still not guarantee that there aren't dynamic FDB entries to the _right_ of that static FDB entry, but at least those entries will age out by themselves since they aren't hit, and won't bother anyone. Fixes: 291d1e7 ("net: dsa: sja1105: Add support for FDB and MDB management") Fixes: 1da7382 ("net: dsa: sja1105: Add FDB operations for P/Q/R/S series") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a5e63c7 ] The logic for discerning between KSZ8051 and KSZ87XX PHYs is incorrect such that the that KSZ87XX switch is not identified correctly. ksz8051_ksz8795_match_phy_device() uses the parameter ksz_phy_id to discriminate whether it was called from ksz8051_match_phy_device() or from ksz8795_match_phy_device() but since PHY_ID_KSZ87XX is the same value as PHY_ID_KSZ8051, this doesn't work. Instead use a bool to discriminate the caller. Without this patch, the KSZ8795 switch port identifies as: ksz8795-switch spi3.1 ade1 (uninitialized): PHY [dsa-0.1:03] driver [Generic PHY] With the patch, it identifies correctly: ksz8795-switch spi3.1 ade1 (uninitialized): PHY [dsa-0.1:03] driver [Micrel KSZ87XX Switch] Fixes: 8b95599 ("net: phy: micrel: Discern KSZ8051 and KSZ8795 PHYs") Signed-off-by: Steve Bennett <steveb@workware.net.au> Reviewed-by: Marek Vasut <marex@denx.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7fe74df ] Replace pci_enable_device() with pcim_enable_device(), pci_disable_device() and pci_release_regions() will be called in release automatically. Fixes: 1da177e ("Linux-2.6.12-rc2") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wang Hai <wanghai38@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9b87f43 ] The tqmx86 MFD driver was passing IRQ 0 for "no IRQ" in the past. This causes warnings with newer kernels. Prepare the gpio-tqmx86 driver for the fixed MFD driver by handling a missing IRQ properly. Fixes: b868db9 ("gpio: tqmx86: Add GPIO from for this IO controller") Signed-off-by: Matthias Schiffer <matthias.schiffer@ew.tq-group.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ae954bb ] In commit 58acd10 ("sctp: update active_key for asoc when old key is being replaced"), sctp_auth_asoc_init_active_key() is called to update the active_key right after the old key is deleted and before the new key is added, and it caused that the active_key could be found with the key_id. In Ying Xu's testing, the BUG_ON in sctp_auth_asoc_init_active_key() was triggered: [ ] kernel BUG at net/sctp/auth.c:416! [ ] RIP: 0010:sctp_auth_asoc_init_active_key.part.8+0xe7/0xf0 [sctp] [ ] Call Trace: [ ] sctp_auth_set_key+0x16d/0x1b0 [sctp] [ ] sctp_setsockopt.part.33+0x1ba9/0x2bd0 [sctp] [ ] __sys_setsockopt+0xd6/0x1d0 [ ] __x64_sys_setsockopt+0x20/0x30 [ ] do_syscall_64+0x5b/0x1a0 So fix it by moving the active_key update after sh_keys is added. Fixes: 58acd10 ("sctp: update active_key for asoc when old key is being replaced") Reported-by: Ying Xu <yinxu@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9fdc5d8 ] Pauseframe control is set to symmetric mode by default on the NFP. Pause frames can not be configured through ethtool now, but ethtool can report the supported mode. Fixes: 265aeb5 ("nfp: add support for .get_link_ksettings()") Signed-off-by: Fei Qin <fei.qin@corigine.com> Signed-off-by: Louis Peens <louis.peens@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4039146 ] The patch fixing the returned value of ip6_skb_dst_mtu (int -> unsigned int) was rebased between its initial review and the version applied. In the meantime fade564 was applied, which added a new variable (int) used as the returned value. This lead to a mismatch between the function prototype and the variable used as the return value. Fixes: 40fc305 ("net: ipv6: fix return value of ip6_skb_dst_mtu") Cc: Vadim Fedorenko <vfedorenko@novek.ru> Signed-off-by: Antoine Tenart <atenart@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 28bbbb9 ] When cross compiling a MIPS kernel on a BSD based HOSTCC leads to errors like SYNC include/config/auto.conf.cmd - due to: .config egrep: empty (sub)expression UPD include/config/kernel.release HOSTCC scripts/dtc/dtc.o - due to target missing It turns out that egrep uses this egrep pattern: (|MINOR_|PATCHLEVEL_) This is not valid syntax or gives undefined results according to POSIX 9.5.3 ERE Grammar https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html It seems to be silently accepted by the Linux egrep implementation while a BSD host complains. Such patterns can be replaced by a transformation like "(|p1|p2)" -> "(p1|p2)?" Fixes: 48c35b2 ("[MIPS] There is no __GNUC_MAJOR__") Signed-off-by: H. Nikolaus Schaller <hns@goldelico.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fb65382 ] Set the error code if bnx2x_alloc_fw_stats_mem() fails. The current code returns success. Fixes: ad5afc8 ("bnx2x: Separate VF and PF logic") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit af35fc3 ] Syzbot reported uninit value pegasus_probe(). The problem was in missing error handling. get_interrupt_interval() internally calls read_eprom_word() which can fail in some cases. For example: failed to receive usb control message. These cases should be handled to prevent uninit value bug, since read_eprom_word() will not initialize passed stack variable in case of internal failure. Fail log: BUG: KMSAN: uninit-value in get_interrupt_interval drivers/net/usb/pegasus.c:746 [inline] BUG: KMSAN: uninit-value in pegasus_probe+0x10e7/0x4080 drivers/net/usb/pegasus.c:1152 CPU: 1 PID: 825 Comm: kworker/1:1 Not tainted 5.12.0-rc6-syzkaller #0 ... Workqueue: usb_hub_wq hub_event Call Trace: __dump_stack lib/dump_stack.c:79 [inline] dump_stack+0x24c/0x2e0 lib/dump_stack.c:120 kmsan_report+0xfb/0x1e0 mm/kmsan/kmsan_report.c:118 __msan_warning+0x5c/0xa0 mm/kmsan/kmsan_instr.c:197 get_interrupt_interval drivers/net/usb/pegasus.c:746 [inline] pegasus_probe+0x10e7/0x4080 drivers/net/usb/pegasus.c:1152 .... Local variable ----data.i@pegasus_probe created at: get_interrupt_interval drivers/net/usb/pegasus.c:1151 [inline] pegasus_probe+0xe57/0x4080 drivers/net/usb/pegasus.c:1152 get_interrupt_interval drivers/net/usb/pegasus.c:1151 [inline] pegasus_probe+0xe57/0x4080 drivers/net/usb/pegasus.c:1152 Reported-and-tested-by: syzbot+02c9f70f3afae308464a@syzkaller.appspotmail.com Fixes: 1da177e ("Linux-2.6.12-rc2") Signed-off-by: Pavel Skripkin <paskripkin@gmail.com> Link: https://lore.kernel.org/r/20210804143005.439-1-paskripkin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 57a1681 ] The function tpci200_register called by tpci200_install and tpci200_unregister called by tpci200_uninstall are in pair. However, tpci200_unregister has some cleanup operations not in the tpci200_register. So the error handling code of tpci200_pci_probe has many different double free issues. Fix this problem by moving those cleanup operations out of tpci200_unregister, into tpci200_pci_remove and reverting the previous commit 9272e5d ("ipack/carriers/tpci200: Fix a double free in tpci200_pci_probe"). Fixes: 9272e5d ("ipack/carriers/tpci200: Fix a double free in tpci200_pci_probe") Cc: stable@vger.kernel.org Reported-by: Dongliang Mu <mudongliangabcd@gmail.com> Signed-off-by: Dongliang Mu <mudongliangabcd@gmail.com> Link: https://lore.kernel.org/r/20210810100323.3938492-1-mudongliangabcd@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 50f05bd ] The error handling code in tpci200_register does not free interface_regs allocated by ioremap and the current version of error handling code is problematic. Fix this by refactoring the error handling code and free interface_regs when necessary. Fixes: 4398679 ("ipack: add error handling for ioremap_nocache") Cc: stable@vger.kernel.org Reported-by: Dongliang Mu <mudongliangabcd@gmail.com> Signed-off-by: Dongliang Mu <mudongliangabcd@gmail.com> Link: https://lore.kernel.org/r/20210810100323.3938492-2-mudongliangabcd@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
… different parents [ Upstream commit 3f79f6f ] Cross-rename lacks a check when that would prevent exchanging a directory and subvolume from different parent subvolume. This causes data inconsistencies and is caught before commit by tree-checker, turning the filesystem to read-only. Calling the renameat2 with RENAME_EXCHANGE flags like renameat2(AT_FDCWD, namesrc, AT_FDCWD, namedest, (1 << 1)) on two paths: namesrc = dir1/subvol1/dir2 namedest = subvol2/subvol3 will cause key order problem with following write time tree-checker report: [1194842.307890] BTRFS critical (device loop1): corrupt leaf: root=5 block=27574272 slot=10 ino=258, invalid previous key objectid, have 257 expect 258 [1194842.322221] BTRFS info (device loop1): leaf 27574272 gen 8 total ptrs 11 free space 15444 owner 5 [1194842.331562] BTRFS info (device loop1): refs 2 lock_owner 0 current 26561 [1194842.338772] item 0 key (256 1 0) itemoff 16123 itemsize 160 [1194842.338793] inode generation 3 size 16 mode 40755 [1194842.338801] item 1 key (256 12 256) itemoff 16111 itemsize 12 [1194842.338809] item 2 key (256 84 2248503653) itemoff 16077 itemsize 34 [1194842.338817] dir oid 258 type 2 [1194842.338823] item 3 key (256 84 2363071922) itemoff 16043 itemsize 34 [1194842.338830] dir oid 257 type 2 [1194842.338836] item 4 key (256 96 2) itemoff 16009 itemsize 34 [1194842.338843] item 5 key (256 96 3) itemoff 15975 itemsize 34 [1194842.338852] item 6 key (257 1 0) itemoff 15815 itemsize 160 [1194842.338863] inode generation 6 size 8 mode 40755 [1194842.338869] item 7 key (257 12 256) itemoff 15801 itemsize 14 [1194842.338876] item 8 key (257 84 2505409169) itemoff 15767 itemsize 34 [1194842.338883] dir oid 256 type 2 [1194842.338888] item 9 key (257 96 2) itemoff 15733 itemsize 34 [1194842.338895] item 10 key (258 12 256) itemoff 15719 itemsize 14 [1194842.339163] BTRFS error (device loop1): block=27574272 write time tree block corruption detected [1194842.339245] ------------[ cut here ]------------ [1194842.443422] WARNING: CPU: 6 PID: 26561 at fs/btrfs/disk-io.c:449 csum_one_extent_buffer+0xed/0x100 [btrfs] [1194842.511863] CPU: 6 PID: 26561 Comm: kworker/u17:2 Not tainted 5.14.0-rc3-git+ #793 [1194842.511870] Hardware name: empty empty/S3993, BIOS PAQEX0-3 02/24/2008 [1194842.511876] Workqueue: btrfs-worker-high btrfs_work_helper [btrfs] [1194842.511976] RIP: 0010:csum_one_extent_buffer+0xed/0x100 [btrfs] [1194842.512068] RSP: 0018:ffffa2c284d77da0 EFLAGS: 00010282 [1194842.512074] RAX: 0000000000000000 RBX: 0000000000001000 RCX: ffff928867bd9978 [1194842.512078] RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff928867bd9970 [1194842.512081] RBP: ffff92876b958000 R08: 0000000000000001 R09: 00000000000c0003 [1194842.512085] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000 [1194842.512088] R13: ffff92875f989f98 R14: 0000000000000000 R15: 0000000000000000 [1194842.512092] FS: 0000000000000000(0000) GS:ffff928867a00000(0000) knlGS:0000000000000000 [1194842.512095] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [1194842.512099] CR2: 000055f5384da1f0 CR3: 0000000102fe4000 CR4: 00000000000006e0 [1194842.512103] Call Trace: [1194842.512128] ? run_one_async_free+0x10/0x10 [btrfs] [1194842.631729] btree_csum_one_bio+0x1ac/0x1d0 [btrfs] [1194842.631837] run_one_async_start+0x18/0x30 [btrfs] [1194842.631938] btrfs_work_helper+0xd5/0x1d0 [btrfs] [1194842.647482] process_one_work+0x262/0x5e0 [1194842.647520] worker_thread+0x4c/0x320 [1194842.655935] ? process_one_work+0x5e0/0x5e0 [1194842.655946] kthread+0x135/0x160 [1194842.655953] ? set_kthread_struct+0x40/0x40 [1194842.655965] ret_from_fork+0x1f/0x30 [1194842.672465] irq event stamp: 1729 [1194842.672469] hardirqs last enabled at (1735): [<ffffffffbd1104f5>] console_trylock_spinning+0x185/0x1a0 [1194842.672477] hardirqs last disabled at (1740): [<ffffffffbd1104cc>] console_trylock_spinning+0x15c/0x1a0 [1194842.672482] softirqs last enabled at (1666): [<ffffffffbdc002e1>] __do_softirq+0x2e1/0x50a [1194842.672491] softirqs last disabled at (1651): [<ffffffffbd08aab7>] __irq_exit_rcu+0xa7/0xd0 The corrupted data will not be written, and filesystem can be unmounted and mounted again (all changes since the last commit will be lost). Add the missing check for new_ino so that all non-subvolumes must reside under the same parent subvolume. There's an exception allowing to exchange two subvolumes from any parents as the directory representing a subvolume is only a logical link and does not have any other structures related to the parent subvolume, unlike files, directories etc, that are always in the inode namespace of the parent subvolume. Fixes: cdd1fed ("btrfs: add support for RENAME_EXCHANGE and RENAME_WHITEOUT") CC: stable@vger.kernel.org # 4.7+ Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e0bff43 ] The Renoir XHCI controller apparently doesn't resume reliably with the standard D3hot-to-D0 delay. Increase it to 20ms. [Alex: I talked to the AMD USB hardware team and the AMD Windows team and they are not aware of any HW errata or specific issues. The HW works fine in Windows. I was told Windows uses a rather generous default delay of 100ms for PCI state transitions.] Link: https://lore.kernel.org/r/20210722025858.220064-1-alexander.deucher@amd.com Signed-off-by: Marcin Bachry <hegel666@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Prike Liang <prike.liang@amd.com> Cc: Shyam Sundar S K <shyam-sundar.s-k@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 65ca89c ] The commit 2e6b836 ("ASoC: intel: atom: Fix reference to PCM buffer address") changed the reference of PCM buffer address to substream->runtime->dma_addr as the buffer address may change dynamically. However, I forgot that the dma_addr field is still not set up for the CONTINUOUS buffer type (that this driver uses) yet in 5.14 and earlier kernels, and it resulted in garbage I/O. The problem will be fixed in 5.15, but we need to address it quickly for now. The fix is to deduce the address again from the DMA pointer with virt_to_phys(), but from the right one, substream->runtime->dma_area. Fixes: 2e6b836 ("ASoC: intel: atom: Fix reference to PCM buffer address") Reported-and-tested-by: Hans de Goede <hdegoede@redhat.com> Cc: <stable@vger.kernel.org> Acked-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/2048c6aa-2187-46bd-6772-36a4fb3c5aeb@redhat.com Link: https://lore.kernel.org/r/20210819152945.8510-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 22f7496 ] Patch series "mm, memcg: memory.{low,min} reclaim fix & cleanup", v4. This series contains a fix for a edge case in my earlier protection calculation patches, and a patch to make the area overall a little more robust to hopefully help avoid this in future. This patch (of 2): A cgroup can have both memory protection and a memory limit to isolate it from its siblings in both directions - for example, to prevent it from being shrunk below 2G under high pressure from outside, but also from growing beyond 4G under low pressure. Commit 9783aa9 ("mm, memcg: proportional memory.{low,min} reclaim") implemented proportional scan pressure so that multiple siblings in excess of their protection settings don't get reclaimed equally but instead in accordance to their unprotected portion. During limit reclaim, this proportionality shouldn't apply of course: there is no competition, all pressure is from within the cgroup and should be applied as such. Reclaim should operate at full efficiency. However, mem_cgroup_protected() never expected anybody to look at the effective protection values when it indicated that the cgroup is above its protection. As a result, a query during limit reclaim may return stale protection values that were calculated by a previous reclaim cycle in which the cgroup did have siblings. When this happens, reclaim is unnecessarily hesitant and potentially slow to meet the desired limit. In theory this could lead to premature OOM kills, although it's not obvious this has occurred in practice. Workaround the problem by special casing reclaim roots in mem_cgroup_protection. These memcgs are never participating in the reclaim protection because the reclaim is internal. We have to ignore effective protection values for reclaim roots because mem_cgroup_protected might be called from racing reclaim contexts with different roots. Calculation is relying on root -> leaf tree traversal therefore top-down reclaim protection invariants should hold. The only exception is the reclaim root which should have effective protection set to 0 but that would be problematic for the following setup: Let's have global and A's reclaim in parallel: | A (low=2G, usage = 3G, max = 3G, children_low_usage = 1.5G) |\ | C (low = 1G, usage = 2.5G) B (low = 1G, usage = 0.5G) for A reclaim we have B.elow = B.low C.elow = C.low For the global reclaim A.elow = A.low B.elow = min(B.usage, B.low) because children_low_usage <= A.elow C.elow = min(C.usage, C.low) With the effective values resetting we have A reclaim A.elow = 0 B.elow = B.low C.elow = C.low and global reclaim could see the above and then B.elow = C.elow = 0 because children_low_usage > A.elow Which means that protected memcgs would get reclaimed. In future we would like to make mem_cgroup_protected more robust against racing reclaim contexts but that is likely more complex solution than this simple workaround. [hannes@cmpxchg.org - large part of the changelog] [mhocko@suse.com - workaround explanation] [chris@chrisdown.name - retitle] Fixes: 9783aa9 ("mm, memcg: proportional memory.{low,min} reclaim") Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: Chris Down <chris@chrisdown.name> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Chris Down <chris@chrisdown.name> Acked-by: Roman Gushchin <guro@fb.com> Link: http://lkml.kernel.org/r/cover.1594638158.git.chris@chrisdown.name Link: http://lkml.kernel.org/r/044fb8ecffd001c7905d27c0c2ad998069fdc396.1594638158.git.chris@chrisdown.name Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
…claim [ Upstream commit f56ce41 ] We've noticed occasional OOM killing when memory.low settings are in effect for cgroups. This is unexpected and undesirable as memory.low is supposed to express non-OOMing memory priorities between cgroups. The reason for this is proportional memory.low reclaim. When cgroups are below their memory.low threshold, reclaim passes them over in the first round, and then retries if it couldn't find pages anywhere else. But when cgroups are slightly above their memory.low setting, page scan force is scaled down and diminished in proportion to the overage, to the point where it can cause reclaim to fail as well - only in that case we currently don't retry, and instead trigger OOM. To fix this, hook proportional reclaim into the same retry logic we have in place for when cgroups are skipped entirely. This way if reclaim fails and some cgroups were scanned with diminished pressure, we'll try another full-force cycle before giving up and OOMing. [akpm@linux-foundation.org: coding-style fixes] Link: https://lkml.kernel.org/r/20210817180506.220056-1-hannes@cmpxchg.org Fixes: 9783aa9 ("mm, memcg: proportional memory.{low,min} reclaim") Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reported-by: Leon Yang <lnyng@fb.com> Reviewed-by: Rik van Riel <riel@surriel.com> Reviewed-by: Shakeel Butt <shakeelb@google.com> Acked-by: Roman Gushchin <guro@fb.com> Acked-by: Chris Down <chris@chrisdown.name> Acked-by: Michal Hocko <mhocko@suse.com> Cc: <stable@vger.kernel.org> [5.4+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fdd92b6 ] We've had CONFIG_MANDATORY_FILE_LOCKING since 2015 and a lot of distros have disabled it. Warn the stragglers that still use "-o mand" that we'll be dropping support for that mount option. Cc: stable@vger.kernel.org Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2e34328 ] I got a problem on MIPS with Big-Endian is turned on: every time when NF trying to change TCP MSS it returns because of new.v16 was greater than old.v16. But real MSS was 1460 and my rule was like this: add rule table chain tcp option maxseg size set 1400 And 1400 is lesser that 1460, not greater. Later I founded that main causer is cast from u32 to __be16. Debugging: In example MSS = 1400(HEX: 0x578). Here is representation of each byte like it is in memory by addresses from left to right(e.g. [0x0 0x1 0x2 0x3]). LE — Little-Endian system, BE — Big-Endian, left column is type. LE BE u32: [78 05 00 00] [00 00 05 78] As you can see, u32 representation will be casted to u16 from different half of 4-byte address range. But actually nf_tables uses registers and store data of various size. Actually TCP MSS stored in 2 bytes. But registers are still u32 in definition: struct nft_regs { union { u32 data[20]; struct nft_verdict verdict; }; }; So, access like regs->data[priv->sreg] exactly u32. So, according to table presents above, per-byte representation of stored TCP MSS in register will be: LE BE (u32)regs->data[]: [78 05 00 00] [05 78 00 00] ^^ ^^ We see that register uses just half of u32 and other 2 bytes may be used for some another data. But in nft_exthdr_tcp_set_eval() it casted just like u32 -> __be16: new.v16 = src But u32 overfill __be16, so it get 2 low bytes. For clarity draw one more table(<xx xx> means that bytes will be used for cast). LE BE u32: [<78 05> 00 00] [00 00 <05 78>] (u32)regs->data[]: [<78 05> 00 00] [05 78 <00 00>] As you can see, for Little-Endian nothing changes, but for Big-endian we take the wrong half. In my case there is some other data instead of zeros, so new MSS was wrongly greater. For shooting this bug I used solution for ports ranges. Applying of this patch does not affect Little-Endian systems. Signed-off-by: Sergey Marinkevich <sergey.marinkevich@eltex-co.ru> Acked-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Tested-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk> Tested-by: Hulk Robot <hulkrobot@huawei.com> Tested-by: Linux Kernel Functional Testing <lkft@linaro.org> Tested-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Linux 5.4.143 # gpg: Signature made Thu 26 Aug 2021 08:56:00 PM CST # gpg: using RSA key E27E5D8A3403A2EF66873BBCDEA66FF797772CDC # gpg: Can't check signature: No public key # By Thomas Gleixner (13) and others # Via Sasha Levin * tag 'v5.4.143': (264 commits) Linux 5.4.143 netfilter: nft_exthdr: fix endianness of tcp option cast fs: warn about impending deprecation of mandatory locks mm: memcontrol: fix occasional OOMs due to proportional memory.low reclaim mm, memcg: avoid stale protection values when cgroup is above protection ASoC: intel: atom: Fix breakage for PCM buffer address setup PCI: Increase D3 delay for AMD Renoir/Cezanne XHCI btrfs: prevent rename2 from exchanging a subvol with a directory from different parents ipack: tpci200: fix memory leak in the tpci200_register ipack: tpci200: fix many double free issues in tpci200_pci_probe slimbus: ngd: reset dma setup during runtime pm slimbus: messaging: check for valid transaction id slimbus: messaging: start transaction ids from 1 instead of zero tracing / histogram: Fix NULL pointer dereference on strcmp() on NULL event name ALSA: hda - fix the 'Capture Switch' value change notifications mmc: dw_mmc: Fix hang on data CRC error ovl: add splice file read write helper iavf: Fix ping is lost after untrusted VF had tried to change MAC i40e: Fix ATR queue selection ovs: clear skb->tstamp in forwarding path ... Signed-off-by: Ranjan Dutta <ranjan.dutta@intel.com> # Conflicts: # arch/arm64/include/asm/arch_timer.h
As EHL OSE FW switch to support runtime D3 (RTD3), so driver also add the RTD3 support according to finish power runtime power management. Signed-off-by: Even Xu <even.xu@intel.com>
Add PTP_PEROUT_FREQ_ADJ check before updating the TGPIO_COMPV registers. COMPV register is to be updated accordingly only when FREQ_ADJ flag is not enabled. Signed-off-by: Lakshmi Sowjanya D <lakshmi.sowjanya.d@intel.com>
According to RTD3 implementation logic, in interrupt handler, it's safe to access HW resource directly, no need call runtime pm apis to do runtime pm suspend/resume. Signed-off-by: Even Xu <even.xu@intel.com>
add return value check for wait_event_interruptible_timeout. Signed-off-by: Even Xu <even.xu@intel.com>
Re-number the barno of EHL_COMMUNITY to resolve gpio pins mismatch issue as the community2 isn't exposed to the user. Signed-off-by: Lakshmi Sowjanya D <lakshmi.sowjanya.d@intel.com>
To be compliant with i915 display driver requirements, i915 power-up must be done before any HDA communication takes place, including parsing the bus capabilities. Otherwise the initial codec probe may fail. Move i915 initialization earlier in the SOF HDA sequence. This sequence is now aligned with the snd-hda-intel driver where the display_power() call is before snd_hdac_bus_parse_capabilities() and rest of the capability parsing. Also remove unnecessary ifdef around hda_codec_i915_init(). There's a dummy implementation provided if CONFIG_SND_SOC_SOF_HDA is not enabled. Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Takashi Iwai <tiwai@suse.de> Link: https://lore.kernel.org/r/20200206200223.7715-3-kai.vehmanen@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
added HEALTH property and Enabled it with workaroud values. Signed-off-by: Kothapeta, BikshapathiX <bikshapathix.kothapeta@intel.com>
PR is on different branch |
sys-oak
pushed a commit
that referenced
this pull request
Nov 30, 2021
[ Upstream commit d412137 ] The perf_buffer fails on system with offline cpus: # test_progs -t perf_buffer test_perf_buffer:PASS:nr_cpus 0 nsec test_perf_buffer:PASS:nr_on_cpus 0 nsec test_perf_buffer:PASS:skel_load 0 nsec test_perf_buffer:PASS:attach_kprobe 0 nsec test_perf_buffer:PASS:perf_buf__new 0 nsec test_perf_buffer:PASS:epoll_fd 0 nsec skipping offline CPU #24 skipping offline CPU #25 skipping offline CPU #26 skipping offline CPU #27 skipping offline CPU #28 skipping offline CPU #29 skipping offline CPU #30 skipping offline CPU #31 test_perf_buffer:PASS:perf_buffer__poll 0 nsec test_perf_buffer:PASS:seen_cpu_cnt 0 nsec test_perf_buffer:FAIL:buf_cnt got 24, expected 32 Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED Changing the test to check online cpus instead of possible. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20211021114132.8196-2-jolsa@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
sys-oak
pushed a commit
that referenced
this pull request
Dec 14, 2021
[ Upstream commit d412137 ] The perf_buffer fails on system with offline cpus: # test_progs -t perf_buffer test_perf_buffer:PASS:nr_cpus 0 nsec test_perf_buffer:PASS:nr_on_cpus 0 nsec test_perf_buffer:PASS:skel_load 0 nsec test_perf_buffer:PASS:attach_kprobe 0 nsec test_perf_buffer:PASS:perf_buf__new 0 nsec test_perf_buffer:PASS:epoll_fd 0 nsec skipping offline CPU #24 skipping offline CPU #25 skipping offline CPU #26 skipping offline CPU #27 skipping offline CPU #28 skipping offline CPU #29 skipping offline CPU #30 skipping offline CPU #31 test_perf_buffer:PASS:perf_buffer__poll 0 nsec test_perf_buffer:PASS:seen_cpu_cnt 0 nsec test_perf_buffer:FAIL:buf_cnt got 24, expected 32 Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED Changing the test to check online cpus instead of possible. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20211021114132.8196-2-jolsa@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
sys-oak
pushed a commit
that referenced
this pull request
Apr 6, 2022
commit 22e2100 upstream. The trace_hardirqs_{on,off}() require the caller to setup frame pointer properly. This because these two functions use macro 'CALLER_ADDR1' (aka. __builtin_return_address(1)) to acquire caller info. If the $fp is used for other purpose, the code generated this macro (as below) could trigger memory access fault. 0xffffffff8011510e <+80>: ld a1,-16(s0) 0xffffffff80115112 <+84>: ld s2,-8(a1) # <-- paging fault here The oops message during booting if compiled with 'irqoff' tracer enabled: [ 0.039615][ T0] Unable to handle kernel NULL pointer dereference at virtual address 00000000000000f8 [ 0.041925][ T0] Oops [#1] [ 0.042063][ T0] Modules linked in: [ 0.042864][ T0] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.17.0-rc1-00233-g9a20c48d1ed2 #29 [ 0.043568][ T0] Hardware name: riscv-virtio,qemu (DT) [ 0.044343][ T0] epc : trace_hardirqs_on+0x56/0xe2 [ 0.044601][ T0] ra : restore_all+0x12/0x6e [ 0.044721][ T0] epc : ffffffff80126a5c ra : ffffffff80003b94 sp : ffffffff81403db0 [ 0.044801][ T0] gp : ffffffff8163acd8 tp : ffffffff81414880 t0 : 0000000000000020 [ 0.044882][ T0] t1 : 0098968000000000 t2 : 0000000000000000 s0 : ffffffff81403de0 [ 0.044967][ T0] s1 : 0000000000000000 a0 : 0000000000000001 a1 : 0000000000000100 [ 0.045046][ T0] a2 : 0000000000000000 a3 : 0000000000000000 a4 : 0000000000000000 [ 0.045124][ T0] a5 : 0000000000000000 a6 : 0000000000000000 a7 : 0000000054494d45 [ 0.045210][ T0] s2 : ffffffff80003b94 s3 : ffffffff81a8f1b0 s4 : ffffffff80e27b50 [ 0.045289][ T0] s5 : ffffffff81414880 s6 : ffffffff8160fa00 s7 : 00000000800120e8 [ 0.045389][ T0] s8 : 0000000080013100 s9 : 000000000000007f s10: 0000000000000000 [ 0.045474][ T0] s11: 0000000000000000 t3 : 7fffffffffffffff t4 : 0000000000000000 [ 0.045548][ T0] t5 : 0000000000000000 t6 : ffffffff814aa368 [ 0.045620][ T0] status: 0000000200000100 badaddr: 00000000000000f8 cause: 000000000000000d [ 0.046402][ T0] [<ffffffff80003b94>] restore_all+0x12/0x6e This because the $fp(aka. $s0) register is not used as frame pointer in the assembly entry code. resume_kernel: REG_L s0, TASK_TI_PREEMPT_COUNT(tp) bnez s0, restore_all REG_L s0, TASK_TI_FLAGS(tp) andi s0, s0, _TIF_NEED_RESCHED beqz s0, restore_all call preempt_schedule_irq j restore_all To fix above issue, here we add one extra level wrapper for function trace_hardirqs_{on,off}() so they can be safely called by low level entry code. Signed-off-by: Changbin Du <changbin.du@gmail.com> Fixes: 3c46979 ("riscv: Enable LOCKDEP_SUPPORT & fixup TRACE_IRQFLAGS_SUPPORT") Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
sys-oak
pushed a commit
that referenced
this pull request
Apr 12, 2022
commit 22e2100 upstream. The trace_hardirqs_{on,off}() require the caller to setup frame pointer properly. This because these two functions use macro 'CALLER_ADDR1' (aka. __builtin_return_address(1)) to acquire caller info. If the $fp is used for other purpose, the code generated this macro (as below) could trigger memory access fault. 0xffffffff8011510e <+80>: ld a1,-16(s0) 0xffffffff80115112 <+84>: ld s2,-8(a1) # <-- paging fault here The oops message during booting if compiled with 'irqoff' tracer enabled: [ 0.039615][ T0] Unable to handle kernel NULL pointer dereference at virtual address 00000000000000f8 [ 0.041925][ T0] Oops [#1] [ 0.042063][ T0] Modules linked in: [ 0.042864][ T0] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.17.0-rc1-00233-g9a20c48d1ed2 #29 [ 0.043568][ T0] Hardware name: riscv-virtio,qemu (DT) [ 0.044343][ T0] epc : trace_hardirqs_on+0x56/0xe2 [ 0.044601][ T0] ra : restore_all+0x12/0x6e [ 0.044721][ T0] epc : ffffffff80126a5c ra : ffffffff80003b94 sp : ffffffff81403db0 [ 0.044801][ T0] gp : ffffffff8163acd8 tp : ffffffff81414880 t0 : 0000000000000020 [ 0.044882][ T0] t1 : 0098968000000000 t2 : 0000000000000000 s0 : ffffffff81403de0 [ 0.044967][ T0] s1 : 0000000000000000 a0 : 0000000000000001 a1 : 0000000000000100 [ 0.045046][ T0] a2 : 0000000000000000 a3 : 0000000000000000 a4 : 0000000000000000 [ 0.045124][ T0] a5 : 0000000000000000 a6 : 0000000000000000 a7 : 0000000054494d45 [ 0.045210][ T0] s2 : ffffffff80003b94 s3 : ffffffff81a8f1b0 s4 : ffffffff80e27b50 [ 0.045289][ T0] s5 : ffffffff81414880 s6 : ffffffff8160fa00 s7 : 00000000800120e8 [ 0.045389][ T0] s8 : 0000000080013100 s9 : 000000000000007f s10: 0000000000000000 [ 0.045474][ T0] s11: 0000000000000000 t3 : 7fffffffffffffff t4 : 0000000000000000 [ 0.045548][ T0] t5 : 0000000000000000 t6 : ffffffff814aa368 [ 0.045620][ T0] status: 0000000200000100 badaddr: 00000000000000f8 cause: 000000000000000d [ 0.046402][ T0] [<ffffffff80003b94>] restore_all+0x12/0x6e This because the $fp(aka. $s0) register is not used as frame pointer in the assembly entry code. resume_kernel: REG_L s0, TASK_TI_PREEMPT_COUNT(tp) bnez s0, restore_all REG_L s0, TASK_TI_FLAGS(tp) andi s0, s0, _TIF_NEED_RESCHED beqz s0, restore_all call preempt_schedule_irq j restore_all To fix above issue, here we add one extra level wrapper for function trace_hardirqs_{on,off}() so they can be safely called by low level entry code. Signed-off-by: Changbin Du <changbin.du@gmail.com> Fixes: 3c46979 ("riscv: Enable LOCKDEP_SUPPORT & fixup TRACE_IRQFLAGS_SUPPORT") Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
sys-oak
pushed a commit
that referenced
this pull request
May 10, 2024
[ Upstream commit 743edc8 ] As previously explained, the rehash delayed work migrates filters from one region to another. This is done by iterating over all chunks (all the filters with the same priority) in the region and in each chunk iterating over all the filters. When the work runs out of credits it stores the current chunk and entry as markers in the per-work context so that it would know where to resume the migration from the next time the work is scheduled. Upon error, the chunk marker is reset to NULL, but without resetting the entry markers despite being relative to it. This can result in migration being resumed from an entry that does not belong to the chunk being migrated. In turn, this will eventually lead to a chunk being iterated over as if it is an entry. Because of how the two structures happen to be defined, this does not lead to KASAN splats, but to warnings such as [1]. Fix by creating a helper that resets all the markers and call it from all the places the currently only reset the chunk marker. For good measures also call it when starting a completely new rehash. Add a warning to avoid future cases. [1] WARNING: CPU: 7 PID: 1076 at drivers/net/ethernet/mellanox/mlxsw/core_acl_flex_keys.c:407 mlxsw_afk_encode+0x242/0x2f0 Modules linked in: CPU: 7 PID: 1076 Comm: kworker/7:24 Tainted: G W 6.9.0-rc3-custom-00880-g29e61d91b77b #29 Hardware name: Mellanox Technologies Ltd. MSN3700/VMOD0005, BIOS 5.11 01/06/2019 Workqueue: mlxsw_core mlxsw_sp_acl_tcam_vregion_rehash_work RIP: 0010:mlxsw_afk_encode+0x242/0x2f0 [...] Call Trace: <TASK> mlxsw_sp_acl_atcam_entry_add+0xd9/0x3c0 mlxsw_sp_acl_tcam_entry_create+0x5e/0xa0 mlxsw_sp_acl_tcam_vchunk_migrate_all+0x109/0x290 mlxsw_sp_acl_tcam_vregion_rehash_work+0x6c/0x470 process_one_work+0x151/0x370 worker_thread+0x2cb/0x3e0 kthread+0xd0/0x100 ret_from_fork+0x34/0x50 </TASK> Fixes: 6f9579d ("mlxsw: spectrum_acl: Remember where to continue rehash migration") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Alexander Zubkov <green@qrator.net> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/cc17eed86b41dd829d39b07906fec074a9ce580e.1713797103.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
sys-oak
pushed a commit
that referenced
this pull request
Jun 26, 2024
[ Upstream commit e64746e ] cpumask_of_node() can be called for NUMA_NO_NODE inside do_map_benchmark() resulting in the following sanitizer report: UBSAN: array-index-out-of-bounds in ./arch/x86/include/asm/topology.h:72:28 index -1 is out of range for type 'cpumask [64][1]' CPU: 1 PID: 990 Comm: dma_map_benchma Not tainted 6.9.0-rc6 #29 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) Call Trace: <TASK> dump_stack_lvl (lib/dump_stack.c:117) ubsan_epilogue (lib/ubsan.c:232) __ubsan_handle_out_of_bounds (lib/ubsan.c:429) cpumask_of_node (arch/x86/include/asm/topology.h:72) [inline] do_map_benchmark (kernel/dma/map_benchmark.c:104) map_benchmark_ioctl (kernel/dma/map_benchmark.c:246) full_proxy_unlocked_ioctl (fs/debugfs/file.c:333) __x64_sys_ioctl (fs/ioctl.c:890) do_syscall_64 (arch/x86/entry/common.c:83) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) Use cpumask_of_node() in place when binding a kernel thread to a cpuset of a particular node. Note that the provided node id is checked inside map_benchmark_ioctl(). It's just a NUMA_NO_NODE case which is not handled properly later. Found by Linux Verification Center (linuxtesting.org). Fixes: 65789da ("dma-mapping: add benchmark support for streaming DMA APIs") Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru> Acked-by: Barry Song <baohua@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
sys-oak
pushed a commit
that referenced
this pull request
Aug 13, 2024
[ Upstream commit eef0532 ] Run bpf_tcp_ca selftests (./test_progs -t bpf_tcp_ca) on a Loongarch platform, some "Segmentation fault" errors occur: ''' test_dctcp:PASS:bpf_dctcp__open_and_load 0 nsec test_dctcp:FAIL:bpf_map__attach_struct_ops unexpected error: -524 #29/1 bpf_tcp_ca/dctcp:FAIL test_cubic:PASS:bpf_cubic__open_and_load 0 nsec test_cubic:FAIL:bpf_map__attach_struct_ops unexpected error: -524 #29/2 bpf_tcp_ca/cubic:FAIL test_dctcp_fallback:PASS:dctcp_skel 0 nsec test_dctcp_fallback:PASS:bpf_dctcp__load 0 nsec test_dctcp_fallback:FAIL:dctcp link unexpected error: -524 #29/4 bpf_tcp_ca/dctcp_fallback:FAIL test_write_sk_pacing:PASS:open_and_load 0 nsec test_write_sk_pacing:FAIL:attach_struct_ops unexpected error: -524 #29/6 bpf_tcp_ca/write_sk_pacing:FAIL test_update_ca:PASS:open 0 nsec test_update_ca:FAIL:attach_struct_ops unexpected error: -524 settcpca:FAIL:setsockopt unexpected setsockopt: \ actual -1 == expected -1 (network_helpers.c:99: errno: No such file or directory) \ Failed to call post_socket_cb start_test:FAIL:start_server_str unexpected start_server_str: \ actual -1 == expected -1 test_update_ca:FAIL:ca1_ca1_cnt unexpected ca1_ca1_cnt: \ actual 0 <= expected 0 #29/9 bpf_tcp_ca/update_ca:FAIL #29 bpf_tcp_ca:FAIL Caught signal #11! Stack trace: ./test_progs(crash_handler+0x28)[0x5555567ed91c] linux-vdso.so.1(__vdso_rt_sigreturn+0x0)[0x7ffffee408b0] ./test_progs(bpf_link__update_map+0x80)[0x555556824a78] ./test_progs(+0x94d68)[0x5555564c4d68] ./test_progs(test_bpf_tcp_ca+0xe8)[0x5555564c6a88] ./test_progs(+0x3bde54)[0x5555567ede54] ./test_progs(main+0x61c)[0x5555567efd54] /usr/lib64/libc.so.6(+0x22208)[0x7ffff2aaa208] /usr/lib64/libc.so.6(__libc_start_main+0xac)[0x7ffff2aaa30c] ./test_progs(_start+0x48)[0x55555646bca8] Segmentation fault ''' This is because BPF trampoline is not implemented on Loongarch yet, "link" returned by bpf_map__attach_struct_ops() is NULL. test_progs crashs when this NULL link passes to bpf_link__update_map(). This patch adds NULL checks for all links in bpf_tcp_ca to fix these errors. If "link" is NULL, goto the newly added label "out" to destroy the skel. v2: - use "goto out" instead of "return" as Eduard suggested. Fixes: 06da9f3 ("selftests/bpf: Test switching TCP Congestion Control algorithms.") Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Link: https://lore.kernel.org/r/b4c841492bd4ed97964e4e61e92827ce51bf1dc9.1720615848.git.tanggeliang@kylinos.cn Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
sys-oak
pushed a commit
that referenced
this pull request
Oct 10, 2024
[ Upstream commit 743edc8 ] As previously explained, the rehash delayed work migrates filters from one region to another. This is done by iterating over all chunks (all the filters with the same priority) in the region and in each chunk iterating over all the filters. When the work runs out of credits it stores the current chunk and entry as markers in the per-work context so that it would know where to resume the migration from the next time the work is scheduled. Upon error, the chunk marker is reset to NULL, but without resetting the entry markers despite being relative to it. This can result in migration being resumed from an entry that does not belong to the chunk being migrated. In turn, this will eventually lead to a chunk being iterated over as if it is an entry. Because of how the two structures happen to be defined, this does not lead to KASAN splats, but to warnings such as [1]. Fix by creating a helper that resets all the markers and call it from all the places the currently only reset the chunk marker. For good measures also call it when starting a completely new rehash. Add a warning to avoid future cases. [1] WARNING: CPU: 7 PID: 1076 at drivers/net/ethernet/mellanox/mlxsw/core_acl_flex_keys.c:407 mlxsw_afk_encode+0x242/0x2f0 Modules linked in: CPU: 7 PID: 1076 Comm: kworker/7:24 Tainted: G W 6.9.0-rc3-custom-00880-g29e61d91b77b #29 Hardware name: Mellanox Technologies Ltd. MSN3700/VMOD0005, BIOS 5.11 01/06/2019 Workqueue: mlxsw_core mlxsw_sp_acl_tcam_vregion_rehash_work RIP: 0010:mlxsw_afk_encode+0x242/0x2f0 [...] Call Trace: <TASK> mlxsw_sp_acl_atcam_entry_add+0xd9/0x3c0 mlxsw_sp_acl_tcam_entry_create+0x5e/0xa0 mlxsw_sp_acl_tcam_vchunk_migrate_all+0x109/0x290 mlxsw_sp_acl_tcam_vregion_rehash_work+0x6c/0x470 process_one_work+0x151/0x370 worker_thread+0x2cb/0x3e0 kthread+0xd0/0x100 ret_from_fork+0x34/0x50 </TASK> Fixes: 6f9579d ("mlxsw: spectrum_acl: Remember where to continue rehash migration") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Alexander Zubkov <green@qrator.net> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/cc17eed86b41dd829d39b07906fec074a9ce580e.1713797103.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
sys-oak
pushed a commit
that referenced
this pull request
Oct 10, 2024
[ Upstream commit e64746e ] cpumask_of_node() can be called for NUMA_NO_NODE inside do_map_benchmark() resulting in the following sanitizer report: UBSAN: array-index-out-of-bounds in ./arch/x86/include/asm/topology.h:72:28 index -1 is out of range for type 'cpumask [64][1]' CPU: 1 PID: 990 Comm: dma_map_benchma Not tainted 6.9.0-rc6 #29 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) Call Trace: <TASK> dump_stack_lvl (lib/dump_stack.c:117) ubsan_epilogue (lib/ubsan.c:232) __ubsan_handle_out_of_bounds (lib/ubsan.c:429) cpumask_of_node (arch/x86/include/asm/topology.h:72) [inline] do_map_benchmark (kernel/dma/map_benchmark.c:104) map_benchmark_ioctl (kernel/dma/map_benchmark.c:246) full_proxy_unlocked_ioctl (fs/debugfs/file.c:333) __x64_sys_ioctl (fs/ioctl.c:890) do_syscall_64 (arch/x86/entry/common.c:83) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) Use cpumask_of_node() in place when binding a kernel thread to a cpuset of a particular node. Note that the provided node id is checked inside map_benchmark_ioctl(). It's just a NUMA_NO_NODE case which is not handled properly later. Found by Linux Verification Center (linuxtesting.org). Fixes: 65789da ("dma-mapping: add benchmark support for streaming DMA APIs") Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru> Acked-by: Barry Song <baohua@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
sys-oak
pushed a commit
that referenced
this pull request
Oct 29, 2024
[ Upstream commit 36ac9e7 ] Reinitialize the whole EST structure would also reset the mutex lock which is embedded in the EST structure, and then trigger the following warning. To address this, move the lock to struct stmmac_priv. We also need to reacquire the mutex lock when doing this initialization. DEBUG_LOCKS_WARN_ON(lock->magic != lock) WARNING: CPU: 3 PID: 505 at kernel/locking/mutex.c:587 __mutex_lock+0xd84/0x1068 Modules linked in: CPU: 3 PID: 505 Comm: tc Not tainted 6.9.0-rc6-00053-g0106679839f7-dirty #29 Hardware name: NXP i.MX8MPlus EVK board (DT) pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : __mutex_lock+0xd84/0x1068 lr : __mutex_lock+0xd84/0x1068 sp : ffffffc0864e3570 x29: ffffffc0864e3570 x28: ffffffc0817bdc78 x27: 0000000000000003 x26: ffffff80c54f1808 x25: ffffff80c9164080 x24: ffffffc080d723ac x23: 0000000000000000 x22: 0000000000000002 x21: 0000000000000000 x20: 0000000000000000 x19: ffffffc083bc3000 x18: ffffffffffffffff x17: ffffffc08117b080 x16: 0000000000000002 x15: ffffff80d2d40000 x14: 00000000000002da x13: ffffff80d2d404b8 x12: ffffffc082b5a5c8 x11: ffffffc082bca680 x10: ffffffc082bb2640 x9 : ffffffc082bb2698 x8 : 0000000000017fe8 x7 : c0000000ffffefff x6 : 0000000000000001 x5 : ffffff8178fe0d48 x4 : 0000000000000000 x3 : 0000000000000027 x2 : ffffff8178fe0d50 x1 : 0000000000000000 x0 : 0000000000000000 Call trace: __mutex_lock+0xd84/0x1068 mutex_lock_nested+0x28/0x34 tc_setup_taprio+0x118/0x68c stmmac_setup_tc+0x50/0xf0 taprio_change+0x868/0xc9c Fixes: b2aae65 ("net: stmmac: add mutex lock to protect est parameters") Signed-off-by: Xiaolei Wang <xiaolei.wang@windriver.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Serge Semin <fancer.lancer@gmail.com> Reviewed-by: Andrew Halaney <ahalaney@redhat.com> Link: https://lore.kernel.org/r/20240513014346.1718740-2-xiaolei.wang@windriver.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> (cherry picked from commit 36ac9e7) [Harshit: CVE-2024-38594; resolved conflicts due to missing commit: 5ca63ff ("net: stmmac: Report taprio offload status") in 6.6.y] Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.