Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

4.1 abbbi rc1 #51

Closed
wants to merge 109 commits into from
Closed

4.1 abbbi rc1 #51

wants to merge 109 commits into from

Conversation

ohporter
Copy link
Contributor

@ohporter ohporter commented Nov 6, 2015

This series supports the Arrow BeagleBone Black Industrial compatible board. It adds support for a componentized drm adv7511 driver with audio support from the vendor (modified from upstream version). Ultimately, these changes should be integrated with the adv7511.c driver upstream. It also backports the upstream componentized support for tilcdc. am335x-abbbi.dts is added and am335x-boneblack.dts is converted to use the newer DT graph bindings to match external encoders with drm drivers like tilcdc.

This is tested on both BBB Rev. A5A and the ABBI Rev. A0 board. The kernel images/modules are identical and the hdmi encoder matched to the tilcdc driver is selected by the loaded DTB (either tda998x or adihdmi).

A U-Boot change is required in order to identify the ABBBI board and load the correct .dtb. I've pushed this support (just a two line change on top of the last beagleboard.org U-Boot release patch I found) to https://github.com/konsulko/u-boot/tree/v2015-10-abbbi. I don't see a bb.org U-Boot tree to which I could send a pull request for that change.

RobertCNelson and others added 30 commits October 30, 2015 12:07
Signed-off-by: Robert Nelson <robertcnelson@gmail.com>
Signed-off-by: Robert Nelson <robertcnelson@gmail.com>
Signed-off-by: Robert Nelson <robertcnelson@gmail.com>
PRU IRAM addresses need to be masked before being handled to
remoteproc. This is due to PRU Binutils' lack of separate address
spaces for IRAM and DRAM.

Signed-off-by: Dimitar Dimitrov <dinuxbg@gmail.com>
Signed-off-by: John Syne <john3909@gmail.com>
Signed-off-by: Robert Nelson <robertcnelson@gmail.com>
Signed-off-by: Robert Nelson <robertcnelson@gmail.com>
This patch introduces regmap_get_max_register() function which would be
used by the infrastructures like nvmem framework built on top of
regmap.

Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
This patch introduces regmap_get_reg_stride() function which would
be used by the infrastructures like nvmem framework built on top of
regmap. Mostly this function would be used for sanity checks on inputs
within such infrastructure.

Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
The beaglebone family of boards contain two I2C busses enabled.
The first one with a baseboard identification EEPROM and a
cape I2C bus.

Signed-off-by: Pantelis Antoniou <pantelis.antoniou@konsulko.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Mark (and unmark) device nodes with the POPULATE flag as appropriate.
This is required to avoid multi probing when using I2C and device
overlays containing a mux.
This patch is also more careful with the release of the adapter device
which caused a deadlock with muxes, and does not break the build
on !OF since the node flag accessors are not defined then.

Signed-off-by: Pantelis Antoniou <pantelis.antoniou@konsulko.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
This patch adds just providers part of the framework just to enable easy
review.

Up until now, NVMEM drivers like eeprom were stored in drivers/misc,
where they all had to duplicate pretty much the same code to register
a sysfs file, allow in-kernel users to access the content of the devices
they were driving, etc.

This was also a problem as far as other in-kernel users were involved,
since the solutions used were pretty much different from on driver to
another, there was a rather big abstraction leak.

This introduction of this framework aims at solving this. It also
introduces DT representation for consumer devices to go get the data
they require (MAC Addresses, SoC/Revision ID, part numbers, and so on)
from the nvmems.

Having regmap interface to this framework would give much better
abstraction for nvmems on different buses.

Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
[Maxime Ripard: intial version of eeprom framework]
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
This patch adds just consumers part of the framework just to enable easy
review.

Up until now, nvmem drivers were stored in drivers/misc, where they all
had to duplicate pretty much the same code to register a sysfs file,
allow in-kernel users to access the content of the devices they were
driving, etc.

This was also a problem as far as other in-kernel users were involved,
since the solutions used were pretty much different from on driver to
another, there was a rather big abstraction leak.

This introduction of this framework aims at solving this. It also
introduces DT representation for consumer devices to go get the data they
require (MAC Addresses, SoC/Revision ID, part numbers, and so on) from
the nvmems.

Having regmap interface to this framework would give much better
abstraction for nvmems on different buses.

Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
[Maxime Ripard: intial version of the framework]
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
This patch adds read/write apis which are based on nvmem_device. It is
common that the drivers like omap cape manager or qcom cpr driver to
access bytes directly at particular offset in the eeprom and not from
nvmem cell info in DT. These driver would need to get access to the nvmem
directly, which is what these new APIS provide.

These wrapper apis would help such users to avoid code duplication in
there drivers and also avoid them reading a big eeprom blob and parsing
it internally in there driver.

Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
This patch adds bindings for simple nvmem framework which allows nvmem
consumers to talk to nvmem providers to get access to nvmem cell data.

Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
[Maxime Ripard: intial version of eeprom framework]
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
This patch add basic how-to and api summary documentation for simple
NVMEM framework.

Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
This patch adds QFPROM support driver which is used by other drivers
like thermal sensor and cpufreq.

On MSM parts there are some efuses (called qfprom) these fuses store
things like calibration data, speed bins.. etc. Drivers like cpufreq,
thermal sensors would read out this data for configuring the driver.

Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
This patch adds bindings for qfprom found in QCOM SOCs. QFPROM driver
is based on simple nvmem framework.

Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Now that we have the nvmem framework, we can consolidate the common
driver code. Move the driver to the framework, and hopefully, it will
fix the sysfs file creation race.

Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
[srinivas.kandagatla: Moved to regmap based EEPROM framework]
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
The patch adds support for the On Chip One Time Programmable Peripheral
(OCOTP) on the Vybrid platform.

Signed-off-by: Sanchayan Maity <maitysanchayan@gmail.com>
Acked-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This driver handles the i.MX On-Chip OTP Controller found in
i.MX6Q/D, i.MX6S/DL, i.MX6SL, and i.MX6SX SoCs. Currently it
just returns the values stored in the shadow registers.

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Acked-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch brings read-only support for the On-Chip OTP cells
in the i.MX23 and i.MX28 processor. The driver implements the
new NVMEM provider API.

Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Reviewed-by: Marek Vasut <marex@denx.de>
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There are some SoC specified values store in eFuse,
such as the cpu_leakage and cpu_version,
this driver can expose these values to /sys base on nvmem.

Signed-off-by: Caesar Wang <caesar.wang@rock-chips.com>
Signed-off-by: ZhengShunQian <zhengsq@rock-chips.com>
Acked-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Acked-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The position to read/write must be less than max
register size.

Signed-off-by: ZhengShunQian <zhengsq@rock-chips.com>
Acked-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
It's pointless to test (cell->bit_offset || cell->bit_offset).
nvmem_shift_read_buffer_in_place() should be called when
(cell->bit_offset || cell->nbits).

Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A tmp buffer is allocated if cell->bit_offset || cell->nbits.
So the tmp buffer needs to be freed at the same condition to avoid leak.

Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jyri Sarha and others added 7 commits November 3, 2015 16:20
Add support for an external compontised DRM encoder. The external
encoder can be connected to tilcdc trough device tree graph binding.
The binding document for tilcdc has been updated. The current
implementation supports only tda998x encoder.

To be able to filter out the unsupported video modes the tilcdc driver
needs to hijack the external connectors helper functions. The tilcdc
installes new helper functions that are otherwise identical to
orignals, but the mode_valid() call-back check the mode first localy,
before calling the original call-back. The tilcdc dirver restores the
original helper functions before it is unbound from the external
device.

I got the idea and some lines of code from Jean-Francois Moine's
"drm/tilcdc: Change the interface with the tda998x driver"-patch.

Signed-off-by: Jyri Sarha <jsarha@ti.com>
Acked-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
…upport

Adds a CONFIG_DRM_TILCDC_SLAVE_COMPAT module for "ti,tilcdc,slave"
node conversion. The implementation is in tilcdc_slave_compat.c and it
uses tilcdc_slave_compat.dts as a basis for creating a DTS
overlay. The DTS overlay adds an external tda998x encoder to tilcdc
that corresponds to the old tda998x based slave encoder.

Signed-off-by: Jyri Sarha <jsarha@ti.com>
Acked-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Hacked driver that has audio support. Use this temporarily until
audio support can be added to the upstream adv7511 driver.

Signed-off-by: Jason Kridner <jdk@ti.com>
[Remove slave hacks and use adv75xx compatible strings]
Signed-off-by: Matt Porter <mporter@konsulko.com>
Convert the driver over the the device model component framework, making
use of the drm encoder/connector helpers. This allows adihdmi to be
dynamically selected as an external encoder for drm drivers like tilcdc
that support the DT graph binding which defines ports and remote-endpoints
to attach external encoders.

Also, this driver was modified by another developer to support audio and
tweak some settings.  Along the way it seems to have been reformatted to
4 space tabs which is hard to work with alongside the standard 8 space tabs
in the kernel coding standard. As such, this is reformatted to standard 8
space tabs so it's a bit more readable.

The component and audio support should be merged into the upstream driver
so this adihdmi driver can be removed.

Signed-off-by: Matt Porter <mporter@konsulko.com>
Convert tda998x to build as a module and add adihdmi as a module.

Signed-off-by: Jason Kridner <jdk@ti.com>
[Squash and eliminate deprecated slave overlay support]
Signed-off-by: Matt Porter <mporter@konsulko.com>
Use new binding for the external tda19988 HDMI encoder.

Signed-off-by: Jyri Sarha <jsarha@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
[Resolved conflicts merging into beagleboard kernel]
Signed-off-by: Matt Porter <mporter@konsulko.com>
Adds a dts file for the Arrow BeagleBone Black Industrial board.
This BBB variant differs in that it uses an industrial temp rated
ADV7511W HDMI encoder rather than the NXP HDMI encoder on the
tradtional BBB.

Signed-off-by: Matt Porter <mporter@konsulko.com>
RobertCNelson added a commit to RobertCNelson/ti-linux-kernel-dev that referenced this pull request Nov 6, 2015
add support for Arrow BeagleBone Black Industrial compatible board

Signed-off-by: Robert Nelson <robertcnelson@gmail.com>
RobertCNelson added a commit to RobertCNelson/ti-linux-kernel-dev that referenced this pull request Nov 6, 2015
add support for Arrow BeagleBone Black Industrial compatible board

Signed-off-by: Robert Nelson <robertcnelson@gmail.com>
@RobertCNelson
Copy link
Member

Thanks @ohporter !

btw, is this board "available"? I'd like to pick one up, so i don't break anything. ;)

All merged up for "r29" nett week. (kernel is frozen for sunday's weekly snapshot image), so it'll get pushed out at the latest by thursday..

Regards,

@RobertCNelson
Copy link
Member

@ohporter can you do me a favor:

please run:

wget https://raw.githubusercontent.com/RobertCNelson/boot-scripts/master/device/bone/tester/show-eeprom.sh
sudo /bin/bash show-eeprom.sh

and i can add it our bb.org eeprom database..

Regards,

@ohporter
Copy link
Contributor Author

ohporter commented Nov 6, 2015

@RobertCNelson wow, that was quick! Thanks for picking this up. I will talk to the ODM and explain to them that they need to send you one so it can be maintained. Will sync with you offline about that.

Here's the eeprom info:

# /bin/bash show-eeprom.sh
eeprom: [�U3�A335BNLTAIA0]
eeprom raw: [00000000  aa 55 33 ee 41 33 33 35  42 4e 4c 54 41 49 41 30  |.U3.A335BNLTAIA0|]

@RobertCNelson
Copy link
Member

Thanks @ohporter just added that to the eeprom list, we need to keep these guys from stepping on each other. ;)

https://github.com/beagleboard/image-builder/blob/master/readme.md

Regards,

thess pushed a commit to thess/ti-linux-kernel-dev that referenced this pull request Nov 8, 2015
add support for Arrow BeagleBone Black Industrial compatible board

Signed-off-by: Robert Nelson <robertcnelson@gmail.com>
RobertCNelson pushed a commit that referenced this pull request Mar 9, 2016
… switches

commit 5ff8eaa upstream.

If cgroup writeback is in use, an inode is associated with a cgroup
for writeback.  If the inode's main dirtier changes to another cgroup,
the association gets updated asynchronously.  Nothing was pinning the
superblock while such switches are in progress and superblock could go
away while async switching is pending or in progress leading to
crashes like the following.

 kernel BUG at fs/jbd2/transaction.c:319!
 invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
 CPU: 1 PID: 29158 Comm: kworker/1:10 Not tainted 4.5.0-rc3 #51
 Hardware name: Google Google, BIOS Google 01/01/2011
 Workqueue: events inode_switch_wbs_work_fn
 task: ffff880213dbbd40 ti: ffff880209264000 task.ti: ffff880209264000
 RIP: 0010:[<ffffffff803e6922>]  [<ffffffff803e6922>] start_this_handle+0x382/0x3e0
 RSP: 0018:ffff880209267c30  EFLAGS: 00010202
 ...
 Call Trace:
  [<ffffffff803e6be4>] jbd2__journal_start+0xf4/0x190
  [<ffffffff803cfc7e>] __ext4_journal_start_sb+0x4e/0x70
  [<ffffffff803b31ec>] ext4_evict_inode+0x12c/0x3d0
  [<ffffffff8035338b>] evict+0xbb/0x190
  [<ffffffff80354190>] iput+0x130/0x190
  [<ffffffff80360223>] inode_switch_wbs_work_fn+0x343/0x4c0
  [<ffffffff80279819>] process_one_work+0x129/0x300
  [<ffffffff80279b16>] worker_thread+0x126/0x480
  [<ffffffff8027ed14>] kthread+0xc4/0xe0
  [<ffffffff809771df>] ret_from_fork+0x3f/0x70

Fix it by bumping s_active while cgroup association switching is in
flight.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-and-tested-by: Tahsin Erdogan <tahsin@google.com>
Link: http://lkml.kernel.org/g/CAAeU0aNCq7LGODvVGRU-oU_o-6enii5ey0p1c26D1ZzYwkDc5A@mail.gmail.com
Fixes: d10c809 ("writeback: implement foreign cgroup inode bdi_writeback switching")
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
RobertCNelson pushed a commit that referenced this pull request Jun 29, 2016
commit 9ce119f upstream.

A line discipline which does not define a receive_buf() method can
can cause a GPF if data is ever received [1]. Oddly, this was known
to the author of n_tracesink in 2011, but never fixed.

[1] GPF report
    BUG: unable to handle kernel NULL pointer dereference at           (null)
    IP: [<          (null)>]           (null)
    PGD 3752d067 PUD 37a7b067 PMD 0
    Oops: 0010 [#1] SMP KASAN
    Modules linked in:
    CPU: 2 PID: 148 Comm: kworker/u10:2 Not tainted 4.4.0-rc2+ #51
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
    Workqueue: events_unbound flush_to_ldisc
    task: ffff88006da94440 ti: ffff88006db60000 task.ti: ffff88006db60000
    RIP: 0010:[<0000000000000000>]  [<          (null)>]           (null)
    RSP: 0018:ffff88006db67b50  EFLAGS: 00010246
    RAX: 0000000000000102 RBX: ffff88003ab32f88 RCX: 0000000000000102
    RDX: 0000000000000000 RSI: ffff88003ab330a6 RDI: ffff88003aabd388
    RBP: ffff88006db67c48 R08: ffff88003ab32f9c R09: ffff88003ab31fb0
    R10: ffff88003ab32fa8 R11: 0000000000000000 R12: dffffc0000000000
    R13: ffff88006db67c20 R14: ffffffff863df820 R15: ffff88003ab31fb8
    FS:  0000000000000000(0000) GS:ffff88006dc00000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
    CR2: 0000000000000000 CR3: 0000000037938000 CR4: 00000000000006e0
    Stack:
     ffffffff829f46f1 ffff88006da94bf8 ffff88006da94bf8 0000000000000000
     ffff88003ab31fb0 ffff88003aabd438 ffff88003ab31ff8 ffff88006430fd90
     ffff88003ab32f9c ffffed0007557a87 1ffff1000db6cf78 ffff88003ab32078
    Call Trace:
     [<ffffffff8127cf91>] process_one_work+0x8f1/0x17a0 kernel/workqueue.c:2030
     [<ffffffff8127df14>] worker_thread+0xd4/0x1180 kernel/workqueue.c:2162
     [<ffffffff8128faaf>] kthread+0x1cf/0x270 drivers/block/aoe/aoecmd.c:1302
     [<ffffffff852a7c2f>] ret_from_fork+0x3f/0x70 arch/x86/entry/entry_64.S:468
    Code:  Bad RIP value.
    RIP  [<          (null)>]           (null)
     RSP <ffff88006db67b50>
    CR2: 0000000000000000
    ---[ end trace a587f8947e54d6ea ]---

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
RobertCNelson pushed a commit that referenced this pull request Jan 12, 2017
commit 1c7de2b upstream.

There is at least one Chelsio 10Gb card which uses VPD area to store some
non-standard blocks (example below).  However pci_vpd_size() returns the
length of the first block only assuming that there can be only one VPD "End
Tag".

Since 4e1a635 ("vfio/pci: Use kernel VPD access functions"), VFIO
blocks access beyond that offset, which prevents the guest "cxgb3" driver
from probing the device.  The host system does not have this problem as its
driver accesses the config space directly without pci_read_vpd().

Add a quirk to override the VPD size to a bigger value.  The maximum size
is taken from EEPROMSIZE in drivers/net/ethernet/chelsio/cxgb3/common.h.
We do not read the tag as the cxgb3 driver does as the driver supports
writing to EEPROM/VPD and when it writes, it only checks for 8192 bytes
boundary.  The quirk is registered for all devices supported by the cxgb3
driver.

This adds a quirk to the PCI layer (not to the cxgb3 driver) as the cxgb3
driver itself accesses VPD directly and the problem only exists with the
vfio-pci driver (when cxgb3 is not running on the host and may not be even
loaded) which blocks accesses beyond the first block of VPD data.  However
vfio-pci itself does not have quirks mechanism so we add it to PCI.

This is the controller:
Ethernet controller [0200]: Chelsio Communications Inc T310 10GbE Single Port Adapter [1425:0030]

This is what I parsed from its VPD:
===
b'\x82*\x0010 Gigabit Ethernet-SR PCI Express Adapter\x90J\x00EC\x07D76809 FN\x0746K'
 0000 Large item 42 bytes; name 0x2 Identifier String
	b'10 Gigabit Ethernet-SR PCI Express Adapter'
 002d Large item 74 bytes; name 0x10
	#00 [EC] len=7: b'D76809 '
	#0a [FN] len=7: b'46K7897'
	#14 [PN] len=7: b'46K7897'
	#1e [MN] len=4: b'1037'
	#25 [FC] len=4: b'5769'
	#2c [SN] len=12: b'YL102035603V'
	#3b [NA] len=12: b'00145E992ED1'
 007a Small item 1 bytes; name 0xf End Tag

 0c00 Large item 16 bytes; name 0x2 Identifier String
	b'S310E-SR-X      '
 0c13 Large item 234 bytes; name 0x10
	#00 [PN] len=16: b'TBD             '
	#13 [EC] len=16: b'110107730D2     '
	#26 [SN] len=16: b'97YL102035603V  '
	#39 [NA] len=12: b'00145E992ED1'
	#48 [V0] len=6: b'175000'
	#51 [V1] len=6: b'266666'
	#5a [V2] len=6: b'266666'
	#63 [V3] len=6: b'2000  '
	#6c [V4] len=2: b'1 '
	#71 [V5] len=6: b'c2    '
	#7a [V6] len=6: b'0     '
	#83 [V7] len=2: b'1 '
	#88 [V8] len=2: b'0 '
	#8d [V9] len=2: b'0 '
	#92 [VA] len=2: b'0 '
	#97 [RV] len=80: b's\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'...
 0d00 Large item 252 bytes; name 0x11
	#00 [VC] len=16: b'122310_1222 dp  '
	#13 [VD] len=16: b'610-0001-00 H1\x00\x00'
	#26 [VE] len=16: b'122310_1353 fp  '
	#39 [VF] len=16: b'610-0001-00 H1\x00\x00'
	#4c [RW] len=173: b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'...
 0dff Small item 0 bytes; name 0xf End Tag

10f3 Large item 13315 bytes; name 0x62
!!! unknown item name 98: b'\xd0\x03\x00@`\x0c\x08\x00\x00\x00\x00\x00\x00\x00\x00\x00'
===

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
RobertCNelson pushed a commit that referenced this pull request Feb 17, 2017
The RCU sched protection was changed to RCU only and so all IRQ-off and
preempt-off disabled region were changed to the relevant rcu-read-lock
primitives. One was missed and triggered:
|[ BUG: bad unlock balance detected! ]
|4.4.30-rt41 #51 Tainted: G        W
|btattach/345 is trying to release lock (
|Unable to handle kernel paging request at virtual address 6b6b6bbb
|Backtrace:
|[<c016b5a0>] (lock_release) from [<c0804844>] (rt_spin_unlock+0x20/0x30)
|[<c0804824>] (rt_spin_unlock) from [<c0138954>] (put_pwq_unlocked+0xa4/0x118)
|[<c01388b0>] (put_pwq_unlocked) from [<c0138b2c>] (destroy_workqueue+0x164/0x1b0)
|[<c01389c8>] (destroy_workqueue) from [<c078e1ac>] (hci_unregister_dev+0x120/0x21c)
|[<c078e08c>] (hci_unregister_dev) from [<c054f658>] (hci_uart_tty_close+0x90/0xbc)
|[<c054f5c8>] (hci_uart_tty_close) from [<c03a2be8>] (tty_ldisc_close+0x50/0x58)
|[<c03a2b98>] (tty_ldisc_close) from [<c03a2cb4>] (tty_ldisc_kill+0x18/0x78)
|[<c03a2c9c>] (tty_ldisc_kill) from [<c03a3528>] (tty_ldisc_release+0x100/0x134)
|[<c03a3428>] (tty_ldisc_release) from [<c039cd68>] (tty_release+0x3bc/0x460)
|[<c039c9ac>] (tty_release) from [<c020cc08>] (__fput+0xe0/0x1b4)
|[<c020cb28>] (__fput) from [<c020cd3c>] (____fput+0x10/0x14)
|[<c020cd2c>] (____fput) from [<c013e0d4>] (task_work_run+0xa4/0xb8)
|[<c013e030>] (task_work_run) from [<c0121754>] (do_exit+0x40c/0x8b0)
|[<c0121348>] (do_exit) from [<c0122ff8>] (do_group_exit+0x54/0xc4)

Cc: stable-rt@vger.kernel.org
Reported-by: John Keeping <john@metanate.com>
Tested-by: John Keeping <john@metanate.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
RobertCNelson pushed a commit that referenced this pull request May 8, 2017
commit 9d7f29c upstream.

calculate_min_delta() may incorrectly access a 4th element of buf2[]
which only has 3 elements. This may trigger undefined behaviour and has
been reported to cause strange crashes in start_kernel() sometime after
timer initialization when built with GCC 5.3, possibly due to
register/stack corruption:

sched_clock: 32 bits at 200MHz, resolution 5ns, wraps every 10737418237ns
CPU 0 Unable to handle kernel paging request at virtual address ffffb0aa, epc == 8067daa8, ra == 8067da84
Oops[#1]:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.9.18 #51
task: 8065e3e0 task.stack: 80644000
$ 0   : 00000000 00000001 00000000 00000000
$ 4   : 8065b4d0 00000000 805d0000 00000010
$ 8   : 00000010 80321400 fffff000 812de408
$12   : 00000000 00000000 00000000 ffffffff
$16   : 00000002 ffffffff 80660000 806a666c
$20   : 806c0000 00000000 00000000 00000000
$24   : 00000000 00000010
$28   : 80644000 80645ed0 00000000 8067da84
Hi    : 00000000
Lo    : 00000000
epc   : 8067daa8 start_kernel+0x33c/0x500
ra    : 8067da84 start_kernel+0x318/0x500
Status: 11000402 KERNEL EXL
Cause : 4080040c (ExcCode 03)
BadVA : ffffb0aa
PrId  : 0501992c (MIPS 1004Kc)
Modules linked in:
Process swapper/0 (pid: 0, threadinfo=80644000, task=8065e3e0, tls=00000000)
Call Trace:
[<8067daa8>] start_kernel+0x33c/0x500
Code: 24050240  0c0131f9  24849c64 <a200b0a8> 41606020  000000c0  0c1a45e6 00000000  0c1a5f44

UBSAN also detects the same issue:

================================================================
UBSAN: Undefined behaviour in arch/mips/kernel/cevt-r4k.c:85:41
load of address 80647e4c with insufficient space
for an object of type 'unsigned int'
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.9.18 #47
Call Trace:
[<80028f70>] show_stack+0x88/0xa4
[<80312654>] dump_stack+0x84/0xc0
[<8034163c>] ubsan_epilogue+0x14/0x50
[<803417d8>] __ubsan_handle_type_mismatch+0x160/0x168
[<8002dab0>] r4k_clockevent_init+0x544/0x764
[<80684d34>] time_init+0x18/0x90
[<8067fa5c>] start_kernel+0x2f0/0x500
=================================================================

buf2[] is intentionally only 3 elements so that the last element is the
median once 5 samples have been inserted, so explicitly prevent the
possibility of comparing against the 4th element rather than extending
the array.

Fixes: 1fa4055 ("MIPS: cevt-r4k: Dynamically calculate min_delta_ns")
Reported-by: Rabin Vincent <rabinv@axis.com>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Tested-by: Rabin Vincent <rabinv@axis.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/15892/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
RobertCNelson pushed a commit that referenced this pull request May 24, 2017
commit 768d64f upstream.

Driver should provide its own struct device for all DMA-mapping calls instead
of extracting device pointer from DMA engine channel. Although this is harmless
from the driver operation perspective on ARM architecture, it is always good
to use the DMA mapping API in a proper way. This patch fixes following DMA API
debug warning:

WARNING: CPU: 0 PID: 0 at lib/dma-debug.c:1241 check_sync+0x520/0x9f4
samsung-uart 12c20000.serial: DMA-API: device driver tries to sync DMA memory it has not allocated [device address=0x000000006df0f580] [size=64 bytes]
Modules linked in:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.11.0-rc1-00137-g07ca963 #51
Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
[<c011aaa4>] (unwind_backtrace) from [<c01127c0>] (show_stack+0x20/0x24)
[<c01127c0>] (show_stack) from [<c06ba5d8>] (dump_stack+0x84/0xa0)
[<c06ba5d8>] (dump_stack) from [<c0139528>] (__warn+0x14c/0x180)
[<c0139528>] (__warn) from [<c01395a4>] (warn_slowpath_fmt+0x48/0x50)
[<c01395a4>] (warn_slowpath_fmt) from [<c0729058>] (check_sync+0x520/0x9f4)
[<c0729058>] (check_sync) from [<c072967c>] (debug_dma_sync_single_for_device+0x88/0xc8)
[<c072967c>] (debug_dma_sync_single_for_device) from [<c0803c10>] (s3c24xx_serial_start_tx_dma+0x100/0x2f8)
[<c0803c10>] (s3c24xx_serial_start_tx_dma) from [<c0804338>] (s3c24xx_serial_tx_chars+0x198/0x33c)

Reported-by: Seung-Woo Kim <sw0312.kim@samsung.com>
Fixes: 62c37ee ("serial: samsung: add dma reqest/release functions")
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org>
Reviewed-by: Shuah Khan <shuahkh@osg.samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
RobertCNelson pushed a commit that referenced this pull request Aug 8, 2017
commit 768d64f upstream.

Driver should provide its own struct device for all DMA-mapping calls instead
of extracting device pointer from DMA engine channel. Although this is harmless
from the driver operation perspective on ARM architecture, it is always good
to use the DMA mapping API in a proper way. This patch fixes following DMA API
debug warning:

WARNING: CPU: 0 PID: 0 at lib/dma-debug.c:1241 check_sync+0x520/0x9f4
samsung-uart 12c20000.serial: DMA-API: device driver tries to sync DMA memory it has not allocated [device address=0x000000006df0f580] [size=64 bytes]
Modules linked in:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.11.0-rc1-00137-g07ca963 #51
Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
[<c011aaa4>] (unwind_backtrace) from [<c01127c0>] (show_stack+0x20/0x24)
[<c01127c0>] (show_stack) from [<c06ba5d8>] (dump_stack+0x84/0xa0)
[<c06ba5d8>] (dump_stack) from [<c0139528>] (__warn+0x14c/0x180)
[<c0139528>] (__warn) from [<c01395a4>] (warn_slowpath_fmt+0x48/0x50)
[<c01395a4>] (warn_slowpath_fmt) from [<c0729058>] (check_sync+0x520/0x9f4)
[<c0729058>] (check_sync) from [<c072967c>] (debug_dma_sync_single_for_device+0x88/0xc8)
[<c072967c>] (debug_dma_sync_single_for_device) from [<c0803c10>] (s3c24xx_serial_start_tx_dma+0x100/0x2f8)
[<c0803c10>] (s3c24xx_serial_start_tx_dma) from [<c0804338>] (s3c24xx_serial_tx_chars+0x198/0x33c)

Reported-by: Seung-Woo Kim <sw0312.kim@samsung.com>
Fixes: 62c37ee ("serial: samsung: add dma reqest/release functions")
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org>
Reviewed-by: Shuah Khan <shuahkh@osg.samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
RobertCNelson pushed a commit that referenced this pull request May 31, 2018
[ Upstream commit 7f582b2 ]

syzkaller found a reliable way to crash the host, hitting a BUG()
in __tcp_retransmit_skb()

Malicous MSG_FASTOPEN is the root cause. We need to purge write queue
in tcp_connect_init() at the point we init snd_una/write_seq.

This patch also replaces the BUG() by a less intrusive WARN_ON_ONCE()

kernel BUG at net/ipv4/tcp_output.c:2837!
invalid opcode: 0000 [#1] SMP KASAN
Dumping ftrace buffer:
   (ftrace buffer empty)
Modules linked in:
CPU: 0 PID: 5276 Comm: syz-executor0 Not tainted 4.17.0-rc3+ #51
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:__tcp_retransmit_skb+0x2992/0x2eb0 net/ipv4/tcp_output.c:2837
RSP: 0000:ffff8801dae06ff8 EFLAGS: 00010206
RAX: ffff8801b9fe61c0 RBX: 00000000ffc18a16 RCX: ffffffff864e1a49
RDX: 0000000000000100 RSI: ffffffff864e2e12 RDI: 0000000000000005
RBP: ffff8801dae073a0 R08: ffff8801b9fe61c0 R09: ffffed0039c40dd2
R10: ffffed0039c40dd2 R11: ffff8801ce206e93 R12: 00000000421eeaad
R13: ffff8801ce206d4e R14: ffff8801ce206cc0 R15: ffff8801cd4f4a80
FS:  0000000000000000(0000) GS:ffff8801dae00000(0063) knlGS:00000000096bc900
CS:  0010 DS: 002b ES: 002b CR0: 0000000080050033
CR2: 0000000020000000 CR3: 00000001c47b6000 CR4: 00000000001406f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <IRQ>
 tcp_retransmit_skb+0x2e/0x250 net/ipv4/tcp_output.c:2923
 tcp_retransmit_timer+0xc50/0x3060 net/ipv4/tcp_timer.c:488
 tcp_write_timer_handler+0x339/0x960 net/ipv4/tcp_timer.c:573
 tcp_write_timer+0x111/0x1d0 net/ipv4/tcp_timer.c:593
 call_timer_fn+0x230/0x940 kernel/time/timer.c:1326
 expire_timers kernel/time/timer.c:1363 [inline]
 __run_timers+0x79e/0xc50 kernel/time/timer.c:1666
 run_timer_softirq+0x4c/0x70 kernel/time/timer.c:1692
 __do_softirq+0x2e0/0xaf5 kernel/softirq.c:285
 invoke_softirq kernel/softirq.c:365 [inline]
 irq_exit+0x1d1/0x200 kernel/softirq.c:405
 exiting_irq arch/x86/include/asm/apic.h:525 [inline]
 smp_apic_timer_interrupt+0x17e/0x710 arch/x86/kernel/apic/apic.c:1052
 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:863

Fixes: cf60af0 ("net-tcp: Fast Open client - sendmsg(MSG_FASTOPEN)")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
RobertCNelson pushed a commit that referenced this pull request Jun 4, 2018
[ Upstream commit 7f582b2 ]

syzkaller found a reliable way to crash the host, hitting a BUG()
in __tcp_retransmit_skb()

Malicous MSG_FASTOPEN is the root cause. We need to purge write queue
in tcp_connect_init() at the point we init snd_una/write_seq.

This patch also replaces the BUG() by a less intrusive WARN_ON_ONCE()

kernel BUG at net/ipv4/tcp_output.c:2837!
invalid opcode: 0000 [#1] SMP KASAN
Dumping ftrace buffer:
   (ftrace buffer empty)
Modules linked in:
CPU: 0 PID: 5276 Comm: syz-executor0 Not tainted 4.17.0-rc3+ #51
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:__tcp_retransmit_skb+0x2992/0x2eb0 net/ipv4/tcp_output.c:2837
RSP: 0000:ffff8801dae06ff8 EFLAGS: 00010206
RAX: ffff8801b9fe61c0 RBX: 00000000ffc18a16 RCX: ffffffff864e1a49
RDX: 0000000000000100 RSI: ffffffff864e2e12 RDI: 0000000000000005
RBP: ffff8801dae073a0 R08: ffff8801b9fe61c0 R09: ffffed0039c40dd2
R10: ffffed0039c40dd2 R11: ffff8801ce206e93 R12: 00000000421eeaad
R13: ffff8801ce206d4e R14: ffff8801ce206cc0 R15: ffff8801cd4f4a80
FS:  0000000000000000(0000) GS:ffff8801dae00000(0063) knlGS:00000000096bc900
CS:  0010 DS: 002b ES: 002b CR0: 0000000080050033
CR2: 0000000020000000 CR3: 00000001c47b6000 CR4: 00000000001406f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <IRQ>
 tcp_retransmit_skb+0x2e/0x250 net/ipv4/tcp_output.c:2923
 tcp_retransmit_timer+0xc50/0x3060 net/ipv4/tcp_timer.c:488
 tcp_write_timer_handler+0x339/0x960 net/ipv4/tcp_timer.c:573
 tcp_write_timer+0x111/0x1d0 net/ipv4/tcp_timer.c:593
 call_timer_fn+0x230/0x940 kernel/time/timer.c:1326
 expire_timers kernel/time/timer.c:1363 [inline]
 __run_timers+0x79e/0xc50 kernel/time/timer.c:1666
 run_timer_softirq+0x4c/0x70 kernel/time/timer.c:1692
 __do_softirq+0x2e0/0xaf5 kernel/softirq.c:285
 invoke_softirq kernel/softirq.c:365 [inline]
 irq_exit+0x1d1/0x200 kernel/softirq.c:405
 exiting_irq arch/x86/include/asm/apic.h:525 [inline]
 smp_apic_timer_interrupt+0x17e/0x710 arch/x86/kernel/apic/apic.c:1052
 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:863

Fixes: cf60af0 ("net-tcp: Fast Open client - sendmsg(MSG_FASTOPEN)")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
crow-misia pushed a commit to crow-misia/linux that referenced this pull request Feb 25, 2019
commit cf657d2 upstream.

Due to quadratic behavior of x25_new_lci(), syzbot was able
to trigger an rcu stall.

Fix this by not blocking BH for the whole duration of
the function, and inserting a reschedule point when possible.

If we care enough, using a bitmap could get rid of the quadratic
behavior.

syzbot report :

rcu: INFO: rcu_preempt self-detected stall on CPU
rcu:    0-...!: (10500 ticks this GP) idle=4fa/1/0x4000000000000002 softirq=283376/283376 fqs=0
rcu:     (t=10501 jiffies g=383105 q=136)
rcu: rcu_preempt kthread starved for 10502 jiffies! g383105 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=0
rcu: RCU grace-period kthread stack dump:
rcu_preempt     I28928    10      2 0x80000000
Call Trace:
 context_switch kernel/sched/core.c:2844 [inline]
 __schedule+0x817/0x1cc0 kernel/sched/core.c:3485
 schedule+0x92/0x180 kernel/sched/core.c:3529
 schedule_timeout+0x4db/0xfd0 kernel/time/timer.c:1803
 rcu_gp_fqs_loop kernel/rcu/tree.c:1948 [inline]
 rcu_gp_kthread+0x956/0x17a0 kernel/rcu/tree.c:2105
 kthread+0x357/0x430 kernel/kthread.c:246
 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:352
NMI backtrace for cpu 0
CPU: 0 PID: 8759 Comm: syz-executor2 Not tainted 5.0.0-rc4+ beagleboard#51
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x172/0x1f0 lib/dump_stack.c:113
 nmi_cpu_backtrace.cold+0x63/0xa4 lib/nmi_backtrace.c:101
 nmi_trigger_cpumask_backtrace+0x1be/0x236 lib/nmi_backtrace.c:62
 arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38
 trigger_single_cpu_backtrace include/linux/nmi.h:164 [inline]
 rcu_dump_cpu_stacks+0x183/0x1cf kernel/rcu/tree.c:1211
 print_cpu_stall kernel/rcu/tree.c:1348 [inline]
 check_cpu_stall kernel/rcu/tree.c:1422 [inline]
 rcu_pending kernel/rcu/tree.c:3018 [inline]
 rcu_check_callbacks.cold+0x500/0xa4a kernel/rcu/tree.c:2521
 update_process_times+0x32/0x80 kernel/time/timer.c:1635
 tick_sched_handle+0xa2/0x190 kernel/time/tick-sched.c:161
 tick_sched_timer+0x47/0x130 kernel/time/tick-sched.c:1271
 __run_hrtimer kernel/time/hrtimer.c:1389 [inline]
 __hrtimer_run_queues+0x33e/0xde0 kernel/time/hrtimer.c:1451
 hrtimer_interrupt+0x314/0x770 kernel/time/hrtimer.c:1509
 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1035 [inline]
 smp_apic_timer_interrupt+0x120/0x570 arch/x86/kernel/apic/apic.c:1060
 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:807
 </IRQ>
RIP: 0010:__read_once_size include/linux/compiler.h:193 [inline]
RIP: 0010:queued_write_lock_slowpath+0x13e/0x290 kernel/locking/qrwlock.c:86
Code: 00 00 fc ff df 4c 8d 2c 01 41 83 c7 03 41 0f b6 45 00 41 38 c7 7c 08 84 c0 0f 85 0c 01 00 00 8b 03 3d 00 01 00 00 74 1a f3 90 <41> 0f b6 55 00 41 38 d7 7c eb 84 d2 74 e7 48 89 df e8 6c 0f 4f 00
RSP: 0018:ffff88805f117bd8 EFLAGS: 00000206 ORIG_RAX: ffffffffffffff13
RAX: 0000000000000300 RBX: ffffffff89413ba0 RCX: 1ffffffff1282774
RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffffffff89413ba0
RBP: ffff88805f117c70 R08: 1ffffffff1282774 R09: fffffbfff1282775
R10: fffffbfff1282774 R11: ffffffff89413ba3 R12: 00000000000000ff
R13: fffffbfff1282774 R14: 1ffff1100be22f7d R15: 0000000000000003
 queued_write_lock include/asm-generic/qrwlock.h:104 [inline]
 do_raw_write_lock+0x1d6/0x290 kernel/locking/spinlock_debug.c:203
 __raw_write_lock_bh include/linux/rwlock_api_smp.h:204 [inline]
 _raw_write_lock_bh+0x3b/0x50 kernel/locking/spinlock.c:312
 x25_insert_socket+0x21/0xe0 net/x25/af_x25.c:267
 x25_bind+0x273/0x340 net/x25/af_x25.c:705
 __sys_bind+0x23f/0x290 net/socket.c:1505
 __do_sys_bind net/socket.c:1516 [inline]
 __se_sys_bind net/socket.c:1514 [inline]
 __x64_sys_bind+0x73/0xb0 net/socket.c:1514
 do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x457e39
Code: ad b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007fafccd0dc78 EFLAGS: 00000246 ORIG_RAX: 0000000000000031
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000457e39
RDX: 0000000000000012 RSI: 0000000020000240 RDI: 0000000000000004
RBP: 000000000073bf00 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007fafccd0e6d4
R13: 00000000004bdf8b R14: 00000000004ce4b8 R15: 00000000ffffffff
Sending NMI from CPU 0 to CPUs 1:
NMI backtrace for cpu 1
CPU: 1 PID: 8752 Comm: syz-executor4 Not tainted 5.0.0-rc4+ beagleboard#51
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:__x25_find_socket+0x78/0x120 net/x25/af_x25.c:328
Code: 89 f8 48 c1 e8 03 80 3c 18 00 0f 85 a6 00 00 00 4d 8b 64 24 68 4d 85 e4 74 7f e8 03 97 3d fb 49 83 ec 68 74 74 e8 f8 96 3d fb <49> 8d bc 24 88 04 00 00 48 89 f8 48 c1 e8 03 0f b6 04 18 84 c0 74
RSP: 0018:ffff8880639efc58 EFLAGS: 00000246
RAX: 0000000000040000 RBX: dffffc0000000000 RCX: ffffc9000e677000
RDX: 0000000000040000 RSI: ffffffff863244b8 RDI: ffff88806a764628
RBP: ffff8880639efc80 R08: ffff8880a80d05c0 R09: fffffbfff1282775
R10: fffffbfff1282774 R11: ffffffff89413ba3 R12: ffff88806a7645c0
R13: 0000000000000001 R14: ffff88809f29ac00 R15: 0000000000000000
FS:  00007fe8d0c58700(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000001b32823000 CR3: 00000000672eb000 CR4: 00000000001406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 x25_new_lci net/x25/af_x25.c:357 [inline]
 x25_connect+0x374/0xdf0 net/x25/af_x25.c:786
 __sys_connect+0x266/0x330 net/socket.c:1686
 __do_sys_connect net/socket.c:1697 [inline]
 __se_sys_connect net/socket.c:1694 [inline]
 __x64_sys_connect+0x73/0xb0 net/socket.c:1694
 do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x457e39
Code: ad b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007fe8d0c57c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000457e39
RDX: 0000000000000012 RSI: 0000000020000200 RDI: 0000000000000004
RBP: 000000000073bf00 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007fe8d0c586d4
R13: 00000000004be378 R14: 00000000004ceb00 R15: 00000000ffffffff

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: Andrew Hendry <andrew.hendry@gmail.com>
Cc: linux-x25@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
RobertCNelson pushed a commit that referenced this pull request Mar 18, 2019
commit cf657d2 upstream.

Due to quadratic behavior of x25_new_lci(), syzbot was able
to trigger an rcu stall.

Fix this by not blocking BH for the whole duration of
the function, and inserting a reschedule point when possible.

If we care enough, using a bitmap could get rid of the quadratic
behavior.

syzbot report :

rcu: INFO: rcu_preempt self-detected stall on CPU
rcu:    0-...!: (10500 ticks this GP) idle=4fa/1/0x4000000000000002 softirq=283376/283376 fqs=0
rcu:     (t=10501 jiffies g=383105 q=136)
rcu: rcu_preempt kthread starved for 10502 jiffies! g383105 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=0
rcu: RCU grace-period kthread stack dump:
rcu_preempt     I28928    10      2 0x80000000
Call Trace:
 context_switch kernel/sched/core.c:2844 [inline]
 __schedule+0x817/0x1cc0 kernel/sched/core.c:3485
 schedule+0x92/0x180 kernel/sched/core.c:3529
 schedule_timeout+0x4db/0xfd0 kernel/time/timer.c:1803
 rcu_gp_fqs_loop kernel/rcu/tree.c:1948 [inline]
 rcu_gp_kthread+0x956/0x17a0 kernel/rcu/tree.c:2105
 kthread+0x357/0x430 kernel/kthread.c:246
 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:352
NMI backtrace for cpu 0
CPU: 0 PID: 8759 Comm: syz-executor2 Not tainted 5.0.0-rc4+ #51
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x172/0x1f0 lib/dump_stack.c:113
 nmi_cpu_backtrace.cold+0x63/0xa4 lib/nmi_backtrace.c:101
 nmi_trigger_cpumask_backtrace+0x1be/0x236 lib/nmi_backtrace.c:62
 arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38
 trigger_single_cpu_backtrace include/linux/nmi.h:164 [inline]
 rcu_dump_cpu_stacks+0x183/0x1cf kernel/rcu/tree.c:1211
 print_cpu_stall kernel/rcu/tree.c:1348 [inline]
 check_cpu_stall kernel/rcu/tree.c:1422 [inline]
 rcu_pending kernel/rcu/tree.c:3018 [inline]
 rcu_check_callbacks.cold+0x500/0xa4a kernel/rcu/tree.c:2521
 update_process_times+0x32/0x80 kernel/time/timer.c:1635
 tick_sched_handle+0xa2/0x190 kernel/time/tick-sched.c:161
 tick_sched_timer+0x47/0x130 kernel/time/tick-sched.c:1271
 __run_hrtimer kernel/time/hrtimer.c:1389 [inline]
 __hrtimer_run_queues+0x33e/0xde0 kernel/time/hrtimer.c:1451
 hrtimer_interrupt+0x314/0x770 kernel/time/hrtimer.c:1509
 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1035 [inline]
 smp_apic_timer_interrupt+0x120/0x570 arch/x86/kernel/apic/apic.c:1060
 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:807
 </IRQ>
RIP: 0010:__read_once_size include/linux/compiler.h:193 [inline]
RIP: 0010:queued_write_lock_slowpath+0x13e/0x290 kernel/locking/qrwlock.c:86
Code: 00 00 fc ff df 4c 8d 2c 01 41 83 c7 03 41 0f b6 45 00 41 38 c7 7c 08 84 c0 0f 85 0c 01 00 00 8b 03 3d 00 01 00 00 74 1a f3 90 <41> 0f b6 55 00 41 38 d7 7c eb 84 d2 74 e7 48 89 df e8 6c 0f 4f 00
RSP: 0018:ffff88805f117bd8 EFLAGS: 00000206 ORIG_RAX: ffffffffffffff13
RAX: 0000000000000300 RBX: ffffffff89413ba0 RCX: 1ffffffff1282774
RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffffffff89413ba0
RBP: ffff88805f117c70 R08: 1ffffffff1282774 R09: fffffbfff1282775
R10: fffffbfff1282774 R11: ffffffff89413ba3 R12: 00000000000000ff
R13: fffffbfff1282774 R14: 1ffff1100be22f7d R15: 0000000000000003
 queued_write_lock include/asm-generic/qrwlock.h:104 [inline]
 do_raw_write_lock+0x1d6/0x290 kernel/locking/spinlock_debug.c:203
 __raw_write_lock_bh include/linux/rwlock_api_smp.h:204 [inline]
 _raw_write_lock_bh+0x3b/0x50 kernel/locking/spinlock.c:312
 x25_insert_socket+0x21/0xe0 net/x25/af_x25.c:267
 x25_bind+0x273/0x340 net/x25/af_x25.c:705
 __sys_bind+0x23f/0x290 net/socket.c:1505
 __do_sys_bind net/socket.c:1516 [inline]
 __se_sys_bind net/socket.c:1514 [inline]
 __x64_sys_bind+0x73/0xb0 net/socket.c:1514
 do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x457e39
Code: ad b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007fafccd0dc78 EFLAGS: 00000246 ORIG_RAX: 0000000000000031
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000457e39
RDX: 0000000000000012 RSI: 0000000020000240 RDI: 0000000000000004
RBP: 000000000073bf00 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007fafccd0e6d4
R13: 00000000004bdf8b R14: 00000000004ce4b8 R15: 00000000ffffffff
Sending NMI from CPU 0 to CPUs 1:
NMI backtrace for cpu 1
CPU: 1 PID: 8752 Comm: syz-executor4 Not tainted 5.0.0-rc4+ #51
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:__x25_find_socket+0x78/0x120 net/x25/af_x25.c:328
Code: 89 f8 48 c1 e8 03 80 3c 18 00 0f 85 a6 00 00 00 4d 8b 64 24 68 4d 85 e4 74 7f e8 03 97 3d fb 49 83 ec 68 74 74 e8 f8 96 3d fb <49> 8d bc 24 88 04 00 00 48 89 f8 48 c1 e8 03 0f b6 04 18 84 c0 74
RSP: 0018:ffff8880639efc58 EFLAGS: 00000246
RAX: 0000000000040000 RBX: dffffc0000000000 RCX: ffffc9000e677000
RDX: 0000000000040000 RSI: ffffffff863244b8 RDI: ffff88806a764628
RBP: ffff8880639efc80 R08: ffff8880a80d05c0 R09: fffffbfff1282775
R10: fffffbfff1282774 R11: ffffffff89413ba3 R12: ffff88806a7645c0
R13: 0000000000000001 R14: ffff88809f29ac00 R15: 0000000000000000
FS:  00007fe8d0c58700(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000001b32823000 CR3: 00000000672eb000 CR4: 00000000001406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 x25_new_lci net/x25/af_x25.c:357 [inline]
 x25_connect+0x374/0xdf0 net/x25/af_x25.c:786
 __sys_connect+0x266/0x330 net/socket.c:1686
 __do_sys_connect net/socket.c:1697 [inline]
 __se_sys_connect net/socket.c:1694 [inline]
 __x64_sys_connect+0x73/0xb0 net/socket.c:1694
 do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x457e39
Code: ad b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007fe8d0c57c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000457e39
RDX: 0000000000000012 RSI: 0000000020000200 RDI: 0000000000000004
RBP: 000000000073bf00 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007fe8d0c586d4
R13: 00000000004be378 R14: 00000000004ceb00 R15: 00000000ffffffff

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: Andrew Hendry <andrew.hendry@gmail.com>
Cc: linux-x25@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
crow-misia pushed a commit to crow-misia/linux that referenced this pull request Mar 24, 2019
commit fc8efd2 upstream.

LTP testcase mtest06 [1] can trigger a crash on s390x running 5.0.0-rc8.
This is a stress test, where one thread mmaps/writes/munmaps memory area
and other thread is trying to read from it:

  CPU: 0 PID: 2611 Comm: mmap1 Not tainted 5.0.0-rc8+ beagleboard#51
  Hardware name: IBM 2964 N63 400 (z/VM 6.4.0)
  Krnl PSW : 0404e00180000000 00000000001ac8d8 (__lock_acquire+0x7/0x7a8)
  Call Trace:
  ([<0000000000000000>]           (null))
   [<00000000001adae4>] lock_acquire+0xec/0x258
   [<000000000080d1ac>] _raw_spin_lock_bh+0x5c/0x98
   [<000000000012a780>] page_table_free+0x48/0x1a8
   [<00000000002f6e54>] do_fault+0xdc/0x670
   [<00000000002fadae>] __handle_mm_fault+0x416/0x5f0
   [<00000000002fb138>] handle_mm_fault+0x1b0/0x320
   [<00000000001248cc>] do_dat_exception+0x19c/0x2c8
   [<000000000080e5ee>] pgm_check_handler+0x19e/0x200

page_table_free() is called with NULL mm parameter, but because "0" is a
valid address on s390 (see S390_lowcore), it keeps going until it
eventually crashes in lockdep's lock_acquire.  This crash is
reproducible at least since 4.14.

Problem is that "vmf->vma" used in do_fault() can become stale.  Because
mmap_sem may be released, other threads can come in, call munmap() and
cause "vma" be returned to kmem cache, and get zeroed/re-initialized and
re-used:

handle_mm_fault                           |
  __handle_mm_fault                       |
    do_fault                              |
      vma = vmf->vma                      |
      do_read_fault                       |
        __do_fault                        |
          vma->vm_ops->fault(vmf);        |
            mmap_sem is released          |
                                          |
                                          | do_munmap()
                                          |   remove_vma_list()
                                          |     remove_vma()
                                          |       vm_area_free()
                                          |         # vma is released
                                          | ...
                                          | # same vma is allocated
                                          | # from kmem cache
                                          | do_mmap()
                                          |   vm_area_alloc()
                                          |     memset(vma, 0, ...)
                                          |
      pte_free(vma->vm_mm, ...);          |
        page_table_free                   |
          spin_lock_bh(&mm->context.lock);|
            <crash>                       |

Cache mm_struct to avoid using potentially stale "vma".

[1] https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/mem/mtest06/mmap1.c

Link: http://lkml.kernel.org/r/5b3fdf19e2a5be460a384b936f5b56e13733f1b8.1551595137.git.jstancek@redhat.com
Signed-off-by: Jan Stancek <jstancek@redhat.com>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Reviewed-by: Matthew Wilcox <willy@infradead.org>
Acked-by: Rafael Aquini <aquini@redhat.com>
Reviewed-by: Minchan Kim <minchan@kernel.org>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Souptick Joarder <jrdr.linux@gmail.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
RobertCNelson pushed a commit that referenced this pull request Feb 20, 2020
[ Upstream commit 04d36d7 ]

The introduction of a split between the reference count on rxrpc_local
objects and the usage count didn't quite go far enough.  A number of kernel
work items need to make use of the socket to perform transmission.  These
also need to get an active count on the local object to prevent the socket
from being closed.

Fix this by getting the active count in those places.

Also split out the raw active count get/put functions as these places tend
to hold refs on the rxrpc_local object already, so getting and putting an
extra object ref is just a waste of time.

The problem can lead to symptoms like:

    BUG: kernel NULL pointer dereference, address: 0000000000000018
    ..
    CPU: 2 PID: 818 Comm: kworker/u9:0 Not tainted 5.5.0-fscache+ #51
    ...
    RIP: 0010:selinux_socket_sendmsg+0x5/0x13
    ...
    Call Trace:
     security_socket_sendmsg+0x2c/0x3e
     sock_sendmsg+0x1a/0x46
     rxrpc_send_keepalive+0x131/0x1ae
     rxrpc_peer_keepalive_worker+0x219/0x34b
     process_one_work+0x18e/0x271
     worker_thread+0x1a3/0x247
     kthread+0xe6/0xeb
     ret_from_fork+0x1f/0x30

Fixes: 730c5fd ("rxrpc: Fix local endpoint refcounting")
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
@pdp7
Copy link
Contributor

pdp7 commented Jun 11, 2020

@ohporter I don't see am335x-abbbi.dts in mainline. Is there a reason that you did not submit it?

RobertCNelson pushed a commit that referenced this pull request Mar 4, 2024
[ Upstream commit 9dbe086 ]

A crash was found when dumping SMC-R connections. It can be reproduced
by following steps:

- environment: two RNICs on both sides.
- run SMC-R between two sides, now a SMC_LGR_SYMMETRIC type link group
  will be created.
- set the first RNIC down on either side and link group will turn to
  SMC_LGR_ASYMMETRIC_LOCAL then.
- run 'smcss -R' and the crash will be triggered.

 BUG: kernel NULL pointer dereference, address: 0000000000000010
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 8000000101fdd067 P4D 8000000101fdd067 PUD 10ce46067 PMD 0
 Oops: 0000 [#1] PREEMPT SMP PTI
 CPU: 3 PID: 1810 Comm: smcss Kdump: loaded Tainted: G W   E      6.7.0-rc6+ #51
 RIP: 0010:__smc_diag_dump.constprop.0+0x36e/0x620 [smc_diag]
 Call Trace:
  <TASK>
  ? __die+0x24/0x70
  ? page_fault_oops+0x66/0x150
  ? exc_page_fault+0x69/0x140
  ? asm_exc_page_fault+0x26/0x30
  ? __smc_diag_dump.constprop.0+0x36e/0x620 [smc_diag]
  smc_diag_dump_proto+0xd0/0xf0 [smc_diag]
  smc_diag_dump+0x26/0x60 [smc_diag]
  netlink_dump+0x19f/0x320
  __netlink_dump_start+0x1dc/0x300
  smc_diag_handler_dump+0x6a/0x80 [smc_diag]
  ? __pfx_smc_diag_dump+0x10/0x10 [smc_diag]
  sock_diag_rcv_msg+0x121/0x140
  ? __pfx_sock_diag_rcv_msg+0x10/0x10
  netlink_rcv_skb+0x5a/0x110
  sock_diag_rcv+0x28/0x40
  netlink_unicast+0x22a/0x330
  netlink_sendmsg+0x240/0x4a0
  __sock_sendmsg+0xb0/0xc0
  ____sys_sendmsg+0x24e/0x300
  ? copy_msghdr_from_user+0x62/0x80
  ___sys_sendmsg+0x7c/0xd0
  ? __do_fault+0x34/0x1a0
  ? do_read_fault+0x5f/0x100
  ? do_fault+0xb0/0x110
  __sys_sendmsg+0x4d/0x80
  do_syscall_64+0x45/0xf0
  entry_SYSCALL_64_after_hwframe+0x6e/0x76

When the first RNIC is set down, the lgr->lnk[0] will be cleared and an
asymmetric link will be allocated in lgr->link[SMC_LINKS_PER_LGR_MAX - 1]
by smc_llc_alloc_alt_link(). Then when we try to dump SMC-R connections
in __smc_diag_dump(), the invalid lgr->lnk[0] will be accessed, resulting
in this issue. So fix it by accessing the right link.

Fixes: f16a7dd ("smc: netlink interface for SMC sockets")
Reported-by: henaumars <henaumars@sina.com>
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=7616
Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Reviewed-by: Tony Lu <tonylu@linux.alibaba.com>
Link: https://lore.kernel.org/r/1703662835-53416-1-git-send-email-guwen@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.