USB storage device causing occasional block/genhd.c panic #213

zerxy · 2013-02-05T04:06:17Z

Error occurs during boot on my 256 MB Raspberry Pi. dmesg output includes:

WARNING: at block/genhd.c:1582 disk_clear_events+0x148/0x164()

Operating system is fully up-to-date using rpi-update. I'm using two USB storage devices which are connected to my Pi via a powered hub. Using LVM2 to join them together. I have patched LVM2 as described here: https://lists.fedorahosted.org/pipermail/lvm2-commits/2012-November/000391.html

EDIT: Previously thought issue was influenced by gpu_mem setting in config.txt. This does not appear to be the case.

The text was updated successfully, but these errors were encountered:

popcornmix · 2013-02-05T10:58:39Z

I've just tried gpu_mem=8, 16 on a freshly imaged, and rpi-updated image and all booted happily.
Do you have any other options on config.txt?
Can you put gpu_mem back to 64 and check it still boots.

zerxy · 2013-02-05T14:49:19Z

Couldn't reproduce this error with gpu_mem=8 or gpu_mem=16 but succeeded with gpu_mem=32. Full dmesg output:

[    0.000000] Booting Linux on physical CPU 0
[    0.000000] Initializing cgroup subsys cpu
[    0.000000] Linux version 3.6.11+ (dc4@dc4-arm-01) (gcc version 4.7.2 20120731 (prerelease) (crosstool-NG linaro-1.13.1+bzr2458 - Linaro GCC 2012.08) ) #368 PREEMPT Sun Feb 3 18:35:57 GMT 2013
[    0.000000] CPU: ARMv6-compatible processor [410fb767] revision 7 (ARMv7), cr=00c5387d
[    0.000000] CPU: PIPT / VIPT nonaliasing data cache, VIPT nonaliasing instruction cache
[    0.000000] Machine: BCM2708
[    0.000000] cma: CMA: reserved 16 MiB at 0d000000
[    0.000000] Memory policy: ECC disabled, Data cache writeback
[    0.000000] On node 0 totalpages: 57344
[    0.000000] free_area_init_node: node 0, pgdat c053b834, node_mem_map c05e5000
[    0.000000]   Normal zone: 448 pages used for memmap
[    0.000000]   Normal zone: 0 pages reserved
[    0.000000]   Normal zone: 56896 pages, LIFO batch:15
[    0.000000] pcpu-alloc: s0 r0 d32768 u32768 alloc=1*32768
[    0.000000] pcpu-alloc: [0] 0
[    0.000000] Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 56896
[    0.000000] Kernel command line: dma.dmachans=0x7f35 bcm2708_fb.fbwidth=1920 bcm2708_fb.fbheight=1200 bcm2708.boardrev=0x1000002 bcm2708.serial=0x47d1d2b8 smsc95xx.macaddr=B8:27:EB:D1:D2:B8 sdhci-bcm2708.emmc_clock_freq=100000000 vc_mem.mem_base=0xec00000 vc_mem.mem_size=0x10000000  smsc95xx.turbo_mode=N sdhci-bcm2708.enable_llm=0 dwc_otg.lpm_enable=0 dwc_otg.microframe_schedule=1 dwc_otg.fiq_fix_enable=1 console=ttyAMA0,115200 kgdboc=ttyAMA0,115200 console=tty1 root=/dev/mmcblk0p2 rootfstype=ext4 elevator=cfq rootwait ro
[    0.000000] PID hash table entries: 1024 (order: 0, 4096 bytes)
[    0.000000] Dentry cache hash table entries: 32768 (order: 5, 131072 bytes)
[    0.000000] Inode-cache hash table entries: 16384 (order: 4, 65536 bytes)
[    0.000000] Memory: 224MB = 224MB total
[    0.000000] Memory: 204880k/204880k available, 24496k reserved, 0K highmem
[    0.000000] Virtual kernel memory layout:
[    0.000000]     vector  : 0xffff0000 - 0xffff1000   (   4 kB)
[    0.000000]     fixmap  : 0xfff00000 - 0xfffe0000   ( 896 kB)
[    0.000000]     vmalloc : 0xce800000 - 0xff000000   ( 776 MB)
[    0.000000]     lowmem  : 0xc0000000 - 0xce000000   ( 224 MB)
[    0.000000]     modules : 0xbf000000 - 0xc0000000   (  16 MB)
[    0.000000]       .text : 0xc0008000 - 0xc04e5460   (4982 kB)
[    0.000000]       .init : 0xc04e6000 - 0xc0506f24   ( 132 kB)
[    0.000000]       .data : 0xc0508000 - 0xc053c060   ( 209 kB)
[    0.000000]        .bss : 0xc053c084 - 0xc05e4738   ( 674 kB)
[    0.000000] NR_IRQS:330
[    0.000000] sched_clock: 32 bits at 1000kHz, resolution 1000ns, wraps every 4294967ms
[    0.000000] Console: colour dummy device 80x30
[    0.000000] console [tty1] enabled
[    0.001042] Calibrating delay loop... 464.48 BogoMIPS (lpj=2322432)
[    0.060061] pid_max: default: 32768 minimum: 301
[    0.060406] Mount-cache hash table entries: 512
[    0.061164] Initializing cgroup subsys cpuacct
[    0.061220] Initializing cgroup subsys devices
[    0.061253] Initializing cgroup subsys freezer
[    0.061284] Initializing cgroup subsys blkio
[    0.061386] CPU: Testing write buffer coherency: ok
[    0.061719] hw perfevents: enabled with v6 PMU driver, 3 counters available
[    0.061871] Setting up static identity map for 0x39d198 - 0x39d1f4
[    0.063375] devtmpfs: initialized
[    0.074433] NET: Registered protocol family 16
[    0.080913] DMA: preallocated 4096 KiB pool for atomic coherent allocations
[    0.082139] bcm2708.uart_clock = 0
[    0.083478] hw-breakpoint: found 6 breakpoint and 1 watchpoint registers.
[    0.083533] hw-breakpoint: maximum watchpoint size is 4 bytes.
[    0.083571] mailbox: Broadcom VideoCore Mailbox driver
[    0.083665] bcm2708_vcio: mailbox at f200b880
[    0.083766] bcm_power: Broadcom power driver
[    0.083805] bcm_power_open() -> 0
[    0.083830] bcm_power_request(0, 8)
[    0.584520] bcm_mailbox_read -> 00000080, 0
[    0.584560] bcm_power_request -> 0
[    0.584587] Serial: AMBA PL011 UART driver
[    0.584728] dev:f1: ttyAMA0 at MMIO 0x20201000 (irq = 83) is a PL011 rev3
[    0.917727] console [ttyAMA0] enabled
[    0.941350] bio: create slab <bio-0> at 0
[    0.946222] SCSI subsystem initialized
[    0.950319] usbcore: registered new interface driver usbfs
[    0.955900] usbcore: registered new interface driver hub
[    0.961498] usbcore: registered new device driver usb
[    0.967980] Switching to clocksource stc
[    0.972229] FS-Cache: Loaded
[    0.975366] CacheFiles: Loaded
[    0.990178] NET: Registered protocol family 2
[    0.995461] TCP established hash table entries: 8192 (order: 4, 65536 bytes)
[    1.002809] TCP bind hash table entries: 8192 (order: 3, 32768 bytes)
[    1.009393] TCP: Hash tables configured (established 8192 bind 8192)
[    1.015823] TCP: reno registered
[    1.019076] UDP hash table entries: 256 (order: 0, 4096 bytes)
[    1.024973] UDP-Lite hash table entries: 256 (order: 0, 4096 bytes)
[    1.031520] NET: Registered protocol family 1
[    1.036439] RPC: Registered named UNIX socket transport module.
[    1.042486] RPC: Registered udp transport module.
[    1.047209] RPC: Registered tcp transport module.
[    1.051921] RPC: Registered tcp NFSv4.1 backchannel transport module.
[    1.059195] bcm2708_dma: DMA manager at f2007000
[    1.063980] bcm2708_gpio: bcm2708_gpio_probe c0515d98
[    1.069445] vc-mem: phys_addr:0x00000000 mem_base=0x0ec00000 mem_size:0x10000000(256 MiB)
[    1.078601] audit: initializing netlink socket (disabled)
[    1.084202] type=2000 audit(0.930:1): initialized
[    1.207033] VFS: Disk quotas dquot_6.5.2
[    1.211068] Dquot-cache hash table entries: 1024 (order 0, 4096 bytes)
[    1.218124] FS-Cache: Netfs 'nfs' registered for caching
[    1.223886] NFS: Registering the id_resolver key type
[    1.229079] Key type id_resolver registered
[    1.233377] Key type id_legacy registered
[    1.237746] msgmni has been set to 432
[    1.243302] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 252)
[    1.251016] io scheduler noop registered
[    1.255065] io scheduler deadline registered
[    1.259393] io scheduler cfq registered (default)
[    1.307508] Console: switching to colour frame buffer device 240x75
[    1.339978] kgdb: Registered I/O driver kgdboc.
[    1.345363] vc-cma: Videocore CMA driver
[    1.349408] vc-cma: vc_cma_base      = 0x00000000
[    1.354276] vc-cma: vc_cma_size      = 0x00000000 (0 MiB)
[    1.359806] vc-cma: vc_cma_initial   = 0x00000000 (0 MiB)
[    1.374691] brd: module loaded
[    1.383060] loop: module loaded
[    1.386679] vchiq: vchiq_init_state: slot_zero = 0xcd000000, is_master = 0
[    1.394573] Loading iSCSI transport class v2.0-870.
[    1.400488] usbcore: registered new interface driver smsc95xx
[    1.406664] dwc_otg: version 3.00a 10-AUG-2012 (platform bus)
[    1.617789] Core Release: 2.80a
[    1.621050] Setting default values for core params
[    1.626072] Finished setting default values for core params
[    1.836896] Using Buffer DMA mode
[    1.840321] Periodic Transfer Interrupt Enhancement - disabled
[    1.846317] Multiprocessor Interrupt Enhancement - disabled
[    1.852025] OTG VER PARAM: 0, OTG VER FLAG: 0
[    1.856519] Dedicated Tx FIFOs mode
[    1.861016] dwc_otg: Microframe scheduler enabled
[    1.861397] dwc_otg bcm2708_usb: DWC OTG Controller
[    1.866609] dwc_otg bcm2708_usb: new USB bus registered, assigned bus number 1
[    1.874134] dwc_otg bcm2708_usb: irq 32, io mem 0x00000000
[    1.879818] Init: Port Power? op_state=1
[    1.883876] Init: Power Port (0)
[    1.887307] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002
[    1.894304] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[    1.901696] usb usb1: Product: DWC OTG Controller
[    1.906547] usb usb1: Manufacturer: Linux 3.6.11+ dwc_otg_hcd
[    1.912467] usb usb1: SerialNumber: bcm2708_usb
[    1.917911] hub 1-0:1.0: USB hub found
[    1.921813] hub 1-0:1.0: 1 port detected
[    1.926247] dwc_otg: FIQ enabled
[    1.926267] dwc_otg: NAK holdoff enabled
[    1.926288] Module dwc_common_port init
[    1.926523] Initializing USB Mass Storage driver...
[    1.931745] usbcore: registered new interface driver usb-storage
[    1.938004] USB Mass Storage support registered.
[    1.942942] usbcore: registered new interface driver libusual
[    1.949134] mousedev: PS/2 mouse device common for all mice
[    1.955653] bcm2835-cpufreq: min=700000 max=700000 cur=700000
[    1.961475] bcm2835-cpufreq: switching to governor powersavebcm2835-cpufreq: switching to governor powersave
[    1.971637] cpuidle: using governor ladder
[    1.976068] cpuidle: using governor menu
[    1.987041] sdhci: Secure Digital Host Controller Interface driver
[    2.000150] sdhci: Copyright(c) Pierre Ossman
[    2.011477] sdhci: Disable low-latency mode
[    2.062349] mmc0: SDHCI controller on BCM2708_Arasan [platform] using platform's DMA
[    2.077354] mmc0: BCM2708 SDHC host at 0x20300000 DMA 2 IRQ 77
[    2.092375] sdhci-pltfm: SDHCI platform and OF driver helper
[    2.110759] usbcore: registered new interface driver usbhid
[    2.123484] Indeed it is in host mode hprt0 = 00021501
[    2.140474] usbhid: USB HID core driver
[    2.172621] TCP: cubic registered
[    2.192214] Initializing XFRM netlink socket
[    2.222144] NET: Registered protocol family 17
[    2.244245] mmc0: could read SD Status register (SSR) at the 3th attempt
[    2.262341] Key type dns_resolver registered
[    2.282715] VFP support v0.3: implementor 41 architecture 1 part 20 variant b rev 5
[    2.297678] mmc0: new SDHC card at address e624
[    2.312166] mmcblk0: mmc0:e624 SD08G 7.40 GiB
[    2.324520] registered taskstats version 1
[    2.340012]  mmcblk0: p1 p2
[    2.370625] EXT4-fs (mmcblk0p2): mounted filesystem with ordered data mode. Opts: (null)
[    2.386042] VFS: Mounted root (ext4 filesystem) readonly on device 179:2.
[    2.403926] devtmpfs: mounted
[    2.414547] Freeing init memory: 128K
[    2.425367] usb 1-1: new high-speed USB device number 2 using dwc_otg
[    2.439770] Indeed it is in host mode hprt0 = 00001101
[    2.652483] usb 1-1: New USB device found, idVendor=0424, idProduct=9512
[    2.666962] usb 1-1: New USB device strings: Mfr=0, Product=0, SerialNumber=0
[    2.682154] hub 1-1:1.0: USB hub found
[    2.693276] hub 1-1:1.0: 3 ports detected
[    2.982349] usb 1-1.1: new high-speed USB device number 3 using dwc_otg
[    3.112719] usb 1-1.1: New USB device found, idVendor=0424, idProduct=ec00
[    3.130066] usb 1-1.1: New USB device strings: Mfr=0, Product=0, SerialNumber=0
[    3.149153] smsc95xx v1.0.4
[    3.216941] smsc95xx 1-1.1:1.0: eth0: register 'smsc95xx' at usb-bcm2708_usb-1.1, smsc95xx USB 2.0 Ethernet, b8:27:eb:d1:d2:b8
[    3.332406] usb 1-1.2: new low-speed USB device number 4 using dwc_otg
[    3.444395] usb 1-1.2: New USB device found, idVendor=060b, idProduct=0595
[    3.472900] usb 1-1.2: New USB device strings: Mfr=0, Product=1, SerialNumber=0
[    3.506237] usb 1-1.2: Product: USB Keyboard
[    3.538207] input: USB Keyboard as /devices/platform/bcm2708_usb/usb1/1-1/1-1.2/1-1.2:1.0/input/input0
[    3.565853] hid-generic 0003:060B:0595.0001: input,hidraw0: USB HID v1.10 Keyboard [USB Keyboard] on usb-bcm2708_usb-1.2/input0
[    3.605168] input: USB Keyboard as /devices/platform/bcm2708_usb/usb1/1-1/1-1.2/1-1.2:1.1/input/input1
[    3.623455] hid-generic 0003:060B:0595.0002: input,hiddev0,hidraw1: USB HID v1.10 Device [USB Keyboard] on usb-bcm2708_usb-1.2/input1
[    3.722440] usb 1-1.3: new high-speed USB device number 5 using dwc_otg
[    3.842686] usb 1-1.3: New USB device found, idVendor=0409, idProduct=005a
[    3.857871] usb 1-1.3: New USB device strings: Mfr=0, Product=0, SerialNumber=0
[    3.876193] hub 1-1.3:1.0: USB hub found
[    3.888768] hub 1-1.3:1.0: 4 ports detected
[    4.182353] usb 1-1.3.1: new high-speed USB device number 6 using dwc_otg
[    4.320184] usb 1-1.3.1: New USB device found, idVendor=0bda, idProduct=0119
[    4.342704] usb 1-1.3.1: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[    4.360799] usb 1-1.3.1: Product: USB2.0-CRW
[    4.373474] usb 1-1.3.1: Manufacturer: Generic
[    4.386053] usb 1-1.3.1: SerialNumber: 20090815198100000
[    4.402410] scsi0 : usb-storage 1-1.3.1:1.0
[    4.512401] usb 1-1.3.2: new high-speed USB device number 7 using dwc_otg
[    4.644524] usb 1-1.3.2: New USB device found, idVendor=13fe, idProduct=3800
[    4.662027] usb 1-1.3.2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[    4.679292] usb 1-1.3.2: Product: Patriot Memory
[    4.692607] usb 1-1.3.2: Manufacturer:
[    4.710230] usb 1-1.3.2: SerialNumber: 0701257A10B25A31
[    4.727330] udevd[145]: starting version 175
[    4.760087] scsi1 : usb-storage 1-1.3.2:1.0
[    5.414676] scsi 0:0:0:0: Direct-Access     Generic- SD/MMC           1.00 PQ: 0 ANSI: 0 CCS
[    5.783746] scsi 1:0:0:0: Direct-Access              Patriot Memory   PMAP PQ: 0 ANSI: 0 CCS
[    5.824779] sd 1:0:0:0: [sdb] 62570496 512-byte logical blocks: (32.0 GB/29.8 GiB)
[    5.863038] sd 1:0:0:0: [sdb] Write Protect is off
[    5.892258] sd 1:0:0:0: [sdb] Mode Sense: 23 00 00 00
[    5.893042] sd 1:0:0:0: [sdb] No Caching mode page present
[    5.922255] sd 1:0:0:0: [sdb] Assuming drive cache: write through
[    5.956667] sd 1:0:0:0: [sdb] No Caching mode page present
[    5.988197] sd 1:0:0:0: [sdb] Assuming drive cache: write through
[    6.015686]  sdb: sdb1
[    6.042427] sd 1:0:0:0: [sdb] No Caching mode page present
[    6.071557] sd 1:0:0:0: [sdb] Assuming drive cache: write through
[    6.102240] sd 1:0:0:0: [sdb] Attached SCSI removable disk
[    6.154594] sd 0:0:0:0: [sda] 62333952 512-byte logical blocks: (31.9 GB/29.7 GiB)
[    6.191948] sd 0:0:0:0: [sda] Write Protect is off
[    6.222209] sd 0:0:0:0: [sda] Mode Sense: 03 00 00 00
[    6.223235] sd 0:0:0:0: [sda] No Caching mode page present
[    6.250779] sd 0:0:0:0: [sda] Assuming drive cache: write through
[    6.286949] sd 0:0:0:0: [sda] No Caching mode page present
[    6.313645] sd 0:0:0:0: [sda] Assuming drive cache: write through
[    6.348988]  sda: sda1
[    6.376076] sd 0:0:0:0: [sda] No Caching mode page present
[    6.407014] sd 0:0:0:0: [sda] Assuming drive cache: write through
[    6.432186] sd 0:0:0:0: [sda] Attached SCSI removable disk
[    6.507982] Registered led device: led0
[    8.906027] device-mapper: ioctl: 4.23.0-ioctl (2012-07-25) initialised: dm-devel@redhat.com
[    9.146250] ------------[ cut here ]------------
[    9.159509] WARNING: at block/genhd.c:1582 disk_clear_events+0x148/0x164()
[    9.174813] Modules linked in: dm_mod evdev leds_gpio led_class
[    9.189316] [<c0013a7c>] (unwind_backtrace+0x0/0xf0) from [<c001e2c8>] (warn_slowpath_common+0x4c/0x64)
[    9.207304] [<c001e2c8>] (warn_slowpath_common+0x4c/0x64) from [<c001e2fc>] (warn_slowpath_null+0x1c/0x24)
[    9.225602] [<c001e2fc>] (warn_slowpath_null+0x1c/0x24) from [<c01de260>] (disk_clear_events+0x148/0x164)
[    9.243824] [<c01de260>] (disk_clear_events+0x148/0x164) from [<c00f130c>] (check_disk_change+0x1c/0x54)
[    9.262046] [<c00f130c>] (check_disk_change+0x1c/0x54) from [<c0261ec0>] (sd_open+0x68/0x138)
[    9.279321] [<c0261ec0>] (sd_open+0x68/0x138) from [<c00f2118>] (__blkdev_get+0x2b0/0x3f0)
[    9.296388] [<c00f2118>] (__blkdev_get+0x2b0/0x3f0) from [<c00f23e0>] (blkdev_get+0x188/0x31c)
[    9.313828] [<c00f23e0>] (blkdev_get+0x188/0x31c) from [<c00bfc80>] (do_dentry_open.isra.16+0x1d4/0x254)
[    9.332160] [<c00bfc80>] (do_dentry_open.isra.16+0x1d4/0x254) from [<c00bfdd0>] (finish_open+0x20/0x38)
[    9.350440] [<c00bfdd0>] (finish_open+0x20/0x38) from [<c00cf07c>] (do_last.isra.35+0x4e8/0xba8)
[    9.368136] [<c00cf07c>] (do_last.isra.35+0x4e8/0xba8) from [<c00cf7e4>] (path_openat+0xa8/0x480)
[    9.385935] [<c00cf7e4>] (path_openat+0xa8/0x480) from [<c00cfe6c>] (do_filp_open+0x2c/0x80)
[    9.403254] [<c00cfe6c>] (do_filp_open+0x2c/0x80) from [<c00c09ec>] (do_sys_open+0xe8/0x184)
[    9.420462] [<c00c09ec>] (do_sys_open+0xe8/0x184) from [<c000da60>] (ret_fast_syscall+0x0/0x30)
[    9.437888] ---[ end trace 46c5c93ac717e025 ]---
[   12.214742] EXT4-fs (mmcblk0p2): re-mounted. Opts: (null)
[   12.610455] EXT4-fs (mmcblk0p2): re-mounted. Opts: (null)
[   13.279127] bcm2708 watchdog, heartbeat=10 sec (nowayout=0)
[   19.268393] EXT4-fs (dm-0): mounted filesystem without journal. Opts: (null)
[   23.674924] smsc95xx 1-1.1:1.0: eth0: link up, 100Mbps, full-duplex, lpa 0x49E1
[   26.558188] Adding 102396k swap on /var/swap.  Priority:-1 extents:1 across:102396k SS
[   30.563963] watchdog stopped

zerxy · 2013-02-05T14:52:05Z

config.txt contents:

# uncomment if you get no picture on HDMI for a default "safe" mode
#hdmi_safe=1

# uncomment this if your display has a black border of unused pixels visible
# and your display can output without overscan
#disable_overscan=1

# uncomment the following to adjust overscan. Use positive numbers if console
# goes off screen, and negative if there is too much border
overscan_left=0
overscan_right=0
overscan_top=0
overscan_bottom=0

# uncomment to force a console size. By default it will be display's size minus
# overscan.
#framebuffer_width=1920
#framebuffer_height=1200
#framebuffer_depth=32

# uncomment if hdmi display is not detected and composite is being output
#hdmi_force_hotplug=1

# uncomment to force a specific HDMI mode (this will force VGA)
#hdmi_group=1
#hdmi_mode=1

# uncomment to force a HDMI mode rather than DVI. This can make audio work in
# DMT (computer monitor) modes
#hdmi_drive=2

# uncomment to increase signal to HDMI, if you have interference, blanking, or
# no display
#config_hdmi_boost=4

# uncomment for composite PAL
#sdtv_mode=2

#uncomment to overclock the arm. 700 MHz is the default.
#arm_freq=980

# for more options see http://elinux.org/RPi_config.txt
#core_freq=250
#h264_freq=250
#isp_freq=250
#v3d_freq=250
#sdram_freq=480
#over_voltage=8
#over_voltage_sdram=7
force_turbo=1
initial_turbo=60
#init_emmc_clock=100000000
current_limit_override=0x5A000020
gpu_mem=32

zerxy · 2013-02-05T14:56:25Z

Please note that the WARNING: at block/genhd.c:1582 disk_clear_events+0x148/0x164() error isn't produced at every boot, just now and then.

popcornmix · 2013-02-05T15:00:20Z

I'm not convinced it is related to gpu_mem=32.
Do you have a USB disk attached? Any different without it?
Can you try multiple boots with gpu_mem=32 and gpu_mem=64 and report how many succeed.

zerxy · 2013-02-05T15:12:10Z

I'm also not convinced it is related to gpu_mem setting. Two 32 GB USB storage devices attached. Using LVM2 to join them into a single volume.

Just booted five times with gpu_mem=32 and only once did the error occur.

popcornmix · 2013-02-05T17:30:07Z

Might be worth checking voltage between TP1 and TP2 or using a powered hub to connect the USB disks.

zerxy · 2013-02-06T02:44:30Z

I am using a powered hub to connect the USB storage.

zerxy · 2013-02-08T07:07:44Z

Updated to Feb 7 kernel 3.6.11+ #371 via rpi-update. Nothing has changed. Error still occurs occasionally, only ever during boot.

zerxy · 2013-02-08T07:17:34Z

Please note that this error has only recently appeared for me. Only thing changed recently hardware-wise is the addition of another USB storage device which is connected along with previous USB storage device to my Pi via a powered hub. Using LVM2 to join them together. I have patched LVM2 as described here: https://lists.fedorahosted.org/pipermail/lvm2-commits/2012-November/000391.html

popcornmix · 2013-02-08T10:49:30Z

I don't think this is related to gpu_mem. Can you close this and open a USB issue about the occasional block/genhd.c panic?

zerxy · 2013-02-08T20:08:50Z

Renamed and edited this issue. Hopefully this is now acceptable to you.

zerxy · 2013-02-12T13:48:06Z

Updated to Feb 12 kernel 3.6.11+ #375 via rpi-update. Nothing has changed. Error still occurs occasionally, only ever during boot.

zerxy · 2013-02-16T23:53:38Z

Error still present in Feb 16 kernel 3.6.11+ #377.

zerxy · 2013-03-02T06:20:00Z

Error still present in Mar 1 kernel 3.6.11+ #385.

ghollingworth · 2013-03-02T07:21:27Z

What is the WARN in disk_clear_events? What does it say?

On Saturday, 2 March 2013, zerxy wrote:

Error still present in Mar 1 kernel 3.6.11+ #385.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/213#issuecomment-14323893
.

zerxy · 2013-03-02T07:36:29Z

dmesg output:

[    9.146250] ------------[ cut here ]------------
[    9.159509] WARNING: at block/genhd.c:1582 disk_clear_events+0x148/0x164()
[    9.174813] Modules linked in: dm_mod evdev leds_gpio led_class
[    9.189316] [<c0013a7c>] (unwind_backtrace+0x0/0xf0) from [<c001e2c8>] (warn_slowpath_common+0x4c/0x64)
[    9.207304] [<c001e2c8>] (warn_slowpath_common+0x4c/0x64) from [<c001e2fc>] (warn_slowpath_null+0x1c/0x24)
[    9.225602] [<c001e2fc>] (warn_slowpath_null+0x1c/0x24) from [<c01de260>] (disk_clear_events+0x148/0x164)
[    9.243824] [<c01de260>] (disk_clear_events+0x148/0x164) from [<c00f130c>] (check_disk_change+0x1c/0x54)
[    9.262046] [<c00f130c>] (check_disk_change+0x1c/0x54) from [<c0261ec0>] (sd_open+0x68/0x138)
[    9.279321] [<c0261ec0>] (sd_open+0x68/0x138) from [<c00f2118>] (__blkdev_get+0x2b0/0x3f0)
[    9.296388] [<c00f2118>] (__blkdev_get+0x2b0/0x3f0) from [<c00f23e0>] (blkdev_get+0x188/0x31c)
[    9.313828] [<c00f23e0>] (blkdev_get+0x188/0x31c) from [<c00bfc80>] (do_dentry_open.isra.16+0x1d4/0x254)
[    9.332160] [<c00bfc80>] (do_dentry_open.isra.16+0x1d4/0x254) from [<c00bfdd0>] (finish_open+0x20/0x38)
[    9.350440] [<c00bfdd0>] (finish_open+0x20/0x38) from [<c00cf07c>] (do_last.isra.35+0x4e8/0xba8)
[    9.368136] [<c00cf07c>] (do_last.isra.35+0x4e8/0xba8) from [<c00cf7e4>] (path_openat+0xa8/0x480)
[    9.385935] [<c00cf7e4>] (path_openat+0xa8/0x480) from [<c00cfe6c>] (do_filp_open+0x2c/0x80)
[    9.403254] [<c00cfe6c>] (do_filp_open+0x2c/0x80) from [<c00c09ec>] (do_sys_open+0xe8/0x184)
[    9.420462] [<c00c09ec>] (do_sys_open+0xe8/0x184) from [<c000da60>] (ret_fast_syscall+0x0/0x30)
[    9.437888] ---[ end trace 46c5c93ac717e025 ]---

ghollingworth · 2013-03-02T07:49:18Z

Yes that's already in the bug, I was asking what the code is in genhd.c
that caused the warn to be triggered

On Saturday, 2 March 2013, zerxy wrote:

dmesg output:

[ 9.146250] ------------[ cut here ]------------
[ 9.159509] WARNING: at block/genhd.c:1582 disk_clear_events+0x148/0x164()
[ 9.174813] Modules linked in: dm_mod evdev leds_gpio led_class
[ 9.189316] from
[ 9.207304] from
[ 9.225602] from
[ 9.243824] from
[ 9.262046] from
[ 9.279321] from
[ 9.296388] from
[ 9.313828] from
[ 9.332160] from
[ 9.350440] from
[ 9.368136] from
[ 9.385935] from
[ 9.403254] from
[ 9.420462] from
[ 9.437888] ---[ end trace 46c5c93ac717e025 ]---

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/213#issuecomment-14324610
.

zerxy · 2013-03-02T07:55:05Z

I don't understand why you need my assistance in determining the problem with genhd.c

ghollingworth · 2013-03-02T08:06:13Z

Because I'm only one person and have a thousand issues to deal with, it may
help you solve the problem.

Otherwise I'll probably get around to it in a year or so

Gordon

Director of software, Raspberry Pi

On Saturday, 2 March 2013, zerxy wrote:

I don't understand why you need my assistance in determining the problem
with genhd.c

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/213#issuecomment-14324773
.

zerxy · 2013-03-02T08:09:39Z

Don't take your frustration out on me. If your employers actually knew what's good for them you would already have plenty of assistance from paid co-workers.

zerxy · 2013-03-02T08:11:56Z

What exactly do you want me to do to help you? I will try my best to help.

licaon-kter · 2013-03-02T08:34:01Z

@zerxy: Take it down a notch, would you? Remember that the "Raspberry Pi Foundation" in a charity first, they are all volunteers ( or most of them anyway ). Yes, they might be Broadcom employed but that's the day job, this is the after hours part. ;)

zerxy · 2013-03-02T08:37:06Z

The charity thing is used as an excuse. Lots of Raspberry Pi's sold = lots of money made = no excuse for poor support.

ghollingworth · 2013-03-02T14:39:25Z

If more than about ten people have the same problem

On Saturday, 2 March 2013, zerxy wrote:

The charity thing is used as an excuse. Lots of Raspberry Pi's sold = lots
of money made = no excuse for poor support.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/213#issuecomment-14325121
.

ghollingworth · 2013-03-02T14:44:01Z

Sorry, bad iPhone!

If more than about ten people have the same problem then it's worth
investing some of my time into it otherwise I will try to enable you to
help yourself. (Which I was trying to do)

We do spend significant money on software, but there are a lot of minor
bugs that may be finger trouble, Linux drivers problems or just hardware
problems. The point of communicating through this is to try to help you
diagnose the problem and put it into the right box first

Gordon

On Saturday, 2 March 2013, Gordon Hollingworth wrote:

If more than about ten people have the same problem

On Saturday, 2 March 2013, zerxy wrote:

The charity thing is used as an excuse. Lots of Raspberry Pi's sold =
lots of money made = no excuse for poor support.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/213#issuecomment-14325121
.

zerxy · 2013-03-02T15:52:28Z

Looking around it appears that the problem I have, which has only been cosmetic for me so far, seems to not be Raspberry Pi specific anyway.

zerxy · 2013-03-04T15:17:04Z

Lines 1580 to 1585 from block/genhd.c

 /* then, fetch and clear pending events */
        spin_lock_irq(&ev->lock);
        WARN_ON_ONCE(ev->clearing & mask);      /* cleared by workfn */
        pending = ev->pending & mask;
        ev->pending &= ~mask;
        spin_unlock_irq(&ev->lock);

zerxy · 2013-03-04T20:08:07Z

This might be the solution: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/block/genhd.c?id=12c2bdb232168511c8dd54d6626549391a228918

popcornmix · 2013-03-04T21:15:44Z

It does look like that is likely the fix. Unfortunately the patch doesn't apply cleanly.
Might be worth testing the 3.8 kernel which has this in.
#225

zerxy · 2013-03-05T16:23:04Z

Tried moving WARN_ON_ONCE(ev->clearing & mask); /* cleared by workfn */ line to the bottom, like so:

 /* then, fetch and clear pending events */
        spin_lock_irq(&ev->lock);
        pending = ev->pending & mask;
        ev->pending &= ~mask;
        spin_unlock_irq(&ev->lock);
        WARN_ON_ONCE(ev->clearing & mask);      /* cleared by workfn */

genhd.c compiles without error and the kernel produced has booted without the WARNING: at block/genhd.c:1582 disk_clear_events+0x148/0x164() error or any other error related to genhd.c in the few test boots I've performed.

zerxy · 2013-03-28T07:46:35Z

Tests have shown that the above change actually does not eliminate this error but instead makes it much less likely to occur. It is now quite rare.

psergiu · 2013-04-07T20:13:21Z

Same issue - errors appear only when:

two USB storage devices are present at boot (USB Flash + USB HDD).
last field of /etc/fstab for filesystems on those devices is not 0 (thus forcing fsck at boot time)

Changing last field of /etc/fstab back to 0 (preventing fsck to be run) causes the error to no longer appear.

Using the Raspibian with "stable" 3.6.11+ rev 371

ghollingworth · 2013-07-17T20:33:18Z

Can you test again with the latest build?

P33M · 2014-05-05T12:54:37Z

Two 3TB hard disks with /etc/fstab fsck column set nonzero works for me.

zerxy · 2014-05-05T18:27:09Z

This stopped being an issue for me with kernel 3.8 and later.

[ Upstream commit 4f4178f ] Fixes this warning introduced by commit 5b8f15f ("net: cdc_mbim: handle IPv6 Neigbor Solicitations"): =============================== [ INFO: suspicious RCU usage. ] 3.15.0-rc3 #213 Tainted: G W O ------------------------------- net/8021q/vlan_core.c:69 suspicious rcu_dereference_check() usage! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 1 no locks held by ksoftirqd/0/3. stack backtrace: CPU: 0 PID: 3 Comm: ksoftirqd/0 Tainted: G W O 3.15.0-rc3 #213 Hardware name: LENOVO 2776LEG/2776LEG, BIOS 6EET55WW (3.15 ) 12/19/2011 0000000000000001 ffff880232533bf0 ffffffff813a5ee6 0000000000000006 ffff880232530090 ffff880232533c20 ffffffff81076b94 0000000000000081 0000000000000000 ffff8802085ac000 ffff88007fc8ea00 ffff880232533c50 Call Trace: [<ffffffff813a5ee6>] dump_stack+0x4e/0x68 [<ffffffff81076b94>] lockdep_rcu_suspicious+0xfa/0x103 [<ffffffff813978a6>] __vlan_find_dev_deep+0x54/0x94 [<ffffffffa04a1938>] cdc_mbim_rx_fixup+0x379/0x66a [cdc_mbim] [<ffffffff813ab76f>] ? _raw_spin_unlock_irqrestore+0x3a/0x49 [<ffffffff81079671>] ? trace_hardirqs_on_caller+0x192/0x1a1 [<ffffffffa059bd10>] usbnet_bh+0x59/0x287 [usbnet] [<ffffffff8104067d>] tasklet_action+0xbb/0xcd [<ffffffff81040057>] __do_softirq+0x14c/0x30d [<ffffffff81040237>] run_ksoftirqd+0x1f/0x50 [<ffffffff8105f13e>] smpboot_thread_fn+0x172/0x18e [<ffffffff8105efcc>] ? SyS_setgroups+0xdf/0xdf [<ffffffff810594b0>] kthread+0xb5/0xbd [<ffffffff813a84b1>] ? __wait_for_common+0x13b/0x170 [<ffffffff810593fb>] ? __kthread_parkme+0x5c/0x5c [<ffffffff813b147c>] ret_from_fork+0x7c/0xb0 [<ffffffff810593fb>] ? __kthread_parkme+0x5c/0x5c Fixes: 5b8f15f ("net: cdc_mbim: handle IPv6 Neigbor Solicitations") Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Fixes this warning introduced by commit 5b8f15f ("net: cdc_mbim: handle IPv6 Neigbor Solicitations"): =============================== [ INFO: suspicious RCU usage. ] 3.15.0-rc3 #213 Tainted: G W O ------------------------------- net/8021q/vlan_core.c:69 suspicious rcu_dereference_check() usage! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 1 no locks held by ksoftirqd/0/3. stack backtrace: CPU: 0 PID: 3 Comm: ksoftirqd/0 Tainted: G W O 3.15.0-rc3 #213 Hardware name: LENOVO 2776LEG/2776LEG, BIOS 6EET55WW (3.15 ) 12/19/2011 0000000000000001 ffff880232533bf0 ffffffff813a5ee6 0000000000000006 ffff880232530090 ffff880232533c20 ffffffff81076b94 0000000000000081 0000000000000000 ffff8802085ac000 ffff88007fc8ea00 ffff880232533c50 Call Trace: [<ffffffff813a5ee6>] dump_stack+0x4e/0x68 [<ffffffff81076b94>] lockdep_rcu_suspicious+0xfa/0x103 [<ffffffff813978a6>] __vlan_find_dev_deep+0x54/0x94 [<ffffffffa04a1938>] cdc_mbim_rx_fixup+0x379/0x66a [cdc_mbim] [<ffffffff813ab76f>] ? _raw_spin_unlock_irqrestore+0x3a/0x49 [<ffffffff81079671>] ? trace_hardirqs_on_caller+0x192/0x1a1 [<ffffffffa059bd10>] usbnet_bh+0x59/0x287 [usbnet] [<ffffffff8104067d>] tasklet_action+0xbb/0xcd [<ffffffff81040057>] __do_softirq+0x14c/0x30d [<ffffffff81040237>] run_ksoftirqd+0x1f/0x50 [<ffffffff8105f13e>] smpboot_thread_fn+0x172/0x18e [<ffffffff8105efcc>] ? SyS_setgroups+0xdf/0xdf [<ffffffff810594b0>] kthread+0xb5/0xbd [<ffffffff813a84b1>] ? __wait_for_common+0x13b/0x170 [<ffffffff810593fb>] ? __kthread_parkme+0x5c/0x5c [<ffffffff813b147c>] ret_from_fork+0x7c/0xb0 [<ffffffff810593fb>] ? __kthread_parkme+0x5c/0x5c Fixes: 5b8f15f ("net: cdc_mbim: handle IPv6 Neigbor Solicitations") Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>

commit ecdd095 upstream. Inside set_status, transfer need to setup again, so we have to drain IO before the transition, otherwise oops may be triggered like the following: divide error: 0000 [raspberrypi#1] SMP KASAN CPU: 0 PID: 2935 Comm: loop7 Not tainted 4.10.0-rc7+ raspberrypi#213 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 task: ffff88006ba1e840 task.stack: ffff880067338000 RIP: 0010:transfer_xor+0x1d1/0x440 drivers/block/loop.c:110 RSP: 0018:ffff88006733f108 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff8800688d7000 RCX: 0000000000000059 RDX: 0000000000000000 RSI: 1ffff1000d743f43 RDI: ffff880068891c08 RBP: ffff88006733f160 R08: ffff8800688d7001 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffff8800688d7000 R13: ffff880067b7d000 R14: dffffc0000000000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88006d000000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000006c17e0 CR3: 0000000066e3b000 CR4: 00000000001406f0 Call Trace: lo_do_transfer drivers/block/loop.c:251 [inline] lo_read_transfer drivers/block/loop.c:392 [inline] do_req_filebacked drivers/block/loop.c:541 [inline] loop_handle_cmd drivers/block/loop.c:1677 [inline] loop_queue_work+0xda0/0x49b0 drivers/block/loop.c:1689 kthread_worker_fn+0x4c3/0xa30 kernel/kthread.c:630 kthread+0x326/0x3f0 kernel/kthread.c:227 ret_from_fork+0x31/0x40 arch/x86/entry/entry_64.S:430 Code: 03 83 e2 07 41 29 df 42 0f b6 04 30 4d 8d 44 24 01 38 d0 7f 08 84 c0 0f 85 62 02 00 00 44 89 f8 41 0f b6 48 ff 25 ff 01 00 00 99 <f7> 7d c8 48 63 d2 48 03 55 d0 48 89 d0 48 89 d7 48 c1 e8 03 83 RIP: transfer_xor+0x1d1/0x440 drivers/block/loop.c:110 RSP: ffff88006733f108 ---[ end trace 0166f7bd3b0c0933 ]--- Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Ming Lei <tom.leiming@gmail.com> Tested-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit ecdd095 upstream. Inside set_status, transfer need to setup again, so we have to drain IO before the transition, otherwise oops may be triggered like the following: divide error: 0000 [#1] SMP KASAN CPU: 0 PID: 2935 Comm: loop7 Not tainted 4.10.0-rc7+ #213 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 task: ffff88006ba1e840 task.stack: ffff880067338000 RIP: 0010:transfer_xor+0x1d1/0x440 drivers/block/loop.c:110 RSP: 0018:ffff88006733f108 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff8800688d7000 RCX: 0000000000000059 RDX: 0000000000000000 RSI: 1ffff1000d743f43 RDI: ffff880068891c08 RBP: ffff88006733f160 R08: ffff8800688d7001 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffff8800688d7000 R13: ffff880067b7d000 R14: dffffc0000000000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88006d000000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000006c17e0 CR3: 0000000066e3b000 CR4: 00000000001406f0 Call Trace: lo_do_transfer drivers/block/loop.c:251 [inline] lo_read_transfer drivers/block/loop.c:392 [inline] do_req_filebacked drivers/block/loop.c:541 [inline] loop_handle_cmd drivers/block/loop.c:1677 [inline] loop_queue_work+0xda0/0x49b0 drivers/block/loop.c:1689 kthread_worker_fn+0x4c3/0xa30 kernel/kthread.c:630 kthread+0x326/0x3f0 kernel/kthread.c:227 ret_from_fork+0x31/0x40 arch/x86/entry/entry_64.S:430 Code: 03 83 e2 07 41 29 df 42 0f b6 04 30 4d 8d 44 24 01 38 d0 7f 08 84 c0 0f 85 62 02 00 00 44 89 f8 41 0f b6 48 ff 25 ff 01 00 00 99 <f7> 7d c8 48 63 d2 48 03 55 d0 48 89 d0 48 89 d7 48 c1 e8 03 83 RIP: transfer_xor+0x1d1/0x440 drivers/block/loop.c:110 RSP: ffff88006733f108 ---[ end trace 0166f7bd3b0c0933 ]--- Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Ming Lei <tom.leiming@gmail.com> Tested-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Zero sized buffer objects tend to make various bits of the GEM infrastructure complain: WARNING: CPU: 1 PID: 2323 at drivers/gpu/drm/drm_mm.c:389 drm_mm_insert_node_generic+0x258/0x2f0 Modules linked in: CPU: 1 PID: 2323 Comm: drm-api-test Tainted: G W 4.9.0-rc4-00906-g693af44 #213 Hardware name: Qualcomm Technologies, Inc. DB820c (DT) task: ffff8000d7353400 task.stack: ffff8000d7720000 PC is at drm_mm_insert_node_generic+0x258/0x2f0 LR is at drm_vma_offset_add+0x4c/0x70 Zero sized buffers serve no appreciable value to the user so disallow them at create time. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>

commit ecdd095 upstream. Inside set_status, transfer need to setup again, so we have to drain IO before the transition, otherwise oops may be triggered like the following: divide error: 0000 [#1] SMP KASAN CPU: 0 PID: 2935 Comm: loop7 Not tainted 4.10.0-rc7+ raspberrypi#213 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 task: ffff88006ba1e840 task.stack: ffff880067338000 RIP: 0010:transfer_xor+0x1d1/0x440 drivers/block/loop.c:110 RSP: 0018:ffff88006733f108 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff8800688d7000 RCX: 0000000000000059 RDX: 0000000000000000 RSI: 1ffff1000d743f43 RDI: ffff880068891c08 RBP: ffff88006733f160 R08: ffff8800688d7001 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffff8800688d7000 R13: ffff880067b7d000 R14: dffffc0000000000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88006d000000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000006c17e0 CR3: 0000000066e3b000 CR4: 00000000001406f0 Call Trace: lo_do_transfer drivers/block/loop.c:251 [inline] lo_read_transfer drivers/block/loop.c:392 [inline] do_req_filebacked drivers/block/loop.c:541 [inline] loop_handle_cmd drivers/block/loop.c:1677 [inline] loop_queue_work+0xda0/0x49b0 drivers/block/loop.c:1689 kthread_worker_fn+0x4c3/0xa30 kernel/kthread.c:630 kthread+0x326/0x3f0 kernel/kthread.c:227 ret_from_fork+0x31/0x40 arch/x86/entry/entry_64.S:430 Code: 03 83 e2 07 41 29 df 42 0f b6 04 30 4d 8d 44 24 01 38 d0 7f 08 84 c0 0f 85 62 02 00 00 44 89 f8 41 0f b6 48 ff 25 ff 01 00 00 99 <f7> 7d c8 48 63 d2 48 03 55 d0 48 89 d0 48 89 d7 48 c1 e8 03 83 RIP: transfer_xor+0x1d1/0x440 drivers/block/loop.c:110 RSP: ffff88006733f108 ---[ end trace 0166f7bd3b0c0933 ]--- Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Ming Lei <tom.leiming@gmail.com> Tested-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Zero sized buffer objects tend to make various bits of the GEM infrastructure complain: WARNING: CPU: 1 PID: 2323 at drivers/gpu/drm/drm_mm.c:389 drm_mm_insert_node_generic+0x258/0x2f0 Modules linked in: CPU: 1 PID: 2323 Comm: drm-api-test Tainted: G W 4.9.0-rc4-00906-g693af44 raspberrypi#213 Hardware name: Qualcomm Technologies, Inc. DB820c (DT) task: ffff8000d7353400 task.stack: ffff8000d7720000 PC is at drm_mm_insert_node_generic+0x258/0x2f0 LR is at drm_vma_offset_add+0x4c/0x70 Zero sized buffers serve no appreciable value to the user so disallow them at create time. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>

rxrpc_service_prealloc_one() doesn't set the socket pointer on any new call it preallocates, but does add it to the rxrpc net namespace call list. This, however, causes rxrpc_put_call() to oops when the call is discarded when the socket is closed. rxrpc_put_call() needs the socket to be able to reach the namespace so that it can use a lock held therein. Fix this by setting a call's socket pointer immediately before discarding it. This can be triggered by unloading the kafs module, resulting in an oops like the following: BUG: unable to handle kernel NULL pointer dereference at 0000000000000030 IP: rxrpc_put_call+0x1e2/0x32d PGD 0 P4D 0 Oops: 0000 [#1] SMP Modules linked in: kafs(E-) CPU: 3 PID: 3037 Comm: rmmod Tainted: G E 4.12.0-fscache+ #213 Hardware name: ASUS All Series/H97-PLUS, BIOS 2306 10/09/2014 task: ffff8803fc92e2c0 task.stack: ffff8803fef74000 RIP: 0010:rxrpc_put_call+0x1e2/0x32d RSP: 0018:ffff8803fef77e08 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff8803fab99ac0 RCX: 000000000000000f RDX: ffffffff81c50a40 RSI: 000000000000000c RDI: ffff8803fc92ea88 RBP: ffff8803fef77e30 R08: ffff8803fc87b941 R09: ffffffff82946d20 R10: ffff8803fef77d10 R11: 00000000000076fc R12: 0000000000000005 R13: ffff8803fab99c20 R14: 0000000000000001 R15: ffffffff816c6aee FS: 00007f915a059700(0000) GS:ffff88041fb80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000030 CR3: 00000003fef39000 CR4: 00000000001406e0 Call Trace: rxrpc_discard_prealloc+0x325/0x341 rxrpc_listen+0xf9/0x146 kernel_listen+0xb/0xd afs_close_socket+0x3e/0x173 [kafs] afs_exit+0x1f/0x57 [kafs] SyS_delete_module+0x10f/0x19a do_syscall_64+0x8a/0x149 entry_SYSCALL64_slow_path+0x25/0x25 Fixes: 2baec2c ("rxrpc: Support network namespacing") Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>

commit 082f230 upstream. Local random address needs to be updated before creating connection if RPA from LE Direct Advertising Report was resolved in host. Otherwise remote device might ignore connection request due to address mismatch. This was affecting following qualification test cases: GAP/CONN/SCEP/BV-03-C, GAP/CONN/GCEP/BV-05-C, GAP/CONN/DCEP/BV-05-C Before patch: < HCI Command: LE Set Random Address (0x08|0x0005) plen 6 #11350 [hci0] 84680.231216 Address: 56:BC:E8:24:11:68 (Resolvable) Identity type: Random (0x01) Identity: F2:F1:06:3D:9C:42 (Static) > HCI Event: Command Complete (0x0e) plen 4 #11351 [hci0] 84680.246022 LE Set Random Address (0x08|0x0005) ncmd 1 Status: Success (0x00) < HCI Command: LE Set Scan Parameters (0x08|0x000b) plen 7 #11352 [hci0] 84680.246417 Type: Passive (0x00) Interval: 60.000 msec (0x0060) Window: 30.000 msec (0x0030) Own address type: Random (0x01) Filter policy: Accept all advertisement, inc. directed unresolved RPA (0x02) > HCI Event: Command Complete (0x0e) plen 4 #11353 [hci0] 84680.248854 LE Set Scan Parameters (0x08|0x000b) ncmd 1 Status: Success (0x00) < HCI Command: LE Set Scan Enable (0x08|0x000c) plen 2 #11354 [hci0] 84680.249466 Scanning: Enabled (0x01) Filter duplicates: Enabled (0x01) > HCI Event: Command Complete (0x0e) plen 4 #11355 [hci0] 84680.253222 LE Set Scan Enable (0x08|0x000c) ncmd 1 Status: Success (0x00) > HCI Event: LE Meta Event (0x3e) plen 18 #11356 [hci0] 84680.458387 LE Direct Advertising Report (0x0b) Num reports: 1 Event type: Connectable directed - ADV_DIRECT_IND (0x01) Address type: Random (0x01) Address: 53:38:DA:46:8C:45 (Resolvable) Identity type: Public (0x00) Identity: 11:22:33:44:55:66 (OUI 11-22-33) Direct address type: Random (0x01) Direct address: 7C:D6:76:8C:DF:82 (Resolvable) Identity type: Random (0x01) Identity: F2:F1:06:3D:9C:42 (Static) RSSI: -74 dBm (0xb6) < HCI Command: LE Set Scan Enable (0x08|0x000c) plen 2 #11357 [hci0] 84680.458737 Scanning: Disabled (0x00) Filter duplicates: Disabled (0x00) > HCI Event: Command Complete (0x0e) plen 4 #11358 [hci0] 84680.469982 LE Set Scan Enable (0x08|0x000c) ncmd 1 Status: Success (0x00) < HCI Command: LE Create Connection (0x08|0x000d) plen 25 #11359 [hci0] 84680.470444 Scan interval: 60.000 msec (0x0060) Scan window: 60.000 msec (0x0060) Filter policy: White list is not used (0x00) Peer address type: Random (0x01) Peer address: 53:38:DA:46:8C:45 (Resolvable) Identity type: Public (0x00) Identity: 11:22:33:44:55:66 (OUI 11-22-33) Own address type: Random (0x01) Min connection interval: 30.00 msec (0x0018) Max connection interval: 50.00 msec (0x0028) Connection latency: 0 (0x0000) Supervision timeout: 420 msec (0x002a) Min connection length: 0.000 msec (0x0000) Max connection length: 0.000 msec (0x0000) > HCI Event: Command Status (0x0f) plen 4 #11360 [hci0] 84680.474971 LE Create Connection (0x08|0x000d) ncmd 1 Status: Success (0x00) < HCI Command: LE Create Connection Cancel (0x08|0x000e) plen 0 #11361 [hci0] 84682.545385 > HCI Event: Command Complete (0x0e) plen 4 #11362 [hci0] 84682.551014 LE Create Connection Cancel (0x08|0x000e) ncmd 1 Status: Success (0x00) > HCI Event: LE Meta Event (0x3e) plen 19 #11363 [hci0] 84682.551074 LE Connection Complete (0x01) Status: Unknown Connection Identifier (0x02) Handle: 0 Role: Master (0x00) Peer address type: Public (0x00) Peer address: 00:00:00:00:00:00 (OUI 00-00-00) Connection interval: 0.00 msec (0x0000) Connection latency: 0 (0x0000) Supervision timeout: 0 msec (0x0000) Master clock accuracy: 0x00 After patch: < HCI Command: LE Set Scan Parameters (0x08|0x000b) plen 7 raspberrypi#210 [hci0] 667.152459 Type: Passive (0x00) Interval: 60.000 msec (0x0060) Window: 30.000 msec (0x0030) Own address type: Random (0x01) Filter policy: Accept all advertisement, inc. directed unresolved RPA (0x02) > HCI Event: Command Complete (0x0e) plen 4 raspberrypi#211 [hci0] 667.153613 LE Set Scan Parameters (0x08|0x000b) ncmd 1 Status: Success (0x00) < HCI Command: LE Set Scan Enable (0x08|0x000c) plen 2 raspberrypi#212 [hci0] 667.153704 Scanning: Enabled (0x01) Filter duplicates: Enabled (0x01) > HCI Event: Command Complete (0x0e) plen 4 raspberrypi#213 [hci0] 667.154584 LE Set Scan Enable (0x08|0x000c) ncmd 1 Status: Success (0x00) > HCI Event: LE Meta Event (0x3e) plen 18 raspberrypi#214 [hci0] 667.182619 LE Direct Advertising Report (0x0b) Num reports: 1 Event type: Connectable directed - ADV_DIRECT_IND (0x01) Address type: Random (0x01) Address: 50:52:D9:A6:48:A0 (Resolvable) Identity type: Public (0x00) Identity: 11:22:33:44:55:66 (OUI 11-22-33) Direct address type: Random (0x01) Direct address: 7C:C1:57:A5:B7:A8 (Resolvable) Identity type: Random (0x01) Identity: F4:28:73:5D:38:B0 (Static) RSSI: -70 dBm (0xba) < HCI Command: LE Set Scan Enable (0x08|0x000c) plen 2 raspberrypi#215 [hci0] 667.182704 Scanning: Disabled (0x00) Filter duplicates: Disabled (0x00) > HCI Event: Command Complete (0x0e) plen 4 raspberrypi#216 [hci0] 667.183599 LE Set Scan Enable (0x08|0x000c) ncmd 1 Status: Success (0x00) < HCI Command: LE Set Random Address (0x08|0x0005) plen 6 raspberrypi#217 [hci0] 667.183645 Address: 7C:C1:57:A5:B7:A8 (Resolvable) Identity type: Random (0x01) Identity: F4:28:73:5D:38:B0 (Static) > HCI Event: Command Complete (0x0e) plen 4 raspberrypi#218 [hci0] 667.184590 LE Set Random Address (0x08|0x0005) ncmd 1 Status: Success (0x00) < HCI Command: LE Create Connection (0x08|0x000d) plen 25 raspberrypi#219 [hci0] 667.184613 Scan interval: 60.000 msec (0x0060) Scan window: 60.000 msec (0x0060) Filter policy: White list is not used (0x00) Peer address type: Random (0x01) Peer address: 50:52:D9:A6:48:A0 (Resolvable) Identity type: Public (0x00) Identity: 11:22:33:44:55:66 (OUI 11-22-33) Own address type: Random (0x01) Min connection interval: 30.00 msec (0x0018) Max connection interval: 50.00 msec (0x0028) Connection latency: 0 (0x0000) Supervision timeout: 420 msec (0x002a) Min connection length: 0.000 msec (0x0000) Max connection length: 0.000 msec (0x0000) > HCI Event: Command Status (0x0f) plen 4 raspberrypi#220 [hci0] 667.186558 LE Create Connection (0x08|0x000d) ncmd 1 Status: Success (0x00) > HCI Event: LE Meta Event (0x3e) plen 19 raspberrypi#221 [hci0] 667.485824 LE Connection Complete (0x01) Status: Success (0x00) Handle: 0 Role: Master (0x00) Peer address type: Random (0x01) Peer address: 50:52:D9:A6:48:A0 (Resolvable) Identity type: Public (0x00) Identity: 11:22:33:44:55:66 (OUI 11-22-33) Connection interval: 50.00 msec (0x0028) Connection latency: 0 (0x0000) Supervision timeout: 420 msec (0x002a) Master clock accuracy: 0x07 @ MGMT Event: Device Connected (0x000b) plen 13 {0x0002} [hci0] 667.485996 LE Address: 11:22:33:44:55:66 (OUI 11-22-33) Flags: 0x00000000 Data length: 0 Signed-off-by: Szymon Janc <szymon.janc@codecoup.pl> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 082f230 upstream. Local random address needs to be updated before creating connection if RPA from LE Direct Advertising Report was resolved in host. Otherwise remote device might ignore connection request due to address mismatch. This was affecting following qualification test cases: GAP/CONN/SCEP/BV-03-C, GAP/CONN/GCEP/BV-05-C, GAP/CONN/DCEP/BV-05-C Before patch: < HCI Command: LE Set Random Address (0x08|0x0005) plen 6 #11350 [hci0] 84680.231216 Address: 56:BC:E8:24:11:68 (Resolvable) Identity type: Random (0x01) Identity: F2:F1:06:3D:9C:42 (Static) > HCI Event: Command Complete (0x0e) plen 4 #11351 [hci0] 84680.246022 LE Set Random Address (0x08|0x0005) ncmd 1 Status: Success (0x00) < HCI Command: LE Set Scan Parameters (0x08|0x000b) plen 7 #11352 [hci0] 84680.246417 Type: Passive (0x00) Interval: 60.000 msec (0x0060) Window: 30.000 msec (0x0030) Own address type: Random (0x01) Filter policy: Accept all advertisement, inc. directed unresolved RPA (0x02) > HCI Event: Command Complete (0x0e) plen 4 #11353 [hci0] 84680.248854 LE Set Scan Parameters (0x08|0x000b) ncmd 1 Status: Success (0x00) < HCI Command: LE Set Scan Enable (0x08|0x000c) plen 2 #11354 [hci0] 84680.249466 Scanning: Enabled (0x01) Filter duplicates: Enabled (0x01) > HCI Event: Command Complete (0x0e) plen 4 #11355 [hci0] 84680.253222 LE Set Scan Enable (0x08|0x000c) ncmd 1 Status: Success (0x00) > HCI Event: LE Meta Event (0x3e) plen 18 #11356 [hci0] 84680.458387 LE Direct Advertising Report (0x0b) Num reports: 1 Event type: Connectable directed - ADV_DIRECT_IND (0x01) Address type: Random (0x01) Address: 53:38:DA:46:8C:45 (Resolvable) Identity type: Public (0x00) Identity: 11:22:33:44:55:66 (OUI 11-22-33) Direct address type: Random (0x01) Direct address: 7C:D6:76:8C:DF:82 (Resolvable) Identity type: Random (0x01) Identity: F2:F1:06:3D:9C:42 (Static) RSSI: -74 dBm (0xb6) < HCI Command: LE Set Scan Enable (0x08|0x000c) plen 2 #11357 [hci0] 84680.458737 Scanning: Disabled (0x00) Filter duplicates: Disabled (0x00) > HCI Event: Command Complete (0x0e) plen 4 #11358 [hci0] 84680.469982 LE Set Scan Enable (0x08|0x000c) ncmd 1 Status: Success (0x00) < HCI Command: LE Create Connection (0x08|0x000d) plen 25 #11359 [hci0] 84680.470444 Scan interval: 60.000 msec (0x0060) Scan window: 60.000 msec (0x0060) Filter policy: White list is not used (0x00) Peer address type: Random (0x01) Peer address: 53:38:DA:46:8C:45 (Resolvable) Identity type: Public (0x00) Identity: 11:22:33:44:55:66 (OUI 11-22-33) Own address type: Random (0x01) Min connection interval: 30.00 msec (0x0018) Max connection interval: 50.00 msec (0x0028) Connection latency: 0 (0x0000) Supervision timeout: 420 msec (0x002a) Min connection length: 0.000 msec (0x0000) Max connection length: 0.000 msec (0x0000) > HCI Event: Command Status (0x0f) plen 4 #11360 [hci0] 84680.474971 LE Create Connection (0x08|0x000d) ncmd 1 Status: Success (0x00) < HCI Command: LE Create Connection Cancel (0x08|0x000e) plen 0 #11361 [hci0] 84682.545385 > HCI Event: Command Complete (0x0e) plen 4 #11362 [hci0] 84682.551014 LE Create Connection Cancel (0x08|0x000e) ncmd 1 Status: Success (0x00) > HCI Event: LE Meta Event (0x3e) plen 19 #11363 [hci0] 84682.551074 LE Connection Complete (0x01) Status: Unknown Connection Identifier (0x02) Handle: 0 Role: Master (0x00) Peer address type: Public (0x00) Peer address: 00:00:00:00:00:00 (OUI 00-00-00) Connection interval: 0.00 msec (0x0000) Connection latency: 0 (0x0000) Supervision timeout: 0 msec (0x0000) Master clock accuracy: 0x00 After patch: < HCI Command: LE Set Scan Parameters (0x08|0x000b) plen 7 #210 [hci0] 667.152459 Type: Passive (0x00) Interval: 60.000 msec (0x0060) Window: 30.000 msec (0x0030) Own address type: Random (0x01) Filter policy: Accept all advertisement, inc. directed unresolved RPA (0x02) > HCI Event: Command Complete (0x0e) plen 4 #211 [hci0] 667.153613 LE Set Scan Parameters (0x08|0x000b) ncmd 1 Status: Success (0x00) < HCI Command: LE Set Scan Enable (0x08|0x000c) plen 2 #212 [hci0] 667.153704 Scanning: Enabled (0x01) Filter duplicates: Enabled (0x01) > HCI Event: Command Complete (0x0e) plen 4 #213 [hci0] 667.154584 LE Set Scan Enable (0x08|0x000c) ncmd 1 Status: Success (0x00) > HCI Event: LE Meta Event (0x3e) plen 18 #214 [hci0] 667.182619 LE Direct Advertising Report (0x0b) Num reports: 1 Event type: Connectable directed - ADV_DIRECT_IND (0x01) Address type: Random (0x01) Address: 50:52:D9:A6:48:A0 (Resolvable) Identity type: Public (0x00) Identity: 11:22:33:44:55:66 (OUI 11-22-33) Direct address type: Random (0x01) Direct address: 7C:C1:57:A5:B7:A8 (Resolvable) Identity type: Random (0x01) Identity: F4:28:73:5D:38:B0 (Static) RSSI: -70 dBm (0xba) < HCI Command: LE Set Scan Enable (0x08|0x000c) plen 2 #215 [hci0] 667.182704 Scanning: Disabled (0x00) Filter duplicates: Disabled (0x00) > HCI Event: Command Complete (0x0e) plen 4 #216 [hci0] 667.183599 LE Set Scan Enable (0x08|0x000c) ncmd 1 Status: Success (0x00) < HCI Command: LE Set Random Address (0x08|0x0005) plen 6 #217 [hci0] 667.183645 Address: 7C:C1:57:A5:B7:A8 (Resolvable) Identity type: Random (0x01) Identity: F4:28:73:5D:38:B0 (Static) > HCI Event: Command Complete (0x0e) plen 4 #218 [hci0] 667.184590 LE Set Random Address (0x08|0x0005) ncmd 1 Status: Success (0x00) < HCI Command: LE Create Connection (0x08|0x000d) plen 25 #219 [hci0] 667.184613 Scan interval: 60.000 msec (0x0060) Scan window: 60.000 msec (0x0060) Filter policy: White list is not used (0x00) Peer address type: Random (0x01) Peer address: 50:52:D9:A6:48:A0 (Resolvable) Identity type: Public (0x00) Identity: 11:22:33:44:55:66 (OUI 11-22-33) Own address type: Random (0x01) Min connection interval: 30.00 msec (0x0018) Max connection interval: 50.00 msec (0x0028) Connection latency: 0 (0x0000) Supervision timeout: 420 msec (0x002a) Min connection length: 0.000 msec (0x0000) Max connection length: 0.000 msec (0x0000) > HCI Event: Command Status (0x0f) plen 4 #220 [hci0] 667.186558 LE Create Connection (0x08|0x000d) ncmd 1 Status: Success (0x00) > HCI Event: LE Meta Event (0x3e) plen 19 #221 [hci0] 667.485824 LE Connection Complete (0x01) Status: Success (0x00) Handle: 0 Role: Master (0x00) Peer address type: Random (0x01) Peer address: 50:52:D9:A6:48:A0 (Resolvable) Identity type: Public (0x00) Identity: 11:22:33:44:55:66 (OUI 11-22-33) Connection interval: 50.00 msec (0x0028) Connection latency: 0 (0x0000) Supervision timeout: 420 msec (0x002a) Master clock accuracy: 0x07 @ MGMT Event: Device Connected (0x000b) plen 13 {0x0002} [hci0] 667.485996 LE Address: 11:22:33:44:55:66 (OUI 11-22-33) Flags: 0x00000000 Data length: 0 Signed-off-by: Szymon Janc <szymon.janc@codecoup.pl> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

@SiZe

We got a syzkaller problem because of aarch64 alignment fault if KFENCE enabled. When the size from user bpf program is an odd number, like 399, 407, etc, it will cause the struct skb_shared_info's unaligned access. As seen below: BUG: KFENCE: use-after-free read in __skb_clone+0x23c/0x2a0 net/core/skbuff.c:1032 Use-after-free read at 0xffff6254fffac077 (in kfence-#213): __lse_atomic_add arch/arm64/include/asm/atomic_lse.h:26 [inline] arch_atomic_add arch/arm64/include/asm/atomic.h:28 [inline] arch_atomic_inc include/linux/atomic-arch-fallback.h:270 [inline] atomic_inc include/asm-generic/atomic-instrumented.h:241 [inline] __skb_clone+0x23c/0x2a0 net/core/skbuff.c:1032 skb_clone+0xf4/0x214 net/core/skbuff.c:1481 ____bpf_clone_redirect net/core/filter.c:2433 [inline] bpf_clone_redirect+0x78/0x1c0 net/core/filter.c:2420 bpf_prog_d3839dd9068ceb51+0x80/0x330 bpf_dispatcher_nop_func include/linux/bpf.h:728 [inline] bpf_test_run+0x3c0/0x6c0 net/bpf/test_run.c:53 bpf_prog_test_run_skb+0x638/0xa7c net/bpf/test_run.c:594 bpf_prog_test_run kernel/bpf/syscall.c:3148 [inline] __do_sys_bpf kernel/bpf/syscall.c:4441 [inline] __se_sys_bpf+0xad0/0x1634 kernel/bpf/syscall.c:4381 kfence-#213: 0xffff6254fffac000-0xffff6254fffac196, size=407, cache=kmalloc-512 allocated by task 15074 on cpu 0 at 1342.585390s: kmalloc include/linux/slab.h:568 [inline] kzalloc include/linux/slab.h:675 [inline] bpf_test_init.isra.0+0xac/0x290 net/bpf/test_run.c:191 bpf_prog_test_run_skb+0x11c/0xa7c net/bpf/test_run.c:512 bpf_prog_test_run kernel/bpf/syscall.c:3148 [inline] __do_sys_bpf kernel/bpf/syscall.c:4441 [inline] __se_sys_bpf+0xad0/0x1634 kernel/bpf/syscall.c:4381 __arm64_sys_bpf+0x50/0x60 kernel/bpf/syscall.c:4381 To fix the problem, we adjust @SiZe so that (@SiZe + @hearoom) is a multiple of SMP_CACHE_BYTES. So we make sure the struct skb_shared_info is aligned to a cache line. Fixes: 1cf1cae ("bpf: introduce BPF_PROG_TEST_RUN command") Signed-off-by: Baisong Zhong <zhongbaisong@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/bpf/20221102081620.1465154-1-zhongbaisong@huawei.com

@SiZe

[ Upstream commit d3fd203 ] We got a syzkaller problem because of aarch64 alignment fault if KFENCE enabled. When the size from user bpf program is an odd number, like 399, 407, etc, it will cause the struct skb_shared_info's unaligned access. As seen below: BUG: KFENCE: use-after-free read in __skb_clone+0x23c/0x2a0 net/core/skbuff.c:1032 Use-after-free read at 0xffff6254fffac077 (in kfence-raspberrypi#213): __lse_atomic_add arch/arm64/include/asm/atomic_lse.h:26 [inline] arch_atomic_add arch/arm64/include/asm/atomic.h:28 [inline] arch_atomic_inc include/linux/atomic-arch-fallback.h:270 [inline] atomic_inc include/asm-generic/atomic-instrumented.h:241 [inline] __skb_clone+0x23c/0x2a0 net/core/skbuff.c:1032 skb_clone+0xf4/0x214 net/core/skbuff.c:1481 ____bpf_clone_redirect net/core/filter.c:2433 [inline] bpf_clone_redirect+0x78/0x1c0 net/core/filter.c:2420 bpf_prog_d3839dd9068ceb51+0x80/0x330 bpf_dispatcher_nop_func include/linux/bpf.h:728 [inline] bpf_test_run+0x3c0/0x6c0 net/bpf/test_run.c:53 bpf_prog_test_run_skb+0x638/0xa7c net/bpf/test_run.c:594 bpf_prog_test_run kernel/bpf/syscall.c:3148 [inline] __do_sys_bpf kernel/bpf/syscall.c:4441 [inline] __se_sys_bpf+0xad0/0x1634 kernel/bpf/syscall.c:4381 kfence-raspberrypi#213: 0xffff6254fffac000-0xffff6254fffac196, size=407, cache=kmalloc-512 allocated by task 15074 on cpu 0 at 1342.585390s: kmalloc include/linux/slab.h:568 [inline] kzalloc include/linux/slab.h:675 [inline] bpf_test_init.isra.0+0xac/0x290 net/bpf/test_run.c:191 bpf_prog_test_run_skb+0x11c/0xa7c net/bpf/test_run.c:512 bpf_prog_test_run kernel/bpf/syscall.c:3148 [inline] __do_sys_bpf kernel/bpf/syscall.c:4441 [inline] __se_sys_bpf+0xad0/0x1634 kernel/bpf/syscall.c:4381 __arm64_sys_bpf+0x50/0x60 kernel/bpf/syscall.c:4381 To fix the problem, we adjust @SiZe so that (@SiZe + @hearoom) is a multiple of SMP_CACHE_BYTES. So we make sure the struct skb_shared_info is aligned to a cache line. Fixes: 1cf1cae ("bpf: introduce BPF_PROG_TEST_RUN command") Signed-off-by: Baisong Zhong <zhongbaisong@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/bpf/20221102081620.1465154-1-zhongbaisong@huawei.com Signed-off-by: Sasha Levin <sashal@kernel.org>

@SiZe

[ Upstream commit d3fd203 ] We got a syzkaller problem because of aarch64 alignment fault if KFENCE enabled. When the size from user bpf program is an odd number, like 399, 407, etc, it will cause the struct skb_shared_info's unaligned access. As seen below: BUG: KFENCE: use-after-free read in __skb_clone+0x23c/0x2a0 net/core/skbuff.c:1032 Use-after-free read at 0xffff6254fffac077 (in kfence-#213): __lse_atomic_add arch/arm64/include/asm/atomic_lse.h:26 [inline] arch_atomic_add arch/arm64/include/asm/atomic.h:28 [inline] arch_atomic_inc include/linux/atomic-arch-fallback.h:270 [inline] atomic_inc include/asm-generic/atomic-instrumented.h:241 [inline] __skb_clone+0x23c/0x2a0 net/core/skbuff.c:1032 skb_clone+0xf4/0x214 net/core/skbuff.c:1481 ____bpf_clone_redirect net/core/filter.c:2433 [inline] bpf_clone_redirect+0x78/0x1c0 net/core/filter.c:2420 bpf_prog_d3839dd9068ceb51+0x80/0x330 bpf_dispatcher_nop_func include/linux/bpf.h:728 [inline] bpf_test_run+0x3c0/0x6c0 net/bpf/test_run.c:53 bpf_prog_test_run_skb+0x638/0xa7c net/bpf/test_run.c:594 bpf_prog_test_run kernel/bpf/syscall.c:3148 [inline] __do_sys_bpf kernel/bpf/syscall.c:4441 [inline] __se_sys_bpf+0xad0/0x1634 kernel/bpf/syscall.c:4381 kfence-#213: 0xffff6254fffac000-0xffff6254fffac196, size=407, cache=kmalloc-512 allocated by task 15074 on cpu 0 at 1342.585390s: kmalloc include/linux/slab.h:568 [inline] kzalloc include/linux/slab.h:675 [inline] bpf_test_init.isra.0+0xac/0x290 net/bpf/test_run.c:191 bpf_prog_test_run_skb+0x11c/0xa7c net/bpf/test_run.c:512 bpf_prog_test_run kernel/bpf/syscall.c:3148 [inline] __do_sys_bpf kernel/bpf/syscall.c:4441 [inline] __se_sys_bpf+0xad0/0x1634 kernel/bpf/syscall.c:4381 __arm64_sys_bpf+0x50/0x60 kernel/bpf/syscall.c:4381 To fix the problem, we adjust @SiZe so that (@SiZe + @hearoom) is a multiple of SMP_CACHE_BYTES. So we make sure the struct skb_shared_info is aligned to a cache line. Fixes: 1cf1cae ("bpf: introduce BPF_PROG_TEST_RUN command") Signed-off-by: Baisong Zhong <zhongbaisong@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/bpf/20221102081620.1465154-1-zhongbaisong@huawei.com Signed-off-by: Sasha Levin <sashal@kernel.org>

[ Upstream commit 3531b27 ] Fix a missed "goto out" to unlock on error to cleanup this splat: WARNING: lock held when returning to user space! 6.6.0-rc3-lizhijian+ #213 Not tainted ------------------------------------------------ cxl/673 is leaving the kernel with locks still held! 1 lock held by cxl/673: #0: ffffffffa013b9d0 (cxl_region_rwsem){++++}-{3:3}, at: commit_store+0x7d/0x3e0 [cxl_core] In terms of user visible impact of this bug for backports: cxl_region_invalidate_memregion() on x86 invokes wbinvd which is a problematic instruction for virtualized environments. So, on virtualized x86, cxl_region_invalidate_memregion() returns an error. This failure case got missed because CXL memory-expander device passthrough is not a production use case, and emulation of CXL devices is typically limited to kernel development builds with CONFIG_CXL_REGION_INVALIDATION_TEST=y, that makes cxl_region_invalidate_memregion() succeed. In other words, the expected exposure of this bug is limited to CXL subsystem development environments using QEMU that neglected CONFIG_CXL_REGION_INVALIDATION_TEST=y. Fixes: d1257d0 ("cxl/region: Move cache invalidation before region teardown, and before setup") Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20231025085450.2514906-1-lizhijian@fujitsu.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>

When the mirred action is used on a classful egress qdisc and a packet is mirrored or redirected to self we hit a qdisc lock deadlock. See trace below. [..... other info removed for brevity....] [ 82.890906] [ 82.890906] ============================================ [ 82.890906] WARNING: possible recursive locking detected [ 82.890906] 6.8.0-05205-g77fadd89fe2d-dirty #213 Tainted: G W [ 82.890906] -------------------------------------------- [ 82.890906] ping/418 is trying to acquire lock: [ 82.890906] ffff888006994110 (&sch->q.lock){+.-.}-{3:3}, at: __dev_queue_xmit+0x1778/0x3550 [ 82.890906] [ 82.890906] but task is already holding lock: [ 82.890906] ffff888006994110 (&sch->q.lock){+.-.}-{3:3}, at: __dev_queue_xmit+0x1778/0x3550 [ 82.890906] [ 82.890906] other info that might help us debug this: [ 82.890906] Possible unsafe locking scenario: [ 82.890906] [ 82.890906] CPU0 [ 82.890906] ---- [ 82.890906] lock(&sch->q.lock); [ 82.890906] lock(&sch->q.lock); [ 82.890906] [ 82.890906] *** DEADLOCK *** [ 82.890906] [..... other info removed for brevity....] Example setup (eth0->eth0) to recreate tc qdisc add dev eth0 root handle 1: htb default 30 tc filter add dev eth0 handle 1: protocol ip prio 2 matchall \ action mirred egress redirect dev eth0 Another example(eth0->eth1->eth0) to recreate tc qdisc add dev eth0 root handle 1: htb default 30 tc filter add dev eth0 handle 1: protocol ip prio 2 matchall \ action mirred egress redirect dev eth1 tc qdisc add dev eth1 root handle 1: htb default 30 tc filter add dev eth1 handle 1: protocol ip prio 2 matchall \ action mirred egress redirect dev eth0 We fix this by adding an owner field (CPU id) to struct Qdisc set after root qdisc is entered. When the softirq enters it a second time, if the qdisc owner is the same CPU, the packet is dropped to break the loop. Reported-by: Mingshuai Ren <renmingshuai@huawei.com> Closes: https://lore.kernel.org/netdev/20240314111713.5979-1-renmingshuai@huawei.com/ Fixes: 3bcb846 ("net: get rid of spin_trylock() in net_tx_action()") Fixes: e578d9c ("net: sched: use counter to break reclassify loops") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Victor Nogueira <victor@mojatatu.com> Reviewed-by: Pedro Tammela <pctammela@mojatatu.com> Tested-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://lore.kernel.org/r/20240415210728.36949-1-victor@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>

[ Upstream commit 0f022d3 ] When the mirred action is used on a classful egress qdisc and a packet is mirrored or redirected to self we hit a qdisc lock deadlock. See trace below. [..... other info removed for brevity....] [ 82.890906] [ 82.890906] ============================================ [ 82.890906] WARNING: possible recursive locking detected [ 82.890906] 6.8.0-05205-g77fadd89fe2d-dirty #213 Tainted: G W [ 82.890906] -------------------------------------------- [ 82.890906] ping/418 is trying to acquire lock: [ 82.890906] ffff888006994110 (&sch->q.lock){+.-.}-{3:3}, at: __dev_queue_xmit+0x1778/0x3550 [ 82.890906] [ 82.890906] but task is already holding lock: [ 82.890906] ffff888006994110 (&sch->q.lock){+.-.}-{3:3}, at: __dev_queue_xmit+0x1778/0x3550 [ 82.890906] [ 82.890906] other info that might help us debug this: [ 82.890906] Possible unsafe locking scenario: [ 82.890906] [ 82.890906] CPU0 [ 82.890906] ---- [ 82.890906] lock(&sch->q.lock); [ 82.890906] lock(&sch->q.lock); [ 82.890906] [ 82.890906] *** DEADLOCK *** [ 82.890906] [..... other info removed for brevity....] Example setup (eth0->eth0) to recreate tc qdisc add dev eth0 root handle 1: htb default 30 tc filter add dev eth0 handle 1: protocol ip prio 2 matchall \ action mirred egress redirect dev eth0 Another example(eth0->eth1->eth0) to recreate tc qdisc add dev eth0 root handle 1: htb default 30 tc filter add dev eth0 handle 1: protocol ip prio 2 matchall \ action mirred egress redirect dev eth1 tc qdisc add dev eth1 root handle 1: htb default 30 tc filter add dev eth1 handle 1: protocol ip prio 2 matchall \ action mirred egress redirect dev eth0 We fix this by adding an owner field (CPU id) to struct Qdisc set after root qdisc is entered. When the softirq enters it a second time, if the qdisc owner is the same CPU, the packet is dropped to break the loop. Reported-by: Mingshuai Ren <renmingshuai@huawei.com> Closes: https://lore.kernel.org/netdev/20240314111713.5979-1-renmingshuai@huawei.com/ Fixes: 3bcb846 ("net: get rid of spin_trylock() in net_tx_action()") Fixes: e578d9c ("net: sched: use counter to break reclassify loops") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Victor Nogueira <victor@mojatatu.com> Reviewed-by: Pedro Tammela <pctammela@mojatatu.com> Tested-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://lore.kernel.org/r/20240415210728.36949-1-victor@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>

ghost assigned ghollingworth Feb 9, 2013

P33M closed this as completed May 5, 2014

shawaj mentioned this issue Feb 14, 2017

clk-bcm2835: Mark used PLLs and dividers CRITICAL #1846

Closed

USB storage device causing occasional block/genhd.c panic #213

USB storage device causing occasional block/genhd.c panic #213

Comments

zerxy commented Feb 5, 2013

popcornmix commented Feb 5, 2013

Uh oh!

zerxy commented Feb 5, 2013

Uh oh!

zerxy commented Feb 5, 2013

Uh oh!

zerxy commented Feb 5, 2013

Uh oh!

popcornmix commented Feb 5, 2013

Uh oh!

zerxy commented Feb 5, 2013

Uh oh!

popcornmix commented Feb 5, 2013

Uh oh!

zerxy commented Feb 6, 2013

Uh oh!

zerxy commented Feb 8, 2013

Uh oh!

zerxy commented Feb 8, 2013

Uh oh!

popcornmix commented Feb 8, 2013

Uh oh!

zerxy commented Feb 8, 2013

Uh oh!

zerxy commented Feb 12, 2013

Uh oh!

zerxy commented Feb 16, 2013

Uh oh!

zerxy commented Mar 2, 2013

Uh oh!

ghollingworth commented Mar 2, 2013

Uh oh!

zerxy commented Mar 2, 2013

Uh oh!

ghollingworth commented Mar 2, 2013

Uh oh!

zerxy commented Mar 2, 2013

Uh oh!

ghollingworth commented Mar 2, 2013

Uh oh!

zerxy commented Mar 2, 2013

Uh oh!

zerxy commented Mar 2, 2013

Uh oh!

licaon-kter commented Mar 2, 2013

Uh oh!

zerxy commented Mar 2, 2013

Uh oh!

ghollingworth commented Mar 2, 2013

Uh oh!

ghollingworth commented Mar 2, 2013

Uh oh!

zerxy commented Mar 2, 2013

Uh oh!

zerxy commented Mar 4, 2013

Uh oh!

zerxy commented Mar 4, 2013

Uh oh!

popcornmix commented Mar 4, 2013

Uh oh!

zerxy commented Mar 5, 2013

Uh oh!

zerxy commented Mar 28, 2013

Uh oh!

psergiu commented Apr 7, 2013

Uh oh!

ghollingworth commented Jul 17, 2013

Uh oh!

P33M commented May 5, 2014

Uh oh!

zerxy commented May 5, 2014

Uh oh!