Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kernel oops on plugging in any V4l2 device since applying media_build updates for PCTV 461e DVB-S2 #524

Closed
tvjon opened this issue Feb 6, 2014 · 4 comments

Comments

@tvjon
Copy link

tvjon commented Feb 6, 2014

Linux rpi 3.10.28+ #634 PREEMPT Sun Feb 2 15:16:25 GMT 2014 armv6l GNU/L
-------------------------------------------------------------------------------------------------------
[  101.828153] usb 1-1.2: new high-speed USB device number 7 using dwc_otg
[  101.929822] usb 1-1.2: New USB device found, idVendor=2013, idProduct=0258
[  101.929860] usb 1-1.2: New USB device strings: Mfr=3, Product=1, SerialNumber=2
[  101.929879] usb 1-1.2: Product: PCTV 461
[  101.929914] usb 1-1.2: Manufacturer: PCTV
[  101.929933] usb 1-1.2: SerialNumber: 0011319010
[  101.990440] media: Linux media interface: v0.10
[  102.015030] Linux video capture interface: v2.00
[  102.015095] WARNING: You are using an experimental version of the media stack.
[  102.015095]  As the driver is backported to an older kernel, it doesn't offer
[  102.015095]  enough quality for its usage in production.
[  102.015095]  Use it with care.
[  102.015095] Latest git patches (needed if you report a bug to linux-media@vger.kernel.org):
[  102.015095]  261cb200e7227820cd0056435d7c1a3a9c476766 [media] af9035: add ID [2040:f900] Hauppauge WinTV-MiniStick 2
[  102.015095]  dee88f4378d18b4e7ebca22c82dab2ed5dfd178c [media] nuvoton-cir: Don't touch PS/2 interrupts while initializing
[  102.015095]  8d2b022911c2fa72085df39921dc5cd963bc159f [media] cx23885: Fix tuning regression for TeVii S471
[  102.025513] Unable to handle kernel NULL pointer dereference at virtual address 00000004
[  102.025566] pgd = d2ebc000
[  102.025582] [00000004] *pgd=00000000
[  102.025607] Internal error: Oops: 5 [#1] PREEMPT ARM
[  102.025622] Modules linked in: v4l2_common(O) videodev(O) media(O) bnep rfcomm bluetooth rfkill snd_bcm2835 snd_soc_wm8804 snd_soc_bcm2708_i2s evdev regmap_mmio joydev snd_soc_core snd_compress regmap_i2c regmap_spi snd_pcm snd_page_alloc snd_seq snd_seq_device snd_timer leds_gpio led_class snd spi_bcm2708 i2c_bcm2708
[  102.025731] CPU: 0 PID: 2542 Comm: modprobe Tainted: G           O 3.10.28+ #634
[  102.025750] task: d2cd6f60 ti: d2e60000 task.ti: d2e60000
[  102.025789] PC is at module_put+0x28/0x6c
[  102.025812] LR is at load_module+0x1780/0x1ddc
[  102.025831] pc : [<c005f9c0>]    lr : [<c0061f10>]    psr: a0000013
[  102.025831] sp : d2e61ea0  ip : 00000000  fp : c005f77c
[  102.025848] r10: 00000001  r9 : d2e60000  r8 : d2f00580
[  102.025862] r7 : 00000001  r6 : bf17adb0  r5 : bf17adbc  r4 : d2e60000
[  102.025876] r3 : 00000000  r2 : 00000000  r1 : 00000000  r0 : bf17adb0
[  102.025892] Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
[  102.025906] Control: 00c5387d  Table: 12ebc008  DAC: 00000015
[  102.025921] Process modprobe (pid: 2542, stack limit = 0xd2e601b0)
[  102.025935] Stack: (0xd2e61ea0 to 0xd2e62000)
[  102.025958] 1ea0: d2e61f58 c0061f10 bf17adbc 00007fff c005ede4 c007e9ac c005f82c d89e6000
[  102.025981] 1ec0: d2e60000 bf17adf8 d2e60000 d2e60000 c02284fc c0531144 bf17aa88 00000005
[  102.026003] 1ee0: bf17aab0 0000000a 00000000 00000000 6e72656b 00006c65 00000000 00000000
[  102.026025] 1f00: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[  102.026046] 1f20: 00000000 00000000 00000000 3cb60e57 000000d2 0000360c b6f34000 b6f1c948
[  102.026068] 1f40: 00000080 c000dc68 d2e60000 00000000 00000000 c0062644 d89e6000 0000360c
[  102.026089] 1f60: d89e7744 d89e7605 d89e881c 00000ef8 00001678 00000000 00000000 00000000
[  102.026110] 1f80: 0000001f 00000020 00000019 00000017 00000015 00000000 00000000 00060000
[  102.026132] 1fa0: b8365630 c000dac0 00000000 00060000 b6f34000 0000360c b6f1c948 b6f34000
[  102.026154] 1fc0: 00000000 00060000 b8365630 00000080 b8364e88 0000360c b6f1c948 00000000
[  102.026176] 1fe0: 00000000 bee6c91c b6f13fb4 b6e7eec4 60000010 b6f34000 00000000 00000000
[  102.026216] [<c005f9c0>] (module_put+0x28/0x6c) from [<c0061f10>] (load_module+0x1780/0x1ddc)
[  102.026251] [<c0061f10>] (load_module+0x1780/0x1ddc) from [<c0062644>] (SyS_init_module+0xd8/0xec)
[  102.026290] [<c0062644>] (SyS_init_module+0xd8/0xec) from [<c000dac0>] (ret_fast_syscall+0x0/0x30)
[  102.026314] Code: e5943004 e2833001 e5843004 e590314c (e5932004) 
[  102.026367] ---[ end trace 7bc1951145d16b77 ]---
[  102.026392] note: modprobe[2542] exited with preempt_count 1
@popcornmix
Copy link
Collaborator

Can you explain what "media_build updates for PCTV 461e DVB-S2" are?
Is this something you've locally changed and rebuilt?
Is this an upstream commit we've pulled into kernel tree?

@tvjon
Copy link
Author

tvjon commented Feb 9, 2014

Yes,
http://git.linuxtv.org/anttip/media_tree.git/shortlog/refs/heads/pctv_461e
Also:
https://lwn.net/Articles/573081/

If possible, I wanted to add DVB drivers without having to compile the kernel, particularly as I'm trying to build natively on RPi.
The build went ahead ok (took several hours of course), & installing also went ok.

I was hoping to get this tuner working as I think you're already aware that the 290e tuner currently won't initialise because of a "GPIO range error" I reported a couple of weeks or so ago.

Thank you for helping.

@popcornmix
Copy link
Collaborator

I think you'll have to report this to linuxtv (or whoever produces the driver).
We can't really be expected to support out of tree drivers (and I have no immediate thoughts as to what the problem is).
This message doesn't really inspire confidence:

[  102.015095] WARNING: You are using an experimental version of the media stack.
[  102.015095]  As the driver is backported to an older kernel, it doesn't offer
[  102.015095]  enough quality for its usage in production.
[  102.015095]  Use it with care.

@P33M
Copy link
Contributor

P33M commented Jul 14, 2015

Please report issues regarding upstream subsystems to the relevant maintainers. We can't support backports from various development versions here, as the workload would multiply rapidly.

@P33M P33M closed this as completed Jul 14, 2015
popcornmix pushed a commit that referenced this issue Aug 19, 2016
drm_connector_register_all requires a few too many locks because our
connector_list locking is busted. Add another FIXME+hack to work
around this. This should address the below lockdep splat:

======================================================
[ INFO: possible circular locking dependency detected ]
4.7.0-rc5+ #524 Tainted: G           O
-------------------------------------------------------
kworker/u8:0/6 is trying to acquire lock:
 (&dev->mode_config.mutex){+.+.+.}, at: [<ffffffff815afde0>] drm_modeset_lock_all+0x40/0x120

but task is already holding lock:
 ((fb_notifier_list).rwsem){++++.+}, at: [<ffffffff810ac195>] __blocking_notifier_call_chain+0x35/0x70

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 ((fb_notifier_list).rwsem){++++.+}:
       [<ffffffff810df611>] lock_acquire+0xb1/0x200
       [<ffffffff819a55b4>] down_write+0x44/0x80
       [<ffffffff810abf91>] blocking_notifier_chain_register+0x21/0xb0
       [<ffffffff814c7448>] fb_register_client+0x18/0x20
       [<ffffffff814c6c86>] backlight_device_register+0x136/0x260
       [<ffffffffa0127eb2>] intel_backlight_device_register+0xa2/0x160 [i915]
       [<ffffffffa00f46be>] intel_connector_register+0xe/0x10 [i915]
       [<ffffffffa0112bfb>] intel_dp_connector_register+0x1b/0x80 [i915]
       [<ffffffff8159dfea>] drm_connector_register+0x4a/0x80
       [<ffffffff8159fe44>] drm_connector_register_all+0x64/0xf0
       [<ffffffff815a2a64>] drm_modeset_register_all+0x174/0x1c0
       [<ffffffff81599b72>] drm_dev_register+0xc2/0xd0
       [<ffffffffa00621d7>] i915_driver_load+0x1547/0x2200 [i915]
       [<ffffffffa006d80f>] i915_pci_probe+0x4f/0x70 [i915]
       [<ffffffff814a2135>] local_pci_probe+0x45/0xa0
       [<ffffffff814a349b>] pci_device_probe+0xdb/0x130
       [<ffffffff815c07e3>] driver_probe_device+0x223/0x440
       [<ffffffff815c0ad5>] __driver_attach+0xd5/0x100
       [<ffffffff815be386>] bus_for_each_dev+0x66/0xa0
       [<ffffffff815c002e>] driver_attach+0x1e/0x20
       [<ffffffff815bf9be>] bus_add_driver+0x1ee/0x280
       [<ffffffff815c1810>] driver_register+0x60/0xe0
       [<ffffffff814a1a10>] __pci_register_driver+0x60/0x70
       [<ffffffffa01a905b>] i915_init+0x5b/0x62 [i915]
       [<ffffffff8100042d>] do_one_initcall+0x3d/0x150
       [<ffffffff811a935b>] do_init_module+0x5f/0x1d9
       [<ffffffff81124416>] load_module+0x20e6/0x27e0
       [<ffffffff81124d63>] SYSC_finit_module+0xc3/0xf0
       [<ffffffff81124dae>] SyS_finit_module+0xe/0x10
       [<ffffffff819a83a9>] entry_SYSCALL_64_fastpath+0x1c/0xac

-> #0 (&dev->mode_config.mutex){+.+.+.}:
       [<ffffffff810df0ac>] __lock_acquire+0x10fc/0x1260
       [<ffffffff810df611>] lock_acquire+0xb1/0x200
       [<ffffffff819a3097>] mutex_lock_nested+0x67/0x3c0
       [<ffffffff815afde0>] drm_modeset_lock_all+0x40/0x120
       [<ffffffff8158f79b>] drm_fb_helper_restore_fbdev_mode_unlocked+0x2b/0x80
       [<ffffffff8158f81d>] drm_fb_helper_set_par+0x2d/0x50
       [<ffffffffa0105f7a>] intel_fbdev_set_par+0x1a/0x60 [i915]
       [<ffffffff814c13c6>] fbcon_init+0x586/0x610
       [<ffffffff8154d16a>] visual_init+0xca/0x130
       [<ffffffff8154e611>] do_bind_con_driver+0x1c1/0x3a0
       [<ffffffff8154eaf6>] do_take_over_console+0x116/0x180
       [<ffffffff814bd3a7>] do_fbcon_takeover+0x57/0xb0
       [<ffffffff814c1e48>] fbcon_event_notify+0x658/0x750
       [<ffffffff810abcae>] notifier_call_chain+0x3e/0xb0
       [<ffffffff810ac1ad>] __blocking_notifier_call_chain+0x4d/0x70
       [<ffffffff810ac1e6>] blocking_notifier_call_chain+0x16/0x20
       [<ffffffff814c748b>] fb_notifier_call_chain+0x1b/0x20
       [<ffffffff814c86b1>] register_framebuffer+0x251/0x330
       [<ffffffff8158fa9f>] drm_fb_helper_initial_config+0x25f/0x3f0
       [<ffffffffa0106b48>] intel_fbdev_initial_config+0x18/0x30 [i915]
       [<ffffffff810adfd8>] async_run_entry_fn+0x48/0x150
       [<ffffffff810a3947>] process_one_work+0x1e7/0x750
       [<ffffffff810a3efb>] worker_thread+0x4b/0x4f0
       [<ffffffff810aad4f>] kthread+0xef/0x110
       [<ffffffff819a85ef>] ret_from_fork+0x1f/0x40

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock((fb_notifier_list).rwsem);
                               lock(&dev->mode_config.mutex);
                               lock((fb_notifier_list).rwsem);
  lock(&dev->mode_config.mutex);

 *** DEADLOCK ***

6 locks held by kworker/u8:0/6:
 #0:  ("events_unbound"){.+.+.+}, at: [<ffffffff810a38c9>] process_one_work+0x169/0x750
 #1:  ((&entry->work)){+.+.+.}, at: [<ffffffff810a38c9>] process_one_work+0x169/0x750
 #2:  (registration_lock){+.+.+.}, at: [<ffffffff814c8487>] register_framebuffer+0x27/0x330
 #3:  (console_lock){+.+.+.}, at: [<ffffffff814c86ce>] register_framebuffer+0x26e/0x330
 #4:  (&fb_info->lock){+.+.+.}, at: [<ffffffff814c78dd>] lock_fb_info+0x1d/0x40
 #5:  ((fb_notifier_list).rwsem){++++.+}, at: [<ffffffff810ac195>] __blocking_notifier_call_chain+0x35/0x70

stack backtrace:
CPU: 2 PID: 6 Comm: kworker/u8:0 Tainted: G           O    4.7.0-rc5+ #524
Hardware name: Intel Corp. Broxton P/NOTEBOOK, BIOS APLKRVPA.X64.0138.B33.1606250842 06/25/2016
Workqueue: events_unbound async_run_entry_fn
 0000000000000000 ffff8800758577f0 ffffffff814507a5 ffffffff828b9900
 ffffffff828b9900 ffff880075857830 ffffffff810dc6fa ffff880075857880
 ffff88007584d688 0000000000000005 0000000000000006 ffff88007584d6b0
Call Trace:
 [<ffffffff814507a5>] dump_stack+0x67/0x92
 [<ffffffff810dc6fa>] print_circular_bug+0x1aa/0x200
 [<ffffffff810df0ac>] __lock_acquire+0x10fc/0x1260
 [<ffffffff810df611>] lock_acquire+0xb1/0x200
 [<ffffffff815afde0>] ? drm_modeset_lock_all+0x40/0x120
 [<ffffffff815afde0>] ? drm_modeset_lock_all+0x40/0x120
 [<ffffffff819a3097>] mutex_lock_nested+0x67/0x3c0
 [<ffffffff815afde0>] ? drm_modeset_lock_all+0x40/0x120
 [<ffffffff810fa85f>] ? rcu_read_lock_sched_held+0x7f/0x90
 [<ffffffff81208218>] ? kmem_cache_alloc_trace+0x248/0x2b0
 [<ffffffff815afdc5>] ? drm_modeset_lock_all+0x25/0x120
 [<ffffffff815afde0>] drm_modeset_lock_all+0x40/0x120
 [<ffffffff8158f79b>] drm_fb_helper_restore_fbdev_mode_unlocked+0x2b/0x80
 [<ffffffff8158f81d>] drm_fb_helper_set_par+0x2d/0x50
 [<ffffffffa0105f7a>] intel_fbdev_set_par+0x1a/0x60 [i915]
 [<ffffffff814c13c6>] fbcon_init+0x586/0x610
 [<ffffffff8154d16a>] visual_init+0xca/0x130
 [<ffffffff8154e611>] do_bind_con_driver+0x1c1/0x3a0
 [<ffffffff8154eaf6>] do_take_over_console+0x116/0x180
 [<ffffffff814bd3a7>] do_fbcon_takeover+0x57/0xb0
 [<ffffffff814c1e48>] fbcon_event_notify+0x658/0x750
 [<ffffffff810abcae>] notifier_call_chain+0x3e/0xb0
 [<ffffffff810ac1ad>] __blocking_notifier_call_chain+0x4d/0x70
 [<ffffffff810ac1e6>] blocking_notifier_call_chain+0x16/0x20
 [<ffffffff814c748b>] fb_notifier_call_chain+0x1b/0x20
 [<ffffffff814c86b1>] register_framebuffer+0x251/0x330
 [<ffffffff815b7e8d>] ? vga_switcheroo_client_fb_set+0x5d/0x70
 [<ffffffff8158fa9f>] drm_fb_helper_initial_config+0x25f/0x3f0
 [<ffffffffa0106b48>] intel_fbdev_initial_config+0x18/0x30 [i915]
 [<ffffffff810adfd8>] async_run_entry_fn+0x48/0x150
 [<ffffffff810a3947>] process_one_work+0x1e7/0x750
 [<ffffffff810a38c9>] ? process_one_work+0x169/0x750
 [<ffffffff810a3efb>] worker_thread+0x4b/0x4f0
 [<ffffffff810a3eb0>] ? process_one_work+0x750/0x750
 [<ffffffff810aad4f>] kthread+0xef/0x110
 [<ffffffff819a85ef>] ret_from_fork+0x1f/0x40
 [<ffffffff810aac60>] ? kthread_stop+0x2e0/0x2e0

v2: Rebase onto the right branch (hand-editing patches ftw) and add more
reporters.

Reported-by: Imre Deak <imre.deak@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Reported-by: Jiri Kosina <jikos@kernel.org>
Cc: Jiri Kosina <jikos@kernel.org>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
popcornmix pushed a commit that referenced this issue Jan 22, 2018
When a tail call fails, it is documented that the tail call should
continue execution at the following instruction.  An example tail call
sequence is:

  12: (85) call bpf_tail_call#12
  13: (b7) r0 = 0
  14: (95) exit

The ARM assembler for the tail call in this case ends up branching to
instruction 14 instead of instruction 13, resulting in the BPF filter
returning a non-zero value:

  178:	ldr	r8, [sp, #588]	; insn 12
  17c:	ldr	r6, [r8, r6]
  180:	ldr	r8, [sp, #580]
  184:	cmp	r8, r6
  188:	bcs	0x1e8
  18c:	ldr	r6, [sp, #524]
  190:	ldr	r7, [sp, #528]
  194:	cmp	r7, #0
  198:	cmpeq	r6, #32
  19c:	bhi	0x1e8
  1a0:	adds	r6, r6, #1
  1a4:	adc	r7, r7, #0
  1a8:	str	r6, [sp, #524]
  1ac:	str	r7, [sp, #528]
  1b0:	mov	r6, #104
  1b4:	ldr	r8, [sp, #588]
  1b8:	add	r6, r8, r6
  1bc:	ldr	r8, [sp, #580]
  1c0:	lsl	r7, r8, #2
  1c4:	ldr	r6, [r6, r7]
  1c8:	cmp	r6, #0
  1cc:	beq	0x1e8
  1d0:	mov	r8, #32
  1d4:	ldr	r6, [r6, r8]
  1d8:	add	r6, r6, #44
  1dc:	bx	r6
  1e0:	mov	r0, #0		; insn 13
  1e4:	mov	r1, #0
  1e8:	add	sp, sp, #596	; insn 14
  1ec:	pop	{r4, r5, r6, r7, r8, sl, pc}

For other sequences, the tail call could end up branching midway through
the following BPF instructions, or maybe off the end of the function,
leading to unknown behaviours.

Fixes: 39c13c2 ("arm: eBPF JIT compiler")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
popcornmix pushed a commit that referenced this issue Jan 31, 2018
commit f4483f2 upstream.

When a tail call fails, it is documented that the tail call should
continue execution at the following instruction.  An example tail call
sequence is:

  12: (85) call bpf_tail_call#12
  13: (b7) r0 = 0
  14: (95) exit

The ARM assembler for the tail call in this case ends up branching to
instruction 14 instead of instruction 13, resulting in the BPF filter
returning a non-zero value:

  178:	ldr	r8, [sp, #588]	; insn 12
  17c:	ldr	r6, [r8, r6]
  180:	ldr	r8, [sp, #580]
  184:	cmp	r8, r6
  188:	bcs	0x1e8
  18c:	ldr	r6, [sp, #524]
  190:	ldr	r7, [sp, #528]
  194:	cmp	r7, #0
  198:	cmpeq	r6, #32
  19c:	bhi	0x1e8
  1a0:	adds	r6, r6, #1
  1a4:	adc	r7, r7, #0
  1a8:	str	r6, [sp, #524]
  1ac:	str	r7, [sp, #528]
  1b0:	mov	r6, #104
  1b4:	ldr	r8, [sp, #588]
  1b8:	add	r6, r8, r6
  1bc:	ldr	r8, [sp, #580]
  1c0:	lsl	r7, r8, #2
  1c4:	ldr	r6, [r6, r7]
  1c8:	cmp	r6, #0
  1cc:	beq	0x1e8
  1d0:	mov	r8, #32
  1d4:	ldr	r6, [r6, r8]
  1d8:	add	r6, r6, #44
  1dc:	bx	r6
  1e0:	mov	r0, #0		; insn 13
  1e4:	mov	r1, #0
  1e8:	add	sp, sp, #596	; insn 14
  1ec:	pop	{r4, r5, r6, r7, r8, sl, pc}

For other sequences, the tail call could end up branching midway through
the following BPF instructions, or maybe off the end of the function,
leading to unknown behaviours.

Fixes: 39c13c2 ("arm: eBPF JIT compiler")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
popcornmix pushed a commit that referenced this issue Jun 14, 2022
[ Upstream commit df5cd36 ]

When we boot a machine using a devicetree, the generic DT code goes
through all nodes with a 'device_type = "memory"' property, and collects
all memory banks mentioned there. However it does not check for the
status property, so any nodes which are explicitly "disabled" will still
be added as a memblock.
This ends up badly for QEMU, when booting with secure firmware on
arm/arm64 machines, because QEMU adds a node describing secure-only
memory:
===================
	secram@e000000 {
		secure-status = "okay";
		status = "disabled";
		reg = <0x00 0xe000000 0x00 0x1000000>;
		device_type = "memory";
	};
===================

The kernel will eventually use that memory block (which is located below
the main DRAM bank), but accesses to that will be answered with an
SError:
===================
[    0.000000] Internal error: synchronous external abort: 96000050 [#1] PREEMPT SMP
[    0.000000] Modules linked in:
[    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 5.18.0-rc6-00014-g10c8acb8b679 #524
[    0.000000] Hardware name: linux,dummy-virt (DT)
[    0.000000] pstate: 200000c5 (nzCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[    0.000000] pc : new_slab+0x190/0x340
[    0.000000] lr : new_slab+0x184/0x340
[    0.000000] sp : ffff80000a4b3d10
....
==================
The actual crash location and call stack will be somewhat random, and
depend on the specific allocation of that physical memory range.

As the DT spec[1] explicitly mentions standard properties, add a simple
check to skip over disabled memory nodes, so that we only use memory
that is meant for non-secure code to use.

That fixes booting a QEMU arm64 VM with EL3 enabled ("secure=on"), when
not using UEFI. In this case the QEMU generated DT will be handed on
to the kernel, which will see the secram node.
This issue is reproducible when using TF-A together with U-Boot as
firmware, then booting with the "booti" command.

When using U-Boot as an UEFI provider, the code there [2] explicitly
filters for disabled nodes when generating the UEFI memory map, so we
are safe.
EDK/2 only reads the first bank of the first DT memory node [3] to learn
about memory, so we got lucky there.

[1] https://github.com/devicetree-org/devicetree-specification/blob/main/source/chapter3-devicenodes.rst#memory-node (after the table)
[2] https://source.denx.de/u-boot/u-boot/-/blob/master/lib/fdtdec.c#L1061-1063
[3] https://github.com/tianocore/edk2/blob/master/ArmVirtPkg/PrePi/FdtParser.c

Reported-by: Ross Burton <ross.burton@arm.com>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20220517101410.3493781-1-andre.przywara@arm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
sigmaris pushed a commit to sigmaris/linux that referenced this issue Jun 23, 2022
[ Upstream commit df5cd36 ]

When we boot a machine using a devicetree, the generic DT code goes
through all nodes with a 'device_type = "memory"' property, and collects
all memory banks mentioned there. However it does not check for the
status property, so any nodes which are explicitly "disabled" will still
be added as a memblock.
This ends up badly for QEMU, when booting with secure firmware on
arm/arm64 machines, because QEMU adds a node describing secure-only
memory:
===================
	secram@e000000 {
		secure-status = "okay";
		status = "disabled";
		reg = <0x00 0xe000000 0x00 0x1000000>;
		device_type = "memory";
	};
===================

The kernel will eventually use that memory block (which is located below
the main DRAM bank), but accesses to that will be answered with an
SError:
===================
[    0.000000] Internal error: synchronous external abort: 96000050 [raspberrypi#1] PREEMPT SMP
[    0.000000] Modules linked in:
[    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 5.18.0-rc6-00014-g10c8acb8b679 raspberrypi#524
[    0.000000] Hardware name: linux,dummy-virt (DT)
[    0.000000] pstate: 200000c5 (nzCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[    0.000000] pc : new_slab+0x190/0x340
[    0.000000] lr : new_slab+0x184/0x340
[    0.000000] sp : ffff80000a4b3d10
....
==================
The actual crash location and call stack will be somewhat random, and
depend on the specific allocation of that physical memory range.

As the DT spec[1] explicitly mentions standard properties, add a simple
check to skip over disabled memory nodes, so that we only use memory
that is meant for non-secure code to use.

That fixes booting a QEMU arm64 VM with EL3 enabled ("secure=on"), when
not using UEFI. In this case the QEMU generated DT will be handed on
to the kernel, which will see the secram node.
This issue is reproducible when using TF-A together with U-Boot as
firmware, then booting with the "booti" command.

When using U-Boot as an UEFI provider, the code there [2] explicitly
filters for disabled nodes when generating the UEFI memory map, so we
are safe.
EDK/2 only reads the first bank of the first DT memory node [3] to learn
about memory, so we got lucky there.

[1] https://github.com/devicetree-org/devicetree-specification/blob/main/source/chapter3-devicenodes.rst#memory-node (after the table)
[2] https://source.denx.de/u-boot/u-boot/-/blob/master/lib/fdtdec.c#L1061-1063
[3] https://github.com/tianocore/edk2/blob/master/ArmVirtPkg/PrePi/FdtParser.c

Reported-by: Ross Burton <ross.burton@arm.com>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20220517101410.3493781-1-andre.przywara@arm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
popcornmix pushed a commit that referenced this issue May 30, 2023
After commit 27a2195 ("power: supply: core: auto-exposure
of simple-battery data") we call power_supply_get_battery_info() in
__power_supply_register(), but it causes test_battery crash with NULL
pointer:

[    7.524846] __power_supply_register: Expected proper parent device for 'test_battery'
[    7.524856] CPU 3 Unable to handle kernel paging request at virtual address 0000000000000278, era == 9000000002fb279c, ra == 9000000003173434
[    7.524862] Oops[#1]:
[    7.524866] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 6.3.0+ #524
[    7.524870] Hardware name: Loongson Loongson-3A5000-7A1000-1w-CRB/Loongson-LS3A5000-7A1000-1w-CRB, BIOS vUDK2018-LoongArch-V2.0.0-prebeta9 10/21/2022
[    7.524872] pc 9000000002fb279c ra 9000000003173434 tp 90000001001f0000 sp 90000001001f3c00
[    7.524875] a0 0000000000000000 a1 0000000000000000 a2 0000000000000000 a3 9000000004553e13
[    7.524878] a4 9000000004553e16 a5 ffffffffffffffff a6 9000000003fd0948 a7 0000000000000030
[    7.524881] t0 9000000003173434 t1 90000000035e2c00 t2 0000000000000001 t3 00000000fffff2b0
[    7.524883] t4 0000000000000007 t5 fffffffffffffffe t6 0000000000000000 t7 0000000000000010
[    7.524886] t8 00000000000000b4 u0 00000001c0840c76 s9 0000000000000000 s0 0000000000000000
[    7.524889] s1 900000000370c458 s2 0000000000000001 s3 9000000101f81000 s4 9000000101f81038
[    7.524891] s5 9000000101f813a0 s6 900000000370c398 s7 9000000003eebd18 s8 90000000037c0070
[    7.524894]    ra: 9000000003173434 power_supply_get_battery_info+0xe4/0x710
[    7.591224]   ERA: 9000000002fb279c __dev_fwnode+0x8/0x20
[    7.853583]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[    7.859740]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[    7.864073]  EUEN: 00000000 (-FPE -SXE -ASXE -BTE)
[    7.868839]  ECFG: 00071c1c (LIE=2-4,10-12 VS=7)
[    7.873430] ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0)
[    7.878884]  BADV: 0000000000000278
[    7.882346]  PRID: 0014c010 (Loongson-64bit, Loongson-3A5000)
[    7.888056] Modules linked in:
[    7.891088] Process swapper/0 (pid: 1, threadinfo=0000000091357ee8, task=00000000313d98cb)
[    7.899307] Stack : 00000000000000b4 900000000279aca8 0000000000000001 0000000000000000
[    7.907274]         0000000000000049 0000000000000408 9000000100008c00 90000000035b6370
[    7.915238]         900000000b801080 90000000029a440c 9000000003eb4680 9000000003eb4680
[    7.923202]         9000000101f81038 900000000357cdcc 0000000000000001 900000000370c458
[    7.931166]         9000000101f81000 90000000037c0070 9000000003eebd18 900000000370c398
[    7.939131]         90000000045f4530 9000000101f81038 0000000000000000 0000000000000001
[    7.947095]         900000000370c458 9000000101f81000 0000000000000000 9000000003173e64
[    7.955059]         0000000000000026 9000000003de3880 9000000100562040 900000000357e07c
[    7.963023]         000000000000026f 0000000000000001 90000001001f3d68 fffffffffffffff5
[    7.970987]         90000000036d2ca8 90000000027429d8 90000000037c0030 0000000000000001
[    7.978951]         ...
[    7.981379] Call Trace:
[    7.981382] [<9000000002fb279c>] __dev_fwnode+0x8/0x20
[    7.988916] [<9000000003173434>] power_supply_get_battery_info+0xe4/0x710
[    7.995666] [<9000000003173e64>] __power_supply_register+0x404/0x580
[    8.001984] [<900000000379d368>] test_power_init+0x6c/0x124
[    8.007527] [<90000000026e08f8>] do_one_initcall+0x58/0x1ec
[    8.013066] [<9000000003761614>] kernel_init_freeable+0x290/0x310
[    8.019127] [<90000000035b40e8>] kernel_init+0x24/0x11c
[    8.024324] [<90000000026e2048>] ret_from_kernel_thread+0xc/0xa4
[    8.030295]
[    8.031769] Code: 4c000020  0015002c  03400000 <28c9e08c> 40000d80  02c06184  4c000020  28ca0084  4c000020

Root cause: psy->dev.parent is NULL in power_supply_get_battery_info(),
so change the else branch to be 'else if (psy->dev.parent)' and return
-ENOENT if psy->dev.parent is NULL.

Fixes: 27a2195 ("power: supply: core: auto-exposure of simple-battery data")
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants