Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CM4 hangs when performing large writes on PCI BARs mapped as normal memory #4928

Closed
TobleMiner opened this issue Mar 5, 2022 · 8 comments
Closed

Comments

@TobleMiner
Copy link
Contributor

TobleMiner commented Mar 5, 2022

Describe the bug

Writes to PCI BARs that have been mapped as normal memory with a size of more than ~256 bytes freeze the BCM2711.

Commonly a lot of drivers assume it is ok to ioremap PCI BARs as normal memory (ioremap_wc) in contrast to device memory. However performing large writes on such mappings seems to somehow freeze the BCM2711. I've tested this with the Raspberry Pi CM4, a carrier board and multiple PCIe devices (e1000e NIC, SM750 GPU, AMD RX570 GPU).
The issue does not occur when mapping the BAR as device memory. However this does of course come with additional restrictions over normal memory, most notably need for aligned access. This is unexpected in many places, especially if such memory is exposed to userspace via mmap (i.e. on framebuffer devices).
Other arm and arm64 platforms do not seem to have this issue (tested on Ampere Altra). Is this expected behaviour and a known limitation of the PCIe controller on the BCM2711?

Steps to reproduce the behaviour

  1. Insert a PCIe peripheral with at least one memory BAR into the carrier board
  2. Compile the PCI BAR write test kernel module
  3. Determine the PCI vendor and device id of your PCIe peripheral (lspci -nn)
  4. Pick a suitable memory BAR from the peripheral for the test (dmesg | grep BAR, any mem BAR >=1kB will be fine)
  5. Ensure the driver for the PCIe peripheral is not loaded
  6. While monitoring the kernel log, insert the bar write test kernel module (insmod bar-write-test.ko vendor_id=0x8086 device_id=0x10d3 bar=1, replace vendor, device id and bar with the ones from your peripheral)
  7. The PI will freeze, showing Trying 2^9... as the last kernel log message

Repeat the same test, but insert the bar write test module with additional parameter device_mem=1 (insmod bar-write-test.ko vendor_id=0x8086 device_id=0x10d3 bar=1 device_mem=1). This maps the BAR as device memory and the test will complete successfully without hanging the entire system.

The bug can also be reproduced with the sm750fb driver and a SM750 based graphics card. Those tend to be quite expensive and hard to come by though, thus I deem above reproducer more precise and accessible.

Device (s)

Raspberry Pi CM4

System

rpi-issue:

Raspberry Pi reference 2022-01-28
Generated using pi-gen, https://github.com/RPi-Distro/pi-gen, fbe448ccdc995d295d24c7596e5f0ef62cc2488f, stage

Firmware version: a26faf97e3bf76bcc23949d7cdab2f96f399a0c3 (clean) (release) (start)
Kernel version: 5.15.24-v8+

Has also been tested with Linux 5.10.92-v8+ before, behaviour is exactly the same

Logs

Log output was taken from serial with loglevel set to 9.

Log output when remapping as normal memory:

pi@raspberrypi:~$ sudo insmod /bar-write-test.ko vendor_id=0x8086 device_id=0x10d3 bar=1
[  521.810311] BAR1 has size 0x00080000
[  521.813952] Mapping BAR1 as regular memory
[  521.818127] Testing BAR write sizes
[  521.821650] Trying 2^2...
[  521.824280] Trying 2^3...
[  521.826926] Trying 2^4...
[  521.829574] Trying 2^5...
[  521.832202] Trying 2^6...
[  521.834848] Trying 2^7...
[  521.837500] Trying 2^8...
[  521.840136] Trying 2^9...

Log output when remapping as device memory:

pi@raspberrypi:~$ sudo insmod /bar-write-test.ko vendor_id=0x8086 device_id=0x10d3 bar=1 device_mem=1
[  463.080111] bar_write_test: loading out-of-tree module taints kernel.
[  463.087753] pci 0000:00:00.0: enabling device (0000 -> 0002)
[  463.093512] bar_write_test 0000:01:00.0: enabling device (0000 -> 0002)
[  463.100177] BAR1 has size 0x00080000
[  463.103770] Mapping BAR1 as device memory
[  463.107815] Testing BAR write sizes
[  463.111383] Trying 2^2...
[  463.114088] Trying 2^3...
[  463.116734] Trying 2^4...
[  463.119353] Trying 2^5...
[  463.121985] Trying 2^6...
[  463.124608] Trying 2^7...
[  463.127247] Trying 2^8...
[  463.129896] Trying 2^9...
[  463.132554] Trying 2^10...
[  463.135350] Trying 2^11...
[  463.138229] Trying 2^12...
[  463.141271] Trying 2^13...
[  463.144626] Trying 2^14...
[  463.148644] Trying 2^15...
[  463.153984] Trying 2^16...
[  463.161933] Trying 2^17...
[  463.175103] Trying 2^18...
[  463.198826] Trying 2^19...
[  463.243287] BAR seems to be writable at all sizes
[  463.248008] bar_write_test: probe of 0000:01:00.0 failed with error -1

Additional context

No response

@pelwell
Copy link
Contributor

pelwell commented Mar 5, 2022

The BCM2711 only supports up to 32-bit accesses to the Root Complex - that is a fundamental hardware limitation. Only device drivers that map as device memory will work.

Closing as Can't Fix.

@mi-hol
Copy link

mi-hol commented Mar 8, 2022

The BCM2711 only supports up to 32-bit accesses to the Root Complex - that is a fundamental hardware limitation. Only device drivers that map as device memory will work.

Is a rather important fact but I failed to locate this information in any document published by the foundation.
Search definitions I used were site:raspberrypi.com +BCM2711 +hardware +limitation & site:raspberrypi.com +BCM2711 +hardware +limitation hopefully its just my fault?

@geerlingguy
Copy link

@mi-hol I don't think it's ever really been documented that thoroughly.

As time progresses, and the community has been testing more and more cards and devices on the Pi, these 'features' are being uncovered. Until the CM4, only a handful of people on the planet ever tried any PCI Express devices other than the VL805 chip on a BCM2711 (afaik).

It might be nice at some point to have more thorough documentation about what calls to avoid when working on drivers for the CM4 in particular, though. I've seen some interesting projects people are doing (or trying to do) with a CM4 where it's a decent fit, but some of the projects stall out when they run into these driver issues, and it's really a process of manual debugging.

I've been documenting the rough edges I've hit so far and hope to summarize most of what I learned in an upcoming post/video though. At least that'll be something when people search for a concise list!

@mi-hol
Copy link

mi-hol commented Mar 10, 2022

some of the projects stall out when they run into these driver issues

@pelwell therefore these learnings should be added to RPis official documentation, just closing this issue without follow-up on side of documentation doesn't feel right

@pelwell
Copy link
Contributor

pelwell commented Mar 10, 2022

From page 8 of the CM4 datasheet (https://datasheets.raspberrypi.com/cm4/cm4-datasheet.pdf):
cm4_pcie_warning

@mi-hol
Copy link

mi-hol commented Mar 10, 2022

@pelwell great to see it mentioned in the datasheet. Now from my view its a bit hidden there.
How about mentioning this limitation in a separate section in the obvious place ?
Maybe the text could be modelled after an existing section like
https://www.raspberrypi.com/documentation/computers/compute-module.html#attaching-and-enabling-peripherals ?

@pelwell
Copy link
Contributor

pelwell commented Mar 10, 2022

If you care about it that strongly, submit a pull request: https://github.com/raspberrypi/documentation/pulls

@m1geo
Copy link

m1geo commented Feb 12, 2024

Does anyone know of this 32GB limit exists for the BCM2712 used on the Pi 5?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants