Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test GPU (ASRock Rack M2_VGA, based on SM750) #62

Closed
geerlingguy opened this issue Jan 24, 2021 · 75 comments
Closed

Test GPU (ASRock Rack M2_VGA, based on SM750) #62

geerlingguy opened this issue Jan 24, 2021 · 75 comments

Comments

@geerlingguy
Copy link
Owner

This is a fun one—it's definitely not the kind of GPU where people would ask "will it run Crysis", especially since a headline feature is 2D graphics acceleration (not even 3D yet!).

But the ASRock Rack M2_VGA is an M.2 form factor graphics card that sports a lone VGA port and 16MB (yeah, MB, not GB) of DDR graphics memory.

asrock-rack-m2-vga

I doubt it will even be as fast as the built-in graphics on the Pi, but it would be interesting to see if it works. It uses the SiliconMotion SM750 graphics chip, which actually supports up to two DVI/HDMI/VGA displays, as well as two video inputs which can be overlaid on those outputs.

The chip is mostly known for being helpful in embedded or server graphics situations, and is not a 'powerhouse' by any means. Just a little utilitarian chip that sips less than 2W of power maximum (making it suitable for lower-power scenarios where you still need a display or two, but don't do gaming or ML/AI applications on it).

It seems like there's a mainline driver since a few years ago (SM750), and it would be interesting to see if it 'just works' (compared to the other cards). It seems like the chip itself uses BIOS (and was designed in 2012), so that gives me a little pause.

But the chip is simple enough and documented enough that I wonder if we could bring it up manually if the driver starts barfing on memory allocations like all the other GPUs I've tested have (AMD and Nvidia).

@geerlingguy
Copy link
Owner Author

Other server vendors also build the same chip (SM750) into full PCIe cards, e.g. Sunix's VGA0419L PCI-E card.

And in this video, a rep demos a USB version of the video card and a few other form factors.

Finally, some commentary from Phoronix seems to indicate the early versions of the driver were a bit rough (but hopefully adequate for ARM64... we'll see!).

@geerlingguy
Copy link
Owner Author

geerlingguy commented Jan 24, 2021

I was hoping the card would leech power off the PCI-E bus but it looks like you have to use the included 4-pin molex to 3-pin jumper to power the board, otherwise it doesn't show up on the Pi at all.

I'm using an external / separate PSU and it looks like it puts 5v on the outer pins and GND in the middle.

Anyways, here's the deets:

pi@raspberrypi:~ $ sudo lspci -vvvv
...
01:00.0 VGA compatible controller: Silicon Motion, Inc. SM750 (rev a1) (prog-if 00 [VGA controller])
	Subsystem: Silicon Motion, Inc. SM750
	Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Interrupt: pin A routed to IRQ 255
	Region 0: Memory at <unassigned> (32-bit, prefetchable) [disabled]
	Capabilities: [40] Power Management version 3
		Flags: PMEClk- DSI- D1+ D2- AuxCurrent=375mA PME(D0+,D1+,D2-,D3hot+,D3cold-)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
		Address: 0000000000000000  Data: 0000
	Capabilities: [70] Express (v2) Legacy Endpoint, MSI 00
		DevCap:	MaxPayload 128 bytes, PhantFunc 0, Latency L0s <1us, L1 <8us
			ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop-
			MaxPayload 128 bytes, MaxReadReq 512 bytes
		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
		LnkCap:	Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s <512ns, L1 <16us
			ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp-
		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- CommClk-
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
		DevCap2: Completion Timeout: Not Supported, TimeoutDis+, LTR-, OBFF Not Supported
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
		LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
	Capabilities: [b0] MSI-X: Enable- Count=1 Masked-
		Vector table: BAR=5 offset=00000000
		PBA: BAR=5 offset=00000000
	Capabilities: [100 v1] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
		AERCap:	First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
	Capabilities: [140 v1] Virtual Channel
		Caps:	LPEVC=0 RefClk=100ns PATEntryBits=1
		Arb:	Fixed- WRR32- WRR64- WRR128-
		Ctrl:	ArbSelect=Fixed
		Status:	InProgress-
		VC0:	Caps:	PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
			Arb:	Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
			Ctrl:	Enable+ ID=0 ArbSelect=Fixed TC/VC=ff
			Status:	NegoPending- InProgress-

@geerlingguy
Copy link
Owner Author

$ dmesg
...
[    1.013251] brcm-pcie fd500000.pcie: host bridge /scb/pcie@7d500000 ranges:
[    1.013270] brcm-pcie fd500000.pcie:   No bus range found for /scb/pcie@7d500000, using [bus 00-ff]
[    1.013327] brcm-pcie fd500000.pcie:      MEM 0x0600000000..0x0603ffffff -> 0x00f8000000
[    1.013383] brcm-pcie fd500000.pcie:   IB MEM 0x0000000000..0x00ffffffff -> 0x0100000000
[    1.047197] brcm-pcie fd500000.pcie: link up, 2.5 GT/s x1 (SSC)
[    1.047488] brcm-pcie fd500000.pcie: PCI host bridge to bus 0000:00
[    1.047503] pci_bus 0000:00: root bus resource [bus 00-ff]
[    1.047520] pci_bus 0000:00: root bus resource [mem 0x600000000-0x603ffffff] (bus address [0xf8000000-0xfbffffff])
[    1.047572] pci 0000:00:00.0: [14e4:2711] type 01 class 0x060400
[    1.047792] pci 0000:00:00.0: PME# supported from D0 D3hot
[    1.051200] pci 0000:00:00.0: bridge configuration invalid ([bus ff-ff]), reconfiguring
[    1.051391] pci 0000:01:00.0: [126f:0750] type 00 class 0x030000
[    1.051459] pci 0000:01:00.0: reg 0x10: [mem 0x00000000-0x03ffffff pref]
[    1.051487] pci 0000:01:00.0: reg 0x14: [mem 0x00000000-0x001fffff]
[    1.051576] pci 0000:01:00.0: reg 0x30: [mem 0x00000000-0x0000ffff pref]
[    1.051768] pci 0000:01:00.0: supports D1
[    1.051779] pci 0000:01:00.0: PME# supported from D0 D1 D3hot
[    1.051973] pci 0000:01:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none
[    1.055078] pci_bus 0000:01: busn_res: [bus 01-ff] end is updated to 01
[    1.055115] pci 0000:00:00.0: BAR 8: no space for [mem size 0x06000000]
[    1.055127] pci 0000:00:00.0: BAR 8: failed to assign [mem size 0x06000000]
[    1.055143] pci 0000:01:00.0: BAR 0: no space for [mem size 0x04000000 pref]
[    1.055155] pci 0000:01:00.0: BAR 0: failed to assign [mem size 0x04000000 pref]
[    1.055166] pci 0000:01:00.0: BAR 1: no space for [mem size 0x00200000]
[    1.055178] pci 0000:01:00.0: BAR 1: failed to assign [mem size 0x00200000]
[    1.055190] pci 0000:01:00.0: BAR 6: no space for [mem size 0x00010000 pref]
[    1.055201] pci 0000:01:00.0: BAR 6: failed to assign [mem size 0x00010000 pref]
[    1.055214] pci 0000:00:00.0: PCI bridge to [bus 01]

So might need to expand the BAR. Good thing I don't see any IO BAR in there 😅

Anyways, that's all for now, I'm going to recompile the kernel later (maybe tomorrow, maybe next week) and add in the SM750 driver, and see where we end up from there. All using the 64-bit Pi OS beta for now.

@geerlingguy
Copy link
Owner Author

geerlingguy commented Jan 24, 2021

Enabled:

Device Drivers > Staging drivers > Silicon Motion SM750 framebuffer support

Cross-compiling on kernel rpi-5.10.y branch.

@PixlRainbow
Copy link

as well as two video inputs which can be overlaid on those outputs.

Pi HUD? 🤔

@69K-ram
Copy link

69K-ram commented Jan 28, 2021

I was just about to suggest this to "GPU" to you after seeing the LTT video about it, but you already have one lol.
Good luck! Maybe this one is so low spec it will just work?

@geerlingguy
Copy link
Owner Author

Reference for anyone who doesn't subscribe to LTT: THIS is a Graphics Card..??

(I wish I had a team of tech nuts who could do all the testing and script writing for me... just jelly :)

@geerlingguy
Copy link
Owner Author

Well that's annoying. After recompiling the kernel and booting with the VGA card powered up and plugged in, the Pi starts booting to the point where the display starts flashing the cursor... then completely freezes.

Here's what that looks like:

IMG_3508

(That signal's coming through the HDMI port on the IO Board, not from the VGA output.)

I've tried rebooting multiple times, but it always gets stuck a few seconds in and then there's no response from anything. If I turn off power to the VGA card, it boots just fine.

@geerlingguy
Copy link
Owner Author

I used sudo raspi-config to boot to console (instead of GUI), and also to disable the splash screen, and then rebooted with the VGA card enabled so I could see what was coming out in the console.

Every single time it gets stuck around here:

IMG_3509

[   OK   ] Started Forward Password Requests to Plymouth Directory Watch.

So maybe something that's trying to initialize some graphics rendering (the Plymouth Boot Screen...). The annoying thing is there doesn't seem to be a separate driver for the SM750 so I can't just disable it using a modprobe blacklist.

This could be the shortest test cycle ever—I can't yet think of anything else to try :/

@geerlingguy
Copy link
Owner Author

Just noting that without the card powered up, when it boots, it hits that 'Show Plymouth Boot Screen' in the log, then the screen flashes to black, then the screen comes back, outputs some more startup data, then ends with:

           Starting Terminate Plymouth Boot Screen...

So something in the Plymouth Boot Screen process seems to kinda explode when this card is active.

@geerlingguy
Copy link
Owner Author

Uninstalling plymouth in light of the fact that I don't see how to completely disable it otherwise (all the online guides assume you're using GRUB, but Pi OS does not):

sudo apt-get purge -y plymouth*

@geerlingguy
Copy link
Owner Author

Well, now it locks up completely at:

[   OK   ] Started udev Kernel Device Manager

So at least it doesn't lock up at the Plymouth stage anymore :D

@geerlingguy
Copy link
Owner Author

I'm recompiling the kernel without Silicon Motion SM750 framebuffer support to see if it makes a difference.

@geerlingguy
Copy link
Owner Author

Okay, so without that compiled in, the thing boots with the board powered up... and even better, with the updated kernel (which has the better BAR space mapping committed a few months ago), the registers all look good:

[    1.245812] brcm-pcie fd500000.pcie: host bridge /scb/pcie@7d500000 ranges:
[    1.247456] brcm-pcie fd500000.pcie:   No bus range found for /scb/pcie@7d500000, using [bus 00-ff]
[    1.249244] brcm-pcie fd500000.pcie:      MEM 0x0600000000..0x063fffffff -> 0x00c0000000
[    1.251088] brcm-pcie fd500000.pcie:   IB MEM 0x0000000000..0x00ffffffff -> 0x0100000000
[    1.286525] brcm-pcie fd500000.pcie: link up, 2.5 GT/s PCIe x1 (SSC)
[    1.287844] brcm-pcie fd500000.pcie: PCI host bridge to bus 0000:00
[    1.288874] pci_bus 0000:00: root bus resource [bus 00-ff]
[    1.289826] pci_bus 0000:00: root bus resource [mem 0x600000000-0x63fffffff] (bus address [0xc0000000-0xffffffff])
[    1.291763] pci 0000:00:00.0: [14e4:2711] type 01 class 0x060400
[    1.292968] pci 0000:00:00.0: PME# supported from D0 D3hot
[    1.297601] pci 0000:00:00.0: bridge configuration invalid ([bus ff-ff]), reconfiguring
[    1.299700] pci 0000:01:00.0: [126f:0750] type 00 class 0x030000
[    1.300732] pci 0000:01:00.0: reg 0x10: [mem 0x00000000-0x03ffffff pref]
[    1.302611] pci 0000:01:00.0: reg 0x14: [mem 0x00000000-0x001fffff]
[    1.303687] pci 0000:01:00.0: reg 0x30: [mem 0x00000000-0x0000ffff pref]
[    1.305835] pci 0000:01:00.0: supports D1
[    1.306794] pci 0000:01:00.0: PME# supported from D0 D1 D3hot
[    1.307945] pci 0000:01:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none
[    1.323777] pci_bus 0000:01: busn_res: [bus 01-ff] end is updated to 01
[    1.324778] pci 0000:00:00.0: BAR 8: assigned [mem 0x600000000-0x605ffffff]
[    1.326596] pci 0000:01:00.0: BAR 0: assigned [mem 0x600000000-0x603ffffff pref]
[    1.328471] pci 0000:01:00.0: BAR 1: assigned [mem 0x604000000-0x6041fffff]
[    1.330393] pci 0000:01:00.0: BAR 6: assigned [mem 0x604200000-0x60420ffff pref]
[    1.332347] pci 0000:00:00.0: PCI bridge to [bus 01]
[    1.333381] pci 0000:00:00.0:   bridge window [mem 0x600000000-0x605ffffff]

Going to do some testing and see what happens with X.org.

@geerlingguy
Copy link
Owner Author

Well, nothing. It only outputs over HDMI.

@geerlingguy
Copy link
Owner Author

Hmm... driver being worked on over here: https://gitlab.com/sudipm/sm750/-/tree/sm750/drivers/gpu/drm/sm750 (from 2 years ago, ha!).

See: https://github.com/raspberrypi/linux/blob/rpi-5.10.y/drivers/staging/sm750fb/TODO

I emailed one of the contacts listed in the driver docs... who knows if I'll get a response.

@geerlingguy
Copy link
Owner Author

So... I tried pulling the patch from that drm driver and popping it into the 5.10.y branch—but when I build I get:

  CC [M]  drivers/gpu/drm/sm750/smi_drv.o
drivers/gpu/drm/sm750/smi_drv.c:12:10: fatal error: drm/drmP.h: No such file or directory
 #include <drm/drmP.h>
          ^~~~~~~~~~~~
compilation terminated.
make[4]: *** [scripts/Makefile.build:279: drivers/gpu/drm/sm750/smi_drv.o] Error 1
make[3]: *** [scripts/Makefile.build:496: drivers/gpu/drm/sm750] Error 2
make[3]: *** Waiting for unfinished jobs....

So it looks like that driver may be built for an older kernel version and isn't compatible with the latest kernel :/

@geerlingguy
Copy link
Owner Author

Looks like that change was in kernel 5.5—so maybe running the current stable kernel version (5.4) would let it work? It's worth a try... See DisplayLink/evdi#185

@geerlingguy
Copy link
Owner Author

Yeah... so doing a few manual patches to that patch to try to make it compile leads to dozens more compile errors, so I'm going to drop trying to get it working on 5.10.y, and jump back to 5.4.y.

@geerlingguy
Copy link
Owner Author

geerlingguy commented Jan 29, 2021

With the patch in place, the kernel build on 5.4.y seemed to work. Copied it over to the Pi, rebooted and...

[    1.324708] pci 0000:00:00.0: BAR 8: no space for [mem size 0x06000000]
[    1.325653] pci 0000:00:00.0: BAR 8: failed to assign [mem size 0x06000000]
[    1.327425] pci 0000:01:00.0: BAR 0: no space for [mem size 0x04000000 pref]
[    1.329274] pci 0000:01:00.0: BAR 0: failed to assign [mem size 0x04000000 pref]
[    1.331175] pci 0000:01:00.0: BAR 1: no space for [mem size 0x00200000]
[    1.332164] pci 0000:01:00.0: BAR 1: failed to assign [mem size 0x00200000]
[    1.334106] pci 0000:01:00.0: BAR 6: no space for [mem size 0x00010000 pref]
[    1.336053] pci 0000:01:00.0: BAR 6: failed to assign [mem size 0x00010000 pref]

Have to adjust the BAR space manually on this old kernel.

With 1 GB BAR space, getting:

[    1.324861] pci 0000:00:00.0: BAR 8: assigned [mem 0x600000000-0x605ffffff]
[    1.326682] pci 0000:01:00.0: BAR 0: assigned [mem 0x600000000-0x603ffffff pref]
[    1.328536] pci 0000:01:00.0: BAR 1: assigned [mem 0x604000000-0x6041fffff]
[    1.330481] pci 0000:01:00.0: BAR 6: assigned [mem 0x604200000-0x60420ffff pref]

...but nothing in dmesg mentioning SM750, nor anything out of the VGA output. So... how do I get Linux/RPi to recognize the card using the new driver? Maybe I have to manually load it in somehow?

@geerlingguy
Copy link
Owner Author

Oh... well... that's not right:

$ uname -r
5.10.9-v8+

@geerlingguy
Copy link
Owner Author

Ugh... I can't get these drivers to build for the life of me!

In file included from drivers/gpu/drm/sm750/smi_drv.c:15:
drivers/gpu/drm/sm750/smi_drv.h:178:32: error: field ‘mem_global_ref’ has incomplete type
    struct drm_global_reference mem_global_ref;
                                ^~~~~~~~~~~~~~
drivers/gpu/drm/sm750/smi_drv.h:179:29: error: field ‘bo_global_ref’ has incomplete type
    struct ttm_bo_global_ref bo_global_ref;
                             ^~~~~~~~~~~~~
drivers/gpu/drm/sm750/smi_drv.h:296:50: warning: ‘struct reservation_object’ declared inside parameter list will not be visible outside of this definition or declaration
      uint32_t flags, struct sg_table *sg, struct reservation_object *resv, struct smi_bo **psmibo);
                                                  ^~~~~~~~~~~~~~~~~~
drivers/gpu/drm/sm750/smi_drv.c: In function ‘smi_drm_freeze’:
drivers/gpu/drm/sm750/smi_drv.c:107:2: error: implicit declaration of function ‘drm_kms_helper_poll_disable’; did you mean ‘drm_fb_helper_pan_display’? [-Werror=implicit-function-declaration]
  drm_kms_helper_poll_disable(dev);
  ^~~~~~~~~~~~~~~~~~~~~~~~~~~
  drm_fb_helper_pan_display
drivers/gpu/drm/sm750/smi_drv.c: In function ‘smi_drm_resume’:
drivers/gpu/drm/sm750/smi_drv.c:154:2: error: implicit declaration of function ‘drm_kms_helper_poll_enable’; did you mean ‘drm_fb_helper_fill_info’? [-Werror=implicit-function-declaration]
  drm_kms_helper_poll_enable(dev);
  ^~~~~~~~~~~~~~~~~~~~~~~~~~
  drm_fb_helper_fill_info
drivers/gpu/drm/sm750/smi_drv.c:144:21: warning: unused variable ‘sdev’ [-Wunused-variable]
  struct smi_device *sdev = dev->dev_private;
                     ^~~~
drivers/gpu/drm/sm750/smi_drv.c: At top level:
drivers/gpu/drm/sm750/smi_drv.c:276:39: error: ‘DRIVER_IRQ_SHARED’ undeclared here (not in a function); did you mean ‘TIMER_IRQSAFE’?
  .driver_features = DRIVER_HAVE_IRQ | DRIVER_IRQ_SHARED | DRIVER_GEM |
                                       ^~~~~~~~~~~~~~~~~
                                       TIMER_IRQSAFE
drivers/gpu/drm/sm750/smi_drv.c:277:9: error: ‘DRIVER_PRIME’ undeclared here (not in a function); did you mean ‘DRIVER_NAME’?
         DRIVER_PRIME | DRIVER_MODESET,
         ^~~~~~~~~~~~
         DRIVER_NAME
drivers/gpu/drm/sm750/smi_drv.c:309:3: error: ‘struct drm_driver’ has no member named ‘gem_prime_res_obj’; did you mean ‘gem_prime_export’?
  .gem_prime_res_obj = smi_gem_prime_res_obj,
   ^~~~~~~~~~~~~~~~~
   gem_prime_export
drivers/gpu/drm/sm750/smi_drv.c:309:23: error: initialization of ‘struct sg_table * (*)(struct drm_gem_object *)’ from incompatible pointer type ‘struct reservation_object * (*)(struct drm_gem_object *)’ [-Werror=incompatible-pointer-types]
  .gem_prime_res_obj = smi_gem_prime_res_obj,
                       ^~~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/sm750/smi_drv.c:309:23: note: (near initialization for ‘driver.gem_prime_get_sg_table’)
In file included from drivers/gpu/drm/sm750/smi_fbdev.c:15:
drivers/gpu/drm/sm750/smi_drv.h:178:32: error: field ‘mem_global_ref’ has incomplete type
    struct drm_global_reference mem_global_ref;
                                ^~~~~~~~~~~~~~
drivers/gpu/drm/sm750/smi_drv.h:179:29: error: field ‘bo_global_ref’ has incomplete type
    struct ttm_bo_global_ref bo_global_ref;
                             ^~~~~~~~~~~~~
drivers/gpu/drm/sm750/smi_drv.h:296:50: warning: ‘struct reservation_object’ declared inside parameter list will not be visible outside of this definition or declaration
      uint32_t flags, struct sg_table *sg, struct reservation_object *resv, struct smi_bo **psmibo);
                                                  ^~~~~~~~~~~~~~~~~~
drivers/gpu/drm/sm750/smi_fbdev.c: In function ‘smifb_create_object’:
drivers/gpu/drm/sm750/smi_fbdev.c:48:21: warning: unused variable ‘cdev’ [-Wunused-variable]
  struct smi_device *cdev = dev->dev_private;
                     ^~~~
cc1: some warnings being treated as errors
make[4]: *** [scripts/Makefile.build:266: drivers/gpu/drm/sm750/smi_drv.o] Error 1
make[4]: *** Waiting for unfinished jobs....
drivers/gpu/drm/sm750/smi_fbdev.c: In function ‘smifb_create’:
drivers/gpu/drm/sm750/smi_fbdev.c:161:2: error: implicit declaration of function ‘drm_fb_helper_fill_fix’; did you mean ‘drm_fb_helper_fill_info’? [-Werror=implicit-function-declaration]
  drm_fb_helper_fill_fix(info, fb->pitches[0], fb->format->depth);
  ^~~~~~~~~~~~~~~~~~~~~~
  drm_fb_helper_fill_info
  AR      drivers/i3c/built-in.a
drivers/gpu/drm/sm750/smi_fbdev.c:163:2: error: implicit declaration of function ‘drm_fb_helper_fill_var’; did you mean ‘drm_fb_helper_fill_info’? [-Werror=implicit-function-declaration]
  drm_fb_helper_fill_var(info, &gfbdev->helper, sizes->fb_width,
  ^~~~~~~~~~~~~~~~~~~~~~
  drm_fb_helper_fill_info
drivers/gpu/drm/sm750/smi_fbdev.c:175:58: error: ‘struct ttm_buffer_object’ has no member named ‘vma_node’
  drm_vma_offset_remove(&bo->bo.bdev->vma_manager, &bo->bo.vma_node);
                                                          ^
  CC [M]  drivers/gpu/drm/tiny/ili9341.o
drivers/gpu/drm/sm750/smi_fbdev.c: In function ‘smi_fbdev_destroy’:
drivers/gpu/drm/sm750/smi_fbdev.c:205:4: error: implicit declaration of function ‘drm_gem_object_unreference_unlocked’; did you mean ‘drm_gem_object_put_unlocked’? [-Werror=implicit-function-declaration]
    drm_gem_object_unreference_unlocked(gfb->obj);
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    drm_gem_object_put_unlocked
  AR      drivers/idle/built-in.a
cc1: some warnings being treated as errors
make[4]: *** [scripts/Makefile.build:266: drivers/gpu/drm/sm750/smi_fbdev.o] Error 1
make[3]: *** [scripts/Makefile.build:500: drivers/gpu/drm/sm750] Error 2
make[3]: *** Waiting for unfinished jobs....

I think I may have to throw in the towel on this GPU as well.

Why are all the [okay but not great] working drivers only written for Windows?

@69K-ram
Copy link

69K-ram commented Jan 29, 2021

Respect for even getting this far, I wouldn't have slightest idea where to start, and I sure wouldn't have tried that long.

In the LTT video, he mentions it's using a Windows feature called WARP. And the Wikipedia article about WARP says it uses "just-in-time compilation to x86 machine code". Is it possible the driver is dependent on x86 instructions or something like that? I have no idea how the Linux driver for it works though.

@PixlRainbow
Copy link

The thing is though, WARP is a DirectX feature, not a driver feature.

@Coreforge
Copy link

Could you try disabling fbdev emulation? Since it crashes after udev, it could be the console trying to access the fbdev, which could lock up the whole pi if something doesn't work, like using 64bit writes. You'll have to add
drm_kms_helper.fbdev_emulation=0
to cmdline.txt. Most things involving GRUB also just change the kernel command line, so you can do the same by adding the arguments to cmdline.txt

@geerlingguy
Copy link
Owner Author

I added drm_kms_helper.fbdev_emulation=0 to /boot/cmdline.txt on the Pi, and am building the latest 5.10.y kernel with the SM750 staging driver... we'll see how this goes.

@geerlingguy
Copy link
Owner Author

@Coreforge - Sadly, that seems to have not made any difference:

udev-lockup.mov

@paulwratt
Copy link

(I wish I was in a position to help out with some of this diagnosis)

@geerlingguy geerlingguy changed the title Test GPU (ASRock Rack M2_VGA) Test GPU (ASRock Rack M2_VGA, based on SM750) Feb 23, 2022
@TobleMiner
Copy link

TobleMiner commented Mar 5, 2022

I opened an issue detailing this behaviour on raspberrypi/linux and got a response from a former Broadcom, now Raspberry Pi foundation engineer: raspberrypi/linux#4928 (comment)

It seems to boil down to the fact that the PCIe root complex on the BCM2711 can only support aligned access up to 32bit. So remapping PCIe BARs as normal memory will never work properly. No one expects normal memory to have any alignment requirements. Thus a whole bunch of PCI drivers and also userspace software using those drivers will just not work on the Pi 4 without rewriting them to do only aligned accesses. That would be a huge amount of work though and not useful beyond just "making it work" on the Pi.

@pelwell
Copy link

pelwell commented Mar 6, 2022

the BCM2711 can only support aligned 32bit access.

It supports aligned accesses up to 32 bits. The linked comment has been amended.

@TobleMiner
Copy link

Ah, sorry for the confusion. I've now edited the comment to reflect that, thanks for bringing it up!

@geerlingguy
Copy link
Owner Author

That would be a huge amount of work though and not useful beyond just "making it work" on the Pi.

I don't think anyone has the idea it would be work that would be mainlined at any point, but there are some use cases where a particular driver or device is useful to get working on the Pi — for example people running storage controllers for disk storage on the Pi using old HBA cards — so it's nice to know all the corner cases where just a few lines of modified code will fix it.

I think for simpler graphics cards like SM750-based cards, it might be feasible to maintain a patch (especially considering the driver hasn't changed in years) that gives full or close to full functionality, for the few crazy people who want to use it (e.g. for adding more displays or using any of the 2D rendering built in). Heck, maybe some casino startup wants to start building Pi slot machines :D

@geerlingguy
Copy link
Owner Author

All right, so I managed to apply @TobleMiner's patch in this comment to the latest kernel source, and compile the kernel with CONFIG_FB_SM750=m (under Device Drivers -> Staging drivers -> Silicon Motion SM750 framebuffer support in menuconfig).

I tried booting an image with the full Pi OS and window manager, but when it initialized (with one HDMI display in HDMI0 and VGA connected to the M2_VGA card), it seemed to lock up. Got further than usual though. I'm going to try just a console version.

@Coreforge
Copy link

I don't think a window manager is fully working on any gpu yet. Instead of reflashing, just booting it up without the gpu and disabling graphical boot in raspi-config should be faster and has the same effect.

@geerlingguy
Copy link
Owner Author

geerlingguy commented Mar 16, 2022

@Coreforge - Heh, too late ;)

I was also working on my build script a tiny bit.

Before:

pi@m2:~ $ uname -a
Linux m2 5.10.92-v8+ #1514 SMP PREEMPT Mon Jan 17 17:39:38 GMT 2022 aarch64 GNU/Linux

After:

pi@m2:~ $ uname -a
Linux m2 5.15.28-v8+ #1 SMP PREEMPT Wed Mar 16 21:43:17 UTC 2022 aarch64 GNU/Linux

pi@m2:~ $ cat /sys/class/graphics/fb0/virtual_size
1024,768

IMG_0941

Pardon the crusty display. The other one I have with VGA input is in use but will soon be rotated out of that position!

@geerlingguy
Copy link
Owner Author

I noticed the SM750 driver has a number of memset_io calls that would need to be adjusted to work with the Pi SoC just like the other cards @Coreforge was working on (e.g. https://github.com/geerlingguy/linux/pull/1/files).

@geerlingguy
Copy link
Owner Author

Working on a patch here: geerlingguy/linux#2 (so far it just has the fb console working as @TobleMiner had earlier).

@geerlingguy
Copy link
Owner Author

With just the patch from @TobleMiner I was getting a weird artifact after the blinking cursor in the console over VGA. I updated my patch (see link above) with a few more memset swaps, and it seems like those artifacts are gone. Going to try on X to see if I can get a desktop.

@geerlingguy
Copy link
Owner Author

geerlingguy commented Apr 12, 2022

Rebooting with X and the driver loaded results in the screen hanging at some point (prior to seeing any possible errors), so I added:

echo "blacklist sm750fb" | sudo tee /etc/modprobe.d/blacklist-sm750fb.conf

Then after boot, I ran:

sudo modprobe sm750fb

Dmesg shows:

[  142.774551] sm750fb: module is from the staging directory, the quality is unknown, you have been warned.
[  142.776466] no options.
[  142.777017] pci 0000:00:00.0: enabling device (0000 -> 0002)
[  142.777073] sm750fb 0000:01:00.0: enabling device (0000 -> 0002)
[  142.777112] sm750fb 0000:01:00.0: no specific g_option.
[  142.777129] mmio phyAddr = 604000000
[  142.777191] mmio virtual addr = 0000000054575e1d
[  142.777216] video memory phyAddr = 600000000, size = 16777216 bytes
[  142.777247] video memory vaddr = 000000009bddbcb5
[  143.369500] use simul primary mode
[  143.369519] crtc->cursor.mmio = 00000000f44c08d1
[  143.369565] ret = 5,fb_find_mode failed,with driver prepared modes
[  143.369575] success! use specified mode:1024x768-32@60 in kernel prepared default modedb
[  143.369582] Member of info->var is :
               xres=1024
               yres=768
               xres_virtual=1024
               yres_virtual=768
               xoffset=0
               yoffset=0
               bits_per_pixel=32
                ...
[  143.369591] fix->smem_start = 600000000
[  143.369597] fix->smem_len = 1000000
[  143.369603] fix->mmio_start = 604000000
[  143.369609] fix->mmio_len = 200000

Though the VGA input on my display shows nothing. via SSH, I see:

pi@m2x:~ $ sudo xrandr -q
Can't open display 

(Note: Within X, I can see xrandr -q outputting the HDMI-1 and HDMI-2 info, but not VGA.)

Module seems loaded:

01:00.0 VGA compatible controller: Silicon Motion, Inc. SM750 (rev a1) (prog-if 00 [VGA controller])
	Subsystem: Silicon Motion, Inc. SM750
	Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Interrupt: pin A routed to IRQ 63
	Region 0: Memory at 600000000 (32-bit, prefetchable) [size=64M]
	Region 1: Memory at 604000000 (32-bit, non-prefetchable) [size=2M]
	Expansion ROM at 604200000 [virtual] [disabled] [size=64K]
	Capabilities: [40] Power Management version 3
		Flags: PMEClk- DSI- D1+ D2- AuxCurrent=375mA PME(D0+,D1+,D2-,D3hot+,D3cold-)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
		Address: 0000000000000000  Data: 0000
	Capabilities: [70] Express (v2) Legacy Endpoint, MSI 00
		DevCap:	MaxPayload 128 bytes, PhantFunc 0, Latency L0s <1us, L1 <8us
			ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
		DevCtl:	CorrErr- NonFatalErr- FatalErr- UnsupReq-
			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop-
			MaxPayload 128 bytes, MaxReadReq 512 bytes
		DevSta:	CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
		LnkCap:	Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s <512ns, L1 <16us
			ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp-
		LnkCtl:	ASPM Disabled; RCB 64 bytes, Disabled- CommClk+
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 2.5GT/s (ok), Width x1 (ok)
			TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
		DevCap2: Completion Timeout: Not Supported, TimeoutDis+ NROPrPrP- LTR-
			 10BitTagComp- 10BitTagReq- OBFF Not Supported, ExtFmt- EETLPPrefix-
			 EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
			 FRS-
			 AtomicOpsCap: 32bit- 64bit- 128bitCAS-
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- OBFF Disabled,
			 AtomicOpsCtl: ReqEn-
		LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete- EqualizationPhase1-
			 EqualizationPhase2- EqualizationPhase3- LinkEqualizationRequest-
			 Retimer- 2Retimers- CrosslinkRes: unsupported
	Capabilities: [b0] MSI-X: Enable- Count=1 Masked-
		Vector table: BAR=5 offset=00000000
		PBA: BAR=5 offset=00000000
	Capabilities: [100 v1] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr-
		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
		AERCap:	First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
			MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
		HeaderLog: 00000000 00000000 00000000 00000000
	Capabilities: [140 v1] Virtual Channel
		Caps:	LPEVC=0 RefClk=100ns PATEntryBits=1
		Arb:	Fixed- WRR32- WRR64- WRR128-
		Ctrl:	ArbSelect=Fixed
		Status:	InProgress-
		VC0:	Caps:	PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
			Arb:	Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
			Ctrl:	Enable+ ID=0 ArbSelect=Fixed TC/VC=ff
			Status:	NegoPending- InProgress-
	Kernel driver in use: sm750fb
	Kernel modules: sm750fb

@geerlingguy
Copy link
Owner Author

I switched 'Boot' to "To CLI" in Raspberry Pi Configuration and rebooted.

In CLI, xrandr -q still gives "Can't open display", and I can still modprobe the driver, though of course the display isn't active.

After I did that, I tried startx and got this interesting feedback in the Xorg log file:

[   337.295] (--) PCI:*(1@0:0:0) 126f:0750:126f:0750 rev 161, Mem @ 0x600000000/67108864, 0x604000000/2097152, BIOS @ 0x????????/65536
[   337.295] (II) LoadModule: "glx"
[   337.295] (II) Loading /usr/lib/xorg/modules/extensions/libglx.so
[   337.297] (II) Module glx: vendor="X.Org Foundation"
[   337.297] 	compiled for 1.20.11, module version = 1.0.0
[   337.297] 	ABI class: X.Org Server Extension, version 10.0
[   337.297] (==) Matched siliconmotion as autoconfigured driver 0
[   337.297] (==) Matched modesetting as autoconfigured driver 1
[   337.297] (==) Matched fbdev as autoconfigured driver 2
[   337.298] (==) Assigned the driver to the xf86ConfigLayout
[   337.298] (II) LoadModule: "siliconmotion"
[   337.298] (WW) Warning, couldn't open module siliconmotion
[   337.298] (EE) Failed to load module "siliconmotion" (module does not exist, 0)
[   337.298] (II) LoadModule: "modesetting"
[   337.298] (II) Loading /usr/lib/xorg/modules/drivers/modesetting_drv.so
[   337.299] (II) Module modesetting: vendor="X.Org Foundation"
[   337.299] 	compiled for 1.20.11, module version = 1.20.11
[   337.299] 	Module class: X.Org Video Driver
[   337.299] 	ABI class: X.Org Video Driver, version 24.1
[   337.299] (II) LoadModule: "fbdev"
[   337.299] (II) Loading /usr/lib/xorg/modules/drivers/fbdev_drv.so
[   337.299] (II) Module fbdev: vendor="X.Org Foundation"
[   337.299] 	compiled for 1.20.0, module version = 0.5.0
[   337.299] 	Module class: X.Org Video Driver
[   337.299] 	ABI class: X.Org Video Driver, version 24.0
[   337.299] (II) modesetting: Driver for Modesetting Kernel Drivers: kms
[   337.299] (II) FBDEV: driver for framebuffer: fbdev
[   337.299] (WW) Falling back to old probe method for modesetting
[   337.300] (II) Loading sub module "fbdevhw"
[   337.300] (II) LoadModule: "fbdevhw"
[   337.300] (II) Loading /usr/lib/xorg/modules/libfbdevhw.so
[   337.300] (II) Module fbdevhw: vendor="X.Org Foundation"
[   337.300] 	compiled for 1.20.11, module version = 0.0.2
[   337.300] 	ABI class: X.Org Video Driver, version 24.1
[   337.300] (**) FBDEV(1): claimed PCI slot 1@0:0:0
[   337.300] (II) FBDEV(1): using default device
[   337.300] (II) modeset(G0): using drv /dev/dri/card1
[   337.300] (WW) VGA arbiter: cannot open kernel arbiter, no multi-card support
[   337.300] (EE) Screen 0 deleted because of no matching config section.
[   337.300] (II) UnloadModule: "modesetting"
[   337.301] (II) FBDEV(0): Creating default Display subsection in Screen section
	"Default Screen Section" for depth/fbbpp 24/32
[   337.301] (==) FBDEV(0): Depth 24, (==) framebuffer bpp 32
[   337.301] (==) FBDEV(0): RGB weight 888
[   337.301] (==) FBDEV(0): Default visual is TrueColor
[   337.301] (==) FBDEV(0): Using gamma correction (1.0, 1.0, 1.0)
[   337.301] (II) FBDEV(0): hardware: sm750_fb1 (video memory: 16384kB)
[   337.301] (EE) 
[   337.301] (EE) Backtrace:
[   337.304] (EE) 0: /usr/lib/xorg/Xorg (OsLookupColor+0x188) [0x5588eef1a8]
[   337.304] (EE) unw_get_proc_info failed: no unwind info found [-10]
[   337.304] (EE) 
[   337.305] (EE) Segmentation fault at address 0x124
[   337.305] (EE) 
Fatal server error:
[   337.305] (EE) Caught signal 11 (Segmentation fault). Server aborting
[   337.305] (EE) 
[   337.305] (EE) 

@geerlingguy
Copy link
Owner Author

geerlingguy commented Apr 12, 2022

lol someone else with a similar issue gave up trying to get an SM501 working and swapped over to an Nvidia GT710.

After reading this post on the SM710, I also tried sudo startx but same problem.

Someone else was having a similar error when running tigervnc and had to append an LD_PRELOAD path to their vnc server command (TigerVNC/tigervnc#800 (comment)), but I tried that and no difference. Still a Segmentation fault at address 0x124, probably some memory copy/access that's still broken in code somewhere.

@geerlingguy
Copy link
Owner Author

Huh... same error but on a different GPU, the core inside the Rockchip RK3399, here: https://forum.armbian.com/topic/12985-potential-opp-issue-with-nanopi-m4v2/#comment-95405

Update: X11 fails to start, consistently segfaulting at OsLookupColor+0x188.

@PixlRainbow
Copy link

PixlRainbow commented Apr 13, 2022

Similar issue encountered on the rk356x, crashing in the same place.
Something about a mismatch where one library expects 24 bit but another expects 32, and memory alignment issues introduced when the code was not being tested on RISC platforms.

https://gitlab.freedesktop.org/mesa/mesa/-/issues/6142

@geerlingguy
Copy link
Owner Author

@PixlRainbow - That would make sense :( — though that issue seems more aligned with issue #4 on the Radeon 5450 (though related possibly here).

@geerlingguy
Copy link
Owner Author

Just as an FYI, I also tested with https://gist.github.com/Coreforge/91da3d410ec7eb0ef5bc8dee24b91359?permalink_comment_id=4134159#gistcomment-4134159 (a memcpy.so override that helped with Xorg on the Radeon 5450), but that had no effect. Still get the segfault at 0x124 when running OsLookupColor+0x188.

@geerlingguy
Copy link
Owner Author

Going to mark this as closed/complete, as the card is working about as far as I think we can expect without SiliconMotion getting involved, ideally writing a DRM driver for it instead of the ancient FB code that's currently in the kernel.

@supercomputer7
Copy link

Is there a place to buy this card or a variant of this? Seems like a neat solution for lean graphics, and not only for the Raspberry Pi use case this issue solved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests