Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attach BCM43602 to a VM cause system freeze and shutdown #3734

Closed
Ninlives opened this issue Mar 23, 2018 · 14 comments
Closed

Attach BCM43602 to a VM cause system freeze and shutdown #3734

Ninlives opened this issue Mar 23, 2018 · 14 comments
Labels
C: other eol-4.0 Closed because Qubes 4.0 has reached end-of-life (EOL) hardware support P: default Priority: default. Default priority for new issues, to be replaced given sufficient information. T: bug Type: bug report. A problem or defect resulting in unintended behavior in something that exists.

Comments

@Ninlives
Copy link

Qubes OS version:

R4.0rc5

Affected component(s):


Steps to reproduce the behavior:

run qvm-pci attach <vmname> <backend>:<bdf>, I've tried every combination with permissive and no-strict-reset option.

Expected behavior:

VM(especially sys-net) is attached to the adapter so I can make the wifi work.

Actual behavior:

System freeze immediately when attach to a running VM or the VM attached to the adapter start, then shutdown after a few seconds.

General notes:

The system-freeze will happen in the setup step so I have to detach the adapter from the sys-net to finish installation.
Some one suggested to use sudo xl pci-attach <vmname> <bdf> to attach the adapter directly to a VM, the command seemed to work, but qvm-pci shows that the adapter is still not used by any VM, and the network manager didn't show any wifi device.


Related issues:

@andrewdavidwong
Copy link
Member

Based on our issue reporting guidelines, this appears to be too localized for qubes-issues, since it depends on a specific hardware configuration. We ask that you please send this to the qubes-users mailing list instead. qubes-users is intended for these sorts of issues and receives much more traffic, which means that your issue is more likely to receive a response there. If, after reading our issue reporting guidelines, you believe we are mistaken, please leave a comment briefly explaining why. We'll be happy to take another look, and, if appropriate, reopen this issue. Thank you for your understanding.

@andrewdavidwong andrewdavidwong added the R: not applicable E.g., help/support requests, questions, discussions, "not a bug," not enough info, not actionable. label Mar 24, 2018
@arno01
Copy link

arno01 commented Jan 30, 2020

If a search engine will bring you here, then here is what helped me to make BCM43602 work again on my MacBookPro14,3 (mid-2017).

  1. start dom0 with qubes.skip_autostart.

  2. start sys-net in the following sequence:

qvm-start sys-net
sleep 3
sudo xl pci-attach sys-net '03:00.0,permissive=1'
qvm-run -p sys-net "sudo cp ~/brcmfmac43602-pcie.txt /usr/lib/firmware/brcm/brcmfmac43602-pcie.txt"
qvm-run -p sys-net "echo 1 | sudo tee /sys/bus/pci/rescan"

Enjoy your Wifi card at its full speed! 2.4GHz & 5GHz wifi networks are working :-)

The brcmfmac43602-pcie.txt file you can get here https://bugzilla.kernel.org/show_bug.cgi?id=193121#c52 - see an "attachment 285753" from Simon Siebert on 2019-11-02 16:41:15 UTC

Make sure to set macaddr=00:90:4c:0d:f4:3e in that file! It's the default mac address which is coming from the firmware.

Refs.


I've also noticed that if I remove brcmfmac kernel driver from the sys-net (domU), and modprobe it back again => the card will never work.
I've tried resetting the PCI bus, removing the card, etc.. only reboot helps.

@marmarek
Copy link
Member

If that helps, it may be easier to:

  1. Place the file in the template.
  2. Attach device with (to avoid doing it each time manually): qvm-pci at -p -o permissive=True sys-net dom0:03_00.0

I'd recommend checking for the mac address first, what really should be set there - if two users with the same mac address set meet, it will mean duplicated MAC address in the network, which basically means it won't work for either of them.

@arno01
Copy link

arno01 commented Jan 31, 2020

qvm-pci attach -p -o permissive=True sys-net dom0:03_00.0 and then qvm-start sys-net makes the system freeze immediately.

I've tried booting with noreboot loglvl=debug guest_loglvl=debug iommu=no-igfx,debug console_to_ring=true.

image

domU logs the first time 03:00.0 initializes there

domU logs after qvm-run -p sys-net "echo 1 | sudo tee /sys/bus/pci/rescan"

[Thu Jan 30 12:55:02 2020] pci 0000:00:09.0: [14e4:43ba] type 00 class 0x028000
[Thu Jan 30 12:55:02 2020] pci 0000:00:09.0: reg 0x10: [mem 0x82400000-0x82407fff 64bit]
[Thu Jan 30 12:55:02 2020] pci 0000:00:09.0: reg 0x18: [mem 0x82000000-0x823fffff 64bit]
[Thu Jan 30 12:55:02 2020] pci 0000:00:09.0: supports D1 D2
[Thu Jan 30 12:55:02 2020] pci 0000:00:09.0: BAR 2: assigned [mem 0xf2400000-0xf27fffff 64bit]
[Thu Jan 30 12:55:02 2020] pci 0000:00:09.0: BAR 0: assigned [mem 0xf2000000-0xf2007fff 64bit]
[Thu Jan 30 12:55:02 2020] cfg80211: Loading compiled-in X.509 certificates for regulatory database
[Thu Jan 30 12:55:02 2020] cfg80211: Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7'
[Thu Jan 30 12:55:02 2020] usbcore: registered new interface driver brcmfmac
[Thu Jan 30 12:55:02 2020] brcmfmac 0000:00:09.0: enabling device (0000 -> 0002)
[Thu Jan 30 12:55:02 2020] xen: --> pirq=16 -> irq=21 (gsi=21)
[Thu Jan 30 12:55:03 2020] brcmfmac: brcmf_fw_alloc_request: using brcm/brcmfmac43602-pcie for chip BCM43602/2
[Thu Jan 30 12:55:03 2020] brcmfmac: brcmf_fw_alloc_request: using brcm/brcmfmac43602-pcie for chip BCM43602/2
[Thu Jan 30 12:55:03 2020] brcmfmac: brcmf_c_process_clm_blob: no clm_blob available (err=-2), device may have limited channels available
[Thu Jan 30 12:55:03 2020] brcmfmac: brcmf_c_preinit_dcmds: Firmware: BCM43602/2 wl0: Nov 10 2015 06:38:10 version 7.35.177.61 (r598657) FWID 01-ea662a8c
[Thu Jan 30 12:55:03 2020] brcmfmac 0000:00:09.0 wls9: renamed from wlan0
[Thu Jan 30 12:55:03 2020] IPv6: ADDRCONF(NETDEV_UP): wls9: link is not ready
[Thu Jan 30 12:55:03 2020] IPv6: ADDRCONF(NETDEV_UP): wls9: link is not ready
[Thu Jan 30 12:55:03 2020] IPv6: ADDRCONF(NETDEV_UP): wls9: link is not ready
[Thu Jan 30 12:55:04 2020] IPv6: ADDRCONF(NETDEV_UP): wls9: link is not ready
[Thu Jan 30 12:55:05 2020] IPv6: ADDRCONF(NETDEV_UP): wls9: link is not ready
[Thu Jan 30 12:55:06 2020] IPv6: ADDRCONF(NETDEV_CHANGE): wls9: link becomes ready
[Thu Jan 30 12:55:06 2020] brcmfmac: brcmf_inetaddr_changed: fail to get arp ip table err:-52
  • domU logs after modprobe -r brcmfmac
[Thu Jan 30 16:00:07 2020] usbcore: deregistering interface driver brcmfmac
[Thu Jan 30 16:00:07 2020] brcmfmac: brcmf_cfg80211_get_tx_power: error (-5)
[Thu Jan 30 16:00:07 2020] brcmfmac: brcmf_fil_cmd_data: bus is down. we have nothing to do.
[Thu Jan 30 16:00:07 2020] brcmfmac: brcmf_link_down: WLC_DISASSOC failed (-5)
[Thu Jan 30 16:00:07 2020] brcmfmac: brcmf_fil_cmd_data: bus is down. we have nothing to do.
[Thu Jan 30 16:00:07 2020] brcmfmac: brcmf_set_pmk: failed to change PSK in firmware (len=0)
[Thu Jan 30 16:00:08 2020] brcmfmac: brcmf_fil_cmd_data: bus is down. we have nothing to do.
[Thu Jan 30 16:00:08 2020] brcmfmac: brcmf_fil_cmd_data: bus is down. we have nothing to do.
[Thu Jan 30 16:00:08 2020] brcmfmac: brcmf_fil_cmd_data: bus is down. we have nothing to do.
[Thu Jan 30 16:00:08 2020] brcmfmac: brcmf_cfg80211_get_channel: chanspec failed (-5)
[Thu Jan 30 16:00:08 2020] brcmfmac: brcmf_msgbuf_delete_flowring: Failed to submit RING_DELETE, flowring will be removed
  • domU logs after modprobe brcmfmac - the 03:00.0 card is not working anymore:
[Thu Jan 30 16:01:47 2020] cfg80211: Loading compiled-in X.509 certificates for regulatory database
[Thu Jan 30 16:01:47 2020] cfg80211: Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7'
[Thu Jan 30 16:01:47 2020] usbcore: registered new interface driver brcmfmac
[Thu Jan 30 16:01:47 2020] net_ratelimit: 2 callbacks suppressed
[Thu Jan 30 16:01:47 2020] brcmfmac: brcmf_chip_recognition: chip backplane type 15 is not supported
[Thu Jan 30 16:01:47 2020] brcmfmac: brcmf_pcie_probe: failed 14e4:43ba

Sometimes:

[Thu Jan 30 11:59:12 2020] brcmfmac: brcmf_msgbuf_query_dcmd: Timeout on response for query command
[Thu Jan 30 11:59:12 2020] brcmfmac: brcmf_c_preinit_dcmds: Retreiving cur_etheraddr failed, -5
[Thu Jan 30 11:59:12 2020] brcmfmac: brcmf_bus_started: failed: -5
[Thu Jan 30 11:59:12 2020] brcmfmac: brcmf_attach: dongle is not responding: err=-5

lspci tree

[arno@dom0 ~]$ lspci -t -nn -v
-[0000:00]-+-00.0  Intel Corporation Xeon E3-1200 v6/7th Gen Core Processor Host Bridge/DRAM Registers [8086:5910]
           +-01.0-[01]--+-00.0  Advanced Micro Devices, Inc. [AMD/ATI] Baffin [Radeon RX 460/560D / Pro 450/455/460/555/555X/560/560X] [1002:67ef]
           |            \-00.1  Advanced Micro Devices, Inc. [AMD/ATI] Baffin HDMI/DP Audio [Radeon RX 550 640SP / RX 560/560X] [1002:aae0]
           +-01.1-[04-79]----00.0-[05-79]--+-00.0-[06]----00.0  Intel Corporation JHL6540 Thunderbolt 3 NHI (C step) [Alpine Ridge 4C 2016] [8086:15d2]
           |                               +-01.0-[08-40]--
           |                               +-02.0-[07]----00.0  Intel Corporation JHL6540 Thunderbolt 3 USB Controller (C step) [Alpine Ridge 4C 2016] [8086:15d4]
           |                               \-04.0-[41-79]--
           +-01.2-[7a-ef]----00.0-[7b-ef]--+-00.0-[7c]----00.0  Intel Corporation JHL6540 Thunderbolt 3 NHI (C step) [Alpine Ridge 4C 2016] [8086:15d2]
           |                               +-01.0-[7e-b6]--
           |                               +-02.0-[7d]----00.0  Intel Corporation JHL6540 Thunderbolt 3 USB Controller (C step) [Alpine Ridge 4C 2016] [8086:15d4]
           |                               \-04.0-[b7-ef]--
           +-14.0  Intel Corporation 100 Series/C230 Series Chipset Family USB 3.0 xHCI Controller [8086:a12f]
           +-15.0  Intel Corporation 100 Series/C230 Series Chipset Family Serial IO I2C Controller #0 [8086:a160]
           +-16.0  Intel Corporation 100 Series/C230 Series Chipset Family MEI Controller #1 [8086:a13a]
           +-19.0  Intel Corporation 100 Series/C230 Series Chipset Family Serial IO UART Controller #2 [8086:a166]
           +-1b.0-[02]----00.0  Samsung Electronics Co Ltd NVMe SSD Controller SM961/PM961 [144d:a804]
           +-1c.0-[03]----00.0  Broadcom Inc. and subsidiaries BCM43602 802.11ac Wireless LAN SoC [14e4:43ba]
           +-1e.0  Intel Corporation 100 Series/C230 Series Chipset Family Serial IO UART #0 [8086:a127]
           +-1e.1  Intel Corporation 100 Series/C230 Series Chipset Family Serial IO UART #1 [8086:a128]
           +-1e.2  Intel Corporation 100 Series/C230 Series Chipset Family Serial IO GSPI #0 [8086:a129]
           +-1e.3  Intel Corporation 100 Series/C230 Series Chipset Family Serial IO GSPI #1 [8086:a12a]
           +-1f.0  Intel Corporation Sunrise Point-H LPC Controller [8086:a151]
           +-1f.2  Intel Corporation 100 Series/C230 Series Chipset Family Power Management Controller [8086:a121]
           +-1f.3  Intel Corporation 100 Series/C230 Series Chipset Family HD Audio Controller [8086:a170]
           \-1f.4  Intel Corporation 100 Series/C230 Series Chipset Family SMBus [8086:a123]

lspci of 03:00.0

[arno@dom0 ~]$ sudo lspci -s 03:00.0 -k -vv -nn 
03:00.0 Network controller [0280]: Broadcom Inc. and subsidiaries BCM43602 802.11ac Wireless LAN SoC [14e4:43ba] (rev 02)
	Subsystem: Apple Inc. Device [106b:0173]
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 256 bytes
	Interrupt: pin A routed to IRQ 16
	Region 0: Memory at 82400000 (64-bit, non-prefetchable) [size=32K]
	Region 2: Memory at 82000000 (64-bit, non-prefetchable) [size=4M]
	Capabilities: [48] Power Management version 3
		Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=2 PME-
	Capabilities: [58] MSI: Enable+ Count=1/16 Maskable- 64bit+
		Address: 00000000fee00958  Data: 0000
	Capabilities: [68] Vendor Specific Information: Len=44 <?>
	Capabilities: [ac] Express (v2) Endpoint, MSI 00
		DevCap:	MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited
			ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 10.000W
		DevCtl:	CorrErr- NonFatalErr- FatalErr- UnsupReq-
			RlxdOrd+ ExtTag- PhantFunc- AuxPwr+ NoSnoop+
			MaxPayload 256 bytes, MaxReadReq 1024 bytes
		DevSta:	CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr+ TransPend-
		LnkCap:	Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s <2us, L1 <32us
			ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
		LnkCtl:	ASPM L1 Enabled; RCB 64 bytes Disabled- CommClk+
			ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 2.5GT/s (ok), Width x1 (ok)
			TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
		DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR+, OBFF Via WAKE#
			 AtomicOpsCap: 32bit- 64bit- 128bitCAS-
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR+, OBFF Disabled
			 AtomicOpsCtl: ReqEn-
		LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
	Capabilities: [100 v1] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr-
		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
		AERCap:	First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
			MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
		HeaderLog: 00000000 00000000 00000000 00000000
	Capabilities: [13c v1] Device Serial Number 1f-27-90-ff-ff-50-8c-85
	Capabilities: [150 v1] Power Budgeting <?>
	Capabilities: [160 v1] Virtual Channel
		Caps:	LPEVC=0 RefClk=100ns PATEntryBits=1
		Arb:	Fixed- WRR32- WRR64- WRR128-
		Ctrl:	ArbSelect=Fixed
		Status:	InProgress-
		VC0:	Caps:	PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
			Arb:	Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
			Ctrl:	Enable+ ID=0 ArbSelect=Fixed TC/VC=ff
			Status:	NegoPending- InProgress-
	Capabilities: [1b0 v1] Latency Tolerance Reporting
		Max snoop latency: 0ns
		Max no snoop latency: 0ns
	Capabilities: [220 v1] Resizable BAR <?>
	Capabilities: [240 v1] L1 PM Substates
		L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
			  PortCommonModeRestoreTime=0us PortTPowerOnTime=50us
		L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
			   T_CommonMode=0us LTR1.2_Threshold=0ns
		L1SubCtl2: T_PwrOn=10us
	Kernel driver in use: pciback
	Kernel modules: brcmfmac

PCI register configuration of 03:00.0

It's after the moment it was activated in the domU.

[arno@dom0 ~]$ setpci --dumpregs | awk -v slot='03:00.0' 'NR>1 && !/ E?CAP/{
>   reg = tolower($NF)
>   printf "%s=",reg
>   system("setpci -s " slot " " reg)
> }'
vendor_id=14e4
device_id=43ba
command=0006
status=0010
revision=02
class_prog=00
class_device=0280
cache_line_size=40
latency_timer=00
header_type=00
bist=00
base_address_0=82400004
base_address_1=00000000
base_address_2=82000004
base_address_3=00000000
base_address_4=00000000
base_address_5=00000000
cardbus_cis=00000000
subsystem_vendor_id=106b
subsystem_id=0173
rom_address=00000000
interrupt_line=ff
interrupt_pin=01
min_gnt=00
max_lat=00
primary_bus=04
secondary_bus=00
subordinate_bus=00
sec_latency_timer=82
io_base=00
io_limit=00
sec_status=0000
memory_base=0000
memory_limit=0000
pref_memory_base=0000
pref_memory_limit=0000
pref_base_upper32=00000000
pref_limit_upper32=0173106b
io_base_upper16=0000
io_limit_upper16=0000
bridge_rom_address=00000000
bridge_control=0000
cb_cardbus_base=82400004
cb_capabilities=0000
cb_sec_status=0000
cb_bus_number=04
cb_cardbus_number=00
cb_subordinate_bus=00
cb_cardbus_latency=82
cb_memory_base_0=00000000
cb_memory_limit_0=00000000
cb_memory_base_1=00000000
cb_memory_limit_1=00000000
cb_io_base_0=106b
cb_io_base_0_hi=0173
cb_io_limit_0=0000
cb_io_limit_0_hi=0000
cb_io_base_1=0048
cb_io_base_1_hi=0000
cb_io_limit_1=0000
cb_io_limit_1_hi=0000
cb_subsystem_vendor_id=ffff
cb_subsystem_id=ffff
cb_legacy_mode_base=ffffffff

Before passing it to domU, the parameter command was 0002 and changed to 0006 after I passed it to domU.
And one more interesting observation, is that if I remove this wireless adapter in domU or shut down domU, then the parameters change as follows:

  • command changes from 0006 to 0000
  • pref_limit_upper32 changes from 0173106b to 0157106b (difference: 0x1c0000)
  • interrupt_line changes from ff to 00
  • base_address_0 from 82400004 to 00000004
  • base_address_2 from 82400004 to 00000004
  • cache_line_size from 40 to 00

After which, I cannot make the 03:00.0 card working again.. tried removing it, pci bus rescanning it. It would never appear back again and only the reboot would help. Do you have any suggestions/ideas what else could I try so there is a way to restart sys-net without breaking the card?

I have been trying this script to perform a hot reset of the 1b.0-[03] (see above lspci tree), but without success. The card doesn't appear back again.

@arno01
Copy link

arno01 commented Jan 31, 2020

I'd recommend checking for the mac address first, what really should be set there - if two users with the same mac address set meet, it will mean duplicated MAC address in the network, which basically means it won't work for either of them.

That MAC address is in the firmware. Even if I do not set macaddr= at all, the card gets that mac address always by default. :/
Maybe I could try reading what MAC I get when I boot the card up in the macOS and trying setting that address in the /lib/firmware/brcm/brcmfmac43602-pcie.txt (.txt) file.

$ strings /lib/firmware/brcm/brcmfmac43602-pcie.bin | grep ^macaddr
macaddr=%s
macaddr=00:90:4c:0d:f4:3e
macaddr

There is also some interesting section NVRAM from EFI in this doc https://wireless.wiki.kernel.org/en/users/drivers/brcm80211
However, I do not seem to have the /sys/firmware/efi/efivars/nvram-74b00bd9-805a-4d61-b51f-43268123d113 file, nor anything similar. I've tried grepping through all the files there but didn't find anything. Maybe I should try parsing some info there from macOS ?

@marmarek
Copy link
Member

Maybe I should try parsing some info there from macOS ?

First thing I'd check is what MAC address you see there.

@arno01
Copy link

arno01 commented Feb 7, 2020

I have been tinkering more with the BCM43602 wireless adapter these days and have figured a way so I can restart sys-net without having to reboot. 👹

$ sudo lspci -t -nn -v
  +-1c.0-[03]----00.0  Broadcom Inc. and subsidiaries BCM43602 802.11ac Wireless LAN SoC [14e4:43ba]

The whole trick was to unbind the parent PCI bridge off the pcieport (PCI Express Port Bus Driver), after which I could restart sys-net as many times as I want without breaking BCM43602 adapter:

echo 0000:00:1c.0 | sudo tee /sys/bus/pci/drivers/pcieport/unbind

What's more interesting and is related to my previous observations is that if I don't unbind the parent PCI bridge off the pcieport and then restart the sys-net (attaching the 03:00.0 as described before), the BCM43602 adapter's subsystem changes the device number from 0x0173 to 0x0157 🧐

03:00.0 Network controller [0280]: Broadcom Inc. and subsidiaries BCM43602 802.11ac Wireless LAN SoC [14e4:43ba] (rev 02)
	Subsystem: Apple Inc. Device [106b:0173]     ===>>> [106b:0157] !!! 👈

But as soon as I unbind the parent PCI bridge off the pcieport again, restart sys-net, the BCM43602 adapter is working back again, even despite that its subsystem device number is now 0x0157 instead of 0x0173. 🤔

One more observation is that when I remove the BCM43602 adapter device echo 1 | sudo tee /sys/bus/pci/devices/0000\:00\:1c.0/0000\:03\:00.0/remove and issue a rescan command to its parent PCI bridge echo 1 | sudo tee /sys/bus/pci/devices/0000\:00\:1c.0/rescan the BCM43602 adapter (03:00.0) will become visible again only if the parent PCI bridge was unbound from the pcieport.

Does this all make any sense, @marmarek ? 🤓

@arno01
Copy link

arno01 commented Feb 7, 2020

Maybe I should try parsing some info there from macOS ?

First thing I'd check is what MAC address you see there.

I see a different MAC address in macOS. So I've taken it from there and replaced the macaddr= in the /usr/lib/firmware/brcm/brcmfmac43602-pcie.txt so it is unique. It's working fine with that mac address. (I believe it will probably work with any mac address I put there).

@arno01
Copy link

arno01 commented Feb 7, 2020

One more thing is that it seem to be impossible to attach any device using xl pci-attach when sys-net is started without any PCI device attached to it. Is this expected?

[arno@dom0 ~]$ qvm-pci | grep sys-net
[arno@dom0 ~]$ 
[arno@dom0 ~]$ qvm-start sys-net
[arno@dom0 ~]$ sudo xl pci-attach sys-net '0000:03:00.0,permissive=1'
libxl: error: libxl_pci.c:1575:libxl__device_pci_add: Domain 5:PCI device 0000:03:00.0 already assigned to a different guest?
libxl: error: libxl_pci.c:1735:device_pci_add_done: Domain 5:libxl__device_pci_add  failed for PCI device 0:3:0.0 (rc -1)
libxl: error: libxl_device.c:1414:device_addrm_aocomplete: unable to add device

Additionally, I had this question on my mind few days ago, is that there seem to be no way to pass strictreset=0 via the xl pci-attach, whilst it takes the permissive=1 without an issue. I presume this functionality wasn't added at the time no-strict-reset option was added only to libvirt?

@marmarek
Copy link
Member

marmarek commented Feb 7, 2020

Does this all make any sense, @marmarek ? nerd_face

I don't know what pcieport driver does on unbind...

One more thing is that it seem to be impossible to attach any device using xl pci-attach when sys-net is started without any PCI device attached to it. Is this expected?

Make sure virt_mode of sys-net is set to hvm. Without PCI devices at boot it defaults to PVH, which doesn't support PCI passthrough.

I presume this functionality wasn't added at the time no-strict-reset option was added only to libvirt?

The strictreset is libvirt-only thing. xl doesn't care about missing reset. permissive is a different thing applicable to both. And for xl there is also rdm_policy=relaxed (see man xl.cfg).

@arno01
Copy link

arno01 commented Feb 8, 2020

Thanks for the hint on rdm_policy=relaxed, not sure if that will do any good for my case though.
AFAIK, as I've seen on the net, it seems to be known in helping with the devices with shared RMRR which have PCI passthrough issues.

One more thing is that it seem to be impossible to attach any device using xl pci-attach when sys-net is started without any PCI device attached to it. Is this expected?

Make sure virt_mode of sys-net is set to hvm. Without PCI devices at boot it defaults to PVH, which doesn't support PCI passthrough.

virt_mode of sys-net is always seem to set to hvm, even when I remove all PCI devices from sys-net.
At least that is what qvm-prefs -g sys-net virt_mode returns.
I have tried to see the difference between the outputs virsh -c xen dumpxml sys-net before (when with a single PCI device 00:15.0) and after (no PCI devices):

[arno@dom0 ~]$ diff -Nur bef aft
--- bef	2020-02-08 10:09:49.219035654 +0100
+++ aft	2020-02-08 10:10:26.220035640 +0100
@@ -1,4 +1,4 @@
-<domain type='xen' id='1'>
+<domain type='xen' id='3'>
   <name>sys-net</name>
   <uuid>4f9e4095-d96c-4e70-87fd-b872a874c956</uuid>
   <memory unit='KiB'>409600</memory>
@@ -7,7 +7,7 @@
   <os>
     <type arch='x86_64' machine='xenfv'>hvm</type>
     <loader type='rom'>hvmloader</loader>
-    <cmdline>root=/dev/mapper/dmroot ro nomodeset console=hvc0 rd_NO_PLYMOUTH rd.plymouth.enable=0 plymouth.enable=0 xen_scrub_pages=0 nopat</cmdline>
+    <cmdline>root=/dev/mapper/dmroot ro nomodeset console=hvc0 rd_NO_PLYMOUTH rd.plymouth.enable=0 plymouth.enable=0 xen_scrub_pages=0 nopat iommu=soft swiotlb=8192</cmdline>
     <boot dev='cdrom'/>
     <boot dev='hd'/>
   </os>
@@ -16,6 +16,9 @@
     <apic/>
     <pae/>
     <viridian/>
+    <xen>
+      <e820_host state='on'/>
+    </xen>
   </features>
   <cpu mode='host-passthrough'>
     <feature policy='disable' name='vmx'/>
@@ -61,6 +64,12 @@
     <video>
       <model type='vga' vram='16384' heads='1' primary='yes'/>
     </video>
+    <hostdev mode='subsystem' type='pci' managed='yes' nostrictreset='yes'>
+      <driver name='xen'/>
+      <source>
+        <address domain='0x0000' bus='0x00' slot='0x15' function='0x0'/>
+      </source>
+    </hostdev>
     <memballoon model='xen'/>
   </devices>
 </domain>

I there a way to make sure sys-net gets e820_host=1 and iommu=soft swiotlb=8192 when it starts without the PCI devices?

@arno01
Copy link

arno01 commented Feb 8, 2020

I there a way to make sure sys-net gets e820_host=1 and iommu=soft swiotlb=8192 when it starts without the PCI devices?

Adding these parameters didn't do any better.

I've also tried adding the 00:15.0 PCI device after sys-net was up & running, it fails too:

[arno@dom0 ~]$ qvm-pci | grep sys-net
[arno@dom0 ~]$ qvm-start sys-net
[arno@dom0 ~]$ cat 15-adapter.xml 
<hostdev mode='subsystem' type='pci' managed='yes' nostrictreset='yes'>
  <driver name='xen'/>
  <source>
    <address domain='0x0000' bus='0x00' slot='0x15' function='0x0'/>
  </source>
</hostdev>
[arno@dom0 ~]$ virsh -c xen attach-device sys-net 15-adapter.xml 
error: Failed to attach device from 15-adapter.xml
error: internal error: Unable to reset PCI device 0000:00:15.0: internal error: libxenlight failed to attach pci device 0000:00:15.0

sys-net is always seem to be loaded in HVM mode:

[arno@dom0 ~]$ virsh -c xen dumpxml sys-net |grep -i hvm
    <type arch='x86_64' machine='xenfv'>hvm</type>
    <loader type='rom'>hvmloader</loader>

@ctr49
Copy link

ctr49 commented Mar 1, 2021

Is anyone still working on this? None of the listed workarounds seem to work for me.
Permanent assignment of the wifi card to sys-net freezes the system, dynamically assigning it after start just results in:

brcmfmac: brcmf_chip_recognition: SB chip is not supported
brcmfmac: brcmf_pcie_probe: failed 14e4:43ba

as reported similarly in a previous comment but I seem to have the problem from the first start of sys-net not only upon restart.

@andrewdavidwong andrewdavidwong added C: other P: default Priority: default. Default priority for new issues, to be replaced given sufficient information. T: bug Type: bug report. A problem or defect resulting in unintended behavior in something that exists. hardware support and removed R: not applicable E.g., help/support requests, questions, discussions, "not a bug," not enough info, not actionable. labels Mar 1, 2021
@andrewdavidwong andrewdavidwong added this to the Release 4.0 updates milestone Mar 1, 2021
@andrewdavidwong andrewdavidwong added the needs diagnosis Requires technical diagnosis from developer. Replace with "diagnosed" or remove if otherwise closed. label Mar 1, 2021
@andrewdavidwong andrewdavidwong added the eol-4.0 Closed because Qubes 4.0 has reached end-of-life (EOL) label Aug 5, 2023
@github-actions
Copy link

github-actions bot commented Aug 5, 2023

This issue is being closed because:

If anyone believes that this issue should be reopened and reassigned to an active milestone, please leave a brief comment.
(For example, if a bug still affects Qubes OS 4.1, then the comment "Affects 4.1" will suffice.)

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Aug 5, 2023
@andrewdavidwong andrewdavidwong removed the needs diagnosis Requires technical diagnosis from developer. Replace with "diagnosed" or remove if otherwise closed. label Aug 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C: other eol-4.0 Closed because Qubes 4.0 has reached end-of-life (EOL) hardware support P: default Priority: default. Default priority for new issues, to be replaced given sufficient information. T: bug Type: bug report. A problem or defect resulting in unintended behavior in something that exists.
Projects
None yet
Development

No branches or pull requests

5 participants