Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WSL2] Null bytes in virtual com port with a stlinkv2.1 #111

Closed
bhav97 opened this issue Nov 23, 2021 · 13 comments
Closed

[WSL2] Null bytes in virtual com port with a stlinkv2.1 #111

bhav97 opened this issue Nov 23, 2021 · 13 comments
Labels
hardware Incompatible device or hub

Comments

@bhav97
Copy link

bhav97 commented Nov 23, 2021

Hi

I am running into a weird issue while using the stlink on my STM8 Nucleo-32 board. I have updated the stlink firmware to the latest available (V2J27M7) but the issue persists.

If I open the serial port on a windows application, everything works expected. After using usbipd to attach to wsl, I only read 0x0 from the serial port after an initial burst of data.

programming features of the stlink seem to be working as expected (using stm8flash). I have not used the mass storage device.

$ sudo tio /dev/ttyACM0
[tio 21:11:05] tio v1.32
[tio 21:11:05] Press ctrl-t q to quit
[tio 21:11:05] Connected
lloxxx
      Helloxxx
              Helloxxx
                      Helloxxx
                              Helloxxx
                                      Helloxxx
                                              Helloxxx
                                                      Helxx
                                                           Helloxxx
                                                                   Helloxxx
                                                                           Helloxxx
                                                                                   Heelloxxx
                                                                                            Helloxxx
                                                                                                    Helloxxx
                                                                                                            Helloxloxxx
                                                                                                                       Helloxxx
       Helloxxx
               Helloxxx
                       Helloxxx
                               Helloxxx
                                       Helloxxx
                                               Helloxxx
                                                       Helloxxx
                                                               Hello
[tio 21:11:08] Switched to hexadecimal mode
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[tio 21:11:09] Switched to normal mode

[tio 21:11:11] Disconnected

I don't see any exceptions in the usbipd log, and I'm not sure if it is a usbipd problem but I was hoping you could point me in the right direction
usbipd.log

@dorssel
Copy link
Owner

dorssel commented Nov 23, 2021

Apart from the large burst of:

dbug: UsbIpServer.AttachedClient[1000]
USBSUP_XFER_STALL -> EPIPE -> -32

log entries (which is a bit odd, but not wrong per se), I see nothing weird in the log. But tio may not be the best low-level serial port tool out there. In fact, from the man page tio is more for generic tty terminals than serial ports. I personally use minicom for low level serial communication.

@bhav97
Copy link
Author

bhav97 commented Nov 23, 2021

minicom gives similar output

Welcome to minicom 2.7.1

OPTIONS: I18n
Compiled on Dec 23 2019, 02:06:26.
Port /dev/ttyACM0, 21:32:24

Press CTRL-A Z for help on special keys

�xx
   Helloxxx
           Hello

















CTRL-A Z for help | 115200 8N1 | NOR | Minicom 2.7.1 | VT102 | Offline | ttyACM0                                        

And switching to hex output gives a bunch of zeros
I do see some entries in dmesg (not sure if this is a problem)

[63003.141124] vhci_hcd vhci_hcd.0: pdev(0) rhport(0) sockfd(3)
[63003.141127] vhci_hcd vhci_hcd.0: devid(65538) speed(2) speed_str(full-speed)
[63003.141176] vhci_hcd vhci_hcd.0: Device attached
[63003.429083] vhci_hcd: vhci_device speed not set
[63003.498983] usb 1-1: new full-speed USB device number 20 using vhci_hcd
[63003.588881] vhci_hcd: vhci_device speed not set
[63003.658870] usb 1-1: SetAddress Request (20) to port 0
[63003.718455] usb 1-1: New USB device found, idVendor=0483, idProduct=374b, bcdDevice= 1.00
[63003.718458] usb 1-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[63003.718460] usb 1-1: Product: STM32 STLink
[63003.718461] usb 1-1: Manufacturer: STMicroelectronics
[63003.718462] usb 1-1: SerialNumber: 0671FF393731435243151045
[63003.779680] cdc_acm 1-1:1.2: ttyACM0: USB ACM device
[63134.402834] vhci_hcd: unlink->seqnum 274634
[63134.402838] vhci_hcd: urb->status -104
[63134.404555] vhci_hcd: unlink->seqnum 274636
[63134.404557] vhci_hcd: urb->status -104
[63186.528971] vhci_hcd: unlink->seqnum 279864
[63186.528973] vhci_hcd: urb->status -104
[63186.529739] vhci_hcd: unlink->seqnum 279865
[63186.529742] vhci_hcd: urb->status -104
[63186.530487] vhci_hcd: unlink->seqnum 279866
[63186.530489] vhci_hcd: the urb (seqnum 279866) was already given back
[63186.530490] vhci_hcd: unlink->seqnum 279867
[63186.530490] vhci_hcd: the urb (seqnum 279867) was already given back
[63186.530501] vhci_hcd: unlink->seqnum 279868
[63186.530501] vhci_hcd: the urb (seqnum 279868) was already given back
[63314.592310] vhci_hcd: connection closed
[63314.592450] vhci_hcd: stop threads
[63314.592454] vhci_hcd: release socket
[63314.592458] vhci_hcd: disconnect device
[63314.592509] usb 1-1: USB disconnect, device number 20

@dorssel
Copy link
Owner

dorssel commented Nov 24, 2021

Given the USBSUP_XFER_STALL -> EPIPE entries in the log, those messages (urb->status -104 and the urb (seqnum XXX) was already given back) are perfectly normal. They are Linux debug messages, not warnings or errors.

The question really is, why does the pipe stall so often in the first place? I cannot explain that.

The first few stalls are normal, for endpoint 0, during device setup.
But the real communication is on endpoint 5. And the first weird thing is pending count = 17. That is a lot of pending requests. URB requests usually overlap (it is essentially multiple outstanding reads for performance), but usual numbers for pending count are in the order of 4 or 5, not 17. Then, at some point, those all get cancelled twice (!) Aborting endpoint 132 and Aborting endpoint 133, after which a burst of USBSUP_XFER_STALL is returned. I guess this is what leads to the NUL bytes you seem to receive.

Unfortunately, I have no further clues on what is going wrong here...

@dorssel dorssel added the hardware Incompatible device or hub label Nov 28, 2021
@dorssel
Copy link
Owner

dorssel commented Jan 23, 2022

@bhav97
Could you please try PR #224, the installer is at https://github.com/dorssel/usbipd-win/actions/runs/1734739747.
This may very well solve the problem.

@bhav97
Copy link
Author

bhav97 commented Jan 23, 2022

Thanks for the update!
Unfortunately it didn't work. I still have pending count = 16 in my logs and null bytes in the the serial log

usbip.log

@dorssel
Copy link
Owner

dorssel commented Jan 23, 2022

Thanks for testing. I'll keep trying...

@bhav97
Copy link
Author

bhav97 commented Jan 23, 2022

Hi! There was an error from my side. While installing there was a conflict due to powertoys awake and I think the installation wasn't completed properly. I uninstalled and reinstalled and got some logs again.

No difference but here's the updated log file. Let me know if you need anything else
usbip.log

@dorssel
Copy link
Owner

dorssel commented Mar 16, 2022

@bhav97
Another issue that may be related to yours was fixed in the current master build. Could you give https://github.com/dorssel/usbipd-win/actions/runs/1991685584 a try?

@bhav97
Copy link
Author

bhav97 commented Mar 19, 2022

Hi @dorssel , thanks for the update
I tried the 2d175 build, but I still receive the NULL bytes after the initial burst of data.

I'll collect some tshark logs as well

usbipd.log

Here's a tshark capture of attach, read data, detach
usbipd.zip

@dorssel
Copy link
Owner

dorssel commented Mar 19, 2022

@bhav97
These last tshark captures are interesting.

  1. At the start of the communication (endpoint = 5), there is 1 'x' character incoming, and 1 'x' character outgoing. This is the only outgoing character in the entire capture (packets 90-94)
  2. Then a burst of "Helloxxx\a" bunched up in full packets of 32 bytes (packets 95-140). The last packet gets truncated at the 16-byte mark.
  3. Then it is followed by the NUL bytes. Sometimes 1, sometimes up to 4 in a packet (packets 142-922).
  4. Then some sort of a "line reset" with a new line coding request (packet 924-926)
  5. Then a single interrupt on endpoint 4 (packet 927)
  6. And finally from packet 963 the same message "Helloxxx\a", but this time 1 byte per packet. And this never ends, and there are no NUL characters.

To me this looks like: it is a custom firmware, where the device is supposed to spit out "Helloxxx\a" repeatedly. The packets quickly return 32 characters at a time, but the last one ends after 22 packets with a truncated packet at the 16-byte mark. Then NULs. Looks a lot like "buffer overrun" in the buffered serial output.
Then a line state switch, after which the device returns the same message (apparently unbuffered this time, 1 character per read). And that goes fine.

I'm sorry, but I don't see anything wrong in the USBIP protocol, nor in the USB request, nor in usbipd-win handling those. Are you sure the device is programmed correctly? What happens on a bare-metal Linux?

@bhav97
Copy link
Author

bhav97 commented Mar 27, 2022

Hi @dorssel , You're right the firmware simply prints out Helloxxx over UART. The onboard STlink serves the UART data over the VCOM.

It works on Ubuntu20.04 and Win11.

  1. On WSL there seems to be some sort of an echo (I sometimes see multiple outgoing characters, one at a time) vs nothing like that on windows
  2. On WSL there are USB frames with multiple characters of the string whereas on windows the data always is read one character at a time.

I'm not sure if this behavior is because of different handling of bulk requests on these two platforms so I'll try collecting tshark logs on Ubuntu as well for a more appropriate comparison.

Btw I use tio/minicom on WSL/Ubuntu and PuTTY on Windows. Here's the capture from the Windows host
wincap.zip

@dorssel
Copy link
Owner

dorssel commented Mar 27, 2022

@bhav97
I think it may be timing. On native Windows, or bare Linux, the device driver is loaded as soon as the device boots. So the virtual COM port is polled immediately, and starts picking up the characters one-by-one immediately.

On WSL, the device is first booted, and then it takes a while before the attach happens and the Linux tty driver starts reading. In this time, it looks like the device is filling a buffer internally, because the USB requests haven't started yet. When the attach and Linux driver are finally loaded, the first USB requests start draining the buffer, returning 32 characters at a time.

Your WSL capture did show that after a while, also on WSL the characters come in 1 at a time, without the NULs. Can you confirm this?

My guess: on WSL the device needs to buffer the period from USB power up to 'attach + Linux driver load'. That goes somewhat OK (32 characters at a time for a while) but end in NULs because the device has overflowed its buffers. You don't see this on native Windows/bare Linux, because there is hardly any delay between power up and driver load.

@dorssel
Copy link
Owner

dorssel commented Apr 28, 2022

Root cause seems to be user firmware. Closing issue.

@dorssel dorssel closed this as completed Apr 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hardware Incompatible device or hub
Projects
None yet
Development

No branches or pull requests

2 participants