-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
USB: File writing hang regression for big files #3452
Comments
I have also seen multiple reports of DFU CRC mismatches or of Invalid Manifests, when trying to install via USB when the current firmware has the aforementioned USB CDC changes. Currently trying to test whether they are directly linked, but seems suspicious that there's no reports of these symptoms when installing via USB on firmware that doesn't have the changes to USB CDC parameters... I think it would be beneficial to double check them before they hit an RC... |
Greetings. We were unable to reproduce it on our test systems, could you provide additional details: OS version you are using We will continue testing on other systems meanwhile |
So the DFU CRC mismatches I have not been able to replicate myself, but noticed that people reporting it are under similar conditions to the qFlipper write timeouts. The qFlipper timeouts however I can consistently replicate myself. I am using Fedora Linux 37, but I have had it happen first hand on Windows too, and heard of it happening on Mac too. Installed firmware that causes these issues is anything with those USB CDC parameters being changed. Now, I have been testing on latest OFW dev branch at the time of writing, and also Momentum dev branch after I re-insert the USB CDC changes. At the time of reporting this issue, I was testing on OFW dev branch, and on XFW dev branch too before reverting the USB CDC changes which fixed it. Also at the time of reporting, there were reports of it happening on RM too, and there too removing USB CDC changes fixed it (quote "FIX GLITCHY INSTALL" from his git history). Target firmware is anything. The issue is random, so a firmware with a bigger resources.tar will have a higher chance of encountering the issue. While installing Momentum / XFW (around 6MB of resources.tar), it happens 4 out of 5 times, at a random point through the writing of resources.tar. I have gotten it to happen while installing OFW dev branch too, though it is a lower chance due to the dramatically smaller resources.tar. Pc is anything. I have observed it first hand on my desktop pc running a ryzen 5600 and b550 mobo, both on front and back usb ports, on both Fedora and Windows. Also on my laptop, a huawei matebook with ryzen 3500u, on both arch linux and windows. And as i said, have heard from dozens of people on very diverse platforms, though by now it has gone quiet due to reverting those parameter changes downstream fixing the issue entirely. To get some rough expectations of frequency of this issue, i tested on my main desktop machine on fedora. Installed firmware is OFW dev branch (1070064).
once again, due to the size of the resources.tar providing a larger window for the bug to manifest itself. I mean no disrespect, but I strongly feel that this is a problem on the flipper side, rather than on the pc side. After the issue occurs, there is no way to re-establish USB functionality until a flipper reboot. You can connect to another pc, restart the pc, restart qflipper, but nothing will help. The USB interface of the flipper is simply stuck. |
after compiling ofw dev branch locally with DEBUG=1 and attaching to debugger after the hang (in this case crash) reveals that it is having issues at
|
Thank you for such a complete feedback, I will try to replicate this issue myself. Will get back to you today |
just wanted to say that this is still an issue. got also multiple reports from people installing momentum firmware with the webupdater that it stops while uploading resources.tar to flipper. this doesnt seem to be an issue with momentum, since they are on ofw, trying to install another firmware, and it is failing to upload the file, rather than installing it. |
please, is there ANY info on what that USB CDC change did? because so far, ive seen no info on it, no benefits from it (no faster speeds or anything), and just random file writes failing and hanging the USB interface indefinitely. |
Was not able to ever reproduce it on any of the systems. My colleague managed to get it once on home windows pc, however it involved using a faulty SD card and fully wiping it prior to update attempt. Are you using any usb hub by any chance? |
There is no hub. Using otherwise the same system and setup, with the CDC changes included it happens quite often when writing large files over rpc, with the CDC changes reverted it never happens. There are no other variables, it's just "CDC parameter changes" or "no CDC parameter changes" in the currently installed firmware, that affect this issue. Ive even heard now from people struggling to install Momentum because of this, after telling them to downgrade to OFW 0.97.1 to be sure that this bug isn't included, and they couldn't even downgrade! From OFW downgrading to lower OFW to fix the issue, even that wasn't wanting to work, they had to DFU. |
Alright, thank you for details. Could you be able to take stack trace on official firmware? |
oddly enough, ive only gotten the crash once, which is what i posted. its usually just a hang. flipper continues to work, but the USB interface is dead. my assumption is that the crash i got there is one of the many possible side effects from silent corruption caused by those changes we are debating over, while the most common outcome is the USB interface locking up and qflipper timing out. i will try to find the time to get more info, but with all due respect, we are talking about just 4 parameter changes, only 4 lines, that were changed with no additional context, it worked perfectly before, and it randomly breaks after, with no explanation or known improvement by changing those 4 numbers. works: #define CDC0_RXD_EP 0x01
#define CDC0_NTF_EP 0x83
#define CDC1_TXD_EP 0x85
#define CDC1_NTF_EP 0x86 breaks: #define CDC0_RXD_EP 0x02
#define CDC0_NTF_EP 0x81
#define CDC1_TXD_EP 0x84
#define CDC1_NTF_EP 0x83 the linked PR #3358 that introduced this aimed to fix the CDC parameter changes were only denoted as
with no known improvement, bugfix, or other benefit related to the change. and yet, it causes issues to many users, ive seen atleast 20 thus far myself. |
sorry! Ill go stand in a corner....I forgot to write that i had tested that file. Got same issue. But just to be sure I tried it again and now the 0.97.1 worked flawlessly.. Sorry for time-waste and thanks for all that you do!!! |
Apologies for my standalone ticket, I discovered this browsing another user's open ticket. I am experiencing the same issue, and needed to downgrade my OFW to 0.97.1 in order for the Momentum FW to be installed. Quite curious. 17 [default] Binding on background is not deferred as requested by the DeferredPropertyNames class info because one or more of its sub-objects contain an id. |
indeed, the issue seems resolved with #3705. thank you! |
Describe the bug.
Have 10+ user reports, including first hand, of qFlipper consistently hanging during the update process while writing resources.tar (although at random points throughout the write operation). I tested back through the commit history and narrowed it down to #3358. In particular, reverting the USB CDC parameter changes fixes the issues.
Reproduction
Target
Flipper USB -> qFlipper on all operating systems
Logs
In Flipper serial debug logs, write operations flow through normally until they suddenly stop, for no apparent reason, and a few seconds later qFlipper reports a timeout error. No other information is present in the logs.
Anything else?
The PR in question mentions "switch from 2x unidirectional data endpoints to 1x bidirectional".
Seems like this is what is causing issues, although not clear why. Is there any more info on this change?
The text was updated successfully, but these errors were encountered: