-
-
Notifications
You must be signed in to change notification settings - Fork 18
Improvements for async device management and exclusive access to device #376
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…covery of devices is asynchronous (and because of the use of multiple async PortManagers in the test platform v3): - Make sure all list operations use a lock of the collection - PortTcpIpManager._networkDevices made static as it acts as a complement to the NanoFrameworkDevices collection - if there are multiple, the device list management does not work properly for a non-static _networkDevices. System-wide exclusive access to a device via system-wide mutex, required to avoid competition between e.g.,Device Explorer auto-discovery, test platform etc. If this is used, it is possible for the nanoFramework tools to wait for a device to become available instead of throwing an exception. If an exception is thrown, that is an indication of a real problem, not a competition with other nanoFramework tools. Not sure if that also means the WaitAndRetry in the PortSerialManager should be removed. Use of System-wide exclusive access in the serial DeviceWatcher; async part of that operation moved from PortSerialManager to DeviceWatcher. Also updated PortTcpIpManager+DeviceWatcher. Serial ports: DeviceWatcher now has a static method to find all serial ports, incl exclusion list, so that all nanoFramework devices use the same method. Also to be used in the test platform v3. Port**Manager returns the added or already registered device for PostBase.AddDevice. Makes use in test platform code easier. Fixed OnDeviceEnumerationCompleted logic if both AddDevice and a device watcher are used. The OnDeviceEnumerationCompleted is now (indirectly) fired by the DeviceWatcher. Also fixed at least one statement that would result in an exception if NanoNetworkDevices have been discovered before NanoSerialDevices. Signed-off-by: Frank Robijn <robijn@good-heavens.nl>
|
Important Review skippedReview was skipped due to path filters ⛔ Files ignored due to path filters (15)
CodeRabbit blocks several paths by default. You can override this behavior by explicitly including those paths in the path filters. For example, including You can disable this status message by setting the Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
…ecause it was not read-only. SonarCloud may well be right that the way NanoFrameworkDevices is used in this library is questionable, but changing is more work. As it is now (readonly object instance), NanoFrameworkDevices is a valid lock-target as the instance never changes. Also added an exception handling in GlobalExclusiveDeviceAccess in case of killed processes - change was not committed previously.
- Before the changes in this PR, the PortSerialManager.OnDeviceEnumerationCompleted event was postponed if a new device was detected before the addition of previously detected devices was completed. In the modified version that was no longer the case. Behaviour is restored in this version. - The DeviceWatcher.GetPortNames was changed to a static method, as it apparently is not used by any code in nanoFramework. As it is a public library, this may break the code of someone else. Original method reinstated and new static method made internal. The public version of the method is made part of the PortSerialManager, as external code is most likely to interact with the PortSerialManager than with DeviceWatcher. Addition: - It is allowed to change the PortExclusionList after the DeviceWatcher has been started. Made sure to add locks in the right places.
josesimoes
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Frank I can see where you're going and I see value in this global lock approach.
Few comments to address.
Also, there seems to be something wrong the rescan devices call execution. Not only it takes a serious ammount of time, but it never flags the completion to the caller...
I have these COM ports on my machine (2 of the COM ports are nanoDevices)
Last comment: the algorithm is trying to reach the serial Bluetooth devices (when it shouldn't).
nanoFramework.Tools.DebugLibrary.Shared/PortComposite/PortCompositeDeviceManager.cs
Show resolved
Hide resolved
nanoFramework.Tools.DebugLibrary.Shared/PortSerial/DeviceWatcher.cs
Outdated
Show resolved
Hide resolved
nanoFramework.Tools.DebugLibrary.Shared/PortSerial/DeviceWatcher.cs
Outdated
Show resolved
Hide resolved
nanoFramework.Tools.DebugLibrary.Shared/PortSerial/DeviceWatcher.cs
Outdated
Show resolved
Hide resolved
O yes, a lot of time. If you reserve a COM port for a virtual device and don't have that device running, it also takes ages before the enumeration completes. Ive checked in the debugger that this has nothing to do with the locking mechanism. I suspect it is the WaitAndRetry in PortSerialManager.OnDeviceAdded. The total wait time alone is about 10 seconds for each port that is not connected to a nanoDevice, plus ten times the amount of time the underlying code needs to determine there is no nanoDevice connected to the port. I've several of those reserved-for-virtual-device-ports, and I've noticed that although the PortSerialManager.OnDeviceAdded method is called at the same time for all ports (as the method is called async and in parallel for all ports), the method never completes at the same time. It completes for one port at a time. The time between successive completions of the method in case of ports that are not connected to a device is a few seconds. That is very similar to the last wait period of the WaitAndRetry (= 2500ms). I suspect that somewhere in the code that connects to the device there is a lock and the connections to the various devices are not tried in parallel, but are forced to be sequential. If this is correct, then in your case, with 4 non-nanoDevice ports, the total time before completion is about 10 + (6-2-1)*2.5 + (a bit for the 2 connected devices)= 20 seconds. I don't know why the WaitAndRetry is needed, for nanodevices only one try (no retry) seems to be sufficient. I also didn't investigate the long return times. Instead I aim to have a configuration where you can exclude all special ports (in addition to the exclusion filter in the library).
I cannot reproduce that. The only way that can happen is if somewhere deep in the library there is code that tries to connect to a device via a port that never returns. But then that should also happen in the current code (if you would remove the "rogue port" filter).
I think that's solved in the next commit. Please check again (I don't have BT COM ports, can't test that.) |
|
I've investigated the timing issue a bit further. No devices connected, just 6 COM ports that are unresponsive. The code that makes all code execute sequentially is the first operation that uses the wire protocol to query the nanoDevice. That fails of course, but only after some timeout. What causes all calls to be sequential is that communication via the wire protocol is protected via a static lock: Engine._syncReqLock. So the device watcher launches parallel threads for the six ports, and then they wait in turn to be able to send a message, each to a different port. There is no need for such a system-wide lock, nothing in the DebugEngine is static. This is the only lock: if the lock is static, scanning the six ports takes 26 seconds. If the lock is not static but per Engine instance, scanning the ports takes about 10 seconds (= total wait time for the wait and retry loop). I presume that the fact that the lock is static will also hurt the test framework if you run tests on multiple real hardware devices, as the communication between the test host and devices for deployment and to get the results is also done via the wire protocol. It seems easy to get rid of the static lock. To make it non-static may be a bit risky: if there is code out there that creates two Engines for the same port, then that code may fail after the change. That is not the case if _syncReqLock is a dictionary with the NanoDevice.DeviceId as key and the SemaphoreSlim as value. I'll see whether I can implement the dictionary later today; I'll add the commit to this PR. |
Ellerbach
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple of comments, all up, logic seems good to me.
nanoFramework.Tools.DebugLibrary.Shared/NFDevice/GlobalExclusiveDeviceAccess.cs
Show resolved
Hide resolved
nanoFramework.Tools.DebugLibrary.Shared/NFDevice/GlobalExclusiveDeviceAccess.cs
Show resolved
Hide resolved
nanoFramework.Tools.DebugLibrary.Shared/NFDevice/GlobalExclusiveDeviceAccess.cs
Show resolved
Hide resolved
nanoFramework.Tools.DebugLibrary.Shared/NFDevice/GlobalExclusiveDeviceAccess.cs
Show resolved
Hide resolved
nanoFramework.Tools.DebugLibrary.Shared/PortComposite/PortCompositeDeviceManager.cs
Outdated
Show resolved
Hide resolved
nanoFramework.Tools.DebugLibrary.Shared/PortTcpIp/PortTcpIpManager.cs
Outdated
Show resolved
Hide resolved
nanoFramework.Tools.DebugLibrary.Shared/PortComposite/PortCompositeDeviceManager.cs
Outdated
Show resolved
Hide resolved
nanoFramework.Tools.DebugLibrary.Shared/PortSerial/DeviceWatcher.cs
Outdated
Show resolved
Hide resolved
|
The rescan issue is coming from this call: For some reason the call to Maybe we should add a timeout on the serial port workflow so a faulty device can't block/hinder the discovery process. |
|
@frobijn discovery and rescan working much smoother now! This is making very good progress. 👍🏻 |
I didn't analyse all the serial communication code, there's a lot and I'm not very familiar with that. |
|
@josesimoes Your minor changes caused a merge conflict that VS2022 could not handle "because of ... codepage". Automerge resulted in an empty file. I hope that I've got all your changes in the new commit. Kept the static GetPortNames method, see above. |
The result of a "hanging" device is that the "enumeration complete" event is never fired. The discovery process continues. As external software (like the Device Explorer) also listens to the "device added" event, the effect may be minimal. |
Apologies for getting in the way... |
Not exactly. There are other callers (like VS device managed) that disable the rescan button when a scan operation is running. If that never flags is as being completed. It will never enable back the button. That happens every now and then. Only remedy is to close and restart VS to release the resources. Let's not worry about this for now. |
|
Ok. Then I think that I have addressed all issues/suggestions raised so far. |
|
@frobijn on last improvement that ensures the devices are really closed (therefore made available) when removed from the device list. This resolves a situation on device rescan when a device was connected it wasn't possible to resume communication with it, until restarting the application. Please take a look. If your OK, lets have this merged! |
josesimoes
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All good!
Great job improving this. 💯 👏🏻
No comments, looks good. Let's merge! |
|
@frobijn thank you again for your contribution! 🙏😄 .NET nanoFramework is all about community involvement, and no contribution is too small. Please edit it and add an entry with your GitHub username in the appropriate location (names are sorted alphabetically): (Feel free to adjust your name if it's not correct) |


Description
One feature of the proposed test platform v3 should be in this library:
System-wide exclusive access to a device via system-wide mutex, required to avoid competition between e.g.,Device Explorer auto-discovery, test platform etc. If this is used, it is possible for the nanoFramework tools to wait for a device to become available instead of throwing an exception. If an exception is thrown, that is an indication of a real problem, not a competition with other nanoFramework tools. Not sure if that also means the WaitAndRetry in the PortSerialManager should be removed.
Use of System-wide exclusive access in the PortSerialManager+DeviceWatcher; async part of that operation moved from PortSerialManager to DeviceWatcher. Also updated PortTcpIpManager+DeviceWatcher.
The other changes have been triggered by the use of the debugger library in test platform v3:
Local locks for global NanoFrameworkDevices collection because of discovery of devices is asynchronous (and because of the use of multiple async PortManagers in the test platform v3):
Serial ports: DeviceWatcher now has a static method to find all serial ports, incl exclusion list, so that all nanoFramework tools can use the same method to find the ports. Also to be used in the test platform v3.
Port*Manager returns the added or already registered device for PostBase.AddDevice. Makes use in test platform code easier.
Fixed OnDeviceEnumerationCompleted logic if both AddDevice and a device watcher are used. The OnDeviceEnumerationCompleted is now (indirectly) fired by the DeviceWatcher.
Also fixed at least one statement that would result in an exception if NanoNetworkDevices have been discovered before NanoSerialDevices.
.editorconfig added
Motivation and Context
These changes are a result of the work for test platform v3: improvements and one extra feature that should be in this library rather than in test platform v3. But the changes are probably also relevant for other applications, so they are presented in a separate PR.
In the proposed test platform v3 I've introduced a mechanism that nanoFramework tools can use to get exclusive access to a device. The tool waits for the device to become available (optionally with timeout) rather than throwing an exception. If an exception is thrown, either a non-nanoFramework application is using the device or it is not a nanoFramework tool. This mechanism is needed in the test platform to prevent issues if the developer starts multiple test hosts. But is also prevents interference from the Device Explorer in Visual Studio, if the VS extension would use the same mechanism.
That mechanism should not be part of the test platform, but should be in this library. I'm prepared to ensure that the mechanism is working correctly in nanoff (and of course in the test platform v2 and v3). It should work out of the box for the VS Extension once the submodule reference to the nf-debugger repository has been updated, and would prevent device detection issues if two Visual Studio instances are open and try to discover devices at the same time . That requires this library and NuGet package to be updated first.
The other changes are related to the (ease of) use of the library in the test platform. The library seems to be created for device watcher-type applications, and some code for modifications of the global list of devices are not properly protected against simultaneous updates. Tried to fix that. Also: an application like test platform may create a debugger for a specific device that has not previously been discovered via a device watcher, but the application has to get the result from the global list anyway instead of having the device as a return value of the call.
I've noticed that there is some specialized code to prevent a connection to certain serial devices. Made that code available via a static method, so an application like test platform v3 can use exactly the same code to know which serial ports are present.
How Has This Been Tested?
Using the Serial Test App, at least for the serial devices. Don't know how to connect a device via TCP/IP (is that supposed to work?). The global exclusive access was tested by running one Serial Test App in the debugger and one outside Visual Studio.
Types of changes
Checklist: