New connections can't be made after some time. #255

LeoBortoni · 2020-07-14T16:37:59Z

bleak version: 0.7.0
Python version: 3.7.7
Operating System: Windows 10

Description

So this is what I am trying to do since 2 months ago: My goal is to create application which reads some number N of sensors at the same time via bluetooth, collect its data and publish it on my data bank. It is a simple thing: I created a app which load the list of address of the sensors, and split it on another sublists of simultaneos_connections_list, so then for each sensor in the group of simultaneos_connections_list, I create a thread or async task to read this sensor (Yes, i created a wrapper of the bleak lib that allows me to use threads). This thread, or async task, do this: Create a Bleak client, search and try to connect with the sensor, get its data and publish it on the data bank. After this, is time for the other group of simultaneos_connections create its thread and perform the exact same task, but for others sensor's address. Finished all the groups, the while (True) loop restarts and the applications execute the creation of threads again. Every thing works pretty well for some time (both the async and thread version). The problem occours after 3 or 4 hours running the application. After this time, the bleak lib just cant connect with any other sensor. The bleak finds the bluetooth address and try to connect with the sensor, but it aways stops on the "Get Services..." log and than raise a empty exception, which I assume to be a timeout exception. The number of simultaneos connections N that I have tested are 1,2 and 3. I have tested many differents approaches and i will list it bellow, togehter with my assumptions of why this is happening. For each thread, i try to connect with the sensor 20 times, and each time I call the connect method from the bleak client like that: client.connect(timeout = 3). After try the connection 20 times, the thread throws a exception "Could not find the address".

Note > For each measurement on my sensors, i create a thread, a bleak client on this thread, try to connect and read it, publish the result on the data bank and then I delet this client.

What is more weird, is that when this application can't connect with the sensor any more, even if i finalize the app and restart it, itself will not work anymore! It can't connect with the sensor, just like if it had been running for 4 hours and then stoped. Even if I restart the bluetooth module on the windows system, it does not connect with any sensor. I do have one test.py program which just connect with the sensor, reads its data and print on the screen, just to test the bleak library, and even this program dosen't work anymore, can't connect. The only thing that allows me to connect with my sensors again is to restart the computer. Then, i am able to run this test.py or the main application with threads and it works.

One thing to note > Despite after stops to connect with my sensors and cant connect anymore with any sensor unless i restart the computer, I CAN connect with the sensors devices using the windows bluetooth interface to connect with devices. Just the bleak module dosen't work anymore.

One thing interesting, that has led me to some direction of what is happening, is as follow: Once, I left the app with threads running and after 3 hours it stoped to connect with the sensors, ok. nothing new, but i still left it there, running, just to see if after some hours it would connect again. Then, after 2 more hours, I got a memory error from python: Process is terminated due stackoverflow exception. And the application was forced close. And the error box which appeared to me said: 'A new guard page for the stack cannot be created'. That is pretty weird since i am creating the client in a thread, and deleting the client after the try to connect or the successfully connection. The pictures of the errors are atached.

So, let start to describe what i have tryied.

What I Did

The first thing I tryied was to for each cycle of the while (True), execute a subprocess to run a powershell script to turn on and off the bluetooth windows module. Dosen't work.

Then, I tryied to execute some thing more severe. On adm mode, I tryied to execute a program to force restart of all the bluetooth services from windows, for each run of the while (True). Did not work.

The I tryied to track the memory usage of the application, for a situation when the application was left running for 3 hours and then stoped to connect. Atached is the graphic of the memory usage. I have made the calculations, that highest pick of memory use, is the exacly moment when the application can't connect anymore. That picke there is the last sample collected, after that the bleak cant connect.

Then I tryied to force the garbage collector after each while (True) with: import gc, and gc.collect(). Did not work.

I also have tracked the highest memory usage of my application, when it stops to connect with the sensors after 3 hours, atached is the track.

And this result really make sense, because when i try to just collect data from one sensor, i mean, left the whole application running just to collect data from one sensor device (one thread) and then publishind this data in the databank, the application runs for 10 hours and than stops to connect. This is interesting because perhaps some stack at the discovery.py is overloading the bluetooth driver of the OS, and it just get back to normality after restart the machine.

Now prepare your self for the gran finalle.

Ok, i thought, I could try to force the garbage collector to work in another way. I could try to create a process instead of thread with subprocess library. And that i did. some simple stuff. And then i left it running. For each measurement, i create a process, inside this process i create a bleak client, do my reads and publish it on my data bank. IT WORKS. Right now is completing 24 hours of running the application with 3 simoultaneos connections (This mean that in my main process, i create 3 threads per time, on each thread i create a process).

But this is not a good solution, since i am creating a lot of process and this is too much costly. Besides, i need to use the async version of this application, without threads. The use of threads was just to try to find a solution.

So, i think this is everything i can remember right now. Please, let me know if you guys already had tome thing like that.

hbldh · 2020-07-20T10:17:17Z

Very good analysis of the problem. It is now proven that #133 is real and might need solving.
When creating a process and closing it, all .NET objects are released by the python when it ends so there it should work without problem, that is your only option currently, at least in version 0.7.1 and lower.

I would be much helped by you installing the develop branch of bleak and running your long-running code now, since I made some changes in cbc6069. I have some more ideas for improving this, but this is a first attempt.

LeoBortoni · 2020-07-21T14:08:18Z

I did as you have asked, and repeated the long running experiment with the develop branch, version 0.7.2a3. Still with the same behaviour. I tested calling and not calling the garbage collector after the while (True) of the application. It is the same. Here is a picture of the test with 0.7.2a3 calling the garbage collector. When the time is approximately 4200 s is when my data bank relates the last sample collected via bluetooth. After this it stops completely and further my machine stoped to work and crashed. I got a black screen with a horizontal bar blinking, and i had to force shutdown.

hbldh · 2020-07-29T20:02:29Z

@LeoBortoni Could you please send me the code you run, along with the memory measurement solution that you have and I will try to make sense of this problem in some way? If you don't want to publish it here then send it via mail to me.

LeoBortoni · 2020-07-30T15:52:18Z

bleak_approachs.zip
There are both the codes, the one which works (creating multiple process with threads), and the other which dosent work.
On the bleak_which_works.py you will need to change the arguments of the line 48 to the arguments that you use to execute a python programm. Try to let the bleak_which_dosent_work.py running for one day, i am curious if your machine will crash and got a black screen as has happened with mine. To generate the memory graphs i am using the application depicted here: https://medium.com/zendesk-engineering/hunting-for-memory-leaks-in-python-applications-6824d0518774

hbldh · 2020-08-09T16:09:44Z

@LeoBortoni Thank you for this. I will run this during this week and get back to you.

I have been experimenting extensively with the pythonnet package and the Windows UWP Bluetooth LE API these past two weeks. I have gotten approximatey nowhere, but I think the problem is due to that no Event handler degreistration can be done currently. All Events can have handlers added (with +=, in which case the handler method has its counter incremented) but when calling removal of the handler (with -=) they all respond with an error that the method canno be found. I think it has somethig to do with pythonnet and that UWP is not directly supported by it. Handler are left with their counters incremented and these are not decremented.

These links are related:

I want to do something like this, but it is apparently impossible with the UWP/WinRT solutions.

This might be solved by implementing #180.

hbldh · 2020-08-19T18:33:56Z

@LeoBortoni I have ran your code and tried to modify the pythonnet code in different ways to alleviate the problem, but it is still there. I have been unsuccessful in solving the issue directly.

However, I solved a different problem: I implemented a different .NET backend, using the winrt package instead. If you are willing to check out the feature/winrt branch and run your code using that backend instead, I would appreciate that! Import everything as usual, but you have to send handles to read_gatt_char/write_gatt_char/start_notify, because the winrt has a bug handling uuid values right now, and I do not know if it will ever be solved... Handles can be foudn e.g. by running the service_explorer.py example.

In case it solves your memory problem I will merge the winrt backend into the codebase and leave it as an optional Windows backend. It is a bit slower it seems, but if it manages to do gc corectly then it has clear edge...

Run pip install winrt and then run your code with handles.

hbldh · 2020-08-20T14:46:24Z

This is me running connect, notify and disconnect with the current develop branch for 10 minutes

This is me running connect, notify and disconnect with the feature/winrt branch for 10 minutes

There are comments about C# not collecting memory unless needed, so the graphs might not actually unavailable memory, e.g. here. That is, the feature/winrt might still work, but the memory is shown as spoken for.

In your example, try to remove everything that is not bleak (influxdb and suchlike). I cannot understand why your plot has such a spiky behaviour at the end. It does drop down at the end, so the program does release memory when reaching the end. I want to know if the crash is really due to bleak or due to other components you use in your code.

I will run longer sessions during the weekend to see if I can crash my system as well. I've got 16 GB of RAM so it might take a while...

hbldh · 2020-10-20T21:37:13Z

There are multiple improvements in version 0.9.0 regarding the Windows backend. It might be so that this is solved there, if you want to try it out.

dlech · 2022-07-25T21:00:32Z

We dropped the pythonnet backend some time ago. If this issue also exists in the winrt backend, let's start a new issue.

hbldh self-assigned this Jul 20, 2020

hbldh added Backend: pythonnet Issues or PRs relating to the .NET/pythonnet backend bug Something isn't working labels Jul 20, 2020

hbldh added this to the Version 0.8.0 milestone Jul 20, 2020

hbldh added a commit that referenced this issue Jul 20, 2020

Additional work for fixing #255 and #133.

7b55dfa

hbldh modified the milestones: Version 0.8.0, Version 0.X.Y Aug 18, 2020

hbldh mentioned this issue Aug 19, 2020

Change from pythonnet to WinRT #180

Closed

hbldh mentioned this issue Sep 2, 2020

Release/v0.8.0 #279

Merged

hbldh closed this as completed in cbc6069 Sep 14, 2020

hbldh reopened this Sep 28, 2020

hbldh mentioned this issue Sep 28, 2020

Disconnected callbacks firing multiple times when multiple connections are made in a session #312

Closed

dlech mentioned this issue Aug 5, 2021

bleak-winrt #609

Merged

dlech closed this as completed Jul 25, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New connections can't be made after some time. #255

New connections can't be made after some time. #255

LeoBortoni commented Jul 14, 2020

hbldh commented Jul 20, 2020

LeoBortoni commented Jul 21, 2020

hbldh commented Jul 29, 2020

LeoBortoni commented Jul 30, 2020

hbldh commented Aug 9, 2020

hbldh commented Aug 19, 2020

hbldh commented Aug 20, 2020 •

edited

Loading

hbldh commented Oct 20, 2020 •

edited

Loading

dlech commented Jul 25, 2022

New connections can't be made after some time. #255

New connections can't be made after some time. #255

Comments

LeoBortoni commented Jul 14, 2020

Description

What I Did

hbldh commented Jul 20, 2020

LeoBortoni commented Jul 21, 2020

hbldh commented Jul 29, 2020

LeoBortoni commented Jul 30, 2020

hbldh commented Aug 9, 2020

hbldh commented Aug 19, 2020

hbldh commented Aug 20, 2020 • edited Loading

hbldh commented Oct 20, 2020 • edited Loading

dlech commented Jul 25, 2022

hbldh commented Aug 20, 2020 •

edited

Loading

hbldh commented Oct 20, 2020 •

edited

Loading