Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

App doesn't sync after coming from the background #2396

Closed
kingalg opened this issue Mar 29, 2024 · 16 comments
Closed

App doesn't sync after coming from the background #2396

kingalg opened this issue Mar 29, 2024 · 16 comments
Assignees
Labels
bug Something isn't working iOS mobile

Comments

@kingalg
Copy link
Collaborator

kingalg commented Mar 29, 2024

Version: 2.1.2-alpha.14 (367)
System: iOS

Issue: The App doesn't sync after coming from the background. The app on iOS never really works in the background, but for some time it was syncing correctly after it was reopened without a need for swiping up or force-quiting. Currently, it doesn't sync.

Steps to recreate: Have a working app on ios, exchange a few messages with another user on a different platform. Put the app on iOS into the background (leave and check a different app or let the iPhone turn the screen off). Do not swipe the app up, force quit it or restart the phone - make sure that it's still in the background. Open it again and try to send/receive a message. It should sync after a short time, but it never does.

Notes: I checked it again on our prod version and it is working there. So, this is a regression that only appeared at some point in one of the 2.1.2 alphas.

@kingalg kingalg added bug Something isn't working iOS mobile labels Mar 29, 2024
@kingalg kingalg added this to Quiet Mar 29, 2024
@kingalg kingalg moved this to Backlog - Mobile in Quiet Mar 29, 2024
@holmesworcester holmesworcester moved this from Backlog - Mobile to Sprint in Quiet Mar 29, 2024
@holmesworcester
Copy link
Contributor

We should fix this. It's a blocker!

@holmesworcester
Copy link
Contributor

Since this is a regression, it's worth investing in a test, which also means fixing mobile e2e tests before the release.

At quick glance, it looks like Detox can test sending the app to background and returning from background, so let's do it:

https://wix.github.io/Detox/docs/next/api/device/#devicelaunchappparams

@holmesworcester
Copy link
Contributor

holmesworcester commented Apr 2, 2024

Let's first fix the issue and then see if there is a way to test this issue with our current setup. If there is, let's do it.

If there isn't, let's not.

Ideas for how to test this if we want to:

  1. Try to send a message, if it is not greyed out that should mean that it has connected correctly to the backend.
  2. Build some visual indicator in the app frontend that could show us if we are not able to connect to the Tor control port or know that Tor is working.
  3. Another idea: add a Tor connection indicator similar to the one we show in Desktop alphas in mobile alphas.
  4. Try 2-player Detox tests now (it wasn't possible to in the past because of problems with Tor, but maybe it's not hard to do and so easy that we should just do it.)
  5. If it's useful to use the new "trying to connect..." indicator of connection status then we can merge that work.

@holmesworcester
Copy link
Contributor

Lucas can't reproduce.

It should sync after a short time, but it never does.

How short? Once the app goes into background and comes back, Tor has to restart and then it has to reconnect. This could take some time.

Notes: I checked it again on our prod version and it is working there. So, this is a regression that only appeared at some point in one of the 2.1.2 alphas.

This is the most interesting and confusing part. Why would the last release behave differently? Is it possible that in production, when you switch away and switch back, the app is not getting sent to the background?

Could there be some difference in performance or memory usage, or in the context in which Kinga is testing, that changes how quickly the app goes into the background?

@holmesworcester
Copy link
Contributor

holmesworcester commented Apr 2, 2024

I think overall we should treat this as an iOS+Tor reliability question to resolve in the future, and assume that the app will sync if given enough time and enough tries.

Let's move this out of sprint since we can't reproduce it.

Also, @kingalg and @leblowl are you testing on the same device? Which device?

@holmesworcester holmesworcester moved this from Sprint to Next Sprint in Quiet Apr 2, 2024
@leblowl
Copy link
Collaborator

leblowl commented Apr 2, 2024

I'm testing on an iPhone SE 2nd gen

@holmesworcester
Copy link
Contributor

holmesworcester commented Apr 2, 2024

Notes: Lucas also sees long syncing times after returning to foreground in production, and suspects that we could reproduce the error on desktop as well after restarting.

These are the logs when there is a delay in restarting the connection:

[31;1mbackend:libp2p:websockets [0mconnect zedbjrdnxpexhfxxwljjmw6csodt7o5j3d2ebd6upit4aiggdjhec5ad.onion:80 [31m+10s[0m

  [31;1mbackend:libp2p:websockets [0m/dns4/dmjdcthp7lo6igbeuatzgeuuyjg25vkamkxosgsp6rhotxt3ezahexad.onion/tcp/80/ws/p2p/QmWkhUw5redKNGLqu7fK4oqSYNXJ9RtHym9tmnjUzaLBnn connected /dns4/zedbjrdnxpexhfxxwljjmw6csodt7o5j3d2ebd6upit4aiggdjhec5ad.onion/tcp/80/ws/p2p/Qmc9ogVhL9kL5ZJKPbfjKTMvATVkQCoNfTmLEUxvadfds3 [31m+9s[0m
  [31;1mbackend:libp2p:websockets [0mnew outbound connection /dns4/zedbjrdnxpexhfxxwljjmw6csodt7o5j3d2ebd6upit4aiggdjhec5ad.onion/tcp/80/ws/p2p/Qmc9ogVhL9kL5ZJKPbfjKTMvATVkQCoNfTmLEUxvadfds3 [31m+1ms[0m

  [36;1mbackend:libp2p:websockets:err [0merror upgrading outbound connection /dns4/zedbjrdnxpexhfxxwljjmw6csodt7o5j3d2ebd6upit4aiggdjhec5ad.onion/tcp/80/ws/p2p/Qmc9ogVhL9kL5ZJKPbfjKTMvATVkQCoNfTmLEUxvadfds3. Details: stream ended before 24 bytes became available [36m+46s[0m

  [31;1mbackend:libp2p:websockets [0mconnect zedbjrdnxpexhfxxwljjmw6csodt7o5j3d2ebd6upit4aiggdjhec5ad.onion:80 [31m+38s[0m

  [31;1mbackend:libp2p:websockets [0m/dns4/dmjdcthp7lo6igbeuatzgeuuyjg25vkamkxosgsp6rhotxt3ezahexad.onion/tcp/80/ws/p2p/QmWkhUw5redKNGLqu7fK4oqSYNXJ9RtHym9tmnjUzaLBnn connected /dns4/zedbjrdnxpexhfxxwljjmw6csodt7o5j3d2ebd6upit4aiggdjhec5ad.onion/tcp/80/ws/p2p/Qmc9ogVhL9kL5ZJKPbfjKTMvATVkQCoNfTmLEUxvadfds3 [31m+4s[0m
  [31;1mbackend:libp2p:websockets [0mnew outbound connection /dns4/zedbjrdnxpexhfxxwljjmw6csodt7o5j3d2ebd6upit4aiggdjhec5ad.onion/tcp/80/ws/p2p/Qmc9ogVhL9kL5ZJKPbfjKTMvATVkQCoNfTmLEUxvadfds3 [31m+1ms[0m

  [36;1mbackend:libp2p:websockets:err [0merror upgrading outbound connection /dns4/zedbjrdnxpexhfxxwljjmw6csodt7o5j3d2ebd6upit4aiggdjhec5ad.onion/tcp/80/ws/p2p/Qmc9ogVhL9kL5ZJKPbfjKTMvATVkQCoNfTmLEUxvadfds3. Details: stream ended before 24 bytes became available [36m+44s[0m

  [31;1mbackend:libp2p:websockets [0mconnect zedbjrdnxpexhfxxwljjmw6csodt7o5j3d2ebd6upit4aiggdjhec5ad.onion:80 [31m+40s[0m

  [31;1mbackend:libp2p:websockets [0m/dns4/dmjdcthp7lo6igbeuatzgeuuyjg25vkamkxosgsp6rhotxt3ezahexad.onion/tcp/80/ws/p2p/QmWkhUw5redKNGLqu7fK4oqSYNXJ9RtHym9tmnjUzaLBnn connected /dns4/zedbjrdnxpexhfxxwljjmw6csodt7o5j3d2ebd6upit4aiggdjhec5ad.onion/tcp/80/ws/p2p/Qmc9ogVhL9kL5ZJKPbfjKTMvATVkQCoNfTmLEUxvadfds3 [31m+2s[0m
  [31;1mbackend:libp2p:websockets [0mnew outbound connection /dns4/zedbjrdnxpexhfxxwljjmw6csodt7o5j3d2ebd6upit4aiggdjhec5ad.onion/tcp/80/ws/p2p/Qmc9ogVhL9kL5ZJKPbfjKTMvATVkQCoNfTmLEUxvadfds3 [31m+2ms[0m

  [36;1mbackend:libp2p:websockets:err [0merror upgrading outbound connection /dns4/zedbjrdnxpexhfxxwljjmw6csodt7o5j3d2ebd6upit4aiggdjhec5ad.onion/tcp/80/ws/p2p/Qmc9ogVhL9kL5ZJKPbfjKTMvATVkQCoNfTmLEUxvadfds3. Details: stream ended before 24 bytes became available [36m+42s[0m

  [31;1mbackend:libp2p:websockets [0mconnect zedbjrdnxpexhfxxwljjmw6csodt7o5j3d2ebd6upit4aiggdjhec5ad.onion:80 [31m+40s[0m

  [31;1mbackend:libp2p:websockets [0m/dns4/dmjdcthp7lo6igbeuatzgeuuyjg25vkamkxosgsp6rhotxt3ezahexad.onion/tcp/80/ws/p2p/QmWkhUw5redKNGLqu7fK4oqSYNXJ9RtHym9tmnjUzaLBnn connected /dns4/zedbjrdnxpexhfxxwljjmw6csodt7o5j3d2ebd6upit4aiggdjhec5ad.onion/tcp/80/ws/p2p/Qmc9ogVhL9kL5ZJKPbfjKTMvATVkQCoNfTmLEUxvadfds3 [31m+1s[0m
  [31;1mbackend:libp2p:websockets [0mnew outbound connection /dns4/zedbjrdnxpexhfxxwljjmw6csodt7o5j3d2ebd6upit4aiggdjhec5ad.onion/tcp/80/ws/p2p/Qmc9ogVhL9kL5ZJKPbfjKTMvATVkQCoNfTmLEUxvadfds3 [31m+2ms[0m

  [35;1mbackend:Libp2pService [0mQmWkhUw5redKNGLqu7fK4oqSYNXJ9RtHym9tmnjUzaLBnn connected to Qmc9ogVhL9kL5ZJKPbfjKTMvATVkQCoNfTmLEUxvadfds3 [35m+3m[0m
  [35;1mbackend:Libp2pService [0mQmWkhUw5redKNGLqu7fK4oqSYNXJ9RtHym9tmnjUzaLBnn is connected to 1 peers [35m+1ms[0m
  [35;1mbackend:Libp2pService [0mQmWkhUw5redKNGLqu7fK4oqSYNXJ9RtHym9tmnjUzaLBnn has 1 open connections [35m+0ms[0m
  [31;1mbackend:libp2p:websockets [0moutbound connection /dns4/zedbjrdnxpexhfxxwljjmw6csodt7o5j3d2ebd6upit4aiggdjhec5ad.onion/tcp/80/ws/p2p/Qmc9ogVhL9kL5ZJKPbfjKTMvATVkQCoNfTmLEUxvadfds3 upgraded [31m+5s[0m

Error upgrading outbound connection is something we see elsewhere when we use the wrong PSK:

#1897 (comment)

It may mean other things too.

@kingalg
Copy link
Collaborator Author

kingalg commented Apr 3, 2024

@holmesworcester Device - I'm also using iPhone SE 2nd gen
Usually, it should sync after a few seconds (and this is how it's working on prod for me). So, most often, it will be a few seconds, but on rare occasions when it's longer, it's no more than half a minute. On our current alpha version, I have been waiting for 10-20 minutes without success. For me, on prod it's syncing every time and on alpha, it's not syncing at all. @leblowl could you give me your step by step or record what you are doing? Maybe we do something differently.

@leblowl
Copy link
Collaborator

leblowl commented Apr 3, 2024

@kingalg I just open the app, connect to a community, then put the app in the background, do something else (open maps or something that requires a lot of resources), wait maybe 20 seconds, then open the app again. I was able to reproduce this on production, where it didn't sync after 15 minutes or more. I think restarting the app fixes things. But I can play it with more, maybe it happens more often on the current alpha then production. Either way I think there is likely a bug here that we can fix.

@kingalg
Copy link
Collaborator Author

kingalg commented Apr 4, 2024

@leblowl did you send any message from another app when this one was in the background? And yes, restarting the app fixed it, as the issue is only with the app not syncing after returning from the background (it was working exactly like this in the past, but after Wiktor introduced some changes, the app was syncing correctly). I'm really curious why we are getting such different results on the same version and the same device. That's interesting.

@holmesworcester
Copy link
Contributor

Lucas can reproduce this because it happens frequently enough but probably anyone can.

@leblowl leblowl moved this from Next Sprint to Sprint in Quiet Apr 4, 2024
@leblowl leblowl moved this from Sprint to In progress in Quiet Apr 4, 2024
@leblowl
Copy link
Collaborator

leblowl commented Apr 6, 2024

I looked into this more, and it looks like since we are now randomizing ports when iOS resumes from background, the HTTP tunnel port that the backend uses to connect to Tor changes. Libp2p has an auto-dialer that re-dials peers, which is how we re-connect to peers when the app resumes. However, in the version of libp2p that we are using, it dials peers sequentially and once the HTTP tunnel port changes, it looks like it just gets hung up waiting for a socket to connect and it never does and we never receive a ECONNREFUSED error, so the dialer just waits. The dialTimeout option is one way to prevent something like this, and we weren't using it. So I hooked that up in #2414, but it still means there is a delay for however long dialTimeout is set to, which is currently 2 minutes.

Ideally, to really fix this, when the app goes to background (or resumes), we can find a way to close any sockets that reference the old HTTP tunnel port.

@holmesworcester
Copy link
Contributor

Let's make a new issue for the more complete fix.

@holmesworcester
Copy link
Contributor

@leblowl leblowl moved this from In progress to (Mobile) Ready for QA in Quiet Apr 16, 2024
@leblowl
Copy link
Collaborator

leblowl commented Apr 16, 2024

@kingalg in 2.2.0, after maybe 3 minutes, the app should resume syncing. For a more thorough fix, we have #2424 which will be in the next release ideally.

@kingalg
Copy link
Collaborator Author

kingalg commented Apr 22, 2024

iOS 371
A temporary fix is working.

@kingalg kingalg closed this as completed Apr 22, 2024
@kingalg kingalg moved this from (Mobile) Ready for QA to Done in Quiet Apr 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working iOS mobile
Projects
Status: Done
Development

No branches or pull requests

3 participants