Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ACM] Perform cloud ping or reachability test on network events; non-blocking background reachability test #2811

Merged

Conversation

avtolstoy
Copy link
Member

@avtolstoy avtolstoy commented Aug 14, 2024

Description

  1. On network state changes re-evaluation is performed once the cloud connection is fully established (i.e. postponed during handshake). Also postponed during OTA and pending reset post-OTA. Happens now outside of network manager and solely in connection manager
  2. Re-evalution happens in the background now without blocking the system (normal internet test is performed in a blocking fashion)
  3. I've refactored/fixed rtt and (added) score calculations taking into account packet losses/reordering/duplicates
  4. Final score is penalized on losses and exponentially on sequential losses (this is not ideal yet, but should be good enough for now; a better approach would be to do something similar to TCP with congestion window, bursting, miss indication blah blah, but again this is a dumber variant of that to some extent)
  5. Changed some timings/constants (5s total test, fixed 250ms gap between tx packets, 10 tx packets)
  6. DNS resolutions are not performed during background check if possible, instead using previously resolved cloud IP (again, if available) -> DNS resolutions are blocking and take a while sometimes (TODO to fix DNS stuff later)
  7. ACM calls into system_cloud layer to get cloud address for the most part (just removed a bunch of similar code for flash space savings purposes)
  8. Outgoing DNS packets go through the network interface DNS servers have been provisioned for (e.g. ones from DHCP on WiFi will use WiFi as outgoing interface, cellular ones provisioned through PPP will use cellular etc)
  9. Periodic preferred network check is scheduled every 5 minutes if last ACM evaluation did not choose it for some reason (e.g. cloud is not reachable due to no internet connectivity)
  10. Fixed a bug with session resumption not triggering reachability (internet) test
  11. Preferred network is not chosen if it fails reachability test
  12. X.prefer() API will not trigger immediate cloud connection migration, instead a check will be performed and if cloud is reachable through X - migration will happen

Dependencies

Depends on particle-iot/lwip#16

Minimal test app

#include "application.h"

SerialLogHandler dbg(LOG_LEVEL_ALL);
SYSTEM_THREAD(ENABLED);
SYSTEM_MODE(SEMI_AUTOMATIC);

/* executes once at startup */
void setup() {
    //WiFi.prefer();
    //Cellular.prefer();
    waitUntil(Serial.isConnected);
    Particle.connect();
}

/* executes continuously after setup() runs */
void loop() {
    if (Serial.available() > 0) {
        char c = Serial.read();
        switch (c) {
            case 'w': {
                WiFi.disconnect();
                delay(1);
                WiFi.connect();
                break;
            }
            case 'c': {
                Cellular.disconnect();
                delay(1);
                Cellular.connect();
                break;
            }
            case 'W': {
                WiFi.prefer();
                break;
            }
            case 'C': {
                Cellular.prefer();
                break;
            }
            case 'N': {
                Network.prefer();
                break;
            }
        }
    }
}

Poor connectivity results (see score)

0000024174 [system.cm] TRACE: 5: total=2884 consecutive=1 penalty=2884 resultingScore=2884 new=5768
0000024220 [system.cm] TRACE: 5: total=2884 consecutive=2 penalty=5768 resultingScore=5768 new=11536
0000024268 [system.cm] TRACE: 5: total=2884 consecutive=3 penalty=11536 resultingScore=11536 new=23072
0000024322 [system.cm] TRACE: 5: total=2884 consecutive=4 penalty=23072 resultingScore=23072 new=46144
0000024378 [system.cm] TRACE: 5: total=2884 consecutive=5 penalty=46144 resultingScore=46144 new=92288
0000024429 [system.cm] TRACE: 5: total=2884 consecutive=6 penalty=92288 resultingScore=92288 new=184576
0000024481 [system.cm] TRACE: 5: total=2884 consecutive=7 penalty=184576 resultingScore=184576 new=369152
0000024537 [system.cm] TRACE: 5: total=2884 consecutive=8 penalty=369152 resultingScore=369152 new=738304
0000024598 [system.cm] INFO: WiFi: 2/10 packets (0 tx errors) 385/1847 bytes received, avg rtt: 1442, mask=0003, score=369152
0000024661 [system.cm] TRACE: 4: total=2809 consecutive=1 penalty=624 resultingScore=2809 new=3433
0000024710 [system.cm] INFO: Cellular: 9/10 packets (0 tx errors) 2301/2590 bytes received, avg rtt: 312, mask=01ff, score=381

@avtolstoy avtolstoy added this to the 5.9.0 milestone Aug 14, 2024
@avtolstoy avtolstoy marked this pull request as ready for review August 21, 2024 21:45
@avtolstoy avtolstoy merged commit c0c2a55 into develop Aug 22, 2024
13 checks passed
@avtolstoy avtolstoy deleted the fix/acm-perform-cloud-ping-or-test-on-network-events branch August 22, 2024 15:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant