Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cronvoy: Map Envoy Mobile errors to Cronet API errors #1594

Closed
carloseltuerto opened this issue Jul 15, 2021 · 15 comments
Closed

Cronvoy: Map Envoy Mobile errors to Cronet API errors #1594

carloseltuerto opened this issue Jul 15, 2021 · 15 comments
Assignees

Comments

@carloseltuerto
Copy link
Contributor

carloseltuerto commented Jul 15, 2021

The Cronet API exposes an official list of error codes that must have a one-to-one correspondence with Envoy Mobile errors. These errors also have a one-to-one relation with a Cronet Internal Error Code, except for the last one.

  // Error code indicating the host being sent the request could not be resolved to an IP address.
  ERROR_HOSTNAME_NOT_RESOLVED = 1;      // Internal error: NAME_NOT_RESOLVED

  // Error code indicating the device was not connected to any network.
  ERROR_INTERNET_DISCONNECTED = 2;      // Internal error: INTERNET_DISCONNECTED

  // Error code indicating that as the request was processed the network configuration changed.
  // When {@link #getErrorCode} returns this code, this exception may be cast to a
  // {@link QuicException}.
  ERROR_NETWORK_CHANGED = 3;            // Internal error: NETWORK_CHANGED

  // Error code indicating a timeout expired. Timeouts expiring while attempting to connect will be
  // reported as the more specific {@link #ERROR_CONNECTION_TIMED_OUT}.
  ERROR_TIMED_OUT = 4;                  // Internal error: TIMED_OUT

  // Error code indicating the connection was closed unexpectedly.
  ERROR_CONNECTION_CLOSED = 5;          // Internal error: CONNECTION_CLOSED

  // Error code indicating the connection attempt timed out.
  ERROR_CONNECTION_TIMED_OUT = 6;       // Internal error: CONNECTION_TIMED_OUT

  // Error code indicating the connection attempt was refused.
  ERROR_CONNECTION_REFUSED = 7;         // Internal error: CONNECTION_REFUSED

  // Error code indicating the connection was unexpectedly reset.
  ERROR_CONNECTION_RESET = 8;           // Internal error: CONNECTION_RESET

  // Error code indicating the IP address being contacted is unreachable, meaning there is no route
  // to the specified host or network.
  ERROR_ADDRESS_UNREACHABLE = 9;        // Internal error: ADDRESS_UNREACHABLE

  // Error code indicating an error related to the <a href="https://www.chromium.org/quic">QUIC</a>
  // protocol. When {@link #getErrorCode} returns this code, this exception can be cast to {@link
  // QuicException} for more information.
  ERROR_QUIC_PROTOCOL_FAILED = 10;      // Internal error: QUIC_PROTOCOL_ERROR

  // Error code indicating another type of error was encountered. {@link
  // #getCronetInternalErrorCode} can be consulted to get a more specific cause.
  ERROR_OTHER = 11;                     // Catch all for all other internal errors.

Requires changes in Envoy.

@DavidSchinazi
Copy link

@carloseltuerto is this issue not a duplicate of #1550 ?

@carloseltuerto
Copy link
Contributor Author

They overlap, but this issue is more important: those are the Public error codes - there is no way around to have them all implemented.

Issue 1550 exposes all the internal codes that are presumably needed - that includes these plus other that are driving some business logic in the Java Cronet, and that are used in some tests, often to test a regression.

@DavidSchinazi
Copy link

I don't think "there is no way around to have them all implemented" is correct, we can return a subset of error codes right?

@carloseltuerto
Copy link
Contributor Author

The best deal is to implement the public ones, plus the few extra ones in Issue 1550. We don't need to match all of the 235 internal codes :-)

@DavidSchinazi
Copy link

My point was that we can get away with not implementing some of the public ones: if we return one error in some scenarios instead of another it shouldn't be a huge compatibility problem

@carloseltuerto
Copy link
Contributor Author

carloseltuerto commented Jul 19, 2021

Well, for sure, one test won't pass. It injects an error condition, and the correct public error code must be returned:

https://github.com/envoyproxy/envoy-mobile/blob/main/test/java/org/chromium/net/CronetUrlRequestTest.java#L2042

@alyssawilk
Copy link
Contributor

OK, I'm working on a PR to wire up the error flags. Once we do that I think we can track most of these.

@RyanTheOptimist I'd like your thoughts on ERROR_INTERNET_DISCONNECTED / ERROR_NETWORK_CHANGED, if we can detect these from Java because I don't think we can in C++ yet. Also I'm not sure what ERROR_QUIC_PROTOCOL_FAILED maps to.

We don't differentiate upstream connection failures in Envoy so I think for now we just pick one chrome error code and use it for all 3 errors.

ERROR_HOSTNAME_NOT_RESOLVED DnsResolutionFailed once https://github.com/envoyproxy/envoy/pull/19588/files lands
ERROR_INTERNET_DISCONNECTED = 2;
ERROR_NETWORK_CHANGED = 3;
ERROR_TIMED_OUT = StreamIdleTimeout or DurationTimeout
ERROR_CONNECTION_CLOSED => UpstreamConnectionTermination
ERROR_CONNECTION_TIMED_OUT = n/a (falls under UpstreamConnectionFailure)
ERROR_CONNECTION_REFUSED = UpstreamConnectionFailure (refused == unreachable)
ERROR_CONNECTION_RESET UpstreamRemoteReset
ERROR_ADDRESS_UNREACHABLE = UpstreamConnectionFailure (refused == unreachable)
ERROR_QUIC_PROTOCOL_FAILED = 10;
ERROR_OTHER = (switch statement default)

@RyanTheOptimist
Copy link
Contributor

For INTERNET_DISCONNECTED and NETWORK_CHANGED, Chrome gets this info from the Network Change Notifier, which hook into Android notifications via the Java API https://developer.android.com/reference/android/net/ConnectivityManager.NetworkCallback so Envoy Mobile Java should be able to do something similar. And at some point we'll need to wire these signals down to Envoy.

ERROR_QUIC_PROTOCOL_FAILED is the Cronet layer exposing the network error code ERR_QUIC_PROTOCOL_ERROR: https://source.chromium.org/chromium/chromium/src/+/main:components/cronet/native/url_request.cc;l=122?q=ERROR_QUIC_PROTOCOL_FAILED&ss=chromium Basically this is what most errors map to when QUIC is used as the transport. That being said, this bucket is way too broadly applied in Chrome but we've never fixed this.

I'm slightly surprised that Envoy does not differentiate between refused and unreachable, but c'est la vie. Other than debugging, I doubt this difference is important.

@alyssawilk
Copy link
Contributor

For ERROR_QUIC_PROTOCOL_FAILED would it make sense to either convert known errors (connection failure etc) to ERROR_QUIC_PROTOCOL_FAILED when we've negotiated HTTP/3, or use it instead of ERROR_OTHER when HTTP/3 is in use?

@RyanTheOptimist
Copy link
Contributor

Converting known errors to ERROR_QUIC_PROTOCOL_FAILED when HTTP/3 was negotiated is probably most similar to Chrome. On the otherhand using ERROR_QUIC_PROTOCOL_FAILED instead of ERROR_OTHER is probably more accurate. I'd be inclined to either do the latter or just skip ERROR_QUIC_PROTOCOL_FAILED completely. Your first suggestion might be safest, if anyone has logic wired up to that error code, though I'm a bit skeptical that anyone does.

alyssawilk added a commit that referenced this issue Jan 26, 2022
Risk Level: low
Testing:
Docs Changes: inline
Part of #1594

Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
@colibie
Copy link
Contributor

colibie commented Oct 18, 2022

For ERROR_QUIC_PROTOCOL_FAILED would it make sense to either convert known errors (connection failure etc) to ERROR_QUIC_PROTOCOL_FAILED when we've negotiated HTTP/3, or use it instead of ERROR_OTHER when HTTP/3 is in use?

@alyssawilk I need a flag in finalStreamIntel that would tell me if HTTP/3 was used or not. I plan to reverse map the responseFlags to the quic errors in java when quic is used, but there's no way to know quic was used when the onError callback is called.

@alyssawilk
Copy link
Contributor

@colibie SG, on it.

RyanTheOptimist pushed a commit that referenced this issue Oct 20, 2022
Description: Converts EM's responseFlags to Cronet's API Errors
Additional Description: Note that this change does not include ERR_NETWORK_CHANGED, ERR_INTERNET_DISCONNECT and QUIC_PROTOCOL_FAILED as they aren't implemented yet
Risk Level: Low
Testing: Test included
Docs Changes: n/a
Release Notes: n/a

Signed-off-by: Chidera Olibie <colibie@google.com>
alyssawilk added a commit that referenced this issue Oct 20, 2022
Risk Level: low
Testing: unit tests
Docs Changes: n/a
Release Notes: yes
Part of #1594

Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
@alyssawilk
Copy link
Contributor

And done, thanks for your patience :-)

colibie added a commit to colibie/envoy-mobile that referenced this issue Oct 22, 2022
…roxy#2568)

Description: Converts EM's responseFlags to Cronet's API Errors
Additional Description: Note that this change does not include ERR_NETWORK_CHANGED, ERR_INTERNET_DISCONNECT and QUIC_PROTOCOL_FAILED as they aren't implemented yet
Risk Level: Low
Testing: Test included
Docs Changes: n/a
Release Notes: n/a

Signed-off-by: Chidera Olibie <colibie@google.com>

Signed-off-by: Chidera <colibie@google.com>
colibie pushed a commit to colibie/envoy-mobile that referenced this issue Oct 22, 2022
Risk Level: low
Testing: unit tests
Docs Changes: n/a
Release Notes: yes
Part of envoyproxy#1594

Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Signed-off-by: Chidera Olibie <colibie@google.com>
RyanTheOptimist pushed a commit that referenced this issue Nov 2, 2022
Description: As a follow up for PR2568, the PR maps the network_disconnected and quic exception errors. The network_changed error has been removed as this is not relevant to envoymobile.

Signed-off-by: Chidera Olibie <colibie@google.com>
@colibie
Copy link
Contributor

colibie commented Nov 3, 2022

@carloseltuerto you may close this issue. The last commit mapped all the errors.

@alyssawilk
Copy link
Contributor

Wohoo! Thanks so much for all your work on this!

jpsim added a commit that referenced this issue Nov 4, 2022
…builder-function

* origin/main:
  remove the use of deprecated flag (#2658)
  Cronvoy: Map EM Errors to Cronet API Errors II (#1594) (#2633)
  build: revert boring patch (#2651)
  Bump Lyft Support Rotation (#2654)
  fix one issue blocking bumping Envoy (#2649)
  ci: remove Snow from Lyft EM rotation (#2650)
  ci: increasing timeouts (#2653)
  python: Apply Envoy python-yapf formatting (#2648)
  repo: Shellcheck cleanups (#2646)
  repo: Switch `pip_install` -> `pip_parse` (#2647)
  Use safe_malloc instead of new when creating new_envoy_map_entry (#2632)
  python: Pin requirement hashes (#2643)
  Disable flaky TestConfig.StringAccessors (#2642)
  ci: migrate from set-output to GITHUB_OUTPUT (#2625)
  Add a comment to addPlatformFilter (#2634)
  Allow Cronvoy to build with proguard. (#2635)
  Update Envoy (#2630)
  Add support for Platform and Native filters to C++ EngineBuilder (#2626)
  Register getaddrinfo in extension_registry (#2627)
  dns: stop using cares DNS resolver (#2618)

Signed-off-by: JP Simard <jp@jpsim.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants