-
Notifications
You must be signed in to change notification settings - Fork 277
[OSM-Restart] Sidecar Envoys are not able to connect with OSM after osm controller was restarted. #2145
Comments
@fredstanley Thanks for reporting this issue. May I ask how are you restarting the controller? I also see that you are using a forked version of OSM ( I recently tried restarting the controller pod using |
@shashankram we still did not move to the latest version. Is the fix available only in v.50 tag ? Yes i tried restart the same way. (By issuing the delete pod command.) |
I don't know the base upstream version you are using, but a lot has changed since the last few versions. I recommend you try the upstream v0.5 version and see if the issue is reproducible. If it is, I can look into that specific version and test in my environment. Let me know if that sounds reasonable. |
ok I will try to upgrade to v.05 and update there. |
@fredstanley Encounter same issue before. It was fixed after specify ca bundle secret name with |
By default the CA bundle secret argument is always passed: https://github.com/openservicemesh/osm/blob/main/charts/osm/templates/osm-deployment.yaml#L37 |
@fredstanley are you able to reproduce this issue with the latest release? |
@shashankram The rebase to upstream osm was bit involved in our case (had some private changes). Give us couple of weeks. I will update you on this issue. |
This change adds an e2e test to test the connectivity between client and server before/during/after osm-controller restarts. Previously this was resulting in 503s due to issue openservicemesh#2131 which has been fixed. Resolves openservicemesh#2146 and tests openservicemesh#2145. Signed-off-by: Shashank Ram <shashr2204@gmail.com>
@fredstanley we added a test to ensure connectivity is not impacted after OSM controller restarts in #2212, and its passing right now. There was a bug related to connectivity being impacted post controller restart which as been addressed: #2131 |
…vicemesh#2212) This change adds an e2e test to test the connectivity between client and server before/during/after osm-controller restarts. Previously this was resulting in 503s due to issue openservicemesh#2131 which has been fixed. Resolves openservicemesh#2146 and tests openservicemesh#2145. Signed-off-by: Shashank Ram <shashr2204@gmail.com>
@shashankram I moved to v.0.6.0 and it seem to work fine. Thanks. |
I am trying a restart case of osm-controller. when i restart the controller i see the services sidecars (envoys) not able to access the osm-controller due to certificate failure.
Environment:
osm version
): Version: dev; Commit: c495d19; Date: 2020-09-10-19:20kubectl version
):Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.8", GitCommit:"9f2892aab98fe339f3bd70e3c470144299398ace", GitTreeState:"clean", BuildDate:"2020-08-13T16:12:48Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.3", GitCommit:"2e7996e3e2712684bc73f0dec0200d64eec7fe40", GitTreeState:"clean", BuildDate:"2020-05-20T12:43:34Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}
LOGS from the sidecar envoy:
=====================
[2020-12-04 19:27:54.988][1][debug][config] [bazel-out/k8-opt/bin/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:48] Establishing new gRPC bidi stream for rpc StreamAggregatedResources(stream .envoy.service.discovery.v3.DiscoveryRequest) returns (stream .envoy.service.discovery.v3.DiscoveryResponse);
[2020-12-04 19:27:54.989][1][debug][router] [source/common/router/router.cc:426] [C0][S16105689025710459465] cluster 'osm-controller' match for URL '/envoy.service.discovery.v3.AggregatedDiscoveryService/StreamAggregatedResources'
[2020-12-04 19:27:54.989][1][debug][router] [source/common/router/router.cc:583] [C0][S16105689025710459465] router decoding headers:
':method', 'POST'
':path', '/envoy.service.discovery.v3.AggregatedDiscoveryService/StreamAggregatedResources'
':authority', 'osm-controller'
':scheme', 'https'
'te', 'trailers'
'content-type', 'application/grpc'
'x-envoy-internal', 'true'
'x-forwarded-for', '10.52.185.122'
[2020-12-04 19:27:54.989][1][debug][pool] [source/common/http/conn_pool_base.cc:71] queueing request due to no available connections
[2020-12-04 19:27:54.989][1][debug][pool] [source/common/conn_pool/conn_pool_base.cc:53] creating a new connection
[2020-12-04 19:27:54.989][1][debug][client] [source/common/http/codec_client.cc:35] [C22] connecting
[2020-12-04 19:27:54.990][1][debug][connection] [source/common/network/connection_impl.cc:753] [C22] connecting to 10.50.196.193:15128
[2020-12-04 19:27:54.990][1][debug][connection] [source/common/network/connection_impl.cc:769] [C22] connection in progress
[2020-12-04 19:27:54.990][1][debug][http2] [source/common/http/http2/codec_impl.cc:1063] [C22] updating connection-level initial window size to 268435456
[2020-12-04 19:27:54.991][1][debug][connection] [source/common/network/connection_impl.cc:616] [C22] connected
[2020-12-04 19:27:54.992][1][debug][connection] [source/extensions/transport_sockets/tls/ssl_socket.cc:190] [C22] handshake expecting read
[2020-12-04 19:27:55.007][1][debug][connection] [source/extensions/transport_sockets/tls/ssl_socket.cc:197] [C22] handshake error: 1
[2020-12-04 19:27:55.007][1][debug][connection] [source/extensions/transport_sockets/tls/ssl_socket.cc:225] [C22] TLS error: 67108971:RSA routines:OPENSSL_internal:BLOCK_TYPE_IS_NOT_01 67109000:RSA routines:OPENSSL_internal:PADDING_CHECK_FAILED 184549382:X.509 certificate routines:OPENSSL_internal:public key routines 268435581:SSL routines:OPENSSL_internal:CERTIFICATE_VERIFY_FAILED
[2020-12-04 19:27:55.008][1][debug][connection] [source/common/network/connection_impl.cc:208] [C22] closing socket: 0
[2020-12-04 19:27:55.010][1][debug][client] [source/common/http/codec_client.cc:92] [C22] disconnect. resetting 0 pending requests
[2020-12-04 19:27:55.010][1][debug][pool] [source/common/conn_pool/conn_pool_base.cc:255] [C22] client disconnected, failure reason: TLS error: 67108971:RSA routines:OPENSSL_internal:BLOCK_TYPE_IS_NOT_01 67109000:RSA routines:OPENSSL_internal:PADDING_CHECK_FAILED 184549382:X.509 certificate routines:OPENSSL_internal:public key routines 268435581:SSL routines:OPENSSL_internal:CERTIFICATE_VERIFY_FAILED
[2020-12-04 19:27:55.010][1][debug][router] [source/common/router/router.cc:1022] [C0][S16105689025710459465] upstream reset: reset reason connection failure
[2020-12-04 19:27:55.010][1][debug][http] [source/common/http/async_client_impl.cc:99] async http request response headers (end_stream=true):
':status', '200'
'content-type', 'application/grpc'
'grpc-status', '14'
'grpc-message', 'upstream connect error or disconnect/reset before headers. reset reason: connection failure'
[2020-12-04 19:27:55.010][1][warning][config] [bazel-out/k8-opt/bin/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:93] StreamAggregatedResources gRPC config stream closed: 14, upstream connect error or disconnect/reset before headers. reset reason: connection failure
[2020-12-04 19:27:55.010][1][debug][config] [source/common/config/grpc_subscription_impl.cc:87] gRPC update for type.googleapis.com/envoy.config.listener.v3.Listener failed
[2020-12-04 19:27:55.010][1][debug][config] [source/common/config/grpc_subscription_impl.cc:87] gRPC update for type.googleapis.com/envoy.config.route.v3.RouteConfiguration failed
[2020-12-04 19:27:55.010][1][debug][config] [source/common/config/grpc_subscription_impl.cc:87] gRPC update for type.googleapis.com/envoy.config.route.v3.RouteConfiguration failed
[2020-12-04 19:27:55.010][1][debug][config] [source/common/config/grpc_subscription_impl.cc:87] gRPC update for type.googleapis.com/envoy.config.endpoint.v3.ClusterLoadAssignment failed
[2020-12-04 19:27:55.010][1][debug][config] [source/common/config/grpc_subscription_impl.cc:87] gRPC update for type.googleapis.com/envoy.config.endpoint.v3.ClusterLoadAssignment failed
[2020-12-04 19:27:55.010][1][debug][config] [source/common/config/grpc_subscription_impl.cc:87] gRPC update for type.googleapis.com/envoy.config.endpoint.v3.ClusterLoadAssignment failed
[2020-12-04 19:27:55.010][1][debug][config] [source/common/config/grpc_subscription_impl.cc:87] gRPC update for type.googleapis.com/envoy.config.endpoint.v3.ClusterLoadAssignment failed
[2020-12-04 19:27:55.010][1][debug][config] [source/common/config/grpc_subscription_impl.cc:87] gRPC update for type.googleapis.com/envoy.config.endpoint.v3.ClusterLoadAssignment failed
[2020-12-04 19:27:55.010][1][debug][config] [source/common/config/grpc_subscription_impl.cc:87] gRPC update for type.googleapis.com/envoy.config.endpoint.v3.ClusterLoadAssignment failed
[2020-12-04 19:27:55.010][1][debug][config] [source/common/config/grpc_subscription_impl.cc:87] gRPC update for type.googleapis.com/envoy.config.endpoint.v3.ClusterLoadAssignment failed
[2020-12-04 19:27:55.010][1][debug][config] [source/common/config/grpc_subscription_impl.cc:87] gRPC update for type.googleapis.com/envoy.config.endpoint.v3.ClusterLoadAssignment failed
[2020-12-04 19:27:55.010][1][debug][config] [source/common/config/grpc_subscription_impl.cc:87] gRPC update for type.googleapis.com/envoy.config.endpoint.v3.ClusterLoadAssignment failed
[2020-12-04 19:27:55.010][1][debug][config] [source/common/config/grpc_subscription_impl.cc:87] gRPC update for type.googleapis.com/envoy.config.endpoint.v3.ClusterLoadAssignment failed
[2020-12-04 19:27:55.010][1][debug][config] [source/common/config/grpc_subscription_impl.cc:87] gRPC update for type.googleapis.com/envoy.config.endpoint.v3.ClusterLoadAssignment failed
[2020-12-04 19:27:55.010][1][debug][config] [source/common/config/grpc_subscription_impl.cc:87] gRPC update for type.googleapis.com/envoy.config.cluster.v3.Cluster failed
The text was updated successfully, but these errors were encountered: