Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flaky test: 1/10k: ControlChannelConnectivityStateMonitoring #5468

Open
easwars opened this issue Jun 24, 2022 · 7 comments
Open

Flaky test: 1/10k: ControlChannelConnectivityStateMonitoring #5468

easwars opened this issue Jun 24, 2022 · 7 comments

Comments

@easwars
Copy link
Contributor

easwars commented Jun 24, 2022

Test error log:
--- FAIL: Test (9.95s)
    --- FAIL: Test/ControlChannelConnectivityStateMonitoring (5.02s)
        tlogger.go:116: INFO server.go:598 [core] [Server #525] Server created  (t=+533.807µs)
        fake_rls_server.go:53: Started fake RLS server at "127.0.0.1:35475"
        balancer_test.go:763: Registered child policy with name "test-child-policyTest/ControlChannelConnectivityStateMonitoring"
        tlogger.go:116: INFO server.go:598 [core] [Server #526] Server created  (t=+1.244216ms)
        balancer_test.go:772: Started TestService backend at: "127.0.0.1:38063"
        tlogger.go:116: INFO config.go:144 [rls] Received JSON service config: {
              "routeLookupConfig": {
                "grpc_keybuilders": [
                  {
                    "names": [
                      {
                        "service": "grpc.testing.TestService"
                      }
                    ],
                    "headers": [
                      {
                        "key": "k1",
                        "names": [
                          "n1"
                        ]
                      },
                      {
                        "key": "k2",
                        "names": [
                          "n2"
                        ]
                      }
                    ]
                  }
                ],
                "lookup_service": "127.0.0.1:35475",
                "lookup_service_timeout": "5s",
                "max_age": "0.100s",
                "cache_size_bytes": "1024"
              },
              "routeLookupChannelServiceConfig": {
                "loadBalancingConfig": [
                  {
                    "pick_first": {}
                  }
                ]
              },
              "childPolicy": [
                {
                  "test-child-policyTest/ControlChannelConnectivityStateMonitoring": {}
                }
              ],
              "childPolicyConfigTargetFieldName": "Backend"
            }  (t=+2.370632ms)
        tlogger.go:116: INFO server.go:786 [core] [Server #525 ListenSocket #527] ListenSocket created  (t=+3.815951ms)
        tlogger.go:116: INFO server.go:786 [core] [Server #526 ListenSocket #528] ListenSocket created  (t=+7.234996ms)
        tlogger.go:116: INFO clientconn.go:105 [core] [Channel #529] Channel created  (t=+8.666116ms)
        tlogger.go:116: INFO clientconn.go:1579 [core] [Channel #529] original dial target is: "rls-e2e:///"  (t=+8.798917ms)
        tlogger.go:116: INFO clientconn.go:1586 [core] [Channel #529] parsed dial target is: {Scheme:rls-e2e Authority: Endpoint: URL:{Scheme:rls-e2e Opaque: User: Host: Path:/ RawPath: ForceQuery:false RawQuery: Fragment: RawFragment:}}  (t=+8.961119ms)
        tlogger.go:116: INFO clientconn.go:263 [core] [Channel #529] Channel authority set to ""  (t=+9.104421ms)
        tlogger.go:116: INFO resolver_conn_wrapper.go:175 [core] [Channel #529] Resolver state updated: {
              "Addresses": null,
              "ServiceConfig": {
                "Config": {
                  "Config": null,
                  "LB": null,
                  "Methods": {}
                },
                "Err": null
              },
              "Attributes": null
            } (service config updated)  (t=+9.363025ms)
        tlogger.go:116: INFO balancer_conn_wrappers.go:271 [core] [Channel #529] Channel switches to new LB policy "rls_experimental"  (t=+9.625628ms)
        tlogger.go:116: INFO balancer.go:260 [rls] [rls-experimental-lb 0x4000336600] Creating control channel to RLS server at: 127.0.0.1:35475  (t=+10.007133ms)
        tlogger.go:116: INFO control_channel.go:127 [rls] [rls-control-channel 0x4000238000] Disabling service config from the name resolver and instead using: {"loadBalancingConfig": [{"pick_first": {}}]}  (t=+15.280104ms)
        tlogger.go:116: INFO clientconn.go:105 [core] [Channel #530] Channel created  (t=+15.543207ms)
        tlogger.go:116: INFO clientconn.go:1579 [core] [Channel #530] original dial target is: "127.0.0.1:35475"  (t=+16.076014ms)
        tlogger.go:116: INFO clientconn.go:1584 [core] [Channel #530] dial target "127.0.0.1:35475" parse failed: parse "127.0.0.1:35475": first path segment in URL cannot contain colon  (t=+16.279217ms)
        tlogger.go:116: INFO clientconn.go:1599 [core] [Channel #530] fallback to scheme "passthrough"  (t=+16.411719ms)
        tlogger.go:116: INFO clientconn.go:1607 [core] [Channel #530] parsed dial target is: {Scheme:passthrough Authority: Endpoint:127.0.0.1:35475 URL:{Scheme:passthrough Opaque: User: Host: Path:/127.0.0.1:35475 RawPath: ForceQuery:false RawQuery: Fragment: RawFragment:}}  (t=+16.616622ms)
        tlogger.go:116: INFO clientconn.go:263 [core] [Channel #530] Channel authority set to "127.0.0.1:35475"  (t=+16.813024ms)
        tlogger.go:116: INFO resolver_conn_wrapper.go:175 [core] [Channel #530] Resolver state updated: {
              "Addresses": [
                {
                  "Addr": "127.0.0.1:35475",
                  "ServerName": "",
                  "Attributes": null,
                  "BalancerAttributes": null,
                  "Type": 0,
                  "Metadata": null
                }
              ],
              "ServiceConfig": null,
              "Attributes": null
            } (resolver returned new addresses)  (t=+17.124028ms)
        tlogger.go:116: INFO clientconn.go:631 [core] [Channel #530] ignoring service config from resolver (<nil>) and applying the default because service config is disabled  (t=+17.322831ms)
        tlogger.go:116: INFO balancer_conn_wrappers.go:271 [core] [Channel #530] Channel switches to new LB policy "pick_first"  (t=+17.552034ms)
        tlogger.go:116: INFO clientconn.go:725 [core] [Channel #530 SubChannel #531] Subchannel created  (t=+17.804538ms)
        tlogger.go:116: INFO control_channel.go:83 [rls] [rls-control-channel 0x4000238000] Control channel created to RLS server at: 127.0.0.1:35475  (t=+18.030441ms)
        tlogger.go:116: INFO balancer.go:282 [rls] [rls-experimental-lb 0x4000336600] Child policy changed to "test-child-policyTest/ControlChannelConnectivityStateMonitoring"  (t=+18.174542ms)
        tlogger.go:116: INFO balancer.go:441 [rls] [rls-experimental-lb 0x4000336600] New balancer.State: {ConnectivityState:IDLE Picker:0x40003e6780}  (t=+18.329945ms)
        tlogger.go:116: INFO control_channel.go:212 [rls] [rls-control-channel 0x4000238000] Sending RLS request {
              "targetType": "grpc",
              "reason": "REASON_MISS"
            }  (t=+18.698049ms)
        tlogger.go:116: INFO picker_wrapper.go:161 [core] blockingPicker: the picked transport is not ready, loop back to repick  (t=+26.99106ms)
        tlogger.go:116: INFO clientconn.go:1101 [core] [Channel #530 SubChannel #531] Subchannel Connectivity change to CONNECTING  (t=+27.139[46](https://github.com/grpc/grpc-go/runs/7033783730?check_suite_focus=true#step:8:47)2ms)
        tlogger.go:116: INFO clientconn.go:1212 [core] [Channel #530 SubChannel #531] Subchannel picks a new address "127.0.0.1:35[47](https://github.com/grpc/grpc-go/runs/7033783730?check_suite_focus=true#step:8:48)5" to connect  (t=+27.354365ms)
        tlogger.go:116: INFO control_channel.go:135 [rls] [rls-control-channel 0x4000238000] Starting connectivity state monitoring goroutine  (t=+27.554168ms)
        tlogger.go:116: INFO clientconn.go:418 [core] [Channel #530] Channel Connectivity change to CONNECTING  (t=+28.981587ms)
        tlogger.go:116: INFO clientconn.go:1101 [core] [Channel #530 SubChannel #531] Subchannel Connectivity change to READY  (t=+29.108088ms)
        tlogger.go:116: INFO clientconn.go:418 [core] [Channel #530] Channel Connectivity change to READY  (t=+29.326591ms)
        tlogger.go:116: INFO picker.go:305 [rls] [rls-picker 0x40003e6780] Received RLS response for key {path:/grpc.testing.TestService/EmptyCall keys:} with targets [127.0.0.1:38063], headerData "", err: <nil>  (t=+30.70601ms)
        tlogger.go:116: INFO child_policy.go:84 [rls] [rls-child-policy-wrapper 127.0.0.1:38063 0x40002338c0] Created  (t=+30.837111ms)
        tlogger.go:116: INFO balancergroup.go:100 [rls] [rls-experimental-lb 0x4000336600] Creating child policy of type test-child-policyTest/ControlChannelConnectivityStateMonitoring  (t=+30.933613ms)
        tlogger.go:116: INFO balancer.go:571 [rls] [rls-experimental-lb 0x4000336600] Child policy "127.0.0.1:38063" added to BalancerGroup  (t=+31.042614ms)
        tlogger.go:116: INFO balancer.go:368 [rls] [rls-experimental-lb 0x4000336600] Pushing new state to child policy "127.0.0.1:38063": {ResolverState:{Addresses:[] ServiceConfig:0x4000258120 Attributes:<nil>} BalancerConfig:0x4000233a10}  (t=+31.223817ms)
        tlogger.go:116: INFO clientconn.go:725 [core] [Channel #529 SubChannel #534] Subchannel created  (t=+31.336218ms)
        tlogger.go:116: INFO balancergroup.go:499 [rls] [rls-experimental-lb 0x4000336600] Balancer state update from locality 127.0.0.1:38063, new state: {ConnectivityState:IDLE Picker:0x4000233b30}  (t=+31.437519ms)
        tlogger.go:116: INFO picker.go:311 [rls] [rls-picker 0x40003e6780] Removing pending request entry for key {path:/grpc.testing.TestService/EmptyCall keys:}  (t=+31.540721ms)
        tlogger.go:116: INFO balancer.go:441 [rls] [rls-experimental-lb 0x4000336600] New balancer.State: {ConnectivityState:CONNECTING Picker:0x400027f200}  (t=+31.630222ms)
        tlogger.go:116: INFO clientconn.go:418 [core] [Channel #529] Channel Connectivity change to CONNECTING  (t=+31.700523ms)
        tlogger.go:116: INFO balancer.go:546 [rls] [rls-experimental-lb 0x4000336600] Child policy "127.0.0.1:38063" has new state 0x400000c900  (t=+31.828125ms)
        tlogger.go:116: INFO balancer.go:441 [rls] [rls-experimental-lb 0x4000336600] New balancer.State: {ConnectivityState:IDLE Picker:0x400027f380}  (t=+31.902726ms)
        tlogger.go:116: INFO clientconn.go:418 [core] [Channel #529] Channel Connectivity change to IDLE  (t=+31.968727ms)
        tlogger.go:116: INFO clientconn.go:1101 [core] [Channel #529 SubChannel #534] Subchannel Connectivity change to CONNECTING  (t=+32.102228ms)
        tlogger.go:116: INFO clientconn.go:1212 [core] [Channel #529 SubChannel #534] Subchannel picks a new address "127.0.0.1:38063" to connect  (t=+32.22793ms)
        tlogger.go:116: INFO balancergroup.go:499 [rls] [rls-experimental-lb 0x4000336600] Balancer state update from locality 127.0.0.1:38063, new state: {ConnectivityState:CONNECTING Picker:0x4000233ec0}  (t=+32.519734ms)
        tlogger.go:116: INFO balancer.go:546 [rls] [rls-experimental-lb 0x4000336600] Child policy "127.0.0.1:38063" has new state 0x400000c978  (t=+32.629335ms)
        tlogger.go:116: INFO balancer.go:441 [rls] [rls-experimental-lb 0x4000336600] New balancer.State: {ConnectivityState:CONNECTING Picker:0x400027f6e0}  (t=+32.705236ms)
        tlogger.go:116: INFO clientconn.go:418 [core] [Channel #529] Channel Connectivity change to CONNECTING  (t=+32.779337ms)
        tlogger.go:116: INFO clientconn.go:1101 [core] [Channel #529 SubChannel #534] Subchannel Connectivity change to READY  (t=+33.5756[48](https://github.com/grpc/grpc-go/runs/7033783730?check_suite_focus=true#step:8:49)ms)
        tlogger.go:116: INFO balancergroup.go:[49](https://github.com/grpc/grpc-go/runs/7033783730?check_suite_focus=true#step:8:50)9 [rls] [rls-experimental-lb 0x4000336600] Balancer state update from locality 127.0.0.1:38063, new state: {ConnectivityState:READY Picker:0x40003fb020}  (t=+33.71105ms)
        tlogger.go:116: INFO balancer.go:546 [rls] [rls-experimental-lb 0x4000336600] Child policy "127.0.0.1:38063" has new state 0x400000cae0  (t=+33.822051ms)
        tlogger.go:116: INFO balancer.go:441 [rls] [rls-experimental-lb 0x4000336600] New balancer.State: {ConnectivityState:READY Picker:0x400027ff80}  (t=+33.899352ms)
        tlogger.go:116: INFO clientconn.go:418 [core] [Channel #529] Channel Connectivity change to READY  (t=+33.971053ms)
        tlogger.go:116: INFO clientconn.go:1101 [core] [Channel #530 SubChannel #531] Subchannel Connectivity change to IDLE  (t=+36.00568ms)
        tlogger.go:116: INFO clientconn.go:418 [core] [Channel #530] Channel Connectivity change to IDLE  (t=+36.857092ms)
        tlogger.go:116: INFO control_channel.go:212 [rls] [rls-control-channel 0x4000238000] Sending RLS request {
              "targetType": "grpc",
              "reason": "REASON_MISS"
            }  (t=+136.815426ms)
        tlogger.go:116: INFO clientconn.go:1101 [core] [Channel #530 SubChannel #531] Subchannel Connectivity change to CONNECTING  (t=+137.14733ms)
        tlogger.go:116: INFO clientconn.go:1212 [core] [Channel #530 SubChannel #531] Subchannel picks a new address "127.0.0.1:35475" to connect  (t=+137.403834ms)
        tlogger.go:116: INFO clientconn.go:418 [core] [Channel #530] Channel Connectivity change to CONNECTING  (t=+137.940641ms)
        tlogger.go:116: INFO clientconn.go:1101 [core] [Channel #530 SubChannel #531] Subchannel Connectivity change to TRANSIENT_FAILURE  (t=+139.230458ms)
        tlogger.go:116: INFO clientconn.go:418 [core] [Channel #530] Channel Connectivity change to TRANSIENT_FAILURE  (t=+139.426061ms)
        tlogger.go:116: INFO picker.go:305 [rls] [rls-picker 0x400027ff80] Received RLS response for key {path:/grpc.testing.TestService/EmptyCall keys:} with targets [], headerData "", err: rpc error: code = Unavailable desc = connection closed before server preface received  (t=+139.838966ms)
        tlogger.go:116: INFO picker.go:311 [rls] [rls-picker 0x400027ff80] Removing pending request entry for key {path:/grpc.testing.TestService/EmptyCall keys:}  (t=+140.034069ms)
        tlogger.go:116: INFO balancer.go:441 [rls] [rls-experimental-lb 0x4000336600] New balancer.State: {ConnectivityState:READY Picker:0x400037cc60}  (t=+140.201471ms)
        tlogger.go:116: INFO control_channel.go:212 [rls] [rls-control-channel 0x4000238000] Sending RLS request {
              "targetType": "grpc",
              "reason": "REASON_MISS",
              "keyMap": {
                "k1": "v1"
              }
            }  (t=+140.85168ms)
        tlogger.go:116: INFO picker.go:305 [rls] [rls-picker 0x400037cc60] Received RLS response for key {path:/grpc.testing.TestService/EmptyCall keys:k1=v1} with targets [], headerData "", err: rpc error: code = Unavailable desc = connection closed before server preface received  (t=+141.174984ms)
        tlogger.go:116: INFO picker.go:311 [rls] [rls-picker 0x400037cc60] Removing pending request entry for key {path:/grpc.testing.TestService/EmptyCall keys:k1=v1}  (t=+141.326186ms)
        tlogger.go:116: INFO balancer.go:441 [rls] [rls-experimental-lb 0x4000336600] New balancer.State: {ConnectivityState:READY Picker:0x400037d440}  (t=+141.468488ms)
        tlogger.go:116: INFO balancer.go:441 [rls] [rls-experimental-lb 0x4000336600] New balancer.State: {ConnectivityState:READY Picker:0x400037d740}  (t=+241.705526ms)
        tlogger.go:116: INFO control_channel.go:212 [rls] [rls-control-channel 0x4000238000] Sending RLS request {
              "targetType": "grpc",
              "reason": "REASON_MISS",
              "keyMap": {
                "k1": "v1"
              }
            }  (t=+242.295734ms)
        tlogger.go:116: INFO picker.go:305 [rls] [rls-picker 0x400037d740] Received RLS response for key {path:/grpc.testing.TestService/EmptyCall keys:k1=v1} with targets [], headerData "", err: rpc error: code = Unavailable desc = connection closed before server preface received  (t=+242.541337ms)
        tlogger.go:116: INFO picker.go:311 [rls] [rls-picker 0x400037d740] Removing pending request entry for key {path:/grpc.testing.TestService/EmptyCall keys:k1=v1}  (t=+242.631438ms)
        tlogger.go:116: INFO balancer.go:441 [rls] [rls-experimental-lb 0x4000336600] New balancer.State: {ConnectivityState:READY Picker:0x4000436240}  (t=+242.732639ms)
        tlogger.go:116: INFO balancer.go:441 [rls] [rls-experimental-lb 0x4000336600] New balancer.State: {ConnectivityState:READY Picker:0x4000436360}  (t=+343.807488ms)
        tlogger.go:116: INFO control_channel.go:212 [rls] [rls-control-channel 0x4000238000] Sending RLS request {
              "targetType": "grpc",
              "reason": "REASON_MISS",
              "keyMap": {
                "k1": "v1"
              }
            }  (t=+443.834123ms)
        tlogger.go:116: INFO picker.go:305 [rls] [rls-picker 0x4000436360] Received RLS response for key {path:/grpc.testing.TestService/EmptyCall keys:k1=v1} with targets [], headerData "", err: rpc error: code = Unavailable desc = connection closed before server preface received  (t=+444.066627ms)
        tlogger.go:116: INFO picker.go:311 [rls] [rls-picker 0x4000436360] Removing pending request entry for key {path:/grpc.testing.TestService/EmptyCall keys:k1=v1}  (t=+444.177428ms)
        tlogger.go:116: INFO balancer.go:441 [rls] [rls-experimental-lb 0x4000336600] New balancer.State: {ConnectivityState:READY Picker:0x4000184d80}  (t=+444.279129ms)
        tlogger.go:116: INFO balancer.go:441 [rls] [rls-experimental-lb 0x4000336600] New balancer.State: {ConnectivityState:READY Picker:0x40004364e0}  (t=+544.806971ms)
        tlogger.go:116: INFO control_channel.go:212 [rls] [rls-control-channel 0x4000238000] Sending RLS request {
              "targetType": "grpc",
              "reason": "REASON_MISS",
              "keyMap": {
                "k1": "v1"
              }
            }  (t=+545.557281ms)
        tlogger.go:116: INFO picker.go:305 [rls] [rls-picker 0x40004364e0] Received RLS response for key {path:/grpc.testing.TestService/EmptyCall keys:k1=v1} with targets [], headerData "", err: rpc error: code = Unavailable desc = connection closed before server preface received  (t=+545.925886ms)
        tlogger.go:116: INFO picker.go:311 [rls] [rls-picker 0x40004364e0] Removing pending request entry for key {path:/grpc.testing.TestService/EmptyCall keys:k1=v1}  (t=+546.073688ms)
        tlogger.go:116: INFO balancer.go:441 [rls] [rls-experimental-lb 0x4000336600] New balancer.State: {ConnectivityState:READY Picker:0x4000436fc0}  (t=+546.23079ms)
        tlogger.go:116: INFO balancer.go:441 [rls] [rls-experimental-lb 0x4000336600] New balancer.State: {ConnectivityState:READY Picker:0x40004370e0}  (t=+646.479828ms)
        tlogger.go:116: INFO control_channel.go:212 [rls] [rls-control-channel 0x4000238000] Sending RLS request {
              "targetType": "grpc",
              "reason": "REASON_MISS",
              "keyMap": {
                "k1": "v1"
              }
            }  (t=+746.686065ms)
        tlogger.go:116: INFO picker.go:305 [rls] [rls-picker 0x40004370e0] Received RLS response for key {path:/grpc.testing.TestService/EmptyCall keys:k1=v1} with targets [], headerData "", err: rpc error: code = Unavailable desc = connection closed before server preface received  (t=+746.982869ms)
        tlogger.go:116: INFO picker.go:311 [rls] [rls-picker 0x40004370e0] Removing pending request entry for key {path:/grpc.testing.TestService/EmptyCall keys:k1=v1}  (t=+747.081671ms)
        tlogger.go:116: INFO balancer.go:441 [rls] [rls-experimental-lb 0x4000336600] New balancer.State: {ConnectivityState:READY Picker:0x4000185[50](https://github.com/grpc/grpc-go/runs/7033783730?check_suite_focus=true#step:8:51)0}  (t=+747.180172ms)
        tlogger.go:116: INFO balancer.go:441 [rls] [rls-experimental-lb 0x4000336600] New balancer.State: {ConnectivityState:READY Picker:0x4000185680}  (t=+848.289921ms)
        tlogger.go:116: INFO control_channel.go:212 [rls] [rls-control-channel 0x4000238000] Sending RLS request {
              "targetType": "grpc",
              "reason": "REASON_MISS",
              "keyMap": {
                "k1": "v1"
              }
            }  (t=+947.545946ms)
        tlogger.go:116: INFO picker.go:305 [rls] [rls-picker 0x4000185680] Received RLS response for key {path:/grpc.testing.TestService/EmptyCall keys:k1=v1} with targets [], headerData "", err: rpc error: code = Unavailable desc = connection closed before server preface received  (t=+947.9227[51](https://github.com/grpc/grpc-go/runs/7033783730?check_suite_focus=true#step:8:52)ms)
        tlogger.go:116: INFO picker.go:311 [rls] [rls-picker 0x4000185680] Removing pending request entry for key {path:/grpc.testing.TestService/EmptyCall keys:k1=v1}  (t=+948.073253ms)
        tlogger.go:116: INFO balancer.go:441 [rls] [rls-experimental-lb 0x4000336600] New balancer.State: {ConnectivityState:READY Picker:0x4000437e60}  (t=+948.230755ms)
        tlogger.go:116: INFO balancer.go:441 [rls] [rls-experimental-lb 0x4000336600] New balancer.State: {ConnectivityState:READY Picker:0x400009e360}  (t=+1.048285991s)
        tlogger.go:116: INFO control_channel.go:212 [rls] [rls-control-channel 0x4000238000] Sending RLS request {
              "targetType": "grpc",
              "reason": "REASON_MISS",
              "keyMap": {
                "k1": "v1"
              }
            }  (t=+1.048[52](https://github.com/grpc/grpc-go/runs/7033783730?check_suite_focus=true#step:8:53)7294s)
        tlogger.go:116: INFO picker.go:305 [rls] [rls-picker 0x4000437e60] Received RLS response for key {path:/grpc.testing.TestService/EmptyCall keys:k1=v1} with targets [], headerData "", err: rpc error: code = Unavailable desc = connection closed before server preface received  (t=+1.048758597s)
        tlogger.go:116: INFO picker.go:311 [rls] [rls-picker 0x4000437e60] Removing pending request entry for key {path:/grpc.testing.TestService/EmptyCall keys:k1=v1}  (t=+1.048846198s)
        tlogger.go:116: INFO balancer.go:441 [rls] [rls-experimental-lb 0x4000336600] New balancer.State: {ConnectivityState:READY Picker:0x4000185bc0}  (t=+1.048937099s)
        tlogger.go:116: INFO clientconn.go:1101 [core] [Channel #[53](https://github.com/grpc/grpc-go/runs/7033783730?check_suite_focus=true#step:8:54)0 SubChannel #531] Subchannel Connectivity change to IDLE  (t=+1.139600009s)
        tlogger.go:116: INFO clientconn.go:418 [core] [Channel #530] Channel Connectivity change to IDLE  (t=+1.139817412s)
        tlogger.go:116: INFO balancer.go:441 [rls] [rls-experimental-lb 0x4000336600] New balancer.State: {ConnectivityState:READY Picker:0x40003e6d20}  (t=+1.149314239s)
        tlogger.go:116: INFO control_channel.go:212 [rls] [rls-control-channel 0x4000238000] Sending RLS request {
              "targetType": "grpc",
              "reason": "REASON_MISS",
              "keyMap": {
                "k1": "v1"
              }
            }  (t=+1.149895947s)
        tlogger.go:116: INFO clientconn.go:1101 [core] [Channel #530 SubChannel #531] Subchannel Connectivity change to CONNECTING  (t=+1.15014715s)
        tlogger.go:116: INFO clientconn.go:1212 [core] [Channel #530 SubChannel #531] Subchannel picks a new address "127.0.0.1:3[54](https://github.com/grpc/grpc-go/runs/7033783730?check_suite_focus=true#step:8:55)75" to connect  (t=+1.150298252s)
        tlogger.go:116: INFO clientconn.go:418 [core] [Channel #530] Channel Connectivity change to CONNECTING  (t=+1.150720058s)
        tlogger.go:116: INFO clientconn.go:1101 [core] [Channel #530 SubChannel #531] Subchannel Connectivity change to READY  (t=+1.152216278s)
        tlogger.go:116: INFO clientconn.go:418 [core] [Channel #530] Channel Connectivity change to READY  (t=+1.152507182s)
        tlogger.go:116: INFO control_channel.go:170 [rls] [rls-control-channel 0x4000238000] Connectivity state is READY  (t=+1.152615883s)
        tlogger.go:116: INFO picker.go:305 [rls] [rls-picker 0x40003e6d20] Received RLS response for key {path:/grpc.testing.TestService/EmptyCall keys:k1=v1} with targets [127.0.0.1:38063], headerData "", err: <nil>  (t=+1.1[55](https://github.com/grpc/grpc-go/runs/7033783730?check_suite_focus=true#step:8:56)632723s)
        tlogger.go:116: INFO picker.go:311 [rls] [rls-picker 0x40003e6d20] Removing pending request entry for key {path:/grpc.testing.TestService/EmptyCall keys:k1=v1}  (t=+1.155747825s)
        tlogger.go:116: INFO balancer.go:441 [rls] [rls-experimental-lb 0x4000336600] New balancer.State: {ConnectivityState:READY Picker:0x400009f080}  (t=+1.155842226s)
        tlogger.go:116: INFO balancer.go:441 [rls] [rls-experimental-lb 0x400016a800] New balancer.State: {ConnectivityState:READY Picker:0x4000185ec0}  (t=+1.395396623s)
        tlogger.go:116: INFO balancer.go:441 [rls] [rls-experimental-lb 0x400016aa00] New balancer.State: {ConnectivityState:READY Picker:0x40003e7[56](https://github.com/grpc/grpc-go/runs/7033783730?check_suite_focus=true#step:8:57)0}  (t=+1.515330324s)
        balancer_test.go:824: Timed out waiting for resetBackoffDone
        tlogger.go:116: INFO clientconn.go:418 [core] [Channel #529] Channel Connectivity change to SHUTDOWN  (t=+5.011408384s)
        tlogger.go:116: INFO control_channel.go:188 [rls] [rls-control-channel 0x4000238000] Closing control channel  (t=+5.011648387s)
        tlogger.go:116: INFO clientconn.go:418 [core] [Channel #530] Channel Connectivity change to SHUTDOWN  (t=+5.011755188s)
        tlogger.go:116: INFO clientconn.go:1101 [core] [Channel #530 SubChannel #531] Subchannel Connectivity change to SHUTDOWN  (t=+5.01187389s)
        tlogger.go:116: INFO clientconn.go:14[57](https://github.com/grpc/grpc-go/runs/7033783730?check_suite_focus=true#step:8:58) [core] [Channel #530 SubChannel #531] Subchannel deleted  (t=+5.011980691s)
        tlogger.go:116: INFO clientconn.go:1056 [core] [Channel #530] Channel deleted  (t=+5.012052792s)
        tlogger.go:116: INFO clientconn.go:1101 [core] [Channel #529 SubChannel #534] Subchannel Connectivity change to SHUTDOWN  (t=+5.012199794s)
        tlogger.go:116: INFO clientconn.go:1457 [core] [Channel #529 SubChannel #534] Subchannel deleted  (t=+5.012307396s)
        tlogger.go:116: INFO clientconn.go:1056 [core] [Channel #529] Channel deleted  (t=+5.012382997s)
        tlogger.go:116: INFO server.go:737 [core] [Server #526 ListenSocket #528] ListenSocket deleted  (t=+5.012814403s)
        tlogger.go:116: INFO server.go:737 [core] [Server #525 ListenSocket #527] ListenSocket deleted  (t=+5.01308[63](https://github.com/grpc/grpc-go/runs/7033783730?check_suite_focus=true#step:8:64)06s)
FAIL
FAIL	google.golang.org/grpc/balancer/rls

https://github.com/grpc/grpc-go/runs/7033783730?check_suite_focus=true

@easwars
Copy link
Contributor Author

easwars commented Jul 12, 2022

@easwars
Copy link
Contributor Author

easwars commented Jul 22, 2022

The reason why this test is flaky is as follows:

  • Test creates a ClientConn with service config setting the LB policy to "rls"
  • RLS LB policy initializes the control channel to the RLS server. As part of this, it spawns a goroutine to monitor the control channel connectivity state changes.
  • But by the time the first RPC is successfully made, the above goroutine has not gotten a chance to run yet
  • And at this time, the test stops the RLS server. This moves the control channel to IDLE and it is only now that the monitoring goroutine gets to run, and it has already missed the first transition to READY.

In a production environment, this scenario is extremely unlikely to occur, and even if it does occur, the only problem is that the RLS LB policy will fail to reset backoff state for the first time that the control channel transitions back to READY. Subsequent transitions will be handled properly.

@easwars
Copy link
Contributor Author

easwars commented Jul 22, 2022

Our current state change API is lossy because state changes can be lost between the former returning and the caller invoking GetState.

@zasweq
Copy link
Contributor

zasweq commented Aug 11, 2022

@easwars
Copy link
Contributor Author

easwars commented Nov 28, 2022

Blocked on #5818.

@easwars
Copy link
Contributor Author

easwars commented Nov 28, 2022

FAILED in 1 out of 10000 in 43.9s

@dfawley dfawley changed the title Flaky test: ControlChannelConnectivityStateMonitoring Flaky test: 1/10k: ControlChannelConnectivityStateMonitoring Nov 28, 2022
@easwars easwars removed their assignment Aug 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants