Bigtable: 'test_create_instance_w_two_clusters' flakes with '504 Deadline Exceeded' #5928

tseaver · 2018-09-11T18:57:42Z

From:

___________ TestInstanceAdminAPI.test_create_instance_w_two_clusters ___________

self = <tests.system.TestInstanceAdminAPI testMethod=test_create_instance_w_two_clusters>

    def test_create_instance_w_two_clusters(self):
        from google.cloud.bigtable import enums
        from google.cloud.bigtable.table import ClusterState
        _PRODUCTION = enums.Instance.Type.PRODUCTION
        ALT_INSTANCE_ID = 'dif' + unique_resource_id('-')
        instance = Config.CLIENT.instance(ALT_INSTANCE_ID,
                                          instance_type=_PRODUCTION,
                                          labels=LABELS)
    
        ALT_CLUSTER_ID_1 = ALT_INSTANCE_ID + '-c1'
        ALT_CLUSTER_ID_2 = ALT_INSTANCE_ID + '-c2'
        LOCATION_ID_2 = 'us-central1-f'
        STORAGE_TYPE = enums.StorageType.HDD
        cluster_1 = instance.cluster(
            ALT_CLUSTER_ID_1, location_id=LOCATION_ID, serve_nodes=SERVE_NODES,
            default_storage_type=STORAGE_TYPE)
        cluster_2 = instance.cluster(
            ALT_CLUSTER_ID_2, location_id=LOCATION_ID_2,
            serve_nodes=SERVE_NODES, default_storage_type=STORAGE_TYPE)
        operation = instance.create(clusters=[cluster_1, cluster_2])
        # We want to make sure the operation completes.
        operation.result(timeout=10)
    
        # Make sure this instance gets deleted after the test case.
        self.instances_to_delete.append(instance)
    
        # Create a new instance instance and make sure it is the same.
        instance_alt = Config.CLIENT.instance(ALT_INSTANCE_ID)
        instance_alt.reload()
    
        self.assertEqual(instance, instance_alt)
        self.assertEqual(instance.display_name, instance_alt.display_name)
        self.assertEqual(instance.type_, instance_alt.type_)
    
        clusters, failed_locations = instance_alt.list_clusters()
        self.assertEqual(failed_locations, [])
    
        clusters.sort(key=lambda x: x.name)
        alt_cluster_1, alt_cluster_2 = clusters
    
        self.assertEqual(cluster_1.location_id, alt_cluster_1.location_id)
        self.assertEqual(alt_cluster_1.state, enums.Cluster.State.READY)
        self.assertEqual(cluster_1.serve_nodes, alt_cluster_1.serve_nodes)
        self.assertEqual(cluster_1.default_storage_type,
                         alt_cluster_1.default_storage_type)
        self.assertEqual(cluster_2.location_id, alt_cluster_2.location_id)
        self.assertEqual(alt_cluster_2.state, enums.Cluster.State.READY)
        self.assertEqual(cluster_2.serve_nodes, alt_cluster_2.serve_nodes)
        self.assertEqual(cluster_2.default_storage_type,
                         alt_cluster_2.default_storage_type)
    
        # Test list clusters in project via 'client.list_clusters'
        clusters, failed_locations = Config.CLIENT.list_clusters()
        self.assertFalse(failed_locations)
        found = set([cluster.name for cluster in clusters])
        self.assertTrue({alt_cluster_1.name,
                         alt_cluster_2.name,
                         Config.CLUSTER.name}.issubset(found))
    
        temp_table_id = 'test-get-cluster-states'
        temp_table = instance.table(temp_table_id)
>       temp_table.create()

tests/system.py:280: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
google/cloud/bigtable/table.py:218: in create
    table=table, initial_splits=splits)
google/cloud/bigtable_admin_v2/gapic/bigtable_table_admin_client.py:327: in create_table
    request, retry=retry, timeout=timeout, metadata=metadata)
../api_core/google/api_core/gapic_v1/method.py:139: in __call__
    return wrapped_func(*args, **kwargs)
../api_core/google/api_core/retry.py:260: in retry_wrapped_func
    on_error=on_error,
../api_core/google/api_core/retry.py:177: in retry_target
    return target()
../api_core/google/api_core/timeout.py:206: in func_with_timeout
    return func(*args, **kwargs)
../api_core/google/api_core/grpc_helpers.py:61: in error_remapped_callable
    six.raise_from(exceptions.from_grpc_error(exc), exc)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

value = DeadlineExceeded('Deadline Exceeded',)
from_value = <_Rendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEED...all.cc","file_line":1099,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>

    def raise_from(value, from_value):
>       raise value
E       DeadlineExceeded: 504 Deadline Exceeded

../.nox/sys-2-7/lib/python2.7/site-packages/six.py:737: DeadlineExceeded

sduskis · 2018-09-13T18:31:05Z

We probably need to increase the time out from the 130 seconds it is now, to more like 15 minutes. Create table can take longer than 2 minutes, depending on conditions.

tseaver · 2018-09-13T18:42:29Z

@sduskis The deadline in bigtable_table_admin_client_config.py is set via autosynth. We would need to get the upstream configuration fixed, rather than changing here manually.

sduskis · 2018-09-13T18:45:35Z

Understood. I am in progress in making those upstream changes now.

tseaver · 2018-09-17T17:50:33Z

Another two failures today:

sduskis · 2018-09-17T18:02:12Z

FYI, my internal changes touched more than just CreateTable. I wanted to fix other RPCs as well. Hopefully, we'll get a fix out this week.

sduskis · 2018-09-18T19:45:10Z

@tseaver, I got the changes out. See this googleapis commit for details.

A synth regen should hopefully fix this flake.

Closes #5928.

tseaver · 2018-09-28T18:12:43Z

Hmmm, just saw the same failure yesterday (all for 2.7 and 3.7 systests):

tseaver · 2018-10-17T19:15:23Z

Another failure today.

tseaver · 2019-06-19T17:27:55Z

@sduskis I just saw the temp_table.create() fail again with a 504. It looks to me like the googleapis commit you linked did not up the timeout for CreateInstance.

Closes #5928.

tseaver added testing api: bigtable Issues related to the Bigtable API. type: process A process-related concern. May include testing, release, or the like. flaky labels Sep 11, 2018

tseaver assigned sduskis Sep 13, 2018

sduskis removed their assignment Sep 18, 2018

tseaver added a commit that referenced this issue Sep 18, 2018

Re-synth to pick up new deadline configuration.

7d88fc3

Closes #5928.

tseaver mentioned this issue Sep 18, 2018

Bigtable: re-synth to pick up new deadline configuration. #6010

Merged

tseaver closed this as completed in #6010 Sep 18, 2018

tseaver added a commit that referenced this issue Sep 18, 2018

Re-synth to pick up new deadline configuration. (#6010)

6b26403

Closes #5928.

chrisdunelm mentioned this issue Sep 19, 2018

Regenerate APIs (excluding Spanner) googleapis/google-cloud-dotnet#2522

Closed

jskeet mentioned this issue Sep 19, 2018

Regenerate Bigtable API googleapis/google-cloud-dotnet#2526

Merged

tseaver reopened this Sep 28, 2018

tseaver mentioned this issue Oct 17, 2018

BigQuery: add destination table properties to 'LoadJobConfig'. #6202

Merged

AVaksman mentioned this issue Nov 19, 2018

Bigtable: Run instance_admin system tests on a separate instance from table_admin and data system tests. #6579

Merged

tseaver mentioned this issue Feb 19, 2019

BigQuery: fix lint. #7383

Merged

crwilcox closed this as completed in #6579 Mar 8, 2019

tseaver mentioned this issue May 8, 2019

Bigtable: 'test_create_instance_with_two_clusters' flakes modifying profile. #7900

Closed

tseaver reopened this Jun 19, 2019

This was referenced Jun 19, 2019

Bigtable: Increase timeout for app profile update operation. #8417

Merged

Bigtable: Force timeout for table creation to 90 seconds (in systests). #8450

Merged

tseaver closed this as completed in #8450 Jun 20, 2019

tseaver added a commit that referenced this issue Jun 20, 2019

Force timeout for table creation to 90 seconds (in systests). (#8450)

d57f285

Closes #5928.

JustinBeckwith assigned tseaver Feb 1, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bigtable: 'test_create_instance_w_two_clusters' flakes with '504 Deadline Exceeded' #5928

Bigtable: 'test_create_instance_w_two_clusters' flakes with '504 Deadline Exceeded' #5928

tseaver commented Sep 11, 2018

sduskis commented Sep 13, 2018

tseaver commented Sep 13, 2018

sduskis commented Sep 13, 2018

tseaver commented Sep 17, 2018

sduskis commented Sep 17, 2018

sduskis commented Sep 18, 2018

tseaver commented Sep 28, 2018

tseaver commented Oct 17, 2018

tseaver commented Jun 19, 2019

Bigtable: 'test_create_instance_w_two_clusters' flakes with '504 Deadline Exceeded' #5928

Bigtable: 'test_create_instance_w_two_clusters' flakes with '504 Deadline Exceeded' #5928

Comments

tseaver commented Sep 11, 2018

sduskis commented Sep 13, 2018

tseaver commented Sep 13, 2018

sduskis commented Sep 13, 2018

tseaver commented Sep 17, 2018

sduskis commented Sep 17, 2018

sduskis commented Sep 18, 2018

tseaver commented Sep 28, 2018

tseaver commented Oct 17, 2018

tseaver commented Jun 19, 2019