Description
It seems like InfluxDB Cloud on GCP[1] was down sometime between 14:16 to 15:12 PDT (potentially even broader than this). I wasn't able to use the InfluxDB dashboard to view the metrics (from my home network) and on several of my applications where I am trying out InfluxDB from (running on multiple different GCP VMs), I also got Connection timed out [1]. When the backend comes back up, ideally the client should retry. Instead, I see a dip in the dashboard for a certain period[3], although I am confident that the metrics client was being called to write the metrics.
I did not restart my system, yet writes after this influxDB downtime are visible in the dashboard.
If retry is already implemented, how can I configure the retry strategy to have more retries over longer period of time?
[1] My host is https://us-central1-1.gcp.cloud2.influxdata.com/.
[2] The batch item wasn't processed successfully because: HTTPSConnectionPool(host='us-central1-1.gcp.cloud2.influxdata.com', port=443): Max retries exceeded with url: /api/v2/write?org=<REDACTED>&bucket=<REDACTED>+Bucket&precision=ns (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fa44a34b6d8>: Failed to establish a new connection: [Errno 110] Connection timed out'))
[3]