Skip to content

Enhance Snowflake SQL API Hook with Retry Logic for Query Status Polling #50514

@jonathanleek

Description

@jonathanleek

Apache Airflow Provider(s)

snowflake

Versions of Apache Airflow Providers

We’ve identified an issue with the SnowflakeSqlApiHook used by the SnowflakeSqlApiOperator in the Apache Airflow Snowflake provider. In certain cases, after submitting a query to Snowflake’s SQL API, the initial request succeeds, but polling the status endpoint fails due to a RemoteDisconnected error, caused by the remote Snowflake endpoint forcefully closing a connection. This is likely due to the reuse of a stale connection or other transient TCP-level issues.

Why This Matters
• Ensures robustness of asynchronous query execution using the SQL API.
• Avoids failed tasks due to transient connection pool or network issues.
• Aligns with best practices in client design for polling APIs over HTTP.

Apache Airflow version

2.9.3

Operating System

Debian

Deployment

Astronomer

Deployment details

No response

What happened

Summary of Issue
• The initial POST to submit the query succeeds.
• The subsequent GET to the statementStatusUrl fails with:
RemoteDisconnected('Remote end closed connection without response')
The failure is raised at the application level and results in the task being marked as failed.
• Retrying at the task/operator level is not viable when:
• The task is considered successful from Airflow’s point of view (query submitted).
• The query is not idempotent (e.g., it modifies data), so re-running it would cause issues.

Snowflake confirmed this is a client-side (Airflow hook) implementation gap:
• The polling request should retry on connection-level failures (like RemoteDisconnected), possibly with exponential backoff.
• This would avoid needing to retry the entire query from the start.
• The current implementation makes a single attempt to poll the status endpoint, which is fragile in cloud network environments.

What you think should happen instead

No response

How to reproduce

  1. Create a long running query.
  2. Kill the TCP Connection during polling.
  3. Logs will show something like
    RemoteDisconnected('Remote end closed connection without response') ... raise ValueError({"status": "error", "message": str(e)})

Anything else

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions