Skip to content

Conversation

vandop
Copy link

@vandop vandop commented Oct 1, 2025

fixes #863

Rationale for this change

When using the Flight SQL JDBC driver with connection pooling and a catalog parameter, ArrowFlightSqlClientHandler.close() performs a CloseSession RPC that can fail during gRPC channel shutdown. These transient failures (UNAVAILABLE or INTERNAL with "Connection closed after GOAWAY") bubble up as SQLException and cause noisy errors in pooling frameworks like Apache Commons DBCP.

What changes are included in this PR?

Core Changes

  • Enhanced error suppression in ArrowFlightSqlClientHandler.close() for both:
    • CloseSession RPC calls
    • AutoCloseable resource cleanup
  • Code refactoring to eliminate duplication:
    • Created reusable helper methods: isBenignCloseException(), logSuppressedCloseException(), handleBenignCloseException()

Error Suppression Logic

Suppress only these specific transient shutdown conditions:

  • FlightStatusCode.UNAVAILABLE
  • FlightStatusCode.INTERNAL with message containing "Connection closed after GOAWAY"

Logging Improvements

  • Context-aware logging: DEBUG mode shows full stack traces, INFO mode shows clean messages
  • Fixed typo: "Supressed" → "Suppressed" in existing PreparedStatement error handling
  • Consistent format: All suppression follows same logging pattern

Are there any user-facing changes?

Yes - Improved user experience:

  • Eliminates noisy shutdown errors when using catalog parameters with connection pooling
  • Maintains error visibility for genuine failures (non-transient errors still throw SQLException)
  • Better compatibility with pooling frameworks (JMeter, DBCP, etc.)

How was this patch tested?

Integration Testing

  • Environment: JDK 21, JMeter 5.6.3
  • Target: Dremio Cloud (data.dremio.cloud:443)
  • Test URL: jdbc:arrow-flight-sql://data.dremio.cloud:443/?token=<PAT>&catalog=<PROJECT_ID>
  • JMeter Dremio test suite: Dremio performance tests were patched with a custom build with this change and are back to a working state (previously were using the legacy Dremio JDBC driver)

Validation Steps

  1. Before patch: Intermittent UNAVAILABLE/INTERNAL("Connection closed after GOAWAY") errors during connection close
  2. After patch: Benign errors logged as INFO, genuine failures still surface as SQLException
  3. Regression testing: All existing functionality preserved

Vando Pereira and others added 3 commits September 23, 2025 17:52
…hen catalog is set

This change addresses race conditions during gRPC channel shutdown that occur
when using connection pooling with catalog parameters. The CloseSession RPC
can fail with UNAVAILABLE or 'Connection closed after GOAWAY' errors during
normal connection cleanup.

Key improvements:
- Refactored duplicate exception handling code into reusable helper methods
- Added comprehensive error suppression for both AutoCloseable cleanup and CloseSession
- Follows the established ARROW-17785 pattern from PreparedStatement.close()
- Improved logging with context-aware debug/info messages
- Fixed typo in existing error suppression logging

The refactoring eliminates code duplication while maintaining identical
functionality and improving maintainability.
ARROW-17785: [Flight SQL][JDBC] Suppress benign CloseSession errors when catalog is set
Copy link
Member

@lidavidm lidavidm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to unit test this?

Vando Pereira added 2 commits October 3, 2025 10:41
…on error suppression

Add unit and integration tests for the error suppression functionality
in ArrowFlightSqlClientHandler:

- ArrowFlightSqlClientHandlerTest: 18 unit tests covering error detection,
  logging, and exception handling logic using Mockito and reflection
- ArrowFlightSqlClientHandlerIntegrationTest: 4 integration tests with
  real FlightServer to validate error suppression in realistic scenarios

Tests verify that benign gRPC shutdown errors (UNAVAILABLE and INTERNAL
with GOAWAY) are properly suppressed while genuine failures are correctly
propagated as exceptions.
…reliability

This commit addresses bugs introduced in the error suppression implementation:

1. Fixed NullPointerException in isBenignCloseException() when
   FlightRuntimeException.getMessage() returns null. Added null check
   before calling contains() on the message string.

2. Fixed unit test setup to avoid attempting real server connections
   during test initialization. Tests now use reflection to test private
   methods without requiring actual network connections.

3. Fixed Mockito unnecessary stubbing warnings by making all mock
   objects lenient, allowing tests to create comprehensive mocks
   without triggering warnings when not all stubbings are used.

4. Simplified integration tests to focus on testable scenarios.
   Removed tests that required mocking gRPC service methods (closeSession)
   which are not routed through FlightProducer, making them difficult
   to test in isolation.

Test Results:
- 21 tests total (15 unit + 1 integration + 5 builder tests)
- All tests passing with 0 failures and 0 errors
- Comprehensive coverage of error suppression logic via reflection-based
  unit tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ArrowFlightSqlClientHandler.close() performs a CloseSession RPC that can fail during gRPC channel shutdown - similar to ARROW-17785
2 participants