Skip to content

Conversation

@serramatutu
Copy link

alexguo-db and others added 30 commits May 27, 2025 13:37
…ide properties (apache#2885)

- The Mashup engine will convert all property keys to lower case before
opening a connection with the driver
- So, using server side property passthrough with Power BI/PQTest
doesn't work, because none of the properties will start with `SSP_`
- This PR will change the prefix to `ssp_` and convert keys to lowercase
so that the driver still works with properties set with `SSP_`
- Tested by setting server side property through PQTest
…che#2884)

# Arrow-ADBC Catalog Parameter Support

## Overview
This branch adds support for properly handling catalog parameters in the
C# ADBC driver for Databricks, with special handling for the legacy
"SPARK" catalog. The implementation ensures backward compatibility with
tools that expect "SPARK" to act as a default catalog alias while also
supporting multiple catalog configurations.

## Key Changes

1. **Legacy SPARK Catalog Handling**: Added special handling for the
"SPARK" catalog name, treating it as a null catalog to trigger default
catalog behavior in the underlying API.

2. **Multiple Catalog Support**: Enhanced the implementation to respect
the `EnableMultipleCatalogSupport` configuration setting, with proper
behavior for both enabled and disabled states.

3. **Metadata Query Optimization**: Implemented optimized metadata query
handling (GetCatalogs, GetSchemas, GetTables, GetColumns) that avoids
unnecessary RPC calls when possible.

4. **Protected Method Visibility**: Modified method visibility in base
classes to allow proper overriding in derived classes, enabling
specialized catalog handling.

5. **Comprehensive Testing**: Added tests to verify the behavior of
metadata queries with different catalog settings and configurations.

## Implementation Details

The implementation follows a clean approach that:
- Properly handles the "SPARK" catalog as a special case
- Returns synthetic results for certain metadata queries to avoid
unnecessary network calls
- Maintains schema compatibility between real and synthetic results
- Provides proper encapsulation of the multiple catalog support flag

These changes improve compatibility with various client tools while
maintaining performance and adhering to the ADBC specification.
…data Optimization (apache#2886)

## Arrow ADBC: Primary Key and Foreign Key Metadata Optimization

### Description

This PR adds support for optimizing Primary Key and Foreign Key metadata
queries in the C# Databricks ADBC driver. It introduces a new connection
parameter `adbc.databricks.enable_pk_fk` that allows users to control
whether the driver should make PK/FK metadata calls to the server or
return empty results for improved performance.

### Background

Primary Key and Foreign Key metadata queries can be expensive
operations, particularly in Databricks environments where they may not
be fully supported in certain catalogs. This implementation provides a
way to optimize these operations by:

1. Allowing users to disable PK/FK metadata calls entirely via
configuration
2. Automatically returning empty results for legacy catalogs (SPARK,
hive_metastore) where PK/FK metadata is not supported
3. Ensuring that empty results maintain schema compatibility with real
metadata responses

### Proposed Changes

- Add new connection parameter `adbc.databricks.enable_pk_fk` to control
PK/FK metadata behavior (default: true)
- Implement special handling for legacy catalogs (SPARK, hive_metastore)
to return empty results without server calls
- Modify method visibility in base classes to allow proper overriding in
derived classes
- Add comprehensive test coverage for the new functionality

### How is this tested?

Added unit tests that verify:
1. The correct behavior of the `ShouldReturnEmptyPkFkResult` method with
various combinations of settings
2. Schema compatibility between empty results and real metadata
responses
3. Proper handling of different catalog scenarios

These tests ensure that the optimization works correctly while
maintaining compatibility with client applications that expect
consistent schema structures.
…o handle presigned URL expiration in CloudFetch (apache#2855)

### Problem
The Databricks driver's CloudFetch functionality was not properly
handling expired cloud file URLs, which could lead to failed downloads
and errors during query execution. The system needed a way to track,
cache, and refresh presigned URLs before they expire.

### Solution
- Improve `CloudFetchResultFetcher` class that:
  - Manages a cache of cloud file URLs with their expiration times
  - Proactively refreshes URLs that are about to expire
  - Provides thread-safe access to URL information
- Added an `IClock` interface and implementations to facilitate testing
with controlled time
- Extended the `IDownloadResult` interface to support URL refreshing and
expiration checking
- Updated namespace from
`Apache.Arrow.Adbc.Drivers.Apache.Databricks.CloudFetch` to
`Apache.Arrow.Adbc.Drivers.Databricks.CloudFetch` for better
organization
… GetColumnsExtended (apache#2894)

## PR Description

### Description
This PR fixes an issue with foreign key handling in the
`GetColumnsExtended` method by refactoring the cross-reference lookup
process. It extracts the foreign key lookup logic into a dedicated
method and improves the handling of empty result sets.

### Changes
- Created a new `GetCrossReferenceAsForeignTableAsync` method to
encapsulate the logic for retrieving cross-reference data where the
current table is treated as a foreign table
- Added constants for primary and foreign key field names and prefixes
to improve maintainability
- Implemented proper handling for empty result sets in
`GetColumnsExtendedAsync` by creating a helper method
`CreateEmptyExtendedColumnsResult`
- Extended the Databricks driver to override the new
`GetCrossReferenceAsForeignTableAsync` method to maintain consistency
with the empty result handling logic
- Added TODOs for future logging improvements

### Motivation
The existing implementation had several issues:
1. The foreign key lookup logic was embedded directly in the
`GetColumnsExtendedAsync` method, making it difficult to maintain and
extend, it also does not go through the child class DatabricksStatement
overriden function.
2. Empty result sets were not handled properly, potentially causing
errors when no columns were returned
3. The code lacked consistency in how it handled relationship data

### Testing
The changes have been tested with both empty and populated tables to
ensure proper handling of all scenarios. The refactored code maintains
the same functionality while improving code organization and error
handling.

### Related Issues
This PR is related to the ongoing improvements in primary and foreign
key metadata handling in the ADBC drivers.
…g tests (apache#2898)

This PR:
apache@317c9c9
re-fetches expired cloudfetch URLs, enabling longer cloud fetch
streaming reads.

This PR enables the long-running status poller test for cloudfetch
reading.

Tested locally
…o 1.14.1 in /go/adbc (apache#2915)

Bumps
[github.com/snowflakedb/gosnowflake](https://github.com/snowflakedb/gosnowflake)
from 1.14.0 to 1.14.1.

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…pache#2911)

Bumps [org.junit:junit-bom](https://github.com/junit-team/junit5) from
5.12.2 to 5.13.0.

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…9.4 in /java (apache#2908)

Bumps `dep.org.checkerframework.version` from 3.49.3 to 3.49.4.
Updates `org.checkerframework:checker-qual` from 3.49.3 to 3.49.4

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
….0 in /go/adbc (apache#2914)

Bumps
[cloud.google.com/go/bigquery](https://github.com/googleapis/google-cloud-go)
from 1.68.0 to 1.69.0.

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…44.4 to 2.44.5 in /java (apache#2912)

Bumps
[com.diffplug.spotless:spotless-maven-plugin](https://github.com/diffplug/spotless)
from 2.44.4 to 2.44.5.

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…/java (apache#2910)

Bumps [org.postgresql:postgresql](https://github.com/pgjdbc/pgjdbc) from
42.7.5 to 42.7.6.

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…31.1 in /java (apache#2909)

Bumps
[com.google.protobuf:protobuf-java](https://github.com/protocolbuffers/protobuf)
from 4.31.0 to 4.31.1.

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
… /go/adbc (apache#2913)

Bumps
[google.golang.org/api](https://github.com/googleapis/google-api-go-client)
from 0.234.0 to 0.235.0.

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…che#2896)

This PR adds support for edge cases that the ODBC driver supported. It
also allows users to specify a default schema without a default catalog
(relying on the default catalog set by backend server)

### Default schema fallback
If spark protocol version is too low, initial namespace will not be
respected. This PR sets `USE <schema>` as a backup immediately after
`TOpenSessionResp`. `SET CATALOG <catalog>` is not implemented since it
is introduced at the same time as InitialNamespace.

Additionally, schema can now be provided in the OpenSessionReq without
Catalog.

This is relevant for setting default schemas for dbr < 10.4. It means
that we can now provide a default schema for pre-unity-catalog

#### Testing
To test this add to DATABRICKS_TEST_CONFIG_FILE:
```
// no catalog
"db_schema": "default_schema",
```
And use a cluster running dbr < 10.4, run
`OlderDBRVersion_ShouldSetSchemaViaUseStatement`

### Default catalog legacy compatibility
If default catalog or default schema is provided in the
`TOpenSessionResp`, we want subsequent getTables calls to use the
default namespace. This enables powerbi to act as if it is pre-UC, since
it will automatically make all getTables requests with default catalog,
which is often `hive_metastore`.

If no schema is provided in OpenSessionReq, only default catalog will be
returned. This means whether or not metadata queries have a default
schema provided is dependent on if "db_schema" is provided

So, in powerbi, when only a default schema is provided but not a default
catalog, it will automatically restrict itself to the default catalog.

#### Testing

Added `MetadataQuery_ShouldUseResponseNamespace`
Fixed only comments and messages.
There are no code changes. 

This PR is intended to fix typos in `c/`, but I also fix
`go/adbc/drivermgr/arrow-adbc/adbc.h` because this needs to be synced
with `adbc.h` in `c/`.

Closes apache#2923
…etadata command (apache#2920)

# Add support for escaping underscores in metadata queries

## Why
For current Power BI ODBC connection we are escaping all the underscore.
I've thinking about change in the connector side to always pass in the
escaped name, but it is not feasible since later we will introduce sql
text based metadata query like DESCRIBE TABLE `xxx` as json, where we
would expect xxx not be escaped since it will be treated as text.
Thus the best way is to introduce a new parameter in the metadata api to
specify whether the client want to escape the underscore, then on the
driver side based on different calling method(Thrift API or sql text) we
can perform differently.

## Description
This PR adds support for escaping underscores in metadata query
parameters through a new parameter
`adbc.get_metadata.escape_underscore`. When enabled, underscores in
catalog, schema, table, and column names will be treated as literal
characters rather than SQL wildcards.


## Changes
- Added new parameter `EscapeUnderscore` to control underscore escaping
behavior
- Added `EscapeUnderscoreInName` helper method to handle underscore
escaping
- Updated all metadata query methods to use the escaping functionality:
  - GetCatalogsAsync
  - GetSchemasAsync
  - GetTablesAsync
  - GetColumnsAsync
  - GetPrimaryKeysAsync
  - GetCrossReferenceAsync
  - GetCrossReferenceAsForeignTableAsync

## Testing
- Added test case in `CanGetColumnsExtended` to verify the escaping
functionality
- Verified that null values are handled correctly
- Verified that escaping only occurs when the flag is enabled

## Usage
To enable underscore escaping, set the parameter:
```csharp
statement.SetOption(ApacheParameters.EscapeUnderscore, "true");
```

## Impact
This change allows users to query metadata for objects that contain
underscores in their names without the underscore being interpreted as a
SQL wildcard character.
…pache#2925)

Increases the DefaultTemporarilyUnavailableRetryTimeout from 500s to
900s to match ODBC/JDBC driver behavior
Fixed only comments.
There are no code changes. 

Closes apache#2927
…he#2926)

# Improve metadata query handling in C# drivers

## Summary
This PR makes several improvements to metadata query handling in the C#
drivers:

1. Removes underscore escaping for exact match queries (GetPrimaryKeys,
GetCrossReference) in the Hive driver
2. Fixes cross-reference handling in the Databricks driver to check both
primary and foreign catalogs
3. Updates tests to handle different integer types (Int16/Int32) in
FK_INDEX column
4. Adds implementation for foreign key table creation in Databricks
tests
5. Adds special character handling in table names for Databricks tests
to ensure proper escaping

## Motivation
The current implementation incorrectly escapes underscores in exact
match queries, which can lead to incorrect results when querying
metadata for tables with underscores in their names. Additionally, the
Databricks driver was not properly handling cross-references when one
catalog was valid but the other was invalid.

The added special character handling in tests ensures that our drivers
correctly handle table names with special characters, which is important
for real-world scenarios.

## Testing
- Updated existing tests to handle the different integer types that may
be returned
- Added implementation for foreign key table creation in Databricks
tests
- Enhanced test coverage by including special characters in table names
- Passing existing PrimaryKey/ForeignKey tests.
…atabase (apache#2921)

### Changes
This PR makes changes to be consistent with odbc behavior


#### ODBC behavior:
- Does not attach default schema (from OpenSessionResp) in subsequent
queries. This PR changes to be like ODBC here, just to not break
anything.
- If catalog == "SPARK", do not set catalog in initial namespace; let
server return default catalog

#### ODBC does the following when `EnableMultipleCatalogsSupport=0`
- Do not set catalog in the initial namespace.
- Do not use the OpenSessionResp catalog for subsequent queries
(including metadata queries)


#### ODBC Driver Behavior Testing

| Connection String Catalog/Schema | EnableMultipleCatalogs=False |
EnableMultipleCatalogs=True |

|--------------------------------------------------|-----------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------|
| Only default catalog = x | OpenSession(null,
“default”)<br>GetColumns(null, null, null, null) | OpenSession(x,
“default”)<br>GetColumns(x, null, null, null) |
| Only default schema = x | OpenSession(null, x)<br>GetColumns(null,
null, null, null) | OpenSession(null, x)<br>GetColumns(hive_metastore,
null, null, null) |
| Both catalog + schema x, y | OpenSession(null, y)<br>GetColumns(null,
null, null, null) | OpenSession(x, y)<br>GetColumns(x, null, null, null)
|
| Only default Catalog “SPARK”, schema = null | OpenSessionReq(null,
“default”)<br>GetColumns(null, null, null, null) | OpenSessionReq(null,
“default”)<br>GetColumns(hive_metastore, null, null, null) |
| Both with default catalog “SPARK”, schema = x | OpenSessionReq: null
Catalog, x schema<br>GetColumns(null, null, null, null) |
OpenSessionReq(null, x)<br>GetColumns(hive_metastore, null, null, null)
|


### Testing
Adds an extensive grid test to validate different edge cases during
creation



### TODO
- Make initial_namespace.schema "default" as default behavior
- CatalogName + Foreign CatlogName
- Add testing for old dbr (dbr 7.3, 9.1)
…amp precision to microseconds (apache#2917)

Introduces a new setting to set the maximum timestamp precision to
Microsecond. Setting this value will convert the default Nanosecond
value to Microsecond to avoid the overflow that occurs when a date is
before the year 1678 or after 2262.

Provides a fix for apache#2811 by
creating a workaround that can be set by the caller.

---------

Co-authored-by: David Coe <>
…river (apache#2825)

Includes some improvements to ease instrumentation
Instrumentation for the Snowflake driver.

## Example

```golang
func (base *ImplementsAdbcOTelTracing) MyFunction(ctx context.Context) (nRows int64, err error) {
    nRows = -1

    ctx, span := utils.StartSpan(ctx, "MyFunction", st)
    defer func() {
        // Optionally set attributes before exiting the function
        span.SetAttributes(semconv.DBResponseReturnedRowsKey.Int64(nRows))
        // MUST call this to ensure the span completes
        utils.EndSpan(span, err)
    }()
}
```
The driver manager was leaking array streams in the case that you
executed a query, *did not fetch the result*, then did another
operation. In this case, because the stream was never imported, it would
silently leak. Avoid this by explicitly/directly releasing any extant
array stream before another driver operation.
No change for production code.
Some test code has been changed, but they are unlikely to affect the
test.

Closes apache#2932
Performed the following updates:
- Updated Azure.Identity from 1.13.2 to 1.14.0 in
/csharp/Directory.Build.props, /csharp/Directory.Build.targets,
/csharp/test/Drivers/BigQuery/Apache.Arrow.Adbc.Tests.Drivers.BigQuery.csproj
- Updated BenchmarkDotNet from 0.14.0 to 0.15.1 in
/csharp/Benchmarks/Benchmarks.csproj, /csharp/Directory.Build.props,
/csharp/Directory.Build.targets
- Updated DuckDB.NET.Bindings.Full from 1.2.1 to 1.3.0 in
/csharp/Benchmarks/Benchmarks.csproj, /csharp/Directory.Build.props,
/csharp/Directory.Build.targets,
/csharp/test/Apache.Arrow.Adbc.Tests/Apache.Arrow.Adbc.Tests.csproj
- Updated Microsoft.NET.Test.Sdk from 17.12.0 to 17.14.1 in
/csharp/Directory.Build.props, /csharp/Directory.Build.targets,
/csharp/test/Apache.Arrow.Adbc.Tests/Apache.Arrow.Adbc.Tests.csproj
- Updated Microsoft.NET.Test.Sdk from 17.12.0 to 17.14.1 in
/csharp/Directory.Build.props, /csharp/Directory.Build.targets,
/csharp/test/Drivers/Apache/Apache.Arrow.Adbc.Tests.Drivers.Apache.csproj
- Updated Microsoft.NET.Test.Sdk from 17.12.0 to 17.14.1 in
/csharp/Directory.Build.props, /csharp/Directory.Build.targets,
/csharp/test/Drivers/BigQuery/Apache.Arrow.Adbc.Tests.Drivers.BigQuery.csproj
- Updated Microsoft.NET.Test.Sdk from 17.12.0 to 17.14.1 in
/csharp/Directory.Build.props, /csharp/Directory.Build.targets,
/csharp/test/Drivers/Databricks/Apache.Arrow.Adbc.Tests.Drivers.Databricks.csproj
- Updated Microsoft.NET.Test.Sdk from 17.12.0 to 17.14.1 in
/csharp/Directory.Build.props, /csharp/Directory.Build.targets,
/csharp/test/Drivers/FlightSql/Apache.Arrow.Adbc.Tests.Drivers.FlightSql.csproj
- Updated Microsoft.NET.Test.Sdk from 17.12.0 to 17.14.1 in
/csharp/Directory.Build.props, /csharp/Directory.Build.targets,
/csharp/test/Drivers/Interop/FlightSql/Apache.Arrow.Adbc.Tests.Drivers.Interop.FlightSql.csproj
- Updated Microsoft.NET.Test.Sdk from 17.12.0 to 17.14.1 in
/csharp/Directory.Build.props, /csharp/Directory.Build.targets,
/csharp/test/Drivers/Interop/Snowflake/Apache.Arrow.Adbc.Tests.Drivers.Interop.Snowflake.csproj
- Updated xunit.runner.visualstudio from 3.1.0 to 3.1.1 in
/csharp/Directory.Build.props, /csharp/Directory.Build.targets,
/csharp/test/Apache.Arrow.Adbc.Tests/Apache.Arrow.Adbc.Tests.csproj
- Updated xunit.runner.visualstudio from 3.1.0 to 3.1.1 in
/csharp/Directory.Build.props, /csharp/Directory.Build.targets,
/csharp/test/Drivers/Apache/Apache.Arrow.Adbc.Tests.Drivers.Apache.csproj
- Updated xunit.runner.visualstudio from 3.1.0 to 3.1.1 in
/csharp/Directory.Build.props, /csharp/Directory.Build.targets,
/csharp/test/Drivers/BigQuery/Apache.Arrow.Adbc.Tests.Drivers.BigQuery.csproj
- Updated xunit.runner.visualstudio from 3.1.0 to 3.1.1 in
/csharp/Directory.Build.props, /csharp/Directory.Build.targets,
/csharp/test/Drivers/Databricks/Apache.Arrow.Adbc.Tests.Drivers.Databricks.csproj
- Updated xunit.runner.visualstudio from 3.1.0 to 3.1.1 in
/csharp/Directory.Build.props, /csharp/Directory.Build.targets,
/csharp/test/Drivers/FlightSql/Apache.Arrow.Adbc.Tests.Drivers.FlightSql.csproj
- Updated xunit.runner.visualstudio from 3.1.0 to 3.1.1 in
/csharp/Directory.Build.props, /csharp/Directory.Build.targets,
/csharp/test/Drivers/Interop/FlightSql/Apache.Arrow.Adbc.Tests.Drivers.Interop.FlightSql.csproj
- Updated xunit.runner.visualstudio from 3.1.0 to 3.1.1 in
/csharp/Directory.Build.props, /csharp/Directory.Build.targets,
/csharp/test/Drivers/Interop/Snowflake/Apache.Arrow.Adbc.Tests.Drivers.Interop.Snowflake.csproj

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
amoeba and others added 27 commits August 4, 2025 17:49
Clarifies the relation between this document and the authoritative
definition of the ADBC API.
…lt true (apache#3232)

## Motivation

In PR apache#3171, `RunAsync` option
in `TExecuteStatementReq` was added and exposed via connection parameter
`adbc.databricks.enable_run_async_thrift`, but it is not enabled by
default. This is turned on by default in other Databricks drivers, we
should turn in on by default in ADBC as well.

## Change
- Set `DatabricksConnection:_runAsyncInThrift` default value to `true`

## Test
- PBI PQTest with all the test cases
Bumps [ruby/setup-ruby](https://github.com/ruby/setup-ruby) from 1.253.0
to 1.254.0.

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…3236)

Bumps
[google-github-actions/auth](https://github.com/google-github-actions/auth)
from 2.1.11 to 2.1.12.

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [docker/login-action](https://github.com/docker/login-action) from
3.4.0 to 3.5.0.

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
… /go/adbc (apache#3233)

Bumps
[google.golang.org/api](https://github.com/googleapis/google-api-go-client)
from 0.243.0 to 0.244.0.

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…adbc (apache#3235)

Bumps [modernc.org/sqlite](https://gitlab.com/cznic/sqlite) from 1.38.1
to 1.38.2.

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…e validation for Spark, Impala & Hive (apache#3224)

Co-authored-by: Sudhir Emmadi <emmadisudhir@microsoft.com>
…_driver_manager package (apache#3197)

Part of apache#3106

Remove the `driver_manager` feature of adbc_core and adding a new
adbc_driver_manager package instead.

Crates that depended on the `driver_manager` feature, such as
adbc_snowflake, will need to be updated to include adbc_driver_manager
as a dependency.

---------

Co-authored-by: David Li <li.davidm96@gmail.com>
Improves the existing go pkgsite by adding a README.

This is an alternative to
apache#3199.
As per the
[docs](https://arrow.apache.org/adbc/main/format/driver_manifests.html#manifest-structure)
the Driver manifest allows overriding the `entrypoint` via the
`Driver.entrypoint` key. Rust follows this properly, but C++ checks for
a top-level key named `entrypoint` instead of following the docs. This
PR fixes this so that the C++ driver manager correctly looks for
`Driver.entrypoint`.
…d fields (apache#3240)

Co-authored-by: Xuliang (Harry) Sun <32334165+xuliangs@users.noreply.github.com>
…ge (apache#3244)

Current error messages do not contain details for what occurred, only a
message like:

`Cannot execute <ReadChunkWithRetries>b__0 after 5 tries`

This adds the Message of the last exception that occurred as well.

Co-authored-by: David Coe <>
)

Modifies the behavior of GetSearchPaths so macOS doesn't follow other
Unix-likes but instead uses the more conventional `/Library/Application
Support/ADBC`. `/etc/` isn't really a thing on macOS.

Also updates the driver manfiest docs to call this new behavior out.

Closes apache#3247.
… and StatusPoller to Stop/Dispose Appropriately (apache#3217)

### Motivation
The following cases are not properly stopping or disposing the status
poller:
1.  If the DatabricksCompositeReader is explicitly disposed by the user
2. CloudFetchReader is done returning results
3. Edge case terminal operation status (timedout_state, unknown_state)

In addition:
- When DatabricksOperationStatusPoller.Dispose(), it may cancel the
GetOperationStatusRequest in the client. If the input buffer has data
and cancellation is triggered, it leaves the TCLI client with
unconsumed/unsent data in the buffer, breaking subsequent requests
(fixed in this PR)

### Fixes

DatabricksOperationStatusPollerLogic is now more appropriately managed
by DatabricksCompositeReader (moved out of BaseDatabricksReader) to
handle all cases where null results (indicating completion) are
returned.

Disposing DatabricksCompositeReader appropriately disposes the
activeReader and statusPoller


#### TODO
Follow-up PR - when statement is disposed, it should also dispose the
reader (the poller is currently stopped when operationhandle is set to
null, but this should also happen explicitly)

Need add some unit testing (follow up pr:
apache#3243)
…ache#3251)

This fixes an occasional issue with the SetUp block of the
DriverManifest fixture. At least on my macOS system, if a previous test
runs fails before running TearDown, SetUp would fail on this assertion.
This changes the SetUp behavior so it re-uses the temporary directory.
Some refactors to make it easier to unit test. (follow-up pr:
apache#3255)

1. Move reader-related logic to Databricks/Reader
2. TracingStatement extends ITracingStatement - this lets us mock
IHiveServer2Statement more easily
3. IOperationStatusPoller
It seems that CI has started to fail as a result of macos-latest
changing from macos 14 to 15.

```log
 [ 65%] Building CXX object driver/sqlite/CMakeFiles/adbc_driver_sqlite_objlib.dir/sqlite.cc.o
/Users/runner/work/arrow-adbc/arrow-adbc/c/driver/sqlite/sqlite.cc:718:16: error: use of undeclared identifier 'sqlite3_load_extension'
  718 |       int rc = sqlite3_load_extension(conn_, extension_path_.c_str(),
      |                ^
1 error generated.
make[2]: *** [driver/sqlite/CMakeFiles/adbc_driver_sqlite_objlib.dir/sqlite.cc.o] Error 1
make[1]: *** [driver/sqlite/CMakeFiles/adbc_driver_sqlite_objlib.dir/all] Error 2
make: *** [all] Error 2
```

I don't know why this error is occurring, but it looks like it can be
avoided by making a change like apache#1259.
…e#3252)

Replicates the change in apache#3250
to the Rust Driver Manager. Follow-on to apache#3247

Modifies the behavior of GetSearchPaths so macOS doesn't follow other
Unix-likes but instead uses the more conventional /Library/Application
Support/ADBC. /etc/ isn't really a thing on macOS.

Tested manually by debugging the test with and without
`/Library/Application Support/ADBC` existing and verifying the right
branch gets hit. I'm not too worried exercising this in CI but we could.
Bumps [ruby/setup-ruby](https://github.com/ruby/setup-ruby) from 1.254.0
to 1.255.0.

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…adbc (apache#3271)

Bumps [golang.org/x/tools](https://github.com/golang/tools) from 0.35.0
to 0.36.0.

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
… /java (apache#3275)

Bumps [com.uber.nullaway:nullaway](https://github.com/uber/NullAway)
from 0.12.7 to 0.12.8.

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
… in /go/adbc (apache#3277)

Bumps google.golang.org/protobuf from 1.36.6 to 1.36.7.

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…java (apache#3268)

Bumps [org.assertj:assertj-core](https://github.com/assertj/assertj)
from 3.27.3 to 3.27.4.

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
… /go/adbc (apache#3274)

Bumps
[google.golang.org/api](https://github.com/googleapis/google-api-go-client)
from 0.244.0 to 0.246.0.

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
packages = []
with path.open() as source:
for line in source:
if "img.shields.io" in line:

Check failure

Code scanning / CodeQL

Incomplete URL substring sanitization High documentation

The string
img.shields.io
may be at an arbitrary position in the sanitized URL.

Copilot Autofix

AI 3 months ago

To fix the problem, we should avoid using a substring check to identify lines containing img.shields.io. Instead, we should parse the line to extract any URLs, then use urllib.parse to check if any of those URLs have a hostname of img.shields.io. This ensures that only actual references to the intended domain are processed, and not lines where the substring appears in an unintended context.

How to fix:

  • For each line, extract all URLs (e.g., using a regex to find markdown links or direct URLs).
  • For each URL, parse it with urllib.parse.urlparse and check if the hostname is exactly img.shields.io.
  • Only process lines where such a URL is found.

What to change:

  • In the _driver_status function, replace the substring check with logic that extracts URLs and checks their hostnames.
  • Add an import for re (for regex) and urllib.parse (for URL parsing).

Suggested changeset 1
docs/source/ext/adbc_misc.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/docs/source/ext/adbc_misc.py b/docs/source/ext/adbc_misc.py
--- a/docs/source/ext/adbc_misc.py
+++ b/docs/source/ext/adbc_misc.py
@@ -23,6 +23,8 @@
 import itertools
 import typing
 from pathlib import Path
+import re
+import urllib.parse
 
 import docutils
 import sphinx
@@ -77,7 +79,15 @@
     packages = []
     with path.open() as source:
         for line in source:
-            if "img.shields.io" in line:
+            # Find all URLs in the line (e.g., markdown links or direct URLs)
+            urls = re.findall(r'(https?://[^\s\)]+)', line)
+            found_img_shields = False
+            for url in urls:
+                parsed = urllib.parse.urlparse(url)
+                if parsed.hostname and parsed.hostname.lower() == "img.shields.io":
+                    found_img_shields = True
+                    break
+            if found_img_shields:
                 before, _, after = line.partition("img.shields.io")
                 tag = before[before.index("![") + 2 : before.index("]")].strip()
                 key, _, value = tag.partition(": ")
EOF
@@ -23,6 +23,8 @@
import itertools
import typing
from pathlib import Path
import re
import urllib.parse

import docutils
import sphinx
@@ -77,7 +79,15 @@
packages = []
with path.open() as source:
for line in source:
if "img.shields.io" in line:
# Find all URLs in the line (e.g., markdown links or direct URLs)
urls = re.findall(r'(https?://[^\s\)]+)', line)
found_img_shields = False
for url in urls:
parsed = urllib.parse.urlparse(url)
if parsed.hostname and parsed.hostname.lower() == "img.shields.io":
found_img_shields = True
break
if found_img_shields:
before, _, after = line.partition("img.shields.io")
tag = before[before.index("![") + 2 : before.index("]")].strip()
key, _, value = tag.partition(": ")
Copilot is powered by AI and may make mistakes. Always verify output.
@serramatutu serramatutu deleted the apache/main branch August 12, 2025 14:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.