Skip to content

Conversation

@majin1102
Copy link
Contributor

Close #5806

@github-actions github-actions bot added the enhancement New feature or request label Jan 26, 2026
@github-actions
Copy link
Contributor

Code Review

Summary

This PR adds support for DataFusion URL table queries against Lance datasets (e.g., SELECT * FROM 'path/to/data.lance').

Potential Issues

P1: URL detection may be too simplistic

The .lance suffix check in LanceUrlTableFactory::try_new uses url.ends_with(".lance"):

if !url.ends_with(".lance") {
    return Ok(None);
}

This could have edge cases:

  • URLs with query parameters: s3://bucket/data.lance?versionId=123 would fail
  • URLs with trailing slashes: s3://bucket/data.lance/ would fail

Consider using URL parsing or path extraction before checking the extension, or document these limitations.

P1: Missing license header line

url_factory.rs only has // SPDX-License-Identifier: Apache-2.0 but is missing the second line // SPDX-FileCopyrightText: Copyright The Lance Authors that other files in the crate have.

Minor Observations

  • The test coverage is reasonable for the happy path but could benefit from a test case with remote URLs (S3/GCS) in integration tests.
  • The MultiUrlTableFactory design is clean and allows for extensibility.

Overall the implementation follows the existing patterns in the codebase well.

@codecov
Copy link

codecov bot commented Jan 26, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support lance url table in datafusion

1 participant