Skip to content

Conversation

@jackye1995
Copy link
Contributor

This PR adds the following main patches:

The later 2 required a few additional PRs that built up to those, so also added to this patch PR since it would be too complicated to only cherry-pick those 2.

BubbleCal and others added 2 commits January 20, 2026 10:27
We made some changes that cause the upper bound lose, tighten it to make
it 15x faster
…e-format#5736)

We have a performance issue when `LANCE_MEM_POOL_SIZE`: when this is set
(to something different than the default), the current cache always
misses here:

https://github.com/lance-format/lance/blob/445dd5bfe7b0111fd8e146163393e825e5995679/rust/lance-datafusion/src/exec.rs#L398-L410

- Replace static `LazyLock` session contexts with LRU cache (size 4)
- Cache key uses resolved configuration values (after env var lookup)
- Fixes cache misses when `LANCE_MEM_POOL_SIZE` env var is set

- [x] `cargo test -p lance-datafusion` passes
- [x] `cargo clippy -p lance-datafusion` passes

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
@github-actions github-actions bot added the bug Something isn't working label Jan 20, 2026
@github-actions
Copy link
Contributor

PR Review: Patch for 1.0.3

This patch PR bundles several important fixes for the release/v1.0 branch. After reviewing the key changes, here is my assessment:

Summary

The PR consolidates fixes from:

Review Findings

No P0/P1 issues identified. The implementation looks solid overall. A few observations:

Session Context Cache (lance-datafusion/src/exec.rs)

  • The LRU cache implementation correctly uses resolved configuration values as cache keys (mem_pool_size, max_temp_directory_size, use_spilling, target_partition)
  • Thread-safe with proper mutex handling and poison recovery (unwrap_or_else(|e| e.into_inner()))
  • Tests verify both cache hits and LRU eviction behavior

WAND Block Score Upper Bound (lance-index/src/scalar/inverted/wand.rs)

  • The check_block_max() function properly iterates and sums block max scores
  • Performance improvement by tightening the upper bound calculation during WAND search

StorageOptionsAccessor (lance-io/src/object_store/storage_options.rs)

  • Clean separation of static vs dynamic credential sources
  • Proper expiration tracking with configurable refresh offset
  • Thread-safe with try-lock mechanism to avoid refresh contention
  • Good test coverage including concurrent access scenarios

Credential Vending (lance-namespace-impls/src/credentials/)

  • Credential caching with TTL based on half the credential lifetime (capped at 30 min)
  • Sensitive data properly redacted in Debug implementations
  • Cache key includes identity hash to ensure per-user isolation

Minor Observations (Not blocking)

  1. The credentials/cache.rs only hashes the first 64 chars of auth tokens to avoid memory issues - this is a reasonable tradeoff but worth documenting in case of token collision concerns in specific deployment scenarios.

  2. The PR is large (15k+ additions) due to bundling multiple features. The linked original PRs should be referenced for detailed commit-level review.

Recommendation: Approve for merge to release/v1.0.3

@jackye1995 jackye1995 force-pushed the patch-103 branch 2 times, most recently from 771f5e0 to 0e03e35 Compare January 20, 2026 21:54
@codecov
Copy link

codecov bot commented Jan 20, 2026

Codecov Report

❌ Patch coverage is 92.92035% with 8 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
rust/lance-datafusion/src/exec.rs 93.87% 3 Missing and 3 partials ⚠️
rust/lance-index/src/scalar/inverted/wand.rs 86.66% 2 Missing ⚠️

📢 Thoughts on this report? Let us know!

Applies fixes for:
1. cargo-deny: add ignores for RUSTSEC-2025-0141 (bincode) and
   RUSTSEC-2026-0002 (lru)
2. torch.jit.script deprecation: migrate to torch.compile, add MSVC
   setup for Windows, add filterwarnings for PyTorch inductor
3. Java CI out of disk space: use warp-ubuntu-latest-x64-4x runner

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
jackye1995 and others added 4 commits January 20, 2026 21:05
…rmat#5457)

Related to refactorings in the namespace spec:
lancedb/sophon#4783 and
lance-format/lance-namespace#278

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…t#5566)

This PR introduces the credentials vending feature to the namespace
impl, allowing us to vend credentials if we run directory namespace, or
run it as backend for rest namespace. This would allow us to fully test
the credentials vending code path end to end.

The actual vending logic mainly consults the same feature implemented in
Apache Polaris. The support covers aws, gcp and azure.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…ance-format#5611)

Combined patches for upgrading lance-namespace.

Changes from 0.4.0 (lance-format#5568):
1. Introduced full error handling spec, update rust interface to
implement the spec, and also dir and rest implementations in rust,
python, java based on it
2. Fixed that `create_empty_table` is deprecated and `declare_table` is
introduced.
3. Added `deregister_table` support for dir namespace (without manifest)

Changes from 0.4.5 (lance-format#5611):
1. Use `Default::default()` for constructing request and response models
2. Leverage newly added identity to cache vended credentials
3. Support newly added `load_detailed_metadata` and `vend_credentials`
flags in requests

Also made ToSnafuLocation trait public to fix compilation on release branch.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This feature is similar to lancedb HeaderProvider, but implemented in a
more generic way.

In lance-namespace 0.4.5, we introduced per-request context which is a
free-form map that can be passed in, and processed by different
implementations differently. Based on that, we add a
`DynamicContextProvider`.

Then specifically for `RestNamespace`, we define that any context
starting with `headers.` will be translated to a request level header.

Because of the requirement to dynamically inject per-request headers, we
also moved to use raw reqwest client now for all HTTP requests in
RestNamespace.

Adapted for release branch by using Python::with_gil instead of
Python::attach (pyo3 0.25 vs 0.26).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@jackye1995 jackye1995 force-pushed the patch-103 branch 3 times, most recently from 545a7ec to 5b559ce Compare January 21, 2026 06:41
@github-actions github-actions bot added the java label Jan 21, 2026
@jackye1995
Copy link
Contributor Author

All tests passing now except for known flaky tests. Merging

@jackye1995 jackye1995 merged commit 61d3e9d into lance-format:release/v1.0 Jan 21, 2026
29 of 33 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working java python

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants