-
Notifications
You must be signed in to change notification settings - Fork 539
feat: support credentials vending in directory namespace #5566
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Code Review: Credential Vending for Directory NamespaceSummaryThis PR adds credential vending support for AWS, GCP, and Azure cloud storage in the directory namespace. The implementation follows patterns similar to Apache Polaris credential vending. FeedbackP1: GCP Credential Vending - Expiration Time CalculationIn let expires_at_millis = std::time::SystemTime::now()
.duration_since(std::time::UNIX_EPOCH)
.expect("time went backwards")
.as_millis() as u64
+ self.config.duration_millis;However, GCP access tokens typically have a fixed 1-hour lifetime regardless of what the caller requests. This can mislead clients into thinking credentials are valid longer than they actually are. Consider:
P1: GCP Permission Level Has No EffectThe GCP vendor accepts a
This could be confusing for users. Consider either:
P1: S3 Policy Prefix Pattern May Be IncorrectIn if prefix.is_empty() {
"*".to_string()
} else {
format!("{}/*", prefix_trimmed)
}This pattern P0: Azure - Unused
|
0186a07 to
172ae60
Compare
PR Review: Credentials Vending for Directory NamespaceSummaryThis PR adds credential vending support for cloud storage (AWS S3, GCP GCS, Azure Blob) in the directory namespace. The implementation follows a similar pattern to Apache Polaris credential vending. P0: Critical Issues1. Missing The Python and Java documentation shows credential vendor properties but doesn't mention the In >>> # With AWS credential vending (requires credential-vendor-aws feature)
>>> ns = lance.namespace.DirectoryNamespace(
... root="s3://my-bucket/data",
... credential_vendor="aws", # This property doesn't exist
... credential_vendor_aws_role_arn="arn:aws:iam::123456789012:role/MyRole",
... credential_duration_millis="3600000"
... )Should be: >>> ns = lance.namespace.DirectoryNamespace(
... root="s3://my-bucket/data",
... credential_vendor_enabled="true", # Required
... credential_vendor_aws_role_arn="arn:aws:iam::123456789012:role/MyRole",
... )P1: Important Issues2. Incomplete comment in In /// For example,
/// It is arguable that if a user has write permission,
///
/// Here these permission sets are more for
This looks like unfinished documentation that should be completed or removed. 3. GCP credential vending doesn't scope to location The GCP implementation in async fn vend_credentials(&self, table_location: &str) -> Result<VendedCredentials> {
let (_bucket, _prefix) = Self::parse_gcs_uri(table_location)?; // Unused!
let (access_token, expires_at_millis) = self.get_token().await?;
...
}Unlike AWS (which uses scoped IAM policies) and Azure (which scopes to container), GCP tokens are not scoped to the specific bucket/prefix. This should be documented as a limitation, or implement Credential Access Boundaries for proper scoping. 4. Missing integration tests The PR has unit tests for URI parsing and policy generation, but lacks integration tests for actual credential vending. Given this is security-critical functionality, integration tests that verify the vended credentials work correctly (even mocked) would provide important coverage. Minor Observations (not blocking)
Recommendation: Address P0 (documentation) and P1 items #2 (incomplete comment) before merging. Item #3 (GCP scoping) can be addressed as a follow-up with proper documentation of the limitation. 🤖 Generated with Claude Code |
172ae60 to
7ed86ed
Compare
Code Review: Credentials Vending FeatureSummaryThis PR introduces credential vending functionality for AWS (STS AssumeRole), GCP (OAuth2 tokens), and Azure (SAS tokens) to the directory namespace implementation. The design follows Apache Polaris patterns and uses feature flags for optional dependencies. P0 - Critical Issues1. GCP credential expiration is synthetic, not actual The GCP implementation calculates // Current code adds config duration blindly
let expires_at_millis = std::time::SystemTime::now()
.duration_since(std::time::UNIX_EPOCH)
.expect("time went backwards")
.as_millis() as u64
+ self.config.duration_millis;Recommendation: The P1 - Important Issues2. Unused configuration fields in GCP vendor The // These fields are configured but never read
pub service_account: Option<String>,
pub credentials_file: Option<String>,Recommendation: Either implement service account impersonation / custom credentials file support, or remove these unused fields and document that only default credentials are supported. 3. Azure SAS token expiry calculation relies on requested duration, not key validity The code calculates // key_end_time may be less than end_time
let key_end_time = if end_time > max_key_end {
max_key_end // The actual key constraint
} else {
end_time
};
// ... but expires_at_millis uses end_time unconditionally
let expires_at_millis = (end_time.unix_timestamp() * 1000 + ...) as u64;Recommendation: Use 4. AWS IAM policy missing The read policy includes Recommendation: Add Minor Suggestions (non-blocking)
Overall this is a well-structured implementation. Addressing the expiration calculation issues (P0/P1) will prevent credential failures in production use cases. |
PR Review: Credentials Vending FeatureThis PR introduces credential vending functionality for directory namespace, allowing temporary cloud credentials to be generated for table access. The implementation follows Apache Polaris's approach. SummaryThe PR adds:
P0/P1 Issues1. [P1] GCP vendor does not implement permission-scoped access (gcp.rs:125-140) Unlike AWS (which generates scoped IAM policies) and Azure (which sets SAS permissions), the GCP implementation just returns an OAuth2 token without any scope restriction. The // GCP OAuth2 tokens don't support fine-grained permissions,
// so the permission is for API consistency.Consider either:
2. [P1] Missing validation for duration bounds (aws.rs, azure.rs) AWS STS AssumeRole has duration limits (900-43200 seconds), but let duration_secs = self.config.duration_millis.div_ceil(1000) as i32;Values outside the valid range will cause runtime errors from AWS. Consider clamping or validating during config construction. 3. [P1] Incomplete comment in VendedPermission (credentials.rs:76-82) The /// For example,
/// It is arguable that if a user has write permission,
///
/// Here these permission sets are more for
These appear to be unfinished thoughts that should be completed or removed. Minor Observations (Not Blocking)
TestingThe PR includes good unit test coverage for:
Integration tests are properly marked as Overall this is a solid implementation of credential vending. The P1 issues should be addressed before merging. |
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
Code Review for PR #5566: Credentials Vending in Directory NamespaceThank you for this comprehensive addition of credential vending support. The implementation follows the Apache Polaris approach well. Here are a few concerns to address: P0: Security Concern - Credentials Loggingaws.rs:267, gcp.rs, azure.rs: The
impl std::fmt::Debug for VendedCredentials {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
f.debug_struct("VendedCredentials")
.field("storage_options", &"[REDACTED]")
.field("expires_at_millis", &self.expires_at_millis)
.finish()
}
}P1: Configuration Inconsistency - Python/Java vs Rust Property NamesThe Python/Java documentation uses underscores (
Please align the property naming convention across all bindings. P1: Missing Test Coverage for Error CasesThe unit tests cover success paths well, but could benefit from:
Minor Suggestions (Non-blocking)
Overall, this is a solid implementation. The P0 security issue should be addressed before merging. |
Code ReviewThis PR introduces credential vending for directory namespace, supporting AWS, GCP, and Azure. The implementation follows Apache Polaris patterns. Here are the key findings: P0 Issues (Must Fix)
P1 Issues (Should Fix)
Suggestions (Non-blocking)
Overall, this is a solid implementation. The main concerns are around the GCP token expiration accuracy and the incomplete documentation. |
PR Review: Credentials Vending in Directory NamespaceSummaryThis PR introduces credential vending for AWS (STS AssumeRole), GCP (OAuth2 with CAB), and Azure (SAS tokens) in the directory namespace implementation, similar to Apache Polaris. P0 Issues1. Thread Safety Issue: if let Some(ref creds_file) = config.credentials_file {
std::env::set_var("GOOGLE_APPLICATION_CREDENTIALS", creds_file);
}Setting environment variables at runtime is not thread-safe in Rust and can cause undefined behavior in multi-threaded contexts. This is especially dangerous in a server context where multiple credential vendors may be created concurrently. Recommendation: Pass the credentials file path directly to the google-cloud-auth library if it supports it, or document that this configuration option should only be set before any concurrent operations begin. Alternatively, load credentials from the file directly rather than relying on the environment variable. P1 Issues2. AWS ListBucket Condition May Be Overly Permissive (aws.rs:163-174) "Condition": {
"StringLike": {
"s3:prefix": if prefix.is_empty() {
"*".to_string()
} else {
format!("{}/*", prefix_trimmed)
}
}
}When 3. Missing Test Coverage for Core Functionality
Minor Observations (Non-blocking)
Questions for Maintainers
|
Code Review: feat: support credentials vending in directory namespaceSummaryThis PR introduces credential vending for AWS, GCP, and Azure cloud storage providers in the DirectoryNamespace. The implementation follows patterns similar to Apache Polaris and provides temporary, scoped credentials for table access. P0 Issues (Must Fix)1. Missing The GCP module uses # Current:
google-cloud-auth = { version = "0.18", optional = true }
# Missing:
serde = { workspace = true, optional = true } # needs to be enabled by credential-vendor-gcpThis will cause a compile error when building with 2. GCP The PR adds P1 Issues (Should Fix)1. Documentation inconsistency: In However, this property is not implemented in 2. Potential credential caching concern The
Consider adding credential caching with TTL-based refresh (similar to how AWS SDK handles credentials internally). 3. Missing test coverage for Azure implementation AWS has integration tests (marked
Suggestions (Non-blocking)
Overall, this is a well-structured implementation with good separation between cloud providers. The feature gating approach is clean and the API design follows the builder pattern used elsewhere in the codebase. |
Code Review: PR #5566 - Credentials Vending for Directory NamespaceSummaryThis PR adds credential vending support to the directory namespace implementation, enabling temporary scoped credential generation for AWS S3, GCP Cloud Storage, and Azure Blob Storage. P0 Issues (Must Fix)1. Missing Test Coverage for Critical Paths The credential vending logic lacks integration tests. While there are unit tests for policy building and parsing, there are no tests for the actual Recommended: Add at least mock-based integration tests for credential vending flows. P1 Issues (Should Fix)2. Potential Silent Failures in Provider Detection ( The #[cfg(feature = "credential-vendor-aws")]
"aws" => create_aws_vendor(properties).await, // Returns Ok(None) if role_arn missingRecommended: Consider logging a warning when 3. AWS Policy Prefix Condition May Be Overly Permissive ( The S3 ListBucket condition uses a prefix that allows listing with a trailing wildcard: "s3:prefix": format!("{}/*", prefix_trimmed)This works but the Recommended: Change to 4. GCP Access Boundary CEL Expression Path Matching ( The CEL expression for prefix restriction may not correctly handle paths without trailing slashes: resource.name.startsWith('projects/_/buckets/{}/objects/{}')If 5. Optional Feature Dependencies Not Documented in Cargo.toml
Minor Observations (Informational)
Security Considerations
Overall the security model is sound. The main concern is ensuring misconfigurations are visible to users. |
PR Review: feat: support credentials vending in directory namespaceSummaryThis PR adds credential vending support for AWS, GCP, and Azure to the directory namespace implementation. The feature allows vending temporary, scoped credentials based on table location URI schemes. ReviewOverall: Well-structured implementation with good documentation and test coverage. A few items to address: P1 Issues
P0 IssuesNone identified. Suggestions (non-blocking)
|
| use std::collections::HashMap; | ||
| use std::sync::Arc; | ||
|
|
||
| use async_trait::async_trait; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we use reqsign instead of all SDKs from aws, azure and gcp?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let me try
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like reqsign currently cannot perform scoped credentials generation, so cannot really use it at this moment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, will take a look
583ca0c to
7205392
Compare
Code Review: Credentials Vending FeatureP0 - Must Fix1. Azure API Typo - Potential Future Breakage The code calls 2. Credential Information in Logs P1 - Should Fix3. Azure Container-Level Scoping Limitation Azure SAS tokens are scoped to containers, not paths. This means requesting credentials for 4. Missing Input Validation
Consider adding basic validation and returning errors for invalid configurations rather than silently accepting them. 5. No Credential Caching
Azure delegation keys (valid up to 7 days) are particularly expensive to regenerate on each call. Test Coverage Gaps
Positive Observations
VerdictThe implementation is solid overall with good security practices. Recommended to address P0 issues before merge, with P1 items as follow-up work. |
…t#5566) This PR introduces the credentials vending feature to the namespace impl, allowing us to vend credentials if we run directory namespace, or run it as backend for rest namespace. This would allow us to fully test the credentials vending code path end to end. The actual vending logic mainly consults the same feature implemented in Apache Polaris. The support covers aws, gcp and azure.
…t#5566) This PR introduces the credentials vending feature to the namespace impl, allowing us to vend credentials if we run directory namespace, or run it as backend for rest namespace. This would allow us to fully test the credentials vending code path end to end. The actual vending logic mainly consults the same feature implemented in Apache Polaris. The support covers aws, gcp and azure.
…t#5566) This PR introduces the credentials vending feature to the namespace impl, allowing us to vend credentials if we run directory namespace, or run it as backend for rest namespace. This would allow us to fully test the credentials vending code path end to end. The actual vending logic mainly consults the same feature implemented in Apache Polaris. The support covers aws, gcp and azure. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This PR introduces the credentials vending feature to the namespace impl, allowing us to vend credentials if we run directory namespace, or run it as backend for rest namespace. This would allow us to fully test the credentials vending code path end to end. The actual vending logic mainly consults the same feature implemented in Apache Polaris. The support covers aws, gcp and azure. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This PR introduces the credentials vending feature to the namespace impl, allowing us to vend credentials if we run directory namespace, or run it as backend for rest namespace. This would allow us to fully test the credentials vending code path end to end.
The actual vending logic mainly consults the same feature implemented in Apache Polaris. The support covers aws, gcp and azure.