Skip to content

Conversation

@Xuanwo
Copy link
Collaborator

@Xuanwo Xuanwo commented Dec 29, 2025

This PR will add alter column nullable to non-nullable support


Parts of this PR were drafted with assistance from Codex (with gpt-5.2) and fully reviewed and edited by me. I take full responsibility for all changes.

@github-actions
Copy link
Contributor

Code Review

P1: Bug - Nested column validation will fail

In validate_no_nulls_before_making_non_nullable, the code uses batch.column_by_name(path) where path can be a nested path like "b.c". However, RecordBatch::column_by_name only searches top-level column names, not nested paths.

When you call scanner.project(&["b.c"]), the resulting batch schema depends on how Lance handles nested projections - the column won't be accessible via batch.column_by_name("b.c").

Suggestion: Consider using the batch's first (and only) column directly since the projection contains exactly one column:

let col = batch.column(0);

Or alternatively, extract the leaf field name from the path:

let leaf_name = path.rsplit('.').next().unwrap_or(path);
let col = batch.column_by_name(leaf_name)...

Recommendation: Add a test for nested columns

The current tests only cover top-level columns. Given that alter_columns explicitly supports nested paths (as seen in test_rename_columns at line 1324), please add a test case for making a nested nullable column non-nullable.


Otherwise the implementation looks correct - the validation approach of scanning all data to check for nulls before allowing the schema change is sound.

@wjones127
Copy link
Contributor

+1 on the review comment. Would like to have this supported in nested columns.

@wjones127 wjones127 self-assigned this Dec 30, 2025
Copy link
Contributor

@wjones127 wjones127 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a test for nested columns for this feature.

@Xuanwo Xuanwo requested a review from wjones127 January 4, 2026 11:09
Copy link
Contributor

@wjones127 wjones127 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

Comment on lines 589 to 590
// TODO: in the future, we could check the values of the column to see if
// they are all non-null and thus the column could be made non-nullable.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might want to remove this TODO before merging though, as it looks to be complete.

@wjones127 wjones127 force-pushed the xuanwo/duckdb-set-not-null branch from 42ea514 to b3f4d7a Compare January 26, 2026 16:59
@codecov
Copy link

codecov bot commented Jan 26, 2026

Codecov Report

❌ Patch coverage is 52.77778% with 17 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
rust/lance/src/dataset/schema_evolution.rs 52.77% 13 Missing and 4 partials ⚠️

📢 Thoughts on this report? Let us know!

@wjones127 wjones127 merged commit 4d8c51c into main Jan 26, 2026
27 of 28 checks passed
@wjones127 wjones127 deleted the xuanwo/duckdb-set-not-null branch January 26, 2026 17:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants