Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions datafusion/common/src/dfschema.rs
Original file line number Diff line number Diff line change
Expand Up @@ -297,6 +297,20 @@ impl DFSchema {

/// Modify this schema by appending the fields from the supplied schema, ignoring any
/// duplicate fields.
///
/// ## Merge Precedence
///
/// **Schema-level metadata**: Metadata from both schemas is merged.
/// If both schemas have the same metadata key, the value from the `other_schema` parameter takes precedence.
///
/// **Field-level merging**: Only non-duplicate fields are added. This means that the
/// `self` fields will always take precedence over the `other_schema` fields.
/// Duplicate field detection is based on:
/// - For qualified fields: both qualifier and field name must match
/// - For unqualified fields: only field name needs to match
///
/// Take note how the precedence for fields & metadata merging differs;
/// merging prefers fields from `self` but prefers metadata from `other_schema`.
pub fn merge(&mut self, other_schema: &DFSchema) {
if other_schema.inner.fields.is_empty() {
return;
Expand Down
45 changes: 44 additions & 1 deletion datafusion/expr/src/expr.rs
Original file line number Diff line number Diff line change
Expand Up @@ -469,7 +469,50 @@ impl FieldMetadata {
}

/// Merges two optional `FieldMetadata` instances, overwriting any existing
/// keys in `m` with keys from `n` if present
/// keys in `m` with keys from `n` if present.
///
/// This function is commonly used in alias operations, particularly for literals
/// with metadata. When creating an alias expression, the metadata from the original
/// expression (such as a literal) is combined with any metadata specified on the alias.
Comment on lines 471 to +476
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This structure is intended for use with the aliasing of literals.

Since it has the same concept of field metadata merge, I felt it was useful to add these docs in the same PR.

///
/// # Arguments
///
/// * `m` - The first metadata (typically from the original expression like a literal)
/// * `n` - The second metadata (typically from the alias definition)
///
/// # Merge Strategy
///
/// - If both metadata instances exist, they are merged with `n` taking precedence
/// - Keys from `n` will overwrite keys from `m` if they have the same name
/// - If only one metadata instance exists, it is returned unchanged
/// - If neither exists, `None` is returned
///
/// # Example usage
/// ```rust
/// use datafusion_expr::expr::FieldMetadata;
/// use std::collections::BTreeMap;
///
/// // Create metadata for a literal expression
/// let literal_metadata = Some(FieldMetadata::from(BTreeMap::from([
/// ("source".to_string(), "constant".to_string()),
/// ("type".to_string(), "int".to_string()),
/// ])));
///
/// // Create metadata for an alias
/// let alias_metadata = Some(FieldMetadata::from(BTreeMap::from([
/// ("description".to_string(), "answer".to_string()),
/// ("source".to_string(), "user".to_string()), // This will override literal's "source"
/// ])));
///
/// // Merge the metadata
/// let merged = FieldMetadata::merge_options(
/// literal_metadata.as_ref(),
/// alias_metadata.as_ref(),
/// );
///
/// // Result contains: {"source": "user", "type": "int", "description": "answer"}
/// assert!(merged.is_some());
/// ```
pub fn merge_options(
m: Option<&FieldMetadata>,
n: Option<&FieldMetadata>,
Expand Down
3 changes: 3 additions & 0 deletions datafusion/expr/src/utils.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1225,6 +1225,9 @@ pub fn only_or_err<T>(slice: &[T]) -> Result<&T> {
}

/// merge inputs schema into a single schema.
///
/// This function merges schemas from multiple logical plan inputs using [`DFSchema::merge`].
/// Refer to that documentation for details on precedence and metadata handling.
pub fn merge_schema(inputs: &[&LogicalPlan]) -> DFSchema {
Comment on lines +1228 to 1231
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if inputs.len() == 1 {
inputs[0].schema().as_ref().clone()
Expand Down