-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
Attempting to apply an alias
to a count_all()
aggregation results in an Internal Error with the following message:
called `Result::unwrap()` on an `Err` value: Internal("Invalid aggregate expression 'Alias(Alias { expr: AggregateFunction(AggregateFunction { func: AggregateUDF { inner: Count { name: \"count\", signature: Signature { type_signature: OneOf([VariadicAny, Nullary]), volatility: Immutable } } }, params: AggregateFunctionParams { args: [Literal(Int64(1), None)], distinct: false, filter: None, order_by: None, null_treatment: None } }), relation: None, name: \"count(*)\", metadata: None })'")
Presumably this is because count_all()
already applies an alias as count(*)
. It looks like it's currently possible to work around this restriction by using count(lit(1))
in place of count_all()
.
To Reproduce
Tested using DataFusion 48.0.1, found in DataFusion 47.0.0
#[cfg(test)]
mod tests {
use arrow::{
array::StringArray,
datatypes::{DataType, Field, Schema},
record_batch::RecordBatch,
};
use datafusion::{functions_aggregate::count::count_all, prelude::SessionContext};
use std::sync::Arc;
#[tokio::test]
async fn alias_count_all() {
// Create a simple RecordBatch with a single row
let schema = Schema::new(vec![Field::new("id", DataType::Utf8, false)]);
let id_array = StringArray::from(vec!["test_id_1"]);
let record_batch =
RecordBatch::try_new(Arc::new(schema), vec![Arc::new(id_array)]).unwrap();
// Create DataFusion context and DataFrame from the record batch
let ctx = SessionContext::new();
let df = ctx.read_batch(record_batch).unwrap();
// Add count_all() aggregation and alias it as "TOTAL_COUNT"
let df_with_count = df
.aggregate(vec![], vec![count_all().alias("TOTAL_COUNT")])
.unwrap();
// Verify the column exists and is aliased as expected
let results = df_with_count.collect().await.unwrap();
assert_eq!(results.len(), 1);
let batch = &results[0];
let schema = batch.schema();
let field = schema.field(0);
assert_eq!(
field.name(),
"TOTAL_COUNT",
"Column should be aliased to TOTAL_COUNT"
);
}
}
Expected behavior
Aliasing count_all()
aggregations should result in the user supplied alias, not a runtime error.
Additional context
No response
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working