-
Notifications
You must be signed in to change notification settings - Fork 286
Description
What is the problem the feature request solves?
Note: This issue was generated with AI assistance. The specification details have been extracted from Spark documentation and may need verification.
Comet does not currently support the Spark minutes_of_time function, causing queries using this function to fall back to Spark's JVM execution instead of running natively on DataFusion.
The MinutesOfTime expression extracts the minute component from a time-based value. This expression is implemented as a RuntimeReplaceable that delegates to the DateTimeUtils.getMinutesOfTime method at runtime. It returns an integer representing the minutes portion (0-59) of the input time value.
Supporting this expression would allow more Spark workloads to benefit from Comet's native acceleration.
Describe the potential solution
Spark Specification
Syntax:
minute(time_expr)Arguments:
| Argument | Type | Description |
|---|---|---|
| child | AnyTimeType | The time expression from which to extract the minute component |
Return Type: IntegerType - Returns an integer value representing the minute component (0-59).
Supported Data Types:
The expression accepts any time-based data type through the AnyTimeType constraint:
- TimeType
- TimestampType
- TimestampNTZType
Edge Cases:
- Null handling: Returns null when the input time expression is null
- Invalid time values: Behavior depends on the underlying
DateTimeUtils.getMinutesOfTimeimplementation - Timezone considerations: For timestamp types, the minute extraction may be affected by timezone settings
- Leap seconds: Standard minute extraction logic applies, leap seconds are handled by the underlying time utilities
Examples:
-- Extract minute from current timestamp
SELECT minute(current_timestamp()) AS current_minute;
-- Extract minute from time literal
SELECT minute(TIME '14:35:20') AS time_minute;
-- Extract minute from timestamp column
SELECT minute(created_at) AS creation_minute FROM events;// DataFrame API usage
import org.apache.spark.sql.functions._
// Extract minute from timestamp column
df.select(minute(col("timestamp_col")).alias("minute_value"))
// Using with current timestamp
df.select(minute(current_timestamp()).alias("current_minute"))Implementation Approach
See the Comet guide on adding new expressions for detailed instructions.
- Scala Serde: Add expression handler in
spark/src/main/scala/org/apache/comet/serde/ - Register: Add to appropriate map in
QueryPlanSerde.scala - Protobuf: Add message type in
native/proto/src/proto/expr.protoif needed - Rust: Implement in
native/spark-expr/src/(check if DataFusion has built-in support first)
Additional context
Difficulty: Medium
Spark Expression Class: org.apache.spark.sql.catalyst.expressions.MinutesOfTime
Related:
HourOfTime- Extract hour component from time valuesSecondsOfTime- Extract seconds component from time valuesDateTimeUtils- Underlying utility class for time operationsTimeExpression- Base trait for time-related expressions
This issue was auto-generated from Spark reference documentation.