Skip to content

[Feature] Support Spark expression: minutes_of_time #3127

@andygrove

Description

@andygrove

What is the problem the feature request solves?

Note: This issue was generated with AI assistance. The specification details have been extracted from Spark documentation and may need verification.

Comet does not currently support the Spark minutes_of_time function, causing queries using this function to fall back to Spark's JVM execution instead of running natively on DataFusion.

The MinutesOfTime expression extracts the minute component from a time-based value. This expression is implemented as a RuntimeReplaceable that delegates to the DateTimeUtils.getMinutesOfTime method at runtime. It returns an integer representing the minutes portion (0-59) of the input time value.

Supporting this expression would allow more Spark workloads to benefit from Comet's native acceleration.

Describe the potential solution

Spark Specification

Syntax:

minute(time_expr)

Arguments:

Argument Type Description
child AnyTimeType The time expression from which to extract the minute component

Return Type: IntegerType - Returns an integer value representing the minute component (0-59).

Supported Data Types:
The expression accepts any time-based data type through the AnyTimeType constraint:

  • TimeType
  • TimestampType
  • TimestampNTZType

Edge Cases:

  • Null handling: Returns null when the input time expression is null
  • Invalid time values: Behavior depends on the underlying DateTimeUtils.getMinutesOfTime implementation
  • Timezone considerations: For timestamp types, the minute extraction may be affected by timezone settings
  • Leap seconds: Standard minute extraction logic applies, leap seconds are handled by the underlying time utilities

Examples:

-- Extract minute from current timestamp
SELECT minute(current_timestamp()) AS current_minute;

-- Extract minute from time literal
SELECT minute(TIME '14:35:20') AS time_minute;

-- Extract minute from timestamp column
SELECT minute(created_at) AS creation_minute FROM events;
// DataFrame API usage
import org.apache.spark.sql.functions._

// Extract minute from timestamp column
df.select(minute(col("timestamp_col")).alias("minute_value"))

// Using with current timestamp
df.select(minute(current_timestamp()).alias("current_minute"))

Implementation Approach

See the Comet guide on adding new expressions for detailed instructions.

  1. Scala Serde: Add expression handler in spark/src/main/scala/org/apache/comet/serde/
  2. Register: Add to appropriate map in QueryPlanSerde.scala
  3. Protobuf: Add message type in native/proto/src/proto/expr.proto if needed
  4. Rust: Implement in native/spark-expr/src/ (check if DataFusion has built-in support first)

Additional context

Difficulty: Medium
Spark Expression Class: org.apache.spark.sql.catalyst.expressions.MinutesOfTime

Related:

  • HourOfTime - Extract hour component from time values
  • SecondsOfTime - Extract seconds component from time values
  • DateTimeUtils - Underlying utility class for time operations
  • TimeExpression - Base trait for time-related expressions

This issue was auto-generated from Spark reference documentation.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions