[Feature] Support Spark expression: years

## What is the problem the feature request solves?

> **Note:** This issue was generated with AI assistance. The specification details have been extracted from Spark documentation and may need verification.

Comet does not currently support the Spark `years` function, causing queries using this function to fall back to Spark's JVM execution instead of running natively on DataFusion.

The `Years` expression is a v2 partition transform that extracts the year component from date/timestamp values for partitioning purposes. It converts temporal data into integer year values, enabling efficient time-based partitioning strategies in Spark SQL tables.

Supporting this expression would allow more Spark workloads to benefit from Comet's native acceleration.

## Describe the potential solution

### Spark Specification

**Syntax:**
```sql
YEARS(column_name)
```

```scala
// DataFrame API usage
Years(col("date_column"))
```

**Arguments:**
| Argument | Type | Description |
|----------|------|-------------|
| child | Expression | The input expression, typically a date or timestamp column |

**Return Type:** `IntegerType` - Returns the year as an integer value.

**Supported Data Types:**
- DateType
- TimestampType
- TimestampNTZType (timestamp without timezone)

**Edge Cases:**
- Null handling: Returns null when the input expression is null
- Invalid dates: Follows Spark's standard date parsing and validation rules
- Year boundaries: Correctly handles leap years and year transitions
- Timezone effects: For timestamp inputs, the year extraction respects the session timezone setting
- Historical dates: Supports dates across the full range supported by Spark's date types

**Examples:**
```sql
-- Creating a table partitioned by years
CREATE TABLE events (
  id BIGINT,
  event_time TIMESTAMP,
  data STRING
) USING DELTA
PARTITIONED BY (YEARS(event_time))

-- Query that benefits from partition pruning
SELECT * FROM events 
WHERE event_time >= '2023-01-01' AND event_time < '2024-01-01'
```

```scala
// DataFrame API usage in partition transforms
import org.apache.spark.sql.catalyst.expressions.Years
import org.apache.spark.sql.functions.col

// Transform expression for partitioning
val yearTransform = Years(col("timestamp_col").expr)

// Usage in DataFrameWriter for partitioned writes
df.write
  .partitionBy("year_partition")
  .option("partitionOverwriteMode", "dynamic")
  .save("/path/to/table")
```

### Implementation Approach

See the [Comet guide on adding new expressions](https://datafusion.apache.org/comet/contributor-guide/adding_a_new_expression.html) for detailed instructions.

1. **Scala Serde**: Add expression handler in `spark/src/main/scala/org/apache/comet/serde/`
2. **Register**: Add to appropriate map in `QueryPlanSerde.scala`
3. **Protobuf**: Add message type in `native/proto/src/proto/expr.proto` if needed
4. **Rust**: Implement in `native/spark-expr/src/` (check if DataFusion has built-in support first)


## Additional context

**Difficulty:** Medium
**Spark Expression Class:** `org.apache.spark.sql.catalyst.expressions.Years`

**Related:**
- `Months` - Monthly partition transform
- `Days` - Daily partition transform  
- `Hours` - Hourly partition transform
- `Bucket` - Hash-based partition transform
- `PartitionTransformExpression` - Base class for partition transforms

---
*This issue was auto-generated from Spark reference documentation.*


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Support Spark expression: years #3131

What is the problem the feature request solves?

Describe the potential solution

Spark Specification

Implementation Approach

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature] Support Spark expression: years #3131

Description

What is the problem the feature request solves?

Describe the potential solution

Spark Specification

Implementation Approach

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions