fix: timestamp logical type fixes #17899

linliu-code · 2026-01-15T20:06:07Z

Change Logs

This PR is created based on the two PRs:

The following description is from #14161, which summarized the issue.

This pr #9743 adds more schema evolution functionality and schema processing. However, we used the InternalSchema system to do various operations such as fix null ordering, reorder, and add columns. At the time, InternalSchema only had a single Timestamp type. When converting back to avro, this was assumed to be micros. Therefore, if the schema provider had any millis columns, the processed schema would end up with those columns as micros.

In this pr to update column stats with better support for logical types: #13711, the schema issues were fixed, as well as additional issues with handling and conversion of timestamps during ingestion.

this pr aims to add functionality to spark and hive readers and writers to automatically repair affected tables.
After switching to use the 1.1 binary, the affected columns will undergo evolution from timestamp-micros to timestamp-mills. Normally a lossy evolution that is not supported, this evolution is ok because the data is actually still timestamp-millis it is just mislabeled as micros in the parquet and table schemas

Impact

When reading from a hudi table using spark or hive reader if the table schema has a column as millis, but the data schema is micros, we will assume that this column is affected and read it as a millis value instead of a micros value. This correction is also applied to all readers that the default write paths use. As a table is rewritten the parquet files will be correct. A table's latest snapshot can be immediately fixed by writing one commit with the 1.1 binary, and then clustering the entire table.

Risk level (write none, low medium or high below)

High,
extensive testing was done and functional tests were added.

Documentation Update

#14100

Contributor's checklist

Read through contributor's guide
Enough context is provided in the sections above
Adequate tests were added if applicable

hudi-bot · 2026-01-28T05:01:31Z

CI report:

81c4d2b Azure: FAILURE

Bot commands

@hudi-bot supports the following commands:

@hudi-bot run azure re-run the last Azure build

Fix timestamp_millis issue

baca3e9

linliu-code changed the base branch from master to release-0.14.2 January 15, 2026 20:08

linliu-code marked this pull request as ready for review January 15, 2026 20:09

github-actions bot added the size:XL PR with lines of changes > 1000 label Jan 15, 2026

Solve more compiling errors

112d3f6

linliu-code force-pushed the 0.14.2-logical-type-fix branch from 60ac38c to 112d3f6 Compare January 15, 2026 21:17

linliu-code force-pushed the 0.14.2-logical-type-fix branch from 2c2938b to 8984cd6 Compare January 22, 2026 22:34

Fix some bugs

b132c9b

linliu-code force-pushed the 0.14.2-logical-type-fix branch from 8984cd6 to b132c9b Compare January 22, 2026 22:50

linliu-code added 2 commits January 26, 2026 10:53

Fix incremental queries for auto repair

b0cc6a2

Fix data skipping support

81c4d2b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: timestamp logical type fixes #17899

fix: timestamp logical type fixes #17899

linliu-code commented Jan 15, 2026 •

edited

Loading

Uh oh!

hudi-bot commented Jan 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix: timestamp logical type fixes #17899

Are you sure you want to change the base?

fix: timestamp logical type fixes #17899

Conversation

linliu-code commented Jan 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Change Logs

Impact

Risk level (write none, low medium or high below)

Documentation Update

Contributor's checklist

Uh oh!

hudi-bot commented Jan 28, 2026

CI report:

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

linliu-code commented Jan 15, 2026 •

edited

Loading