Skip to content

Invert usingDataSourceExec test helper to usingLegacyNativeCometScan #3309

@andygrove

Description

@andygrove

Summary

With the introduction of native_datafusion in auto scan mode (PR #3307), several test helpers that check the scan implementation config are broken when running in auto mode. The root cause is that helpers like usingDataSourceExec check if the config string is literally native_datafusion or native_iceberg_compat, but in auto mode the config reads as "auto" even though it resolves to native_datafusion at plan time.

Failing Tests (in auto mode)

  • "schema evolution" (ParquetReadSuite.scala:1256) — expects SparkException but native_datafusion handles type widening gracefully
  • "row group skipping doesn't overflow when reading into larger type" (ParquetReadSuite.scala:1523) — same issue

Proposed Fix

Since native_comet is deprecated and the default path is now DataSource-based (via auto), invert the check:

  • Rename usingDataSourceExecusingLegacyNativeCometScan which returns true only when config is explicitly native_comet
  • Flip all ~40 call sites accordingly
  • Update usingDataSourceExecWithIncompatTypes similarly
  • Fix the explicit SCAN_NATIVE_DATAFUSION check in the schema evolution test

This avoids needing to enumerate all non-legacy modes and is forward-compatible with future scan implementations.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions