Skip to content

cast StructType to StringType test fails if default scan is auto #2175

@andygrove

Description

@andygrove

Describe the bug

While working on #2172, I discovered a potential issue:

- cast StructType to StringType *** FAILED *** (428 milliseconds)
  Results do not match for query:
  Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
  Timezone Env:

  == Parsed Logical Plan ==
  Project [cast(struct(_9, _9#428065, _10, _10#428066, _11, _11#428067L, _12, _12#428068) as string) AS CAST(struct(_9, _10, _11, _12) AS STRING)#428125]
  +- SubqueryAlias tbl
     +- View (`tbl`, [_1#428057,_2#428058,_3#428059,_4#428060,_5#428061L,_6#428062,_7#428063,_8#428064,_9#428065,_10#428066,_11#428067L,_12#428068,_13#428069,_14#428070,_15#428071,_16#428072,_17#428073,_18#428074,_19#428075,_20#428076,_21#428077,_id#428078])
        +- Relation [_1#428057,_2#428058,_3#428059,_4#428060,_5#428061L,_6#428062,_7#428063,_8#428064,_9#428065,_10#428066,_11#428067L,_12#428068,_13#428069,_14#428070,_15#428071,_16#428072,_17#428073,_18#428074,_19#428075,_20#428076,_21#428077,_id#428078] parquet

  == Analyzed Logical Plan ==
  CAST(struct(_9, _10, _11, _12) AS STRING): string
  Project [cast(struct(_9, _9#428065, _10, _10#428066, _11, _11#428067L, _12, _12#428068) as string) AS CAST(struct(_9, _10, _11, _12) AS STRING)#428125]
  +- SubqueryAlias tbl
     +- View (`tbl`, [_1#428057,_2#428058,_3#428059,_4#428060,_5#428061L,_6#428062,_7#428063,_8#428064,_9#428065,_10#428066,_11#428067L,_12#428068,_13#428069,_14#428070,_15#428071,_16#428072,_17#428073,_18#428074,_19#428075,_20#428076,_21#428077,_id#428078])
        +- Relation [_1#428057,_2#428058,_3#428059,_4#428060,_5#428061L,_6#428062,_7#428063,_8#428064,_9#428065,_10#428066,_11#428067L,_12#428068,_13#428069,_14#428070,_15#428071,_16#428072,_17#428073,_18#428074,_19#428075,_20#428076,_21#428077,_id#428078] parquet

  == Optimized Logical Plan ==
  Project [cast(struct(_9, _9#428065, _10, _10#428066, _11, _11#428067L, _12, _12#428068) as string) AS CAST(struct(_9, _10, _11, _12) AS STRING)#428125]
  +- Relation [_1#428057,_2#428058,_3#428059,_4#428060,_5#428061L,_6#428062,_7#428063,_8#428064,_9#428065,_10#428066,_11#428067L,_12#428068,_13#428069,_14#428070,_15#428071,_16#428072,_17#428073,_18#428074,_19#428075,_20#428076,_21#428077,_id#428078] parquet

  == Physical Plan ==
  *(1) CometColumnarToRow
  +- CometProject [CAST(struct(_9, _10, _11, _12) AS STRING)#428125], [cast(struct(_9, _9#428065, _10, _10#428066, _11, _11#428067L, _12, _12#428068) as string) AS CAST(struct(_9, _10, _11, _12) AS STRING)#428125]
     +- CometScan [native_iceberg_compat] parquet [_9#428065,_10#428066,_11#428067L,_12#428068] Batched: true, DataFilters: [], Format: CometParquet, Location: InMemoryFileIndex(1 paths)[file:/home/andy/git/apache/datafusion-comet/spark/target/tmp/spark-f1e..., PartitionFilters: [], PushedFilters: [], ReadSchema: struct<_9:smallint,_10:int,_11:bigint,_12:decimal(20,0)>

  == Results ==

  == Results ==
  !== Correct Answer - 10000 ==                               == Spark Answer - 10000 ==
   struct<CAST(struct(_9, _10, _11, _12) AS STRING):string>   struct<CAST(struct(_9, _10, _11, _12) AS STRING):string>
  ![{-1, -1, 4294967295, 18446744073709551615}]               [{0, 0, 0, 0}]
  ![{-1, -1, 4294967295, 18446744073709551615}]               [{0, 0, 0, 0}]
  ![{-1, -1, 4294967295, 18446744073709551615}]               [{0, 0, 0, 0}]
  ![{-1, -1, 4294967295, 18446744073709551615}]               [{0, 0, 0, 0}]
  ![{-1, -1, 4294967295, 18446744073709551615}]               [{0, 0, 0, 0}]
  ![{-1, -1, 4294967295, 18446744073709551615}]               [{0, 0, 0, 0}]
  ![{-1, -1, 4294967295, 18446744073709551615}]               [{0, 0, 0, 0}]
  ![{-1, -1, 4294967295, 18446744073709551615}]               [{0, 0, 0, 0}]

Steps to reproduce

Update CometTestBase to remove the config setting the scan impl to native_comet

Expected behavior

No response

Additional context

No response

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions