-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Description
The new tests TestInsertTable.Test Insert Into with subset of columns, Test Insert Into with subset of columns on Parquet table fail on Spark 3.4 due to our validation introduced in HoodieSpark34CatalystPlanUtils in [https://github.com//pull/11568]. Without this change, INSERT INTO with subset of columns used to work.
{code:java}
override def unapplyInsertIntoStatement(plan: LogicalPlan): Option[(LogicalPlan, Seq[String], Map[String, Option[String]], LogicalPlan, Boolean, Boolean)] = {
plan match {
case insert: InsertIntoStatement =>
// apache/spark#36077
// first: in this pr, spark34 support default value for insert into, it will regenerate the user specified cols
// so, no need deal with it in hudi side
// second: in this pr, it will append hoodie meta field with default value, has some bug, it look like be fixed
// in spark35(apache/spark#41262), so if user want specified cols, need disable default feature.
if (SQLConf.get.enableDefaultColumns) {
if (insert.userSpecifiedCols.nonEmpty) {
throw new AnalysisException("hudi not support specified cols when enable default columns, " +
"please disable 'spark.sql.defaultColumn.enabled'")
}
Some((insert.table, Seq.empty, insert.partitionSpec, insert.query, insert.overwrite, insert.ifPartitionNotExists))
} else {
Some((insert.table, insert.userSpecifiedCols, insert.partitionSpec, insert.query, insert.overwrite, insert.ifPartitionNotExists))
}
case _ =>
None
}
} {code}
JIRA info
- Link: https://issues.apache.org/jira/browse/HUDI-8911
- Type: Bug
- Fix version(s):
- 1.1.0