Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-45265][SQL] Support Hive 4.0 metastore #48823

Closed
wants to merge 11 commits into from

Conversation

yaooqinn
Copy link
Member

What changes were proposed in this pull request?

This PR continues the work from #43064 and #45801 to support Hive Metastore Server 4.0. CHAR/VARCHAR type partition filter pushdown is not included in this PR, as it requires further investment.

Why are the changes needed?

Enhance the multiple hive metastore server support feature

Does this PR introduce any user-facing change?

no

How was this patch tested?

Passing HiveClient*Suites w/ 4.0

Was this patch authored or co-authored using generative AI tooling?

no

@dongjoon-hyun
Copy link
Member

👍🏻 Thank you so much for taking over this, @yaooqinn .

@github-actions github-actions bot added the INFRA label Nov 13, 2024
@github-actions github-actions bot removed the INFRA label Nov 13, 2024
@github-actions github-actions bot added the BUILD label Nov 14, 2024
@yaooqinn
Copy link
Member Author

All GitHub actions are passing now. Please take a look. @LuciferYang @dongjoon-hyun @HyukjinKwon @cloud-fan

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-45265][SQL] Support Hive 4 [SPARK-45265][SQL] Support Hive 4.0 Nov 14, 2024
@dongjoon-hyun dongjoon-hyun changed the title [SPARK-45265][SQL] Support Hive 4.0 [SPARK-45265][SQL] Support Hive 4.0 metastore Nov 14, 2024
Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Thank you, @yaooqinn .

I believe we can merge this and test more before Apache Spark 4.0.0 release.

@yaooqinn
Copy link
Member Author

Thank you @dongjoon-hyun @LuciferYang

@yaooqinn yaooqinn deleted the SPARK-45265 branch November 15, 2024 05:50
@Madhukar525722
Copy link

Hi @yaooqinn , Thanks for bringing the HMS4 compatibility. I tried it and found one of the behaviour difference, reported in https://issues.apache.org/jira/browse/SPARK-50461. Could you please guide me, how I should be approaching this, to fix it?
cc - @dongjoon-hyun

Thanks

state.err = new PrintStream(outputBuffer, true, UTF_8.name())
val clz = state.getClass.getField("out").getType.asInstanceOf[Class[_ <: PrintStream]]
val ctor = clz.getConstructor(classOf[OutputStream], classOf[Boolean], classOf[String])
state.getClass.getField("out").set(state, ctor.newInstance(outputBuffer, true, UTF_8.name()))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

build fail : ”the result type of an implicit conversion must be more specific than Object“

@@ -1030,7 +1030,7 @@ private[spark] class HiveExternalCatalog(conf: SparkConf, hadoopConf: Configurat
}
val metaStoreParts = partsWithLocation
.map(p => p.copy(spec = toMetaStorePartitionSpec(p.spec)))
client.createPartitions(db, table, metaStoreParts, ignoreIfExists)
client.createPartitions(tableMeta, metaStoreParts, ignoreIfExists)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants