Skip to content

Conversation

@flyrain
Copy link
Contributor

@flyrain flyrain commented Oct 12, 2025

No description provided.

@github-project-automation github-project-automation bot moved this to PRs In Progress in Basic Kanban Board Oct 12, 2025
@flyrain flyrain changed the title Add Hive federation option in CLI CLI: Add Hive federation option Oct 12, 2025
uri=self.catalog_uri,
authentication_parameters=auth_params,
service_identity=service_identity,
warehouse=self.hadoop_warehouse
Copy link
Contributor Author

@flyrain flyrain Oct 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's weird to use hadoop_warehouse. We might use the name warehouse, so that it can be applied to both Hadoop and Hive federation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thoughts, suggestions? @HonahX @eric-maynard @MonkeyCanCode

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sry for the late response. I was away for couple of days. I don't have a strong preference over this naming. Maybe ask @eric-maynard as he initially added it in eb6b6ad.

Copy link
Contributor

@eric-maynard eric-maynard Nov 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tentative convention I had in mind was to prefix each argument with the federation type it's specific to -- namely, ICEBERG_REMOTE_CATALOG_NAME is specific to iceberg federation type, and HADOOP_WAREHOUSE is specific to HADOOP. I thought it might be unclear what just REMOTE_CATALOG_NAME or WAREHOUSE meant.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to be clear I'm supportive of re-using these flags across federations types (e.g. across Hive / Hadoop) and indeed I think if we ever flesh out federation in the way that I envisioned at this time we would need to. There would be a way to federate to a Hive catalog both for Iceberg and non-Iceberg tables, and these would surely share arguments.

My only hesitation would be that the CLI 's method of handling arguments is a bit brittle and if we re-use them we should just make sure the parsing and the --help display behave the way we expect.

@flyrain flyrain requested review from HonahX and Copilot October 13, 2025 03:52
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds Hive federation support to the CLI by introducing a new "hive" connection type for external catalogs. This allows users to federate with Hive catalogs alongside the existing Iceberg REST and Hadoop options.

  • Added "hive" as a supported catalog connection type in CLI constants and commands
  • Updated CLI documentation to reflect the new hive option
  • Added validation and configuration logic for Hive connections

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File Description
site/content/in-dev/unreleased/command-line-interface.md Updated documentation to include "hive" in the list of supported catalog connection types
client/python/test/test_cli_parsing.py Added test cases for Hive federation scenarios with both implicit and OAuth authentication
client/python/cli/constants.py Added HIVE enum value to CatalogConnectionType
client/python/cli/command/catalogs.py Added HiveConnectionConfigInfo import, validation logic, and configuration building for Hive connections

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@github-project-automation github-project-automation bot moved this from PRs In Progress to Ready to merge in Basic Kanban Board Nov 13, 2025
f" and {Argument.to_flag_name(Arguments.CATALOG_URI)}")
elif self.catalog_connection_type == CatalogConnectionType.HIVE.value:
if not self.hadoop_warehouse or not self.catalog_uri:
raise Exception(f"Missing required argument for connection type 'HIVE':"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: --help was updated to list iceberg-rest, hadoop, hive so these logs (and the tests) should follow that same convention

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants