Spark 3: Consider providing better support for path-based tables

In Spark 3, support for path-based tables is limited. In particular, I don't see a way to create a Hadoop table at a given location through Spark. Users have to use the Iceberg API for that as we use `HadoopTables` only for reading. 

I see a lot of use cases with no metastore where tables are persisted in a location. Usually, these are HDFS use cases. While we can leverage `HadoopCatalog` for such cases, it has its own drawbacks: list operations to find a table and what is even more important it requires a special layout. The latter point is important as we cannot use `HadoopCatalog` for path-based tables that were migrated to Iceberg. I want Iceberg to support migration of path-based as well as metastore-based tables through SQL extensions.

I'd consider adding support to our Spark catalogs to create/load a table using a table path as an identifier. Under the hood, it will use `HadoopTables`.

For example, ```CREATE TABLE `path/to/table` USING iceberg``` or ```SELECT * FROM `path/to/table` WHERE pred```.  

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spark 3: Consider providing better support for path-based tables #1306

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Spark 3: Consider providing better support for path-based tables #1306

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions