Skip to content

Conversation

@ericl
Copy link
Contributor

@ericl ericl commented Oct 26, 2016

What changes were proposed in this pull request?

To reduce the number of components in SQL named *Catalog, rename *FileCatalog to *FileIndex. A FileIndex is responsible for returning the list of partitions / files to scan given a filtering expression.

TableFileCatalog => CatalogFileIndex
FileCatalog => FileIndex
ListingFileCatalog => InMemoryFileIndex
MetadataLogFileCatalog => MetadataLogFileIndex
PrunedTableFileCatalog => PrunedInMemoryFileIndex

cc @yhuai @marmbrus

How was this patch tested?

N/A

@SparkQA
Copy link

SparkQA commented Oct 26, 2016

Test build #67544 has finished for PR 15634 at commit 0776537.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@marmbrus
Copy link
Contributor

This is okay, but note that "Provider" is equally overloaded in the Data Source API.

@ericl
Copy link
Contributor Author

ericl commented Oct 26, 2016

Hmm, any other suggestions?

@rxin
Copy link
Contributor

rxin commented Oct 26, 2016

FileLister? FileListing?

@ericl
Copy link
Contributor Author

ericl commented Oct 27, 2016

How about FileIndex, since these classes are responsible for both listing and filtering functionality?

@ericl
Copy link
Contributor Author

ericl commented Oct 27, 2016

Concretely, I propose renaming

TableFileCatalog => MetastoreFileIndex

ListingFileCatalog => InMemoryFileIndex

MetadataLogFileCatalog => MetadataLogFileIndex

PrunedTableFileCatalog => PrunedInMemoryFileIndex

I think this would make the differences between these classes more clear. Previously, the name was only loosely tied with their behavior.

@rxin
Copy link
Contributor

rxin commented Oct 28, 2016

FileIndex sounds good to me. I wouldn't call it "Metastore" though, since that is a Hive specific term. I'd call it Catalog.

@ericl ericl force-pushed the rename-file-provider branch from 0776537 to b6654f1 Compare October 28, 2016 18:37
@SparkQA
Copy link

SparkQA commented Oct 28, 2016

Test build #67717 has finished for PR 15634 at commit b6654f1.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@ericl ericl changed the title [SPARK-18103] [SQL] Rename *FileCatalog to *FileProvider [SPARK-18103] [SQL] Rename *FileCatalog to *FileIndex Oct 28, 2016
@SparkQA
Copy link

SparkQA commented Oct 28, 2016

Test build #67718 has finished for PR 15634 at commit ec6d4ee.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@rxin
Copy link
Contributor

rxin commented Oct 29, 2016

LGTM pending tests.

@SparkQA
Copy link

SparkQA commented Oct 29, 2016

Test build #67747 has finished for PR 15634 at commit d04fe77.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@rxin
Copy link
Contributor

rxin commented Oct 30, 2016

Merging in master.

@asfgit asfgit closed this in 90d3b91 Oct 30, 2016
robert3005 pushed a commit to palantir/spark that referenced this pull request Nov 1, 2016
## What changes were proposed in this pull request?

To reduce the number of components in SQL named *Catalog, rename *FileCatalog to *FileIndex. A FileIndex is responsible for returning the list of partitions / files to scan given a filtering expression.

```
TableFileCatalog => CatalogFileIndex
FileCatalog => FileIndex
ListingFileCatalog => InMemoryFileIndex
MetadataLogFileCatalog => MetadataLogFileIndex
PrunedTableFileCatalog => PrunedInMemoryFileIndex
```

cc yhuai marmbrus

## How was this patch tested?

N/A

Author: Eric Liang <ekl@databricks.com>
Author: Eric Liang <ekhliang@gmail.com>

Closes apache#15634 from ericl/rename-file-provider.
ghost pushed a commit to dbtsai/spark that referenced this pull request Nov 1, 2016
…to `MetadataLogFileIndex`

## What changes were proposed in this pull request?

This is a follow-up to apache#15634.

## How was this patch tested?

N/A

Author: Liwei Lin <lwlin7@gmail.com>

Closes apache#15712 from lw-lin/18103.
uzadude pushed a commit to uzadude/spark that referenced this pull request Jan 27, 2017
## What changes were proposed in this pull request?

To reduce the number of components in SQL named *Catalog, rename *FileCatalog to *FileIndex. A FileIndex is responsible for returning the list of partitions / files to scan given a filtering expression.

```
TableFileCatalog => CatalogFileIndex
FileCatalog => FileIndex
ListingFileCatalog => InMemoryFileIndex
MetadataLogFileCatalog => MetadataLogFileIndex
PrunedTableFileCatalog => PrunedInMemoryFileIndex
```

cc yhuai marmbrus

## How was this patch tested?

N/A

Author: Eric Liang <ekl@databricks.com>
Author: Eric Liang <ekhliang@gmail.com>

Closes apache#15634 from ericl/rename-file-provider.
uzadude pushed a commit to uzadude/spark that referenced this pull request Jan 27, 2017
…to `MetadataLogFileIndex`

## What changes were proposed in this pull request?

This is a follow-up to apache#15634.

## How was this patch tested?

N/A

Author: Liwei Lin <lwlin7@gmail.com>

Closes apache#15712 from lw-lin/18103.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants