AvroParquetInputFormat should use a parameterized type

The `AvroParquetInputFormat` currently extends `ParquetInputFormat<IndexedRecord>`, which works for regular MR cases. But Spark's `hadoopRDD` and [`newAPIHadoopRDD`](https://people.apache.org/~pwendell/spark-1.1.0-rc3-docs/api/java/org/apache/spark/SparkContext.html#newAPIHadoopRDD(org.apache.hadoop.conf.Configuration, java.lang.Class, java.lang.Class, java.lang.Class)) methods (correctly) create a RDD with the types from the InputFormat. This means that the RDD always uses `IndexedRecord` rather than the correct type.

The `AvroParquetInputFormat` should be `AvroParquetInputFormat<T extends IndexedRecord> extends ParquetInputFormat<T>`

**Reporter**: [Ryan Blue](https://issues.apache.org/jira/secure/ViewProfile.jspa?name=rdblue) / @rdblue

<sub>**Note**: *This issue was originally created as [PARQUET-132](https://issues.apache.org/jira/browse/PARQUET-132). Please see the [migration documentation](https://issues.apache.org/jira/browse/PARQUET-2502) for further details.*</sub>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AvroParquetInputFormat should use a parameterized type #1658

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

AvroParquetInputFormat should use a parameterized type #1658

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions