Skip to content
This repository was archived by the owner on Dec 20, 2018. It is now read-only.
This repository was archived by the owner on Dec 20, 2018. It is now read-only.

Specifying a read schema with spark-avro #96

@mwho

Description

@mwho

It would be nice to have an option to supply a read schema (in lieu of the embedded schema) when reading avro files via spark-avro.

For example, the Python Avro API allows the following:
reader = DataFileReader(data, DatumReader(readers_schema=schema))

The scenario is this: I have many .avro files, possibly with different schemas (due to schema evolution), and I would like to use a single "master" schema to ingest all of those avro files into a single Spark Dataframe.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions