-
Notifications
You must be signed in to change notification settings - Fork 305
ISSUE #96: Specifying a read schema #109
ISSUE #96: Specifying a read schema #109
Conversation
Current coverage is
|
|
I'm splitting the build changes into a separate PR, #112, so you may have to rebase this. |
faa69db to
5a789c4
Compare
|
ok, rebased. |
|
We anticipate this change as it would enable Spark DataFrame schema evolution using Avro. It would enable my scenario of reading Parquet defined with Avro schema. val avroSchema = new Schema.Parser().parse(new File("avsc/user.avsc"))
val df = sqlContext.read.schema(avroSchema).format("parquet").load(path) |
|
When is this PR to be merged or is it abandoned? thanks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
Can you please add a test (including scenario with unknown columns)? Thanks for working on this! |
|
@vlyubin I will fix the code style and add a test. |
5a789c4 to
1558a75
Compare
|
@bottleimp I just made PR regarding the "try-catch" codes. Hope it helps! |
replace try-exception in buildScan
|
What's the status of this guy? Really important feature for those of needing schema evolution and who have run into the stackoverflow problem when reading recursive schemas! |
|
@pauldwolfe, you may check #95 to see if it meets your need. |
|
This looks like the best solution to this issue, and reading multiple avros with evolved schemas is very important to me. What is the consensus here? |
|
Just following up to say that this PR has met my schema evolution needs very well, but I'm resistant to utilizing a forked repo in production. Shall we rebase? |
|
@jamesmatanle, I'll do a rebate today and check what I can do next. |
|
My PR is outdated, with #155 , I think I would just close it. |

Addressing issue #96 .
My solution is, err, a bit violent, all unknown columns were set to
nullvalue instead throw a exception, hope some one can improve it. Right now it works for me. Here's the test example: