Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[SPARK-47125][SQL] Return null if Univocity never triggers parsing
This PR proposes to prevent `null` for `tokenizer.getContext`. This is similar with apache#28029. `getContext` seemingly via the univocity library, it can return null if `begingParsing` is not invoked (https://github.com/uniVocity/univocity-parsers/blob/master/src/main/java/com/univocity/parsers/common/AbstractParser.java#L53). This can happen when `parseLine` is not invoked at https://github.com/apache/spark/blob/e081f06ea401a2b6b8c214a36126583d35eaf55f/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/UnivocityParser.scala#L300 - `parseLine` invokes `begingParsing`. To fix up a bug. Yes. In a very rare case, when `CsvToStructs` is used as a sole predicate against an empty row, it might trigger NPE. This PR fixes it. Manually tested, but test case will be done in a separate PR. We should backport this to all branches. No. Closes apache#45210 from HyukjinKwon/SPARK-47125. Authored-by: Hyukjin Kwon <gurwls223@apache.org> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org> (cherry picked from commit a87015e) Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
- Loading branch information