Skip to content

Conversation

@HyukjinKwon
Copy link
Member

What changes were proposed in this pull request?

https://issues.apache.org/jira/browse/SPARK-13899

This PR makes CSV data source produce InternalRow instead of Row.

Basically, this resembles JSON data source. It uses the same codes for casting.

How was this patch tested?

Unit tests were used within IDE and code style was checked by ./dev/run_tests.

@HyukjinKwon
Copy link
Member Author

cc @rxin @falaki

@SparkQA
Copy link

SparkQA commented Mar 15, 2016

Test build #53179 has finished for PR 11717 at commit 11dcc3d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 15, 2016

Test build #53180 has finished for PR 11717 at commit 798eceb.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member Author

This PR would allow to infer TimestampType more flexibly (e.g. includeing T and GMT) rather than just using Timestamp.valueOf().

@SparkQA
Copy link

SparkQA commented Mar 15, 2016

Test build #53189 has finished for PR 11717 at commit 06bc187.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

case dt: DecimalType =>
val value = new BigDecimal(datum.replaceAll(",", ""))
Decimal(value, dt.precision, dt.scale)
// TODO(hossein): would be good to support other common timestamp formats
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this todo?

@rxin
Copy link
Contributor

rxin commented Mar 16, 2016

Thanks - I'm going to merge this. There is a tiny comment. Can you remove that comment in one of your other pr?

@asfgit asfgit closed this in 9202479 Mar 16, 2016
@HyukjinKwon
Copy link
Member Author

@rxin Does that mean closing #11550 and not supporting a custom date format?

roygao94 pushed a commit to roygao94/spark that referenced this pull request Mar 22, 2016
… data source

## What changes were proposed in this pull request?

https://issues.apache.org/jira/browse/SPARK-13899

This PR makes CSV data source produce `InternalRow` instead of `Row`.

Basically, this resembles JSON data source. It uses the same codes for casting.

## How was this patch tested?

Unit tests were used within IDE and code style was checked by `./dev/run_tests`.

Author: hyukjinkwon <gurwls223@gmail.com>

Closes apache#11717 from HyukjinKwon/SPARK-13899.
@HyukjinKwon HyukjinKwon deleted the SPARK-13899 branch October 1, 2016 06:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants