Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions R/pkg/R/DataFrame.R
Original file line number Diff line number Diff line change
Expand Up @@ -2635,6 +2635,7 @@ setMethod("write.df",
write <- callJMethod(df@sdf, "write")
write <- callJMethod(write, "format", source)
write <- callJMethod(write, "mode", jmode)
write <- callJMethod(write, "options", options)
write <- callJMethod(write, "save", path)
})

Expand Down
12 changes: 11 additions & 1 deletion R/pkg/inst/tests/testthat/test_sparkSQL.R
Original file line number Diff line number Diff line change
Expand Up @@ -208,7 +208,7 @@ test_that("create DataFrame from RDD", {
unsetHiveContext()
})

test_that("read csv as DataFrame", {
test_that("read/write csv as DataFrame", {
csvPath <- tempfile(pattern = "sparkr-test", fileext = ".csv")
mockLinesCsv <- c("year,make,model,comment,blank",
"\"2012\",\"Tesla\",\"S\",\"No comment\",",
Expand Down Expand Up @@ -243,7 +243,17 @@ test_that("read csv as DataFrame", {
expect_equal(count(withoutna2), 3)
expect_equal(count(where(withoutna2, withoutna2$make == "Dummy")), 0)

# writing csv file
csvPath2 <- tempfile(pattern = "csvtest2", fileext = ".csv")
write.df(df2, path = csvPath2, "csv", header = "true")
df3 <- read.df(csvPath2, "csv", header = "true")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could do this - but I was thinking we could also check if R's read.csv is able to read back the file correctly with headers ?

Copy link
Member Author

@felixcheung felixcheung Sep 8, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need full path to work

> read.csv(file = csvPath2)
Error in read.table(file = file, header = header, sep = sep, quote = quote,  :
  no lines available in input

> read.csv(file = paste0(csvPath2, "/", "part-r-00000-bf045be1-500f-4e77-8957-b6d256166ca7.csv"))
  year  make       model                            comment blank
1 2012 Tesla           S                         No comment Empty
2 1997  Ford        E350 Go get one now they are going fast Empty
3 2015 Chevy        Volt
4   NA Dummy Placeholder

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The path is interpreted as a directory by write.df. It then puts in a part-0000 or a sequence of such files inside the directory

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, read.csv doesn't work with wildcard it seems.
testing a fix.

expect_equal(nrow(df3), nrow(df2))
expect_equal(colnames(df3), colnames(df2))
csv <- read.csv(file = list.files(csvPath2, pattern = "^part", full.names = T)[[1]])
expect_equal(colnames(df3), colnames(csv))

unlink(csvPath)
unlink(csvPath2)
})

test_that("convert NAs to null type in DataFrames", {
Expand Down