Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set names for import_list() #164

Merged
merged 9 commits into from
Jan 31, 2018
Merged

Conversation

ruaridhw
Copy link
Contributor

@ruaridhw ruaridhw commented Oct 8, 2017

@leeper, some thoughts on the enhancement you suggested in #162. It may make more sense to keep the logic for cases 3 & 4 the same (see below) and/or do away with case 2 ... ?

PR to add functionality for imported lists to inherit data structure's names according to:

  1. xls and xlsx retain the sheet names
  2. html retains the table's "class" attribute if it exists otherwise blank
  3. zip retains the file names with extension
  4. vectors of files retain the file names without extension

Fixes #162

Unknown added 4 commits October 7, 2017 20:13
@ruaridhw
Copy link
Contributor Author

ruaridhw commented Oct 8, 2017

Appveyor build failed due to a suspected r-devel issue

  • installing source package 'rio' ...
    ** R
    ** inst
    ** preparing package for lazy loading
    Error in inDL(x, as.logical(local), as.logical(now), ...) :
    Checking LENGTH(allocVector(INTSXP,2)) [145284976] is 2 ... failed. Please forward this message to maintainer('data.table').
    ERROR: lazy loading failed for package 'rio'
  • removing 'C:/projects/rio/rio.Rcheck/rio'

Seems to build and check fine locally otherwise:

  • using R version 3.4.1 (2017-06-30)
  • using platform: x86_64-apple-darwin15.6.0 (64-bit)
  • using session charset: UTF-8
  • using options ‘--no-manual --as-cran’
  • checking for file ‘rio/DESCRIPTION’ ... OK
  • checking extension type ... Package
  • this is package ‘rio’ version ‘0.5.7’
    ...
  • DONE
    Status: OK
    R CMD check results
    0 errors | 0 warnings | 0 notes
    R CMD check succeeded

@leeper leeper self-assigned this Oct 9, 2017
@codecov-io
Copy link

codecov-io commented Oct 9, 2017

Codecov Report

❗ No coverage uploaded for pull request base (master@18d138a). Click here to learn what that means.
The diff coverage is 80%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master     #164   +/-   ##
=========================================
  Coverage          ?   85.42%           
=========================================
  Files             ?       18           
  Lines             ?     1036           
  Branches          ?        0           
=========================================
  Hits              ?      885           
  Misses            ?      151           
  Partials          ?        0
Impacted Files Coverage Δ
R/import_list.R 91.11% <80%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 18d138a...8db6f27. Read the comment docs.

R/import_list.R Outdated
@@ -48,6 +48,7 @@ function(file,
setclass <- NULL
}
if (length(file) > 1) {
names(file) <- gsub(paste0("\\.", tools::file_ext(file[1]), "$"), "", file, ignore.case = TRUE)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand what's happening here. Can you explain?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's what the first comment in this PR is trying to explain but unfortunately the comment didn't attach itself to the precise line of code for clarity...

Also relates to Case 4 in the PR description

Copy link
Contributor Author

@ruaridhw ruaridhw Oct 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've since realised that assuming all files have the same extension somewhat defeats the purpose of ingesting multiple disparate datasets of differing file types at the same time.

file <- c("mtcars.csv", "iris.csvy", "flights.xlsx", "USArrests.csv")
my_list <- import_list(file, rbind = FALSE)
names(my_list)
#> [1] "mtcars"  "iris.csvy"  "flights.xlsx"  "USArrests"

If it's going to strip file extensions it probably makes more sense to lapply(file, tools::file_ext) and strip each respective one so that we end up with

exts <- paste0("\\.", lapply(file, tools::file_ext), "$")
names(file) <- sapply(seq_along(file), function(x) gsub(exts[x], "", file[x]))
names(file)
#> [1] "mtcars"  "iris"  "flights"  "USArrests"

Also, on second thoughts it also doesn't really seem appropriate to treat zip files differently so Case 3 should now be the same as Case 4

  1. zip retains the file names without extension

Implemented in 258a50c

@leeper
Copy link
Contributor

leeper commented Oct 10, 2017

Can you add yourself to DESCRIPTION as a contributor?

R/import_list.R Outdated
if (length(file) > 1) {
names(file) <- gsub(paste0("\\.", tools::file_ext(file[1]), "$"), "", file, ignore.case = TRUE)
names(file) <- strip_exts(file)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link

@HughParsonage HughParsonage Oct 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a function tools::file_path_sans_ext that might be useful (possibly in combination with basename()).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great call, thanks @HughParsonage.

@leeper ready for merge, I think

@billdenney
Copy link
Contributor

I was just looking for the functionality of maintaining sheet names for xls/xlsx files with import_list, and I found this PR. My fingers are crossed that it will be integrated soon. (If I can help with the integration, please let me know.)

@ruaridhw
Copy link
Contributor Author

ruaridhw commented Dec 15, 2017

The PR was finished and has been ready to go pending a review.

@billdenney you can install this pull request (and any other for that matter) as follows

library(devtools)
install_github("leeper/rio", ref = github_pull(164))

@leeper leeper merged commit f461eb2 into gesistsa:master Jan 31, 2018
@leeper
Copy link
Contributor

leeper commented Jan 31, 2018

Thanks for this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants