Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sub-folder structure #10

Closed
MonkmanMH opened this issue Aug 3, 2017 · 7 comments
Closed

sub-folder structure #10

MonkmanMH opened this issue Aug 3, 2017 · 7 comments

Comments

@MonkmanMH
Copy link
Contributor

Add sub-folders, as per the guidelines in the recently published paper "Good Enough Practices in Scientific Computing" -- Summary of Practices, 4b - 4e

  • doc
  • data
  • scr (for scripts, e.g. 01_load.r that are already created via bcgovr)
  • bin
@boshek
Copy link
Contributor

boshek commented Aug 3, 2017

This is a tough one. I do like the idea of mirroring that paper. Just to clarify are you saying we replace the files created here with the above structure?

The thing I struggle with here is a "one-size-fits-all" approach to file structure within an analysis directory. Would an argument with a few options of folder/file combinations be merited? Or is better to be simpler and if users don't like the default, they just delete them and replace with their own files/folders?

@stephhazlitt
Copy link
Member

I lean towards one default & users can manually edit as they like -- I think multiple options might add complexity for a similar outcome (the options don't meet everyone's styles/needs)? For the EnvReportBC work flow, there is a preference for having the core analysis scripts in the root directory (see thread here for a similar discussion).

@boshek
Copy link
Contributor

boshek commented Aug 10, 2017

I think I will wait @ateucher to comment on this but I wonder if a blank slate might best here. So that is no .R files are automatically produced. Then in the documentation we could discuss some analysis structure options? Or perhaps the files themselves are generated by another function?

@ateucher
Copy link
Contributor

ateucher commented Aug 14, 2017

I feel like a good solution would be a simple default (as it is now), then allow a user to set an option for an alternate setup. I would see this being supplied as a character vector or files/paths. eg:

c("doc/", "data/", "bin/", "results/", "src/01_load.R", "src/02_clean.R", 
"src/03_analysis.r", "src/04_output.R", "src/runall.R")

This could be an additional argument (dir_struct or something) to analysis_skeleton, but there are already a lot of arguments to that function.

Alternatively (or additionally) a user could set a global option of:

options("bcgovr_dir_struct" = c("doc/", "data/", "bin/", "results/", 
                                "src/01_load.R", "src/02_clean.R", 
                                "src/03_analysis.r", "src/04_output.R", 
                                "src/runall.R"))

Internally the function would look for that option, and if it exists use that structure instead of the default one. A user could set that option in the their .Rprofile so it is available everytime they open R.

Thoughts?

@boshek
Copy link
Contributor

boshek commented Aug 14, 2017

+1 for the .Rprofile option. That is a great idea. So nothing done internally but somewhere (likely here for now so documentation that folks know it is an option. This is a great idea @ateucher as I agree there is a bit of argument overload in analysis_skeleton().

@ateucher
Copy link
Contributor

On second thought I think only setting it via an option is a bad idea, as it's pretty opaque and probably an 'anti-pattern' or 'user-hostile' 😉. I think there should be an argument, which by default checks for that option. If the bcgovr_dir_struct option isn't set and the user doesn't supply a different structure to the argument, then it uses the default structure.

@ateucher
Copy link
Contributor

ateucher commented Aug 23, 2017

Ok, this is what I have (in the struct-options branch):

# devtools::install_github("bcgov/bcgovr", ref = "struct-options")
library(bcgovr)

## Default
analysis_skeleton(path = "c:/_dev/bcgovr_test")
#> Creating new analysis in c:/_dev/bcgovr_test
#> Adding folders and files to c:/_dev/bcgovr_test: R/, out/, graphics/, data/, 01_load.R, 02_clean.R, 03_analysis.R, 04_output.R, internal.R, run_all.R
#> Adding file c:/_dev/bcgovr_test/CONTRIBUTING.md
#> Adding file c:/_dev/bcgovr_test/CODE_OF_CONDUCT.md
#> Adding file c:/_dev/bcgovr_test/README.md
#> Adding file c:/_dev/bcgovr_test/README.rmd
#> Adding file c:/_dev/bcgovr_test/bcgovr_test.Rproj
#> Adding file c:/_dev/bcgovr_test/LICENSE
#> Adding Apache boilerplate header to the top of c:/_dev/bcgovr_test/01_load.R
#> Adding Apache boilerplate header to the top of c:/_dev/bcgovr_test/02_clean.R
#> Adding Apache boilerplate header to the top of c:/_dev/bcgovr_test/03_analysis.R
#> Adding Apache boilerplate header to the top of c:/_dev/bcgovr_test/04_output.R
#> Adding Apache boilerplate header to the top of c:/_dev/bcgovr_test/internal.R
#> Adding Apache boilerplate header to the top of c:/_dev/bcgovr_test/run_all.R

## Specify it by argument:
analysis_skeleton(path = "c:/_dev/bcgovr_test2", 
                  dir_struct = c("doc/", "data/", "bin/", "results/", 
                                 "src/01_load.R", "src/02_clean.R", 
                                 "src/03_analysis.R", "src/04_output.R", 
                                 "src/runall.R"))
#> Creating new analysis in c:/_dev/bcgovr_test2
#> Adding folders and files to c:/_dev/bcgovr_test2: doc/, data/, bin/, results/, src/01_load.R, src/02_clean.R, src/03_analysis.R, src/04_output.R, src/runall.R
#> Adding file c:/_dev/bcgovr_test2/CONTRIBUTING.md
#> Adding file c:/_dev/bcgovr_test2/CODE_OF_CONDUCT.md
#> Adding file c:/_dev/bcgovr_test2/README.md
#> Adding file c:/_dev/bcgovr_test2/README.rmd
#> Adding file c:/_dev/bcgovr_test2/bcgovr_test2.Rproj
#> Adding file c:/_dev/bcgovr_test2/LICENSE
#> Adding Apache boilerplate header to the top of c:/_dev/bcgovr_test2/src/01_load.R
#> Adding Apache boilerplate header to the top of c:/_dev/bcgovr_test2/src/02_clean.R
#> Adding Apache boilerplate header to the top of c:/_dev/bcgovr_test2/src/03_analysis.R
#> Adding Apache boilerplate header to the top of c:/_dev/bcgovr_test2/src/04_output.R
#> Adding Apache boilerplate header to the top of c:/_dev/bcgovr_test2/src/runall.R

## Set it via options (could put this in your .Rprofile)
options("bcgovr.dir.struct" = c("my_awesome_functions/", "do_it_all.R"))

analysis_skeleton(path = "c:/_dev/bcgovr_test3")
#> Creating new analysis in c:/_dev/bcgovr_test3
#> Adding folders and files to c:/_dev/bcgovr_test3: my_awesome_functions/, do_it_all.R
#> Adding file c:/_dev/bcgovr_test3/CONTRIBUTING.md
#> Adding file c:/_dev/bcgovr_test3/CODE_OF_CONDUCT.md
#> Adding file c:/_dev/bcgovr_test3/README.md
#> Adding file c:/_dev/bcgovr_test3/README.rmd
#> Adding file c:/_dev/bcgovr_test3/bcgovr_test3.Rproj
#> Adding file c:/_dev/bcgovr_test3/LICENSE
#> Adding Apache boilerplate header to the top of c:/_dev/bcgovr_test3/do_it_all.R

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants