-
Notifications
You must be signed in to change notification settings - Fork 2
/
make.Rmd
25 lines (13 loc) · 1.31 KB
/
make.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# Data Generation Workflow with "Make"
*"What gets done stays done"*
This is just a proposal to consider using automated build tools like `make` for data prepration steps.
## Motivation
We strive to organize our code that cleans data, generated new data, or generates figures into sequential steps, automating as much or all of this as necessary. Writing scripts for downloading, moving files, etc not only helps
As we work we may alter any one of the steps along the way and want to re-run the sequence, but some of the steps may take non-significant time (5min < t < 60min). The challenge is to only run those steps affected by our change, without having to re-run those that are not. In addition some steps may be non sequeuntial, and can be run seperately or as an aside.
This is not uncommon for any software 'build' process. If we consider the production of our cleaned (L1) and derived (L2) data (ref EDI framework) analagous to a software "build," there are many tools to help with that.
The assumption here is that our goal is to automate data processes as much as possible, but even
For example, the GNU Make utility written in 1976 in Bell Labs.
## Usage
Using Make: [A minimal tutorial on make from kbroman](https://kbroman.org/minimal_make/)
Using Drake (Make for R):
https://github.com/ropensci/drake