[ideas] Support targets writing directly to the store #1281
Replies: 2 comments 6 replies
-
If I understand correctly, I think you may want a custom target storage format with A custom format would allow you to deal with objects in the the target store via custom read/write methods. The target command itself would be the input path or object, without a system call/subprocess, the |
Beta Was this translation helpful? Give feedback.
-
I think this is a great idea. I have personally felt this pain quite keenly when constructing pipelines that call external programs that make use of file input/output (as is quite common in bioinformatics pipelines). If we could abstract away file paths, that would be awesome. I think a concrete example may help illustrate: https://github.com/joelnitta/targets_vcf_example/blob/main/_targets.R As you can see, I depend heavily on |
Beta Was this translation helpful? Give feedback.
-
Help
Description
There are a number of cases where it makes sense to work with files on disk in Targets. For example, you might want to run some command line tool that uses file paths as input and output. Targets somewhat supports this via the
format = "file"
, but this doesn't feel particularly idiomatic because:_store
and a user defined path where we put our filesTo resolve these problems, I think it would be great if there was an officially sanctioned way to generate a target by writing directly to the store. For example, let's say that
tar_output_path()
returned the path to the current target, and also marked the result of the current target as irrelevant (the return value fromsystem2
here doesn't need to be saved):This example is similar to the following, where we explicitly capture the stdout as a string and let targets save it to the store.
However, letting our subprocess write directly to the filesystem will be vastly more performant and memory efficient because everything can be streamed:
I'm aware of the existence of
tar_path_target
, but it's not clear to me what will happen if I write to this file inside the current target. I suspect the return value of the target will be saved to this file afterwards, overwriting my output. Therefore, I think we still need a new function for this purpose.Beta Was this translation helpful? Give feedback.
All reactions