Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return messages for each vetting condition failed #85

Open
franknarf1 opened this issue Nov 23, 2017 · 1 comment
Open

Return messages for each vetting condition failed #85

franknarf1 opened this issue Nov 23, 2017 · 1 comment
Milestone

Comments

@franknarf1
Copy link
Contributor

franknarf1 commented Nov 23, 2017

I am vetting a vector and am interested in seeing a "full report" when it fails:

library(vetr)
library(data.table)

date8_toIDate = function(x) as.IDate(as.character(x), format = "%Y%m%d")

DATE8 = vet_token( INT && !is.na(date8_toIDate(.)) )

x = c(20010102L, 20010101L, 20010100L)
y = replace(x, 2, NA)

vet(DATE8, x)
# [1] "`!is.na(date8_toIDate(x))` is not all TRUE (contains non-TRUE values)"
vet(DATE8, y)
# [1] "`y` should not contain NAs, but does"

So the format is an 8-digit number representing a date. For the vector y, I want to see both that it has NAs and also that its non-NAs fail my conversion test, like...

multivet(DATE8, y)
# [1] "`y` should not contain NAs, but does"    
# [1] "`!is.na(date8_toIDate(y))` is not all TRUE (contains non-TRUE values)"

Background. You could argue that I can fix y's NAs; rerun; and then catch the other condition. The problem is that I am not passing these interactively. Instead, someone else has a non-R script for pulling from various databases and manipulating data in an attempt to meet my documented vetting conditions. Running their input-generating program is time-consuming, so I'd like them to get a full report of input problems whenever any are present so they can fix them all at once.

This may be outside of what you had in mind for the package, but it looks somewhat close to the already-included functionality.

EDIT: Thinking about this more, the behavior I'm asking for may be ill defined. I guess there would need to be a rule that a particular sequence of conditions in the token is traversed. So in AA && BB && CC, if AA fails, it will only examine BB on those where AA passed; and where BB fails on some, again it would test CC only where it passed.

@brodieG
Copy link
Owner

brodieG commented Nov 23, 2017

Right now vetr bails out as soon as it determines that an object cannot possibly meet the vetting token, which in the case of && combinations is possibly as early as the first token failing. This is partly for speed, and also partly for implementation simplicity. I can consider a mode that will evaluate all tokens and track all failures, but it will probably be a long time before I get to it.

A sub-optimal workaround would be to have individual vet calls to each of the individual tokens, e.g.:

vet.res <- list(
  vet(INT, y, stop=FALSE),
  vet(!is.na(date8_toIDate(.)), y, stop=FALSE)
)
vet.fail <- !sapply(vet.res, isTRUE)
if(any(vet.fail)) vet.res[vet.fail] else TRUE

Am I understanding what you are after correctly?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants