Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validation improvements #149

Merged
merged 11 commits into from
Feb 10, 2022
2 changes: 2 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ S3method(str,R7_union)
export("@")
export("method<-")
export("prop<-")
export("props<-")
export("r7_data<-")
export(R7_object)
export(as_class)
Expand All @@ -37,6 +38,7 @@ export(r7_data)
export(s3_class)
export(valid_eventually)
export(valid_implicitly)
export(validate)
importFrom(stats,setNames)
importFrom(utils,getFromNamespace)
importFrom(utils,getS3method)
Expand Down
4 changes: 4 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,10 @@

## Jan 2020

* New `props<-` for setting multiple properties simultaneously and validating
afterwards (#149).
* Validation now happens recursively, and validates types before validating
the object (#149)
* Classes (base types, S3, S4, and R7) are handled consistently wherever they
are used. Strings now only refer to base types. New explicit `s3_class()` for
referring to S3 classes (#134).
Expand Down
18 changes: 14 additions & 4 deletions R/class.R
Original file line number Diff line number Diff line change
Expand Up @@ -50,10 +50,20 @@ is_class <- function(x) inherits(x, "R7_class")
#'
#' @param constructor The constructor function. This is optional, unless
#' you want to control which properties can be set on constructor.
#' @param validator A function used to determine whether or not an object
#' is valid. This is called automatically after construction, and
#' whenever any property is set. It should return `NULL` if the object is
#' valid, and otherwise return a character vector of problems.
#' @param validator A function taking a single argument, the object to validate.
#'
#' The job of a validator is to determine whether the object is valid,
#' i.e. if the current property values form an allowed combination. The
#' types of the properties are always automatically validator so the job of
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

always automatically validator

typo here, validator->valid

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, thanks.

#' the validator is to verify that the value of individual properties is
#' ok (i.e. maybe a property should have length 1, or should always be
#' positive), or that the combination of values of multiple properties is ok.
#' It is called after construction and whenever any property is set.
#'
#' The validator should return `NULL` if the object is valid. If not, it
#' should return a character vector where each element describes a single
#' problem. It's generally helpful to report as many problems at once
#' as possible.
#' @param properties A list specifying the properties (data) that
#' every object of the class will possess. Each property can either be
#' a named string (specifying the class), or a call to [new_property()],
Expand Down
1 change: 1 addition & 0 deletions R/data.R
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
#' and all properties.
#'
#' @inheritParams prop
#' @param value Object used to replace the underlying data.
#' @export
#' @examples
#' text <- new_class("text", parent = "character")
Expand Down
24 changes: 1 addition & 23 deletions R/object.R
Original file line number Diff line number Diff line change
Expand Up @@ -26,38 +26,16 @@ new_object <- function(.data = NULL, ...) {
}

if (!is.null(.data)) {
# TODO eliminate this special case
if (obj_cls@name %in% names(base_types)) {
if (!inherits(.data, obj_cls@name)) {
stop(sprintf(
"`.data` must be %s not %s",
class_desc(as_class(obj_cls@name)),
obj_desc(.data)
))
}
} else {
if (!class_inherits(.data, obj_cls@parent)) {
stop(sprintf(
"`.data` must be %s not %s",
class_desc(obj_cls@parent),
obj_desc(.data)
))
}
}

object <- .data
} else {
object <- obj_cls@parent@constructor()
}
attr(object, ".should_validate") <- FALSE

class(object) <- "R7_object"
object_class(object) <- obj_cls
for (nme in nms) {
prop(object, nme) <- args[[nme]]
prop(object, nme, check = FALSE) <- args[[nme]]
}

attr(object, ".should_validate") <- NULL
validate(object)

object
Expand Down
46 changes: 37 additions & 9 deletions R/property.R
Original file line number Diff line number Diff line change
Expand Up @@ -80,14 +80,16 @@ str.R7_property <- function(object, ..., nest.lev = 0) {
#' rather than throwing an error.
#' - `prop<-` and `@<-` set a new value for the given property.
#' - `props()` returns a list of all properties
#' - `props<-` sets multiple properties at once, validating after all are set
hadley marked this conversation as resolved.
Show resolved Hide resolved
#' - `prop_names()` returns the names of the properties
#' - `prop_exists(x, "prop")` returns `TRUE` iif `x` has property `prop`.
#'
#' @param object An object from a R7 class
#' @param name The name of the parameter as a character. Partial matching
#' is not performed.
#' @param value A replacement value for the parameter. The object is
#' automatically checked for validity after the replacement is done.
#' @param value For `prop<-`, a replacement value for the property;
#' for `props<-`, a named list of values. The object is automatically
#' checked for validity after the replacement is done.
#' @export
#' @examples
#' horse <- new_class("horse", properties = list(
Expand All @@ -108,7 +110,7 @@ prop <- function(object, name) {
if (!inherits(object, "R7_object")) {
hadley marked this conversation as resolved.
Show resolved Hide resolved
stop("`object` is not an <R7_object>")
} else if (!prop_exists(object, name)) {
stop(sprintf("Can't find property %s@%s", obj_desc(object), name))
stop(prop_error_unknown(object, name))
} else {
prop_val(object, name)
}
Expand Down Expand Up @@ -172,6 +174,19 @@ props <- function(object) {
}
}

#' @rdname prop
#' @export
`props<-` <- function(object, value) {
stopifnot(is.list(value))

for (name in names(value)) {
prop(object, name, check = FALSE) <- value[[name]]
}
validate(object)

object
}

#' @rdname prop
#' @export
prop_exists <- function(object, name) {
Expand All @@ -188,29 +203,42 @@ prop_exists <- function(object, name) {

function(object, name, check = TRUE, value) {
prop <- prop_obj(object, name)
if (is.null(prop)) {
stop(prop_error_unknown(object, name))
}

if (!is.null(prop$setter) && !identical(setter_property, name)) {
setter_property <<- name
on.exit(setter_property <<- NULL, add = TRUE)
object <- prop$setter(object, value)
} else {
if (isTRUE(check) && !class_inherits(value, prop$class)) {
stop(sprintf("%s@%s must be of class %s, not %s",
obj_desc(object), name,
class_desc(prop$class),
obj_desc(value)
), call. = FALSE)
stop(prop_error_type(object, name, prop$class, value), call. = FALSE)
Comment on lines 215 to +216
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you extract out a validate_property() from validate_properties() and then use that here? Then if we add more validation checks we don't have to keep these in sync

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about that too. It's just pretty marginal since it's only used in two places and the logic is v simple.

}
attr(object, name) <- value
}

if (isTRUE(check)) {
validate(object)
validate(object, properties = FALSE)
}

invisible(object)
}
})

prop_error_unknown <- function(object, prop_name) {
sprintf("Can't find property %s@%s", obj_desc(object), prop_name)
}

prop_error_type <- function(object, prop_name, expected, actual) {
hadley marked this conversation as resolved.
Show resolved Hide resolved
sprintf("%s@%s must be of class %s, not %s",
obj_desc(object),
prop_name,
class_desc(expected),
obj_desc(actual)
)
}

#' @rdname prop
#' @usage object@name
#' @export
Expand Down
118 changes: 99 additions & 19 deletions R/valid.R
Original file line number Diff line number Diff line change
@@ -1,41 +1,118 @@
#' Validation of R7 objects
#' Validate an R7 object
#'
#' [validate()] calls the validation of an R7 object. This is done
#' automatically when creating new objects (at the end of [new_object]) and
#' when setting any property.
#' @description
#' `validate()` ensures that an R7 object is valid by calling the `validator`
#' provided in [new_class()]. This is done automatically when constructing new
#' objects and when modifying properties.
#'
#' [valid_eventually()] disables validation of properties, runs a function on
#' the object, then validates the object.
#' `valid_eventually()` disables validation, modifies the object, then
#' revalidates. This is useful when a sequence of operations would otherwise
#' lead an object to be temporarily invalid, or when repeated property
#' modification causes a performance bottleneck because the validator is
#' relatively expensive.
#'
#' [valid_implicitly()] does the same but does not validate the object at the end.
#'
#' [valid_implicitly()] should only be used rarely in performance critical code
#' where you are certain a sequence of operations cannot produce an invalid
#' object.
#' `valid_implicitly()` does the same but does not validate the object at the
#' end. It should only be used rarely, and in performance critical code where
#' you are certain a sequence of operations cannot produce an invalid object.
#' @param object An R7 object
#' @param fun A function to call on the object before validation.
validate <- function(object) {
#' @param properties If `TRUE`, the default, checks property types before
#' executing the validator.
#' @export
#' @examples
#' # A range class might validate that the start is less than the end
#' Range <- new_class("Range",
#' properties = list(start = "double", end = "double"),
#' validator = function(object) {
#' if (object@start >= object@end) "start must be smaller than end"
#' }
#' )
#' # You can't construct an invalid object:
#' try(Range(1, 1))
#'
#' # And you can't create an invalid object with @<-
#' r <- Range(1, 2)
#' try(r@end <- 1)
#'
#' # But what if you want to move a range to the right?
#' rightwards <- function(r, x) {
#' r@start <- r@start + x
#' r@end <- r@end + x
#' r
#' }
#' # This function doesn't work because it creates a temporarily invalid state
#' try(rightwards(r, 10))
#'
#' # This is the perfect use case for valid_eventually():
#' rightwards <- function(r, x) {
#' valid_eventually(r, function(object) {
#' object@start <- object@start + x
#' object@end <- object@end + x
#' object
#' })
#' }
#' rightwards(r, 10)
#'
#' # Alternatively, you can set multiple properties at once using props<-
#' rightwards <- function(r, x) {
hadley marked this conversation as resolved.
Show resolved Hide resolved
#' props(r) <- list(start = r@start + x, end = r@end + x)
#' r
#' }
#' rightwards(r, 20)
validate <- function(object, properties = TRUE) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I were designing this function, this is where I'd put ... to force named arguments, because validate(x, FALSE) is essentially meaningless to me without the argument name.

Plus this feels like a function which has a decent chance that we will add more arguments to it, and that would free us from caring about argument order.

Not sure how we feel about this principle for R7 though, since it will be tied closely to base R

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, agreed with both points. But since this is going in base R (which doesn't use that pattern), and properties is mostly their for internal use, I'll leave as is.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding an unneeded ... to the argument list also comes with a price. I would never want to pay that just to force users..

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explicit formal arguments also improve code clarity. Users are free to follow whatever calling convention they like.

if (!is.null(attr(object, ".should_validate"))) {
return(invisible(object))
}

obj_class <- object_class(object)
class <- object_class(object)

validator <- prop_safely(obj_class, "validator")
if (is.null(validator)) {
return(invisible(object))
# First, check property types - if these are incorrect, the validator
# is likely to return spurious errors
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the errors in the object validator are likely to be spurious, should we just go ahead and error immediately if any properties are invalid? Maybe it could have a header of Invalid <obj> properties:?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see you've tried to do something like this by avoiding the object validator with the length(errors) == 0 branch. Hmm, I think I still prefer an immediate error - but only because I would mention in the header that it was the properties that were invalid (I don't feel strongly about this if you want to keep it as is)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, good idea.

if (properties) {
errors <- validate_properties(object, class)
if (length(errors) > 0) {
bullets <- paste0("- ", errors, collapse = "\n")
msg <- sprintf("%s object properties are invalid:\n%s", obj_desc(object), bullets)
stop(msg, call. = FALSE)
}
}

errors <- validator(object)
# Next, recursively validate the object
errors <- character()
while(!is.null(class) && is_class(class)) {
errors <- c(errors, class@validator(object))
class <- prop_safely(class, "parent")
}

# If needed, report errors
if (length(errors) > 0) {
msg <- sprintf("Invalid <%s> object:\n%s", obj_class@name, paste0("- ", errors, collapse = "\n"))
bullets <- paste0("- ", errors, collapse = "\n")
msg <- sprintf("%s object is invalid:\n%s", obj_desc(object), bullets)
stop(msg, call. = FALSE)
}

invisible(object)
}

validate_properties <- function(object, class) {
errors <- character()

for (prop in class@properties) {
# Only validate static properties
if (!is.null(prop$getter) || !is.null(prop$setter)) {
next
}

value <- prop(object, prop$name)
if (!class_inherits(value, prop$class)) {
errors <- c(errors, prop_error_type(object, prop$name, prop$class, value))
}
prop
}

errors
}

#' @rdname validate
#' @export
valid_eventually <- function(object, fun) {
Expand All @@ -44,6 +121,8 @@ valid_eventually <- function(object, fun) {
out <- fun(object)
attr(out, ".should_validate") <- old
validate(out)

out
hadley marked this conversation as resolved.
Show resolved Hide resolved
}

#' @rdname validate
Expand All @@ -53,5 +132,6 @@ valid_implicitly <- function(object, fun) {
attr(object, ".should_validate") <- FALSE
out <- fun(object)
attr(out, ".should_validate") <- old
invisible(out)

out
}
13 changes: 11 additions & 2 deletions R/zzz.R
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,16 @@ R7_object <- new_class(
)

new_base_class <- function(name) {
R7_class(name = name, constructor = function(.data) new_object(.data))
new_class(
name = name,
constructor = function(.data) new_object(.data),
validator = function(object) {
data <- unclass(object)
if (!name %in% .class2(data)) {
sprintf("Underlying data must be <%s> not %s", name, obj_desc(data))
}
}
)
}

# Define simple base types with constructors. See .onLoad() for more
Expand Down Expand Up @@ -47,7 +56,7 @@ R7_generic <- new_class(

R7_method <- new_class(
name = "R7_method",
properties = list(generic = R7_generic, signature = "list", fun = "function"),
properties = list(generic = R7_generic, signature = "list"),
parent = "function",
constructor = function(generic, signature, fun) {
if (is.character(signature)) {
Expand Down
Loading