From d722f50b4d537de564c4258ed341f5b84467278e Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dragos=20Moldovan-Gr=C3=BCnfeld?= Date: Sat, 27 Nov 2021 15:41:54 +0000 Subject: [PATCH] ARROW-13886 [R] Expand documentation for decimal() MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit First attempt at this. Will be followed by: * opening Jira tickets to: * deprecate `decimal()` * implement `decimal128()` and `decimal256()` * expand unit tests both for data types and `Arrays` Tickets generated: * implement `decimal128()` - https://issues.apache.org/jira/browse/ARROW-14843 * implement `decimal256()` - https://issues.apache.org/jira/browse/ARROW-14844 * improve messaging around `Decimal128Type` & `Decimal256Type` in the C++ code - https://issues.apache.org/jira/browse/ARROW-14842 Closes #11758 from dragosmg/ARROW-13886_decimal_docs Authored-by: Dragos Moldovan-Grünfeld Signed-off-by: Nic Crane --- r/R/type.R | 23 +++++++++++++++++++++-- r/man/data-type.Rd | 23 +++++++++++++++++++++-- 2 files changed, 42 insertions(+), 4 deletions(-) diff --git a/r/R/type.R b/r/R/type.R index afa9a094af15f..ac3dcf3e95f84 100644 --- a/r/R/type.R +++ b/r/R/type.R @@ -181,14 +181,33 @@ NestedType <- R6Class("NestedType", inherit = DataType) #' `bit64::integer64` object) by setting `options(arrow.int64_downcast = #' FALSE)`. #' +#' `decimal()` creates a `decimal128` type. Arrow decimals are fixed-point +#' decimal numbers encoded as a scalar integer. The `precision` is the number of +#' significant digits that the decimal type can represent; the `scale` is the +#' number of digits after the decimal point. For example, the number 1234.567 +#' has a precision of 7 and a scale of 3. Note that `scale` can be negative. +#' +#' As an example, `decimal(7, 3)` can exactly represent the numbers 1234.567 and +#' -1234.567 (encoded internally as the 128-bit integers 1234567 and -1234567, +#' respectively), but neither 12345.67 nor 123.4567. +#' +#' `decimal(5, -3)` can exactly represent the number 12345000 (encoded +#' internally as the 128-bit integer 12345), but neither 123450000 nor 1234500. +#' The `scale` can be thought of as an argument that controls rounding. When +#' negative, `scale` causes the number to be expressed using scientific notation +#' and power of 10. +#' #' @param unit For time/timestamp types, the time unit. `time32()` can take #' either "s" or "ms", while `time64()` can be "us" or "ns". `timestamp()` can #' take any of those four values. #' @param timezone For `timestamp()`, an optional time zone string. #' @param byte_width byte width for `FixedSizeBinary` type. #' @param list_size list size for `FixedSizeList` type. -#' @param precision For `decimal()`, precision -#' @param scale For `decimal()`, scale +#' @param precision For `decimal()`, the number of significant digits +#' the arrow `decimal` type can represent. The maximum precision for +#' `decimal()` is 38 significant digits. +#' @param scale For `decimal()`, the number of digits after the decimal +#' point. It can be negative. #' @param type For `list_of()`, a data type to make a list-of-type #' @param ... For `struct()`, a named list of types to define the struct columns #' diff --git a/r/man/data-type.Rd b/r/man/data-type.Rd index a063189757334..2b2313571b26f 100644 --- a/r/man/data-type.Rd +++ b/r/man/data-type.Rd @@ -110,9 +110,12 @@ take any of those four values.} \item{timezone}{For \code{timestamp()}, an optional time zone string.} -\item{precision}{For \code{decimal()}, precision} +\item{precision}{For \code{decimal()}, the number of significant digits +the arrow \code{decimal} type can represent. The maximum precision for +\code{decimal()} is 38 significant digits.} -\item{scale}{For \code{decimal()}, scale} +\item{scale}{For \code{decimal()}, the number of digits after the decimal +point. It can be negative.} \item{...}{For \code{struct()}, a named list of types to define the struct columns} @@ -149,6 +152,22 @@ are translated to R objects, \code{uint32} and \code{uint64} are converted to \c ("numeric") and \code{int64} is converted to \code{bit64::integer64}. For \code{int64} types, this conversion can be disabled (so that \code{int64} always yields a \code{bit64::integer64} object) by setting \code{options(arrow.int64_downcast = FALSE)}. + +\code{decimal()} creates a \code{decimal128} type. Arrow decimals are fixed-point +decimal numbers encoded as a scalar integer. The \code{precision} is the number of +significant digits that the decimal type can represent; the \code{scale} is the +number of digits after the decimal point. For example, the number 1234.567 +has a precision of 7 and a scale of 3. Note that \code{scale} can be negative. + +As an example, \code{decimal(7, 3)} can exactly represent the numbers 1234.567 and +-1234.567 (encoded internally as the 128-bit integers 1234567 and -1234567, +respectively), but neither 12345.67 nor 123.4567. + +\code{decimal(5, -3)} can exactly represent the number 12345000 (encoded +internally as the 128-bit integer 12345), but neither 123450000 nor 1234500. +The \code{scale} can be thought of as an argument that controls rounding. When +negative, \code{scale} causes the number to be expressed using scientific notation +and power of 10. } \examples{ \dontshow{if (arrow_available()) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf}