From d722f50b4d537de564c4258ed341f5b84467278e Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dragos=20Moldovan-Gr=C3=BCnfeld?= <dragos.mold@gmail.com>
Date: Sat, 27 Nov 2021 15:41:54 +0000
Subject: [PATCH] ARROW-13886 [R] Expand documentation for decimal()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

First attempt at this. Will be followed by:
* opening Jira tickets to:
    * deprecate `decimal()`
    * implement `decimal128()` and `decimal256()`
* expand unit tests both for data types and `Arrays`

Tickets generated:
* implement `decimal128()` - https://issues.apache.org/jira/browse/ARROW-14843
* implement `decimal256()` - https://issues.apache.org/jira/browse/ARROW-14844
* improve messaging around `Decimal128Type` & `Decimal256Type` in the C++ code - https://issues.apache.org/jira/browse/ARROW-14842

Closes #11758 from dragosmg/ARROW-13886_decimal_docs

Authored-by: Dragos Moldovan-Grünfeld <dragos.mold@gmail.com>
Signed-off-by: Nic Crane <thisisnic@gmail.com>
---
 r/R/type.R         | 23 +++++++++++++++++++++--
 r/man/data-type.Rd | 23 +++++++++++++++++++++--
 2 files changed, 42 insertions(+), 4 deletions(-)

diff --git a/r/R/type.R b/r/R/type.R
index afa9a094af15f..ac3dcf3e95f84 100644
--- a/r/R/type.R
+++ b/r/R/type.R
@@ -181,14 +181,33 @@ NestedType <- R6Class("NestedType", inherit = DataType)
 #' `bit64::integer64` object) by setting `options(arrow.int64_downcast =
 #' FALSE)`.
 #'
+#' `decimal()` creates a `decimal128` type. Arrow decimals are fixed-point
+#' decimal numbers encoded as a scalar integer. The `precision` is the number of
+#' significant digits that the decimal type can represent; the `scale` is the
+#' number of digits after the decimal point. For example, the number 1234.567
+#' has a precision of 7 and a scale of 3. Note that `scale` can be negative.
+#'
+#' As an example, `decimal(7, 3)` can exactly represent the numbers 1234.567 and
+#' -1234.567 (encoded internally as the 128-bit integers 1234567 and -1234567,
+#' respectively), but neither 12345.67 nor 123.4567.
+#'
+#' `decimal(5, -3)` can exactly represent the number 12345000 (encoded
+#' internally as the 128-bit integer 12345), but neither 123450000 nor 1234500.
+#' The `scale` can be thought of as an argument that controls rounding. When
+#' negative, `scale` causes the number to be expressed using scientific notation
+#' and power of 10.
+#'
 #' @param unit For time/timestamp types, the time unit. `time32()` can take
 #' either "s" or "ms", while `time64()` can be "us" or "ns". `timestamp()` can
 #' take any of those four values.
 #' @param timezone For `timestamp()`, an optional time zone string.
 #' @param byte_width byte width for `FixedSizeBinary` type.
 #' @param list_size list size for `FixedSizeList` type.
-#' @param precision For `decimal()`, precision
-#' @param scale For `decimal()`, scale
+#' @param precision For `decimal()`, the number of significant digits
+#'    the arrow `decimal` type can represent. The maximum precision for
+#'    `decimal()` is 38 significant digits.
+#' @param scale For `decimal()`, the number of digits after the decimal
+#'    point. It can be negative.
 #' @param type For `list_of()`, a data type to make a list-of-type
 #' @param ... For `struct()`, a named list of types to define the struct columns
 #'
diff --git a/r/man/data-type.Rd b/r/man/data-type.Rd
index a063189757334..2b2313571b26f 100644
--- a/r/man/data-type.Rd
+++ b/r/man/data-type.Rd
@@ -110,9 +110,12 @@ take any of those four values.}
 
 \item{timezone}{For \code{timestamp()}, an optional time zone string.}
 
-\item{precision}{For \code{decimal()}, precision}
+\item{precision}{For \code{decimal()}, the number of significant digits
+the arrow \code{decimal} type can represent. The maximum precision for
+\code{decimal()} is 38 significant digits.}
 
-\item{scale}{For \code{decimal()}, scale}
+\item{scale}{For \code{decimal()}, the number of digits after the decimal
+point. It can be negative.}
 
 \item{...}{For \code{struct()}, a named list of types to define the struct columns}
 
@@ -149,6 +152,22 @@ are translated to R objects, \code{uint32} and \code{uint64} are converted to \c
 ("numeric") and \code{int64} is converted to \code{bit64::integer64}. For \code{int64}
 types, this conversion can be disabled (so that \code{int64} always yields a
 \code{bit64::integer64} object) by setting \code{options(arrow.int64_downcast = FALSE)}.
+
+\code{decimal()} creates a \code{decimal128} type. Arrow decimals are fixed-point
+decimal numbers encoded as a scalar integer. The \code{precision} is the number of
+significant digits that the decimal type can represent; the \code{scale} is the
+number of digits after the decimal point. For example, the number 1234.567
+has a precision of 7 and a scale of 3. Note that \code{scale} can be negative.
+
+As an example, \code{decimal(7, 3)} can exactly represent the numbers 1234.567 and
+-1234.567 (encoded internally as the 128-bit integers 1234567 and -1234567,
+respectively), but neither 12345.67 nor 123.4567.
+
+\code{decimal(5, -3)} can exactly represent the number 12345000 (encoded
+internally as the 128-bit integer 12345), but neither 123450000 nor 1234500.
+The \code{scale} can be thought of as an argument that controls rounding. When
+negative, \code{scale} causes the number to be expressed using scientific notation
+and power of 10.
 }
 \examples{
 \dontshow{if (arrow_available()) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf}