Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Addition of function force_sorted() #123

Merged
merged 11 commits into from
Mar 21, 2024
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,4 @@ docs
local_data
*.o
*.so
*.dll
1 change: 1 addition & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ export(estimateBaselineConvexHull)
export(estimateBaselineMedian)
export(estimateBaselineSnip)
export(estimateBaselineTopHat)
export(force_sorted)
export(formatRt)
export(getImputeMargin)
export(gnps)
Expand Down
5 changes: 5 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,10 @@
# MsCoreUtils 1.15

## MsCoreUtils 1.15.5

- Add function `force_sorted()` to adjust a numeric vector to ensure
increasing/sorted values.

## MsCoreUtils 1.15.4

- Fix partial argument match (see issue #125).
Expand Down
72 changes: 72 additions & 0 deletions R/force_sorted.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
#' @title Forcing a numeric vector into a monotonously increasing sequence.
#'
#' @description
#' This function performs interpolation on the non-increasing parts of a
#' numeric input vector to ensure its values are monotonously increasing.
#' If the values are non-increasing at the end of the vector, these values will
#' be replaced by a sequence of numeric values, starting from the last
#' increasing value in the input vector, and increasing by a very small value,
#' which can be defined with parameter `by `
#'
#' @param x `numeric` vector.
#'
#' @param by `numeric(1)` value that will determine the monotonous increase in
#' case the values at the end of the vector are non-increasing and
#' therefore interpolation would not be possible. Defaults
#' to `by = .Machine$double.eps` which is the smallest positive
#' floating-point number x such that 1 + x != 1.
#'
#' @return A vector with continuously increasing values.
#'
#' @note
#' NA values will not be replaced and be returned as-is.
#'
#' @examples
#' x <- c(NA, NA, NA, 1.2, 1.1, 1.14, 1.2, 1.3, NA, 1.04, 1.4, 1.6, NA, NA)
#' sorted_vec <- force_sorted(x)
#' is.unsorted(x, na.rm = TRUE)
#'
#' ## Vector non increasing at the end
#' x <- c(1, 2, 1.5, 2)
#' sorted_rtime <- force_sorted(x, by = 0.1)
#' is.unsorted(x, na.rm = TRUE)
#'
#' ## We can see the values were not interpolated but rather replaced by the
#' ## last increasing value `2` and increasing by 0.1.
#' sorted_vec
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be sorted_rtime. Also could you use another name, like x2 or y, to keep it general. Here, I guess rtime refers to retention time, but this isn't mentioned in the man page (and there's no need to), so it is a bit of an obscure name.

#'
#' @export
#'
#' @rdname force_sorted
force_sorted <- function(x, by = .Machine$double.eps) {
# Select only the non-NA values
if (!is.numeric(x) && !is.integer(x))
stop("'x' needs to be numeric or integer")
nna_idx <- which(!is.na(x))
vec_temp <- x[nna_idx]

while (any(diff(vec_temp) < 0)) {
idx <- which.max(diff(vec_temp) < 0)
# Find next biggest value
next_idx <- which(vec_temp > vec_temp[idx])[1L]

if (is.na(next_idx)) {
l <- idx:length(vec_temp)
vec_temp[l] <- seq(vec_temp[idx], by = by,
length.out = length(l))
warning("Found decreasing values at the end of the vector. ",
"Interpolation is not possible in this region. Instead, ",
"replacing these values with a sequence that starts from ",
"the last increasing value and increments by ", by,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two spaces after by.

". See help for more details")
break
lgatto marked this conversation as resolved.
Show resolved Hide resolved
}
# Interpolation
idx_range <- idx:next_idx
vec_temp[idx_range] <- seq(vec_temp[idx], vec_temp[next_idx],
length.out = length(idx_range))
}
x[nna_idx] <- vec_temp
x
}

46 changes: 46 additions & 0 deletions man/force_sorted.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/gnps.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

20 changes: 20 additions & 0 deletions tests/testthat/test_force_sorted.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
test_that("forceSorting works", {
vec <- c(NA, NA, NA, 1.2, 1.1, 1.14, 1.2, 1.3, 1.1, 1.04, 1.4, 1.6, NA, NA)
# Expected result after interpolation
sorted <- c(NA, NA, NA, 1.2, 1.225, 1.25, 1.275, 1.3, 1.333, 1.367,
1.4, 1.6, NA, NA)
result <- force_sorted(vec)
expect_equal(result, sorted, tolerance = 0.001)

# Test with decreasing values at the end
vec <- c(NA, NA, NA, 1.2, 1.1, 1.14, 1.2, 1.3, 1.4, 1.04, 1.2, 1.04, NA)
expect_warning(result <- force_sorted(vec, by = 0.000001), "replacing")
sorted <- c(NA, NA, NA, 1.2, 1.225, 1.25, 1.275, 1.3, 1.4, 1.400001,
1.400002, 1.400003, NA)
expect_equal(result, sorted)

# Test with sorted values
vec <- c(NA, NA, NA, 1.2, 1.3, 1.42, 1.46, 1.49, 1.498, 1.5, 1.6, 1.66, NA)
result <- force_sorted(vec)
expect_equal(vec, result)
})
Loading