Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try to make time series automatically stationary #417

Closed
spsanderson opened this issue Jan 28, 2023 · 1 comment
Closed

Try to make time series automatically stationary #417

spsanderson opened this issue Jan 28, 2023 · 1 comment
Assignees
Labels
enhancement New feature or request function

Comments

@spsanderson
Copy link
Owner

Steven Paul Sanderson II, MPH it would be totally cool if I upload my csv into R. Read csv = df. Col a is date, col b is target, col c to col zz are potential features. Then I run a function from col b to col zz, and it tells me which features are non-stationary. Then call another function or loop to make the identified non-startionary cols stationary. And output -> df.stationary

@spsanderson spsanderson self-assigned this Jan 28, 2023
@spsanderson spsanderson added the enhancement New feature or request label Jan 28, 2023
@spsanderson spsanderson added this to the healthyR.ts 0.2.8 milestone Jan 30, 2023
@spsanderson spsanderson removed this from the healthyR.ts 0.2.8 milestone Apr 1, 2023
@spsanderson spsanderson added this to the healthyR.ts 0.2.9 milestone Apr 20, 2023
@spsanderson spsanderson removed this from the healthyR.ts 0.2.9 milestone Aug 21, 2023
@spsanderson
Copy link
Owner Author

spsanderson commented Oct 5, 2023

Function USE ts_adf_test()

#' Automatically Stationarize Time Series Data
#'
#' @family Utility
#' @author Steven P. Sanderson II, MPH
#' @description This function attempts to make a non-stationary time series stationary. 
#' This function attempts to make a given time series stationary by applying transformations
#' such as differencing or logarithmic transformation. If the time series is already
#' stationary, it returns the original time series.
#'
#' @details
#' If the input time series is non-stationary (determined by the Augmented Dickey-Fuller test),
#' this function will try to make it stationary by applying a series of transformations:
#' 1. It checks if the time series is already stationary using the Augmented Dickey-Fuller test.
#' 2. If not stationary, it attempts a logarithmic transformation.
#' 3. If the logarithmic transformation doesn't work, it applies differencing.
#'
#' @param .time_series A time series object to be made stationary.
#'
#' @examples
#' # Example 1: Using the AirPassengers dataset
#' data(AirPassengers)
#' auto_stationarize(AirPassengers)
#'
#' # Example 2: Using the BJsales dataset
#' data(BJsales)
#' auto_stationarize(BJsales)
#'
#' @return
#' If the time series is already stationary, it returns the original time series.
#' If a transformation is applied to make it stationary, it returns a list with two elements:
#' - stationary_ts: The stationary time series.
#' - ndiffs: The order of differencing applied to make it stationary.
#'
#' @name auto_stationarize
NULL 

#' @export
#' @rdname auto_stationarize
auto_stationarize <- function(.time_series) {
  # Variables
  time_series <- .time_series

  # Check if the time series is already stationary
  if (ts_adf_test(time_series)$p_value < 0.05) {
    cat("The time series is already stationary via ts_adf_test().\n")
    return(time_series)
  } else {
    cat("The time series is not stationary. Attempting to make it stationary...\n")
  }
  
  # Transformation (e.g., logarithmic)
  if (ts_adf_test(log(time_series))$p_value < 0.05) {
    cat("Logarithmic transformation made the time series stationary.")
    return(log(time_series))
  }
  
  # Differencing
  if (ts_adf_test(diff(time_series, 1))$p_value >= 0.05) {
    diff_order <- forecast::ndiffs(time_series)
    cat("Differencing of order", diff_order, "made the time series stationary.\n")
  }
  
  # Return
  return(
    list(
      stationary_ts = diff(time_series, diff_order),
      ndiffs = diff_order
    )
  )
}

Example

> auto_stationarize(AirPassengers)
The time series is already stationary via ts_adf_test().
     Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
1949 112 118 132 129 121 135 148 148 136 119 104 118
1950 115 126 141 135 125 149 170 170 158 133 114 140
1951 145 150 178 163 172 178 199 199 184 162 146 166
1952 171 180 193 181 183 218 230 242 209 191 172 194
1953 196 196 236 235 229 243 264 272 237 211 180 201
1954 204 188 235 227 234 264 302 293 259 229 203 229
1955 242 233 267 269 270 315 364 347 312 274 237 278
1956 284 277 317 313 318 374 413 405 355 306 271 306
1957 315 301 356 348 355 422 465 467 404 347 305 336
1958 340 318 362 348 363 435 491 505 404 359 310 337
1959 360 342 406 396 420 472 548 559 463 407 362 405
1960 417 391 419 461 472 535 622 606 508 461 390 432
> auto_stationarize(BJsales)
The time series is not stationary. Attempting to make it stationary...
Differencing of order 1 made the time series stationary.
$stationary_ts
Time Series:
Start = 2 
End = 150 
Frequency = 1 
  [1] -0.6 -0.1 -0.5  0.1  1.2 -1.6  1.4  0.3  0.9  0.4 -0.1  0.0  2.0  1.4  2.2  3.4  0.0 -0.7
 [19] -1.0  0.7  3.7  0.5  1.4  3.6  1.1  0.7  3.3 -1.0  1.0 -2.1  0.6 -1.5 -1.4  0.7  0.5 -1.7
 [37] -1.1 -0.1 -2.7  0.3  0.6  0.8  0.0  1.0  1.0  4.2  2.0 -2.7 -1.5 -0.7 -1.3 -1.7 -1.1 -0.1
 [55] -1.7 -1.8  1.6  0.7 -1.0 -1.5 -0.7  1.7 -0.2  0.4 -1.8  0.8  0.7 -2.0 -0.3 -0.6  1.3 -1.4
 [73] -0.3 -0.9  0.0  0.0  1.8  1.3  0.9 -0.3  2.3  0.5  2.2  1.3  1.9  1.5  4.5  1.7  4.8  2.5
 [91]  1.4  3.5  3.2  1.5  0.7  0.3  1.4 -0.1  0.2  1.6 -0.4  0.9  0.6  1.0 -2.5 -1.4  1.2  1.6
[109]  0.3  2.3  0.7  1.3  1.2 -0.2  1.4  3.0 -0.4  1.3 -0.9  1.2 -0.8 -1.0 -0.8 -0.1 -1.5  0.3
[127]  0.2 -0.5 -0.1  0.3  1.3 -1.1 -0.1 -0.5  0.3 -0.7  0.7 -0.5  0.6 -0.3  0.2  2.1  1.5  1.8
[145]  0.4 -0.5 -1.0  0.4  0.5

$ndiffs
[1] 1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request function
Development

No branches or pull requests

1 participant