Skip to content
Alvaro Joao edited this page Nov 23, 2015 · 2 revisions

Welcome!

This is a really simple example. We need to understand that the time series data are normally a continues and chronological data with the same time constant between each sample. So, it's not possible to split the data or make all cross-validations procedures.

In few cases the only available data to create mathematical models to predict the next values are the time series itself. That's why we need to retrieve the lags(past data) to be the inputs of the model.

The approach by steps:

  1. Find the most relevant lags from the series, using acf function. Normalization is usually applied.
  2. Get the lags and make than inputs for the model, by adding the lags as column or attribute for the samples.
  3. Split the data, respecting the sequence.
  4. Setup the value of the initial window/training data, and the prediction horizon.
  5. Creat the model, and then test with the test data.
  6. Measure created model, usually measured with mse, rmse, and mape.
Clone this wiki locally