Skip to content

sonalake/prophet-lstm-pydata-2021

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 

Repository files navigation

Building a time series meta forecasting model with Prophet and LSTM

PyData 2021 Lightning Talk

Challenge

Mobile telecommunications operators face congestion issues in their networks due to increased usage and environmental factors. As cellular base station congestion negatively impacts customer experience, it can negatively also impact revenue and increase subscriber churn. This notebook demonstrates using time series forecasting to predict network congestion so that the operators are better equipped to manage the situation proactively.

Why Meta Forecasting?

The first question to address when modelling time series data is how to choose the ‘best model’ among a variety of candidates. Do we adopt statistical methods or other pure machine learning models, including tree-based algorithms or deep learning techniques for Forecasting? Depending on the underlying mechanism of the model and the training data, different models often learn different features, and hence each model can view the data from different perspectives.

In general, statistical techniques are adequate when facing an autoregressive problem (i.e. when the future is related only to the past). At the same time, machine learning and deep learning models are suitable for more complex situations when it’s also possible to combine a large number of data sources. We can achieve higher precision forecasts by combining the power of diverse models, just as in the case of ensemble learning models like random forests.

Model Selection

In this notebook, we present a technique we call 'meta forecasting' that aims to combine the ability of an additive regression model to learn from experience, along with the generalization and power of deep learning techniques. We build a metamodel using Prophet and long short-term memory. It is similar to ensemble learning but with a slight variation that we'll cover later on.

The reason for selecting Prophet is that it provides a stable forecast, and it is designed to deal with country-specific public holidays, missing observations and large outliers. It can also cope with time series that undergo trend changes, such as those due to a product launch, or in the case of the telecoms, when the operator upgrades the infrastructure or changes the cell configuration. In such cases, we can manually input change points to feed in additional information into the model. These effects might not have been well captured by other approaches, including LSTM, making Prophet an ideal solution for the first model.

We chose the LSTM as the second model due to its ability to forecast for longer time horizons and automatic feature extraction abilities. Furthermore, the different gates inside LSTM boost its capability for capturing nonlinear relationships for forecasting. When modelling time series, there are some factors that have a nonlinear impact on the values we are trying to forecast. Therefore, by using LSTM, the model can learn the nonlinear relationship present in the data leading to better forecast.

Workflow and Data Processing

  1. Fit a Prophet model on our training data
  2. Extract what Prophet has learned and use it to improve the training process of an LSTM model

Model Training

  • We fit a Prophet model on our raw time series. We add the custom seasonality of the model and try to make its predictions as accurate as possible by changing the Fourier order.
  • We now use our fitted Prophet model to improve our LSTM training.
  • Prophet has now learned the seasonalities present in the data, corrected the anomalous trends, learned the impact of holidays and reconstructed a time series that is devoid of any outliers.
  • All these pieces of information are stored in the fitted values. They are a smoothed version of the original data which have been manipulated by the model during the training process. We can view these values as a kind of augmented data source of the original training set.
  • Now we start feeding our LSTM autoencoder with the fitted values produced by Prophet and carrying out a multi-step ahead forecast, projecting 148 hours into the future.
  • Then we conclude training with the raw data. With our LSTM, we can also combine external data sources, for example weather conditions, if we think they might have an effect the values we are trying to forecast.

The idea behind such an uncommon approach is that our neural network can learn from two different, but similar, data sources and perform better on our test data. One caveat of this approach is that when performing multi-step training, we have to be mindful of the catastrophic forgetting problem. Catastrophic forgetting is a problem faced by many models and algorithms; when trained on one task, then trained on a second task, many machine learning models “forget” how to perform the first task. To avoid this problem, the structure of the entire network has to be properly tuned to provide a benefit in performance terms.

Results

At its core, the network is very simple. It is constituted by a seq2seq LSTM layer that predicts the values of a time series 'n' steps into the future. The training procedure is carried out using keras-hypetune. On our data sets, we made a 13.36% improvement in the RMSE for our meta forecasting model as compared to vanilla LSTM.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published