Memory limit for massive amount of timeseries #11

MalteFlender · 2018-12-26T09:31:48Z

It seems to me that at the moment the RAM of the computer I'm using is the limiting factor regarding the amount of timeseries to be trained with. If I want train the system with e.g. 16 GB of timeseries-data I need to have at least 16 GB of RAM.

Is there a way to get around this issue? Maybe it is possible to train the system in smaller Batches or use some kind of iterator. I'm trying to train the system with a lot of data obtained from a database, where I get the timeseries in small chunks.

pmontman · 2019-02-08T05:58:25Z

Thank you, Do you mean that you get some kind of error when running on your data? Or is it just extremelly slow? There is a known problem when running the code in parallel with large amounts of data, specifically when calculating the forecasts. You can try smaller batches for that part, then the part that relies on xgboost should be able to handle relatively larger datasets.

The parallelization problem of the forecasting part will be fixed soon

MalteFlender · 2019-02-19T15:55:03Z

Currently I'm not using the system (I'm planing to).
Since the training is done in one single step it seems to me like there is no way I can split the training into different parts and therefore process training-sets that are bigger than my current RAM-size, since that's the place where I have to store the data.

MalteFlender closed this as completed Feb 19, 2019

MalteFlender reopened this Feb 19, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory limit for massive amount of timeseries #11

Memory limit for massive amount of timeseries #11

MalteFlender commented Dec 26, 2018

pmontman commented Feb 8, 2019

MalteFlender commented Feb 19, 2019

Memory limit for massive amount of timeseries #11

Memory limit for massive amount of timeseries #11

Comments

MalteFlender commented Dec 26, 2018

pmontman commented Feb 8, 2019

MalteFlender commented Feb 19, 2019