The benchmark has been realized on the FIFA
dataset.
You can get the dataset with curl
: curl http://www.philippe-fournier-viger.com/spmf/datasets/FIFA.txt --output FIFA.dat
.
The training has been made with 20_450 sequences with an average length of 34 and an alphabet of 2990 elements.
The benchmark has been realized with a PC with 8 GB of ram, 8 cores and the Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz
CPU.
The threshold_query used is 1.
With FIFA.dat
in the data folder, you can run the benchmark from the benchmark folder: python benchmark.py
.
Subseq
predicted the entire dataset in approximatively 14 minutes, which is an average of 41 ms per prediction.
This model takes relatively more time than CPT
. This is mainly because Subseq
is doing a lot of Full Text search.