This is a C++ library based on the FPmax* algorithm for mining maximal frequent itemsets by Gösta Grahne and Jianfei Zhu [1,2].
This project aims to make the FPmax* algorithm available as a shared library so you can call it directly from your code without needing file writing/reading and system calls. Most of the original code (which is available from the FIMI repository) is preserved.
[1] G. Grahne and J. Zhu, "Efficiently Using Prefix-trees in Mining Frequent Itemsets", Proceedings of the IEEE ICDM Workshop on Frequent Itemset Mining Implementations, 2003. (Available HERE).
[2] G. Grahne and J. Zhu, "Reducing the Main Memory Consumptions of FPmax* and FPclose", Proceedings of the IEEE ICDM Workshop on Frequent Itemset Mining Implementations, 2004. (Available HERE).
mkdir build
cd build
cmake .. -DCMAKE_BUILD_TYPE=Release
make lib
This will generate the library file (.so on Linux or .dll on Windows) in the build
directory.
To test calling the shared library from a C++ code:
make lib_test
ctest -R lib --verbose
You need to include the fpmax.h
header and call the fpmax
function, which has 2 versions.
If you wish to handle input/output the same way you would with the original FPmax* executable (using files), go for the first version:void fpmax(char const * in, char const * out, unsigned int minsup)
Otherwise, you can handle input/output in-memory:
- Create a Dataset object (see
data.h
for the definition) - Call the second version of the function:
FISet* fpmax(Dataset* dataset, unsigned int minsup, unsigned int nlargest=0)
- Get the return value, a pointer to an FISet object - a set of FrequentItemset objects (see
fitemset.h
for their definitions)
To compile/link/run your program, you will need the compiled shared library and the following headers: {buffer.h
, data.h
, fitemset.h
, fp_node.h
, fp_tree.h
, fpmax.h
, fsout.h
}
For usage examples, see the code in the test
folder or the following projects:
- MDM-HGS-CVRP (a solver for vehicle routing problems)
- MR-MS-ILS (a solver for facility location problems)