Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: sequential pattern mining #1195

Open
ghost opened this issue Apr 26, 2021 · 0 comments
Open

Feature request: sequential pattern mining #1195

ghost opened this issue Apr 26, 2021 · 0 comments
Labels

Comments

@ghost
Copy link

ghost commented Apr 26, 2021

Sequential pattern mining is a general data mining solution for finding patterns in sequences e.g. if you have a text something like

ABCD
CBCD
BECD

then it can find the BCD as a frequent closed sequence with the support of 3. It is good to prepare the data first, e.g. in the case of logs split up to words and give each word an individual id and each special character an individual id, use the algorithm on the resulting array and do reverse id -> word mapping on the results. This is necessary to spare CPU and memory, otherwise it would eat a lot of resources. Different type of data might need different preparation. SPM can be used for objects with different properties too and the algorithms can check multiple properties, not just a single one, so in the case of JSON, not plain text this can be an extra feature.

I don't think there is any alternative tool for this. There are countless algorithms for this, I don't know which would be the best, definitely not the old ones like GSP or SPADE. They might need a different thread, otherwise it might freeze the window.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

0 participants