MUT is a series of processing pipelines and frameworks for ingesting URLs, extracting features, and performing classification.
The jupyter notebook x0-parsing.ipynb
contains the logic for parsing different types of lists available online, and creating a massive conglom-labeled.csv
list at the end. The logic is complete here, but more lists could be added in the future. I hope you will contribute to this work if you are interested!
The datasets folder contains pre-processed lists and labeled datasets. The conglom-labeled.csv
is a concatenation of all the lists, and the easylist-ads-labeled.csv
, easylist-tracking-labeled.csv
, malicious-phish-labeled.csv
, malware-labeled.csv
, and yoyo-labeled.csv
are the individual lists.