GitHub - XuyangAbert/CDSC-AL

Getting Start

CDSC_AL: A Clustering-based Data Stream Classification framework using Active Learning

The "Supplemental Result.pdf" includes the results for comparison with semi-supervised methods using 5%, 15%, 20% labeled data. Also, the comparison results between supervised methods and CDSC-AL method with 5%, 15%, and 20% labeled data respectively.

Example Usage

There are two python codes with different settings for the benchmark data streams:

The main_final_draft.py file is developed for arranging data streams to have abrupt drifts and run this code on

Synthetic-1, Synthetic-2, Sea, and Shuttle

The main_final_draft4.py file is developed for simulating data streams with gradual concept drift and run this code on

KDD cup 99, Forest covtype, Gas Sensor Drift, MNIST, CiFAR-10

The two synthetic datasets (Synthetic-1 and Synthetic-2) are generated by the authors and thus we include them here. For the remaining seven datasets, it can found from the following links:

https://archive.ics.uci.edu/ml/index.php
http://users.rowan.edu/ ∼polikar/nse.html

To run the "main_final_draft.py" or "main_final_draft4.py" code with different datasets, go to line 17 to change the name of dataset.

In line 11, the global variable label_ratio allows for users to change the proportion of labeled data in each incoming data chunk.

Two different evaluation metrics are used:

BAcc1Hist: A vector of the Balanced Classification Accuracy values for the entire data streams
F1Hist: A vector of the Macro-average values of the F1-score for the entire data streams

Dependencies:

Numpy
Pandas
Scikit-learn
Scipy

Citation Format

For any use of this project, please refer to the following article:

Yan, Xuyang and Homaifar, Abdollah and Sarkar, Mrinmoy and Girma, Abenezer and Tunstel, Edward. "A Clustering-based framework for Classifying Data Streams." In Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence (IJCAI2021).

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
GasSensor.csv		GasSensor.csv
LICENSE		LICENSE
README.md		README.md
Supplemental Results.pdf		Supplemental Results.pdf
Syn-1.csv		Syn-1.csv
Syn-2.csv		Syn-2.csv
main_final_draft.py		main_final_draft.py
main_final_draft4.py		main_final_draft4.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Getting Start

Example Usage

Dependencies:

Citation Format

About

Releases

Packages

Languages

License

XuyangAbert/CDSC-AL

Folders and files

Latest commit

History

Repository files navigation

Getting Start

Example Usage

Dependencies:

Citation Format

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages