Active learning toolbox

This repo contains some query strategies and utils for active learning, as well as the widget for dataset annotation in Jupyter IDE. The repo has tight integration with libact Python library.

Example of active learning annotation of MNIST dataset with the Jupyter widget.

Active learning

Active learning (AL) is an interactive approach to simultaneously building a labeled dataset and training a machine learning model. AL algorithm:

A relatively large unlabeled dataset is gathered.
A domain expert labels a few positive examples in the dataset.
A classifier is trained on labeled samples.
The classifier is applied to the rest of the corpus.
Few most “useful” examples are selected (e.g., that increase classification performance).
The examples labeled by the expert are added to the training set.
Goto 3.

The procedure repeats until the performance of the classifier stops improving or the expert is bored.

Requirements

Python 3.6 (the package has not been tested with earlier versions)
numpy (1.12.1)
pandas (0.20.1)
sklearn (0.18.1)
scipy (0.19.0)
Pillow (4.2.1)
Jupyter (4.3.0)
LibAct from the fork (pip install git+https://github.com/windj007/libact)

Installation

Enabling widgets in Jupyter IDE

The Jupyter widgets are not enabled by default. To install and activate them do the following.

pip install ipywidgets
jupyter nbextension enable --py --sys-prefix widgetsnbextension

For further details, please, refer to jupyter-widgets repo.

Installing the library and the widget

To install the library and the widget execute in command line with root priviledges:

pip install git+https://github.com/IINemo/active_learning_toolbox

Usage

See an example for MNIST dataset annotation and an example for 20 newsgroups annotation.

If you have Docker installed, you can test the examples with windj007/jupyter-keras-tool:

cd `<package dir>`/examples
docker run -ti --rm -v `pwd`:/notebook -p 8888:8888 windj007/jupyter-keras-tools

Then open http://localhost:8888 in a browser (will launch Jupyter IDE) and open an example notebook.

Cite

If you use active learning toolbox in academic works, please cite (to be published):

BibTex:

@inproceedings{suvorovshelmanov2017ainl,
    title={Active Learning with Adaptive Density Weighted Sampling for Information Extraction from Scientific Papers},
    author={Roman Suvorov and Artem Shelmanov and Ivan Smirnov},
    booktitle={Proceedings of AINL: Artificial Intelligence and Natural Language Conference},
    publisher = {Springer, Communications in Computer and Information Science},
    year={2017}
}

Russian GOST:

Suvorov R., Shelmanov A., Smirnov I. Active learning with adaptive density weighted sampling for information extraction from scientific papers // Proceedings of AINL: Artificial Intelligence and Natural Language Conference. — Springer, Communications in Computer and Information Science, 2017.

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
actleto		actleto
docs		docs
examples		examples
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Active learning toolbox

Active learning

Requirements

Installation

Enabling widgets in Jupyter IDE

Installing the library and the widget

Usage

Cite

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

IINemo/active_learning_toolbox

Folders and files

Latest commit

History

Repository files navigation

Active learning toolbox

Active learning

Requirements

Installation

Enabling widgets in Jupyter IDE

Installing the library and the widget

Usage

Cite

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages