An Apache 2.0 NLP research library, built on PyTorch, for developing state-of-the-art deep learning models on a wide variety of linguistic tasks.
The preferred way to install AllenNLP is via pip
.
tldr; Just run pip install allennlp
in your python 3.6 environment and you're good to go!
If you need pointers on setting up a python 3.6 environment, see below.
Just want AllenNLP models running as a service via Docker? docker run --rm -p 8000:8000 allennlp/allennlp:v0.4.1 python -m allennlp.run serve
Conda can be used set up a virtual environment with the version of Python required for AllenNLP and in which you can sandbox its dependencies. If you already have a python 3.6 environment you want to use, you can skip to the 'installing via pip' section.
-
Create a Conda environment with Python 3.6
conda create -n allennlp python=3.6
-
Activate the Conda environment. You will need to activate the Conda environment in each terminal in which you want to use AllenNLP.
source activate allennlp
-
Install AllenNLP.
pip install allennlp
That's it! You're now ready to build and train AllenNLP models.
AllenNLP installs a script when you install the python package, meaning you can run allennlp commands just by typing allennlp
into a terminal.
pip
currently installs Pytorch for CUDA 8 only (or no GPU). If you require a newer version,
please visit http://pytorch.org/ and install the relevant pytorch binary.
Once you have installed Docker
just run docker run -it -p 8000:8000 --rm allennlp/allennlp:v0.4.0
to get an environment that will run on either the cpu or gpu.
Now you can do any of the following:
- Run a model on example sentences with
python -m allennlp.run predict
. - Start a web service to host our models with
python -m allennlp.run serve
. - Interactively code against AllenNLP from the Python interpreter with
python
.
Using Docker installs AllenNLP from source, for development. Consequently, the allennlp
commandline tool is not
installed and you will have to use the correpsonding python commands (see above).
Built on PyTorch, AllenNLP makes it easy to design and evaluate new deep learning models for nearly any NLP problem, along with the infrastructure to easily run them in the cloud or on your laptop. AllenNLP was designed with the following principles:
- Hyper-modular and lightweight. Use the parts which you like seamlessly with PyTorch.
- Extensively tested and easy to extend. Test coverage is above 90% and the example models provide a template for contributions.
- Take padding and masking seriously, making it easy to implement correct models without the pain.
- Experiment friendly. Run reproducible experiments from a json specification with comprehensive logging.
AllenNLP includes reference implementations of high quality models for Semantic Role Labelling, Question and Answering (BiDAF), Entailment (decomposable attention), and more.
AllenNLP is built and maintained by the Allen Institute for Artificial Intelligence, in close collaboration with researchers at the University of Washington and elsewhere. With a dedicated team of best-in-field researchers and software engineers, the AllenNLP project is uniquely positioned to provide state of the art models with high quality engineering.
allennlp | an open-source NLP research library, built on PyTorch |
allennlp.commands | functionality for a CLI and web service |
allennlp.data | a data processing module for loading datasets and encoding strings as integers for representation in matrices |
allennlp.models | a collection of state-of-the-art models |
allennlp.modules | a collection of PyTorch modules for use with text |
allennlp.nn | tensor utility functions, such as initializers and activation functions |
allennlp.service | a web server to serve our demo and API |
allennlp.training | functionality for training models |
If you want to make changes to AllenNLP library itself (or use bleeding-edge code that hasn't been released to PyPI) you'll need to install the library from GitHub and manually install the requirements:
- First, clone the repo:
git clone https://github.com/allenai/allennlp.git
- Change your directory to where you cloned the files:
cd allennlp
-
Install the required dependencies.
INSTALL_TEST_REQUIREMENTS="true" ./scripts/install_requirements.sh
-
Visit http://pytorch.org/ and install the relevant pytorch package.
You should now be able to test your installation with ./scripts/verify.py
. Congratulations!
A third option is to run AllenNLP via Docker. Docker provides a virtual machine with everything set up to run AllenNLP-- whether you will leverage a GPU or just run on a CPU. Docker provides more isolation and consistency, and also makes it easy to distribute your environment to a compute cluster.
It is easy to run a pre-built Docker development environment. AllenNLP is configured with Docker Cloud to build a new image on every update to the master branch. To download the latest released from Docker Hub just run:
docker pull allennlp/allennlp:v0.4.0
For various reasons you may need to create your own AllenNLP Docker image. The same image can be used either with a CPU or a GPU.
First, follow the instructions above for setting up a development environment. Then run the following command (it will take some time, as it completely builds the environment needed to run AllenNLP.)
docker build --tag allennlp/allennlp .
You should now be able to see this image listed by running docker images allennlp
.
REPOSITORY TAG IMAGE ID CREATED SIZE
allennlp/allennlp latest b66aee6cb593 5 minutes ago 2.38GB
You can run the image with docker run --rm -it allennlp/allennlp
. The --rm
flag cleans up the image on exit and the
-it
flags make the session interactive so you can use the bash shell the Docker image starts.
You can test your installation by running ./scripts/verify.py
.
AllenNLP is an open-source project backed by the Allen Institute for Artificial Intelligence (AI2). AI2 is a non-profit institute with the mission to contribute to humanity through high-impact AI research and engineering. To learn more about who specifically contributed to this codebase, see our contributors page.