extra-model

Code to run the Extra algorithm for the unsupervised topic/aspect extraction on English texts.

Quick start

IMPORTANT:

When running Extra inside docker-container, make sure that Docker process has enough resources. For example, on Mac/Windows it should have at least 8 Gb of RAM available to it. Read More about RAM Requirements
GitHub repo does not come with Glove Embeddings. See section Downloading Embeddings for how to download the required embeddings.

Using docker-compose

This is a preferred way to run extra-model. You can find instructions on how to run extra-model using CLI or as a Python package here

First, build the image:

docker-compose build

Then, run following command to make sure that extra-model was installed correctly:

docker-compose run test

Downloading Embeddings

Next step is to download the embeddings (we use Glove from Stanford in this project).

To download the required embeddings, run the following command:

docker-compose run --rm setup

The embeddings will be downloaded, unzipped and formatted into a space-efficient format. Files will be saved in the embeddings/ directory in the root of the project directory. If the process fails, it can be safely restarted. If you want to restart the process with new files, delete all files except README.md in the embeddings/ directory.

[Optional] Run `docker-compose build` again

After you've downloaded the embeddings, you may want to run docker-compose build again. This will build an image with embeddings already present inside the image.

The tradeoff here is that the image will be much bigger, but you won't spend ~2 minutes each time you run extra-model waiting for embeddings to be mounted into the container. On the other hand, building an image with embeddings in the context will increase build time from ~3 minutes to ~10 minutes.

Run `extra-model`

Finally, running extra-model is as simple as:

docker-compose run extra-model /package/tests/resources/100_comments.csv

NOTE: when using this approach, input file should be mounted inside the container. By default, everything from extra-model folder will be mounted to /package/ folder. This can be changed in docker-compose.yaml

This will produce a result.csv file in /io/ (default setting) folder.

There are multiple flags you can set to change input/outputs of extra. You can find them by running:

docker-compose run extra-model --help

Learn more

Our official documentation is the best place to continue learning about extra-model:

Explanation of inputs/outputs
Step-by-step workflow of what happens inside of extra-model
Examples of how extra-model can be used in downstream applications
Detailed explanation of how to run extra-model using different interfaces (via docker-compose, via CLI, as a Python package).

Authors

extra-model was written by mbalyasin@wayfair.com, mmozer@wayfair.com.

Name		Name	Last commit message	Last commit date
Latest commit History 1,599 Commits
.github		.github
docker		docker
docs		docs
embeddings		embeddings
extra_model		extra_model
io		io
site		site
tests		tests
.bandit		.bandit
.dockerignore		.dockerignore
.flake8		.flake8
.gitignore		.gitignore
.isort.cfg		.isort.cfg
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
MAINTAINERS.md		MAINTAINERS.md
README.md		README.md
SECURITY.md		SECURITY.md
docker-compose.yaml		docker-compose.yaml
mkdocs.yml		mkdocs.yml
mypy.ini		mypy.ini
publish_documentation.sh		publish_documentation.sh
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
renovate.json		renovate.json
requirements-test.txt		requirements-test.txt
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Table of Contents

extra-model

Quick start

Using docker-compose

Downloading Embeddings

[Optional] Run `docker-compose build` again

Run `extra-model`

Learn more

Authors

About

Releases 6

Packages

Contributors 14

Languages

License

wayfair-incubator/extra-model

Folders and files

Latest commit

History

Repository files navigation

Table of Contents

extra-model

Quick start

Using docker-compose

Downloading Embeddings

[Optional] Run docker-compose build again

Run extra-model

Learn more

Authors

About

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases 6

Packages 0

Contributors 14

Languages

[Optional] Run `docker-compose build` again

Run `extra-model`

Packages