Please refer to the project report.
- Install dependencies
pip install -r requirements.txt
python -m spacy download en_core_web_sm
- Copy datasets to the repository manually
- 250k.docs.jsonl (sample of 250k docs)
- mag5.docs.jsonl (full dataset with 5 mill docs)
streamlit run app.py
Or run individual streamlit pages:
- Initial studies:
streamlit run explore.py
- Dataframe (feature) selection:
streamlit run main.py
- Experiment selection:
streamlin run experiment_selection.py