Repository for the Mai 2023 hackathon, specifically for the task to improve search.
- pyenv - to manage your local python versions like a boss, easy to install via pyenv-installer (check common build problems if issues arise)
- pipenv - to manage dependencies in virtual environments with joy
- Clone the repo
git clone git@github.com:iptch/kolihack.git
- Install python dependencies via
pipenv
(check prerequisites before)
cd kolihack
pipenv install
- Active virtual environment
pipenv shell
- Copy the file
content.csv
which you downloaded from the kaggle competition website into the folderdata
. That means you end up with a file calledcontent.csv
inside thedata
folder at the root of this repo. Note: You can also get it from the ipt oss4good GDrive when you don't have a Kaggle account and don't want to register. - Note:
data/content.csv
will be ignored by git and not checked in into the repo.
There is a notebooks folder holding all the jupyter notebooks. Go run them:
Having the pipenv environment activated by running
pipenv shell
you can then run the jupyter lab with
jupyter lab
General python code (e.g. io.py
for data loading and storing) should preferably be located in kolihack
such that we don't mess up our notebooks.