Skip to content

Latest commit

 

History

History
80 lines (51 loc) · 2.32 KB

README.md

File metadata and controls

80 lines (51 loc) · 2.32 KB

Demo: https://supplementary.herokuapp.com/

Analyze and visualize results of all DTU students.

Project setup:

  • Download python3.7.6

  • Create a virtual environment in the root directory

    python3 -m venv venv
  • Acivate the virtual env:

    source venv/bin/activate

    Optional: We can use auto env to automate this step and export of other env vars with autoenv. This program allows us to set commands that will run every time we cd into our directory. In order to use it, we will need to install it globally. First, exit out of your virtual environment in the terminal, install autoenv, then and add a .env file:

    $ deactivate
    $ pip install autoenv==1.0.0
    $ touch .env
  • Install required packages:

    $ pip install -r requirements.txt
  • Sync /data from dropbox

    This step downloads

    • Raw pdf results files
    • Caches to speedup parsing of pdfs during development
    • Update the parsed data

    Steps:

    1. Export SUPPLEMENTARY_DROPBOX_TOKEN=<token>in your environment.

    2. From the root dir of your clone, run the following command.

      python src/python/dropbox_updown.py data data --yes

    This will clone all the data from the dropbox storage. The above command needs to be run everytime after you make changes to the /datafolder. Please note that we don't track the contents of the folder using git.

    NOTE: use --yes option with caution as it will no longer prompt to sync any modified or deleted files/dirs. When used first time, it is safe and will create the necessary dirs/files for you.

Dropbox Usage

We store the raw and parsed data and other shareable things like caches to a dropbox space. You will need to ask for SUPPLEMENTARY_DROPBOX_TOKENfrom the maintainers to get started.

>>> import dropbox
>>> dbx = dropbox.Dropbox('<token>');

List files/folders in a directory.

>>> [entry.name for entry in dbx.files_list_folder('').entries]
['pdf', 'data']

>>> [entry.name for entry in dbx.files_list_folder('/data').entries]
['dtu_results', 'parsed_data.json']

>>> len([entry.name for entry in dbx.files_list_folder('/data/dtu_results').entries])
1319

Tips:

  • Use ipython3instead of python