Skip to content

Commit

Permalink
Sync Master & Staging (#984)
Browse files Browse the repository at this point in the history
* implemented course vectorizers under searches/utils

* implemented course vectorizers under searches/utils

* implemented relevance scoring

* implemented relevance scoring

* add get_relevant_search() method returning course objs

* add get_relevant_search() method returning course objs

* implemented stemming / ngram

* implemented stemming / ngram

* test case added

* stemming/acronym/weighting implemented

* evaluation functions implemented

* evaluation functions implemented

* objectified search utility functions

* objectified search utility functions

* Course clustering with LDA

* Course clustering with LDA

* optimized advance search

* optimized advance search

* refactoring codes

* refactoring codes

* rebased from staging

* rebased from staging

* satisfied linting

* satisfied linting

* ingested/digested jhu and modified search flow

* temp

* rebased from staging origin, still need vectorized search migration

* implemented course vectorizers under searches/utils

* implemented course vectorizers under searches/utils

* implemented relevance scoring

* implemented relevance scoring

* add get_relevant_search() method returning course objs

* add get_relevant_search() method returning course objs

* implemented stemming / ngram

* implemented stemming / ngram

* test case added

* stemming/acronym/weighting implemented

* stemming/acronym/weighting implemented

* evaluation functions implemented

* evaluation functions implemented

* objectified search utility functions

* objectified search utility functions

* Course clustering with LDA

* Course clustering with LDA

* optimized advance search

* optimized advance search

* refactoring codes

* refactoring codes

* rebased from staging

* rebased from staging

* satisfied linting

* satisfied linting

* ingested/digested jhu and modified search flow

* temp

* rebased from staging origin, still need vectorized search migration

* do not store count vectorizer obj

* removed pickle file

* revised unit test

* revised unit test

* prevented from crashing when ingesting

* rebased from master

* removed non-using part of the code for now

* modified requirements and cv loader

* updated upon review

* fixed to render searcher obj at app initialization

* resolved migration issue

* rebased and print exception

* modified scoring / strip() query

* resolved dependency issue

* handle duplicates

* changed to use static english dictionary count vectorizer for search.

* added docstrings

* modify license description, formatting and searches.rst description

* merge migrations

* fix tests for vectorized search

* expand on documentation

* cleanup webpack

* cleanup scripts

* fix update_semesterfield importing in migrations

* cleanup root

* update package.json

* provide cause of internal errors by setting debug to true

* fix tests

* remove documentation legacy

* move dict out of root

* umich parser pep8 compliance

* salisbury pep8

* JHU pep8

JHU course evals access db directly and are not pep8 compliant.
80 char limit is not enforced.

* UMD pep8 compliance and refactor

* gw instructor list limit

* vandy pep8 compliance and refactor, UNCOMPLETE

* Validator pep8 compliance

Almost full compliance. Still debating on whether 79 char line limit is worth
making 99 char limit.

* Style compliance for Queens parser

* style complianace for digestor, uncomplete

* working on refactor

* Conitnuing with refactor and pep8 compliance

:plane: to Israel!

* working on dictionary term/semester filtering

* fixed parenthesis syntax error in ingestor

* removed ingest_offerings and changed to ingest_meeting

* removed ingest_offering

* Ingestor refactor

Made detailed docstrings,
created internal _get function and removed getchain
Changed ALL_KEYS to _ALL_KEYS

* Cleaned ingestor cleandict and renamed to clean

+ general refactor of validator but not complete yet

* method reordering in gw parser

* add pep8 newline

* Various refactors

- filter years and terms command line arg and filtering

* spaces instead of tabs in data pipeline json schemas

* removed unneccessary import

* fixed issue created by new years and terms filtering scheme

* style update for commands argparser file

* working on conforming jhu evals to data pipeline

* fixed pep8 line overflow error in hacky way

* changed wording in comment

* Added eval to ingestion command

NOTE: digest command is temporarily broken.

- removed skip_shallow_duplicates from ingest cmd args
-- NOTE: this could very well cause unforseen issues with peoplesoft courses hiding/overriden classes!
- created --type cmd arg for ingestion
-- had to hack in a way to use new_{course, textbook, eval}_parsers, to be changed on deprecation of school_mappers.py
- changed hide_progress_bar to progress_bar (logic flip)

* hopkins evals to pipeline

* Added Vandy/Hopkins evals to data pipeline

NOTE: digestion command is still broken due to argparse errors

- Added evals to ingestor + validator
- tweaked ingestion cmd args some more
- pep8 compliance

* removed extra lines from supervisord config file

* tabs to spaces in logger

* created new exceptions modules, moved some functionality to utils module

* function out complex resolutions

* moved and documented external_utils to utils

* removed invalid reference to internal_utils

* some library reference refactors

* Created JSONWriter in logger but not yet used in code base

* JHU ingestion starts without error

* Refactor Tracker

- Seperated Viewer and Tracker by file and made better distinction.
- used @Property to clean up Tracker class
- Factored out Counter and TimeDistribution into their own Viewer classes
- Create Bus-like protocol so Viewers can filter Tracker broadcasts

* track status and better docstrings

* cleaned up validatoin schema loading

* load_schemas once only in Validator

* Refactor and cleanup validator

* changed parser_library to library

* The big move 🏠 🚚💨 🏠

- moved all parsing functionality into its own module
- created global variable to hold parsing directory in settings.py
- updated school mappers
- BROKEN: uoft was not moved and must be dealt with later, digestion command is still untested

* delete deprecated/moved/redone files b/c git is dumb

* moved makeschool to parsing module

* Trimmed names of files and parsers ✂️

- changed <school}_<pytpe>.py to <ptype>.py
- updated school_mappers to handle new format
- no longer need each type of parser in school directory if unused
- some pep8 things

* tabs to spaces in parsing commands

* Refactored pipeline commands

- use format strings to handle defaults in args
- changed format of progress bar and cleaned up
- started refactor of digestor

* ❗❗ remove parsing files from scripts again... 😡

* Added vectorizer back to digest command

- would like to move this to its own command in due course.

* partial refactor of digestor + bugfix in digest diff

* uncomment .eslintrc.js

* added dummy VALID_SCHOOLS to school_mappers to pass tests

* added uoft to VALID_SCHOOLS even though parser is not active in order to pass tests

* Sort'em 🔤

This should help with conflicts and duplications in these files

* License and registration please 👮

Added Open GPL3 license to parsing directory files with # comment style.

* Fix npm run watch

* add/update testing/contrib docs (#979)

* added everything back (#980)

* Remove client secrets from core code (#981)

* move hashids to local_settings.py

* move all secrets to sensitive.py

* add encrypted secrets as environment variables for travis

* Correct global vars indentation in travis.yml

* use environment variables via travisci web client instead

* handle secret key before imports

* provide dev credentials

* use SemesterlyTest Facebook App credentials

* add docs for secrets

* uncomment out try catch for error printing on get_secret

* Parsing/docs edit (#983)

* Added DataUpdate object to parsing models

- Created DataUpdate object to track last updated status per schools per semester
- Modified SchoolList view and commented out assertNone in associated test
- Used JSONStreamWriter to write ingestor and save meta data for digestor
to read and be able to update DataUpdate objects per semester
- General cleanup as well
- Created Hoarder as tracker viewer to accumulate term, dept, instr data through parse

* removed Updates completely and updated tests to use DataUpdate

* doc and docstring edits
  • Loading branch information
noahpresler authored Aug 5, 2017
1 parent 1d8fe40 commit 6d0a62b
Show file tree
Hide file tree
Showing 393 changed files with 69,984 additions and 12,439 deletions.
84 changes: 44 additions & 40 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,52 +1,56 @@
semesterly/local_settings.py
*.pyc
env/
venv/
assets/*
.coverage
deploy.sh
dev_env/
prod_env/
dev_requirements.txt
docs/_build
docs/build/
.DS_Store
env/
github-request-body
.idea/
*.log
*.log
logfile.txt
node_modules
deploy.sh
.DS_Store
timetable/courses_json/*.json
TextbookParserOutput.txt
workfile.html
semes/
docs/_build
npm-debug.log.*
openssl.cnf
parse_errors.log
static/js/gulp/*
static/css/gulp/*
logfile.txt
parsing/**/data/*.json
parsing/**/logs/*.json
parsing/**/logs/*.log
*.pid
prod_env/
*.pyc
*.recommended.model
recommended.model
scripts/parser_library/ex_school/data/*
sections
semes/
semesterly_backup/
semesterly/local_settings.py
semesterly/sensitive.py
semesterly-venv/
github-request-body
static/robots.txt
static/sitemap.txt
sections
openssl.cnf
ssl.crt
ssl.key
semesterly_backup/
recommended.model
timetable.features
*.recommended.model
*.timetable.features
docs/build/
scripts/parser_library/ex_school/data/*
*.log
scripts/**/data/*.json
scripts/*/data/*.json
scripts/*/logs/*.json
*.swp
.idea/
.coverage
assets/*
static/bundles/*
webpack-stats.json
*.pid
static/css/gulp/*
static/js/gulp/*
static/robots.txt
static/sitemap.txt
*.swp
*/test_failures/
.vscode/
npm-debug.log.*
Vagrantfile
TextbookParserOutput.txt
timetable/courses_json/*.json
*.timetable.features
timetable.features
.vagrant/
Vagrantfile
venv/
.vscode/
webpack-stats.json
workfile.html
semesterly/secrets.py
semesterly/secret_key.py
backup.sh
client_secret.json
50 changes: 24 additions & 26 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,43 +3,41 @@ sudo: required
dist: trusty
language: node_js
node_js:
- "7.1.0"
env:
- 7.1.0
env:
matrix:
- NODE_ENV=production

branches:
only:
- master
- staging

- master
- staging
cache:
apt: true
directories:
- node_modules
- $HOME/.npm
- $HOME/.cache/pip

- node_modules
- $HOME/.npm
- $HOME/.cache/pip
install:
- npm -g install webpack
- npm -g install chromedriver
- npm -g install jest babel-jest
- npm install
- pip install --user -r requirements.txt
- npm -g install webpack
- npm -g install chromedriver
- npm -g install jest babel-jest
- npm install
- pip install --user -r requirements.txt
addons:
apt:
sources:
- google-chrome
- google-chrome
packages:
- google-chrome-stable
- google-chrome-stable
hosts:
- jhu.sem.ly
- jhu.sem.ly
before_script:
- "export DISPLAY=:99.0"
- "sh -e /etc/init.d/xvfb start"
- sleep 3 # give xvfb some time to start
- export DISPLAY=:99.0
- sh -e /etc/init.d/xvfb start
- sleep 3
script:
- npm run build
- npm run lint
- npm run test
- python manage.py test
- make html -C docs
- npm run build
- npm run lint
- npm run test
- python manage.py test
- make html -C docs
25 changes: 11 additions & 14 deletions agreement/__init__.py
Original file line number Diff line number Diff line change
@@ -1,14 +1,11 @@
"""
Copyright (C) 2017 Semester.ly Technologies, LLC
Semester.ly is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
Semester.ly is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
"""

# Copyright (C) 2017 Semester.ly Technologies, LLC
#
# Semester.ly is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# Semester.ly is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
24 changes: 11 additions & 13 deletions agreement/admin.py
Original file line number Diff line number Diff line change
@@ -1,16 +1,14 @@
"""
Copyright (C) 2017 Semester.ly Technologies, LLC
Semester.ly is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
Semester.ly is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
"""
# Copyright (C) 2017 Semester.ly Technologies, LLC
#
# Semester.ly is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# Semester.ly is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.

from django.contrib import admin

Expand Down
24 changes: 11 additions & 13 deletions agreement/apps.py
Original file line number Diff line number Diff line change
@@ -1,16 +1,14 @@
"""
Copyright (C) 2017 Semester.ly Technologies, LLC
Semester.ly is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
Semester.ly is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
"""
# Copyright (C) 2017 Semester.ly Technologies, LLC
#
# Semester.ly is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# Semester.ly is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.

from __future__ import unicode_literals

Expand Down
19 changes: 19 additions & 0 deletions agreement/migrations/0003_auto_20170707_0757.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# -*- coding: utf-8 -*-
# Generated by Django 1.9.2 on 2017-07-07 12:57
from __future__ import unicode_literals

from django.db import migrations


class Migration(migrations.Migration):

dependencies = [
('agreement', '0002_auto_20170520_1927'),
]

operations = [
migrations.AlterModelOptions(
name='agreement',
options={'get_latest_by': 'last_updated'},
),
]
16 changes: 16 additions & 0 deletions agreement/migrations/0004_merge.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# -*- coding: utf-8 -*-
# Generated by Django 1.9.2 on 2017-07-22 18:04
from __future__ import unicode_literals

from django.db import migrations


class Migration(migrations.Migration):

dependencies = [
('agreement', '0003_auto_20170615_1828'),
('agreement', '0003_auto_20170707_0757'),
]

operations = [
]
24 changes: 11 additions & 13 deletions agreement/models.py
Original file line number Diff line number Diff line change
@@ -1,16 +1,14 @@
"""
Copyright (C) 2017 Semester.ly Technologies, LLC
Semester.ly is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
Semester.ly is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
"""
# Copyright (C) 2017 Semester.ly Technologies, LLC
#
# Semester.ly is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# Semester.ly is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.

from __future__ import unicode_literals

Expand Down
24 changes: 11 additions & 13 deletions agreement/tests.py
Original file line number Diff line number Diff line change
@@ -1,16 +1,14 @@
"""
Copyright (C) 2017 Semester.ly Technologies, LLC
Semester.ly is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
Semester.ly is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
"""
# Copyright (C) 2017 Semester.ly Technologies, LLC
#
# Semester.ly is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# Semester.ly is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.

from django.test import TestCase

Expand Down
24 changes: 11 additions & 13 deletions agreement/urls.py
Original file line number Diff line number Diff line change
@@ -1,16 +1,14 @@
"""
Copyright (C) 2017 Semester.ly Technologies, LLC
Semester.ly is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
Semester.ly is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
"""
# Copyright (C) 2017 Semester.ly Technologies, LLC
#
# Semester.ly is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# Semester.ly is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.

from django.conf.urls import patterns, url
from django.contrib import admin
Expand Down
25 changes: 11 additions & 14 deletions agreement/views.py
Original file line number Diff line number Diff line change
@@ -1,14 +1,11 @@
"""
Copyright (C) 2017 Semester.ly Technologies, LLC
Semester.ly is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
Semester.ly is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
"""

# Copyright (C) 2017 Semester.ly Technologies, LLC
#
# Semester.ly is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# Semester.ly is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
Loading

0 comments on commit 6d0a62b

Please sign in to comment.