Skip to content

[πŸ† Winner, Azure Prize at Stanford TreeHacks] readAR -- 🌲 TreeHacks 2020 Swift App and ML/NLP backend

Notifications You must be signed in to change notification settings

shahjaidev/readAR_Treehacks_2020

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

12 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Stanford TreeHacks 2020

Built with @jackyzha0, @SaiG18 and @SophieMBerger

readAR - backend

DevPost

This repository contains the implementation of readAR's backend. You can find an implementation of BERT to peform word sense disambiguation served through a Flask API, as well as our image processing pipeline (/API-azure-pipeline).

Word Sense Disambiguation is the problem of determining which "sense" (meaning) of a word in the context of a sentence. This model is fine-tuned on the SemEval-2007 dataset and achieves 76.6% F1% score on the test dataset (semcor.xml). This is comparable to the current SOTA which achieves 81.2% F1% on the same dataset. The work here builds upon this paper,

Wiedemann, G., Remus, S., Chawla, A., Biemann, C. (2019): Does BERT Make Any Sense? Interpretable Word Sense Disambiguation with Contextualized Embeddings. Proceedings of KONVENS 2019, Erlangen, Germany.

Tech Stack

Running the example

The entire working model can be found as a Docker image. You can download it and run it as follows:

  1. docker pull jzhao2k19/bert-wsd:latest
  2. docker run -p 5000:5000 jzhao2k19/bert-wsd:latest

How to train a new model

Run chmod +x train.sh to be able to run the script, then do ./train.sh to begin training. It will take approximately 2 hours to retrain the embeddings. This will generate a new set of weights in a file called BERT_semcor.pickle.

Serving the model (without Docker)

Run python server.py to start a Flask server on localhost:5000. Hit it with a POST on /api/wsd with a form response containing a sentence and target word. First spin-up will take ~10s and any subsequent requests will take around 500ms. Keep in mind that running this will have required you to have trained a model before hand. If you would like to run one without training, look at Running the example for how to do it through docker.

Service URLs

(may not work after the hackathon)

  • WSD-model: 140.238.147.73:5000/api/wsd
  • img-pipeline: 140.238.147.73:8080/api
  • quiz-generation: 140.238.147.73:8081/api?q=some+query

Word Sense Disambiguation WSD

Examples

1 -- physics definition of work

curl --location --request POST '140.238.147.73:5000/api/wsd' \
--form 'sentence=How much work is done to lift a 3kg object 2 meters' \
--form 'word=work'
{
    "def": "(physics) a manifestation of energy; the transfer of energy from one physical system to another expressed as the product of a force and the distance through which it moves a body in the direction of that force"
}

1 -- an occupational definition of work

curl --location --request POST '140.238.147.73:5000/api/wsd' \
--form 'sentence=What do you do for work?' \
--form 'word=work'
{
    "def": "the occupation for which you are paid"
}

2 -- a river bank

curl --location --request POST '140.238.147.73:5000/api/wsd' \
--form 'sentence=I stand on the river bank' \
--form 'word=bank'
{
    "def": "sloping land (especially the slope beside a body of water)"
}

2 -- a financial institution

curl --location --request POST '140.238.147.73:5000/api/wsd' \
--form 'sentence=I need to deposit money at the bank tomorrow' \
--form 'word=bank'
{
    "def": "a financial institution that accepts deposits and channels the money into lending activities"
}

About

[πŸ† Winner, Azure Prize at Stanford TreeHacks] readAR -- 🌲 TreeHacks 2020 Swift App and ML/NLP backend

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published