GitHub - HPI-DeepLearning/sentence-boundary-detection-nn: Sentence Boundary Detection using Deep Neural Networks.

Sentence Boundary Detecting using Deep Neural Networks

We try to detect sentence boundaries using deep learning. Created as part of the "Practical Applications of Multimedia Retrieval" seminar at the Hasso-Plattner-Institute, Potsdam, Germany.

Setup Demo

We build a python-based demo using caffe.

#####Prerequirements:

Clone this repository
Install python 2.7 including the following packages from requirements.txt

pip install -r requirements.txt

Use the nltk downloader to download averaged_perceptron_tagger and punkt models:

python -m nltk.downloader

Setup caffe, like described here
Add path to the repository to your python path:

export PYTHONPATH=/path/to/sentence-boundary-detection-nn/python:$PYTHONPATH

Download Google Word Vector (GoogleNews-vectors-negative300.bin.gz) from here or use directly this url and extract the result into the sentence-boundary-detection-nn/python/demo_data directory
Paste your trained models into a demo data folder, for example sentence-boundary-detection-nn/python/demo_data with the following structure:

lexical_models : containing all pretrained models you want to use in a seperate directory. Each models needs a
- .ini
- .caffemodel
- net.prototxt file.
text_data: containing all possible text files, which should be used as prediction input
audio_models: containing all pretrainied audio models, each in a seperate directory. Each needs the same files as described for lexical models
audio_examples: containing all audio files, which should be available during the demo. Each one in a seperate directory containing the ctm, energy and pitch files.

#####Start up

Change into the repository directory and execute, this should work right out of the box, unless you are using a custom demo_data folder:

python web_demo/web.py

Optionally you can specify the location of the word vector and the demo data. Otherwise default values are used. For further information execute:

python web_demo/web.py -h

Name		Name	Last commit message	Last commit date
Latest commit History 559 Commits
infogain_loss_matrix		infogain_loss_matrix
net-audio		net-audio
net		net
paper		paper
python		python
.editorconfig		.editorconfig
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sentence Boundary Detecting using Deep Neural Networks

Setup Demo

About

Releases

Packages

Languages

HPI-DeepLearning/sentence-boundary-detection-nn

Folders and files

Latest commit

History

Repository files navigation

Sentence Boundary Detecting using Deep Neural Networks

Setup Demo

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages