NLP - RNN for Name Origins Prediction

Overview

This repo analyzes the data set found at 'https://download.pytorch.org/tutorial/data.zip' and utlizes a simple ML NLP method to predict the orgins of various names. We use the Keras API to put forward a model that implements a character level embedding layer followed by a bidirectional LSTM layer and two Dense layers.

File Description

The files included cover the data download, the data processing, the model building and the predction against random data - brocken down as such:

1. data_org.py:       Sets the file paths and calls download.py to execute.
2. download.py:       Checks if the raw data or zip files exist. If required it will download/unzip the data and will store it locally. Finally it deletes the zip file.
3. encode_func.py:    The standard sklearn encoding function.
4. data_prep.py:      Prep the data for the model and creates a dictionary for defining the model's output.
5. data_print:        Simple output showing the key data processing steps performed.
6. nlp_model.py:      Build the nlp model, train the model using the data processed and store the model.
7. call.py:           First load the model that was stored. We then use the model to define the origin of a random family name.
8. .gitignore:        Files not shared.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NLP - RNN for Name Origins Prediction

Overview

File Description

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.gitignore		.gitignore
README.md		README.md
call.py		call.py
data_org.py		data_org.py
data_prep.py		data_prep.py
data_print.py		data_print.py
download.py		download.py
encode_func.py		encode_func.py
nlp_model.py		nlp_model.py
nlp_model_TFS.py		nlp_model_TFS.py

AdamLevitt/NLP_Origins_Repo

Folders and files

Latest commit

History

Repository files navigation

NLP - RNN for Name Origins Prediction

Overview

File Description

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages