This project takes a file and returns two.
The first returned file will be phonetically identical to the input, however, every word will be misspelled.
The second returned file will return the input file but with the words that have a phonetically identical twin are substituted.
If it's the first time you're running it, run:
./telephone.py input_file init
to make sure the proper files
are installed. Any other execution of the script after the initial
run can be formatted as ./telephone.py input_file
Output will be input_file.bad
for the misspelled
file and input_file.kinda_bad
for the similar
words file.
Four modules alongside telephone.py
make up this
project:
mod.py
,misspeller.py
,unmisspeller.py
, dependencies.py
dependencies.py
is similar to arequirements.txt
, however, all of the installation is done automatically.mod.py
takes a function and maps it to every token. Then it writes an output file that does not write more than 80 characters per line.misspell.py
takes a word and returns the phonemes combined into one stringunmisspell.py
reverses the CMU Dictionary to take a combined list of phonemes and return a word instead.unmisspell.py
contains a Misspell object which converts a word to phonemes before being converted back to a word again. The original word is returned if there are no identical phonemes.
- Does what it was intended to do. See above.
- The project uses regular expressions to clean some of the phoneme data and word processing data.
- Word tokenization is incorporated to break down text.
- NLTK Packages are used.
- If the project were to be extended, it would implement word vectors to use the Misspeller and UnMisspeller to create puns.
- Everything is modular to allow for parts to be added or removed.
- Object Oriented Approach keeps everything clean.
dependency.py
ensures the code will run on any machine with python 3
The effort spent on this work went to the following tasks:
- Understanding CMU Dict
- Figuring out the word processor
- Structuring the Misspeller, Word_Modifier, and UnMispeller components to fit together.
- Creating the UnMisspeller
- Debugging