CMUDict encoded in IPA
File is tab separated and found in cmudict.ipa
Notes:
- Parenthesis deleted for words with multiple pronounciations
- Emphasis deleted
- Split into 10% dev, 10% test, 80% test data set in datasets/
Can modify mappings found in arpa-ipa.map. Mappings taken from wikipedia: https://en.wikipedia.org/wiki/Arpabet