Skip to content

Simple script to find homophones given a list of Japanese words

Notifications You must be signed in to change notification settings

catemlitten/homophone_finder

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Homophone Finder

This is a simple script with takes in a raw text file of Japanese words and finds homophones.

A sample list was included with this repo, a simple frequency list of 1000 words taken from Offbeat Band here. You can use any list you like, but the format will need to match the one provided, ie:

さく
貰う
真実
ゲーム
団

If it is not you will need to do some data cleaning beforehand.

Notes

In order to reduce ambiguity, words comprised only of kana in the list are removed by the script using a simple character check. Based on the listing here the bulk of kanji in everyday use fall between 4e00 and 9faf.

About

Simple script to find homophones given a list of Japanese words

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages