Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make user phrase scores fairer via rewriting #119

Merged
merged 3 commits into from
Feb 20, 2024

Conversation

lukhnos
Copy link
Collaborator

@lukhnos lukhnos commented Feb 18, 2024

This fixes #118. To avoid single-syllable user unigrams dominating the grid walk when there are competing multi-syllable unigrams, we assign a fairer score to such user unigrams instead of the default value of 0.

This fixes #118. To avoid single-syllable user unigrams dominating the
grid walk when there are competing multi-syllable unigrams, we assign a
fairer score to such user unigrams instead of the default value of 0.
@lukhnos lukhnos force-pushed the dev/fairer-user-phrase-scores branch from 6a1e9ad to 3f9392c Compare February 19, 2024 00:16
@lukhnos lukhnos requested a review from zonble February 19, 2024 00:16
allUnigrams.insert(allUnigrams.begin(), userUnigrams.begin(), userUnigrams.end());
// This relies on the fact that we always use the default separator.
bool isKeyMultiSyllable = key.find(Formosa::Gramambular2::ReadingGrid::kDefaultSeparator) != std::string::npos;
if (isKeyMultiSyllable || allUnigrams.empty()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we may need to comment on why multi syllables phrases should be always in the front without scoring. I cannot understand its purpose when I read the code.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarified the motivation in 4d9db9c. PTAL.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lukhnos @zonble +CC @mjhsieh Just an FYI: a convention is to set any new record's score to -99 instead of 0. (I think I've mentioned that in passing long long time ago but it isn't that critical at that time anyway.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or even better, introducing a new feature of having -inf in the file and the code.

@zonble
Copy link
Collaborator

zonble commented Feb 20, 2024

Look good :)

@zonble zonble merged commit 66749e8 into master Feb 20, 2024
6 checks passed
@lukhnos lukhnos deleted the dev/fairer-user-phrase-scores branch February 20, 2024 18:40
lukhnos added a commit to openvanilla/McBopomofo that referenced this pull request Feb 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Phrase suggestions issue with user-defined pronunciations
3 participants