Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change prefix logic to check for following vowel #50

Merged
merged 2 commits into from
Apr 22, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -157,7 +157,7 @@ flowchart TD
A --> J
```

*Note (1)*: Prefixes considered are "ab", "ob", "ad", "per", "sub", "in", and "con".
*Note (1)*: Prefixes considered are "ab", "ob", "ad", "per", "sub", "in", "con", and "co". Prefixes are only removed when they are followed by a vowel; if not followed by a vowel, the rules regarding consonant placement are the same for the prefix as the rest of the word. An example will help illustrate. The word "perviam" should be syllabified "per-vi-am": the division of "rv" into two separate syllables follows the general rule of consonant placement (add the first consonant to the preceding syllable and the second consonant to the following syllable). The word "periurem", however, should be syllabified "per-iu-rem." Here, the general rule of consonant placement would call for the "r" to adhere to the following syllable. Because it is a prefix, however, the "r" stays in the first syllable.

*Note (2)*: Written "i"s and "y"s may be semivowels and written "u"s may be semi-vowels or consonants.
"I"s are semivowels:
Expand Down
3 changes: 2 additions & 1 deletion tests/word_syllabification_tests.csv
Original file line number Diff line number Diff line change
Expand Up @@ -90,4 +90,5 @@ adincresco,ad-in-cre-sco,
compressans,com-pres-sans,
principem,prin-ci-pem,
redemptor,re-demp-tor,
imperator,im-pe-ra-tor
imperator,im-pe-ra-tor
coegerunt,co-e-ge-runt
11 changes: 7 additions & 4 deletions volpiano_display_utilities/latin_word_syllabification.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@

# Prefix groups are groups of characters that serve as common prefixes. For details,
# see README.
_PREFIX_GROUPS: set = {"ab", "ob", "ad", "per", "sub", "in", "con"}
_PREFIX_GROUPS: set = {"ab", "ob", "ad", "per", "sub", "in", "con", "co"}

_VOWELS: set = {"a", "e", "i", "o", "u", "y"}
_VOWELS_AEOU: set = {"a", "e", "o", "u"}
Expand Down Expand Up @@ -96,7 +96,8 @@ def split_word_by_syl_bounds(word: str, syl_bounds: List[int]) -> List[str]:

def _get_prefixes(word: str) -> str:
"""
Returns the prefix of a word, if it has one.
Returns the prefix of a word, if it has one that is followed by a vowel.
For details on prefixes, see README.

word [str]: word to check for prefix

Expand All @@ -105,9 +106,11 @@ def _get_prefixes(word: str) -> str:
"""
for prefix in _PREFIX_GROUPS:
# If the word is itself one of the prefixes (eg. "in" can
# be a word or a prefix), doen't return a prefix
# be a word or a prefix), don't return a prefix
if word.startswith(prefix) and (word != prefix):
return prefix
prefix_length = len(prefix)
if word[prefix_length] in _VOWELS:
return prefix
return ""


Expand Down
Loading