Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle sometime consonant group "nc" #43

Merged
merged 9 commits into from
Feb 22, 2024
Merged

Conversation

dchiller
Copy link
Collaborator

@dchiller dchiller commented Feb 13, 2024

Closes #42.

It turns out that "nc" is only sometimes a consonant group. It is treated as a group in sequences of three or more consonants ("sanctus" -> "sanc-tus") but not when on its own ("principem" -> "prin-ci-pem")

@dchiller dchiller changed the title Fix syllabification of "principem" Handle sometime consonant group "nc" Feb 13, 2024
@dchiller dchiller marked this pull request as ready for review February 13, 2024 17:00
@annamorphism
Copy link

annamorphism commented Feb 13, 2024

this sounds right--I suppose basically it's nasalizing the vowel rather than acting as a true consonant, so sanctus=sãctus=sãc-tus but pri˜-ci-pem per the usual syllable division rules. Are there other groups that work this way (re-demp-tor but im-pe-ra-tor?), and are they already accounted for?

@jacobdgm
Copy link
Collaborator

checking this locally,

>>> from volpiano_display_utilities.cantus_text_syllabification import syllabify_text
>>> syllabify_text("redemptor")
[SyllabifiedSection: [['re-', 'demp-', 'tor']]]
>>> syllabify_text("imperator")
[SyllabifiedSection: [['i-', 'mpe-', 'ra-', 'tor']]]

"Imperator" doesn't conform with @annamorphism's syllabification. Suggesting you address this too before merging

@jacobdgm
Copy link
Collaborator

Is this a liquid-consonant–plus–consonant thing? Are there l<consonant> clusters that should be syllabified l-<consonant>, or similarly for r<consonant>?

@dchiller
Copy link
Collaborator Author

redemptor vs imperator is a good catch...thanks @annamorphism and @jacobdgm for checking.

@dchiller
Copy link
Collaborator Author

Is this a liquid-consonant–plus–consonant thing? Are there l<consonant> clusters that should be syllabified l-<consonant>, or similarly for r<consonant>?

There's nothing currently in the logic that works with liquid consonant + consonant...I'm also not immediately thinking of any examples. Do you have one?

@dchiller
Copy link
Collaborator Author

I'm also not immediately thinking of any examples.

Ah...so liquid consonant + h I have, but in this case we already don't treat the 'h' as a consonant.

@dchiller
Copy link
Collaborator Author

dchiller commented Feb 14, 2024

@annamorphism I'm just noticing one other previous decision I made that I don't know if I stand by...

In the following cases, do you prefer column A or column B or mixed?

Word A B
iusticie iu-sti-ci-e ius-ti-ci-e
fuisti fui-sti fuis-ti
xpistus xpi-stus xpis-tus
maiestatis ma-ie-sta-tis ma-ies-ta-tis

@annamorphism
Copy link

I just got out my old linguistics textbook (!) which gives a universal rule: syllables should, all else being equal, divide into open syllables and syllables with onsets (aka "onsets are greedy"). So column A, which creates open syllables followed by the "st" onset, is preferable to column B, which creates closed syllables instead. (Fuisti should also probably be three syllables, but that's a separate issue.)

@dchiller
Copy link
Collaborator Author

So column A, which creates open syllables followed by the "st" onset, is preferable to column B, which creates closed syllables instead.

Ok, thank you! That is what intuitively makes sense to me now...I'm not sure why I did it the column B way earlier. So will add to this PR a fix for that.

Fuisti should also probably be three syllables, but that's a separate issue.

The good news is that this was a typo here and that's already the way it is handled!

@dchiller dchiller requested a review from jacobdgm February 16, 2024 14:54
Copy link
Collaborator

@jacobdgm jacobdgm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be an "approve" but for the lack of type annotations; I would, though, appreciate changing the couple of readability things I noted.

volpiano_display_utilities/latin_word_syllabification.py Outdated Show resolved Hide resolved
volpiano_display_utilities/latin_word_syllabification.py Outdated Show resolved Hide resolved
volpiano_display_utilities/latin_word_syllabification.py Outdated Show resolved Hide resolved
@dchiller dchiller requested a review from jacobdgm February 16, 2024 21:15
Copy link
Collaborator

@jacobdgm jacobdgm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, I only added a "type annotations?" comment on num_ltrs_btw_vow_grps, but I had hoped it would be understood to apply to the other variables being declared in this set of changes. num_consonants, syl_bound, and split_case could all usefully be annotated.

@dchiller
Copy link
Collaborator Author

Yes, I was confused by your request. I thought you meant it generally, but then couldn't find anywhere else that I would want to annotate (I added it to the definition of num_ltrs_btw_vow_grps because that's the one where the best case could be made I think).

I just added another one to num_consonants...I could again see one there because it is based off of a defined variable a few lines up, although my natural inclination is that this is uneccessary.

The other two cases you mention are clearly initialized as their appropriate type, so an annotation seems superfluous.

@jacobdgm
Copy link
Collaborator

The other two cases you mention are clearly initialized as their appropriate type, so an annotation seems superfluous.

Is the plan that Volpiano Display Utilities be fully mypy compliant? I think I had assumed that this was the plan from the beginning, in which case I don't think that any of the variables can be left unannotated.

Without reference to mypy, I think it's worth annotating all the variables, even the obvious ones, at least within a given function: If I see an unannotated variable in a context where the other variables are annotated, I'm likely to think, "oh, this isn't annotated, so the variable must have been initialized somewhere else", and then go looking for where it was defined. I can see that syl_bound is an integer at the moment it's defined, and I'm pretty confident that it wasn't initialized earlier by scrolling up and finding a docstring a few lines above, but I can't be confident that some string, say, hasn't been assigned to it somewhere later in the function.

@dchiller
Copy link
Collaborator Author

dchiller commented Feb 21, 2024

Is the plan that Volpiano Display Utilities be fully mypy compliant? I think I had assumed that this was the plan from the beginning, in which case I don't think that any of the variables can be left unannotated.

It is currently (and was without any of the variable annotations recently added starting with 70bc74e) mypy compliant.

I can't be confident that some string, say, hasn't been assigned to it somewhere later in the function.

I'm not sure I understand this. How would a type annotation ensure that a string hasn't been assigned to it later in the function? Or are you saying if there was a type annotation you would know that a string assignment later in the function (if such an assignment occurred) is wrong.

@dchiller
Copy link
Collaborator Author

dchiller commented Feb 21, 2024

I will go ahead and add them...I don't think it is "wrong" if they are there. This is, I think, just different point of personal preference in terms of code readability and clarity (I used mypy to catch lots of errors while actually writing this, but that just requires mypy compliance to be useful). For those purposes, if there is something unclear in the case of those two variables, a comment or better name would be more useful than a type hint. If it clarifies things for you, though, then it's all good.

Copy link
Collaborator

@jacobdgm jacobdgm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or are you saying if there was a type annotation you would know that a string assignment later in the function (if such an assignment occurred) is wrong.

Yes, this.

It is currently (and was without any of the variable annotations recently added starting with 70bc74e) mypy compliant.

Okay, so this is probably my not fully understanding how mypy behaves. My understanding is that there are several levels of strictness; maybe having return annotations for all the functions is sufficient for a less-strict setting, whereas all variables would need to be annotated on a stricter setting.

I will go ahead and add them...I don't think it is "wrong" if they are there. This is, I think, just different point of personal preference in terms of code readability and clarity.... For those purposes, if there is something unclear in the case of those two variables, a comment or better name would be more useful than a type hint. If it clarifies things for you, though, then it's all good.

Fair enough.

@dchiller
Copy link
Collaborator Author

Yes, this.

Makes sense. Mypy will alert you to this without the annotation, but if you are looking at it mypy-less, I see how this is useful.

maybe having return annotations for all the functions is sufficient for a less-strict setting, whereas all variables would need to be annotated on a stricter setting.

There are definitely varying levels of strictness (I've started using --strict but haven't taken the time to check whether there are even further "strictness" levels). Again, I don't know that typing all variables would ever be necessary, but there may be some setting that does want that for some reason.

@dchiller dchiller merged commit 9471c12 into main Feb 22, 2024
1 check passed
@dchiller dchiller deleted the i-42-principem-syllabification branch February 22, 2024 15:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

improper syllabification of "principem"
3 participants