Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return boolean from alignment process indicating need for manual correction of encoding #37

Merged
merged 4 commits into from
Jan 18, 2024

Conversation

dchiller
Copy link
Collaborator

@dchiller dchiller commented Jan 17, 2024

The align_text_and_volpiano function now returns an alignment and a boolean flag that indicates whether the volpiano and text are properly encoded. A return value of True is meant to indicate the need for manual correction of the encoding. This return value is now tested in volpiano syllabification and alignment tests.

As an additional test, every chant on Cantus Database with a melody and a manuscript spelling full text encoded (n = 48,639) was aligned. Only 1 chant failed to align completely and a total of 18,220 chants aligned with manual correction suggested. The chant that failed was 254921, which seems now to be a bit of an edge case for which I don't think we have a convention (although maybe we make what @annamorphism has edited it to be the convention, see #38).

Fixes #35.

Modifies a number of functions to return an indicator of whether
text and volpiano encoding was proper, even in cases where an alignment
can be made.
@dchiller dchiller force-pushed the i-35-catch-errors-boolean branch from d8e0a56 to 8c52e26 Compare January 17, 2024 15:32
@dchiller dchiller marked this pull request as ready for review January 17, 2024 16:25
Copy link
Collaborator

@jacobdgm jacobdgm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine as-is, but please consider the single comment I left.

Comment on lines 114 to 117
if vol_syls[-1][-3:] != "---" or vol_syls[-1][-4] == "-":
final_syl_of_word = vol_syls[-1]
if final_syl_of_word[-3:] != "---" or len(final_syl_of_word) < 4:
volpiano_improperly_spaced = True
elif vol_syls[-1][-4] == "-":
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really understand the purpose of this change - maybe I'm thrown off by the commit message - if len(vol_syls) < 4, we would have gotten an IndexError rather than a ValueError when we checked vol_syls[-1][-4] wouldn't we have?

For the elif at the bottom here, could we not have elif final_syl_of_word[-4] ... instead of elif vol_syls[-1][-4] ...?

A simpler alternative, which I understand to be more pythonic, would be to wrap if vol_syls[-1][-3:] != "---" or vol_syls[-1][-4] == "-": in a try: ... except IndexError: # last syllable has too few characters ...

But I digress - if this is just a typo in the commit message, that's fine - don't try to fix the commit message. Switching to a try/except is up to you, whichever approach you prefer is fine.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, a few things happened here:

  1. I just was careless in my commit message and wrote ValueError rather than IndexError...oops!
  2. I switched to defining the final_syl_of_word variable rather than using vol_syls[-1] everywhere because it seemed clearer...and then forgot to replace every instance of vol_syls[-1]...also oops!
  3. The try/except approach does seem like it makes more sense generally. I think I'll just make that switch here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so....standby for some fixes.

@jacobdgm
Copy link
Collaborator

Actually, if you don't implement the try/except, please do switch to elif final_syl_of_word[-4].

@dchiller dchiller force-pushed the i-35-catch-errors-boolean branch from 9970ad3 to 9d11541 Compare January 18, 2024 15:55
@dchiller dchiller merged commit 2612f0f into main Jan 18, 2024
1 check passed
@dchiller dchiller deleted the i-35-catch-errors-boolean branch January 18, 2024 15:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Catch errors and return a boolean telling whether or not errors occurred in alignment
2 participants