Skip to content

Add script for creating drafts with alternate postprocessing configurations #750

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 6, 2025

Conversation

isaac091
Copy link
Collaborator

@isaac091 isaac091 commented Jun 5, 2025

Towards #746


This change is Reviewable

@isaac091 isaac091 requested review from benjaminking and ddaspit June 5, 2025 23:39
Copy link
Collaborator

@benjaminking benjaminking left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. At some point, we'll have to discuss what adding quotation denormalization to this will look like. NLLB sometimes doesn't translate all of the quotation marks from the source, so we can't go back to the source to figure out what they originally were. We could re-apply quotation normalization, but there are combinations of source and target quote conventions where normalization->denormalization->normalization will be different from the original. Anyway, not super important right now.

:lgtm:

Reviewed 2 of 2 files at r1, all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @ddaspit)

@isaac091
Copy link
Collaborator Author

isaac091 commented Jun 6, 2025

The intention is that this script is only used with unedited drafts (i.e. straight from the translate pipeline) and their original sources. This is because the ScriptureRefs of the source and draft have to match up exactly (with the exception of \rem elements) so that alignment can be done for marker placement. So at least in the way that I'm thinking about this script right now, it would have access to the quotation convention of the source and assume that the quotations in the draft are either normalized or have the same convention as the source.

But yes, we will need to have that conversation.

@isaac091 isaac091 merged commit 9b17dec into master Jun 6, 2025
1 check passed
@isaac091 isaac091 deleted the postprocess_script branch June 6, 2025 23:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants