Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't translate Scripture references in non-verse text #573

Open
johnml1135 opened this issue Dec 12, 2024 · 2 comments
Open

Don't translate Scripture references in non-verse text #573

johnml1135 opened this issue Dec 12, 2024 · 2 comments
Assignees

Comments

@johnml1135
Copy link
Collaborator

There are often Scripture references at the beginning or end of non-verse text. Currently we are trying to translate these and making a mess of things. We should be able to do better. Here are some options:

Best proposal:

  • Strip out all references when making training/pretranslating data
  • Re-insert them when putting into USFM

Complications:

  • How do we know something is a scripture reference? Can we truly make a regex to capture all references? What about multiple languages?
  • Re-insertions would require reworking the USFM updater - which may add some complication (but not too much because there will be no ranges do deal with in non-verse scripture text).
@github-project-automation github-project-automation bot moved this to 🆕 New in Serval Dec 12, 2024
@johnml1135 johnml1135 self-assigned this Dec 12, 2024
@ddaspit
Copy link
Contributor

ddaspit commented Jan 3, 2025

We should just be more careful about what markers we translate and what markers we don't translate.

@johnml1135
Copy link
Collaborator Author

Yes - we could try some funny things with regex's - but I think just the "translate these tags" and "don't translate these tags" is the best way to get to the 95%.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 🆕 New
Development

No branches or pull requests

2 participants