-
-
Notifications
You must be signed in to change notification settings - Fork 2
Denormalize quotation marks #721
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
I still need to add a couple blurbs to the swagger docs. |
|
I apologize for what are inevitably going to be merge conflicts. @pmachapman @ddaspit - I'll plan on letting Peter get his other PR in and then I'll rebase. But it would still be helpful to review now to confirm that, otherwise, the approach is good. I don't like how complicated the |
ddaspit
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 12 of 12 files at r1, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @Enkidu93)
src/Serval/src/Serval.Translation/Services/PretranslationService.cs line 321 at r1 (raw file):
new(sourceQuoteConvention, targetQuoteConvention); UsfmParser.Parse(usfm, quotationMarkDenormalizationFirstPass);
Could this pass be run before running the updater in GetUsfmAsync?
src/Serval/src/Serval.Translation/Contracts/PretranslationQuotationMarkBehavior.cs line 3 at r1 (raw file):
namespace Serval.Translation.Contracts; public enum PretranslationQuotationMarkBehavior
I am toying with the idea of naming this more generically. The enum could be called PretranslationNormalizationBehavior or just NormalizationBehavior (this enum will only be used on pretranslation endpoints so Pretranslation is probably redundant). The values could be Normalized and Denormalized.
Enkidu93
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 6 of 12 files reviewed, 2 unresolved discussions (waiting on @ddaspit)
src/Serval/src/Serval.Translation/Contracts/PretranslationQuotationMarkBehavior.cs line 3 at r1 (raw file):
Previously, ddaspit (Damien Daspit) wrote…
I am toying with the idea of naming this more generically. The enum could be called
PretranslationNormalizationBehavioror justNormalizationBehavior(this enum will only be used on pretranslation endpoints soPretranslationis probably redundant). The values could beNormalizedandDenormalized.
Done
src/Serval/src/Serval.Translation/Services/PretranslationService.cs line 321 at r1 (raw file):
Previously, ddaspit (Damien Daspit) wrote…
Could this pass be run before running the updater in
GetUsfmAsync?
I'm not sure what you mean. It needs to run on the USFM with the pretranslations inserted because that's the text that it is confirming can be denormalized, and since it's not implemented as an IUpdateBlockHandler, I can't pass it during the update itself. I don't like how complex this class is getting, but short-term, I'm not sure what else can be done. Any thoughts?
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #721 +/- ##
==========================================
+ Coverage 66.27% 66.29% +0.02%
==========================================
Files 368 368
Lines 19588 19664 +76
Branches 2505 2511 +6
==========================================
+ Hits 12981 13036 +55
- Misses 5695 5713 +18
- Partials 912 915 +3 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
ddaspit
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 10 of 10 files at r2, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @Enkidu93)
src/Serval/src/Serval.Translation/Services/PretranslationService.cs line 321 at r1 (raw file):
Previously, Enkidu93 (Eli C. Lowry) wrote…
I'm not sure what you mean. It needs to run on the USFM with the pretranslations inserted because that's the text that it is confirming can be denormalized, and since it's not implemented as an
IUpdateBlockHandler, I can't pass it during the update itself. I don't like how complex this class is getting, but short-term, I'm not sure what else can be done. Any thoughts?
That answers my question. It would be nice to encapsulate the logic in DenormalizeQuotationMarks into a convenience function in Machine.py. Or we could implement some kind of parser pipeline class. We don't need to do that right now.
src/Serval/src/Serval.Translation/Services/PretranslationService.cs line 291 at r2 (raw file):
remarks.AddRange(denormalizationRemarks); } var remarkUpdater = new UpdateUsfmParserHandler(remarks: remarks);
Would it be possible to get rid of this final parser pass? Maybe add the disclaimer and marker placement remark in the first updater pass and add the quotation denormalization remark in the denormalization pass.
Enkidu93
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @ddaspit)
src/Serval/src/Serval.Translation/Services/PretranslationService.cs line 321 at r1 (raw file):
Previously, ddaspit (Damien Daspit) wrote…
That answers my question. It would be nice to encapsulate the logic in
DenormalizeQuotationMarksinto a convenience function in Machine.py. Or we could implement some kind of parser pipeline class. We don't need to do that right now.
I agree. OK, I can spin off an issue for later.
src/Serval/src/Serval.Translation/Services/PretranslationService.cs line 291 at r2 (raw file):
Previously, ddaspit (Damien Daspit) wrote…
Would it be possible to get rid of this final parser pass? Maybe add the disclaimer and marker placement remark in the first updater pass and add the quotation denormalization remark in the denormalization pass.
I had that originally. The issue is that it affects the order of the remarks. If we were to do that, the "This is an AI Draft" remark would not be first; rather, the quote denormalization remark would. I could pass them all in during quotation denormalization or else during the update, but 1) the logic gets a little uglier and 2) the DenormalizeQuotationMarks function would be doing double-duty. We could also change the remark-adding behavior to add after any existing remarks, but that would require changes to Machine. What do you think?
ddaspit
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 4 of 4 files at r3, all commit messages.
Reviewable status:complete! all files reviewed, all discussions resolved (waiting on @Enkidu93)
086cea1 to
640732c
Compare
ddaspit
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 10 of 10 files at r4, all commit messages.
Reviewable status:complete! all files reviewed, all discussions resolved (waiting on @Enkidu93)
Fixes #671, #693, and #664.
Relies on sillsdev/machine#316.
This change is