-
-
Notifications
You must be signed in to change notification settings - Fork 2
Paragraph Marker Placement #703
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
2fe71a0 to
74cbc62
Compare
Enkidu93
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 15 files reviewed, 2 unresolved discussions (waiting on @ddaspit)
src/Serval/test/Serval.Translation.Tests/Services/PretranslationServiceTests.cs line 145 at r1 (raw file):
\c 1 \v 1 Chapter 1 \p , verse 1. Translated new paragraph
@isaac091 , I would have expected this to place the marker in the correct spot since it's a pretty simple case. Could you double-check my inputs and see if I've done something wrong?
src/Serval/src/Serval.Translation/Contracts/PretranslationUsfmMarkerBehavior.cs line 5 at r1 (raw file):
public enum PretranslationUsfmMarkerBehavior { PushToEnd,
I'm open to suggestions on this naming. This seems a little clumsy, but a lot of alternatives I could think of were very wordy. Maybe naming the overall enum PretranslationIntraverseUsfmMarkerBehavior would help clarify too?
Enkidu93
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 17 files reviewed, 1 unresolved discussion (waiting on @ddaspit)
src/Serval/test/Serval.Translation.Tests/Services/PretranslationServiceTests.cs line 145 at r1 (raw file):
Previously, Enkidu93 (Eli C. Lowry) wrote…
@isaac091 , I would have expected this to place the marker in the correct spot since it's a pretty simple case. Could you double-check my inputs and see if I've done something wrong?
We figured it out. Thank you, Isaac!
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #703 +/- ##
==========================================
+ Coverage 66.10% 66.15% +0.05%
==========================================
Files 360 360
Lines 19111 19252 +141
Branches 2461 2473 +12
==========================================
+ Hits 12633 12737 +104
- Misses 5577 5610 +33
- Partials 901 905 +4 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
ddaspit
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 14 of 15 files at r1, 3 of 3 files at r2, all commit messages.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @Enkidu93)
src/Serval/src/Serval.Translation/Contracts/PretranslationUsfmMarkerBehavior.cs line 5 at r1 (raw file):
Previously, Enkidu93 (Eli C. Lowry) wrote…
I'm open to suggestions on this naming. This seems a little clumsy, but a lot of alternatives I could think of were very wordy. Maybe naming the overall enum
PretranslationIntraverseUsfmMarkerBehaviorwould help clarify too?
What about Preserve, Strip, and PreserveEnd/End? Or if we want to maintain better compatibility with the current behavior: Preserve, Strip, and PreservePosition/Position?
src/Echo/src/EchoEngine/TranslationEngineServiceV1.cs line 121 at r2 (raw file):
Refs = { row.Refs.Select(r => r.ToString()) }, Translation = row.SourceSegment, SourceTokens = { row.SourceSegment.Split() },
It would probably be simpler to call Split once.
src/Serval/src/Serval.Grpc/Protos/serval/translation/v1/platform.proto line 6 at r2 (raw file):
import "google/protobuf/empty.proto"; import "Protos/serval/translation/v1/engine.proto";
If you specify
<Protobuf Include="**\*.proto" ProtoRoot="Protos" />in Serval.Grpc.csproj, then you will be able to import like this
import "serval/translation/v1/engine.proto";which I think is much nicer.
Also, you should move the AlignedWordPair message to a common.proto file.
src/Machine/src/Serval.Machine.Shared/Services/NmtClearMLBuildJobFactory.cs line 37 at r2 (raw file):
if (buildOptionsObject is not null) { buildOptionsObject["align_pretranslations"] = true;
Can we change the default in Machine.py?
src/Machine/src/Serval.Machine.Shared/Consumers/TranslationInsertPretranslationsConsumer.cs line 60 at r2 (raw file):
} private class PretranslationConverter : JsonConverter<Pretranslation>
Why is this custom converter needed?
Enkidu93
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @ddaspit)
src/Echo/src/EchoEngine/TranslationEngineServiceV1.cs line 121 at r2 (raw file):
Previously, ddaspit (Damien Daspit) wrote…
It would probably be simpler to call
Splitonce.
Done.
src/Machine/src/Serval.Machine.Shared/Consumers/TranslationInsertPretranslationsConsumer.cs line 60 at r2 (raw file):
Previously, ddaspit (Damien Daspit) wrote…
Why is this custom converter needed?
It's similar to the corresponding class in WordAlignment. In order to parse the alignment strings as aligned word pairs, you need a custom converter. In the WordAlignment PR, I had tried to specify a converter that only converted the alignment string itself but couldn't get it to work properly, so we fell back on doing this - similar case here. If you'd like me to revisit that option, I can try again. Regardless, we'll need some kind of custom converter.
src/Machine/src/Serval.Machine.Shared/Services/NmtClearMLBuildJobFactory.cs line 37 at r2 (raw file):
Previously, ddaspit (Damien Daspit) wrote…
Can we change the default in Machine.py?
Yes, I can do that. I wasn't sure if that's something we'd want to be the default in machine.py. I guess it depends on whether you imagine the machine.py code will ever get called by another client. It seems odd/outside the nature of a translation job to run alignments, but if we're thinking of those scripts as tailored to Serval, then I think changing the default makes sense.
src/Serval/src/Serval.Grpc/Protos/serval/translation/v1/platform.proto line 6 at r2 (raw file):
Previously, ddaspit (Damien Daspit) wrote…
If you specify
<Protobuf Include="**\*.proto" ProtoRoot="Protos" />in
Serval.Grpc.csproj, then you will be able to import like thisimport "serval/translation/v1/engine.proto";which I think is much nicer.
Also, you should move the
AlignedWordPairmessage to acommon.protofile.
Oo, that is nicer! Done. I made separate translation and word_alignment common.protos in keeping with the models.
src/Serval/src/Serval.Translation/Contracts/PretranslationUsfmMarkerBehavior.cs line 5 at r1 (raw file):
Previously, ddaspit (Damien Daspit) wrote…
What about
Preserve,Strip, andPreserveEnd/End? Or if we want to maintain better compatibility with the current behavior:Preserve,Strip, andPreservePosition/Position?
OK. In order to keep each option a verb for consistency, I think Preserve/Strip/PreservePosition might be best (or maybe Place?). Preserve is maybe a little under-descriptive 🤔, but I think PreserveEnd is a bit clunky/misleading. Of course, we can clarify in swagger, but it'd be nice for developers and clients alike if the names were semi-clear 😆.
ddaspit
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 13 of 13 files at r3, all commit messages.
Reviewable status:complete! all files reviewed, all discussions resolved (waiting on @Enkidu93)
src/Machine/src/Serval.Machine.Shared/Services/NmtClearMLBuildJobFactory.cs line 37 at r2 (raw file):
Previously, Enkidu93 (Eli C. Lowry) wrote…
Yes, I can do that. I wasn't sure if that's something we'd want to be the default in machine.py. I guess it depends on whether you imagine the machine.py code will ever get called by another client. It seems odd/outside the nature of a translation job to run alignments, but if we're thinking of those scripts as tailored to Serval, then I think changing the default makes sense.
We should remove this code, once we change the default in Machine.py.
ddaspit
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @Enkidu93)
src/Machine/src/Serval.Machine.Shared/Services/NmtClearMLBuildJobFactory.cs line 37 at r2 (raw file):
Previously, ddaspit (Damien Daspit) wrote…
We should remove this code, once we change the default in Machine.py.
Actually, it would be easier to remove it now.
Enkidu93
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 21 of 23 files reviewed, all discussions resolved (waiting on @ddaspit)
src/Machine/src/Serval.Machine.Shared/Services/NmtClearMLBuildJobFactory.cs line 37 at r2 (raw file):
Previously, ddaspit (Damien Daspit) wrote…
Actually, it would be easier to remove it now.
Done.
ddaspit
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r4, all commit messages.
Reviewable status:complete! all files reviewed, all discussions resolved (waiting on @Enkidu93)
Working E2E marker placement
e254fb3 to
6d152e5
Compare
I'll need to add a few tests & update the Machine version once this PR goes through.
Fixes #578
Fixes #663
Fixes #699
This change is