Skip to content

Conversation

@Enkidu93
Copy link
Collaborator

@Enkidu93 Enkidu93 commented Jun 6, 2025

I'll need to add a few tests & update the Machine version once this PR goes through.

Fixes #578
Fixes #663
Fixes #699


This change is Reviewable

@Enkidu93 Enkidu93 requested a review from ddaspit June 6, 2025 15:55
@Enkidu93 Enkidu93 force-pushed the pass_alignments_up_with_pretranslations branch from 2fe71a0 to 74cbc62 Compare June 9, 2025 15:30
Copy link
Collaborator Author

@Enkidu93 Enkidu93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 15 files reviewed, 2 unresolved discussions (waiting on @ddaspit)


src/Serval/test/Serval.Translation.Tests/Services/PretranslationServiceTests.cs line 145 at r1 (raw file):

\c 1
\v 1 Chapter 1
\p , verse 1. Translated new paragraph

@isaac091 , I would have expected this to place the marker in the correct spot since it's a pretty simple case. Could you double-check my inputs and see if I've done something wrong?


src/Serval/src/Serval.Translation/Contracts/PretranslationUsfmMarkerBehavior.cs line 5 at r1 (raw file):

public enum PretranslationUsfmMarkerBehavior
{
    PushToEnd,

I'm open to suggestions on this naming. This seems a little clumsy, but a lot of alternatives I could think of were very wordy. Maybe naming the overall enum PretranslationIntraverseUsfmMarkerBehavior would help clarify too?

Copy link
Collaborator Author

@Enkidu93 Enkidu93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 17 files reviewed, 1 unresolved discussion (waiting on @ddaspit)


src/Serval/test/Serval.Translation.Tests/Services/PretranslationServiceTests.cs line 145 at r1 (raw file):

Previously, Enkidu93 (Eli C. Lowry) wrote…

@isaac091 , I would have expected this to place the marker in the correct spot since it's a pretty simple case. Could you double-check my inputs and see if I've done something wrong?

We figured it out. Thank you, Isaac!

@codecov-commenter
Copy link

codecov-commenter commented Jun 10, 2025

Codecov Report

Attention: Patch coverage is 77.63975% with 36 lines in your changes missing coverage. Please review.

Project coverage is 66.15%. Comparing base (be848f2) to head (6d152e5).

Files with missing lines Patch % Lines
...sumers/TranslationInsertPretranslationsConsumer.cs 63.75% 26 Missing and 3 partials ⚠️
...anslation/Services/TranslationPlatformServiceV1.cs 30.00% 7 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #703      +/-   ##
==========================================
+ Coverage   66.10%   66.15%   +0.05%     
==========================================
  Files         360      360              
  Lines       19111    19252     +141     
  Branches     2461     2473      +12     
==========================================
+ Hits        12633    12737     +104     
- Misses       5577     5610      +33     
- Partials      901      905       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 14 of 15 files at r1, 3 of 3 files at r2, all commit messages.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @Enkidu93)


src/Serval/src/Serval.Translation/Contracts/PretranslationUsfmMarkerBehavior.cs line 5 at r1 (raw file):

Previously, Enkidu93 (Eli C. Lowry) wrote…

I'm open to suggestions on this naming. This seems a little clumsy, but a lot of alternatives I could think of were very wordy. Maybe naming the overall enum PretranslationIntraverseUsfmMarkerBehavior would help clarify too?

What about Preserve, Strip, and PreserveEnd/End? Or if we want to maintain better compatibility with the current behavior: Preserve, Strip, and PreservePosition/Position?


src/Echo/src/EchoEngine/TranslationEngineServiceV1.cs line 121 at r2 (raw file):

                                    Refs = { row.Refs.Select(r => r.ToString()) },
                                    Translation = row.SourceSegment,
                                    SourceTokens = { row.SourceSegment.Split() },

It would probably be simpler to call Split once.


src/Serval/src/Serval.Grpc/Protos/serval/translation/v1/platform.proto line 6 at r2 (raw file):

import "google/protobuf/empty.proto";
import "Protos/serval/translation/v1/engine.proto";

If you specify

<Protobuf Include="**\*.proto" ProtoRoot="Protos" />

in Serval.Grpc.csproj, then you will be able to import like this

import "serval/translation/v1/engine.proto";

which I think is much nicer.

Also, you should move the AlignedWordPair message to a common.proto file.


src/Machine/src/Serval.Machine.Shared/Services/NmtClearMLBuildJobFactory.cs line 37 at r2 (raw file):

                if (buildOptionsObject is not null)
                {
                    buildOptionsObject["align_pretranslations"] = true;

Can we change the default in Machine.py?


src/Machine/src/Serval.Machine.Shared/Consumers/TranslationInsertPretranslationsConsumer.cs line 60 at r2 (raw file):

    }

    private class PretranslationConverter : JsonConverter<Pretranslation>

Why is this custom converter needed?

Copy link
Collaborator Author

@Enkidu93 Enkidu93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @ddaspit)


src/Echo/src/EchoEngine/TranslationEngineServiceV1.cs line 121 at r2 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

It would probably be simpler to call Split once.

Done.


src/Machine/src/Serval.Machine.Shared/Consumers/TranslationInsertPretranslationsConsumer.cs line 60 at r2 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

Why is this custom converter needed?

It's similar to the corresponding class in WordAlignment. In order to parse the alignment strings as aligned word pairs, you need a custom converter. In the WordAlignment PR, I had tried to specify a converter that only converted the alignment string itself but couldn't get it to work properly, so we fell back on doing this - similar case here. If you'd like me to revisit that option, I can try again. Regardless, we'll need some kind of custom converter.


src/Machine/src/Serval.Machine.Shared/Services/NmtClearMLBuildJobFactory.cs line 37 at r2 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

Can we change the default in Machine.py?

Yes, I can do that. I wasn't sure if that's something we'd want to be the default in machine.py. I guess it depends on whether you imagine the machine.py code will ever get called by another client. It seems odd/outside the nature of a translation job to run alignments, but if we're thinking of those scripts as tailored to Serval, then I think changing the default makes sense.


src/Serval/src/Serval.Grpc/Protos/serval/translation/v1/platform.proto line 6 at r2 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

If you specify

<Protobuf Include="**\*.proto" ProtoRoot="Protos" />

in Serval.Grpc.csproj, then you will be able to import like this

import "serval/translation/v1/engine.proto";

which I think is much nicer.

Also, you should move the AlignedWordPair message to a common.proto file.

Oo, that is nicer! Done. I made separate translation and word_alignment common.protos in keeping with the models.


src/Serval/src/Serval.Translation/Contracts/PretranslationUsfmMarkerBehavior.cs line 5 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

What about Preserve, Strip, and PreserveEnd/End? Or if we want to maintain better compatibility with the current behavior: Preserve, Strip, and PreservePosition/Position?

OK. In order to keep each option a verb for consistency, I think Preserve/Strip/PreservePosition might be best (or maybe Place?). Preserve is maybe a little under-descriptive 🤔, but I think PreserveEnd is a bit clunky/misleading. Of course, we can clarify in swagger, but it'd be nice for developers and clients alike if the names were semi-clear 😆.

Copy link
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 13 of 13 files at r3, all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @Enkidu93)


src/Machine/src/Serval.Machine.Shared/Services/NmtClearMLBuildJobFactory.cs line 37 at r2 (raw file):

Previously, Enkidu93 (Eli C. Lowry) wrote…

Yes, I can do that. I wasn't sure if that's something we'd want to be the default in machine.py. I guess it depends on whether you imagine the machine.py code will ever get called by another client. It seems odd/outside the nature of a translation job to run alignments, but if we're thinking of those scripts as tailored to Serval, then I think changing the default makes sense.

We should remove this code, once we change the default in Machine.py.

Copy link
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @Enkidu93)


src/Machine/src/Serval.Machine.Shared/Services/NmtClearMLBuildJobFactory.cs line 37 at r2 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

We should remove this code, once we change the default in Machine.py.

Actually, it would be easier to remove it now.

Copy link
Collaborator Author

@Enkidu93 Enkidu93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 21 of 23 files reviewed, all discussions resolved (waiting on @ddaspit)


src/Machine/src/Serval.Machine.Shared/Services/NmtClearMLBuildJobFactory.cs line 37 at r2 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

Actually, it would be easier to remove it now.

Done.

Copy link
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r4, all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @Enkidu93)

@Enkidu93 Enkidu93 force-pushed the pass_alignments_up_with_pretranslations branch from e254fb3 to 6d152e5 Compare June 19, 2025 23:16
@Enkidu93 Enkidu93 merged commit 1cf4662 into main Jun 19, 2025
2 of 3 checks passed
@Enkidu93 Enkidu93 deleted the pass_alignments_up_with_pretranslations branch June 19, 2025 23:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

4 participants