Skip to content

Conversation

@Enkidu93
Copy link
Collaborator

@Enkidu93 Enkidu93 commented Oct 13, 2025

Fixes #318

Partially addresses sillsdev/serval#768


This change is Reviewable

@Enkidu93 Enkidu93 requested a review from ddaspit October 13, 2025 14:54
@Enkidu93
Copy link
Collaborator Author

Still need to add tests.

@Enkidu93 Enkidu93 force-pushed the validate_usfm_versification branch from 7ff60f6 to b8a9089 Compare October 15, 2025 13:14
@codecov-commenter
Copy link

codecov-commenter commented Oct 15, 2025

Codecov Report

❌ Patch coverage is 74.81752% with 69 lines in your changes missing coverage. Please review.
✅ Project coverage is 72.49%. Comparing base (53b745a) to head (90b5707).

Files with missing lines Patch % Lines
....Machine/Corpora/UsfmVersificationErrorDetector.cs 73.54% 38 Missing and 3 partials ⚠️
...a/ParatextProjectVersificationErrorDetectorBase.cs 75.00% 8 Missing ⚠️
...L.Machine/Corpora/ZipParatextProjectFileHandler.cs 57.89% 7 Missing and 1 partial ⚠️
....Machine/Corpora/FileParatextProjectFileHandler.cs 85.00% 3 Missing ⚠️
...chine/Corpora/ParatextProjectSettingsParserBase.cs 72.72% 1 Missing and 2 partials ⚠️
...a/FileParatextProjectVersificationErrorDetector.cs 0.00% 2 Missing ⚠️
....Machine/Corpora/ParatextProjectTermsParserBase.cs 88.88% 0 Missing and 1 partial ⚠️
...L.Machine/Corpora/ZipParatextProjectTextUpdater.cs 0.00% 1 Missing ⚠️
...ra/ZipParatextProjectVersificationErrorDetector.cs 0.00% 1 Missing ⚠️
...lysis/ZipParatextProjectQuoteConventionDetector.cs 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #353      +/-   ##
==========================================
+ Coverage   72.38%   72.49%   +0.10%     
==========================================
  Files         417      422       +5     
  Lines       35632    35791     +159     
  Branches     4929     4949      +20     
==========================================
+ Hits        25793    25947     +154     
- Misses       8744     8748       +4     
- Partials     1095     1096       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Enkidu93
Copy link
Collaborator Author

Tests have been added. I'm really excited to get this in 🎉.

Copy link
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ddaspit reviewed 16 of 22 files at r1, 2 of 3 files at r2, 7 of 10 files at r3, 3 of 4 files at r4, 1 of 1 files at r5, all commit messages.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @Enkidu93)


src/SIL.Machine/Corpora/UsfmVersificationMismatchDetector.cs line 108 at r4 (raw file):

                if (
                    Type == UsfmVersificationMismatchType.MissingVerseSegment
                    && VerseRef.TryParse(

Why are you using TryParse here and not the constructor?


src/SIL.Machine/Corpora/IParatextProjectFileHandler.cs line 5 at r1 (raw file):

namespace SIL.Machine.Corpora
{
    public interface IParatextProjectFileHandler

Shouldn't we also use this interface in ParatextProjectSettingsParserBase?


src/SIL.Machine/Corpora/ParatextProjectVersificationMismatchDetector.cs line 8 at r1 (raw file):

namespace SIL.Machine.Corpora
{
    public abstract class ParatextProjectVersificationMismatchDetector

By convention, ABCs are suffixed with Base.


src/SIL.Machine/Corpora/ParatextProjectTermsParserBase.cs line 299 at r1 (raw file):

        }

        private Stream Open(string fileName) => _paratextProjectFileHandler.Open(fileName);

This indirection is unnecessary. I would just call _paratextProjectFileHandler directly.

Copy link
Collaborator Author

@Enkidu93 Enkidu93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 9 of 32 files reviewed, 4 unresolved discussions (waiting on @ddaspit)


src/SIL.Machine/Corpora/IParatextProjectFileHandler.cs line 5 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

Shouldn't we also use this interface in ParatextProjectSettingsParserBase?

We could. I pushed a change with this update. It felt a little awkward to me since 1) they're doing slightly different things and 2) we end up having to match the right classes in all the constructors as well as pass both an IParatextProjectFileHandler and a class which itself has a IParatextProjectFileHandler. There might be a better way, but this is better than it was before!


src/SIL.Machine/Corpora/ParatextProjectTermsParserBase.cs line 299 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

This indirection is unnecessary. I would just call _paratextProjectFileHandler directly.

OK, done.


src/SIL.Machine/Corpora/UsfmVersificationMismatchDetector.cs line 108 at r4 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

Why are you using TryParse here and not the constructor?

I don't trust the constructor and would rather not throw an exception (see sillsdev/serval#796). My thinking was that if it fails to parse the verse ref string, that probably indicates that there is an issue at that ref and we could still indicate where the issue is. I guess this thinking ought to extend to

VerseRef defaultVerseRef = new VerseRef(_bookNum, _expectedChapter, _expectedVerse);

as well then.


src/SIL.Machine/Corpora/ParatextProjectVersificationMismatchDetector.cs line 8 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

By convention, ABCs are suffixed with Base.

Oh right - thank you - done.

Copy link
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ddaspit reviewed 23 of 23 files at r6, all commit messages.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @Enkidu93)


src/SIL.Machine/Corpora/ParatextProjectTermsParserBase.cs line 299 at r1 (raw file):

Previously, Enkidu93 (Eli C. Lowry) wrote…

OK, done.

I think you forgot to remove these methods.


src/SIL.Machine/Corpora/UsfmVersificationMismatchDetector.cs line 108 at r4 (raw file):

Previously, Enkidu93 (Eli C. Lowry) wrote…

I don't trust the constructor and would rather not throw an exception (see sillsdev/serval#796). My thinking was that if it fails to parse the verse ref string, that probably indicates that there is an issue at that ref and we could still indicate where the issue is. I guess this thinking ought to extend to

VerseRef defaultVerseRef = new VerseRef(_bookNum, _expectedChapter, _expectedVerse);

as well then.

I would add a comment to indicate that we don't want to throw an exception here, so we aren't using the constructor.


src/SIL.Machine/Corpora/FileParatextProjectTextUpdater.cs line 8 at r6 (raw file):

            : base(
                new FileParatextProjectFileHandler(projectDir),
                new FileParatextProjectSettingsParser(projectDir).Parse()

It might be nice to add a static Parse method to FileParatextProjectSettingsParser, i.e.

FileParatextProjectSettingsParser.Parse(projectDir)

src/SIL.Machine/Corpora/IParatextProjectFileHandler.cs line 12 at r6 (raw file):

        UsfmStylesheet CreateStylesheet(string fileName);

        // ParatextProjectSettings GetSettings();

Don't forget to remove this commented out line of code.

Copy link
Collaborator Author

@Enkidu93 Enkidu93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @ddaspit)


src/SIL.Machine/Corpora/FileParatextProjectTextUpdater.cs line 8 at r6 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

It might be nice to add a static Parse method to FileParatextProjectSettingsParser, i.e.

FileParatextProjectSettingsParser.Parse(projectDir)

OK, I added one for the zip implementation too.


src/SIL.Machine/Corpora/IParatextProjectFileHandler.cs line 12 at r6 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

Don't forget to remove this commented out line of code.

Yes, thank you! Done.


src/SIL.Machine/Corpora/ParatextProjectTermsParserBase.cs line 299 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

I think you forgot to remove these methods.

Done.


src/SIL.Machine/Corpora/UsfmVersificationMismatchDetector.cs line 108 at r4 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

I would add a comment to indicate that we don't want to throw an exception here, so we aren't using the constructor.

Done. Let me know if you'd prefer different wording.


src/SIL.Machine/Corpora/UsfmVersificationMismatchDetector.cs line 18 at r6 (raw file):

    }

    public class UsfmVersificationMismatch

What do you think of these class names, @ddaspit? Would it be better to use a more generic name like UsfmVersificationError since we're now also covering invalid verses that aren't necessarily 'mismatches'?

Copy link
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ddaspit reviewed 12 of 12 files at r7, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @Enkidu93)


src/SIL.Machine/Corpora/UsfmVersificationMismatchDetector.cs line 18 at r6 (raw file):

Previously, Enkidu93 (Eli C. Lowry) wrote…

What do you think of these class names, @ddaspit? Would it be better to use a more generic name like UsfmVersificationError since we're now also covering invalid verses that aren't necessarily 'mismatches'?

Yes, I think a more generic name would be better.

Copy link
Collaborator Author

@Enkidu93 Enkidu93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 25 of 32 files reviewed, 1 unresolved discussion (waiting on @ddaspit)


src/SIL.Machine/Corpora/UsfmVersificationMismatchDetector.cs line 18 at r6 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

Yes, I think a more generic name would be better.

Cool, done.

Copy link
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

@ddaspit reviewed 7 of 7 files at r8, all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @Enkidu93)

@Enkidu93 Enkidu93 merged commit 4545875 into master Nov 6, 2025
4 checks passed
@Enkidu93 Enkidu93 deleted the validate_usfm_versification branch November 6, 2025 14:36
Enkidu93 added a commit to sillsdev/machine.py that referenced this pull request Nov 17, 2025
Enkidu93 added a commit to sillsdev/machine.py that referenced this pull request Nov 19, 2025
Fix unused imports

Fix import sorting

Address reviewr comments

Add parameter types

Use isort
Enkidu93 added a commit to sillsdev/machine.py that referenced this pull request Nov 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Refactor Zip/File/Memory Paratext project classes to use a file handling interface

4 participants