-
-
Notifications
You must be signed in to change notification settings - Fork 17
Port block handler #306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Port block handler #306
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #306 +/- ##
==========================================
+ Coverage 70.50% 70.75% +0.25%
==========================================
Files 389 390 +1
Lines 32386 32721 +335
Branches 4545 4605 +60
==========================================
+ Hits 22833 23153 +320
- Misses 8503 8510 +7
- Partials 1050 1058 +8 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Alright - this is ready for review 😁 |
isaac091
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 2 files reviewed, all discussions resolved (waiting on @ddaspit)
|
@ddaspit This is ready for review |
8d53798 to
5495c05
Compare
|
Fixes #303 |
ddaspit
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 10 of 10 files at r3, all commit messages.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @Enkidu93)
src/SIL.Machine/Corpora/PlaceMarkersUsfmUpdateBlockHandler.cs line 61 at r4 (raw file):
// Paragraph markers at the end of the block should stay there // Section headers should be ignored but re-inserted in the same position relative to other paragraph markers List<UsfmUpdateBlockElement> endElements = new List<UsfmUpdateBlockElement>();
You can use var for all of these variable declarations, since the type is explicit on the right hand side.
src/SIL.Machine/Corpora/PlaceMarkersUsfmUpdateBlockHandler.cs line 91 at r4 (raw file):
|| ( element.Type == UsfmUpdateBlockElementType.Text && element.Tokens[0].ToUsfm().Trim().Count() == 0
You should use Length instead of the LINQ Count function.
tests/SIL.Machine.Tests/Corpora/PlaceMarkersUsfmUpdateBlockHandlerTests.cs line 10 at r4 (raw file):
public class PlaceMarkersUsfmUpdateBlockHandlerTests { private LatinWordTokenizer Tokenizer { get; set; }
This can probably just be a static read-only field.
src/SIL.Machine/Corpora/UpdateUsfmParserHandler.cs line 535 at r4 (raw file):
} List<UsfmToken> tokens = updateBlock.GetTokens(); paraElems.Reverse();
It is a small thing, but I would just use the LINQ Reverse function in the foreach loop instead of the in-place Reverse method on List. It avoids an extra iteration of the list. You can call the LINQ reverse like this Enumerable.Reverse(paraElems).
ddaspit
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Enkidu93 What kind of optimizations are you talking about?
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @Enkidu93)
Enkidu93
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm seeing lots of Insert(index) and RemoveAt and IndexOf+Substring's as well as what looks like iterating through collections more than once. These make suspicious that we could improve the performance.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @ddaspit)
src/SIL.Machine/Corpora/PlaceMarkersUsfmUpdateBlockHandler.cs line 61 at r4 (raw file):
Previously, ddaspit (Damien Daspit) wrote…
You can use
varfor all of these variable declarations, since the type is explicit on the right hand side.
Done. I thought we were using the Type name = new() syntax and avoiding var?
src/SIL.Machine/Corpora/PlaceMarkersUsfmUpdateBlockHandler.cs line 91 at r4 (raw file):
Previously, ddaspit (Damien Daspit) wrote…
You should use
Lengthinstead of the LINQCountfunction.
Done.
src/SIL.Machine/Corpora/UpdateUsfmParserHandler.cs line 535 at r4 (raw file):
Previously, ddaspit (Damien Daspit) wrote…
It is a small thing, but I would just use the LINQ
Reversefunction in theforeachloop instead of the in-placeReversemethod onList. It avoids an extra iteration of the list. You can call the LINQ reverse like thisEnumerable.Reverse(paraElems).
Oooo, thank you! I only didn't do this because I didn't know it existed and was being stupid 😆. That makes sense since you can call Select().Reverse(). Done.
tests/SIL.Machine.Tests/Corpora/PlaceMarkersUsfmUpdateBlockHandlerTests.cs line 10 at r4 (raw file):
Previously, ddaspit (Damien Daspit) wrote…
This can probably just be a static read-only field.
Done.
ddaspit
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there is any low hanging fruit, I would go ahead and do it. I will approve it now if you decide to hold off on the optimizations.
Reviewed 4 of 4 files at r5, all commit messages.
Reviewable status:complete! all files reviewed, all discussions resolved (waiting on @Enkidu93)
src/SIL.Machine/Corpora/PlaceMarkersUsfmUpdateBlockHandler.cs line 61 at r4 (raw file):
Previously, Enkidu93 (Eli C. Lowry) wrote…
Done. I thought we were using the
Type name = new()syntax and avoidingvar?
I wasn't sure if we could use that syntax in this assembly, since it targets .NET Standard 2.0, which is locked to an older version of C#.
Enkidu93
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After looking through it again, I can definitely see room for optimizations but it would require enough reworking (which should then probably be back-ported to machine.py) that I've just spun off an issue for now in the interest of meeting the quarter deadline.
Reviewable status:
complete! all files reviewed, all discussions resolved (waiting on @Enkidu93)
src/SIL.Machine/Corpora/PlaceMarkersUsfmUpdateBlockHandler.cs line 61 at r4 (raw file):
Previously, ddaspit (Damien Daspit) wrote…
I wasn't sure if we could use that syntax in this assembly, since it targets .NET Standard 2.0, which is locked to an older version of C#.
Yeah 👍. I wasn't sure if when we couldn't use the new() syntax, if you preferred we avoid var.
@ddaspit, when porting this, is speed a concern? I've tried to follow machine.py as closely as possible with few exceptions, but there's definitely room for optimization. Is this something we should think about now or just something we should revisit later on?
This change is