-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Better parsing for the words- and docsfile #1695
base: master
Are you sure you want to change the base?
Better parsing for the words- and docsfile #1695
Conversation
One question I have is that I didn't find a solution to convert absl::StrSplit to a std::range or std::view and therefore resulted to using another cppcoro generator. I've seen the idea to avoid these generators but am I right that it is only possible to use these new Iterators through creating classes that implement them? |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1695 +/- ##
=======================================
Coverage 89.86% 89.87%
=======================================
Files 389 390 +1
Lines 37308 37339 +31
Branches 4204 4205 +1
=======================================
+ Hits 33527 33557 +30
+ Misses 2485 2483 -2
- Partials 1296 1299 +3 ☔ View full report in Codecov by Sentry. |
This reverts commit c365935.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much, this absolutely goes into the right direction.
I have some initial comments for the cleaning up, let me know if you need further advice.
…sts in WordsAndDocsFileParserTest.cpp. Renamed methods in WordsAndDocsFileLineCreator.h to reduce ambiguity. Incorporated requested small changes of PR.
Signed-off-by: Johannes Kalmbach <johannes.kalmbach@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only small suggestions.
Also have a look at the sonarcloud issues,
ASSERT_EQ(std::get<0>(testLine), std::get<0>(expectedResult.at(i))); | ||
ASSERT_EQ(std::get<1>(testLine), std::get<1>(expectedResult.at(i))); | ||
ASSERT_EQ(std::get<2>(testLine), std::get<2>(expectedResult.at(i))); | ||
ASSERT_EQ(std::get<3>(testLine), std::get<3>(expectedResult.at(i))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is much better with the helper functions (There are even cleaner ways with better error messages in GoogleTest, but this refactoring is nice because now all improviements can be applied locally!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also a very small suggestion.
…d be outsourced in further refactorings
Conformance check passed ✅No test result changes. |
Quality Gate passedIssues Measures |
One possible solution to the current coverage problem is to start a file IndexImplHelpers.h and a corresponding cpp to outsource the helper methods and test them seperately. This would leed to even more references being passed to the functions. Currently I am unsure whether to do this or not. Also maybe there is another way to reduce the nesting at the positions where the helper functions are now at play as solution. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only a very small thing, otherwise this now looks much cleaner.
@@ -53,8 +53,7 @@ cppcoro::generator<WordsFileLine> IndexImpl::wordsInTextRecords( | |||
std::string_view textView = text; | |||
textView = textView.substr(0, textView.rfind('"')); | |||
textView.remove_prefix(1); | |||
auto normalizedWords = tokenizeAndNormalizeText(textView, localeManager); | |||
for (auto word : normalizedWords) { | |||
for (auto word : tokenizeAndNormalizeText(textView, localeManager)) { | |||
WordsFileLine wordLine{word, false, contextId, 1}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we benefit from a std::move(word)
here?
Seperate PR to further improve the parsing during the textindex building.