-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix lazy list #617
Fix lazy list #617
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Reviewed everything up to c133be6 in 2 minutes and 12 seconds
More details
- Looked at
889
lines of code in16
files - Skipped
0
files when reviewing. - Skipped posting
2
drafted comments based on config settings.
1. src/main/scala/dpla/ingestion3/harvesters/file/FileHarvester.scala:86
- Draft comment:
Consider adding exception handling for IOExceptions that may occur when reading entries from the ZipInputStream or TarInputStream to prevent potential crashes. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
The code does have some error handling with Try, but not for all IO operations. However, these methods return LazyList which means exceptions would propagate to the caller naturally. The caller (handleFile method) already takes a Try[Int] return type, suggesting there's likely error handling at that level. Without seeing the full error handling strategy across the codebase, suggesting local exception handling here could be redundant or conflict with the existing approach.
I might be missing critical error cases that could crash the application. The existing error handling might not be sufficient for all failure scenarios.
While error handling is important, suggesting local exception handling without understanding the broader error handling strategy could lead to redundant or conflicting approaches. The existing Try pattern suggests errors are handled at a higher level.
Delete the comment as it makes assumptions about the error handling strategy without full context, and the code already shows signs of deliberate error handling patterns.
2. src/main/scala/dpla/ingestion3/harvesters/file/NorthwestHeritageFileHarvester.scala:42
- Draft comment:
The match statement in getInputStream is redundant since files are already filtered by zipFilter. Consider simplifying this method. - Reason this comment was not posted:
Confidence changes required:50%
The getInputStream method in multiple files (e.g., NorthwestHeritageFileHarvester, OaiFileHarvester, etc.) uses a match statement to check the file extension. However, since the files are already filtered by the zipFilter, this match statement is redundant and can be simplified.
Workflow ID: wflow_2jJ4uqp5qqtm3ml2
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
Quality Gate failedFailed conditions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Incremental review on 7f024b3 in 1 minute and 4 seconds
More details
- Looked at
987
lines of code in17
files - Skipped
0
files when reviewing. - Skipped posting
4
drafted comments based on config settings.
1. src/main/scala/dpla/ingestion3/harvesters/oai/LocalOaiHarvester.scala:4
- Draft comment:
Consider movingParsedResult
to a more common package if it's used across different harvesters, not just file-based ones. - Reason this comment was not posted:
Confidence changes required:50%
ThewriteOut
method inLocalHarvester
is being used in multiple places, but theParsedResult
class is imported fromfile
package. This could lead to confusion or tight coupling between unrelated components.
2. src/main/scala/dpla/ingestion3/harvesters/file/CommunityWebsHarvester.scala:100
- Draft comment:
Consider movingParsedResult
to a more common package if it's used across different harvesters, not just file-based ones. - Reason this comment was not posted:
Confidence changes required:50%
ThewriteOut
method inLocalHarvester
is being used in multiple places, but theParsedResult
class is imported fromfile
package. This could lead to confusion or tight coupling between unrelated components.
3. src/main/scala/dpla/ingestion3/harvesters/file/DlgFileHarvester.scala:105
- Draft comment:
Consider movingParsedResult
to a more common package if it's used across different harvesters, not just file-based ones. - Reason this comment was not posted:
Confidence changes required:50%
ThewriteOut
method inLocalHarvester
is being used in multiple places, but theParsedResult
class is imported fromfile
package. This could lead to confusion or tight coupling between unrelated components.
4. src/main/scala/dpla/ingestion3/harvesters/file/DplaJsonlFileHarvester.scala:107
- Draft comment:
Consider movingParsedResult
to a more common package if it's used across different harvesters, not just file-based ones. - Reason this comment was not posted:
Confidence changes required:50%
ThewriteOut
method inLocalHarvester
is being used in multiple places, but theParsedResult
class is imported fromfile
package. This could lead to confusion or tight coupling between unrelated components.
Workflow ID: wflow_nh918qknGCNMeouE
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
Important
Centralizes
iter
function for compressed file iteration and addstxtFilter
for.txt
files, removing redundant code across multiple harvesters.iter
function forZipInputStream
andTarInputStream
inFileHarvester.scala
.iter
implementations fromCommunityWebsHarvester.scala
,DlgFileHarvester.scala
,DplaJsonlFileHarvester.scala
, and 10 other files.txtFilter
toFileFilters.scala
for filtering.txt
files.SiFileHarvester.scala
to useFileFilters.txtFilter
instead of a custom filter class.This description was created by for 7f024b3. It will automatically update as commits are pushed.