Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes for LazyList for all the harvesters. #616

Closed
wants to merge 3 commits into from
Closed

Conversation

mdellabitta
Copy link
Contributor

@mdellabitta mdellabitta commented Jan 2, 2025

Important

Refactors iteration in harvester classes to use foreach, centralizes file filtering, and removes redundant code.

  • Iteration Refactor:
    • Replaces for loops with foreach in iter() method calls in CommunityWebsHarvester.scala, DplaJsonlFileHarvester.scala, and FlFileHarvester.scala.
  • File Filtering:
    • Adds txtFilter to FileFilters.scala.
    • Removes TxtFileFilter class from SiFileHarvester.scala and replaces its usage with FileFilters.txtFilter.
  • Miscellaneous:
    • Removes unnecessary braces in getInputStream() in OaiFileHarvester.scala, VaFileHarvester.scala, and VtFileHarvester.scala.

This description was created by Ellipsis for c133be6. It will automatically update as commits are pushed.

Copy link

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Reviewed everything up to 51a47a1 in 1 minute and 31 seconds

More details
  • Looked at 354 lines of code in 14 files
  • Skipped 0 files when reviewing.
  • Skipped posting 8 drafted comments based on config settings.
1. src/main/scala/dpla/ingestion3/harvesters/file/CommunityWebsHarvester.scala:141
  • Draft comment:
    The change from for to foreach is appropriate for side-effect operations, but ensure that the removal of yield does not affect any expected return values. If the result was intended to be used, consider using map instead.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The change from for to foreach is appropriate for side-effect operations, but the yield keyword was removed without replacing it with a proper return mechanism. This could lead to unintended behavior if the result of the operation was expected to be used later.
2. src/main/scala/dpla/ingestion3/harvesters/file/DplaJsonlFileHarvester.scala:150
  • Draft comment:
    The change from for to foreach is appropriate for side-effect operations, but ensure that the removal of yield does not affect any expected return values. If the result was intended to be used, consider using map instead.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The change from for to foreach is appropriate for side-effect operations, but the yield keyword was removed without replacing it with a proper return mechanism. This could lead to unintended behavior if the result of the operation was expected to be used later.
3. src/main/scala/dpla/ingestion3/harvesters/file/FlFileHarvester.scala:149
  • Draft comment:
    The change from for to foreach is appropriate for side-effect operations, but ensure that the removal of yield does not affect any expected return values. If the result was intended to be used, consider using map instead.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The change from for to foreach is appropriate for side-effect operations, but the yield keyword was removed without replacing it with a proper return mechanism. This could lead to unintended behavior if the result of the operation was expected to be used later.
4. src/main/scala/dpla/ingestion3/harvesters/file/HathiFileHarvester.scala:132
  • Draft comment:
    The change from for to foreach is appropriate for side-effect operations, but ensure that the removal of yield does not affect any expected return values. If the result was intended to be used, consider using map instead.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The change from for to foreach is appropriate for side-effect operations, but the yield keyword was removed without replacing it with a proper return mechanism. This could lead to unintended behavior if the result of the operation was expected to be used later.
5. src/main/scala/dpla/ingestion3/harvesters/file/HeartlandFileHarvester.scala:148
  • Draft comment:
    The change from for to foreach is appropriate for side-effect operations, but ensure that the removal of yield does not affect any expected return values. If the result was intended to be used, consider using map instead.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The change from for to foreach is appropriate for side-effect operations, but the yield keyword was removed without replacing it with a proper return mechanism. This could lead to unintended behavior if the result of the operation was expected to be used later.
6. src/main/scala/dpla/ingestion3/harvesters/file/NYPLFileHarvester.scala:158
  • Draft comment:
    The change from for to foreach is appropriate for side-effect operations, but ensure that the removal of yield does not affect any expected return values. If the result was intended to be used, consider using map instead.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The change from for to foreach is appropriate for side-effect operations, but the yield keyword was removed without replacing it with a proper return mechanism. This could lead to unintended behavior if the result of the operation was expected to be used later.
7. src/main/scala/dpla/ingestion3/harvesters/file/NaraDeltaHarvester.scala:259
  • Draft comment:
    The change from for to map is appropriate for transforming collections, but ensure that the removal of yield does not affect any expected return values. If the result was intended to be used, this change is correct.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The change from for to map is appropriate for transforming collections, but ensure that the removal of yield does not affect any expected return values. If the result was intended to be used, this change is correct.
8. src/main/scala/dpla/ingestion3/harvesters/file/NaraFileHarvester.scala:313
  • Draft comment:
    The change from for to map is appropriate for transforming collections, but ensure that the removal of yield does not affect any expected return values. If the result was intended to be used, this change is correct.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The change from for to map is appropriate for transforming collections, but ensure that the removal of yield does not affect any expected return values. If the result was intended to be used, this change is correct.

Workflow ID: wflow_h3Huea5QxscQRW40


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

Copy link

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on 9dd348c in 19 seconds

More details
  • Looked at 13 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 1 drafted comments based on config settings.
1. src/main/scala/dpla/ingestion3/harvesters/file/VtFileHarvester.scala:118
  • Draft comment:
    The change from _ to () in case Success(count) => () is a good practice, making it explicit that no action is taken on success.
  • Reason this comment was not posted:
    Confidence changes required: 0%
    The change from _ to () in the case Success(count) => () is a stylistic improvement, making it clear that no action is taken on success. This is consistent with Scala best practices.

Workflow ID: wflow_WieXstgudBiaQRiT


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

Copy link

sonarqubecloud bot commented Jan 2, 2025

Quality Gate Failed Quality Gate failed

Failed conditions
35.7% Duplication on New Code (required ≤ 3%)

See analysis details on SonarQube Cloud

Copy link

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❌ Changes requested. Incremental review on c133be6 in 1 minute and 35 seconds

More details
  • Looked at 738 lines of code in 14 files
  • Skipped 0 files when reviewing.
  • Skipped posting 0 drafted comments based on config settings.

Workflow ID: wflow_SONbfPquyjo9ogrS


Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

LazyList.empty
case Some(entry) =>
val result =
if (entry.isDirectory || entry.getName.contains("._"))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding a filter to exclude non-XML files in the iter method for ZipInputStream to ensure consistency across harvesters.

Suggested change
if (entry.isDirectory || entry.getName.contains("._"))
if (entry.isDirectory || entry.getName.contains("._") || !entry.getName.endsWith(".xml"))

@mdellabitta mdellabitta closed this Jan 2, 2025
@mdellabitta mdellabitta reopened this Jan 2, 2025
@mdellabitta mdellabitta closed this Jan 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant