Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ignore-misidentified-collection-entries #600

Merged
merged 1 commit into from
Jul 19, 2022

Conversation

alishaevn
Copy link
Contributor

@alishaevn alishaevn commented Jul 19, 2022

issue

ran into this issue while running this rake task on pals for all exporters that were exporting from a collection after updating to v4.1.0

sentry error: https://sentry.notch8.com/sentry/pals/issues/142068/
image

demo

before #597, child collections were exported as a CsvEntry class instead of CsvCollectionEntry. this means that when that importer/exporter is run again, the old entry still exists, but a new one is created as well. the newer entry is at index 0, so calling uniq on the identifier value (as seen below), returns index 0.

Screen Shot 2022-07-18 at 8 37 59 PM

expected behavior

  • ignore misidentified collection entries when writing files from the csv parser

@@ -324,7 +324,7 @@ def retrieve_cloud_files(files)
def write_files
require 'open-uri'
folder_count = 0
sorted_entries = sort_entries(importerexporter.entries)
sorted_entries = sort_entries(importerexporter.entries.uniq(&:identifier))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

before #597, child collections were exported as a CsvEntry class instead of CsvCollectionEntry. this means that when that importer/exporter is run again, the old entry still exists, but a new one is created as well. we want to get the newest one.

irb(main):148:0> current_record_ids.count
=> 13

irb(main):149:0> Bulkrax::Exporter.find(59).entries.count
=> 14

Bulkrax::Exporter.find(59).entries.uniq(&:identifier).count
=> 13

@alishaevn alishaevn added patch-ver for release notes bug-fix labels Jul 19, 2022
@alishaevn alishaevn changed the title ignore misidentified collection entries when writing files from the c… ignore-misidentified-collection-entries Jul 19, 2022
@alishaevn alishaevn merged commit 7b4dcd2 into main Jul 19, 2022
@alishaevn alishaevn linked an issue Aug 5, 2022 that may be closed by this pull request
@alishaevn alishaevn deleted the ignore-misidentified-collection-entries branch December 1, 2022 20:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug-fix patch-ver for release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

split downloads by size
2 participants