-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensuring XML parser sets factory class then parses #705
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
bkiahstroud
approved these changes
Jan 12, 2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like rubocop is failing. Otherwise, approved!
This commit for changing the `Bulkrax::XmlEntry` is the sibling to the changes made to `Bulkrax::OaiEntry` (see SHA 35970ce). The salient information is pulled as a quote from that commit: > Prior to this commit, as we were reading through the record's metadata > when we were processing `shape` we would use the default > `factory_class` to check if `shape` was a valid field. Then we'd use > the value of `model` to derive the `factory_class`. And when we then > processed `smell` we'd be using the newly derived `factory_class` from > the `model` node. > > With this commit, we first establish the `factory_class` from the > `model` node. Then we again process the all the nodes to parse the > metadata. This ensures that both `shape` and `smell` are parsed > against the expected `factory_class`. In addition, I have refactored the spec to remove global state pollution of the parser config. I have also renamed and documented a method to be more descriptive; the previous method name of `xml_elements` was opaque, as it returned an Array of strings (and not an array of XML elements). Further steps could be to extract the now "ubiquitous" `Avacado` class into a spec fixture. We may want to revisit the XML parser fundamental logic; namely we only parse nodes that are explicitly declared with in the `from`. This is a bit different than other parsers, in that they will make assumptions about each encountered column (in the case of CSV) or node (in the case of OAI). tl;dr - Here there be dragons. Closes #702 Related to: - #703 - https://github.com/scientist-softserv/adventist-dl/issues/188 - https://github.com/scientist-softserv/adventist-dl/issues/187 - https://github.com/scientist-softserv/adventist-dl/issues/186 [1]:
jeremyf
added a commit
that referenced
this pull request
Jan 23, 2023
Prior to this commit, if we had configured bulkrax such that we were allowing for our model import to come from multiple fields (e.g. `from: ['model', 'work_type']`) and the CSV had `work_type` as the column name, we would get an error as follows: ``` NoMethodError: undefined method `method_defined?' for nil:NilClass # ./app/models/concerns/bulkrax/has_matchers.rb:128:in `field_supported?' # ./app/models/concerns/bulkrax/has_matchers.rb:157:in `multiple?' # ./app/models/concerns/bulkrax/has_matchers.rb:34:in `block in add_metadata' # ./app/models/concerns/bulkrax/has_matchers.rb:31:in `each' # ./app/models/concerns/bulkrax/has_matchers.rb:31:in `add_metadata' # ./app/models/bulkrax/csv_entry.rb:77:in `block in add_ingested_metadata' # ./app/models/bulkrax/csv_entry.rb:75:in `each' # ./app/models/bulkrax/csv_entry.rb:75:in `add_ingested_metadata' # ./app/models/bulkrax/csv_entry.rb:43:in `build_metadata' ``` Tracing this line it is because the `factory_class` is nil. And if the model column were to be a later column, any columns that occurred before we established the factory class may have been skipped. With this commit, we're echoing the functionality introduced in #705 and #703. Related to: - notch8/britishlibrary#211 - #705 - #703 Closes #718
jeremyf
added a commit
that referenced
this pull request
Jan 24, 2023
* Ensuring CSV sets factory class before metadata parsing Prior to this commit, if we had configured bulkrax such that we were allowing for our model import to come from multiple fields (e.g. `from: ['model', 'work_type']`) and the CSV had `work_type` as the column name, we would get an error as follows: ``` NoMethodError: undefined method `method_defined?' for nil:NilClass # ./app/models/concerns/bulkrax/has_matchers.rb:128:in `field_supported?' # ./app/models/concerns/bulkrax/has_matchers.rb:157:in `multiple?' # ./app/models/concerns/bulkrax/has_matchers.rb:34:in `block in add_metadata' # ./app/models/concerns/bulkrax/has_matchers.rb:31:in `each' # ./app/models/concerns/bulkrax/has_matchers.rb:31:in `add_metadata' # ./app/models/bulkrax/csv_entry.rb:77:in `block in add_ingested_metadata' # ./app/models/bulkrax/csv_entry.rb:75:in `each' # ./app/models/bulkrax/csv_entry.rb:75:in `add_ingested_metadata' # ./app/models/bulkrax/csv_entry.rb:43:in `build_metadata' ``` Tracing this line it is because the `factory_class` is nil. And if the model column were to be a later column, any columns that occurred before we established the factory class may have been skipped. With this commit, we're echoing the functionality introduced in #705 and #703. Related to: - notch8/britishlibrary#211 - #705 - #703 Closes #718 * Update spec/fixtures/csv/factory_class_test.csv Co-authored-by: Benjamin Kiah Stroud <32469930+bkiahstroud@users.noreply.github.com> Co-authored-by: Benjamin Kiah Stroud <32469930+bkiahstroud@users.noreply.github.com>
jeremyf
added a commit
that referenced
this pull request
Feb 3, 2023
With this commit, we're introducing `Bulkrax::EntrySpecHelper.entry_for`, a public api method for downstream Bulkrax spec support. **Release Notes** The newly provided `Bulkrax::EntrySpecHelper.entry_for` method is intended to provide a consistent mechanism for testing your entry parser logic without needing to jump through the entire import cycle for that entry. To use, will need to explicitly require it in your test suite: ```ruby require 'bulkrax/entry_spec_helper' RSpec.describe MyParser do let(:entry) { Bulkrax::EntrySpecHelper.entry_for(...) } end ``` Each of the entry types have slightly different parameter requirements for instantiation. The best source to see those is in Bulkrax's `./spec/bulkrax/entry_spec_helper_spec.rb` file Closes: #714 Related to: - notch8/adventist-dl#246 - #719 - #705
jeremyf
added a commit
that referenced
this pull request
Feb 3, 2023
With this commit, we're introducing `Bulkrax::EntrySpecHelper.entry_for`, a public api method for downstream Bulkrax spec support. **Release Notes** The newly provided `Bulkrax::EntrySpecHelper.entry_for` method is intended to provide a consistent mechanism for testing your entry parser logic without needing to jump through the entire import cycle for that entry. To use, will need to explicitly require it in your test suite: ```ruby require 'bulkrax/entry_spec_helper' RSpec.describe MyParser do let(:entry) { Bulkrax::EntrySpecHelper.entry_for(...) } end ``` Each of the entry types have slightly different parameter requirements for instantiation. The best source to see those is in Bulkrax's `./spec/bulkrax/entry_spec_helper_spec.rb` file Closes: #714 Related to: - notch8/adventist-dl#246 - #719 - #705
jeremyf
added a commit
that referenced
this pull request
Feb 6, 2023
With this commit, we're introducing `Bulkrax::EntrySpecHelper.entry_for`, a public api method for downstream Bulkrax spec support. **Release Notes** The newly provided `Bulkrax::EntrySpecHelper.entry_for` method is intended to provide a consistent mechanism for testing your entry parser logic without needing to jump through the entire import cycle for that entry. To use, will need to explicitly require it in your test suite: ```ruby require 'bulkrax/entry_spec_helper' RSpec.describe MyParser do let(:entry) { Bulkrax::EntrySpecHelper.entry_for(...) } end ``` Each of the entry types have slightly different parameter requirements for instantiation. The best source to see those is in Bulkrax's `./spec/bulkrax/entry_spec_helper_spec.rb` file Closes: #714 Related to: - notch8/adventist-dl#246 - #719 - #705
jeremyf
added a commit
to notch8/britishlibrary
that referenced
this pull request
Mar 14, 2023
Prior to this commit, with Bulkrax 5 there was a deprecation notice regarding `xml_elements`. With this commit, we're favoring `field_mapping_from_values_for_xml_element_names`. Related to: samvera/bulkrax#705
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This commit for changing the
Bulkrax::XmlEntry
is the sibling to the changes made toBulkrax::OaiEntry
(see SHA 35970ce). The salient information is pulled as a quote from that commit:In addition, I have refactored the spec to remove global state pollution of the parser config.
I have also renamed and documented a method to be more descriptive; the previous method name of
xml_elements
was opaque, as it returned an Array of strings (and not an array of XML elements).Further steps could be to extract the now "ubiquitous"
Avacado
class into a spec fixture.We may want to revisit the XML parser fundamental logic; namely we only parse nodes that are explicitly declared with in the
from
. This is a bit different than other parsers, in that they will make assumptions about each encountered column (in the case of CSV) or node (in the case of OAI). tl;dr - Here there be dragons.Closes notch8/adventist_knapsack#210
Related to: