Skip to content

Commit

Permalink
Adding Bulkrax::EntrySpecHelper
Browse files Browse the repository at this point in the history
With this commit, we're introducing
`Bulkrax::EntrySpecHelper.entry_for`, a public api method for downstream
Bulkrax spec support.

**Release Notes**

The newly provided `Bulkrax::EntrySpecHelper.entry_for` method is
intended to provide a consistent mechanism for testing your entry parser
logic without needing to jump through the entire import cycle for that
entry.

To use, will need to explicitly require it in your test suite:

```ruby
require 'bulkrax/entry_spec_helper'

RSpec.describe MyParser do
  let(:entry) { Bulkrax::EntrySpecHelper.entry_for(...) }
end
```

Each of the entry types have slightly different parameter requirements
for instantiation.  The best source to see those is in Bulkrax's
`./spec/bulkrax/entry_spec_helper_spec.rb` file

Closes: #714

Related to:

- scientist-softserv/adventist-dl#246
- #719
- #705
  • Loading branch information
jeremyf committed Feb 3, 2023
1 parent 5fff58c commit 36855e9
Show file tree
Hide file tree
Showing 4 changed files with 295 additions and 0 deletions.
9 changes: 9 additions & 0 deletions app/models/bulkrax/oai_entry.rb
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,15 @@ class OaiEntry < Entry

delegate :record, to: :raw_record

# @api private
#
# Included to assist in testing; namely so that you can copy down an OAI entry, store it locally,
# and then manually construct an {OAI::GetRecordResponse}.
#
# @see Bulkrax::EntrySpecHelper.oai_entry_for
attr_writer :raw_record

# @return [OAI::GetRecordResponse]
def raw_record
@raw_record ||= client.get_record(identifier: identifier, metadata_prefix: parser.parser_fields['metadata_prefix'])
end
Expand Down
153 changes: 153 additions & 0 deletions lib/bulkrax/entry_spec_helper.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,153 @@
# frozen_string_literal: true

require 'oai'
require 'xml/libxml'

module Bulkrax
##
# The purpose of this module is to provide some testing facilities for those that include the
# Bulkrax gem in their application.
#
# This module came about through a desire to expose a quick means of vetting the accuracy of the
# different parsers.
module EntrySpecHelper
##
# @api public
# @since v5.0.1
#
# The purpose of this method is encapsulate the logic of creating the appropriate Bulkrax::Entry
# object based on the given data, identifier, and parser_class_name.
#
# From that entry, you should be able to test how {Bulkrax::Entry#build_metadata} populates the
# {Bulkrax::Entry#parsed_metadata} variable. Other uses may emerge.
#
# @param data [Object] the data that we use to populate the raw metadata. Due to implementation
# details of each entry, the data will be of different formats.
#
# @param identifier [String, Integer] The identifier of the entry. This might also be found in
# the metadata of the entry, but for instantiation purposes we need this value.
# @param parser_class_name [String] The name of the parser class you're wanting to test.
# @param options [Hash<Symbol,Object>] these are to be passed along into the instantiation of
# the various classes. See implementation details.
#
# @return [Bulkrax::Entry]
def self.entry_for(data:, identifier:, parser_class_name:, **options)
importer = importer_for(parser_class_name: parser_class_name, **options)

# Using an instance of the entry_class to dispatch to different
entry_for_dispatch = importer.parser.entry_class.new

# Using the {is_a?} test we get the benefit of inspecting an object's inheritance path
# (e.g. ancestry). The logic, as implemented, also provides a mechanism for folks in their
# applications to add a {class_name_entry_for}; something that I suspect isn't likely
# but given the wide variety of underlying needs I could see happening and I want to encourage
# patterned thinking to fold that different build method into this structure.
key = entry_class_to_symbol_map.keys.detect { |class_name| entry_for_dispatch.is_a?(class_name.constantize) }

# Yes, we'll raise an error if we didn't find a corresponding key. And that's okay.
symbol = entry_class_to_symbol_map.fetch(key)

send("build_#{symbol}_entry_for", importer: importer, identifier: identifier, data: data, **options)
end

DEFAULT_ENTRY_CLASS_TO_SYMBOL_MAP = {
'Bulkrax::OaiEntry' => :oai,
'Bulkrax::XmlEntry' => :xml,
'Bulkrax::CsvEntry' => :csv
}.freeze

# Present implementations of entry classes tend to inherit from the below listed class names.
# We're not looking to register all descendents of the {Bulkrax::Entry} class, but instead find
# the ancestor where there is significant deviation.
def self.entry_class_to_symbol_map
@entry_class_to_symbol_map || DEFAULT_ENTRY_CLASS_TO_SYMBOL_MAP
end

def self.entry_class_to_symbol_map=(value)
@entry_class_to_symbol_map = value
end

def self.importer_for(parser_class_name:, parser_fields: {}, **options)
# Ideally, we could pass in the field_mapping. However, there is logic that ignores the
# parser's field_mapping and directly asks for Bulkrax's field_mapping (e.g. model_mapping
# method).
Rails.logger.warn("You passed :importer_field_mapping as an option. This may not fully work as desired.") if options.key?(:importer_field_mapping)
Bulkrax::Importer.new(
name: options.fetch(:importer_name, "Test importer for identifier"),
admin_set_id: options.fetch(:importer_admin_set_id, "admin_set/default"),
user: options.fetch(:importer_user, User.new(email: "hello@world.com")),
limit: options.fetch(:importer_limits, 1),
parser_klass: parser_class_name,
field_mapping: options.fetch(:importer_field_mappings) { Bulkrax.field_mappings.fetch(parser_class_name) },
parser_fields: parser_fields
)
end
private_class_method :importer_for

##
# @api private
#
# @param data [Hash<Symbol,String>] we're expecting a hash with keys that are symbols and then
# values that are strings.
#
# @return [Bulkrax::CsvEntry]
#
# @note As a foible of this implementation, you'll need to include along a CSV to establish the
# columns that you'll parse (e.g. the first row
def self.build_csv_entry_for(importer:, data:, identifier:, **_options)
importer.parser.entry_class.new(
importerexporter: importer,
identifier: identifier,
raw_metadata: data
)
end

##
# @api private
#
# @param data [String] we're expecting a string that is well-formed XML for OAI parsing.
#
# @return [Bulkrax::OaiEntry]
def self.build_oai_entry_for(importer:, data:, identifier:, **options)
# The raw record assumes we take the XML data, parse it and then send that to the
# OAI::GetRecordResponse object.
doc = XML::Parser.string(data)
raw_record = OAI::GetRecordResponse.new(doc.parse)

raw_metadata = {
importer.parser.source_identifier.to_s => identifier,
"data" => data,
"collections" => options.fetch(:raw_metadata_collections, []),
"children" => options.fetch(:raw_metadata_children, [])
}

importer.parser.entry_class.new(
raw_record: raw_record,
importerexporter: importer,
identifier: identifier,
raw_metadata: raw_metadata
)
end

##
# @api private
#
# @param data [String] we're expecting a string that is well-formed XML.
#
# @return [Bulkrax::XmlEntry]
def self.build_xml_entry_for(importer:, data:, identifier:, **options)
raw_metadata = {
importer.parser.source_identifier.to_s => identifier,
"data" => data,
"collections" => options.fetch(:raw_metadata_collections, []),
"children" => options.fetch(:raw_metadata_children, [])
}

importer.parser.entry_class.new(
importerexporter: importer,
identifier: identifier,
raw_metadata: raw_metadata
)
end
end
end
132 changes: 132 additions & 0 deletions spec/bulkrax/entry_spec_helper_spec.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,132 @@
# frozen_string_literal: true

require 'rails_helper'
require 'bulkrax/entry_spec_helper'
require 'byebug'

RSpec.describe Bulkrax::EntrySpecHelper do
describe '.entry_for' do
let(:identifier) { "867-5309" }
let(:options) { {} }
subject(:entry) { described_class.entry_for(identifier: identifier, data: data, parser_class_name: parser_class_name, **options) }

context 'for parser_class_name: "Bulkrax::CsvParser"' do
let(:parser_class_name) { "Bulkrax::CsvParser" }
let(:import_file_path) { 'spec/fixtures/csv/good.csv' }
let(:options) do
{
parser_fields: {
# Columns are: model,source_identifier,title,parents_column
'import_file_path' => import_file_path
}
}
end

let(:data) { { model: "Work", source_identifier: identifier, title: "If You Want to Go Far" } }

it { is_expected.to be_a(Bulkrax::CsvEntry) }

it "parses metadata" do
entry.build_metadata

expect(entry.factory_class).to eq(Work)
{
"title" => ["If You Want to Go Far"],
"admin_set_id" => "admin_set/default",
"source" => [identifier]
}.each do |key, value|
expect(entry.parsed_metadata.fetch(key)).to eq(value)
end
end
end

context 'for parser_class_name: "Bulkrax::OaiDcParser"' do
let(:parser_class_name) { "Bulkrax::OaiDcParser" }
let(:options) do
{
parser_fields: {
"metadata_prefix" => 'oai_fcrepo',
"base_url" => "http://oai.samvera.org/OAI-script",
"thumbnail_url" => ''
}
}
end
let(:data) do
%(<?xml version="1.0" encoding="UTF-8"?>
<OAI-PMH xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd">
<responseDate>2023-02-01T20:41:11Z</responseDate>
<request verb="GetRecord">#{options.fetch(:parser_fields).fetch('base_url')}?identifier=#{identifier}&amp;metadataPrefix=#{options.fetch(:parser_fields).fetch('metadata_prefix')}&amp;verb=GetRecord</request>
<GetRecord>
<record>
<header>
<identifier>#{identifier}</identifier>
<datestamp>2022-12-15T05:09:20Z</datestamp>
<setSpec>adl:book</setSpec>
</header>
<metadata>
<oai_fcrepo>
<title>If You Want to Go Far</title>
<resource_type>Article</resource_type>
</oai_fcrepo>
</metadata>
</record>
</GetRecord>
</OAI-PMH>)
end

it { is_expected.to be_a(Bulkrax::OaiDcEntry) }

it "parses metadata" do
allow(Collection).to receive(:where).and_return([])
entry.build_metadata

expect(entry.factory_class).to eq(Work)
{
"title" => ["If You Want to Go Far"],
"admin_set_id" => "admin_set/default",
"source" => [identifier]
}.each do |key, value|
expect(entry.parsed_metadata.fetch(key)).to eq(value)
end
end
end

context 'for parser_class_name: "Bulkrax::XmlParser"' do
let(:parser_class_name) { "Bulkrax::XmlParser" }
let(:data) do
%(<metadata>
<title>If You Want to Go Far</title>
<resource_type>Article</resource_type>
</record>)
end

it { is_expected.to be_a(Bulkrax::XmlEntry) }

around do |spec|
# Because of the implementation of XML parsing, we don't have defaults. We set them here
initial_value = Bulkrax.field_mappings[parser_class_name]
Bulkrax.field_mappings[parser_class_name] = {
'title' => { from: ['title'] },
'single_object' => { from: ['resource_type'] },
'source' => { from: ['identifier'], source_identifier: true }
}
spec.run
Bulkrax.field_mappings[parser_class_name] = initial_value
end

it "parses metadata" do
entry.build_metadata

expect(entry.factory_class).to eq(Work)
{
"title" => ["If You Want to Go Far"],
"admin_set_id" => "admin_set/default",
"single_object" => "Article",
"source" => [identifier]
}.each do |key, value|
expect(entry.parsed_metadata.fetch(key)).to eq(value)
end
end
end
end
end
1 change: 1 addition & 0 deletions spec/parsers/bulkrax/xml_parser_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ module Bulkrax
let(:entry) { FactoryBot.create(:bulkrax_entry, importerexporter: importer) }

before do
# NOTE: this will update the field mappings for all subsequent runs of all of the specs.
Bulkrax.field_mappings['Bulkrax::XmlParser'] = {
'title' => { from: ['TitleLargerEntity'] },
'abstract' => { from: ['Abstract'] },
Expand Down

0 comments on commit 36855e9

Please sign in to comment.