Skip to content
This repository has been archived by the owner on Oct 24, 2024. It is now read-only.

Capturing raw metadata for OAI parsing of works #211

Merged
merged 1 commit into from
Dec 21, 2022

Conversation

jeremyf
Copy link
Contributor

@jeremyf jeremyf commented Dec 21, 2022

Prior to this commit, we didn't capture the raw metadata of works parsed via OAI-PMH.

From the Bulkrax change:

With this change, we now capture that information.

Below is an example showing that we need to use the record's metadata as a string.

gem "oai"
require 'oai'
client = OAI::Client.new(
  "http://oai.adventistdigitallibrary.org/OAI-script",
  headers: { from: "jeremy@scientist.com" },
  parser: 'libxml')

opts = {
  metadata_prefix: "oai_adl",
  set: "adl:issue"
}

records = client.list_records(opts)

records.each_with_index do |r, i|
  puts "Working on record ##{i+1}"
  puts "Metadata:\n#{r.metadata.to_s}"
end

Related to: samvera/bulkrax#694

Prior to this commit, we didn't capture the raw metadata of works parsed
via OAI-PMH.

With this change, we now capture that information.

Below is an example showing that we need to use the record's `metadata`
as a string.

```ruby
gem "oai"
require 'oai'
client = OAI::Client.new(
  "http://oai.adventistdigitallibrary.org/OAI-script",
  headers: { from: "jeremy@scientist.com" },
  parser: 'libxml')

opts = {
  metadata_prefix: "oai_adl",
  set: "adl:issue"
}

records = client.list_records(opts)

records.each_with_index do |r, i|
  puts "Working on record ##{i+1}"
  puts "Metadata:\n#{r.metadata.to_s}"
end
```

Related to: samvera/bulkrax#694
# who's ancestors include this ref.
# rubocop:disable Metrics/LineLength
gem 'bulkrax', "~> 4.4", git: "https://github.com/samvera-labs/bulkrax.git", ref: "d36cf3606545ea30da8d8082f1b67b96d9aaf8c1"
gem 'bulkrax', "~> 4.4", git: "https://github.com/samvera-labs/bulkrax.git", ref: "d1d0eca5963fcd190955ab75bcf9d0285afa2bb2"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think ive seen ref's like this before. Does this point to a bulkrax commit or something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does point to a specific commit on the main branch.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh cool! Ty :)

@jeremyf jeremyf merged commit 3848f83 into main Dec 21, 2022
@jeremyf jeremyf deleted the jeremyf---adding-raw-metadata-capture-for-oai branch December 21, 2022 22:09
jeremyf added a commit that referenced this pull request Dec 22, 2022
> Moving OAI metadata capture to entry processing
>
> In writing samvera/bulkrax#694 I introduced an OAI bug.  There
> are two OAI fetch cycles.
>
>The first is the list of records.  These are fetched via the
>`list_identifiers` method.  That method returns header information
>without metadata.  Then, we submit one job per header.  At that point
>we perform the second fetch of the record.  This includes the full
>metadata.

Related to:

- samvera/bulkrax#697
- samvera/bulkrax#694
- #211
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants