Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add file language to the manifest (upload spreadsheet) #1382

Merged
merged 2 commits into from
Nov 13, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 9 additions & 7 deletions app/lib/pre_assembly/file_manifest.rb
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,13 @@ class FileManifest
attr_reader :csv_filename, :staging_location

# the valid roles a file can have, if you specify a "role" column and the value is not one of these, it will be ignored
VALID_ROLES = %w[transcription annotations derivative master].freeze
VALID_ROLES = %w[
annotations
caption
derivative
master
transcription
].freeze

# the required columns that must exist in the file manifest
REQUIRED_COLUMNS = %w[druid filename resource_label sequence publish shelve preserve resource_type].freeze
Expand Down Expand Up @@ -79,7 +85,8 @@ def file_properties_from_row(row)
externalIdentifier: FileIdentifierGenerator.generate,
filename: row[:filename],
label: row[:file_label].presence || row[:filename],
use: role(row),
languageTag: row[:file_language],
use: VALID_ROLES.include?(row[:role]) ? row[:role] : nil, # filter out unexpected role values
administrative: administrative(row),
hasMessageDigests: md5_digest(row),
hasMimeType: row[:mimetype].presence,
Expand All @@ -105,11 +112,6 @@ def administrative(row)
{ sdrPreserve: preserve, publish:, shelve: }
end

# @return [String] the role for the file (if a valid role value, otherwise nil)
def role(row)
row[:role] if VALID_ROLES.include?(row[:role])
end

def md5_digest(row)
container_path = File.join(staging_location, row[:druid])
# look for a checksum file named the same as this file
Expand Down
10 changes: 3 additions & 7 deletions app/lib/pre_assembly/from_staging_location/structural_builder.rb
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ def initialize(filesets:, cocina_dro:, all_files_public:, reading_order: nil)
# rubocop:disable Metrics/AbcSize
# rubocop:disable Metrics/MethodLength
def build
# a counter to use when creating auto-labels for resources, with incremenets for each type
# a counter to use when creating auto-labels for resources, with increments for each type
resource_type_counters = Hash.new(0)

# rubocop:disable Metrics/BlockLength
Expand All @@ -47,9 +47,9 @@ def build
hasMessageDigests: message_digests(fileset_file),
hasMimeType: fileset_file.mimetype,
administrative: administrative(fileset_file),
access: file_access
access: file_access,
use: fileset_file.file_attributes[:role]
}
file_attributes[:use] = fileset_file.file_attributes[:role] if role?(fileset_file)

Cocina::Models::File.new(file_attributes)
end
Expand Down Expand Up @@ -110,10 +110,6 @@ def administrative(fileset_file)
{ sdrPreserve: preserve, publish:, shelve: }
end

def role?(fileset_file)
fileset_file.file_attributes.key? :role
end

def message_digests(fileset_file)
fileset_file.provider_md5 ? [{ type: 'md5', digest: fileset_file.provider_md5 }] : []
end
Expand Down
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
druid,filename,resource_label,sequence,publish,preserve,shelve,resource_type,role
bb000kk0000,page_0001.pdf,page 1,1,yes,yes,yes,page,
bb000kk0000,page_0001.xml,page 1,1,yes,yes,yes,page,transcription
druid,filename,resource_label,sequence,publish,preserve,shelve,resource_type,role,file_language
bb000kk0000,page_0001.pdf,page 1,1,yes,yes,yes,page,,
bb000kk0000,page_0001.xml,page 1,1,yes,yes,yes,page,transcription,
10 changes: 5 additions & 5 deletions spec/fixtures/book-file-manifest-extra-file/file_manifest.csv
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
druid,filename,resource_label,sequence,publish,preserve,shelve,resource_type,role
bb000kk0000,page_0000.jpg,page 1,1,no,yes,no,page,
bb000kk0000,page_0001.jpg,page 1,1,no,yes,no,page,
bb000kk0000,page_0001.pdf,page 1,1,yes,yes,yes,page,
bb000kk0000,page_0001.xml,page 1,1,yes,yes,yes,page,transcription
druid,filename,resource_label,sequence,publish,preserve,shelve,resource_type,role,file_language
bb000kk0000,page_0000.jpg,page 1,1,no,yes,no,page,,
bb000kk0000,page_0001.jpg,page 1,1,no,yes,no,page,,
bb000kk0000,page_0001.pdf,page 1,1,yes,yes,yes,page,,
bb000kk0000,page_0001.xml,page 1,1,yes,yes,yes,page,transcription,
20 changes: 10 additions & 10 deletions spec/fixtures/book-file-manifest/file_manifest.csv
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
druid,filename,resource_label,sequence,publish,preserve,shelve,resource_type,role
bb000kk0000,page_0001.jpg,page 1,1,no,yes,no,page,
bb000kk0000,page_0001.pdf,page 1,1,yes,yes,yes,page,
bb000kk0000,page_0001.xml,page 1,1,yes,yes,yes,page,transcription
bb000kk0000,page_0002.jpg,page 2,2,no,yes,no,page,
bb000kk0000,page_0002.pdf,page 2,2,yes,yes,yes,page,
bb000kk0000,page_0002.xml,page 2,2,yes,yes,yes,page,transcription
bb000kk0000,page_0003.jpg,page 3,3,no,yes,no,page,
bb000kk0000,page_0003.pdf,page 3,3,yes,yes,yes,page,
bb000kk0000,page_0003.xml,page 3,3,yes,yes,yes,page,transcription
druid,filename,resource_label,sequence,publish,preserve,shelve,resource_type,role,file_language
bb000kk0000,page_0001.jpg,page 1,1,no,yes,no,page,,
bb000kk0000,page_0001.pdf,page 1,1,yes,yes,yes,page,,
bb000kk0000,page_0001.xml,page 1,1,yes,yes,yes,page,transcription,
bb000kk0000,page_0002.jpg,page 2,2,no,yes,no,page,,
bb000kk0000,page_0002.pdf,page 2,2,yes,yes,yes,page,,
bb000kk0000,page_0002.xml,page 2,2,yes,yes,yes,page,transcription,
bb000kk0000,page_0003.jpg,page 3,3,no,yes,no,page,,
bb000kk0000,page_0003.pdf,page 3,3,yes,yes,yes,page,,
bb000kk0000,page_0003.xml,page 3,3,yes,yes,yes,page,transcription,
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
druid,filename,resource_label,sequence,publish,preserve,shelve,resource_type,role
mm111mm2222,test1.txt,page 1,1,no,yes,no,page,
mm111mm2222,config/test.yml,file 2,1,yes,yes,yes,file,
mm111mm2222,config/settings/test.yml,file 3,1,yes,yes,yes,page,transcription
mm111mm2222,config/settings/test1.yml,file 4,1,yes,yes,yes,page,transcription
mm111mm2222,config/settings/test2.yml,file 5,1,yes,yes,yes,page,transcription
mm111mm2222,config/images/image.jpg,page 2,2,yes,yes,yes,page,page
mm111mm2222,config/images/subdir/image.jpg,file 6,2,yes,yes,yes,page,transcription
druid,filename,resource_label,sequence,publish,preserve,shelve,resource_type,role,file_language
mm111mm2222,test1.txt,page 1,1,no,yes,no,page,,
mm111mm2222,config/test.yml,file 2,1,yes,yes,yes,file,,
mm111mm2222,config/settings/test.yml,file 3,1,yes,yes,yes,page,transcription,
mm111mm2222,config/settings/test1.yml,file 4,1,yes,yes,yes,page,transcription,
mm111mm2222,config/settings/test2.yml,file 5,1,yes,yes,yes,page,transcription,
mm111mm2222,config/images/image.jpg,page 2,2,yes,yes,yes,page,page,
mm111mm2222,config/images/subdir/image.jpg,file 6,2,yes,yes,yes,page,transcription,
4 changes: 2 additions & 2 deletions spec/fixtures/media_missing/file_manifest.csv
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
druid,filename,resource_label,sequence,publish,preserve,shelve,resource_type,role
sn000dd0000,test.csv,Test,1,yes,yes,yes,page,,
druid,filename,resource_label,sequence,publish,preserve,shelve,resource_type,role,file_language
sn000dd0000,test.csv,Test,1,yes,yes,yes,page,,,
40 changes: 19 additions & 21 deletions spec/fixtures/media_missing/file_manifest_blank_rows.csv
Original file line number Diff line number Diff line change
@@ -1,23 +1,21 @@
druid,filename,resource_label,sequence,publish,preserve,shelve,resource_type,role
aa111aa1111,aa111aa1111_001_a_pm.wav,"Tape 1, Side A",1,no,yes,no,media,
aa111aa1111,aa111aa1111_001_a_sh.wav,,,no,yes,no,,
aa111aa1111,aa111aa1111_001_a_sl.mp3,,,yes,yes,yes,,
aa111aa1111,aa111aa1111_001_img_1.jpg,,,yes,yes,yes,,

aa111aa1111,aa111aa1111_001_b_pm.wav,"Tape 1, Side B",2,yes,yes,yes,file,
aa111aa1111,aa111aa1111_001_b_sh.wav,,,yes,yes,no,,
aa111aa1111,aa111aa1111_001_b_sl.mp3,,,yes,yes,yes,,
aa111aa1111,aa111aa1111_001_img_2.jpg,,,yes,yes,yes,,
aa111aa1111,aa111aa1111.pdf,Transcript,3,yes,yes,yes,file,
bb222bb2222,bb222bb2222_002_a_pm.wav,"Tape 1, Side A",1,no,yes,no,media,

bb222bb2222,bb222bb2222_002_a_sh.wav,,,no,yes,no,,
bb222bb2222,bb222bb2222_002_a_sl.mp3,,,yes,yes,yes,,
bb222bb2222,bb222bb2222_002_img_1.jpg,,,yes,yes,yes,,
bb222bb2222,bb222bb2222_002_b_pm.wav,"Tape 1, Side B",2,no,yes,no,media,
bb222bb2222,bb222bb2222_002_b_sh.wav,,,yes,yes,no,,
bb222bb2222,bb222bb2222_002_b_sl.mp3,,,yes,yes,yes,,
bb222bb2222,bb222bb2222_002_img_2.jpg,,,yes,yes,yes,,
bb222bb2222,bb222bb2222.pdf,Transcript,3,yes,yes,yes,file,transcription
druid,filename,resource_label,sequence,publish,preserve,shelve,resource_type,role,file_language
aa111aa1111,aa111aa1111_001_a_pm.wav,"Tape 1, Side A",1,no,yes,no,media,,
aa111aa1111,aa111aa1111_001_a_sh.wav,,,no,yes,no,,,
aa111aa1111,aa111aa1111_001_a_sl.mp3,,,yes,yes,yes,,,
aa111aa1111,aa111aa1111_001_img_1.jpg,,,yes,yes,yes,,,

aa111aa1111,aa111aa1111_001_b_pm.wav,"Tape 1, Side B",2,yes,yes,yes,file,,
aa111aa1111,aa111aa1111_001_b_sh.wav,,,yes,yes,no,,,
aa111aa1111,aa111aa1111_001_b_sl.mp3,,,yes,yes,yes,,,
aa111aa1111,aa111aa1111_001_img_2.jpg,,,yes,yes,yes,,,
aa111aa1111,aa111aa1111.pdf,Transcript,3,yes,yes,yes,file,,
bb222bb2222,bb222bb2222_002_a_pm.wav,"Tape 1, Side A",1,no,yes,no,media,,

bb222bb2222,bb222bb2222_002_a_sh.wav,,,no,yes,no,,,
bb222bb2222,bb222bb2222_002_a_sl.mp3,,,yes,yes,yes,,,
bb222bb2222,bb222bb2222_002_img_1.jpg,,,yes,yes,yes,,,
bb222bb2222,bb222bb2222_002_b_pm.wav,"Tape 1, Side B",2,no,yes,no,media,,
bb222bb2222,bb222bb2222_002_b_sh.wav,,,yes,yes,no,,,
bb222bb2222,bb222bb2222_002_b_sl.mp3,,,yes,yes,yes,,,
bb222bb2222,bb222bb2222_002_img_2.jpg,,,yes,yes,yes,,,
bb222bb2222,bb222bb2222.pdf,Transcript,3,yes,yes,yes,file,transcription,
38 changes: 19 additions & 19 deletions spec/fixtures/media_missing/file_manifest_invalid.csv
Original file line number Diff line number Diff line change
@@ -1,19 +1,19 @@
druid,filename,resource_label,sequence,publish,preserve,shelve,resource_type,role
aa111aa1111,aa111aa1111_001_a_pm.wav,"Tape 1, Side A",1,no,yes,no,media,
aa111aa1111,aa111aa1111_001_a_sh.wav,,,no,yes,no,,
aa111aa1111,aa111aa1111_001_a_sl.mp3,,,yes,yes,yes,,
aa111aa1111,aa111aa1111_001_img_1.jpg,,,yes,yes,yes,,
aa111aa1111,aa111aa1111_001_b_pm.wav,"Tape 1, Side B",2,yes,yes,yes,file,
aa111aa1111,aa111aa1111_001_b_sh.wav,,,no,no,no,,
aa111aa1111,aa111aa1111_001_b_sl.mp3,,,yes,yes,yes,,
aa111aa1111,aa111aa1111_001_img_2.jpg,,,yes,yes,yes,,
aa111aa1111,aa111aa1111.pdf,Transcript,3,yes,yes,yes,file,
bb222bb2222,bb222bb2222_002_a_pm.wav,"Tape 1, Side A",1,no,yes,no,media,
bb222bb2222,bb222bb2222_002_a_sh.wav,,,no,yes,no,,
bb222bb2222,bb222bb2222_002_a_sl.mp3,,,yes,yes,yes,,
bb222bb2222,bb222bb2222_002_img_1.jpg,,,yes,yes,yes,,
bb222bb2222,bb222bb2222_002_b_pm.wav,"Tape 1, Side B",2,no,yes,no,media,
bb222bb2222,bb222bb2222_002_b_sh.wav,,,no,no,no,,
bb222bb2222,bb222bb2222_002_b_sl.mp3,,,yes,yes,yes,,
bb222bb2222,bb222bb2222_002_img_2.jpg,,,yes,yes,yes,,
bb222bb2222,bb222bb2222.pdf,Transcript,3,yes,yes,yes,file,transcription
druid,filename,resource_label,sequence,publish,preserve,shelve,resource_type,role,file_language
aa111aa1111,aa111aa1111_001_a_pm.wav,"Tape 1, Side A",1,no,yes,no,media,,
aa111aa1111,aa111aa1111_001_a_sh.wav,,,no,yes,no,,,
aa111aa1111,aa111aa1111_001_a_sl.mp3,,,yes,yes,yes,,,
aa111aa1111,aa111aa1111_001_img_1.jpg,,,yes,yes,yes,,,
aa111aa1111,aa111aa1111_001_b_pm.wav,"Tape 1, Side B",2,yes,yes,yes,file,,
aa111aa1111,aa111aa1111_001_b_sh.wav,,,no,no,no,,,
aa111aa1111,aa111aa1111_001_b_sl.mp3,,,yes,yes,yes,,,
aa111aa1111,aa111aa1111_001_img_2.jpg,,,yes,yes,yes,,,
aa111aa1111,aa111aa1111.pdf,Transcript,3,yes,yes,yes,file,,
bb222bb2222,bb222bb2222_002_a_pm.wav,"Tape 1, Side A",1,no,yes,no,media,,
bb222bb2222,bb222bb2222_002_a_sh.wav,,,no,yes,no,,,
bb222bb2222,bb222bb2222_002_a_sl.mp3,,,yes,yes,yes,,,
bb222bb2222,bb222bb2222_002_img_1.jpg,,,yes,yes,yes,,,
bb222bb2222,bb222bb2222_002_b_pm.wav,"Tape 1, Side B",2,no,yes,no,media,,
bb222bb2222,bb222bb2222_002_b_sh.wav,,,no,no,no,,,
bb222bb2222,bb222bb2222_002_b_sl.mp3,,,yes,yes,yes,,,
bb222bb2222,bb222bb2222_002_img_2.jpg,,,yes,yes,yes,,,
bb222bb2222,bb222bb2222.pdf,Transcript,3,yes,yes,yes,file,transcription,
6 changes: 3 additions & 3 deletions spec/fixtures/media_missing/file_manifest_missing_columns.csv
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
druid,filename,label,sequence,resource_type,role
aa111aa1111,aa111aa1111_001_a_pm.wav,"Tape 1, Side A",1,media,
aa111aa1111,aa111aa1111_001_a_sh.wav,,,,
druid,filename,label,sequence,resource_type,role,file_language
aa111aa1111,aa111aa1111_001_a_pm.wav,"Tape 1, Side A",1,media,,
aa111aa1111,aa111aa1111_001_a_sh.wav,,,,,
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
object,filename,label,sequence,resource_type,role
aa111aa1111,aa111aa1111_001_a_pm.wav,"Tape 1, Side A",1,media,
aa111aa1111,aa111aa1111_001_a_sh.wav,,,,
object,filename,label,sequence,resource_type,role,file_language
aa111aa1111,aa111aa1111_001_a_pm.wav,"Tape 1, Side A",1,media,,
aa111aa1111,aa111aa1111_001_a_sh.wav,,,,,
2 changes: 1 addition & 1 deletion spec/fixtures/media_missing/file_manifest_no_rows.csv
Original file line number Diff line number Diff line change
@@ -1 +1 @@
object,filename,label,sequence,publish,preserve,shelve,resource_type,role
object,filename,label,sequence,publish,preserve,shelve,resource_type,role,file_language
20 changes: 10 additions & 10 deletions spec/fixtures/media_video_test/file_manifest.csv
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
druid,filename,resource_label,sequence,publish,preserve,shelve,resource_type,role
vd000bj0000,vd000bj0000_video_1.mp4,Video file 1,1,yes,yes,yes,video,,
vd000bj0000,vd000bj0000_video_1.mpeg,Video file 1,1,no,yes,no,video,,
vd000bj0000,vd000bj0000_video_1_thumb.jp2,Video file 1,1,yes,yes,yes,image,thumb,
vd000bj0000,vd000bj0000_video_2.mp4,Video file 2,2,yes,yes,yes,video,,
vd000bj0000,vd000bj0000_video_2.mpeg,Video file 2,2,no,yes,no,video,,
vd000bj0000,vd000bj0000_video_2_thumb.jp2,Video file 2,2,yes,yes,yes,image,thumb,
vd000bj0000,vd000bj0000_video_img_1.tif,Image of media (1 of 2),3,no,yes,no,image,,
vd000bj0000,vd000bj0000_video_img_2.tif,Image of media (2 of 2),4,no,yes,no,image,bogus_role
vd000bj0000,vd000bj0000_video_log.txt,Disc log file,5,no,yes,no,file,transcription
druid,filename,resource_label,sequence,publish,preserve,shelve,resource_type,role,file_language
vd000bj0000,vd000bj0000_video_1.mp4,Video file 1,1,yes,yes,yes,video,,,
vd000bj0000,vd000bj0000_video_1.mpeg,Video file 1,1,no,yes,no,video,,,
vd000bj0000,vd000bj0000_video_1_thumb.jp2,Video file 1,1,yes,yes,yes,image,thumb,,
vd000bj0000,vd000bj0000_video_2.mp4,Video file 2,2,yes,yes,yes,video,,,
vd000bj0000,vd000bj0000_video_2.mpeg,Video file 2,2,no,yes,no,video,,,
vd000bj0000,vd000bj0000_video_2_thumb.jp2,Video file 2,2,yes,yes,yes,image,thumb,,
vd000bj0000,vd000bj0000_video_img_1.tif,Image of media (1 of 2),3,no,yes,no,image,,,
vd000bj0000,vd000bj0000_video_img_2.tif,Image of media (2 of 2),4,no,yes,no,image,bogus_role,
vd000bj0000,vd000bj0000_video_log.txt,Disc log file,5,no,yes,no,file,transcription,
Loading