Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XML2CSV tool editing for migration #913

Open
rtilla1 opened this issue Aug 30, 2018 · 1 comment
Open

XML2CSV tool editing for migration #913

rtilla1 opened this issue Aug 30, 2018 · 1 comment
Labels
Subject: Migration Concerning migration from Islandora 7 to Islandora 2.x.x

Comments

@rtilla1
Copy link

rtilla1 commented Aug 30, 2018

Step 2 of Investigate using OpenRefine as part of the migration process is to transform MODS with xml2csv tool into very specific columns with well-documented delimiters between compound or complex contents.

Cara is working on pulling simple personal names (with one namePart), complex personal names (with multiple nameParts), simple corporate names, and complex corporate names into their own columns, with appropriate delimiters between the names and their roles and name/role pairs.

Each institution will need to customize parts of the xml2csv tool, but work should be done to make sure any data created by the basic Islandora 7.x forms are carried over.

@carakey
Copy link

carakey commented Aug 30, 2018

The latest version has multiple columns for Names as follows: Personal, Corporate, Conference, Family, and No Type (mapping to the type attribute on the name element) -- each of the above with a Simple column (which maps to either un-typed namePart or displayForm) and Compound column (which maps to nameParts with type attributes).

Overkill? Maybe.

Delimiting is as follows (and could easily be changed):

  • In both simple and compound fields, each discrete name element is separated by a double pipe ||.
  • In both simple and compound fields, each name is followed by role information in double brackets, in the form [[marcrelator=code/text]].
    • If either code or text term is not present then the value of the missing term is given as "NULL".
    • If neither code nor text term is present, or if the authority is other than marcrelators, then [[marcrelators=ctb/Contributor]] is supplied.
  • In simple fields, multiple namePart or displayForm subelements within a name element are separated with a period, e.g. "University of Wyoming. Junior Class"
  • In compound fields, each namePart that has a type attribute is followed by type information in double brackets, e.g. "Andō [[type=family]]Hiroshige [[type=given]]1797-1858 [[type=date]]"
    • The multiple nameParts in compound fields do not have additional delimiters

@kstapelfeldt kstapelfeldt added Subject: Migration Concerning migration from Islandora 7 to Islandora 2.x.x and removed migration labels Sep 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Subject: Migration Concerning migration from Islandora 7 to Islandora 2.x.x
Projects
Development

No branches or pull requests

4 participants