-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
880 alternate script handling #7652
Conversation
5fc32b2
to
3a51ad8
Compare
RESOLVED @cclauss I'm not sure about these Ruff PLC1901 messages, e.g.
I don't want to change those blindly as |
3a51ad8
to
e14a259
Compare
openlibrary/catalog/marc/tests/test_data/bin_input/880_alternate_script.mrc
Show resolved
Hide resolved
225755c
to
e14a259
Compare
@mekarpeles @hornc This PR shouldn't have been closed when the other PR mentioning it was merged. |
for more information, see https://pre-commit.ci
current edition_name (880 no longer present, as desired)
This record is UTF-8 encoded, leader$9 should be 'a'
e14a259
to
23317e2
Compare
8d262f0
to
f8878e4
Compare
628964c
to
ceb26ff
Compare
openlibrary/catalog/marc/tests/test_data/bin_expect/880_Nihon_no_chasho.json
Outdated
Show resolved
Hide resolved
Co-authored-by: Christian Clauss <cclauss@me.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great. I'm sorry I don't have time to do the review justice, but I tried to comment on a few things that caught my eye. I'll try to circle back later and do a more thorough review of the new test cases (which look awesome!)
openlibrary/catalog/marc/tests/test_data/xml_expect/soilsurveyrepor00statgoog.json
Show resolved
Hide resolved
openlibrary/catalog/marc/tests/test_data/bin_expect/880_table_of_contents.json
Outdated
Show resolved
Hide resolved
openlibrary/catalog/marc/tests/test_data/bin_expect/880_arabic_french_many_linkages.json
Outdated
Show resolved
Hide resolved
openlibrary/catalog/marc/tests/test_data/bin_expect/ithaca_two_856u.json
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Placate mypy...
Co-authored-by: Christian Clauss <cclauss@me.com>
Co-authored-by: Christian Clauss <cclauss@me.com>
Co-authored-by: Christian Clauss <cclauss@me.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@tfmorris, thanks for your reviews so far. I've tried to address everything, and split of separate pieces into their own issues to be worked on as follow ups. Is there anything else I should address immediately of this PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was going to dig up some additional test data to cover s.l., etc but I think it's fine to merge as is and handle that later. It's a great improvement and I really appreciate all the work.
Here are a couple of test cases for the latinized publish notation, in case they're still useful |
Closes #7264
title
, and transliterations toother_titles
(justification: edition record is for the item as printed -- the alternate script is likely what is printed on the book, this is the most unambiguous identifier)alternate_names
, with the transliteration asname
(justification: the author record is an abstraction representing an individual who may have used multiple names and forms over time, which stands independently of how their name was presented on a specific book. We need to pick an identifier, and using the catalog language is not inappropriate)series
after stripping punctuationMarcBinary
andMarcXml
classesedition_name
taken from the example in Don't import MARC 250$6 as part of edition name #7617To add:
MARC XML support for 880 linkages -- use existing XML test input: nybc200247_marc.xml
Table-Of-Contents 880s example from Don't import MARC 250$6 as part of edition name #7617 : https://openlibrary.org/show-records/marc_miami_univ_ohio/allbibs0193.out:11791288:963
880
fields are missing -- I'd expect Cyrillic script in the record somewhere. There isn't any. For the purposes of this test case, we are only checking that the string880-04
or similar does not appear in the generated TOC, so it 'works' for that purpose.Examples from #7264
Technical
Testing
Screenshot
Stakeholders
@tfmorris