Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change Assign Placenames robot to update cocina directly. #729

Merged
merged 2 commits into from
Jan 25, 2024

Conversation

justinlittman
Copy link
Contributor

closes #705

Why was this change made? 🤔

Get rid of MODS.

How was this change tested? 🤨

⚡ ⚠ If this change involves consuming from other services or writing to shared file systems, test that GIS accessioning works properly in [stage|qa] environment, in addition to specs. ⚡

Unit

@justinlittman justinlittman marked this pull request as draft January 24, 2024 22:32
@justinlittman
Copy link
Contributor Author

Note that this gets rid of the code that creates LC subject headings. It seems like this code would almost never be exercised, as there are very few possible cases when the existing subject is found in the gazetteer but the subject doesn't match the LC Subject.

Further note that it has been proposed to move to some sort of a dynamic lookup for the gazetteer instead of static data. See #732

@justinlittman justinlittman marked this pull request as ready for review January 25, 2024 12:54
Copy link
Contributor

@edsu edsu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please say more about how you determined that "there are very few possible cases when the existing subject is found in the gazetteer but the subject doesn't match the LC Subject."?

@justinlittman
Copy link
Contributor Author

@registry.entries.select {|e| e[1] && e[1][:loc_keyword] && e[0] != e[1][:loc_keyword]}.each {|e| puts "#{e[0]} => #{e[1][:loc_keyword]}"}
Alībāg (India) => Alibag (India)
Bangalore (India) => Bangalore (India)
Bhākra Dam => Bhakra Dam (India)
Bournemouth => Bournemouth (England)
Bromsgrove (England : District) => Bromsgrove (England : District)
Ciudad Juárez (Mexico) => Ciudad Juarez (Mexico)
Cochin (India) => Cochin (India)
Hubli (India) => Hubli (India)
Kandahar (Afghanistan) => Kandahār (Afghanistan)
Kunduz (Afghanistan) => Kundūz (Afghanistan)
Sīwan (India) => Siwān (Saran, India)
Suisun (Calif.) => Suisun City (Calif.)

^^ These are the only times in which the existing subject would not match an LC subject causing the LC subject to be added. (And I suspect that most of these are typos.)

@justinlittman justinlittman requested a review from edsu January 25, 2024 16:41
Copy link
Contributor

@edsu edsu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are removing the LC functionality can the Gazetteer.find_loc_* methods and specs be removed. Also can the CSV be modified to no longer include the LC specific information that isn't used?

@justinlittman
Copy link
Contributor Author

@edsu I removed the loc methods from the gazetteer and added a comment about the legacy data. I chose not to change the data to avoid messing it up, especially given that we'll probably be replacing it later in the WC.

@justinlittman justinlittman requested a review from edsu January 25, 2024 18:02
@edsu edsu merged commit 928a6cf into main Jan 25, 2024
5 checks passed
@edsu edsu deleted the t705-assign_placenames branch January 25, 2024 19:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remove MODS from Assign Placename robot
2 participants