Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code Table Request - new identification attribute = verbatim determiner #6414

Closed
1 of 5 tasks
Jegelewicz opened this issue Jun 14, 2023 · 17 comments
Closed
1 of 5 tasks
Labels
Function-Agents Function-CodeTables Priority-High (Needed for work) High because this is causing a delay in important collection work..

Comments

@Jegelewicz
Copy link
Member

Jegelewicz commented Jun 14, 2023

Instructions

This is a template to facilitate communication with the Arctos Code Table Committee. Submit a separate request for each relevant value. This form is appropriate for exploring how data may best be stored, for adding vocabulary, or for updating existing definitions.

Reviewing documentation before proceeding will result in a more enjoyable experience.


Initial Request

Goal

Describe what you're trying to accomplish. This is the only necessary step to start this process. The Committee is available to assist with all other steps. Please clearly indicate any uncertainty or desired guidance if you proceed beyond this step.

Clean up low data agents in determinations by moving them to a verbatim field. See also #6411

Proposed Value

Proposed new value. This should be clear and compatible with similar values in the relevant table and across Arctos.

verbatim determiner

Proposed Definition

Clear, complete, non-collection-type-specific functional definition of the value. Avoid discipline-specific terminology if possible, include parenthetically if unavoidable.

Verbatim determiner accepts any string value that describes the agent who made the identification determination. This attribute should be used when there is little to no information about a determiner instead of creating a low-information agent (no dates, relationships, or addresses are known for the agent).

Attribute data type

If the request is for an attribute, what values will be allowed?
free-text, categorical, or number+units depending upon the attribute (TBA)

free-text

Attribute controlled values

If the values are categorical (to be controlled by a code table), add a link to the appropriate code table. If a new table or set of values is needed, please elaborate.

Attribute units

If numerical values should be accompanied by units, provide a link to the appropriate units table.

Context

Describe why this new value is necessary and existing values are not.

There currently is no way to associate a verbatim agent with a determination. As there may be multiple determinations per catalog record, this should be closely associated with the determination rather than just the catalog record (verbatim agent attribute).

Table

Code Tables are http://arctos.database.museum/info/ctDocumentation.cfm. Link to the specific table or value. This may involve multiple tables and will control datatype for Attributes. OtherID requests require BaseURL (and example) or explanation. Please ask for assistance if unsure.

ctidentification_attribute_type

Collection type

Some code tables contain collection-type-specific values. collection_cde may be found from https://arctos.database.museum/home.cfm

N/A

Priority

Please describe the urgency and/or choose a priority-label to the right. You should expect a response within two working days, and may utilize Arctos Contacts if you feel response is lacking.

Example Data

Requests with clarifying sample data are generally much easier to understand and prioritize. Please attach or link to any representative data, in any form or format, which might help clarify the request.

Available for Public View

Most data are by default publicly available. Describe any necessary access restrictions.

Yes

Discussion: Please reach out to anyone who might be affected by this change. Leave a comment or add this to the Committee agenda if you believe more focused conversation is necessary.

@ArctosDB/arctos-code-table-administrators

Approval

All of the following must be checked before this may proceed.

The How-To Document should be followed. Pay particular attention to terminology (with emphasis on consistency) and documentation (with emphasis on functionality). No person should act in multiple roles; the submitter cannot also serve as a Code Table Administrator, for example.

  • Code Table Administrator[1] - check and initial, comment, or thumbs-up to indicate that the request complies with the how-to documentation and has your approval
  • Code Table Administrator[2] - check and initial, comment, or thumbs-up to indicate that the request complies with the how-to documentation and has your approval
  • DBA - The request is functionally acceptable. The term is not a functional duplicate, and is compatible with existing data and code.
  • DBA - Appropriate code or handlers are in place as necessary. (ID_References, Media Relationships, Encumbrances, etc. require particular attention)

Rejection

If you believe this request should not proceed, explain why here. Suggest any changes that would make the change acceptable, alternate (usually existing) paths to the same goals, etc.

  1. Can a suitable solution be found here? If not, proceed to (2)
  2. Can a suitable solution be found by Code Table Committee discussion? If not, proceed to (3)
  3. Take the discussion to a monthly Arctos Working Group meeting for final resolution.

Implementation

Once all of the Approval Checklist is appropriately checked and there are no Rejection comments, or in special circumstances by decree of the Arctos Working Group, the change may be made.

Review everything one last time. Ensure the How-To has been followed. Ensure all checks have been made by appropriate personnel.

Make changes as described above. Ensure the URL of this Issue is included in the definition.

Close this Issue.

DO NOT modify Arctos Authorities in any way before all points in this Issue have been fully addressed; data loss may result.

Special Exemptions

In very specific cases and by prior approval of The Committee, the approval process may be skipped, and implementation requirements may be slightly altered. Please note here if you are proceeding under one of these use cases.

  1. Adding an existing term to additional collection types may proceed immediately and without discussion, but doing so may also subject users to future cleanup efforts. If time allows, please review the term and definition as part of this step.
  2. The Committee may grant special access on particular tables to particular users. This should be exercised with great caution only after several smooth test cases, and generally limited to "taxonomy-like" data such as International Commission on Stratigraphy terminology.
@Jegelewicz Jegelewicz added Priority-High (Needed for work) High because this is causing a delay in important collection work.. Function-CodeTables labels Jun 14, 2023
@Jegelewicz Jegelewicz added this to the Needs Discussion milestone Jun 14, 2023
@Nicole-Ridgwell-NMMNHS
Copy link

Will these values show up when searching agents with "include verbatim agent" set to yes? Can these be made into agents at a later date in the same way that the regular verbatim agent attribute can?

@Jegelewicz
Copy link
Member Author

Will these values show up when searching agents with "include verbatim agent" set to yes?

That should be added to the proposal and to all future "verbatim agent" things

Can these be made into agents at a later date in the same way that the regular verbatim agent attribute can?

Yes but we will need some documentation on adding determiners to existing identifications or perhaps a new tool specifically for this purpose.

@Jegelewicz
Copy link
Member Author

@dustymc I'd like your thoughts on this and @ArctosDB/arctos-code-table-administrators we need one more check.

@campmlc
Copy link

campmlc commented Aug 1, 2023

Happy to check if no objections.

@dustymc
Copy link
Contributor

dustymc commented Aug 1, 2023

thoughts

Torn.

  1. This is clearly a way to associate a low-information agent with a specific identification, yay everybody.
  2. Low-information agents are low information, the specificity doesn't add much, better to keep them all in one container where they're easier to search and compare and integrate, 'identification' should just be a method of the existing record attribute.

I have little idea how to balance those things. One of my checkmarks would require a fair bit of development (which also adds complexity and costs CPU, if someone's tracking impacts on sustainability) so I'm not preemptively checking anything, but at the moment I can't see any reason to oppose this if someone has a compelling use case. (And I think "demonstrable, if perhaps not outright compelling, use case" should be added to the template and evaluation - we seem to spend a lot of time working on things that never get used lately, and this looks like it might be one of those.)

@Jegelewicz
Copy link
Member Author

demonstrable

How many identifications include "verbatim determiner" or something like that in the id remark?

@dustymc
Copy link
Contributor

dustymc commented Aug 1, 2023

in the id remark?

Some. 14620 - 2 tenths of a percent of all IDs - contain determiner, but most of them are some flavor of WE DON'T HAVE ANYTHING TO SAY SO WE'RE GOING TO SAY THAT HERE!1!! Would elevating those data to their own compartment DO anything? I suspect not - they're useful for trying to understand things after you've already found the record where they are, and there's not really useful for anything else no matter where they are. I'm still just trying to understand, but not really getting compelling vibes yet...

@Jegelewicz
Copy link
Member Author

About 12K of those are me in ALMNH:Inv and they could definitely be verbatim agents - verbatim determiner: DIANNE GRIMM in https://arctos.database.museum/guid/ALMNH:Inv:10000 but there are any number of ways any of us may have phrased it "id by" det by and on and on. I suspect there are more if we dig.

@Jegelewicz
Copy link
Member Author

But really the best use case are the agents that are determiners now but are low quality. How many names would that be?

@dustymc
Copy link
Contributor

dustymc commented Aug 2, 2023

4649 - but I'm no longer sure I'm keeping up. We should be planning to verbatimize those (and everything else that's only carrying strings), here we should only be deciding HOW to verbatimize them. (And in case I'm not being clear, I think I'll support just about anything, I'm just trying to understand - and document - the right balance of cohesive and usable and specificity and whatever nobody's thinking of yet.)

@Jegelewicz
Copy link
Member Author

I am trying to decide this

  1. This is clearly a way to associate a low-information agent with a specific identification, yay everybody.
  2. Low-information agents are low information, the specificity doesn't add much, better to keep them all in one container where they're easier to search and compare and integrate, 'identification' should just be a method of the existing record attribute.

And I think that given the potential for exponential growth of identifications on individual records, keeping the "verbatim" determiners associated with the related identification makes more sense than dumping them in an attribute even farther away. If we do 2 - then we need some way of associating the attribute with the identification to which it belongs so that some dat, when the name is "agentified" we can place the determiner with the correct identification.

@dustymc
Copy link
Contributor

dustymc commented Aug 2, 2023

potential for exponential growth of identifications on individual records

Yea, (2) would come with some implicit assumptions (eg, verbatim identifier is earliest ID) and they'd probably all fall apart if someone's tracking multiple identifications involving low-information agents pre-Arctos in some "importable" format. I'm sure there's one of those out there but we haven't met them yet.

That's basically my plea for evidence in a nutshell - I can imagine all sorts of things, but I'd rather not add unnecessary complexity based on theoreticals. "As simple as possible, as complex as necessary" should always be kept in mind. We now have the structure to do something a bit more complex than dumping all verbatim agents into one 'bin,' if we must.

@ebraker @Nicole-Ridgwell-NMMNHS thoughts?

@Jegelewicz
Copy link
Member Author

Also tagging @genevieve-anderegg since she thumbs-upped this

@genevieve-anderegg
Copy link

Having "verbatim determiner" as an option in the drop down for Identification Attributes (currently just "nature of identification" and "identification confidence") makes sense to me!

@ebraker
Copy link
Contributor

ebraker commented Aug 2, 2023

I'm with Dusty - this sounds like a lot of complexity for something that may not be widely needed (?)

@Jegelewicz
Copy link
Member Author

(2) would come with some implicit assumptions (eg, verbatim identifier is earliest ID)

Why should that be the assumption?

@Jegelewicz
Copy link
Member Author

Now that the restrictions on agents have been relaxed, I don't see anyone clamoring for this. Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Function-Agents Function-CodeTables Priority-High (Needed for work) High because this is causing a delay in important collection work..
Projects
None yet
Development

No branches or pull requests

6 participants