Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Populate given and family names from Person names added in node add/edit form #1561

Closed
mjordan opened this issue Jul 18, 2020 · 14 comments
Closed

Comments

@mjordan
Copy link
Contributor

mjordan commented Jul 18, 2020

You can configure the Linked Agent field to allow users to add names on the fly, within the node add/edit form (which would likely be a common configuration). Note that in this screenshot I have renamed this field from its default of "Linked agent" to "Person".

name

However, when new names are added this way, the (required) given and family name parts of the name are not populated:

name

To do this, we chould use the Name module's parsing rules:

name

@seth-shaw-unlv any thoughts?

@mjordan mjordan added the UX label Jul 18, 2020
@mjordan mjordan changed the title Populate given and family names from new Person names Populate given and family names from Person names added in node add/edit form Jul 18, 2020
@seth-shaw-unlv
Copy link
Contributor

Originally I had it the other way around. You would populate those fields in the name and then you would use the auto_entitylabel module to populate the "Name" field (required by Drupal). However, this was making names assumptions that I decided not to pursue.

The reverse suffers the same issue, if I enter a value in the "Name" field, how comfortable can I be in parsing it out to the preferred name field's parts?

In the end, perhaps we should simply not require "Person Preferred Name" and allow repositories to make their own decisions on this point.

@seth-shaw-unlv
Copy link
Contributor

Thoughts, @rosiel or @rtilla1?

@rosiel
Copy link
Member

rosiel commented Jul 24, 2020

My thoughts are that the use of the Drupal Name module is more modelling than I'm comfortable with. It does not reflect the state of any of our existing data - porting our data into it would be time-consuming and tedious, and potentially fraught. I'm not against either:

  • embedded entity form (create names on the fly with a full-on taxonomy entity form embedded in the node form)
  • not having the name parts at all, just make the taxonomy term a string

What I am firmly strongly against is encouraging there to be a situation (i'm not saying we should explicitly ban this, some places may want it but i think it is bad practice and should not be in any public default we ship) where you cannot decide, at the point of data entry of an individual name how it should be parsed.

Creating things that 'work for 99% of the cases we have' is a good approach in tech generally, unless that 1% are (or represent) people. When that 1% are people (as in this case) and when those people for whom this does not work are disproportionately non-white (as in this case), then this is a case of building tech that builds in an implicit assumption of all-people-who-matter-are-like-me-which-is-to-say-a-white-person. There is a name for this assumption. That name is (hold on, I'm not accusing you, me, or the builders of this module of any hate crime) racism.

To put my time where my mouth is, PR incoming to remove 'Name' from Controlled Access Terms.

If we want this (or something like it) for valid use cases, such as:

  • i want to be able to display a list of names sorted the way a phone book would
  • i can't think of any others right now. Maybe 'i want to also be able to display a name in natural order'

then we need to figure out a way to do this that does not exclude names that don't fit what we expect. I think that means requiring double-entry. (the user entering the data must populate the same data in different formats for different purposes). Encouraging a situation where the parsing happens automatically and doesn't take user input on how the parsing happens would be a bad thing to do, in my opinion.

I think it's probably worth having a discussion on 'what are our actual needs' and 'how do other systems handle this?' before 'how can we improve the name module or how we use it so that we don't make assumptions?' It's great that 'hey, we could solve this problem" but... (hah this jpg autocompletes in my browser) we shouldn't be so preoccupied with whether we could to stop to think if we should.

dr-ian-malcolm-with-quote

@mjordan
Copy link
Contributor Author

mjordan commented Jul 24, 2020

@rosiel thank you for pointing out the racist bias implicit in automating the population of the given and family names as I described it above.

I didn't open this post because I was blindly preoccupied with "whether we could" do this (nice meme, and I do admit and own to being that way in some instances), I opened it because SFU has the following use cases for wanting names split out where appropriate in our IR:

  • compliance with OpenAIRE OAI-PMH harvesting
  • the ability to reliably generate structured citations (APA, Chicago, etc.) from object metadata
  • (this one is only potential at this point) better interoperability with campus systems that also break out names

With these use cases in mind but not explicitly stated, I opened this issue to suggest a way to avoid the double population that you point out. I'd like to offer a possible user-facing solution to this problem: we provide the person inputting the name in the node add/edit form a way to preview in place how the automatic parsing rules would parse it, so they could decide if those rules did not parse the name appropriately and take action (for example, manually correct the autopopulated famiily and given name fields). This ability doesn't remove the bias but it does expose it to the user entering the name and allow them to take action.

I like @seth-shaw-unlv 's suggestion that we make preferred personal name optional. I personally (as in this is my personal opinion) think that removing the Name module from the default Islandora is throwing the baby out with the bathwater, unless we can't figure out a way to remove the racial bias you point out. Don't let one person's opinion stop you from opening a PR to remove the module if you feel that's the best way to address the racial bias inherent in the automatic name parsing rules.

@rosiel
Copy link
Member

rosiel commented Jul 24, 2020

I'd like to offer a possible user-facing solution to this problem: we provide the person inputting the name in the node add/edit form a way to preview in place how the automatic parsing rules would parse it, so they could decide if those rules did not parse the name appropriately and take action (for example, manually correct the autopopulated famiily and given name fields).

Awesome. Yes!

However, the situation right now is the following:

  • islandora defaults is what gets considered "canonical". It depends on Controlled Access Terms. Controlled Access Terms has a mandatory Name field.
  • most people migrating into Islandora have people names in some format. This format is unlikely to be exactly what Name expects. ("given", "middle", "last", "suffix" etc.) This is not to say that in no cases will it be. But we should expect a large amount of uncontrolled data.
  • if people are encouraged to migrate from "whatever they have" into the Name module, then that will be encouraging the use of some algorithm. The algorithm will, almost guaranteed, prioritize some name formats over others.
  • Not editing the data when migrating is the "least harm" method, both to metadata integrity, and personal dignity.
  • If people already have data in this granular format, or if they have the time and want to edit their data into granularity, we can absolutely let them! But as that will take some work, it seems like they're the ones who also will have the ability/time to set up a 'Name' field as they want it.

I propose we consider the 'use of the name field' as a ... recipe? Template? thing-in-a-clearinghouse ... that is to say, we recognize that this can solve a problem that many of us will have, by adding complexity that you may consider worth your time. There are several of these brewing in my head - complex titles is another. Can we create community documentation around how to set up these complex situations, and make it as easy as possible for people to learn how to use it and what pitfalls they may encounter?

The MIG (@rtilla1 ) has been wondering about how we can come up with/create/maintain/share this kind of documentation.

@rosiel
Copy link
Member

rosiel commented Jul 24, 2020

All that to say, I made a PR to remove Name from CAT and [edited] in my opinion, removing Name is less likely to encourage harm than making it non-mandatory. I'm not sure why the travis situation isn't updating; they both passed.

@seth-shaw-unlv
Copy link
Contributor

@rosiel, instead of removing it altogether, I would rather the name module-based field configs be moved to the config/optional/ directory. That way, if someone doesn't have the name module installed it won't appear, but if they do it will pop in.

Controlled Access Terms is also used by the ArchivesSpace integration module and the name module (mostly) matches the data model provided by ArchivesSpace. I would rather we not remove this field completely.

@rosiel
Copy link
Member

rosiel commented Jul 24, 2020

OK cool. In the PR I explain that doing config changes like that has to be done manually, file-by-file, and takes me a very long time. Do you have any suggestions on how to speed up doing this?

@seth-shaw-unlv
Copy link
Contributor

I would probably re-do the PR. Start from a fresh copy and move the preferred and alternate configs to the optional directory. I would then add a new alternate name field with a different machine name than the existing one (so we don't have a collision between the string-based field and the name-module-based one) and export it into the module's config/install directory. Then I would update the form/display configs to use the new field and export them.

@seth-shaw-unlv
Copy link
Contributor

I have to feed the family lunch right now, but I might be able to take a few minutes to do it afterward, if you would like.

@seth-shaw-unlv
Copy link
Contributor

So, @mjordan, since we removed the Name module this issue is moot and can be closed, correct?

@mjordan
Copy link
Contributor Author

mjordan commented Sep 25, 2020

@seth-shaw-unlv yup as long as @rosiel is cool with doing so.

@mjordan
Copy link
Contributor Author

mjordan commented Sep 29, 2020

@rosiel ?

@rosiel
Copy link
Member

rosiel commented Oct 6, 2020

Yes to closing, thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants