Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple values being entered into metadata fields that expect only one value (e.g. authors and keywords) #4035

Closed
jggautier opened this issue Aug 1, 2017 · 2 comments
Labels
Feature: Metadata User Role: Depositor Creates datasets, uploads data, etc. UX & UI: Design This issue needs input on the design of the UI and from the product owner

Comments

@jggautier
Copy link
Contributor

This has been brought up in other channels but I haven't seen it reported in a github issue.

Some dataset depositors entering terms into the keyword metadata field enter a string of keywords separated by semi-colons or commas. I suspect that most of the time those keywords are copied from the associated articles' keywords and pasted into the dataset keyword field. Depositors doing this don't realize that if they want to enter multiple terms, they should click the + sign for another set of keyword fields (and of course they might be expecting that the metadata field will parse the string, since they've seen other applications do it).

screen_shot_2017-08-01_at_1_42_00_pm

This is a problem because the fields don't split the strings by the common semi-colon or comma characters in order to treat each keyword as a separate term; it turns the whole string into one term, which hurts discoverability:

screen_shot_2017-08-01_at_12_25_07_pm

To get a sense of how often this is done in Harvard Dataverse and how much of an issue it is, we could query Harvard Dataverse for all of the keywords of non-harvested datasets, then see how many contain semi-colons and/or commas.

I think we could also do the same query for harvested datasets in Harvard Dataverse, which might help indicate the size of the problem with the way keyword metadata is being harvested.

Both harvested and non-harvested datasets have keywords like this. Last November, Leonid sent me the results of a query that included harvested and non-harvested datasets with keywords metadata. Of the appr. 20,000 unique datasets in those results, about 2500 (12 percent) have keywords containing one or more semicolons. (I'm not including keywords that have only commas, because it's more likely that a greater portion of those are really just one-term keywords that happen to contain a comma (e.g. "Firm Objectives, Organization, and Behavior" from JEL codes or lots of terms from LCSH).

@jggautier jggautier added Feature: Metadata UX & UI: Design This issue needs input on the design of the UI and from the product owner labels Aug 1, 2017
@pdurbin
Copy link
Member

pdurbin commented Dec 1, 2021

@jggautier made this point elsewhere but we see a similar problem in #377 where authors wants to enter multiple affiliations. https://dataverse.csuc.cat/dataset.xhtml?persistentId=doi:10.34810/data31&version=2.0 is an example and here are screenshots from the metadata tab and the facet:

Screen Shot 2021-12-01 at 10 14 34 AM

Screen Shot 2021-12-01 at 10 15 35 AM

@jggautier jggautier changed the title Multiple dataset keywords being entered into one keyword field Multiple values being entered into metadata fields that expect only one value (e.g. authors and keywords) Feb 11, 2022
@pdurbin pdurbin added Type: Bug a defect User Role: Depositor Creates datasets, uploads data, etc. labels Nov 16, 2023
@jggautier jggautier removed the Type: Bug a defect label May 15, 2024
@cmbz
Copy link

cmbz commented Aug 20, 2024

To focus on the most important features and bugs, we are closing issues created before 2020 (version 5.0) that are not new feature requests with the label 'Type: Feature'.

If you created this issue and you feel the team should revisit this decision, please reopen the issue and leave a comment.

@cmbz cmbz closed this as completed Aug 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature: Metadata User Role: Depositor Creates datasets, uploads data, etc. UX & UI: Design This issue needs input on the design of the UI and from the product owner
Projects
None yet
Development

No branches or pull requests

3 participants