Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add subfields to authors in dataset_description.json #602

Closed
CPernet opened this issue Sep 9, 2020 · 12 comments
Closed

add subfields to authors in dataset_description.json #602

CPernet opened this issue Sep 9, 2020 · 12 comments

Comments

@CPernet
Copy link
Collaborator

CPernet commented Sep 9, 2020

following a suggestion by @debruine in psych-DS we could have in modality agnostic file an update of the dataset_description.json for authors, removing ambiguities. (not sure what this entail for the validator? as we should allow both types of entries)

{
  "Name": "The mother of all experiments",
  "BIDSVersion": "1.4.0",
  "DatasetType": "raw",
  "License": "CC0",
  "Authors": [
  { "@type": "Person",
    "givenName": "Paul",
    "familyName": "Broca",
    "identifier": "https://orcid.org/0000-0000-0000-0001"},
  { "@type": "Person",
    "givenName": "Carl",
    "familyName": "Wernicke",
    "identifier": "https://orcid.org/0000-0000-0000-0002"}
],
  "Acknowledgements": "Special thanks to Korbinian Brodmann for help in formatting this dataset in BIDS. We thank Alan Lloyd Hodgkin and Andrew Huxley for helpful comments and discussions about the experiment and manuscript; Hermann Ludwig Helmholtz for administrative support; and Claudius Galenus for providing data for the medial-to-lateral index analysis.",
  "HowToAcknowledge": "Please cite this paper: https://www.ncbi.nlm.nih.gov/pubmed/001012092119281",
  "Funding": [
    "National Institute of Neuroscience Grant F378236MFH1",
    "National Institute of Neuroscience Grant 5RMZ0023106"
  ],
  "EthicsApprovals": [
    "Army Human Research Protections Office (Protocol ARL-20098-10051, ARL 12-040, and ARL 12-041)"
  ],
  "ReferencesAndLinks": [
    "https://www.ncbi.nlm.nih.gov/pubmed/001012092119281",
    "Alzheimer A., & Kraepelin, E. (2015). Neural correlates of presenile dementia in humans. Journal of Neuroscientific Data, 2, 234001. http://doi.org/1920.8/jndata.2015.7"
  ],
  "DatasetDOI": "10.0.2.3/dfjj.10",
  "HEDVersion": "7.1.1"
}

tagging @effigies @sappelhoff @robertoostenveld

@sappelhoff
Copy link
Member

sappelhoff commented Sep 9, 2020

I assume that the proposal only entails modifying the Author field? I didn't spot anything other that was different from what we currently support in BIDS.

re: the Author field --> that looks like an interesting direction to me

Currently, we say the following according to this part of the spec:

Authors: OPTIONAL. List of individuals who contributed to the creation/curation of the dataset.

Note that this is a little ambiguous and we could be a lot clearer in terms of what exact datatypes are expected. See this issue, where we want to start improving this state: #533

Looking at the validator schema however, we see that an "array of strings" is expected as input: see link to validator code

Now for the present proposal:

  1. we would HAVE to keep allowing "array of strings" for backward compatibility
  2. but we could add a second way to specify authors: "array of objects"
    1. where each "object" MUST have (and only have) the fields X, Y, Z (to be specified)

that wouldn't be a technical problem.

Overall I think this looks cool but it'd need a bit more tweaking (specify what other @type value you want to allow ... and why the @ symbol is needed, then also what kind of "identifiers" would be permissible, etc.)

Let's hear what others have to say.


PS: using @ + type also made me accidentally tag https://github.com/type ... sorry 🙂

@effigies
Copy link
Collaborator

effigies commented Sep 9, 2020

cc @nellh It would be good to have an OpenNeuro perspective on this.

@satra
Copy link
Collaborator

satra commented Sep 9, 2020

if anyone is interested here is the current version of the dataset contributor model we are using in DANDI.

https://github.com/dandi/dandi-cli/blob/c20d7888391c9abe6fdffbc53ea4e20f054bbde2/dandi/models.py#L510

which adds/overwrites the common model.
https://github.com/dandi/dandi-cli/blob/c20d7888391c9abe6fdffbc53ea4e20f054bbde2/dandi/models.py#L444

specifically this uses a field called contributor which can accept either a Person or an Organization as an object with specific roles assigned to these people.

this is not BIDS compatible, but should be compatible/translatable with datacite, which i believe is what openneuro uses for DOIs. although i don't know what pieces of metadata are transformed into the datacite model.

at present it seems that openneuro doi's provide some basic mapping to creator for all authors:

$ curl https://ez.datacite.org/id/doi:10.18112/openneuro.ds003105.v1.0.1
success: doi:10.18112/openneuro.ds003105.v1.0.1
_target: https://openneuro.org/datasets/ds003105/versions/1.0.1
datacite: <?xml version="1.0" encoding="UTF-8"?>%0A<resource xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://datacite.org/schema/kernel-4" xsi:schemaLocation="http://datacite.org/schema/kernel-4 http://schema.datacite.org/meta/kernel-4/metadata.xsd">%0A  <identifier identifierType="DOI">10.18112/OPENNEURO.DS003105.V1.0.1</identifier>%0A  <creators>%0A    <creator>%0A      <creatorName>Kelly Payette</creatorName>%0A    </creator>%0A    <creator>%0A      <creatorName>Andras Jakab</creatorName>%0A    </creator>%0A  </creators>%0A  <titles>%0A    <title xml:lang="en-us">Fetal Tissue Annotation Challenge FeTA Dataset</title>%0A  </titles>%0A  <publisher>Openneuro</publisher>%0A  <publicationYear>2020</publicationYear>%0A  <resourceType resourceTypeGeneral="Dataset">fMRI</resourceType>%0A</resource>
_profile: datacite
_datacenter: SUL.OPENNEURO
_export: yes
_created: 1598878722
_updated: 1598878724
_status: public

@ericearl
Copy link
Collaborator

I know this is an old thread, but I was re-introduced to it today by @agt24. I think it would be great to NOT change the dataset_description.json's "Authors" field, but instead accept EITHER:

  1. The "Authors" field as-is in the dataset_description.json ; OR
  2. A CITATION.cff file at the root level of the data set in lieu of a dataset_description.json's "Authors" field since CITATION.cff is an accepted widespread standard of documenting details about authors.

Further reading:

@ericearl
Copy link
Collaborator

To clarify why I said "in lieu of", I meant if a CITATION.cff file is present, then the Authors field should not be included so there is no confusion or conflict of Authors.

@CPernet
Copy link
Collaborator Author

CPernet commented Jun 20, 2023

lieu is totally good 'au lieu de' -- already so many french people involved, we don't mind :-D
+1 as a back compatible solution

@effigies
Copy link
Collaborator

@ericearl That's a great idea! +1 from me.

@Remi-Gau
Copy link
Collaborator

See also this issue: #901

@ericearl
Copy link
Collaborator

Thank you for your rich historical memory @Remi-Gau! I'll go put a supporting comment there.

@effigies
Copy link
Collaborator

@Remi-Gau It just sounds better when Eric says it...

@Remi-Gau
Copy link
Collaborator

Also it sounds better with 2 years of experience about citation.cff. 😏

@effigies
Copy link
Collaborator

Let's consolidate discussion in #901. If there are points brought up here that need to be copied, please feel free to reproduce them there.

@effigies effigies closed this as not planned Won't fix, can't repro, duplicate, stale Jun 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants