Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crossref Published dates represented inconsitently across work types #6

Open
gbilder opened this issue Oct 14, 2024 · 1 comment
Open

Comments

@gbilder
Copy link

gbilder commented Oct 14, 2024

Published date parsing seems incocnistent across different Crossref content types:

For example, JournalArticles always return the Published date as a YYYY-MM-DD string.

However other contents seem to also include trailing time info. Here are examples:

  • 10.6019/pxd046288 (Database)
  • 10.1007/10390457_56 (BookChapter)
  • 10.7717/peerj.3953/table-1 (Component)
  • DOI:10.15760/honors.1246 (Dissertation)
  • 10.1109/vlsic.2004.1346548 (ProceedingsArticle)

In each case the published date seems to include trailing time info as well.

For example, with the last item in the above list:

commonmeta convert --from crossref "10.1109/vlsic.2004.1346548"

results in:

{
  "id": "https://doi.org/10.1109/vlsic.2004.1346548",
  "type": "ProceedingsArticle",
  "container": {
    "type": "Proceedings",
    "title": "2004 Symposium on VLSI Circuits. Digest of Technical Papers (IEEE Cat. No.04CH37525)"
  },
  "contributors": [
    {
      "type": "Person",
      "givenName": "J.W.",
      "familyName": "Lee",
      "contributorRoles": [
        "Author"
      ]
    },
    {
      "type": "Person",
      "familyName": "Daihyun Lim",
      "contributorRoles": [
        "Author"
      ]
    },
    {
      "type": "Person",
      "givenName": "B.",
      "familyName": "Gassend",
      "contributorRoles": [
        "Author"
      ]
    },
    {
      "type": "Person",
      "givenName": "G.E.",
      "familyName": "Suh",
      "contributorRoles": [
        "Author"
      ]
    },
    {
      "type": "Person",
      "givenName": "M.",
      "familyName": "van Dijk",
      "contributorRoles": [
        "Author"
      ]
    },
    {
      "type": "Person",
      "givenName": "S.",
      "familyName": "Devadas",
      "contributorRoles": [
        "Author"
      ]
    }
  ],
  "date": {
    "published": "2004-10-26T09:47:24Z"
  },
  "identifiers": [
    {
      "identifier": "https://doi.org/10.1109/vlsic.2004.1346548",
      "identifierType": "DOI"
    }
  ],
  "license": {},
  "provider": "Crossref",
  "publisher": {
    "id": "https://api.crossref.org/members/263",
    "name": "Widerkehr and Associates"
  },
  "references": [
    {
      "key": "ref4",
      "id": "https://doi.org/10.1145/586110.586132"
    },
    {
      "key": "ref3",
      "id": "https://doi.org/10.1109/9780470544365"
    },
    {
      "key": "ref6",
      "title": "IC Identification Circuit Using Device Mismatch",
      "publicationYear": "2000"
    },
    {
      "key": "ref5",
      "id": "https://doi.org/10.1109/csac.2002.1176287"
    },
    {
      "key": "ref2",
      "title": "Identification and Authentication of Integrated Circuits",
      "publicationYear": "2003"
    },
    {
      "key": "ref1",
      "publicationYear": "2001"
    }
  ],
  "titles": [
    {
      "title": "A technique to build a secret key in integrated circuits for identification and authentication applications"
    }
  ],
  "url": "http://ieeexplore.ieee.org/document/1346548/"
}

What is very odd- is that the date in the JSON above ("2004-10-26T09:47:24Z") seem to be extracted from the :

2004-10-26T09:47:24Z

Which is the date the record was createed, not the date the item was published.

http://api.crossref.org/works/10.1109/vlsic.2004.1346548/transform/application/vnd.crossref.unixsd+xml

@gbilder
Copy link
Author

gbilder commented Oct 15, 2024

The issue is simply one of granularity—some dates (e.g., "published") can be partial, and the original metadata doesn't include a time component. However, it would be good if all dates (partial or otherwise) were represented consistently, even in string form. So, for example:

  • "2024-12-01" -> "2024-12-01 00:00:00 +0000 UTC"
  • "2016-06" -> "2016-06-01 00:00:00 +0000 UTC"
  • "1965" -> "1965-01-01 00:00:00 +0000 UTC"
  • "2017-11-07T08:18:33Z" -> "2017-11-07 08:18:33 +0000 UTC"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant