Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically create Location entry in Thoth when a Dissemination workflow succeeds #545

Closed
rhigman opened this issue Feb 5, 2024 · 3 comments · Fixed by thoth-pub/thoth-dissemination#30
Assignees

Comments

@rhigman
Copy link
Member

rhigman commented Feb 5, 2024

On successful dissemination, add a new Location entry to the relevant Publication in Thoth, recording the URL(s) of the newly-created directory entry and/or copy of the content.

To be added to the existing Internet Archive/Figshare workflows (Crossref is not relevant as DOIs are already present in Thoth at time of submission). For Internet Archive, the Platform type INTERNET_ARCHIVE is now available; for institutional repositories such as Figshare, OTHER will need to be used.

There would then be potential for replacing the current logic for checking whether or not a Work already exists in the target platform, instead looking at whether a Location with the relevant Platform exists.

This can then be extended to new dissemination platforms when they are implemented. Ease of implementation may depend on individual platforms' workflows; Internet Archive returns the relevant URL to the dissemination script immediately on creation, but e.g. FTP-based workflows are unlikely to be as neat.

@rhigman
Copy link
Member Author

rhigman commented Feb 6, 2024

As part of this work, add something like PUBLISHER_WEBSITE to the set of location platform types.

@ja573
Copy link
Member

ja573 commented Feb 15, 2024

As part of this work, add something like PUBLISHER_WEBSITE to the set of location platform types.

Tracked separately now under #561

@rhigman
Copy link
Member Author

rhigman commented Apr 19, 2024

  • For each platform, determine the appropriate landingPage and fullTextUrl to record in Thoth
    • Internet Archive: landingPage = archive.org/details/[workId], fullTextUrl = archive.org/details/[workId]/[filename].pdf
    • Figshare: note figshare.com (API) vs repository.lboro.ac.uk (UI) versions of same links, Handles, etc; treat as a "Figshare" upload or a "Loughborough repository" upload (repositories may migrate platforms)?
    • CUL: tbd
    • Zenodo (or do under Disseminate to Zenodo #542)
    • OAPEN: works don't acquire these URLs until some hours/days after dissemination. Currently handled manually. Any alternative? Split out as separate task?
    • (Crossref: not relevant here)
  • Extend disseminator to (retrieve and) pass back landingPage and fullTextUrl when they have been assigned via successful archiving
    • this will need to be on a per-publication basis as we sometimes disseminate more than one format, so publicationId will also need to be passed back, or at least publicationType
  • Add script which takes publicationId, locationPlatform, landingPage and fullTextUrl and writes location to Thoth
    • locationPlatform could be supplied directly or derived from inputs to disseminator
    • publicationId could be passed back directly as above, or obtained from Thoth via e.g. workId + publicationType
  • Extend GitHub Actions to take output from each disseminator run and pass it to new script
    • for dissemination of multiple formats, should the script be called multiple times, or should it handle multiple locations itself?
  • Determine whether any new locationPlatforms need to be added to Thoth
    • e.g. FIGSHARE - or as above, should it be e.g. LBORO_REPO?
    • Any way of marking/"locking" these locations as created by Thoth Dissemination Service/part of Thoth Archiving Network?
    • Is it still appropriate to permit only one location per locationPlatform for all of these? (e.g. users might independently upload copies to additional Figshare repositories, etc - not necessarily sensible but shouldn't go unrecorded)
  • Catchup run: ensure that works disseminated prior to implementation of this feature all have appropriate locations created
    • Could a similar mechanism be used (on a regular, automatic basis) to handle OAPEN, as above?
  • Add an appropriate set of Thoth credentials as repository secret (or organisation secret - would require permissions)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants