Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOIs and lesson retirement / updating policy #1682

Closed
mdlincoln opened this issue Feb 25, 2020 · 24 comments
Closed

DOIs and lesson retirement / updating policy #1682

mdlincoln opened this issue Feb 25, 2020 · 24 comments
Assignees
Milestone

Comments

@mdlincoln
Copy link
Contributor

Branching off of #1370

Committing to use DOIs has implications for how we update published lessons. In this ticket I want to propose a stricter policy around updating and retiring/replacing lessons in response.

From a technical standpoint, implementing DOIs is relatively straightforward once we have established a partnership with a DOI provider (I will open a separate ticket about the technical aspects). It means that we register a DOI through the provider's interface, submitting lesson metadata (title, authors, abstract, date published, etc.) and then provide a URL to the provider that we promise will stay online persistently. Anyone going to the DOI should then be redirected to our live URL.

Speaking with our scholarly communications librarian and our data management librarian, the general consensus is that the content of the page that a DOI points to should not change in major, substantive ways. Formatting and typo corrections are acceptable - but significant editorial changes to a journal-article-like document (which is how we advertise our lessons) should result in a new webpage with its own URL and a new DOI. For us, that would mean making a new markdown file with a new lesson title and slug (and then possibly retiring the "old" lesson, and adding a link to the new lesson)

We need to decide where we are going to draw the line between insubstantial edits, and substantial changes to content. Some cases are pretty clear, in my opinion:

  • Replacing an old broken external link with a link to the Internet Archive version (again, in my opinion) is an insubstantial edit, that shouldn't mean needing to create a whole new page.
  • However, under this new proposed policy, I would argue that changes like the shift from Python 2 to Python 3 would have resulted in brand new lesson pages with new URLs, and retiring the old lessons.
  • What about Lavin tfidf update #1664, where the author is asking to do a post-publication update? It's a fairly minor addition, nothing like wholly updating code and lesson structure. But you'd never be able to do that in a regular journal, so where do we want to draw the line?

I think this is an outgrowth of the identity confusion that Programming Historian has had for a long time - are we a website of lessons? or a scholarly journal? And committing to DOIs for our journal-article-like lessons means committing more to the scholarly journal side of things. If we're doing that, then we should be much stricter about post-publication edits, and much more ready to retire & replace broken lessons.

n.b. replacement lessons may or may not have enough changes that it merits completely new peer review! The python 3 updates were substantial enough that I think they would have merited a new DOI, but not big enough to mean a whole new round of peer review! Whereas, were I to write a new SPARQL lesson, it would be so changed that it would probably merit a new round of peer review. So that's another question you need to decide on.

TL;DR: DOIs means being more strict about lesson updates

  1. What updates (typos, broken links, other...?) are small enough to do without creating a new page+DOI?
  2. What updates are big enough to merit a new page+DOI?
  3. What updates are big enough to merit a whole new lesson proposal & peer-review cycle?
@mdlincoln mdlincoln self-assigned this Feb 25, 2020
@mdlincoln mdlincoln added this to the Plan S milestone Feb 25, 2020
@acrymble
Copy link

I don't think "retirement" is conducive to being a scholarly journal. There is "published" and "retracted" but no one else that I know of uses this idea of retirement.

@svmelton
Copy link
Contributor

To add to @mdlincoln's good questions above, I wonder if we should designate someone to be in charge of managing lesson updates. This has come up recently in PH EN conversations, and @acrymble and I just had an email exchange about potentially adding that as a role. With DOIs in the mix, it seems like we might need this role more acutely. (I know we've discussed it as a ME function—but to be honest, it's a lot of extra work!)

@mdlincoln
Copy link
Contributor Author

Our "retirement" mechanism would be the way to manage creating versions @acrymble - rename it if the project team thinks it needs a re-name.

And @svmelton that may be a good idea - but I'd argue that the management should actually be made simpler by being much stricter in saying "No." to updates of a certain type.

@acrymble
Copy link

Whatever the criteria, it should be (in my view) on the side of preserving effort. If 200 hours has gone into producing a lesson and 1 hour and 7 lines can fix it, that's worth it, in my view. It's too easy to say "broken", without remembering how much energy went into creating something in the first place.

I think we can put forth an argument that the DOI is linked to a document with a purpose. As long as the document still serves that purpose, it can stay.

But I agree with @mdlincoln that we should say no more frequently to requests to update things for the sake of wanting to.

@drjwbaker
Copy link
Member

drjwbaker commented Feb 25, 2020

A few brief thoughts:

  • we need to involve our DOI provider a little in determining what changes create a new DOI because a) as librarians they are interested in the challenge, and b) they are paying per new DOI.
  • agree that we need to say no more often.
  • we need to change some of our ~"help us fix stuff!" language (eg https://programminghistorian.org/en/contribute#provide-feedback-or-report-problems) to be more realistic about what we'll do with that help (eg we might reject a PR that was a substantive change and would therefore mint a new DOI version because we didn't think the content of the PR warranted a new DOI version, even though we were grateful for the idea). I guess this is an expansion of the "say no more often" thought..

@drjwbaker
Copy link
Member

Also, huge thanks to @mdlincoln for taking the time to kick this off.

@drjwbaker
Copy link
Member

Final thing, can I propose we assemble some test cases from past edits and compile them as reference examples of New DOI and No New DOI, include our justifications against the DOI guidelines, and use that as a public/working document to guide (but not constrain, because there will always be weird edge cases) our future decisions. Again, I suspect our DOI partner would be interested in seeing this and helping us improve it.

@acrymble
Copy link

Re suggestion for updating text asking for updates, the following pages may be relevant:

https://programminghistorian.org/en/contribute#provide-feedback-or-report-problems
https://programminghistorian.org/en/feedback

(and obviously the other language equivalents).

Thanks @mdlincoln for initiating this discussion.

@mdlincoln
Copy link
Contributor Author

mdlincoln commented Mar 2, 2020

Whatever the criteria, it should be (in my view) on the side of preserving effort. If 200 hours has gone into producing a lesson and 1 hour and 7 lines can fix it, that's worth it, in my view. It's too easy to say "broken", without remembering how much energy went into creating something in the first place.

Agreed @acrymble, that's why what I'd be proposing is like this:

  1. A lesson is reviewed and published at /en/lessons/how-to-use-sparql, and a DOI is registered for it.
  2. Some time later, substantive-enough changes are required in the code that it would cross the threshold of needing to be a new version. This could even just entail 7 lines.
  3. We make a new copy of the lesson markdown to /en/lessons/how-to-use-sparql-2 and register a new DOI for it.
  4. We move /en/lessons/how-to-use-sparql to /en/lessons/retired/how-to-use-sparql
  5. We add a note to /en/lessons/retired/how-to-use-sparql that there's a more up-to-date version with corrected code available at /en/lessons/how-to-use-sparql-2.
  6. We add a note to /en/lessons/how-to-use-sparql-2 that points to the earlier version.
  7. This could be repeated again if we needed /en/lessons/how-to-use-sparql-3

This ensures that both the original as well as the new lesson keep their valid URLs and DOIs. It doesn't require a whole new peer-review process - we're just making a new copy of the lesson and updating small but significant portions of it. It ensures that all the versions remain available online, but only the most recent one is displayed in the lessons index. It ensures all the versions are linked to each other, so readers can see context.

I think this could require changing our default "lesson retired" text. Currently, it does make it sound like the only reason lessons are retired is due to massively-out-of-date software. That may still happen - in which case, we'd note that in the retirement notice at the top of the lesson. But we might rewrite it to say this lesson is no longer the most up-to-date version - click here to see the newer version.

Does that clarify things?

@spapastamkou
Copy link
Contributor

Maybe it is already considered, but I thought of Zenodo's DOIs versioning with one global DOI for all versions of a digital object and specific DOIs for each version as described here.

@mdlincoln
Copy link
Contributor Author

Zenodo doesn't actually provide DOIs for arbitrary URLs, we're going with @drjwbaker's supplier instead. Also, the DOIs themselves don't have versioning, other than using a string that looks like a version number. The actual content system (zenodo, or Programming Historian) is the one that is responsible for linking together versions of docs.

@acrymble
Copy link

acrymble commented Mar 6, 2020

I just noticed that our "Goals" on patreon include keeping lessons up to date, if we meet the $200 per month target. We can still change that because we haven't met the target. But I raise it because we'll have to update that text if we make changes to our plans.

@drjwbaker
Copy link
Member

I just noticed that our "Goals" on patreon include keeping lessons up to date, if we meet the $200 per month target. We can still change that because we haven't met the target. But I raise it because we'll have to update that text if we make changes to our plans.

That was always a slightly vague goal (that is, where is the specific spend associated with that goal). My suggestion is rather than change the goal we make it actionable by doing something like the following:

  • we identify a lesson needs to change
  • we identify that the change will trigger a new version (based on the criteria we have yet to lay out)
  • at that point, we get a copy editor to retro copy edit the lesson (this is the spend associated with the goal), before we make the change and mint the new DOI.

The point is, if we are fixing something such that we mint a new DOI, we may as well at the same time pour a bit of copy editing love on the lesson at the same time.

Good idea?

@drjwbaker
Copy link
Member

drjwbaker commented Mar 27, 2020

In light of the meeting with our DOI partner #1683 (comment) we need to produce some text that:

a) states what we believe to be an edit that doesn't trigger a new DOI; b) states what we believe to be an edit that will trigger a new DOI; c) outlines our process for contacting our provider regarding new DOIs for revised publications.

Integrating thoughts above I propose the following. @mdlincoln @svmelton @spapastamkou @rivaquiroga: could you please take a look and indicate whether or not you are happy with that.

Note for Managing Editors: per #1683 (comment) this isn't going into our Editorial Workflows just yet, it is more for our agreement with our DOI provider (so they are happy we are not going to cause them lots of trouble!).


Programming Historian DOI policy (edited 30/03/2020)

  1. From FIXME 2020, every article published by the Programming Historian will be assigned a DOI. The assigned DOI will resolve to the published URL at https://programminghistorian.org (e.g. https://programminghistorian.org/fr/lecons/analyse-corpus-antconc)
  2. DOIs are unique identifiers for objects, meaning that the content of the object is expected to be largely consistent over time. If we wish to publish a new object that significantly updates the content of the former object, that means creating a new DOI.
  3. We determine that not all changes to articles require creating a new document and registering a new DOI. Where changes to an article are requested or suggested, the decision on whether or not to make changes rest with the relevant Managing Editor. The Managing Editor must decide if the changes are 'Minor' or 'Major' (whilst many situations will be unique, examples are provided below for guidance). In most cases 'Major' revisions are discouraged, a may suggest the article should be retired, and a new article proposed. For 'Minor' changes, edits will be made in the usual way and no new DOI will be requested. For 'Major' changes, Managing Editors will:
    • Make a new copy of the article with a sequentially increasing numerical suffix in the URL (e.g. /en/lessons/how-to-use-sparql-2, /en/lessons/how-to-use-sparql-3, et cetera)
    • If the article has not been copyedited (this will apply to most lessons published before Spring 2020), pass it through the Copyediting Process.
    • When the new version of the article is ready, move the original version of the article to the retired folder (e.g. /en/lessons/how-to-use-sparql to /en/lessons/retired/how-to-use-sparql)
    • Add a note to the retired version that there is a new version.
    • Add a note to the new version that points to the previous version.
    • Contact our DOI provider to notify them of both the new version of the article (so they can register a new DOI) and the revised URL for the previous version of the article.
  4. In cases where articles have to be retired, we will contact our DOI provider with the revised URL for the article.

Examples of 'Minor' Changes

  • Replacing a broken external link with a link to the Internet Archive version.
  • Correcting errors in spelling or of fact.
  • Revising formatting.
  • Changes to article metadata (e.g. difficulty level).

Examples of 'Major' Changes

  • Revisions to the code base of or processes described in an article required as a result of substantial software changes (e.g. Python 2 to Python 3) or new recommended versions.
  • Changes to the arrangement or order of an article.
  • Replacement of datasets used in an article.

@mdlincoln
Copy link
Contributor Author

mdlincoln commented Mar 29, 2020

Thanks very much @drjwbaker for putting this together - it's very well done.

As I underlined in our call, we might want to clarify in this doc (both for Sussex, but also for our internal benefit) that when we talk about "major changes", what we are talking about from a publishing standpoint is creating an entirely new document that has a new URL as well as a new DOI - it's not changing a DOI for an existing document.

Editorially, we may think of it as versioning and that's just fine, but from the perspective of the dOI registrar, a new version is just a new document, nothing more and nothing less. Both internally and externally, we need to move to the mindset that when we perform "major changes", we are publishing new a document that contains updated content of an existing document, a subtle but important distinction from the concept of "updating an existing document."

Might reprhase:

DOIs are unique identifiers for objects, meaning that if the object changes, the DOI may need to change.

to

DOIs are unique identifiers for objects, meaning that the content of the object is expected to be largely consistent over time. If we wish to publish a new object that significantly updates the content of the former object, that means creating a new DOI.

It's not as elegantly worded, but it is more accurate.

@mdlincoln
Copy link
Contributor Author

Likewise,

We determine that not all changes to articles require registering a new DOI.

would instead be

We determine that not all changes to articles require creating a new document and registering a new DOI.

@drjwbaker
Copy link
Member

Thanks @mdlincoln. Edited.

@mdlincoln
Copy link
Contributor Author

@drjwbaker following up - we need to store this documentation for editorial teams to consult when making decisions about handling lesson changes. I think adding it to the wiki makes sense? Could you put the text up on a page? (or remind me of the link to the doc you made so I can do it?)

@drjwbaker
Copy link
Member

Thanks for the prod @mdlincoln :)

MEs: @spapastamkou @svmelton @rivaquiroga: where would you like the DOI policy (below) to go, given that it impact on editorial workflows for revising articles? Wiki?


The Programming Historian Digital Object Identifier Policy (April 2020)

  1. From May 2020, every article published by the Programming Historian will be assigned (or be in the process of being assigned) a DOI. The assigned DOI will resolve to the published URL at https://programminghistorian.org (e.g. https://programminghistorian.org/fr/lecons/analyse-corpus-antconc)
  2. DOIs are unique identifiers for objects, meaning that the content of the object is expected to be largely consistent over time. If we wish to publish a new object that significantly updates the content of the former object, that means creating a new DOI.
  3. We determine that not all changes to articles require creating a new document and registering a new DOI. Where changes to an article are requested or suggested, the decision on whether or not to make changes rest with the relevant Managing Editor. The Managing Editor must decide if the changes are 'Minor' or 'Major' (whilst many situations will be unique, examples are provided below for guidance). In most cases 'Major' revisions are discouraged, a may suggest the article should be retired, and a new article proposed. For 'Minor' changes, edits will be made in the usual way and no new DOI will be requested. For 'Major' changes, Managing Editors will:
    • Make a new copy of the article with a sequentially increasing numerical suffix in the URL (e.g. /en/lessons/how-to-use-sparql-2, /en/lessons/how-to-use-sparql-3, et cetera)
    • If the article has not been copyedited (this will apply to most lessons published before Spring 2020), pass it through the Copyediting Process.
    • When the new version of the article is ready, move the original version of the article to the retired folder (e.g. /en/lessons/how-to-use-sparql to /en/lessons/retired/how-to-use-sparql)
    • Add a note to the retired version that there is a new version.
    • Add a note to the new version that points to the previous version.
    • Contact our DOI provider to notify them of both the new version of the article (so they can register a new DOI) and the revised URL for the previous version of the article.
  4. In cases where articles have to be retired, we will contact our DOI provider with the revised URL for the article.

Examples of 'Minor' Changes

  • Replacing a broken external link with a link to the Internet Archive version.
  • Correcting errors in spelling or of fact.
  • Revising formatting.
  • Changes to article metadata (e.g. difficulty level).

Examples of 'Major' Changes

  • Revisions to the code base of or processes described in an article required as a result of substantial software changes (e.g. Python 2 to Python 3) or new recommended versions.
  • Changes to the arrangement or order of an article.
  • Replacement of datasets used in an article.

@drjwbaker
Copy link
Member

Further to closing #1684 (comment) MEs (@spapastamkou @svmelton @rivaquiroga) I need to know where you want the policy on minor vs major changes (above). Wiki? Once I have your agreement, I'll do the work, and close the issue.

@rivaquiroga
Copy link
Member

I think the Wiki is the best place

@drjwbaker
Copy link
Member

Okay. I'll get in on there for now, and - unless I hear otherwise - close this issue sometime next week.

@drjwbaker
Copy link
Member

Policy up at https://github.com/programminghistorian/jekyll/wiki/The-Programming-Historian-Digital-Object-Identifier-Policy-(April-2020)

In short MEs: changes to articles may now require the creation of a new .md file with a new DOI, as we can make major changes to an article that has a DOI. Though you still have the power to decide what to do if/when a change to an article is suggested.

@drjwbaker
Copy link
Member

Closing now as I've not heard any dissent :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants