-
Notifications
You must be signed in to change notification settings - Fork 229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Could perma.cc help PH keep weblinks sustainable? #2030
Comments
Matt Lincoln also helpfully sent this article on Robustifying Links To Combat Reference Rot https://journal.code4lib.org/articles/15509 (not tagging Matt so that we don't bug him but also wanna give him kudos). Definitely think we should discuss this all at our next tech team meeting, which @hawc2 you're welcome to attend |
This ticket needs someone assigned to it. Otherwise it will stay open forever. @hawc2 are you planning on taking this forward? |
Yeah, I just assigned it to myself, and the plan was @ZoeLeBlanc will bring it up at the meeting this Wednesday. There are some organizational decisions to make, but this seems like a pretty viable and sustainable option, if we can get a sponsor library on board |
On perma.cc, having had a look the following people are at institutions that are already partners, though I note that in some cases it may be specific (law) libraries that may not provide support to all faculty.
Signing up is free for academic libraries, so I've asked Sussex as well. My instinct is that if we want to move to perma.cc we need a number of us at institutions where our libraries have signed up. So there are two actions here:
|
I'll reach out the Princeton but was planning to see about getting UIUC to join IPP anyways, so will ask about this with them too! |
Just to clarify @hawc2 & @drjwbaker is perma.cc free if a library sponsors us? Or does the library already need to be a member and then we just use it with their account? Mostly just wondering how much this costs the sponsoring library. Thanks! |
My read is that it is free for academic libraries to join, and then any faculty can use the account for any purpose. But my scan may be wrong! It may just be worth starting by asking your library about their perma.cc membership and how you can use it. |
I'm double-checking with a colleague, and can email perma.cc, but my
understanding is that a library, say Sussex Library, could become a member
for free, and register PH as a journal with the library and perma.cc. Then
we could add Organizational Users who work in the journal.
It seems like each of these Organizational Users wouldn't also have to work
at a member library of perma.cc, but I'm double checking on that, as it is
vague in the documentation. If it is true that each journal editor would
need a perma.cc account through their library, I wonder if it would only be
necessary for those in charge of doing the perma.cc. links to get library
perma.cc access, so we could keep this to a subset of our editorial team.
This page of the User Guide helpful, as is the PDF at the bottom for
academic journals: https://perma.cc/docs/libraries
I'll look into Temple Law Library, but like at alot of academic libraries
in the U.S. at least, the Law Library is a separate organization from the
rest of our Libraries in some strange ways, so it's possible I won't be
able to access their account with perma.cc
…On Wed, 24 Mar 2021 at 12:22, James Baker ***@***.***> wrote:
My read is that it is free for academic libraries to join, and then any
faculty can use the account for any purpose. But my scan may be wrong! It
may just be worth starting by asking your library about their perma.cc
membership and how you can use it.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2030 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADXF4EFEDSGCKJ6AGYRES6LTFIGS3ANCNFSM4XFIPAXQ>
.
|
That's funny @hawc2 - I didn't realize it was a common thing. But the UVA Law Library is separate institutionally from the rest of our Library, and I similarly would not have access to their account. |
My ID card doesn't even get me into the Law School/Library building! Pretty
sure it's the only building on campus for which that's the case. It is
strange.
But maybe our libraries will be more interested in being members with
perma.cc if our Law libraries already are?
…On Wed, 24 Mar 2021 at 13:46, Brandon Walsh ***@***.***> wrote:
That's funny @hawc2 <https://github.com/hawc2> - I didn't realize it was
a common thing. But the UVA Law Library is separate institutionally from
the rest of our Library, and I similarly would not have access to their
account.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2030 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADXF4EGLPWRQLNTKWH7Y2ULTFIQPHANCNFSM4XFIPAXQ>
.
|
Yeah, I guess law libraries at US universities might be separate things, but thought I'd ping you all anyway just in case :) |
Good news regarding perma.cc.
It's possible we could do it with any of our libraries, and that they don't need to be Institutional Partners of PH to use perma.cc for the journal. I've followed up with perma.cc's support team to ask about long-term sustainability in terms of what happens if the relevant staff at the hosting institution were to leave either PH or their academic institution. |
Update on migrating between institutions, from perma.cc support: "If you'd like to migrate an org from one registrar to another, you would just need to send in that request to the perma team and get permission from both the existing registrar and the intended registrar." |
@hawc2 Great digging! Will you reply on our behalf via Temple Law Library (ideally using programminghistorian@gmail.com, though I appreciate you probably don't have access - but you can have it)? Or would you like me to? (if I can via your library) |
@drjwbaker Do you mean I should set up an Org account for PH through Temple's account? If I can have access to the gmail account, I'm happy to begin a separate conversation directly with perma.cc user support about the various options we're considering for using their service for the journal. |
Okay. I'll email you the gmail details. If you could do it now(ish) I can be sure to approve the login when the big WARNING sign flashes up on my phone :) (google authentication has caused problems before when sharing access) |
@hawc2 How are you getting on with this? Need a hand? |
Temple's Law Library has held up actually creating a Programming Historian
account but I'm starting to experiment with our department's blog. I'll try
to move things forward on my end and follow up. I could definitely use help
evaluating how this would be best tested and used for PH. It's going to be
quite time consuming
…On Fri, 30 Apr 2021 at 03:09, James Baker ***@***.***> wrote:
@hawc2 <https://github.com/hawc2> How are you getting on with this? Need
a hand?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2030 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADXF4EBCZ4IHEHA6P2CHBLTTLJJSBANCNFSM4XFIPAXQ>
.
|
Okay. Too time consuming to be worth it? I guess what we are suggesting here is a) all future articles use perma.cc for link b) when link rot occurs in published articles, perma.cc is used to fix links (that is, we aren't going to go through and make perma.cc links for all published articles) Right? |
I got the PH account set up through Temple Law Library now. I can request
any PH edit to be added - I just need names and emails. I sent a separate
email to you James to discuss.
Agreed our goals are a) and b) first and foremost, but I'm not sure if b)
is something you can do retroactively in some cases? Doing it for all
published articles might be a good long term goal, but we should see about
a) and b) before assessing whether anything more would be worth the time.
Do we have any sense or data on how much link rot currently affects PH
tutorials?
…On Tue, 4 May 2021 at 03:46, James Baker ***@***.***> wrote:
Okay.
Too time consuming to be worth it? I guess what we are suggesting here is
a) all future articles use perma.cc for link b) when link rot occurs in
published articles, perma.cc is used to fix links (that is, we aren't going
to go through and make perma.cc links for all published articles)
Right?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2030 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADXF4EFBS6H5ATCX6Z263KTTL6Q5RANCNFSM4XFIPAXQ>
.
|
Thanks for getting this setup Alex 👏🏽 ! No set number on how often this happen, but I do think it's easily once every month or so that we find a broken link for various reasons. I agree that focusing on future and current breaking links is the right direction and that we can over time move all lessons to using perma.cc. I think an additional next step is writing up documentation for editors to use perma.cc. Right now our tech documentation is long and not broken up easily by topic, so I would recommend potentially starting a new page for fixing broken links and we can work on archiving the existing instructions. Let me know if you need help with this Alex and thanks again for taking the lead on this 🙌🏽 |
Two thoughts (that came up in an email thread with @hawc2 ):
|
And thanks @ZoeLeBlanc for contributing. I'm aware that it is often @programminghistorian/technical-team members who resolve issues with broken links. |
Update on for providing access to perma.cc.: all PH members can now access our perma.cc account through our programminghistorian@gmail.com account. @drjwbaker has the account access info. Agree with @ZoeLeBlanc we should create documentation this summer for using perma.cc. For now, we'll plan to test it out on specific broken links? I'm happy to help lead the effort but will need some onboarding to how we're handling the problem currently - makes sense to integrate this with @rivaquiroga Lesson Maintenance Worfklow to me |
Is the making progress @hawc2? (and, do we know the steps that look like progress?) |
Now that we have general access to use it, could we have a meeting to discuss how to proceed, both with testing and implementation? I am still learning the ropes of some PH processes, so I'm not sure who should be involved and what are the most efficient ways to integrate perma.cc into our workflows. It shouldn't be a hard tool to use, but as @acrymble mentioned, there are some complex decisions to consider, and it will be very time-consuming to remediate old lessons. |
It feels like the aim is to get it into the author/editor guidelines as our preferred implementation of URLs where we do not expect the content at those URLs to change / be usefully dynamic (as @acrymble notes). A route to implementation might be to test this with a live article submission (perhaps one you edit?) but that decision is better made by a Managing Editor than me. Perhaps we can add this as a discussion point for our next Project Team Call: @mariajoafana will this be in July? |
Hello @hawc2. I'd like to be part of this conversation! |
Per our team meeting discussion on July 28 #2159, @Anisa-ProgHist and I will test out perma.cc for a PH lesson using the one I just finished editing, currently under copyedit stage with Anisa, issue #325 in ph-submissions: https://programminghistorian.github.io/ph-submissions/lessons/clustering-with-scikit-learn-in-python. As we finalize this lesson for publication, we'll try to develop some basic standards for use of perma.cc to deal with link 'rot' and 'growth' for further editors. We'll also track how long the process takes us. While the copyediting stage makes most sense for integrating perma.cc, decisions still need to be made about who will do this labor regularly going forward. |
Thinking about citations, and wondering whether it would be useful to include both the original URL and the perma.cc URL in our bibliographies/footnotes. e.g., http://ceur-ws.org/Vol-2253/paper22.pdf archived at https://perma.cc/--- Looking at the recently published lesson Detecting Text Reuse with Passim, I notice that the citation format used doesn't expose the original URL, rather embeds it within the word 'Link'. Greta Franzini, Maria Moritz, Marco Büchler, Marco Passarotti. Using and evaluating TRACER for an Index fontium computatus of the Summa contra Gentiles of Thomas Aquinas. In Proceedings of the Fifth Italian Conference on Computational Linguistics (CLiC-it 2018). (2018). Link Going forward, I feel that especially when a link isn't archived at perma.cc, it is useful if we can expose original URLs (this may include considering a system for truncating excessively long URLs/those which include queries) because URLs give readers information about sources. Q: Are we still aiming to use the Chicago Manual of Style format as our template? |
Also, this guide looks useful: https://guides.law.stanford.edu/c.php?g=588091&p=4063422 It shows how it is possible to 'batch create' links, and organise links within folders. Both these features will be useful to us. PDFs can also be archived. This could be useful for an example such as that given above (http://ceur-ws.org/Vol-2253/paper22.pdf) of conference proceedings which don't have a DOI. |
I suspect that Submission #348 is an unusual case, but it does raise some interesting challenges. It included several links to the interactive games which are currently playable on the live web. Perma.cc cannot effectively render this kind of complex content, so upon following the link I think readers would be dissatisfied. However, readers could choose to either click through the to ‘See the Screenshot View’ to see a page that looks like the original webpage, or click through to ‘View the Live Page’ from where they will be able to get started playing the game(s) for as long as it/they exist(s) on the web. In case anyone following this thread is interested, those instances are as follows:
Interestingly,
Links to YouTube playlists are also problematic. The page ‘looks’ right, but each individual video has a unique URL (in fact, they have multiple URLs, depending upon whether the Playlist is played through start to finish, or if a video is selected individually)
|
Thanks Anisa, this is just what we were hoping to test. The YouTube issue
is more surprising than itch.io games. I can reach out to perma.cc to hear
their perspective on the problem of archiving dynamic content and emulation
systems. I also might consult a couple scholars who work in archiving
digital games.
This article suggests webrecorder may succeed in some cases where perma.cc
has not:
https://blogs.bl.uk/webarchive/2019/03/archiving-interactive-fiction.html.
Want to test it out?
The ability to move to the live link through perma.cc makes it still a
viable option. But once we've identified any other discrepancies with
specific lesson types, we should have a discussion with the managing
editors about whether perma.cc's rendering of dynamic pages is too
cumbersome for readers for us to use it in those cases.
…On Sun, Sep 5, 2021, 9:10 AM Anisa Hawes ***@***.***> wrote:
I suspect that Submission #348
<programminghistorian/ph-submissions#348> is an
unusual case, but it does raise some interesting challenges.
It included several links to the interactive games which are currently
playable on the live web. Perma.cc cannot effectively render this kind of
complex content, so upon following the link I think readers would be
dissatisfied. However, readers could choose to either click through the to
‘See the Screenshot View’ to see a page that *looks like* the original
webpage, or click through to ‘View the Live Page’ from where they will be
able to get started playing the game(s) for as long as it/they exist(s) on
the web.
In case anyone following this thread is interested, those instances are as
follows:
- Para. 81 + Line 406 *Depression Quest*
<http://www.depressionquest.com/> where the Perma page generated would
look/function like this <https://perma.cc/T54Z-6FYN>
- Para. 123 a more fleshed out version of this game
<https://gkirilloff.itch.io/first-day-in-the-office> where the Perma
page generated would look/function like this
<https://perma.cc/SQ2R-85A8>
- Line 405 *A Witch’s Word*
<https://rainbowstarbird.itch.io/a-witchs-word> where the Perma page
generated would look/function like this <https://perma.cc/YLG6-28JH>
- Line 407 *Queers in Love at the End of the World*
<https://w.itch.io/end-of-the-world> where the Perma page generated
would look/function like this <https://perma.cc/8AVK-T8X7>
- Line 409 *The Uncle Who Works for Nintendo*
<https://ztul.itch.io/the-uncle-who-works-for-nintendo> where the
Perma page generated would look/function like this
<https://perma.cc/BQ39-QGUF>
Interestingly,
- Line 408 *September 7th, 2020*
<https://caitkirby.com/downloads/Fall%202020.html> is an example of
where perma.cc has achieved a successful capture, which you can interact
with here <https://perma.cc/GP6X-RARD> so I have included this.
Links to YouTube playlists are also problematic. The page ‘looks’ right,
but each individual video has a unique URL (in fact, they have multiple
URLs, depending upon whether the Playlist is played through start to
finish, or if a video is selected individually)
- Line 415 *Dan Cox Youtube Twine Tutorials*
<https://www.youtube.com/playlist?list=PLlXuD3kyVEr7bucZtQPpOZHjbUuGKaf2V>
where the Perma page generated would look/function like this
<https://perma.cc/G4KS-Y9QY>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2030 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADXF4EH63IWR4ZAPZUUQQOLUANT45ANCNFSM4XFIPAXQ>
.
|
Ah! Yes ! I almost included in my previous comment, that when I am not at PH, I am a freelance web archivist and I use Webrecorder daily ! It is my tool of choice: brilliantly powerful. Definitely capable of capturing these interactive games - I have tested it to archive several, similarly complex, sites/artefacts in the past. I know the web archivists at the British Library very well, including those involved in the Collecting Interactive Digital Narratives project, and those who launched the research that became the Emerging Formats initiative. Capturing individual YouTube videos via their canonical URLs works well, and it is also possible to capture YT embeds on other websites, but Playlists pose particular challenges because of the number of URLs associated with each individual video (can be 10 or more). I would be happy share some examples and more information. |
The developers of Webrecorder are among my direct contacts, and I'd be delighted to chat with them about our use case ✨ |
Per @anisa-hawes @hawc2 introduction at #2223 given the labour involved in using perma.cc is there a case with future new articles for a) encouraging authors to only include essential links, b) discouraging authors from pointing to complicated links (e.g. YouTube playlists). Both these can be justified under our sustainability criteria https://programminghistorian.org/en/reviewer-guidelines#sustainability |
@anisa-hawes how much additional time would you say perma.cc linking added to the copyedit stage? given that was your first time, how much faster do you think it could become? @drjwbaker the perma.cc process definitely made it apparent a number of ways we could clarify guidelines for authors/editors on when to use links and what kind. Reducing links overall isn't a bad idea, and we could ask people to avoid some kinds of unnecessary links to dynamic sites. But I don't think the jury is out on our ability to preserve interactive media like games, so I think we should investigate further first |
Here is a brief summary of what I said (although did not express as clearly as I would have liked) at today's Project Team Meeting:
|
Yes, I think this is something we could consider... In one of the two lessons I read, I found that the author had doubled up on links multiple times, rather than defining it/providing a link upon first mention only. Elsewhere, in that lesson I found myself suggesting additional links to define technical terms. I wonder how typical these two lessons were in terms of the number of links they included? |
Thanks for the summary @anisa-hawes. I think..
..is ultimately the key positive. So long as we have an infrastructure where updates fail because a link on another part of the site has gone down, perma.cc has the advantage of reducing our exposure to that, thus gradually making working with the site much easier. |
Personally, I think some authors use links in our articles as they would be blog rather than a journal, because we've always encourged it, and are now seeing the downside as links break and cause work. Now, I don't want to encourage the inflexibility of journal policies towards links/urls, but I think we could advise more parsimonious use of links and/or a use of links that is clearly justifiable/justified. |
I estimate that it added another couple of hours to copyediting, but it felt worthwhile for the reasons explained above. But, you are right to observe that the process can be speeded up as I become more familiar with the workflow. I'm not certain how often authors link out to YouTube Playlists / individual videos or exceptionally complex content (e.g. the interactive narratives), but I think it's good if we have a workflow in place for if they do – because this content isn't robust. Indeed, the author of the interactive narratives commented on their instability. |
That's an interesting thought, @drjwbaker. Thank you! |
In another recent Issue, we were talking about updates to the research/investigacion/recherche/pesquisa pages. I note that links on these pages break frequently. Perhaps these are good candidates for perma.cc overhauls too! |
I am currently finalising a draft of revised Editorial Guidelines (to be tested in an Onboarding pilot study with the English team this autumn) which include detailed steps for the Copyediting phase of the workflow. My draft integrates step-by-step instructions for link archiving using perma.cc, but recognises that it doesn't have to be the same person who undertakes both tasks. For example, I could perform the link archiving task across all four languages. Going forwards, I think we could consider integrating use of Webrecorder tools to stabilise (and ensure sustainable access to) the kinds of complex online content (interactives, video 3D models, etc.) we are likely to encounter more frequently in the future. I've added this as an idea for one of our Longer-term Goals within our shared planning document. Following this successful pilot study, I am closing this Issue. |
I came across perma.cc today and was wondering if it could be useful for the Programming Historian to ensure its weblinks are more 'permanent.'
After chatting with @walshbr and @ZoeLeBlanc I'm opening this issue so we could do some research and see if we should be using it instead of or in conjunction with the web archive
The text was updated successfully, but these errors were encountered: