-
-
Notifications
You must be signed in to change notification settings - Fork 348
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fallback to archive.org URLs for failed downloads of FOSS packages #2284
Fallback to archive.org URLs for failed downloads of FOSS packages #2284
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is awesome! I've wanted this feature for ages, but never quite had the time to get back into C#!
/// Return an archive.org URL for this download, or null if it's not there. | ||
/// The filenames look a lot like the filenames in Net.Cache, but don't be fooled! | ||
/// Here it's the first 8 characters of the SHA1 of the DOWNLOADED FILE, not the URL! | ||
/// </summary> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the hash of that file will remain consistent. The url may not.
Produces a filename based of the first 8 digits in sha1 hash,
the 'identifier' and the 'version' in the metadata if the
download_hash exists. Returns '0' if there is no download hash
or has an content type other than zip/gz/tar/tar.gz.
There are some tests that ensure the correct filenames are generated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, either way makes enough sense to me. I just wanted to note it explicitly since we have two different 8-digit hexadecimal filename prefixes floating around, and it's not easy to tell that they're (supposed to be) different at a glance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, actually makes sense for URL in the cache. If the url changes, you probably do want to re-download.
: null; | ||
} | ||
} | ||
|
||
/// <summary> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some tests of the filename generation could be useful, but I wouldn't treat this as a blocker.
Background
SpaceDock broke today, which is fun.
VITAS reports that it looks like a denial of service attack.The NetKAN bot has been uploading all permissively-licensed mods to https://archive.org/details/kspckanmods for quite a while now. Many or most of the SpaceDock downloads that are now failing, could in principle fall back to archive.org URLs. Such a feature could also help us deal with GitHub's download throttling or SpaceDock's certificate expirations.
Changes
Now if the primary download fails for a permissively licensed mod, we try to find it on archive.org as a fallback.
Internally this involves defining a
fallbackUrl
property inDownloadTarget
andNetAsyncDownloaderDownloadPart
, both populated from a newCkanModule.InternetArchiveDownload
property, which itself checks a newLicense.Redistributable
property based on a new copy of the list from NetKAN-bot. Then if a download fails, we check whether it has a fallback URL, and if so and if we haven't already tried it, we try it. If the fallback fails as well, then we continue with the normal steps for a failed download.Fixes #1682.