Skip to content

Why github downloads suck

rofl0r edited this page Nov 29, 2021 · 9 revisions

Abstract

This document lists reasons why you should not use the automatic "download-from-tag" feature of github rather than manually drafting proper release tarballs. Luckily, github offers a feature called "draft a release" which lets you attach a manually created tarball to a tag. Used like this, only the critics about HTTPS-only (last point) applies.

No choice of compression algorithm

Github tag downloads are only available as .zip (which is for windows people) and .tar.gz. usually one would want the download as tar.xz, which compresses up to 50% better, and so has a considerable impact on download transfer times and harddisk usage.

Without guarantee of integrity

Github downloads are created on the fly and come without guarantee that the checksum or filesize will not change once github servers are updated to a newer version of gzip. This will break automated download scripts which do checksum verification.

autoconf-dilemma

many projects are using autoconf, which is designed such that the developer of the source code package creates a configure script as part of the release workflow. since configure scripts are huge, generated, and change often they are usually not checked-in into the source code repo. as a result lazy maintainers leave the onus to create the portable configure script to the user, which has to have the right versions of autoconf, automake, libtool, and all the m4 macros of all the used libraries. usage of "autogen.sh" puts a lot of burden and dependencies on the consumer, which can easily be avoided if the portable configure script is created as part of a proper "make dist" workflow, which results in a designated release tarball hosted on some other service than github.

missing submodules

if git submodules are used, their sources will not recursively be added to the github release tarball. as such, the tarball is basically worthless, since the user will need to have a git client installed anyway to check out the missing submodules.

naming scheme

github downloads usually derive the filename from the tag name. if the tag name is "v1.0.4", the resulting tarball will be named "v1.0.4.tar.gz". bad luck if you need to download from 2 github projects that both use the same version number - one project will overwrite the tarball of the other. additionally it's hard to say what "v1.0.4.tar.gz" actually is. if you happen to find such a tarball on your harddisk, you'll need to use tar tf to look at its contents (or even unpack it and look into some files) to remember the project it's originating from.

mismatch of tarball name and extracted directory

usually it's considered good practice if a tarball named foo-1.0.4.tar.gz extracts to a directory named "foo-1.0.4". this is done right in about 90% of all packages used by sabotage and makes it easier to deal with from automated scripts. with github however, the tarball will be named "v1.0.4.tar.gz" and the extracted directory "foo-1.0.4".

HTTPS only

HTTPS is a nice feature to have additionally to regular HTTP downloads. HTTPS only though is an unnecessary restriction and makes it impossible to download with older clients or embedded devices (think NAS, routers, home-theater boxes etc) which typically ship busybox wget which doesn't implement support for HTTPS. Additionally, HTTPS downloads are slower due to crypto overhead, and in case of high packet loss, it's almost impossible to establish a TLS connection to begin with.

Unstable urls

As history has shown, Microsoft Github may change their URL naming scheme at any point, so your download is no longer available at the same URL (if at all).