Fetch docs from remote repos via git clone
instead of individually requesting every file from GitHub.
#3951
Labels
good first issue
Issue is likely suitable for newcomers to the project.
GitHubRepoFetcher
is super unkind to GitHub's HTTP API. We're basically crawling every remote repo that's listed indata/repos.yml
and issuing an HTTP request for every file in itsdocs/
directory (including subdirs) on startup.This makes for a miserable developer experience when trying to preview changes to documentation. Startup takes many minutes and often fails because we hit rate-limits (sometimes even when using an API token!) The worst part is that the tests take forever to run and depend on thousands of network requests all succeeding, to endpoints which are outside our control.
It'd be simpler, faster and more reliable just to clone the remote repos and read the
.md
docs from the local filesystem. We wouldn't even have to download the whole of each repo; it's possible to download just thedocs
directory for just the head of the default branch, by usingclone --filter
withsparse-checkout
.This would also let us ditch our homegrown cache mechanism, because the files will just stick around when developing locally and we can use the built-in cache in GitHub Actions.
The text was updated successfully, but these errors were encountered: