Fetch docs from remote repos via `git clone` instead of individually requesting every file from GitHub. #3951

sengi · 2023-04-13T21:10:49Z

GitHubRepoFetcher is super unkind to GitHub's HTTP API. We're basically crawling every remote repo that's listed in data/repos.yml and issuing an HTTP request for every file in its docs/ directory (including subdirs) on startup.

This makes for a miserable developer experience when trying to preview changes to documentation. Startup takes many minutes and often fails because we hit rate-limits (sometimes even when using an API token!) The worst part is that the tests take forever to run and depend on thousands of network requests all succeeding, to endpoints which are outside our control.

It'd be simpler, faster and more reliable just to clone the remote repos and read the .md docs from the local filesystem. We wouldn't even have to download the whole of each repo; it's possible to download just the docs directory for just the head of the default branch, by using clone --filter with sparse-checkout.

This would also let us ditch our homegrown cache mechanism, because the files will just stick around when developing locally and we can use the built-in cache in GitHub Actions.

The text was updated successfully, but these errors were encountered:

Dylan-fa · 2023-05-14T00:19:32Z

Hi, I had a go at implementing this issue in #4000. This is my first time writing Ruby or contributing to a govuk repository so please feel free to leave some feedback. Thanks.

sengi added the good first issue Issue is likely suitable for newcomers to the project. label Apr 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fetch docs from remote repos via `git clone` instead of individually requesting every file from GitHub. #3951

Fetch docs from remote repos via `git clone` instead of individually requesting every file from GitHub. #3951

sengi commented Apr 13, 2023 •

edited

Loading

Dylan-fa commented May 14, 2023

Fetch docs from remote repos via git clone instead of individually requesting every file from GitHub. #3951

Fetch docs from remote repos via git clone instead of individually requesting every file from GitHub. #3951

Comments

sengi commented Apr 13, 2023 • edited Loading

Dylan-fa commented May 14, 2023

Fetch docs from remote repos via `git clone` instead of individually requesting every file from GitHub. #3951

Fetch docs from remote repos via `git clone` instead of individually requesting every file from GitHub. #3951

sengi commented Apr 13, 2023 •

edited

Loading