Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Please consider moving gh-pages to another repository and/or reduce the size of the branch #2048

Open
vlsi opened this issue Oct 7, 2019 · 22 comments

Comments

@vlsi
Copy link
Contributor

vlsi commented Oct 7, 2019

The clone of https://github.com/junit-team/junit5 takes a lot. It downloads ~500MiB+, and the resulting repository is full of various pdf files.

Here are the top consumers:

hash bytes path
fffea6e6616e 3812203 docs/5.5.1/user-guide/index.pdf
d3f697d9c019 3812203 docs/5.5.0/user-guide/index.pdf
30d099b3af05 3812203 docs/5.5.2/user-guide/index.pdf
6267c2a59eb1 3812011 docs/snapshot/user-guide/index.pdf
...

and so on.

It does impact both regular development experience (as everybody is used to just git clone $url), and it does impact GitHub Actions CI: it takes 1minute for the "checkout action".

Note: GitHub Action could probably be improved to skip gh-pages branch, however, the issue for humans would sill be there.

  1. Are PDFs required? Could they be pushed somewhere else? Do all the snapshots need to be stored in the main repository?

  2. Could you please consider the use of noTimestamp for the javadoc? It will avoid printing the timestamps, thus it would reduce the changes in the html files.
    See Use notimestamp for JavaDoc and notimestamp+noversionstamp for GroovyDoc gradle/gradle#8619

Sample:

withType<Javadoc>().configureEach {
                (options as StandardJavadocDocletOptions).apply {
                    noTimestamp.value = true
@vlsi
Copy link
Contributor Author

vlsi commented Oct 7, 2019

Adding fetch-depth: 50 might help (to a degree), however it looks like the current GitHub Actions always fetches all the branches in the repository: actions/checkout#22 :(

AFAIK https://github.com/meetup/express-checkout does not help (it does not understand pull requests)

@marcphilipp
Copy link
Member

I agree that we should improve the current situation. 🙂

We'll have to be careful not to break existing URLs, i.e. junit.org/junit5/... should continue to work.

Does anyone know what would happen if we moved the gh_pages branch to a new junit5 subdir in https://github.com/junit-team/junit-team.github.io?

@vlsi
Copy link
Contributor Author

vlsi commented Oct 8, 2019

Just wondering: do you need all the Git history for gh-pages?
What if gh-pages is truncated (e.g. to keep 10 recent commits only)

@marcphilipp
Copy link
Member

I think that would be ok. Or we could keep the manual commits and squash all others?

@marcphilipp
Copy link
Member

FWIW we already set noTimestamp:

@vlsi
Copy link
Contributor Author

vlsi commented Oct 8, 2019

Or we could keep the manual commits and squash all others?

It could make sense. For instance, if the latest commit is automatic, then amend it + force-push-with-lease.

@marcphilipp marcphilipp modified the milestones: 5.6 M1, 5.6 M2 Oct 10, 2019
@marcphilipp
Copy link
Member

I created a script based on one I found on StackOverflow that squashes subsequent commits with the same message. The total number of commits on the gh-pages branch would go down from 3347 to 331.

@junit-team/junit-lambda Are you ok with me force-pushing that to gh-pages? Everyone who previously checked out the branch would have to do a hard reset in their clone/fork.

@sormuras
Copy link
Member

👍

Pruning old or even all versions of committed PDF files would shrink the repository size even further.

@marcphilipp
Copy link
Member

@vlsi
Copy link
Contributor Author

vlsi commented Dec 1, 2019

Are you going to shrink pdfs there as well?

@marcphilipp
Copy link
Member

@vlsi How would you shrink them?

@vlsi
Copy link
Contributor Author

vlsi commented Dec 1, 2019

I assume nobody really needs historical snapshot PDFs.
So you could keep just the latest version of the file.
Or you could refrain from constantly pushing PDFs as they are not diffed well (== Git can't compress PDF files via delta-encoding).

@marcphilipp
Copy link
Member

I guess we could build them (in order to be sure they actually build) but not publish them for snapshots.

@vlsi
Copy link
Contributor Author

vlsi commented Dec 1, 2019

That would work.
GitHub is infinite, and it would be great to use the resources in a fair manner.

@marcphilipp
Copy link
Member

I did that as a first step in order to stop its growth: cf19ffd

@marcphilipp marcphilipp modified the milestones: 5.6 M2, 5.7 M1 Jan 2, 2020
@marcphilipp marcphilipp modified the milestones: 5.7 M1, 5.7 Backlog Apr 5, 2020
@stale
Copy link

stale bot commented May 13, 2021

This issue has been automatically marked as stale because it has not had recent activity. Given the limited bandwidth of the team, it will be automatically closed if no further activity occurs. Thank you for your contribution.

@stale stale bot added the status: stale label May 13, 2021
@stale
Copy link

stale bot commented Jun 3, 2021

This issue has been automatically closed due to inactivity. If you have a good use case for this feature, please feel free to reopen the issue.

@stale stale bot closed this as completed Jun 3, 2021
@marcphilipp
Copy link
Member

I think we should still do this.

@marcphilipp marcphilipp reopened this Jun 4, 2021
@stale stale bot removed the status: stale label Jun 4, 2021
@vlsi
Copy link
Contributor Author

vlsi commented Jun 4, 2021

Just in case, actions/checkout no longer fetches all branches and tags, so it is fast.

@marcphilipp marcphilipp removed this from the 5.8 Backlog milestone Jun 19, 2021
@stale
Copy link

stale bot commented Jun 19, 2022

This issue has been automatically marked as stale because it has not had recent activity. Given the limited bandwidth of the team, it will be automatically closed if no further activity occurs. Thank you for your contribution.

@stale stale bot added the status: stale label Jun 19, 2022
@stale
Copy link

stale bot commented Jul 11, 2022

This issue has been automatically closed due to inactivity. If you have a good use case for this feature, please feel free to reopen the issue.

@stale stale bot closed this as completed Jul 11, 2022
@vlsi
Copy link
Contributor Author

vlsi commented Jul 11, 2022

I believe it is still relevant

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants