Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use external hosting for javadocs (link from README/wiki) to reduce Git repo size #3440

Closed
drekbour opened this issue Apr 2, 2022 · 31 comments

Comments

@drekbour
Copy link
Contributor

drekbour commented Apr 2, 2022

I can't figure out why the javadocs are saved into the source-tree. I really can't understand why the full record of historic javadocs is stored there. It is about 60x the size of the /src tree!

I presume this is so GH hosts these but isn't there a better way?

$ du -s .[^.]* * | sort -nr
653988	docs
110124	.git
11828	src
164	release-notes
76	.mvn
44	.github
32	attic
@drekbour drekbour added the to-evaluate Issue that has been received but not yet evaluated label Apr 2, 2022
@cowtowncoder
Copy link
Member

Yes, Javadocs are hosted for new minor releases, to be linked to from project Wiki.

I am ears for a better system, if this is an actual problem? (disk space is not super expensive these days)

@drekbour
Copy link
Contributor Author

drekbour commented Apr 3, 2022

No maintainer experience to offer but I've used readthedocs.io many times

I'm betting you haven't seen this https://www.javadoc.io/doc/com.fasterxml.jackson.core/jackson-databind/latest/index.html

javadoc hosting for open source projects hosted on Central Maven
free, CDN enabled, new versions auto-detected within 24 hours
Supports Java, Scala, Groovy... any language thats generates a -javadoc.jar

@cowtowncoder
Copy link
Member

Ok, I think we might already have a link from README to external Javadocs; those that are based on Javadoc Maven bundles from Maven Central. So that could significantly simplify changes.

But I think one thing that would allow dropping addition of new javadocs (existing one probably need to be kept at least for a while in case someone is linking to them?) would just be changing of links from Wikis to external Javadoc providers for specific version.

I think this would be a great "new contributor" task to check.

@cowtowncoder cowtowncoder added good first issue Issue that seems easy to resolve and is likely a good candidate for contributors new to project and removed to-evaluate Issue that has been received but not yet evaluated labels Apr 4, 2022
@cowtowncoder cowtowncoder changed the title Better hosting for javadocs. Git repo is bloated Use external hosting for javadocs (link from README/wiki) to reduce Git repo size Apr 4, 2022
@drekbour
Copy link
Contributor Author

drekbour commented Apr 4, 2022

kept at least for a while in case someone is linking to them

Unsure who would be linking to GH javadocs but I wouldn't encourage it by doing anything other than purging them. No one will thank you for keeping their Medium article hotlinked (or ever change those links).

My own experience is that, with modern IDEs fully automating -sources and -javadocs download, the only time I use externally hosted docs for anything is answering SO questions :)

@drekbour
Copy link
Contributor Author

To keep this one vaguely moving - would you be against me going through each FasterXML/* repo: deleting anything generated in ./docs (and the config that persists them there) then replacing with a single to the hosting.
As before the existing published artifacts are sufficient for javadoc to be auto-hosted here with no further effort:
https://www.javadoc.io/doc/com.fasterxml.jackson.module

@cowtowncoder
Copy link
Member

Yes; if you could first replace links on Wiki:

https://github.com/FasterXML/jackson-databind/wiki

that'd be a good step (I think you have access, if not LMK).

And I guess simple redirecting docs for docs/javadoc/*/index.html would be the other part.
With that I'd be happy & same could be done for other repos too. Plus I'd maintain Wiki going forward.

Help much appreciaed @drekbour !

@drekbour
Copy link
Contributor Author

👍 Updated wiki for jackson-core, jackson-databind
👎 Don't have wiki access to jackson-datatype-jdk8, jackson-dataformats-text, jackson-dataformats-binary (and probably any other module/datatype/dataformat etc) to update those.

I note that, because javadoc.io has a drop down covering all versions, the new links are generic and need no maintenance. This leads me to think they could be added to the README.md?

@cowtowncoder
Copy link
Member

cowtowncoder commented Jan 26, 2023

Thank you @drekbour! README already actually has the Javadocs badge under "Status".
I'll see what is needed for other repos: perhaps it requires being contributor (having had a PR merged)?

How about annotations' wiki? Ah already done too, great! :)

cowtowncoder pushed a commit to FasterXML/jackson-core that referenced this issue Jan 26, 2023
cowtowncoder pushed a commit to FasterXML/jackson-core that referenced this issue Jan 26, 2023
cowtowncoder pushed a commit to FasterXML/jackson-annotations that referenced this issue Jan 26, 2023
cowtowncoder pushed a commit to FasterXML/jackson-annotations that referenced this issue Jan 26, 2023
cowtowncoder pushed a commit that referenced this issue Jan 26, 2023
@cowtowncoder
Copy link
Member

Removed docs/javadoc from 2.14 branch onward for:

  • jackson-annotations
  • jackson-core
  • jackson-databind

@cowtowncoder cowtowncoder added 2.15 and removed good first issue Issue that seems easy to resolve and is likely a good candidate for contributors new to project labels Jan 27, 2023
@cowtowncoder
Copy link
Member

@drekbour I changed access settings so you should be able to change wikis for:

  • jackson-modules-base
  • jackson-modules-java8
  • jackson-dataformats-binary
  • jackson-dataformats-text
  • jackson-dataformat-xml

LMK which other ones you'd want to target.

@sbrannen
Copy link

sbrannen commented Jan 29, 2023

FYI: these changes break the builds of some downstream projects.

For example, the Spring Framework 6.0.x build was broken by this.

I have not investigated which further builds are broken, but I imagine there could be many.

The reason these changes break builds is that some projects configure Jackson for external Javadoc links. For example, in the Spring Framework builds we were configuring the following external links.

When the javadoc task in our build executed, it failed to retrieve the package-list files with errors similar to the following.

error: Error fetching URL: https://fasterxml.github.io/jackson-core/javadoc/2.10/ (java.io.FileNotFoundException: https://fasterxml.github.io/jackson-core/javadoc/2.10/package-list)

Navigating up that directory structure led me to https://fasterxml.github.io/jackson-core/ which states:

jackson-core

Empty!

/docs/ used to contain Javadocs definitions, but since they can be found from:

http://www.javadoc.io/doc/com.fasterxml.jackson.core/jackson-core

are no longer stored in this repo

That's how I eventually found this GitHub issue.

To fix Spring's builds, I got things working again by using http://www.javadoc.io.

However, as a benefit to the Jackson community it would be great if you could introduce redirects from URLs such as https://fasterxml.github.io/jackson-core/javadoc/2.10/package-list to https://www.javadoc.io/doc/com.fasterxml.jackson.core/jackson-core/2.10.0/package-list.

FWIW, I noticed that you mentioned adding redirects for index.html files in #3440 (comment). So perhaps the package-list files were just an oversight. 😉

sbrannen added a commit to spring-projects/spring-framework that referenced this issue Jan 29, 2023
The Jackson project no longer publishes Javadoc at
https://fasterxml.github.io which breaks the `javadoc` and `api` build
tasks due to their dependency on that web site for external Javadoc links.

As a workaround, we now reference Jackson's Javadoc via
https://www.javadoc.io.

See FasterXML/jackson-databind#3440
Closes gh-29894
sbrannen added a commit to spring-projects/spring-framework that referenced this issue Jan 29, 2023
The Jackson project no longer publishes Javadoc at
https://fasterxml.github.io which breaks the `javadoc` and `api` build
tasks due to their dependency on that web site for external Javadoc links.

As a workaround, we now reference Jackson's Javadoc via
https://www.javadoc.io.

See FasterXML/jackson-databind#3440
Closes gh-29895
@cowtowncoder
Copy link
Member

cowtowncoder commented Jan 30, 2023

Thank you for bringing this to our attentiont @sbrannen .

Ugh. My intention was not break things in this way, and in hindsight I should have asked about possible downside on user/dev mailing list.

I would need help in figuring out a good way to resolve things here: undoing removals is a possibility, which would mean doing something like:

  1. Retaining javadocs for specific subset of versions
  2. Stopping publishing of javadocs after 2.14 (publishing is actually manual operation after Maven Release plugin)

But if redirect works, that's better, I assume this:

https://blog.hubspot.com/website/html-redirect

would do the trick?

Although not sure about redirecting index.html vs package-list.

@pjfanning
Copy link
Member

I'm not sure if html redirects using meta tags supports wildcards.

One approach that does are .htaccess files. Reasonable write-up: https://www.seoptimer.com/blog/wildcard-redirect/

@cowtowncoder
Copy link
Member

@pjfanning That sounds like a good solution where feasible Not sure it is doable here since we rely on Github pages/in-repo docs reference. But then again we can't be the first project to hit issues like this...

@joca-bt
Copy link

joca-bt commented Jan 30, 2023

This also broke our pipelines and I imagine lots of other people's pipelines. The older URL no longer exists even for older versions. It would have make more sense to keep the URL around for some time.

@pjfanning
Copy link
Member

@joca-bt broken pipelines are esy to fix - eg spring-projects/spring-framework@40d2466

We may put back the old pages but first, we want to see if we can use redirects instead.

Unfortunately, if you need an immediate fix, then you are stuck with changing the URL in your own build files like the spring commit above.

@joca-bt
Copy link

joca-bt commented Jan 30, 2023

But given there was no announcement anywhere about this, you are making user's life more difficult for literally no reason. The old website is just broken, as it shows 404. At least we could add a banner saying "The javadoc has moved to ..." so users don't need to guess.

@pjfanning
Copy link
Member

It's not my repo. I didn't make the change.

The change wasn't made to ruin your day. There is an announcement on https://fasterxml.github.io/jackson-databind/

It just wasn't considered that users link to the subpages in their builds. But to reiterate, the javadocs are accessible at https://www.javadoc.io/doc/com.fasterxml.jackson.core/jackson-databind

@cowtowncoder
Copy link
Member

cowtowncoder commented Jan 30, 2023

@joca-bt This was not intentional: I was not sure there was actual usage. But yes, this definitely could have gone better. At this point I am looking for best solutions to achieve what we want (reduced size of repo download, less maintenance; but also trying to avoid breakage by existing users).

PR would be good, or proposal for revert, followed by selective removal: for example, retain versions 2.10.x and later, do not publish version 2.15 and later.

@cowtowncoder
Copy link
Member

Reading through earlier comments, it sounds like package-list is useful/necessary; I can return these I think, as the first step.

Second, assuming index.html of the main level would be useful I can probably add redirects for those as well.
These are needed for jackson-core, jackson-annotations and jackson-databind.

@cowtowncoder
Copy link
Member

@sbrannen As per #3769, I added docs/javadoc/2.14/package-list (etc) back in this, jackson-core and jackson-annotations repositories (only ones were removal was done). Does this help on its own, or would more be needed for specific build systems in question?

@sbrannen
Copy link

sbrannen commented Feb 1, 2023

Hi @cowtowncoder,

Thanks for investigating ways to alleviate the issues.

When I originally reported the issue about broken builds, I was only focusing on getting builds to work going forward; however, that is only part of the overall set of issues.

It's actually considerably more involved. Let me see if I can highlight the issues I'm aware of.

Restoring package-list files will get builds (that cross reference Jackson javadoc) to pass, but it will result in broken links within the generated Javadoc.

For example, if I revert those changes I made to Spring's build (i.e., switch back to using external Javadoc links like https://fasterxml.github.io/jackson-core/javadoc/2.10/), then the Spring api Gradle task will succeed, and the build will technically pass. However, if I view the generated HTML and click on a link to a cross-referenced Jackson type, I'll encounter a 404 error.

Concrete examples:

  • Spring's published 6.0.4 Javadoc behaves the same as my locally reverted changes that find the package-list file you restored in commit 1776259. The reason is that the javadoc tool uses the contents of the package-list file combined with the URL used to download the package-list file to generate links (based on convention -- without checking for the existence of any such linked page on the Internet) to external types referenced in Javadoc tags (@see, etc.). So Spring's generated Javadoc HTML now references Jackson Javadoc pages that don't exist. To see this in action, click on the 6.0.4 link above and then click on any link to ObjectMapper.
  • In a similar vein, all existing published Spring Javadoc versions now contain broken links to Jackson types. That's true for 6.0.4, 5.3.25, 5.0.0.RELEASE, etc., etc.

The latter bullet point is applicable not only to the published Javadoc of numerous (thousands?) of libraries/projects around the world, but it also applies to any published blog, tweet, Stack Overflow answer, etc. in which somebody included a link to a specific version of Jackson's Javadoc.

Just to be clear, Spring's 6.0.5-SNAPSHOT Javadoc does not have broken links to Jackson because the package-list and cross-referenced documentation both exist on the same
https://www.javadoc.io/doc/com.fasterxml.jackson.core/jackson-databind/2.14.1/ web site.

I hope that clarifies the scope of the issues, but to summarize:

  • Completely removing all Javadoc content from https://fasterxml.github.io breaks builds and breaks all existing links to specific Javadoc pages in Jackson libraries.
  • Restoring only package-list files will allow builds to "pass", but links will be broken. So it's not really a solution.

I see the following as possible options.

  1. Restore all Javadoc content at https://fasterxml.github.io
  2. For Jackson 2.15 onward, do not publish Javadoc to https://fasterxml.github.io but do inform users that Javadoc can be found at https://www.javadoc.io
  3. Point # 1 serves as a quick fix for all known issues, but if you still wish to remove old content you could set up pattern-based redirects for all existing content and then remove the old content once you've verified that the redirects cover all known use cases.

Looking forward to hearing what you decide.

Cheers,

Sam

@pjfanning
Copy link
Member

3 is great but so far no solution that works has been found.

It might be possible to do 1 but change the branch that is used - GitHub pages config would need to be changed to use the separate branch. This approach keeps the master and 2.x branches small.

@sbrannen
Copy link

sbrannen commented Feb 1, 2023

3 is great but so far no solution that works has been found.

If you believe this answer to be the authoritative answer, then the answer is that it is (currently) impossible to configure a 301 redirect with GitHub Pages.

However, it does appear to be possible by setting up CNAME redirection for a custom domain, but I'm not sure if that's feasible/appropriate/possible for Jackson.

@pjfanning
Copy link
Member

@cowtowncoder if it's ok with you, I can create a ghpages branch based off 2.14 branch prior to the recent docs for changes. See my previous comment. Changing the GitHub repo settings to use the ghpages branch is a simple change in Settings tab.

@cowtowncoder
Copy link
Member

Hmmh. It's too bad I moved out of gh-pages earlier, but I am not opposed to going back there I suppose.

So +1 for that @pjfanning .

@cowtowncoder
Copy link
Member

Thank you @sbrannen for a very thorough explanation why the "quick patch" won't be enough.

I think @pjfanning's idea of going back to gh-pages makes sense, so let's plan on doing that.
I think copies of docs/javadoc can be found from tag jackson-databind-2.14.1 (etc).

And we can still leave project Wiki links pointing to javadoc.io, stop publishing new Javadocs with 2.15.
Or, if we want to give grace period, still do 2.15 and stop right after; it's not a big deal to do that (but it is one more step in release process that is nice to get rid of).

@pjfanning
Copy link
Member

pjfanning commented Feb 1, 2023

I've overwritten gh-pages for jackson-core and jackson-databind so the javadocs for those 2 projects are back.

also now:

@cowtowncoder
Copy link
Member

Excellent @pjfanning, thank you for doing this.

@vlsi
Copy link

vlsi commented Nov 28, 2023

I am ears for a better system, if this is an actual problem? (disk space is not super expensive these days)

It makes "global search" harder when navigating the project in IDE.
It makes clone/update slower.

WDYT of having a separate repository (e.g. fasterxml-javadoc.github.io/...) that would host the generated javadocs?

@cowtowncoder
Copy link
Member

We have moved to not publishing Javadocs within repo any more, links pointing to javadoc.io, like README.md here:

https://www.javadoc.io/doc/com.fasterxml.jackson.core/jackson-databind

where Javadocs are probably extracted from Maven Central, to where we release process publishes them.

So I will close this as done.

Individual repositories still mostly have their old Javadocs which we probably do not want to remove.
Just don't want to push more to Git repos.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants