Skip to content

Conversation

@barsh404error
Copy link

@barsh404error barsh404error commented Dec 30, 2025

Thanks alot to @christolis for helping me out on making this pull request.
Added two utulity methods isLinkBroken and replaceDeadLinks

-isLinkBroken(String url) checks the link availability using a HEAD request
I used HEAD request instead of GET request to check link availability without downloading the response body, reducing bandwidth and improving the performance.

-replaceDeadLinks (String text, String replacement) replaces unreachable/broken links asynchronously.

This change does not have any behavior changes to the existing code.

Part of #1276, implements the mentioned utility but doesnt apply it.

@barsh404error barsh404error requested a review from a team as a code owner December 30, 2025 10:52
@CLAassistant
Copy link

CLAassistant commented Dec 30, 2025

CLA assistant check
All committers have signed the CLA.

@barsh404error
Copy link
Author

@tj-wazei I've added your request and fixed them
I’ve updated the implementation to avoid mutating shared state in async callbacks and now collect the results before doing replacements sequentially.
I also reuse a single HttpClient and added a GET fallback when HEAD is not supported.
Let me know if anything else should be adjusted ^^

@barsh404error barsh404error requested a review from tj-wazei January 2, 2026 10:22
@barsh404error barsh404error changed the title Add utilities to detect and replace broken links. Add utilities to detect and replace broken links. Fixed #1276 Jan 3, 2026
@christolis christolis changed the title Add utilities to detect and replace broken links. Fixed #1276 Add utilities to detect and replace broken links. Jan 4, 2026
@Zabuzard Zabuzard self-requested a review January 4, 2026 15:37
Copy link
Member

@Zabuzard Zabuzard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The added methods are utility methods and for those it is crucial that they have a proper Javadoc explaining what they do and give examples.

The isLinkBroken method needs to explain what it means for a link to be broken and how it behaves in edge case, for example if the given text isnt a url and so on.

The other method needs to explain in more detail what the two parameters are and how they work, maybe giving a concrete example in the javadoc.

@barsh404error
Copy link
Author

The added methods are utility methods and for those it is crucial that they have a proper Javadoc explaining what they do and give examples.

The isLinkBroken method needs to explain what it means for a link to be broken and how it behaves in edge case, for example if the given text isnt a url and so on.

The other method needs to explain in more detail what the two parameters are and how they work, maybe giving a concrete example in the javadoc.

Yes, i got that solved made javadoc for isLinkBroken and replaceDeadLinks.
i explained what does it do step by step altho idk if i wrote to long

@barsh404error barsh404error requested a review from Zabuzard January 6, 2026 16:30
tj-wazei
tj-wazei previously approved these changes Jan 6, 2026
Copy link
Contributor

@tj-wazei tj-wazei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

int status = response.statusCode();
return status < 200 || status >= 400;
})
.exceptionally(ignored -> true)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the idiomatic name for something you ignore is _


.toList();

return CompletableFuture.allOf(deadLinkFutures.toArray(new CompletableFuture[0]))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

toArray(new Foo[0]) is an anti-pattern. use method references instead toArray(Foo[]::new)

.distinct()
.map(link -> isLinkBroken(link)
.thenApply(isBroken -> Boolean.TRUE.equals(isBroken) ? link : null))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

empty line in the middle of the flow

Comment on lines +23 to +24
private static final Set<LinkFilter> DEFAULT_FILTERS =
Set.of(LinkFilter.SUPPRESSED, LinkFilter.NON_HTTP_SCHEME);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

improve name. what does this filter, what does it mean, what is it used for?

* <p>
* A link is considered broken if:
* <ul>
* <li>The URL is invalid or malformed</li>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is incorrect, isnt it? did you test it? URI.create(url) throws on malformed URLs, doesnt it. so your documentation for that edge case is wrong then.

*/

public static CompletableFuture<Boolean> isLinkBroken(String url) {
HttpRequest headRequest = HttpRequest.newBuilder(URI.create(url))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

improve name headRequest documents the "how" not the "what/why". a better name would be checkLinkHeadRequest

List<CompletableFuture<String>> deadLinkFutures = links.stream()
.distinct()
.map(link -> isLinkBroken(link)
.thenApply(isBroken -> Boolean.TRUE.equals(isBroken) ? link : null))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isBroken cant be null, so just isBroken ? link : null works and is easier to read

Comment on lines 150 to 153
List<CompletableFuture<String>> deadLinkFutures = links.stream()
.distinct()
.map(link -> isLinkBroken(link)
.thenApply(isBroken -> Boolean.TRUE.equals(isBroken) ? link : null))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better: instead of map(foo -> ...thenApply(...)), use .map(foo -> ...).filter(...) then you also dont have all these null items in ur list, polluting it

return CompletableFuture.allOf(deadLinkFutures.toArray(new CompletableFuture[0]))
.thenApply(ignored -> deadLinkFutures.stream()
.map(CompletableFuture::join)
.filter(Objects::nonNull)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.filter(Objects::nonNull) that one isnt needed anymore with the above fix

.toList();

return CompletableFuture.allOf(deadLinkFutures.toArray(new CompletableFuture[0]))
.thenApply(ignored -> deadLinkFutures.stream()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(_ instead of ignored)

.thenApply(deadLinks -> {
String result = text;
for (String deadLink : deadLinks) {
result = result.replace(deadLink, replacement);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

performance trap. have a quick check if StringBuilder provides replace. if not, u can keep it, its not that big of a deal in this case, i guess.

@barsh404error
Copy link
Author

I focused primarily on improving the Javadocs as requested @Zabuzard
While doing so, I slightly refactored the internal stream to use Optional instead of null to better match the documented behavior and avoid null handling
No external behavior was changed.

Comment on lines +81 to +82
* The method first performs an HTTP {@code HEAD} request and falls back to an HTTP {@code GET}
* request if the {@code HEAD} request indicates a failure.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

implementation detail like that goes into a comment inside the method, not in the javadoc 👍

Comment on lines +148 to +149
* @param text the input text containing URLs (must not be {@code null})
* @param replacement the string to replace broken links with (must not be {@code null})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

u dont need to write that as ur params are all (implicitly) annotated with @NonNull already :)

.map(link -> isLinkBroken(link)
.thenApply(isBroken -> Boolean.TRUE.equals(isBroken) ? link : null))

.thenApply(isBroken -> isBroken ? Optional.of(link) : Optional.<String>empty()))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

u misunderstood me. its better if u simply filter out these, skip them here instead of later.
.filter(...) so they are not even part of the list

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants