-
-
Notifications
You must be signed in to change notification settings - Fork 105
Add utilities to detect and replace broken links. #1366
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
application/src/main/java/org/togetherjava/tjbot/features/utils/LinkDetection.java
Outdated
Show resolved
Hide resolved
application/src/main/java/org/togetherjava/tjbot/features/utils/LinkDetection.java
Outdated
Show resolved
Hide resolved
application/src/main/java/org/togetherjava/tjbot/features/utils/LinkDetection.java
Outdated
Show resolved
Hide resolved
application/src/main/java/org/togetherjava/tjbot/features/utils/LinkDetection.java
Outdated
Show resolved
Hide resolved
|
@tj-wazei I've added your request and fixed them |
application/src/main/java/org/togetherjava/tjbot/features/utils/LinkDetection.java
Show resolved
Hide resolved
application/src/main/java/org/togetherjava/tjbot/features/utils/LinkDetection.java
Outdated
Show resolved
Hide resolved
Zabuzard
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The added methods are utility methods and for those it is crucial that they have a proper Javadoc explaining what they do and give examples.
The isLinkBroken method needs to explain what it means for a link to be broken and how it behaves in edge case, for example if the given text isnt a url and so on.
The other method needs to explain in more detail what the two parameters are and how they work, maybe giving a concrete example in the javadoc.
Yes, i got that solved made javadoc for |
tj-wazei
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
| int status = response.statusCode(); | ||
| return status < 200 || status >= 400; | ||
| }) | ||
| .exceptionally(ignored -> true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the idiomatic name for something you ignore is _
|
|
||
| .toList(); | ||
|
|
||
| return CompletableFuture.allOf(deadLinkFutures.toArray(new CompletableFuture[0])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
toArray(new Foo[0]) is an anti-pattern. use method references instead toArray(Foo[]::new)
| .distinct() | ||
| .map(link -> isLinkBroken(link) | ||
| .thenApply(isBroken -> Boolean.TRUE.equals(isBroken) ? link : null)) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
empty line in the middle of the flow
| private static final Set<LinkFilter> DEFAULT_FILTERS = | ||
| Set.of(LinkFilter.SUPPRESSED, LinkFilter.NON_HTTP_SCHEME); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
improve name. what does this filter, what does it mean, what is it used for?
| * <p> | ||
| * A link is considered broken if: | ||
| * <ul> | ||
| * <li>The URL is invalid or malformed</li> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is incorrect, isnt it? did you test it? URI.create(url) throws on malformed URLs, doesnt it. so your documentation for that edge case is wrong then.
| */ | ||
|
|
||
| public static CompletableFuture<Boolean> isLinkBroken(String url) { | ||
| HttpRequest headRequest = HttpRequest.newBuilder(URI.create(url)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
improve name headRequest documents the "how" not the "what/why". a better name would be checkLinkHeadRequest
| List<CompletableFuture<String>> deadLinkFutures = links.stream() | ||
| .distinct() | ||
| .map(link -> isLinkBroken(link) | ||
| .thenApply(isBroken -> Boolean.TRUE.equals(isBroken) ? link : null)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isBroken cant be null, so just isBroken ? link : null works and is easier to read
| List<CompletableFuture<String>> deadLinkFutures = links.stream() | ||
| .distinct() | ||
| .map(link -> isLinkBroken(link) | ||
| .thenApply(isBroken -> Boolean.TRUE.equals(isBroken) ? link : null)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better: instead of map(foo -> ...thenApply(...)), use .map(foo -> ...).filter(...) then you also dont have all these null items in ur list, polluting it
| return CompletableFuture.allOf(deadLinkFutures.toArray(new CompletableFuture[0])) | ||
| .thenApply(ignored -> deadLinkFutures.stream() | ||
| .map(CompletableFuture::join) | ||
| .filter(Objects::nonNull) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.filter(Objects::nonNull) that one isnt needed anymore with the above fix
| .toList(); | ||
|
|
||
| return CompletableFuture.allOf(deadLinkFutures.toArray(new CompletableFuture[0])) | ||
| .thenApply(ignored -> deadLinkFutures.stream() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(_ instead of ignored)
| .thenApply(deadLinks -> { | ||
| String result = text; | ||
| for (String deadLink : deadLinks) { | ||
| result = result.replace(deadLink, replacement); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
performance trap. have a quick check if StringBuilder provides replace. if not, u can keep it, its not that big of a deal in this case, i guess.
|
I focused primarily on improving the Javadocs as requested @Zabuzard |
| * The method first performs an HTTP {@code HEAD} request and falls back to an HTTP {@code GET} | ||
| * request if the {@code HEAD} request indicates a failure. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
implementation detail like that goes into a comment inside the method, not in the javadoc 👍
| * @param text the input text containing URLs (must not be {@code null}) | ||
| * @param replacement the string to replace broken links with (must not be {@code null}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
u dont need to write that as ur params are all (implicitly) annotated with @NonNull already :)
| .map(link -> isLinkBroken(link) | ||
| .thenApply(isBroken -> Boolean.TRUE.equals(isBroken) ? link : null)) | ||
|
|
||
| .thenApply(isBroken -> isBroken ? Optional.of(link) : Optional.<String>empty())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
u misunderstood me. its better if u simply filter out these, skip them here instead of later.
.filter(...) so they are not even part of the list
Thanks alot to @christolis for helping me out on making this pull request.
Added two utulity methods
isLinkBrokenandreplaceDeadLinks-
isLinkBroken(String url) checks the link availability using a HEAD requestI used HEAD request instead of GET request to check link availability without downloading the response body, reducing bandwidth and improving the performance.
-
replaceDeadLinks(String text, String replacement) replaces unreachable/broken links asynchronously.This change does not have any behavior changes to the existing code.
Part of #1276, implements the mentioned utility but doesnt apply it.