Add utilities to detect and replace broken links.#1366
Add utilities to detect and replace broken links.#1366barsh404error wants to merge 9 commits intoTogether-Java:developfrom
Conversation
application/src/main/java/org/togetherjava/tjbot/features/utils/LinkDetection.java
Outdated
Show resolved
Hide resolved
application/src/main/java/org/togetherjava/tjbot/features/utils/LinkDetection.java
Outdated
Show resolved
Hide resolved
application/src/main/java/org/togetherjava/tjbot/features/utils/LinkDetection.java
Outdated
Show resolved
Hide resolved
application/src/main/java/org/togetherjava/tjbot/features/utils/LinkDetection.java
Show resolved
Hide resolved
|
@tj-wazei I've added your request and fixed them |
application/src/main/java/org/togetherjava/tjbot/features/utils/LinkDetection.java
Outdated
Show resolved
Hide resolved
application/src/main/java/org/togetherjava/tjbot/features/utils/LinkDetection.java
Outdated
Show resolved
Hide resolved
Zabuzard
left a comment
There was a problem hiding this comment.
The added methods are utility methods and for those it is crucial that they have a proper Javadoc explaining what they do and give examples.
The isLinkBroken method needs to explain what it means for a link to be broken and how it behaves in edge case, for example if the given text isnt a url and so on.
The other method needs to explain in more detail what the two parameters are and how they work, maybe giving a concrete example in the javadoc.
Yes, i got that solved made javadoc for |
application/src/main/java/org/togetherjava/tjbot/features/utils/LinkDetection.java
Outdated
Show resolved
Hide resolved
application/src/main/java/org/togetherjava/tjbot/features/utils/LinkDetection.java
Outdated
Show resolved
Hide resolved
application/src/main/java/org/togetherjava/tjbot/features/utils/LinkDetection.java
Outdated
Show resolved
Hide resolved
application/src/main/java/org/togetherjava/tjbot/features/utils/LinkDetection.java
Show resolved
Hide resolved
application/src/main/java/org/togetherjava/tjbot/features/utils/LinkDetection.java
Outdated
Show resolved
Hide resolved
application/src/main/java/org/togetherjava/tjbot/features/utils/LinkDetection.java
Show resolved
Hide resolved
application/src/main/java/org/togetherjava/tjbot/features/utils/LinkDetection.java
Outdated
Show resolved
Hide resolved
| List<CompletableFuture<String>> deadLinkFutures = links.stream() | ||
| .distinct() | ||
| .map(link -> isLinkBroken(link) | ||
| .thenApply(isBroken -> Boolean.TRUE.equals(isBroken) ? link : null)) |
There was a problem hiding this comment.
better: instead of map(foo -> ...thenApply(...)), use .map(foo -> ...).filter(...) then you also dont have all these null items in ur list, polluting it
application/src/main/java/org/togetherjava/tjbot/features/utils/LinkDetection.java
Outdated
Show resolved
Hide resolved
application/src/main/java/org/togetherjava/tjbot/features/utils/LinkDetection.java
Outdated
Show resolved
Hide resolved
application/src/main/java/org/togetherjava/tjbot/features/utils/LinkDetection.java
Outdated
Show resolved
Hide resolved
|
I focused primarily on improving the Javadocs as requested @Zabuzard |
application/src/main/java/org/togetherjava/tjbot/features/utils/LinkDetection.java
Outdated
Show resolved
Hide resolved
application/src/main/java/org/togetherjava/tjbot/features/utils/LinkDetection.java
Outdated
Show resolved
Hide resolved
application/src/main/java/org/togetherjava/tjbot/features/utils/LinkDetection.java
Outdated
Show resolved
Hide resolved
|
Thanks for the detailed review @Zabuzard I went through the comments one by one and addressed them explicitly: Javadoc & behavior clarification Expanded Javadoc for isLinkBroken to clearly define what “broken” means: HTTP request failure HTTP status codes outside the 200–399 range Clarified that: Status 200 is considered valid even with an empty body Response body content is not inspected Invalid URL formats result in an IllegalArgumentException HEAD / GET request logic HEAD is used as an initial, cheaper check If HEAD indicates failure, a GET request is used as a fallback to handle servers that don’t implement HEAD correctly Any exception during either request is treated as a broken link Naming improvements Renamed request variables to describe intent rather than HTTP mechanics (e.g. link status validation rather than fallback flow) Asynchronous behavior replaceDeadLinks performs all link checks asynchronously CompletableFuture.allOf(...) is used only as a synchronization point; no blocking is introduced thenApply(_ -> …) is intentionally used to ignore the unused result and avoid misleading variable names replaceDeadLinks behavior Added a concrete usage example to the Javadoc showing input → output transformation Only links detected as broken are replaced Working links and non-URL content remain unchanged Duplicate links are checked once and replaced consistently Edge cases Empty input or no detected links returns the original text Empty response bodies are not treated as broken Exceptions during HTTP checks are handled defensively and result in “broken” sorry for stopping to answer those comments one by one it was annoying and frustrating lol |
application/src/main/java/org/togetherjava/tjbot/features/utils/LinkDetection.java
Outdated
Show resolved
Hide resolved
application/src/main/java/org/togetherjava/tjbot/features/utils/LinkDetection.java
Outdated
Show resolved
Hide resolved
application/src/main/java/org/togetherjava/tjbot/features/utils/LinkDetection.java
Outdated
Show resolved
Hide resolved
application/src/main/java/org/togetherjava/tjbot/features/utils/LinkDetection.java
Outdated
Show resolved
Hide resolved
| List<CompletableFuture<String>> deadLinkFutures = links.stream() | ||
| .distinct() | ||
| .map(link -> isLinkBroken(link) | ||
| .thenApply(isBroken -> Boolean.TRUE.equals(isBroken) ? link : null)) |
application/src/main/java/org/togetherjava/tjbot/features/utils/LinkDetection.java
Outdated
Show resolved
Hide resolved
application/src/main/java/org/togetherjava/tjbot/features/utils/LinkDetection.java
Outdated
Show resolved
Hide resolved
1d41d50 to
d556a33
Compare
Updated isLinkBroken() to only treat 4xx/5xx status codes as broken. Previously 3xx redirects were incorrectly marked as broken links also improved javadoc clarity throughout LinkDetection class
|
hello, @Zabuzard in this commit i focused mostly on the javadocs and updated |
application/src/main/java/org/togetherjava/tjbot/features/utils/LinkDetection.java
Show resolved
Hide resolved
application/src/main/java/org/togetherjava/tjbot/features/utils/LinkDetection.java
Outdated
Show resolved
Hide resolved
application/src/main/java/org/togetherjava/tjbot/features/utils/LinkDetection.java
Outdated
Show resolved
Hide resolved
application/src/main/java/org/togetherjava/tjbot/features/utils/LinkDetection.java
Outdated
Show resolved
Hide resolved
application/src/main/java/org/togetherjava/tjbot/features/utils/LinkDetection.java
Outdated
Show resolved
Hide resolved
- Rename replaceDeadLinks to replaceBrokenLinks for consistency - Use Optional instead of null values in stream processing - Add convenience overload for extractLinks with default filters - Update javadocs to be more generic and future-proof - Move implementation details from javadoc to inline comments - Replace 'ignored' lambda params with '_' Resolves the review comments from @Zabuzard
Thanks alot to @christolis for helping me out on making this pull request.
Added two utulity methods
isLinkBrokenandreplaceDeadLinks-
isLinkBroken(String url) checks the link availability using a HEAD requestI used HEAD request instead of GET request to check link availability without downloading the response body, reducing bandwidth and improving the performance.
-
replaceDeadLinks(String text, String replacement) replaces unreachable/broken links asynchronously.This change does not have any behavior changes to the existing code.
Part of #1276, implements the mentioned utility but doesnt apply it.