-
-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Zimcheck internal URL checking seems to ignore URLencoding AND HTML entities #378
Comments
@veloman-yunkan @mgautierfr I'm very surprised to discover that hairy bug so late. Please confirm and possibility fix (should be complicated) ASAP. Actually by hardening the testing around MWoffliner, this bug has been discovered. For the rest it seems to work and glad to merge and release in |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
One of the most important feature of zimcheck seems to be really buggy and weak. The checking of internal URL, ie. verifying that URLs in the HTML point to real entries in the ZIM, seem to just take the
href
value from the HTML and search it - as it - in the archive.Which means that there will be an error wrongly returned if:
"
or'
This is the last scenario which happen with this ZIM:
wikipedia_en_canada_2023-10.zim.zip
I got the error:
The text was updated successfully, but these errors were encountered: