Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling of protocol relative external urls in zimcheck #307

Merged
merged 3 commits into from
Jul 5, 2022

Commits on Jul 5, 2022

  1. Enhanced test data demonstrates a bug in zimcheck

    zimcheck treats protocol-relative absolute URLs as relative URLs. This
    bug is demonstrated by enhanced test data (which makes the zimcheck test
    fail). Bugfix coming in the next commit.
    
    The test data was enhanced in the following ways:
    
    - Added external links in main.html. The protocol-relative variant
      (//openzim.org) is the one that breaks the tests. The mismatch
      in the test output demonstrates that the absolute link //openzim.org
      is treated as an internal link:
    
    ```
    [ RUN      ] zimcheck.internal_url_check_poorzimfile
    .../test/zimcheck-test.cpp:249: Failure
    Expected equality of these values:
      expected_stdout
        Which is: ...
      std::string(zimcheck_output)
        Which is: ...
    With diff:
    @@ +7,23 @@
     (A/non_existent.html) were not found in article dangling_link.html
       Found 1 empty links in article: empty_link.html
    +  The following links:
    +- //openzim.org
    +(openzim.org) were not found in article empty_link.html
    +  The following links:
    +- //openzim.org
    +(openzim.org) were not found in article external_image_http.html
    +  The following links:
    +- //openzim.org
    +(openzim.org) were not found in article external_image_https.html
    +  The following links:
    +- //a.io/pic.png
    +(a.io/pic.png) were not found in article external_image_protocol_relative.html
    +  The following links:
    +- //openzim.org
    +(openzim.org) were not found in article main.html
       ../../oops.html is out of bounds. Article: outofbounds_link.html
    +  The following links:
    +- //openzim.org
    +(openzim.org) were not found in article outofbounds_link.html
     [INFO] Overall Test Status: Fail
     [INFO] Total time taken by zimcheck: <3 seconds.\n
    
    Test context:
     zimcheck -u data/zimfiles/poor.zim
    
    [  FAILED  ] zimcheck.internal_url_check_poorzimfile
    ```
    
    - Added an image with inline data. This change only increases the
      functional coverage of zimcheck testing.
    
    - In poor.zim, added three (http, https and protocol-relative) variants
      of absolute referenes to an external image. The mismatch in the test
      output demonstrates that the protocol-relative variant is not reported:
    
    ```
    [ RUN      ] zimcheck.external_url_check_poorzimfile
    .../test/zimcheck-test.cpp:249: Failure
    Expected equality of these values:
      expected_stdout
        Which is: ...
      std::string(zimcheck_output)
        Which is: ...
    With diff:
    @@ -5,5 @@
       http://a.io/pic.png is an external dependence in article external_image_http.html
       https://a.io/pic.png is an external dependence in article external_image_https.html
    -  //a.io/pic.png is an external dependence in article external_image_protocol_relative.html
     [INFO] Overall Test Status: Fail
     [INFO] Total time taken by zimcheck: <3 seconds.\n
    
    Test context:
     zimcheck -x data/zimfiles/poor.zim
    
    [  FAILED  ] zimcheck.external_url_check_poorzimfile (1 ms)
    ```
    
    Also:
     - In create_test_zimfiles script, replaced the obsolete option -f
       of zimwriterfs with a new option -I.
    veloman-yunkan committed Jul 5, 2022
    Configuration menu
    Copy the full SHA
    bcc51f4 View commit details
    Browse the repository at this point in the history
  2. Added UriKind::PROTOCOL_RELATIVE

    This change fixes the bug demonstrated by the previous commit.
    veloman-yunkan committed Jul 5, 2022
    Configuration menu
    Copy the full SHA
    81a4093 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    3c5c32e View commit details
    Browse the repository at this point in the history