Skip to content

linkcheck only emits warnings on HTTP redirects if linkcheck_allowed_redirects is non-empty #13439

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jameslamb opened this issue Mar 19, 2025 · 10 comments

Comments

@jameslamb
Copy link

jameslamb commented Mar 19, 2025

Describe the bug

The docs for led me to believe that HTTP redirects would result in warnings from the linkcheck builder, which I could then turn into a non-0 exit status (and therefore CI failures) via --fail-on-warning.

linkcheck_allowed_redirects

A dictionary that maps a pattern of the source URI to a pattern of the canonical URI

The linkcheck builder will emit a warning when it finds redirected links that don’t meet the rules above.
It can be useful to detect unexpected redirects when using the fail-on-warnings mode.

(my emphasis)

ref: https://www.sphinx-doc.org/en/master/usage/configuration.html#confval-linkcheck_allowed_redirects

I found that if linkcheck_allowed_redirects is empty or not specific in conf.py, HTTP redirects are reported in log output but not emitted as warnings.

As a result, redirects could silently make it through the linkcheck check. To work around this, I'm defining a placeholder like this to get the behavior "fail on any redirects".

linkcheck_allowed_redirects = {"abc": "def"}

How to Reproduce

With sphinx 8.2.3, installed from conda-forge, ran the quickstart.

mkdir testproject
cd ./testproject

sphinx-quickstart \
    --no-sep \
    --project testproject \
    --author james \
    --release v0.1.0 \
    --language en docs

Added a link that's guaranteed to redirect into index.rst.

echo 'See `this link <https://httpbin.org/redirect/1>`_ for details' >> ./docs/index.rst

(you can test this yourself with curl -i https://httpbin.org/redirect/1)

Saw linkcheck report the redirect in logs, but return a 0 exit status.

python -m sphinx -b linkcheck --fail-on-warning ./docs/ ./linkcheck_output
# build succeeded.

echo $?
# 0

Including a screenshot instead of text only because I think that the output color indicates something about the type of log message sphinx is omitting (a difference I don't understand, sorry).

Image

Added a nonsense linkcheck_allowed_redirects.

echo 'linkcheck_allowed_redirects = {"abc": "def"}' >> ./docs/conf.py

And tried again. This time, saw the redirect reported as a warning (in red this time), an a non-0 exit status.

python -m sphinx -b linkcheck --fail-on-warning ./docs/ ./linkcheck_output
# build finished with problems, 1 warning (with warnings treated as errors).

echo $?
# 1
Image

Environment Information

output of 'python -m sphinx --bug-report' (click me)
Platform:              darwin; (macOS-14.4.1-arm64-arm-64bit)
Python version:        3.12.8 | packaged by conda-forge | (main, Dec  5 2024, 14:19:53) [Clang 18.1.8 ])
Python implementation: CPython
Sphinx version:        8.2.3
Docutils version:      0.21.2
Jinja2 version:        3.1.6
Pygments version:      2.19.1

Sphinx extensions

None

Additional context

I'm reporting this as a "bug" because it seems based on the docs that this is unintentional. But if it is intentional, I'd be happy to submit a PR proposing a clarification to those docs explaining how to achieve "fail on any redirects".

I did look for other relevant issues and didn't find any specifically about this. Linking a few somewhat-related:

Thanks for your time and consideration.

@AA-Turner
Copy link
Member

Thank you for the detailed report James (nice to see a fellow economist!). The relevant code is:

if self.config.linkcheck_allowed_redirects:
msg = f'redirect {res_uri} - {redirection}'
logger.warning(msg, location=(result.docname, result.lineno))
else:
colour = turquoise if result.code == 307 else purple
msg = colour('redirect ') + res_uri + colour(f' - {redirection}')
logger.info(msg)

This behaviour was originally introduced in #9234 (#6525). The PR seems to have first sought to introduce warn_redirects, but then changed to the allowed_redirects idea instead.

cc @jayaddison, what do you think about this? My immediate reaction is that changing to fail on any redirects by default would be more annoying than helpful, but equally needing to specify a non-empty dict isn't helpful.

A simple 'fix' would be to allow setting linkcheck_allowed_redirects = {} to mean no redirects are allowed. We could also introduce a Boolean {fail,warn}_redirects option?

A

@jameslamb
Copy link
Author

Thanks very much for the quick, thoughtful, and informative response!

My immediate reaction is that changing to fail on any redirects by default would be more annoying than helpful

A simple 'fix' would be to allow setting linkcheck_allowed_redirects = {} to mean no redirects are allowed

From my perspective as a (new!) user of the linkcheck builder (thanks to @agriyakhetarpal teaching me about it in jameslamb/pydistcheck#293 (comment)), I'd be ok with a warning being unconditionally emitted for every redirect not matching a rule in linkcheck_allowed_redirects.

It wouldn't be that annoying to me:

  • those warnings would only be treated as errors if --fail-on-warning is passed, which is not the default (so I'm opting in to those errors)
  • I could achieve --fail-on-warning for all warnings EXCEPT redirects with a single entry like linkcheck_allowed_redirects = {r".*": r".*"} (no need to introduce more configuration options)

That said, I understand that changing this behavior at this stage could be disruptive for folks, and I know how stressful it is to have someone newly show up in your repo asking for breaking changes 😅 . Any mechanism that reserved a special value or provided new configuration options to achieve "warn on all redirects" would be an improvement over needing to provide a placeholder like {"abc": "def"}.

@jayaddison
Copy link
Contributor

Thanks both! A couple of questions I infer from this are:

  1. Should redirects be considered as warnings, by default?
  2. Regardless of the default, what is the linkcheck_allowed_redirects setting to allow all redirects?

If we can answer those, I think we can choose an appropriate default behaviour and config value.

My personal opinions are:

  1. No - redirects, although not ideal, are not necessarily an indication of a problem. In fact, they often indicate that a website has gone to the effort of maintaining functionality at existing URLs. Some projects might want to consider any kind of redirect as a warnable condition (e.g. to keep all hyperlinks direct-to-latest-source), but I agree with @AA-Turner that it's probably quite noisy to do so by default.
  2. Naively and intuitively I would expect both linkcheck_allowed_redirects = None and linkcheck_allowed_redirects = {} to mean "don't allow any redirects", based on reading the Python expression as if it were English language. However, the linkcheck_allowed_redirects = {r".*": r".*"} configuration -- although that would work, seems inelegant. So I'm wondering whether there'd be another (preferably intuitive) sentinel value we could use to allow wildcard/any redirects not to warn.

@jayaddison
Copy link
Contributor

Note / slightly off-topic: I do think including text output related to bugreports is generally preferable to providing screenshots; it can be more accessible, for example, and is easier to search.

@jameslamb
Copy link
Author

jameslamb commented Mar 20, 2025

I do think including text output related to bugreports is generally preferable to providing screenshots

I agree. Like I mentioned, here I included screenshots specifically because sphinx is using the color of output to convey some information. I did include what I thought were the relevant, likely-to-be-searched log lines in plaintext as comments in the code snippets.

Here are the full logs:

without linkcheck_allowed_redirects defined (click me)
Running Sphinx v8.2.3
loading translations [en]... done
making output directory... done
building [mo]: targets for 0 po files that are out of date
writing output... 
building [linkcheck]: targets for 1 source files that are out of date
updating environment: [new config] 1 added, 0 changed, 0 removed
reading sources... [100%] index
looking for now-outdated files... none found
pickling environment... done
checking consistency... done
preparing documents... done
copying assets... 
copying assets: done
writing output... [100%] index

(           index: line    9) ok        https://www.sphinx-doc.org/en/master/usage/restructuredtext/index.html
(           index: line   18) redirect  https://httpbin.org/redirect/1 - with Found to https://httpbin.org/get
build succeeded.

Look for any errors in the above output or in linkcheck_output/output.txt
with linkcheck_allowed_redirects set and non-empty (click me)
Running Sphinx v8.2.3
loading translations [en]... done
loading pickled environment... The configuration has changed (1 option: 'linkcheck_allowed_redirects')
done
building [mo]: targets for 0 po files that are out of date
writing output... 
building [linkcheck]: targets for 1 source files that are out of date
updating environment: 0 added, 0 changed, 0 removed
reading sources... 
looking for now-outdated files... none found
preparing documents... done
copying assets... 
copying assets: done
writing output... [100%] index

(           index: line    9) ok        https://www.sphinx-doc.org/en/master/usage/restructuredtext/index.html
(           index: line   18) /private/tmp/testproject/docs/index.rst:18: WARNING: redirect  https://httpbin.org/redirect/1 - with Found to https://httpbin.org/get
build finished with problems, 1 warning (with warnings treated as errors).

@jayaddison
Copy link
Contributor

Thanks @jameslamb!

@AA-Turner would it be outlandish to suggest using typing.Any (I know, not really a type...) as a sentinel value for linkcheck_allowed_redirects, meaning wildcard/any?

(from some very unscientific testing using Py3.13, typing.Any can be pickled and unpickled and identity-compares equally to a freshly-import'd Any using is after unpickling)

@AA-Turner
Copy link
Member

AA-Turner commented Mar 26, 2025

Sorry for the slight delay here. I've opened #13452 as a sketch solution. Notably, the only uses of linkcheck_allowed_redirects = {} on GitHub are intended to warn on all redirects. What do you think @jameslamb / @jayaddison?

A

@jameslamb
Copy link
Author

Thanks very much! I can't say much about the specific implementation because I'm not that familiar with Sphinx's internals, but I'd be very happy with this interface change.

Having to opt in to all-redirects-are-warnings by setting linkcheck_allowed_redirects = {} seems fair to me.

@jayaddison
Copy link
Contributor

@AA-Turner @jameslamb FYI: I'm likely to open a follow-up pull request for #13452, because we have encountered a regression for pydata-based projects (see #13462).

The logic I currently have in mind is:

  • linkcheck_allowed_redirects:{}: default warn and log redirects without following them
  • linkcheck_allowed_redirects:None: follow all redirects, do not warn about them
  • linkcheck_allowed_redirects:{...}: unmodified

I have a concern that although we are focusing on the edge cases (the empty dict and None cases above), maybe our documentation doesn't reflect reality when redirect restrictions are configured.

@jayaddison
Copy link
Contributor

A modification to this plan:

The logic I currently have in mind is:

  • linkcheck_allowed_redirects:{}: default warn and log redirects without following them

  • linkcheck_allowed_redirects:None: follow all redirects, do not warn about them

  • linkcheck_allowed_redirects:{...}: unmodified

Because we would prefer to disallow None as a value for linkcheck_allowed_redirects, I think the second case has to be modified to our our sentinel object value instead.

The ergonomics of importing _sentinel_lar to assign it as a value within conf.py aren't particularly elegant, and I wonder if we should instead provide some kind of public constant to declare that value (cc @AA-Turner). I was tempted to suggest making it the default so that users don't need to be aware of _sentinel_lar at all -- but I think that would be a larger breaking change-in-default-behaviour.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants