-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provide default User-Agent in SimpleAsyncHTTPClient #2702
Comments
What should be the default user agent to be added here? "Mozilla/5.0 (compatible; tornado)" is enough? |
I think that is reasonable. More common for non-browsers, though, is something like: "Tornado/6.0" (the curl command-line tool uses "curl/7.67.0", python3 urllib.request.urlopen uses "Python-urllib/3.7") |
The User-Agent format is "Tornado\{Tornado_Version}". If self.request.user_agent isn't set and self.request.headers has no User-Agent in it's keys the default User-Agent is added. Fixes: tornadoweb#2702
How about that @ploxiln? |
Sorry, I should have pointed to #2749 where there is already commentary from the maintainer on this (but no response from the author, maybe your PR can replace it) |
The PR #2749 will break the build because it isn't checking headers previously set. This is fixed with this PR. |
The User-Agent format is "Tornado\{Tornado_Version}". If self.request.user_agent isn't set and self.request.headers has no User-Agent in it's keys the default User-Agent is added. Fixes: #2702
* iostream: check that stream is open before trying to read (tornadoweb#2670) * curl_httpclient: fix disabled decompress_response by setting None (NULL) instead of "none" for ENCODING reported by Andrey Oparin <andrey@edsd.com> * tests: run test_gzip for curl_httpclient also move simple_httpclient test_gzip to the shared httpclient tests, to test the decompress_response option for curl_httpclient as well * mypy: Enable no_implicit_optional "Implicit-optional" mode is on by default, but that default is intended to change in the indefinite future (python/peps#689, python/typing#275). Go ahead and change to the future explicit use of Optional. * gen.with_timeout: Don't log CancelledError after timeout See also: commit a237a99 * Fix ReST syntax * locks: Remove redundant CancelledError handling CancelledError is now always considered "quiet" (and concurrent.futures.CancelledError is no longer the same as asyncio.CancelledError). * ci: Re-enable nightly python Fixes tornadoweb#2677 * gen: Clean up docs for with_timeout Mark CancelledError change as 6.0.3 * *: Modernize IO error handling Where possible, replace use of errno with the exception hierarchy available since python 3.3. Remove explicit handling of EINTR which has been automatic since python 3.5 * netutil: Ignore EADDRNOTAVAIL when binding to localhost ipv6 This happens in docker with default configurations and is generally harmless. Fixes tornadoweb#2274 * test: Skip test_source_port_fail when running as root Root is always allowed to bind to low port numbers, so we can't simulate failure in this case. This is the last remaining failure when running tests in docker. * docs: Add notice about WindowsSelectorEventLoop on py38 Fixes tornadoweb#2608 * Bump version of twisted to pick up security fix * Release notes for 6.0.3 * SSLIOStream: Handle CertificateErrors like other errors Fixes: tornadoweb#2689 * Update database backend reference * Strip documentation about removed argument * Mark Template autoescape kwarg as Optional * httputil: cache header normalization with @lru_cache instead of hand-rolling Tornado is now py3-only so @lru_cache is always available. Performance is about the same. Benchmark below. Python 3.7 on Linux. before, cached: 0.9121252089971676 before, uncached: 13.358482279989403 after, cached: 0.9175888689933345 after, uncached: 11.085199063003529 ```py from time import perf_counter names = [f'sOMe-RanDOM-hEAdeR-{i}' for i in range(1000)] from tornado.httputil import _normalize_header start = perf_counter() for i in range(10000): # _normalize_header.cache_clear() for name in names: _normalize_header(name) print(perf_counter() - start) from tornado.httputil import _NormalizedHeaderCache start = perf_counter() _normalized_headers = _NormalizedHeaderCache(1000) for i in range(10000): # _normalized_headers = _NormalizedHeaderCache(1000) for name in names: _normalized_headers[name] print(perf_counter() - start) ``` * httputil: use compiled re patterns This is slightly faster than using the builtin cache, e.g.: With benchmark below (Python 3.7, Linux): before: 0.7284867879934609 after: 0.2657967659761198 ```py import re from time import perf_counter line = 'HTTP/1.1' _http_version_re = re.compile(r"^HTTP/1\.[0-9]$") start = perf_counter() for i in range(1000000): _http_version_re.match(line) print(perf_counter() - start) start = perf_counter() for i in range(1000000): re.match(r"^HTTP/1\.[0-9]$", line) print(perf_counter() - start) ``` * test: Disable TLS 1.3 in one test This test started failing on windows CI with an upgrade to python 3.7.4 (which bundles a newer version of openssl). Disable tls 1.3 for now. Possibly related to tornadoweb#2536 * spelling corrections * maintainance -> maintenance * recieving -> receiving * tests: replace remaining assertEquals() with assertEqual() assertEquals() is deprecated, and python3.7/pytest can warn about it * httputil.parse_body_arguments: allow incomplete url-escaping support x-www-form-urlencoded body with values consisting of encoded bytes which are not url-encoded into ascii (it seems other web frameworks often support this) add bytes qs support to escape.parse_qs_bytes, leave str qs support for backwards compatibility * Clear fewer headers on 1xx/204/304 responses This function is called on more than just 304 responses; it’s important to permit the Allow header on 204 responses. Also, the relevant RFCs have changed significantly. Fixes tornadoweb#2726. Signed-off-by: Anders Kaseorg <andersk@mit.edu> * Fix extra data sending at HEAD response with Transfer-Encoding: Chunked * Omit Transfer-Encoding header for HEAD response * Add test for unescaping with groups * Fix unescaping of regex routes Previously, only the part before the first '(' would be correctly unescaped. * Use HTTPS link for tornado website. * Simplify chained comparison. * build(deps): bump twisted from 19.2.1 to 19.7.0 in /maint Bumps [twisted](https://github.com/twisted/twisted) from 19.2.1 to 19.7.0. - [Release notes](https://github.com/twisted/twisted/releases) - [Changelog](https://github.com/twisted/twisted/blob/trunk/NEWS.rst) - [Commits](twisted/twisted@twisted-19.2.1...twisted-19.7.0) Signed-off-by: dependabot[bot] <support@github.com> * Dead link handling Added an extra set for handling dead links, and reporting. One consequence of this is that using this script will "work" offline, but will report that some all the links were not fetched. * ci: Pin version of black A new release of black changed the way some of our files are formatted, so use a fixed version in CI. * demos: Fix lint in webspider demo Updates tornadoweb#2765 * build: Revamp test/CI configuration Reduce tox matrix to one env per python version, with two extra builds for lint and docs. Delegate to tox from travis-ci. Add 3.8 to testing. Simplify by dropping coverage reporting and "no-deps" test runs. * process: correct docs of fork_processes exit behavior fixes tornadoweb#2771 * Remove legacy Python support in speedups.c * ci: Don't run full test suite on python 3.5.2 * web: Update hashing algorithm in StaticFileHandler (tornadoweb#2778) Addresses tornadoweb#2776. * build: Run docs and lint on py38 This requires moving some noqa comments due to 3.8's changes to the ast module. * lint: Upgrade to new version of black * lint: Use newer mypy This required some minor code changes, mainly some adjustments in tests (which are now analyzed more thoroughly in spite of being mostly unannotated), and some changes to placement of type:ignore comments. * use bcrypt's checkpw instead of == * Fix case of JavaScript, GitHub and CSS. * Fix syntax error in nested routing example * test: Add gitattributes for test data files This ensures that the tests pass on Windows regardless of the user's git CRLF settings. * test: Use selector event loop on windows. This gets most of the tests working again on windows with py38. * test: Add some more skips on windows Alternate resolvers behave differently on this platform for unknown reasons. * test: Add hasattr check for SIGCHLD This name is not present on all platforms * testing: Add level argument to ExpectLog This makes it possible for tests to be a little more precise, and also makes them less dependent on exactly how the test is run (runtests.py sets the logging level to info, but when running tests directly from an editor it may use the default of warnings-only). CI only runs the tests with runtests.py, so this might regress, but I'm not building anything to prevent that yet (options include running the tests differently in CI or making ExpectLog always use a fixed log configuration instead of picking up the current one) * ci: Add python 3.8 to windows CI * asyncio: AnyThreadEventLoopPolicy should always use selectors on windows * iostream: resolve reads that may be completed while closing fixes issue that a read may fail with StreamClosedError if stream is closed mid-read * avoid premature _check_closed in _start_read _start_read can resolve with _try_inline_read, which can succeed even if the stream has been closed if the buffer has been populated by a prior read preserve the fix for asserts being hit when dealing with closed sockets * catch UnsatisfiableReadError in close * iostream: Add tests for behavior around close with read_until Updates tornadoweb#2719 * iostream: Expand comments around recent subtle changes * Fix Google OAuth example (from 6.0 OAuth2Mixin->authorize_redirect is an ordinary synchronous function) * Add Python 3.8 clasifier to setup.py * Standardize type documentation for HTTPRequest init * travis-ci.com doesn't like it when you have matrix and jobs .org still allows this for some reason * Master branch release notes for version 6.0.4 * maint: Bump bleach version for a security fix * iostream: Update comment Update comment from tornadoweb#2690 about ssl module exceptions. * Added default User-Agent to the simple http client if not provided. The User-Agent format is "Tornado\{Tornado_Version}". If self.request.user_agent isn't set and self.request.headers has no User-Agent in it's keys the default User-Agent is added. Fixes: tornadoweb#2702 * Revert "docs: Use python 3.7 via conda for readthedocs builds" This reverts commit e7e31e5. We were using conda to get access to python 3.7 before rtd supported it in their regular builds, but this led to problems pinning a specific version of sphinx. See readthedocs/readthedocs.org#6870 * fix new E741 detected cases * fix typos * revert genericize change * stop ping_callback * fix types for max_age_days and expires_days parameters * test: Add a sleep to deflake a test Not sure why this has recently started happening in some environments, but killing a process too soon causes the wrong exit status in some python builds on macOS. * ci: Drop tox-venv Its README says it is mostly obsolete due to improvements in virtualenv. Using it appears to cause problems related to pypa/setuptools#1934 because virtualenv installs the wheel package by default but venv doesn't. * ci: Allow failures on nightly python due to cffi incompatibility * template: Clarify docs on escaping Originally from tornadoweb#2831, which went to the wrong branch. * test: Use default timeouts in sigchild test The 1s timeout used here has become flaky with the introduction of a sleep (before the timeout even starts). * auth: Fix example code Continuation of tornadoweb#2811 The oauth2 version of authorize_redirect is no longer a coroutine, so don't use await in example code. The oauth1 version is still a coroutine, but one twitter example was incorrectly calling it with yield instead of await. * platform: Remove tornado.platform.auto.set_close_exec This function is obsolete: Since python 3.4, file descriptors created by python are non-inheritable by default (and in the event you create a file descriptor another way, a standard function os.set_inheritable is available). The windows implementation of this function was also apparently broken, but this went unnoticed because the default behavior on windows is for file descriptors to be non-inheritable. Fixes tornadoweb#2867 * iostream,platform: Remove _set_nonblocking function This functionality is now provided directly in the `os` module. * test: Use larger time values in testing_test This test was flaky on appveyor. Also expand comments about what exactly the test is doing. * Remove text about callback (removed) in run_on_executor * curl_httpclient: set CURLOPT_PROXY to NULL if pycurl supports it This restores curl's default behaviour: use environment variables. This option was set to "" to disable proxy in 905a215 but curl uses environment variables by default. * httpclient_test: Improve error reporting Without this try/finally, if this test ever fails, errors can be reported in a confusing way. * iostream_test: Improve cleanup Closing the file descriptor without removing the corresponding handler is technically incorrect, although the default IOLoops don't have a problem with it. * test: Add missing level to ExpectLog call * asyncio: Improve support Python 3.8 on Windows This commit removes the need for applications to work around the backwards-incompatible change to the default event loop. Instead, Tornado will detect the use of the windows proactor event loop and start a selector event loop in a separate thread. Closes tornadoweb#2804 * asyncio: Rework AddThreadSelectorEventLoop Running a whole event loop on the other thread leads to tricky synchronization problems. Instead, keep as much as possible on the main thread, and call out to a second thread only for the blocking select system call itself. * test: Add an option to disable assertion that logs are empty Use this on windows due to a log spam issue in asyncio. * asyncio: Refactor selector to callbacks instead of coroutine Restarting the event loop to "cleanly" shut down a coroutine introduces other problems (mainly manifesting as errors logged while running tornado.test.gen_test). Replace the coroutine with a pair of callbacks so we don't need to do anything special to shut down without logging warnings. * docs: Pin version of sphinxcontrib-asyncio The just-released version 0.3.0 is incompatible with our older pinned version of sphinx. * docs: Pin version of sphinxcontrib-asyncio The just-released version 0.3.0 is incompatible with our older pinned version of sphinx. * Added arm64 jobs for Travis-CI * CLN : Remove utf-8 coding cookies in source files On Python 3, utf-8 is the default python source code encoding. so, the coding cookies on files that specify utf-8 are not needed anymore. modified: tornado/_locale_data.py modified: tornado/locale.py modified: tornado/test/curl_httpclient_test.py modified: tornado/test/httpclient_test.py modified: tornado/test/httputil_test.py modified: tornado/test/options_test.py modified: tornado/test/util_test.py * Allow non-yielding functions in `tornado.gen.coroutine`'s type hint (tornadoweb#2909) `@gen.coroutine` deco allows non-yielding functions, so I reflected that in the type hint. Requires usage of `@typing.overload` due to python/mypy#9435 * Update super usage (tornadoweb#2912) On Python 3, super does not need to be called with arguments where as on Python 2, super needs to be called with a class object and an instance. This commit updates the super usage using automated regex-based search and replace. After the automated changes were made, each change was individually checked before committing. * Update links on home page * Updated http links to the https versions when possible. * Updated links to Google Groups to match their new URL format. * Updated links to other projects to match their new locations. * And finally, updated link to FriendFeed to go to the Wikipedia page, because friendfeed.com is just a redirect to facebook.com now :-( :-( * Modified ".travis.yml" to test it's own built wheel Signed-off-by: odidev <odidev@puresoftware.com> * tests: httpclient may turn all methods into GET for 303 redirect * websocket_test: test websocket_connect redirect raises exception instead of "uncaught exception" and then test timeout * websocket: set follow_redirects to False to prevent silent failure when the websocket client gets a 3xx redirect response, because it does not currently support redirects Partial fix for issue tornadoweb#2405 * simple_httpclient: after 303 redirect, turn all methods into GET not just POST (but still not HEAD) following the behavior of libcurl > 7.70 * httpclient_test: add test for connect_timeout=0 request_timeout=0 * simple_httpclient: handle connect_timeout or request_timeout of 0 Using a connect_timeout or request_timeout of 0 was effectively invalid for simple_httpclient: it would skip the actual request entirely (because the bulk of the logic was inside "if timeout:"). This was not checked for or raised as an error, it just behaved unexpectedly. Change simple_httpclient to always assert these timeouts are not None and to support the 0 value similar to curl (where request_timeout=0 means no timeout, and connect_timeout=0 means curl default 300 seconds which is very very long for a tcp connection). * httpclient: document connect_timeout/request_timeout 0 value not exactly true for curl_httpclient (libcurl uses a connect_timeout of 300 seconds if no connect timeout is set) but close enough * test: update Travis-CI matrix pypy version to 3.6-7.3.1 * httpclient_test: new test for invalid gzip Content-Encoding this caused an infinite loop in simple_httpclient * http: fix infinite loop hang with invalid gzip data * test: Refactor CI configuration - Add osx and windows builds on travis - Stop running -full test suites on every python version on arm64 - Use cibuildwheel to build for all python versions in one job per platform - Bring a single test configuration and linters up to a first "quick" stage before starting the whole matrix - Push the resulting wheels (and sdist) to pypi on tag builds * Add release notes for 6.1, bump version to 6.1b1 * ci: Switch from testpypi to real pypi * Add deprecation notice for Python 3.5 * Update how to register application with Google * Fix await vs yield in the example * gen: Expliclty track contextvars, fixing contextvars.reset The asyncio event loop provides enough contextvars support out of the box for basic contextvars functionality to work in tornado coroutines, but not `contextvars.reset`. Prior to this change, each yield created a new "level" of context, when an entire coroutine should be on the same level. This is necessary for the reset method to work. Fixes tornadoweb#2731 * test: Add a timeout to SyncHTTPClient test * asyncio: Manage our own thread instead of an executor Python 3.9 changed the behavior of ThreadPoolExecutor at interpreter shutdown (after the already-tricky import-order issues around atexit hooks). Avoid these issues by managing the thread by hand. * ci,setup: Add python 3.9 to tox, cibuildwheel and setup.py * Bump version to 6.1b2 * Set version to 6.1 final * ci: Work around outdated windows root certificates * Bump main branch to 6.2.dev1 * Remove appveyor configs * Drop support for python 3.5 * iostream: Add platform assertion for mypy Without this mypy would fail when run on windows. * maint: Prune requirements lists Remove dependencies that are rarely used outside of tox. The main motivation is to give dependabot less to worry about when an indirect dependency has a security vulnerability. * *: Update black to newest version * Update mypy to latest version * docs: Upgrade to latest version of sphinx This version attempts to resolve types found in type annotations, but in many cases it can't find them so silence a bunch of warnings. (Looks like deferred annotation processing will make this better but we won't be able to use that until we drop Python 3.6) * docs: Pin specific versions of requirements * docs: Stop using autodoc for t.p.twisted This way we don't have to install twisted into the docs build environment. Add some more detail while I'm here. * platform: Deprecate twisted and cares resolvers These were most interesting when the default resolver blocked the main thread. Now that the default is to use a thread pool, there is little if any demand for alternative resolvers just to avoid threads. * Issue tornadoweb#2954: prevent logging error messages for not existing translation files Every not existing translation file for the existing locales logged an error message: Cannot load translation for 'ps': [Errno 2] No such file or directory: '/usr/share/locale/ps/LC_MESSAGES/foo.mo' * WaitIterator: don't re-use _running_future When used with asyncio.Future, WaitIterator may skip indices in some cases. This is caused by multiple _return_result calls after another, without having the chain_future call finish in between. This is fixed here by not hanging on to the _running_future anymore, which forces subsequent _return_result calls to add to _finished, instead of causing the previous result to be silently dropped. Fixes tornadoweb#2034 * Fix return type of _return_result * docs: fix simple typo, authentiate -> authenticate There is a small typo in tornado/netutil.py. Should read `authenticate` rather than `authentiate`. * Avoid 2GB read limitation on SSLIOStream * Remove trailing whitespace * locale: Format with black * wsgi: Update docstring example for python 3 Fixes tornadoweb#2960 * Remove WebSocketHandler.stream. It was no longer used and always set to None. * Add 'address' keyword control binded address tornadoweb#2969 * format code according to result of flake8 check * Add comment explaining workaround * change comment * should use python3 unicode in 'blog' demo tornadoweb#2977 * leave previous versionchanged * leave previous versionchanged * write_message method of WebSocketClientConnection now accepts dict as input * write_message method of WebSocketClientConnection now accepts dict as input * Uppercase A in Any * BaseIOStream.write(): support typed memoryview Making sure that ``len(data) == data.nbytes`` by casting memoryviews to bytes. * Allowed set max_body_size to 0 * fix line too long * fix E127 * what * But this is not beautiful * Is this okay * build(deps): bump jinja2 from 2.11.2 to 2.11.3 in /docs Bumps [jinja2](https://github.com/pallets/jinja) from 2.11.2 to 2.11.3. - [Release notes](https://github.com/pallets/jinja/releases) - [Changelog](https://github.com/pallets/jinja/blob/master/CHANGES.rst) - [Commits](pallets/jinja@2.11.2...2.11.3) Signed-off-by: dependabot[bot] <support@github.com> * build(deps): bump pygments from 2.7.2 to 2.7.4 in /docs Bumps [pygments](https://github.com/pygments/pygments) from 2.7.2 to 2.7.4. - [Release notes](https://github.com/pygments/pygments/releases) - [Changelog](https://github.com/pygments/pygments/blob/master/CHANGES) - [Commits](pygments/pygments@2.7.2...2.7.4) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: Ben Darnell <ben@bendarnell.com> Co-authored-by: Zachary Sailer <zachsailer@gmail.com> Co-authored-by: Pierce Lopez <pierce.lopez@gmail.com> Co-authored-by: Robin Roth <robin@rroth.de> Co-authored-by: Petr Viktorin <encukou@gmail.com> Co-authored-by: Martijn van Oosterhout <oosterhout@fox-it.com> Co-authored-by: Michael V. DePalatis <mike@depalatis.net> Co-authored-by: Remi Rampin <r@remirampin.com> Co-authored-by: Ran Benita <ran@unusedvar.com> Co-authored-by: Semen Zhydenko <simeon.zhidenko@gmail.com> Co-authored-by: Anders Kaseorg <andersk@mit.edu> Co-authored-by: Bulat Khasanov <afti@yandex.ru> Co-authored-by: supakeen <cmdr@supakeen.com> Co-authored-by: John Bampton <jbampton@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Jeff van Santen <jeffreyavansanten@gmail.com> Co-authored-by: Bruno P. Kinoshita <kinow@users.noreply.github.com> Co-authored-by: Gareth T <garetht@users.noreply.github.com> Co-authored-by: Min RK <benjaminrk@gmail.com> Co-authored-by: bn0ir <gblacknoir@gmail.com> Co-authored-by: James Bourbeau <jrbourbeau@gmail.com> Co-authored-by: Recursing <buonanno.lorenzo@gmail.com> Co-authored-by: Flavio Garcia <piraz@candango.org> Co-authored-by: Ben Darnell <ben@cockroachlabs.com> Co-authored-by: marc <Marc> Co-authored-by: agnewee <agnewee@gmail.com> Co-authored-by: Jeff Hunter <jeff@jeffhunter.me> Co-authored-by: 依云 <lilydjwg@gmail.com> Co-authored-by: odidev <odidev@puresoftware.com> Co-authored-by: Sai Rahul Poruri <rporuri@enthought.com> Co-authored-by: jack1142 <6032823+jack1142@users.noreply.github.com> Co-authored-by: Amit Patel <amitp@cs.stanford.edu> Co-authored-by: Debby <debby@glance.net> Co-authored-by: = <=> Co-authored-by: Eugene Toder <eltoder@users.noreply.github.com> Co-authored-by: Florian Best <best@univention.de> Co-authored-by: Alexander Clausen <alex@gc-web.de> Co-authored-by: Tim Gates <tim.gates@iress.com> Co-authored-by: bfis <b.fis@cern.ch> Co-authored-by: Eugene Toder <eltoder@gmail.com> Co-authored-by: youguanxinqing <youguanxinqing@qq.com> Co-authored-by: kriskros341 <krzysztofczuba884@gmail.com> Co-authored-by: Mads R. B. Kristensen <madsbk@gmail.com> Co-authored-by: Sakuya <dl@pbstu.com>
The User-Agent format is "Tornado\{Tornado_Version}". If self.request.user_agent isn't set and self.request.headers has no User-Agent in it's keys the default User-Agent is added. Fixes: tornadoweb#2702
From https://tools.ietf.org/html/rfc7231:
CurlAsyncHTTPClient
provides defaultUser-Agent
string:tornado/tornado/curl_httpclient.py
Lines 380 to 383 in c92b883
While
SimpleAsyncHTTPClient
writesUser-Agent
only when it's provided by the user:tornado/tornado/simple_httpclient.py
Lines 393 to 396 in c92b883
Could you consider adding default User-Agent string in
SimpleAsyncHTTPClient
?The text was updated successfully, but these errors were encountered: