Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow users to manipulate request/response data before dumping #263

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions AUTHORS.rst
Original file line number Diff line number Diff line change
Expand Up @@ -53,3 +53,5 @@ Patches and Suggestions
- Ryan Ashley <rashley-iqt>

- Sam Bull (@greatestape)

- Morten Lied Johansen <mortenjo@ifi.uio.no>
27 changes: 27 additions & 0 deletions docs/dumputils.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,35 @@ you may need to look to find it. In :mod:`requests_toolbelt.utils.dump` there
are two functions that will return a :class:`bytearray` with the information
retrieved from a response object.

Sanitizing information before dumping
-------------------------------------

When debugging, it is quite often useful to dump the request or response to
debugging log, where it can be inspected. The problem is that often the request
or response can contain sensitive data that should not be stored in a logfile
on disk.

To solve this, it is possible to supply a :class:`Sanitizer` which can
manipulate the body or the headers before they are dumped. The default is to
not do anything. For convenience, :class:`HeaderSanitizer` is provided, which
will redact the value of headers that are commonly considered sensitive (See
:attr:`HeaderSanitizer.SENSITIVE_HEADERS`).

You can make any sanitizing you need by subclassing :class:`Sanitizer` and
passing in an instance to :py:func:`dump_all` or :func:`dump_response`.

Public members
--------------

.. autofunction::
requests_toolbelt.utils.dump.dump_all

.. autofunction::
requests_toolbelt.utils.dump.dump_response

.. autoclass::
requests_toolbelt.utils.dump.Sanitizer
:members:

.. autoclass::
requests_toolbelt.utils.dump.HeaderSanitizer
178 changes: 164 additions & 14 deletions requests_toolbelt/utils/dump.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
from requests import compat


__all__ = ('dump_response', 'dump_all')
__all__ = ['dump_response', 'dump_all', 'NoopSanitizer', 'HeaderSanitizer']

HTTP_VERSIONS = {
9: b'0.9',
Expand All @@ -16,6 +16,123 @@
['request', 'response'])


class Sanitizer(object):
CLEANSED_SUBSTITUTE = "********************"

def _sanitize_headers(self, headers):
sanitized_headers = headers.copy()
for name in headers:
if self.should_sanitize_header(name):
sanitized_headers[name] = self.CLEANSED_SUBSTITUTE
if self.should_strip_header(name):
del sanitized_headers[name]
return sanitized_headers

def request_headers(self, headers):
"""Sanitize the request headers

:param headers: The request headers
:type headers: :class:`requests.structures.CaseInsensitiveDict`
:return: A new headers object
:rtype: :class:`requests.structures.CaseInsensitiveDict`
"""
return self._sanitize_headers(headers)

def request_body(self, body):
"""Sanitize a request body

:param body: The body of the request
:type body: `bytes`
:return: The value to dump for the body
:rtype: `bytes`
"""
raise NotImplementedError

def response_headers(self, headers):
"""Sanitize the response headers

Modify the headers in place, removing or redacting values

:param headers: The response headers
:type headers: :class:`requests.structures.CaseInsensitiveDict`
"""
return self._sanitize_headers(headers)

def response_body(self, body):
"""Sanitize a request body

:param body: The body of the request
:type body: `bytes`
:return: The value to dump for the body
:rtype: `bytes`
"""
raise NotImplementedError

def should_sanitize_header(self, name):
raise NotImplementedError

def should_strip_header(self, name):
raise NotImplementedError


class NoopSanitizer(Sanitizer):
"""Performs no sanitation"""

def should_sanitize_header(self, name):
return False

def should_strip_header(self, name):
return False

def request_body(self, body):
return body

def response_body(self, body):
return body


class HeaderSanitizer(NoopSanitizer):
"""Redact the values of headers considered sensitive

This will check all headers in both request and response against a set of
sensitive headers (see :attr:`HeaderSanitizer.SENSITIVE_HEADERS`), and
redact the values to protect sensitive data.

"""

# List of sensitive headers copied from:
# https://github.com/google/har-sanitizer
SENSITIVE_HEADERS = {
"state",
"shdf",
"usg",
"password",
"email",
"code",
"code-verifier",
"client-secret",
"client-id",
"token",
"access-token",
"authenticity-token",
"id-token",
"appid",
"challenge",
"facetid",
"assertion",
"fcparams",
"serverdata",
"authorization",
"auth",
"x-client-data",
"samlrequest",
"samlresponse"
}

def should_sanitize_header(self, name):
return name.lower().replace('_', '-') in self.SENSITIVE_HEADERS


class PrefixSettings(_PrefixSettings):
def __new__(cls, request, response):
request = _coerce_to_bytes(request)
Expand Down Expand Up @@ -54,9 +171,12 @@ def _build_request_path(url, proxy_info):
return request_path, uri


def _dump_request_data(request, prefixes, bytearr, proxy_info=None):
def _dump_request_data(request, prefixes, bytearr, proxy_info=None,
sanitizer=None):
if proxy_info is None:
proxy_info = {}
if sanitizer is None:
sanitizer = NoopSanitizer()

prefix = prefixes.request
method = _coerce_to_bytes(proxy_info.pop('method', request.method))
Expand All @@ -70,21 +190,27 @@ def _dump_request_data(request, prefixes, bytearr, proxy_info=None):
host_header = _coerce_to_bytes(headers.pop('Host', uri.netloc))
bytearr.extend(prefix + b'Host: ' + host_header + b'\r\n')

for name, value in headers.items():
sanitized_headers = sanitizer.request_headers(headers)
for name, value in sanitized_headers.items():
bytearr.extend(prefix + _format_header(name, value))

bytearr.extend(prefix + b'\r\n')
if request.body:
if isinstance(request.body, compat.basestring):
bytearr.extend(prefix + _coerce_to_bytes(request.body))
body = _coerce_to_bytes(request.body)
body = sanitizer.request_body(body)
bytearr.extend(prefix + body)
else:
# In the event that the body is a file-like object, let's not try
# to read everything into memory.
bytearr.extend(b'<< Request body is not a string-like type >>')
bytearr.extend(b'\r\n')


def _dump_response_data(response, prefixes, bytearr):
def _dump_response_data(response, prefixes, bytearr, sanitizer=None):
if sanitizer is None:
sanitizer = NoopSanitizer()

prefix = prefixes.response
# Let's interact almost entirely with urllib3's response
raw = response.raw
Expand All @@ -97,14 +223,15 @@ def _dump_response_data(response, prefixes, bytearr):
str(raw.status).encode('ascii') + b' ' +
_coerce_to_bytes(response.reason) + b'\r\n')

headers = raw.headers
for name in headers.keys():
for value in headers.getlist(name):
sanitized_headers = sanitizer.response_headers(raw.headers)
for name in sanitized_headers.keys():
for value in sanitized_headers.getlist(name):
bytearr.extend(prefix + _format_header(name, value))

bytearr.extend(prefix + b'\r\n')

bytearr.extend(response.content)
body = sanitizer.response_body(response.content)
bytearr.extend(body)


def _coerce_to_bytes(data):
Expand All @@ -115,12 +242,17 @@ def _coerce_to_bytes(data):


def dump_response(response, request_prefix=b'< ', response_prefix=b'> ',
data_array=None):
data_array=None, sanitizer=None):
"""Dump a single request-response cycle's information.

This will take a response object and dump only the data that requests can
see for that single request-response cycle.

If the optional ``sanitize`` parameter is used, it should be an object that
implements the same interface as :class:`Sanitizer`. One possible
implementation is :class:`HeaderSanitizer`, which will redact sensitive
headers.

Example::

import requests
Expand All @@ -142,29 +274,40 @@ def dump_response(response, request_prefix=b'< ', response_prefix=b'> ',
:param data_array: (*optional*)
Bytearray to which we append the request-response cycle data
:type data_array: :class:`bytearray`
:param sanitizer: (*optional*)
How to sanitize the dump.
:type sanitizer: :class:`NoopSanitizer`
:returns: Formatted bytes of request and response information.
:rtype: :class:`bytearray`
"""
data = data_array if data_array is not None else bytearray()
prefixes = PrefixSettings(request_prefix, response_prefix)
if sanitizer is None:
sanitizer = NoopSanitizer()

if not hasattr(response, 'request'):
raise ValueError('Response has no associated request')

proxy_info = _get_proxy_information(response)
_dump_request_data(response.request, prefixes, data,
proxy_info=proxy_info)
_dump_response_data(response, prefixes, data)
proxy_info=proxy_info, sanitizer=sanitizer)
_dump_response_data(response, prefixes, data, sanitizer)
return data


def dump_all(response, request_prefix=b'< ', response_prefix=b'> '):
def dump_all(response, request_prefix=b'< ', response_prefix=b'> ',
sanitizer=None):
"""Dump all requests and responses including redirects.

This takes the response returned by requests and will dump all
request-response pairs in the redirect history in order followed by the
final request-response.

If the optional ``sanitize`` parameter is used, it should be an object that
implements the same interface as :class:`Sanitizer`. One possible
implementation is :class:`HeaderSanitizer`, which will redact sensitive
headers.

Example::

import requests
Expand All @@ -183,15 +326,22 @@ def dump_all(response, request_prefix=b'< ', response_prefix=b'> '):
:param response_prefix: (*optional*)
Bytes to prefix each line of the response data
:type response_prefix: :class:`bytes`
:param sanitizer: (*optional*)
How to sanitize the dump.
:type sanitizer: :class:`NoopSanitizer`
:returns: Formatted bytes of request and response information.
:rtype: :class:`bytearray`
"""
if sanitizer is None:
sanitizer = NoopSanitizer()

data = bytearray()

history = list(response.history[:])
history.append(response)

for response in history:
dump_response(response, request_prefix, response_prefix, data)
dump_response(response, request_prefix, response_prefix, data,
sanitizer)

return data
Loading