Moxy is a script to easily configure mitmproxy as a tool for app development and testing, in particular to insert mock responses, and/or modify existing responses and/or requests based on various criteria.
The MITM stands for man-in-the-middle, i.e., the tool goes in between the client and server, and can capture, replace, and/or modify requests and responses between them. The configuration is JSON, i.e., no coding is required to use this tool. Among other things, it can do:
- mocking responses for endpoints not yet implemented on the backend
- reproducing specific flows, including error responses; one-shot, multi-stage, state machine, cyclic, and randomly selected actions are supported
- matching specific responses from the real backend and modifying their contents (e.g., insert new data into an existing response)
- logging occurrences of specific requests or responses
- introducing random errors to simulate unreliable conditions
Install mitmproxy through pip
or Homebrew:
pip3 install mitmproxy
or
brew install mitmproxy
Run it once to generate the certificate:
mitmproxy
Press Q to quit.
There should now be a certificate in ~/.mitmproxy
. Add it as a trusted
root certificate on your development device. See the
documentation for adding certificates on different platforms.
For iOS simulators:
git clone https://github.com/ADVTOOLS/ADVTrustStore.git
python ADVTrustStore/iosCertTrustManager.py -a ~/.mitmproxy/mitmproxy-ca-cert.pem
On many devices you can also go to mitm.it
while connected through mitmproxy
and install the certificate from there.
You can run mitmproxy
as the interactive console, or mitmdump
that
just executes the script and logs things to the terminal. The script is
specified using the command-line argument -s moxy.py
, and the configuration
file for the script is set with --set mock=config.json
.
The proxy can operate in several different modes, which are detailed below. As a short guide:
- if you can configure the server URL in the app, reverse proxy mode is by far the simplest to set up and doesn't interfere with other traffic
- if you are running the app on an actual device (e.g., a phone), or on a virtual device that supports proxy configuration on a system level, use regular proxy mode
- otherwise you may need to set up transparent proxy mode
Reverse proxy mode means that the client connects to the address of the machine running the proxy, which then forwards the requests to a specified server. This means that the client must be modified to use the proxy address/URL, but otherwise this is easy to set up without affecting other network traffic on the device.
Reverse proxy mode can be run as follows:
mitmdump -s moxy.py --set mock=config.json -m reverse:https://api.server.com
or
./moxyserver config.json https://api.server.com
Here config.json
is the name of your configuration file (see below), and
https://api.server.com
is the real backend's address. The client is then
configured to use https://localhost:8080/
(or your computer's IP address
if not running on the same device) as the server.
Regular and transparent proxy modes allow passing all traffic through the proxy without changing the original URLs, i.e., the client app does not need to be reconfigured (and indeed, you don't even need its source code).
The default, regular proxy, mode requires proxy support and configuration in the app or OS, and typically this means that all HTTP requests will go through the proxy, not just the ones we are interested in.
mitmproxy -s moxy.py --set mock=config.json
or
./moxy config.json
As with reverse proxy mode, config.json
is your configuration file. There
is no server address specified, since it is sent by the client when operating
through a proxy.
The transparent proxy mode requires setting up the device or a router to redirect its HTTP/HTTPS traffic through the proxy. The benefit of this is that it doesn't need any specific proxy configuration (or even support) on the client. Also, in contrast to regular proxy mode, the redirection can be done selectively (e.g., by IP address), using packet filter / firewall rules, so that not all traffic is redirected. However, setting it up is fairly complicated.
The transparent proxy mode is run as follows (but other configuration is required for traffic to be redirected into the proxy):
mitmproxy -s moxy.py --set mock=config.json -m transparent --showhost
Some clients may require a specific certificate instead of any trusted certificate for that domain – this is known as certificate pinning – which you must somehow overcome. This tool is meant for development, so it is assumed here that as the developer you are either able to disable certificate pinning in your own app, or whitelist the certificate used by mitmproxy.
The configuration of this tool is done in JSON, which contains handlers for requests and/or responses by path/endpoint. The JSON file is provided as an argument when starting the proxy. It is automatically reloaded when its modification time changes, i.e., you can edit it while the proxy is running without interrupting operations.
The configuration file is a JSON file containing a single dictionary (map)
object. The two main top-level keys are request
and response
, which
contain the path-specific handlers for requests and responses,
respectively, e.g.:
{
"request":{
"/file":{
"respond":{
"content": "./config/example.json"
}
},
"*":{
"host": "api.server.com"
}
},
"response":{
"/nonexistent/path/foo":{
"status": 404,
"replace":{
"status": 200,
"content": "./config/example.json"
}
},
"*":{
"host": [
".server.com",
"foo.otherserver.com"
]
}
}
}
The included example configuration contains many more examples of things that are possible with this tool.
Both request
and response
dictionaries have paths as keys. The path
keys can be any mix of the following types:
- exact string match without query or fragments, e.g.
/
,/foo/bar
- a regular expression match denoted by a tilde prefix
~
(which not part of the expression itself), evaluated in order and only in case there is no exact match, and including query and fragments, e.g.,~html$
,~^/v1/
- the all-cases mix-in
*
, which is applied as the base configuration for that section (i.e., its contents are added to the any other handler unless explicitly overridden therein)
The path handler values are either:
- a dictionary containing matching and action clauses, or
- an array of such dictionaries
In case the path handler is an array, its elements are evaluated in order until the first match for the request or response is found. In particular, note that even if there are multiple matching handlers with different actions, only the first match is used.
A path handler dictionary may contain a mix of keys for further matching
(e.g., host
matches that path only a specific set of hosts) and for
actions to take (e.g., pass
will pass the request without further
action).
The typical use of the *
path handler is to globally specify the host
or hosts to be matched, such as in transparent proxy mode where the same
paths might exist on multiple servers. The *
is not considered a match
by itself, i.e., if it is the only match, the path is not handled at all.
For example, the following would return the status 418 only for the path
/foo
(which would get it from the *
since it is not overridden by
respond
in /foo
):
"request":{
"/foo":{ },
"*":{ "respond":{ "status": 418 } }
}
However, the match-all regular expression ~
can be used to force all
otherwise unmatched requests or responses to be processed. The following
would return the status 418 for all paths except /foo
which, as an
exact match, prevents regular expression paths (including ~
) from being
evaluated for that path:
"request":{
"/foo":{ },
"~":{ "respond":{ "status": 418 } }
}
It is possible to define both *
and ~
, in which case the contents of
*
apply to ~
unless overridden therein.
A request handler is triggered when the client makes a request. This occurs before anything is sent to the server, so a request handler is good for:
- adding new endpoints that don't exist on backend
- replacing existing endpoints without triggering backend logic
- simulating error responses
- reproducing specific flows regardless of backend state
- redirecting requests
A response handler is triggered when the remote server responds to a request. This occurs after the request has already been processed by the backend, so a request handler is good for:
- matching based on specific data returned by backend
- matching based on HTTP status code (e.g., intercepting errors)
- merging additional data into the backend response
- replacing or deleting specific parts of backend data
In short, if you can determine the desired action from the client request alone and don't need the real backend to process the request, use a request handler. If you need the backend response (status code and/or data), use a response handler. If you need to modify the outgoing request but also need the backend response, use both.
For both request and response handlers, the following keys are available for matching:
scheme
– the URL scheme (http
orhttps
)host
– the server hostname (e.g.,api.server.com
)method
– the HTTP method (e.g.,GET
)path
– the path string, including query and fragments (e.g.,/search?q=foo
)query
– the query parsed into a dictionary (e.g.,{ "q": "foo" }
)request
– the request body contentheaders
– the HTTP headers as a dictionary (e.g.,{ "User-Agent": "…" }
)require
– required values for global variables, which can be set by the theset
action (e.g.,{ "myflag": "1" }
)
For response handlers, the following additional keys are available:
status
– the HTTP status code, e.g., 200content
– the content sent by the servererror
– true/false according to the status code (400+ is an error)
Note that headers
dictionary for response handlers is actually a combination
of the request and response headers, with response headers taking priority in
case of overlap. So the following would be valid for a
response handler headers
condition, with Content-Type
referring to the
response type:
"headers":{
"User-Agent": "~Apple",
"Content-Type": "~^text/html"
}
Matching may be done either as single value or an array of such values. In case of an array, it suffices that any element matches, for example the following matches any of the three hostnames:
"*":{
"host":[
"api.server.com",
".beta.server.com",
"api.staging.com"
]
}
The matching of objects, i.e., content
and query
, is done as a subset
match. For example, the following matches any JSON object returned by the
server where the key foo
has the value bar
and the array arr
contains
at least one element that has the key id
with value 1
, regardless of
any other keys and elements in any object:
"content":{
"foo": "bar",
"arr":[ { "id": 1 } ]
}
To require the presence of a key regardless of its value, the regular
expression string ~
may be used, e.g., { "foo": "~" }
will match if
the key foo
is present, no matter the type or contents of its value.
Note that keys must be exact matches, and do not support regular expression strings.
For host
in particular, beginning the hostname with a dot .
causes it
to match any hostname that ends with the part following the dot, e.g.,
.server.com
matches all of api.server.com
, foo.bar.server.com
, and
server.com
. Likewise ending the hostname with a dot matches any hostname
that has a matching prefix, e.g., api.
matches both api.server.com
and
api.foo.com
(but not dev.api.server.com
).
Most strings in the configuration can also be used as regular expression by
beginning them with a tilde ~
, e.g., ~^foo(|ba[rz])$
matches foo
,
foobar
and foobaz
.
Matching is allowed anywhere in the string, so use the special characters
^
and $
to denote the beginning and end, respectively.
An empty regular expression, expressed by only the plain tilde ~
, matches
everything. In content
matching this is useful to match a dictionary key
for existence regardless of its value.
Paths as keys to request
and response
can also be regular expressions.
If there is no exact match for a given path, then the regular expression
paths of that section (request
or response
) will be evaluated in the
order they appear in the JSON file. Beware that not all tools will preserve
the order of JSON dictionary keys, so the JSON file should be edited
manually if the order of evaluation matters.
Another way to force a deterministic order that survives JSON
transformations is to specify an array of cases for the match-all ~
path, and test the path again inside each element:
"~":[
{
"path": "~^/v1/pages/"
},
{
"path": "~^/v1/"
}
]
A number of "stateful" path handlers are available:
once
– the contents are evaluated only once for that pathcount
– a dictionary of handlers with specific counts as strings,even
,odd
,~
, and/or*
as keys, whereby all matching handlers are merged together such that the more specific ones take precedence (the id for each count is normally the path, but thecount
dictionary may contain anid
key to override this)cycle
– an array of handlers that are cycled through in sequence, wrapping around (the id for each cycle is normally the path, butcycle-id
can be specified alongsidecycle
in case there are multiple alternate cycles for the same path, or the same cycle is to be used with multiple paths)random
– an array of handlers from which one is chosen at random each time it is evaluatedstate
– a dictionary containing the keyvariable
with the name of the variable (settable with theset
action) as value, and handlers for different cases with the value of that variable as key
These handlers allow simulating flows of responses on the same endpoint, e.g., a sequence of events, a one-time or randomly occurring error, etc. Note that stateful handlers can not add any further matching, because they are evaluated only after a match has already been found.
For example, the following request handler passes any request through to the remote server three out of four times at random, but produces a specific error code with a one in four probability:
"~":{
"random":[
{ "pass": true }, { "pass": true }, { "pass": true },
{
"respond": {
"status": 500,
"content": "<h1>500 - Random Error</h1>"
}
}
]
}
These handlers can be nested arbitrarily. Note that if you are have multiple
alternate count
or multiple cycle
handlers for the same path, they will
default to sharing the count or index, since it defaults to being identified
by the path. Meanwhile, for regular expression path handlers use the actual
path as the default identifier, and will not by default share the count/index
across different paths matching the same regular expression. You can specify
id
inside the count
dictionary, or cycle-id
alongside cycle
, to
override this.
There is a global dictionary of variables that may be set with the set
action, tested for with the require
condition, and handled case-by-case
with state
. All variables default to the empty string.
For example, a pair of /get
and /set
request handlers:
"/set":[
{
"query": { "flag": "1" },
"respond": "<body><h1>Flag set</h1></body>",
"set":{ "myflag": "1" }
},
{
"query": { "flag": "0" },
"respond": "<body><h1>Flag cleared</h1></body>",
"set":{ "myflag": "0" }
},
{
"query": { "flag": "toggle" },
"require": { "myflag": "1" },
"set":{ "myflag": "0" },
"respond": "<body><h1>Flag cleared (toggle)</h1></body>"
},
{
"query": { "flag": "toggle" },
"set":{ "myflag": "1" },
"respond": "<body><h1>Flag set (toggle)</h1></body>"
},
{
"respond":{
"status": 400,
"content": "<body><h1>400</h1><p>Usage: <code>/set?flag=(0|1|toggle)</code></p></body>"
}
}
],
"/get":{
"state":{
"variable": "myflag",
"1":{ "respond": "<body><h1>The flag is set</h1></body>" },
"0":{ "respond": "<body><h1>The flag is cleared</h1></body>" },
"":{ "respond": "<body><h1>The flag is undefined</h1></body>" }
}
}
The following actions are available in both response and request handlers:
set
– a dictionary, sets the variables given as keys to the given values, which can be tested for in therequire
matcher andstate
handler (note thatset
takes effect immediately after matching, even ifpass
causes other handlers to be skipped)pass
– skips any further actions and passes the request or response throughlog
– logs the contents of the request or responseterminate
– terminates the entire moxy/mitmproxy process when matched (e.g., for ending automated tests gracefully)
The above actions are handled in the order listed here (i.e., pass
will
cause log
and/or terminate
to not take effect).
The following actions are available only in request handlers:
respond
– respond with the specified response (status
,content
,type
,headers
) instead of requesting it from the remote servermodify
– a modifier dictionary containing replacement keys for the request (the same values are allowed as in request matching)
The following actions are available only in response handlers:
replace
– replaces the response with the contents of the replace dictionary (similar to therespond
dictionary of request handlers)modify
– a modifier dictionary or an array of modifiers, applied in order (see below)
Request handlers can be used to mock responses without ever going through the remote server. This allows simulating events and endpoints that do do not exist or would be hard to reproduce on backend.
The respond
key on a request handler causes a response to be sent, and
likewise the replace
key on a response handler causes the original
response to be replaced with the specified one.
The keys for constructing a response are:
status
– the HTTP status code (defaults to 200 for request handlers and to the original code for response handlers)content
– the content either as raw string, JSON object, or a string containing a local filenametype
– a shortcut for theContent-Type
header (in request handlers often inferred automatically, e.g.,application/json
for JSON objects and files with the extension.json
)headers
– a dictionary of headers
If the respond
or replace
is in itself a string, that string is
interpreted as though it were the value of content
inside a dictionary.
Examples of request handlers:
{
"request":{
"/string":{
"respond":{
"type": "text/html",
"content": "<body><h1>HTML</h1></body>"
}
},
"/file":{
"respond":{
"content": "./config/example.json"
}
},
"/object":{
"respond":{
"content": {
"embedded":{
"json": [ "object" ]
}
}
}
}
}
}
It is possible to modify the request or response content by using a replace
or modify
key in a response handler, or a content
key inside a request
modify
dictionary. The format of the latter is the same as the modify
for response
.
To replace a response entirely, specify the new response inside
the replace
dictionary.
To modify content, the value of modify
(response) or modify["content"]
(request) is either a dictionary or an array of such dictionaries,
processed in order, with the following keys each:
replace
– perform selective replacement of content (in contrast to the aforementioned top levelreplace
which replaces the entire response)delete
– selectively delete contentmerge
– merge (add/insert/override) content
The modify
replace
can be of the following formats:
- a string containing a local file name, in which case the file is read as JSON, then processed as below
- a dictionary: the response content is interpreted as a dictionary,
and merged with
replace
non-recursively such that any colliding keys are taken fromreplace
- a string of the format
/re/sub
where/
is an arbitrary separator character,re
is a regular expression, andsub
is the substitute used for every occurrence ofre
in the content string (note that this can break JSON format) - an array with two strings as elements: the first string is treated as a regular expression, and all occurrences of it in the content are substituted by the second string
The regular expression substitution strings of the last two cases may contain
group references, e.g., |ba[rz]|Ba\\1
would uppercase the b
in both
bar
and baz
, but not in bat
.
Note that response handlers can have a top-level replace
, i.e., not nested
inside modify
, which is different in that it replaces the response entirely.
The modify
delete
can be any nested JSON object. Any matching
key, element, or value is deleted from content. The match need not be
exact, i.e., as for content matching, it suffices to be a subset of
content
. The empty dictionary {}
can be used to match any value,
e.g., the following deletes the key foo
regardless of its value:
"delete":{
"foo": {}
}
The modify
merge
can be any nested JSON object. It is merged
recursively with the content such that any new keys and values are
inserted and any matching existing keys are replaced on the
innermost level of nesting.
If any dictionary or array member value nested inside merge
is a string
beginning with a dot .
and ending with .json
, the corresponding file is
loaded and substituted for the string before merging.
For example:
"modify":[
"|old regex|REPLACEMENT STRING",
{
"merge":{
"some_new_key": "inserted item",
"some_existing_key": "replaced item",
"array":[
{ "id": 42, "note": "added element" }
]
}
},
{
"delete":{
"some_existing_key": {},
"array":[ { "id": 1 } ],
"array_to_delete_entirely": {}
}
},
{
"replace":{
"other_array": [
{ "note": "only element in array after replacing" }
]
}
}
]
As a special case, if an existing value in the content is an array,
it is possible to target specific elements of the array by using a
dictionary with a where
object in place of the array in merge
.
The where
dictionary may contain the following keys:
where
– an object that is matched against elements of the array; any modifications are done only on matching elementsreplace
orcontent
– a replacement object that overwrites any elements in the array matching thewhere
objectmerge
– an object that is merged (only) with elements in the array matching thewhere
objectinsert
– a string, eitherbefore
orafter
, which causes the resulting element to be inserted before or after the matched element, rather than replacing it (typically paired withcontent
for the new object)move
– a string, eitherhead
ortail
, which causes any elements matchingwhere
to be moved to the head (beginning) or the tail (end) of the array, respectivelyforall
– a boolean, iftrue
(the default), the modifications are applied to all elements matchingwhere
, otherwise only to the first matchnegated
– a boolean, iftrue
causes thewhere
condition to be negated, i.e., all of the above applies to elements not matchingwhere
delete
– a boolean, iftrue
causes any elements matchingwhere
to be deleted from the array (this is largely superfluous due to thedelete
operation, but when combined withforall
and/ornegated
, it adds new possibilities)
For example:
"modify":[
{
"merge":{
"array":{
"where": { "id": 1 },
"merge": { "note": "modified element" }
}
}
},
{
"merge":{
"array":{
"where": { "id": 2 },
"replace": { "id": 2, "note": "replaced element" },
"move": "tail"
}
}
},
{
"merge":{
"array":{
"where":{ "id": 42 },
"move": "head"
}
}
}
]
Sometimes merge
is desired to combine values on the top level(s) of a
dictionary, but a nested dictionary or array should still be replaced entirely.
While this can be done as a multi-step modification (first delete the old
values and then merge with the new), there is a shorthand available:
if a value anywhere inside a merge
is a dictionary containing only the key
replace_with
, the value corresponding to the dictionary position is
replaced entirely with the value of that key.
For example:
"modify":{
"merge":{
"top_level": {
"nested_array": {
"replace_with": {
"new_value": [ "new_array" ]
}
}
}
}
}
Similarly to replace_with
, a merge
dictionary may contain only the key
replace_in
, in which case the value of that key is treated as a regular
expression substitution similarly to replace
and applied to the corresponding
value in the content.
For example:
"modify":{
"merge":{
"top_level": {
"string_key": {
"replace_in": [ "$", " text added to the end" ]
}
}
}
}
Similarly to replace
, the replace_in
replacement treats the value as an
string, so when applied to JSON dictionary or array values, it is possible to
produce invalid JSON as the output.
- Kimmo Kulovesi – original design and implementation
- Scott Lyttle – the name "Moxy"
- Max Kovalev – fix for obsolete
wrap
function call