Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Arbitrary WebAPI JS instrumentation #642

Merged
merged 113 commits into from
Jul 8, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
113 commits
Select commit Hold shift + click to select a range
cd40e32
Add mdn-browser-compat-data
birdsarah May 9, 2020
a2531b2
js_instrument_modules as list
birdsarah May 9, 2020
b7403d0
Add mdn-browser-compat
birdsarah May 9, 2020
d211a1c
Pass a list of instrumentingFunctions
birdsarah May 9, 2020
7db500b
Script to generate api data
birdsarah May 9, 2020
5046246
Working give or take
birdsarah May 9, 2020
2a97262
Small naming cleanup
birdsarah May 12, 2020
946ec07
Handle non-configurable properties
birdsarah May 12, 2020
7fd8842
Lint
birdsarah May 12, 2020
ba6ab10
Add aspirational API
birdsarah May 13, 2020
eba6c96
Merge branch 'master' into js-instrumentation
birdsarah May 18, 2020
5276454
Begin migration to new JSInstrumentationRequest interface.
birdsarah May 18, 2020
3d72db9
Continue making progress
birdsarah May 18, 2020
a9f3d93
Begin implementing jsModuleRequest validation.
birdsarah May 19, 2020
9139ebe
Big cleanout- js-instrumentation work moving to python.
birdsarah May 19, 2020
ce3b80e
Continue update to python js-instrumentation
birdsarah May 19, 2020
cdd2f0d
Lint
birdsarah May 19, 2020
de83eac
noqa on wip jsinstrumentation file
birdsarah May 19, 2020
82720a2
Begin updating existing js instrument tests.
birdsarah May 19, 2020
d9266c6
Merge branch 'master' into js-instrumentation
birdsarah May 19, 2020
0226f8d
Merge branch 'issue-443' into js-instrumentation
birdsarah May 19, 2020
68f8077
Small cleanups
birdsarah May 19, 2020
bae565b
Fix naming in calling instrumentJS
birdsarah May 19, 2020
7557bc2
No display mode native for testing
birdsarah May 19, 2020
9d8ca65
Restore py test file to orig.
birdsarah May 19, 2020
8e40b55
Support null propertiesToInstrument
birdsarah May 19, 2020
6559a4d
Re-work instrumentObject tests
birdsarah May 19, 2020
0c0ca21
Clean-up text in test page.
birdsarah May 19, 2020
ab6d3ce
Add default to getLogSettings function
birdsarah May 19, 2020
87ccdab
Don't re-assign logSettings.propertiesToInstrument
birdsarah May 19, 2020
dc014f7
Revert "Don't re-assign logSettings.propertiesToInstrument"
birdsarah May 19, 2020
3421c86
Better assign propertiesToInstrument
birdsarah May 19, 2020
54c8f34
Small cleanup
birdsarah May 21, 2020
073e0a5
Make new logSettings object
birdsarah May 21, 2020
394ca24
Prettify
birdsarah May 21, 2020
1222a78
Small clean
birdsarah May 21, 2020
bce7227
-- BREAK -- JS Rework complete
birdsarah May 21, 2020
2a0142d
Write-out mdn compat data to js_instrumentation .py
birdsarah May 21, 2020
c8e4d9d
Dry out js test code
birdsarah May 21, 2020
f64e439
Consolidate JS tests
birdsarah May 21, 2020
549fc68
Finish missing renames, and add test js via browser_params
birdsarah May 21, 2020
9c393d1
pep8
birdsarah May 21, 2020
952908a
New files and failing tests.
birdsarah May 21, 2020
cd7bf5e
Add a json schema for js_instrument_modules
birdsarah May 21, 2020
cd4b0f7
Latest py tests
birdsarah May 21, 2020
f512fb8
Flake8
birdsarah May 21, 2020
a99c783
Ongoing progress.
birdsarah May 21, 2020
8d20b25
More code, more tests.
birdsarah May 21, 2020
3a149e9
flake
birdsarah May 21, 2020
9ff8757
Rename mdn file
birdsarah May 21, 2020
93e7335
Add latest tests - just implement fingerprinting.json
birdsarah May 21, 2020
5f234a3
flake8
birdsarah May 21, 2020
055f6ea
Add fingerprinting.json (incomplete)
birdsarah May 21, 2020
10e56c6
Correct logSettings property name
birdsarah May 21, 2020
f13e2f9
Restore create_xpi as function
birdsarah May 21, 2020
c840fbc
Make explicit option for logging to console
birdsarah May 21, 2020
b848c44
Process browser_params in task manager
birdsarah May 21, 2020
974b521
Start being able to pass browser_params to selenium
birdsarah May 21, 2020
f0904ac
Revert "Make explicit option for logging to console"
birdsarah May 21, 2020
abf9b1c
Get manual_test working with browser_params
birdsarah May 21, 2020
2c250a7
More robust test for simple fingerprinting output
birdsarah May 21, 2020
4c5ebe5
Add timing information when testing
birdsarah May 21, 2020
aeb168b
Make recheck really fast.
birdsarah May 21, 2020
49dea91
Handle all inputs properly
birdsarah May 21, 2020
295e2ec
Debug with all window params instrumented
birdsarah May 21, 2020
0f4874a
Load xpi we just built
birdsarah May 21, 2020
c1b1f5d
Check for ff version support
birdsarah May 22, 2020
a1d9723
Save a bunch of properties
birdsarah May 22, 2020
3e89866
Relax constraints on what we can instrument.
birdsarah May 22, 2020
1a68eaa
Correct stringifying
birdsarah May 22, 2020
dc31a83
Better name example params, fix some bugs, sample a_f
birdsarah May 22, 2020
4a3f3b3
flake8
birdsarah May 22, 2020
f2bb845
Move example browser_params file out of harms way
birdsarah May 22, 2020
57ed0e3
Add failing test for regression I introduced.
birdsarah May 22, 2020
b9a6369
Fix for regression.
birdsarah May 22, 2020
bfbe19b
Add simple mimeTypes and plugins to fingerprinting.
birdsarah May 22, 2020
17830d4
Lint JS
birdsarah May 22, 2020
90e3e1f
Merge branch 'master' into js-instrumentation
birdsarah May 26, 2020
771049b
Merge branch 'master' into js-instrumentation
birdsarah Jun 23, 2020
514c682
Rm mdn_browser_comat stuff no longer needed
birdsarah Jun 24, 2020
55ac279
Remove example_browser_params
birdsarah Jun 24, 2020
2bd0dca
Load JS_INSTRUMENT_MODULES from JSON string
birdsarah Jun 24, 2020
8f593c8
Rename JS_INSTRUMENT_MODULES to JS_INSTRUMENT_SETTINGS
birdsarah Jun 24, 2020
8fee522
Fixes #28 - Instrument all window.navigator properties.
birdsarah Jun 24, 2020
c600ae6
Finish removing unused mdn-compat pieces.
birdsarah Jun 24, 2020
941c3cc
EventID as a shadow variable
birdsarah Jun 24, 2020
dd5cb98
Flake8
birdsarah Jun 24, 2020
98b4e37
Remove $ prefix and rename
birdsarah Jun 24, 2020
101c41a
Rename jsInstrumentationRequests->jsInstrumentationSettings
birdsarah Jun 24, 2020
ac808fc
TS Lint
birdsarah Jun 24, 2020
6756eaf
Remove use of "request".
birdsarah Jun 25, 2020
520c782
Convert assertions to ValueErrors
birdsarah Jun 25, 2020
8ac03dc
Rename file/folder and fingerprinting -> collection_fingerprinting
birdsarah Jun 25, 2020
bceda0e
Clean-up naming in schema
birdsarah Jun 25, 2020
8c385e0
Add processing of json schema to documentation
birdsarah Jun 25, 2020
5b7e653
Rename js_instrumentation again and ref schema location
birdsarah Jun 25, 2020
8eb4edb
Pass JSON not a js string
birdsarah Jun 25, 2020
fd844b3
Do copying to xpi in npm postbuild step
birdsarah Jun 25, 2020
2e8eaac
Fix import in manual_test
birdsarah Jun 26, 2020
60bae1a
Revert "Pass JSON not a js string"
birdsarah Jun 26, 2020
7072504
Add titles to schema pieces
birdsarah Jun 26, 2020
a21d40d
Merge branch 'master' into js-instrumentation
birdsarah Jun 26, 2020
5a3230f
Add docs for js_instrument_settings
birdsarah Jun 26, 2020
389f4a2
Bit more README cleanup
birdsarah Jun 26, 2020
7a97b86
Merge branch 'master' into js-instrumentation
birdsarah Jun 29, 2020
e9a7dc8
Update README.md
birdsarah Jul 2, 2020
12141c5
Move updating schema docs section
birdsarah Jul 8, 2020
76b3712
Make the single-key dictionary clearer
birdsarah Jul 8, 2020
7dcebfc
Remove versions from npm package files
birdsarah Jul 8, 2020
2a35910
Clean up instrument_existing_window_property.html and js
birdsarah Jul 8, 2020
9559ec9
Fix pyside instrumentation test, add more clarificaiton to README
birdsarah Jul 8, 2020
e2ef603
Use example.com and example.org as localDomains
birdsarah Jul 8, 2020
95465a8
context-manage open, and flake8
birdsarah Jul 8, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
74 changes: 66 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,8 @@ Table of Contents <!-- omit in toc -->
* [Debugging the platform](#debugging-the-platform)
* [Managing requirements](#managing-requirements)
* [Running tests](#running-tests)
* [Mac OSX](#mac-osx-limited-support-for-developers)
* [Mac OSX](#mac-osx)
* [Updating schema docs](#updating-schema-docs)
* [Troubleshooting](#troubleshooting)
* [Docker Deployment for OpenWPM](#docker-deployment-for-openwpm)
* [Building the Docker Container](#building-the-docker-container)
Expand Down Expand Up @@ -80,9 +81,9 @@ After running the install script, activate your conda environment by running:
### Developer instructions

Dev dependencies are installed by using the main `environment.yaml` (which
is used by `./install.sh` script.
is used by `./install.sh` script).

You can install pre-commit hooks install the hooks by running `pre-commit install` to
You can install pre-commit hooks install the hooks by running `pre-commit install` to
lint all the changes before you make a commit.

### Troubleshooting
Expand Down Expand Up @@ -173,8 +174,22 @@ available [below](#output-format).
with the exception of images.
See: [Bug 634073](https://bugzilla.mozilla.org/show_bug.cgi?id=634073).
* Javascript Calls
* Records all method calls (with arguments) and property accesses for APIs
of potential fingerprinting interest:
* Records all method calls (with arguments) and property accesses for configured APIs
* Set `browser_params['js_instrument'] = True`
* Configure `browser_params['js_instrument_settings']` to desired settings.
* Data is saved to the `javascript` table.
* The full specification for `js_instrument_settings` is defined by a JSON schema.
Details of that schema are available in [docs/schemas/README.md](docs/schemas/README.md).
In summary, a list is passed with JS objects to be instrumented and details about how
that object should be instrumented. The js_instrument_settings you pass to browser_params
will be validated python side against the JSON schema before the crawl starts running.
* A number of shortcuts are available to make writing `js_instrument_settings` less
cumbersome than spelling out the full schema. These shortcuts are converted to a full
specification by the `clean_js_instrumentation_settings` method in
[automation/js_instrumentation.py](automation/js_instrumentation.py).
* The first shortcut is the fingerprinting collection, specified by
`collection_fingerprinting`. This was the default prior to v0.11.0. It contains a collection
of APIs of potential fingerprinting interest:
* HTML5 Canvas
* HTML5 WebRTC
* HTML5 Audio
Expand All @@ -184,8 +199,43 @@ available [below](#output-format).
and `window.name` access.
* Navigator properties (e.g. `appCodeName`, `oscpu`, `userAgent`, ...)
* Window properties (via `window.screen`)
* Set `browser_params['js_instrument'] = True`
* Data is saved to the `javascript` table.
* `collection_fingerprinting` is the default if `js_instrument` is `True`.
* The fingerprinting collection is specified by the json file
[fingerprinting.json](automation/js_instrumentation_collections/fingeprinting.json).
This file is also a nice reference example for specifying your own APIs using the other
shortcuts.
* Shortcuts:
* Specifying just a string will instrument
the whole API with the [default log settings](docs/schemas/js_instrument_settings-settings-objects-properties-log-settings.md)
* For just strings you can specify a [Web API](https://developer.mozilla.org/en-US/docs/Web/API)
such as `XMLHttpRequest`. Or you can specify instances on window e.g. `window.document`.
* Alternatively, you can specify a single-key dictionary that maps an API name to the properties / settings you'd
like to use. The key of this dictionary can be an instance on `window` or a Web API.
The value of this dictionary can be:
* A list - this is a shortcut for `propertiesToInstrument` (see [log settings](docs/schemas/js_instrument_settings-settings-objects-properties-log-settings.md))
* A dictionary - with non default log settings. Items missing from this dictionary
will be filled in with the default log settings.
* Here are some examples:
```
// Collections
"collection_fingerprinting",
// APIs, with or without settings details
"Storage",
"XMLHttpRequest",
{"XMLHttpRequest": {"excludedProperties": ["send"]}},
// APIs with shortcut to includedProperties
{"Prop1": ["hi"], "Prop2": ["hi2"]},
{"XMLHttpRequest": ["send"]},
// Specific instances on window
{"window.document": ["cookie", "referrer"]},
{"window": ["name", "localStorage", "sessionStorage"]}
```
* Note, the key / string will only have it's properties instrumented. That is, if you want to instrument
`window.fetch` function you must specify `{"window": ["fetch",]}`. If you specify just `window.fetch` the
instrumentation will try to instrument sub properties of `window.fetch` (which won't work as fetch is a
function). As another example, to instrument window.document.cookie, you must use `{"window.document": ["cookie"]}`.
In instances, such as `fetch`, where you do not need to specify `window.fetch`, but can use the alias `fetch`,
in JavaScript code. The instrumentation `{"window": ["fetch",]}` will pick up calls to both `fetch()` and `window.fetch()`.
* Response body content
* Saves all files encountered during the crawl to a `LevelDB`
database de-duplicated by the md5 hash of the content.
Expand Down Expand Up @@ -537,7 +587,7 @@ in the test directory to run all tests:
$ cd test
$ py.test -vv

See the [pytest docs](https://docs.pytest.org/en/latest/) for more information on selecting
See the [pytest docs](https://docs.pytest.org/en/latest/) for more information on selecting
specific tests and various pytest options.

### Mac OSX
Expand All @@ -552,6 +602,14 @@ Running Firefox with xvfb on OSX is untested and will require the user to instal
an X11 server. We suggest [XQuartz](https://www.xquartz.org/). This setup has not
been tested, we welcome feedback as to whether this is working.

### Updating schema docs

In the rare instance that you need to create schema docs
(after updating or adding files to `schemas` folder), run `npm install`
from OpenWPM top level. Then run `npm run render_schema_docs`. This will update the
`docs/schemas` folder. You may want to clean out the `docs/schemas` folder before doing this
incase files have been renamed.


Troubleshooting
---------------
Expand Down
22 changes: 20 additions & 2 deletions automation/Extension/firefox/feature.js/index.js
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,25 @@ async function main() {
navigation_instrument:true,
cookie_instrument:true,
js_instrument:true,
js_instrument_modules:"fingerprinting",
js_instrument_settings: `
[
{
object: window.CanvasRenderingContext2D.prototype,
instrumentedName: "CanvasRenderingContext2D",
logSettings: {
propertiesToInstrument: [],
nonExistingPropertiesToInstrument: [],
excludedProperties: [],
excludedProperties: [],
logCallStack: false,
logFunctionsAsStrings: false,
logFunctionGets: false,
preventSets: false,
recursive: false,
depth: 5,
}
},
]`,
http_instrument:true,
callstack_instrument:true,
save_content:false,
Expand Down Expand Up @@ -51,7 +69,7 @@ async function main() {
loggingDB.logDebug("Javascript instrumentation enabled");
let jsInstrument = new JavascriptInstrument(loggingDB);
jsInstrument.run(config['crawl_id']);
await jsInstrument.registerContentScript(config['testing'], config['js_instrument_modules']);
await jsInstrument.registerContentScript(config['testing'], config['js_instrument_settings']);
}

if (config['http_instrument']) {
Expand Down
20 changes: 17 additions & 3 deletions automation/Extension/firefox/package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions automation/Extension/firefox/package.json
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
{
"name": "OpenWPM",
"description": "OpenWPM Client extension",
"version": "1.0.0",
"author": "Mozilla",
"dependencies": {
"openwpm-webext-instrumentation": "../webext-instrumentation"
Expand Down Expand Up @@ -35,11 +34,12 @@
"private": true,
"repository": {
"type": "git",
"url": "https://github.com/mozilla/openwpm-firefox-webext"
"url": "git+https://github.com/mozilla/OpenWPM.git"
},
"scripts": {
"prebuild": "cd ../webext-instrumentation && npm run build && cd - && webpack",
"postinstall": "cd ../webext-instrumentation && npm install",
"postbuild": "cp dist/openwpm-1.0.zip openwpm.xpi",
"build": "web-ext build",
"eslint": "eslint . --ext jsm,js,json",
"lint": "npm-run-all lint:*",
Expand Down
Loading