Skip to content

Commit

Permalink
Renaming automation module to openwpm (openwpm#793)
Browse files Browse the repository at this point in the history
  • Loading branch information
ankushduacodes authored Nov 14, 2020
1 parent fe64e20 commit cca85b0
Show file tree
Hide file tree
Showing 136 changed files with 92 additions and 93 deletions.
12 changes: 6 additions & 6 deletions .dockerignore
Original file line number Diff line number Diff line change
Expand Up @@ -18,11 +18,11 @@ firefox-bin
venv

# npm packages
automation/Extension/firefox/node_modules
automation/Extension/webext-instrumentation/node_modules
openwpm/Extension/firefox/node_modules
openwpm/Extension/webext-instrumentation/node_modules

# built extension artifacts
automation/Extension/firefox/dist
automation/Extension/firefox/openwpm.xpi
automation/Extension/firefox/src/content.js
automation/Extension/firefox/src/feature.js
openwpm/Extension/firefox/dist
openwpm/Extension/firefox/openwpm.xpi
openwpm/Extension/firefox/src/content.js
openwpm/Extension/firefox/src/feature.js
8 changes: 4 additions & 4 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -90,7 +90,7 @@ docs/_build/
node_modules

# built extension artifacts
automation/Extension/firefox/dist
automation/Extension/firefox/openwpm.xpi
automation/Extension/firefox/src/content.js
automation/Extension/firefox/src/feature.js
openwpm/Extension/firefox/dist
openwpm/Extension/firefox/openwpm.xpi
openwpm/Extension/firefox/src/content.js
openwpm/Extension/firefox/src/feature.js
10 changes: 5 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,8 +82,8 @@ Quick Start

Once installed, it is very easy to run a quick test of OpenWPM. Check out
`demo.py` for an example. This will use the default setting specified in
`automation/default_manager_params.json` and
`automation/default_browser_params.json`, with the exception of the changes
`openwpm/default_manager_params.json` and
`openwpm/default_browser_params.json`, with the exception of the changes
specified in `demo.py`.

More information on the instrumentation and configuration parameters is given
Expand Down Expand Up @@ -126,7 +126,7 @@ Troubleshooting
2. In older versions of firefox (pre 74) the setting to enable extensions was called
`extensions.legacy.enabled`. If you need to work with earlier firefox, update the
setting name `extensions.experiments.enabled` in
`automation/DeployBrowsers/configure_firefox.py`.
`openwpm/DeployBrowsers/configure_firefox.py`.

3. Make sure you're conda environment is activated (`conda activate openwpm`). You can see
you environments and the activate one by running `conda env list` the active environment
Expand Down Expand Up @@ -193,7 +193,7 @@ bodies are saved in a LevelDB database named `content.ldb`, and are keyed by
the hash of the content. In addition, the browser commands that dump page
source and save screenshots save them in the `sources` and `screenshots`
subdirectories of the main output directory. The SQLite schema
specified by: `automation/DataAggregator/schema.sql`. You can specify additional tables
specified by: `openwpm/DataAggregator/schema.sql`. You can specify additional tables
inline by sending a `create_table` message to the data aggregator.

#### Parquet on Amazon S3
Expand All @@ -213,7 +213,7 @@ location.
**NOTE:** The schemas should be kept in sync with the exception of
output-specific columns (e.g., `instance_id` in the S3 output). You can compare
the two schemas by running
`diff -y automation/DataAggregator/schema.sql automation/DataAggregator/parquet_schema.py`.
`diff -y openwpm/DataAggregator/schema.sql openwpm/DataAggregator/parquet_schema.py`.
Docker Deployment for OpenWPM
-----------------------------
Expand Down
4 changes: 2 additions & 2 deletions crawler.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,8 @@
import boto3
import sentry_sdk

from automation import CommandSequence, MPLogger, TaskManager
from automation.utilities import rediswq
from openwpm import CommandSequence, MPLogger, TaskManager
from openwpm.utilities import rediswq
from test.utilities import LocalS3Session, local_s3_bucket

# Configuration via environment variables
Expand Down
2 changes: 1 addition & 1 deletion demo.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
from automation import CommandSequence, TaskManager
from openwpm import CommandSequence, TaskManager

# The list of sites that we wish to crawl
NUM_BROWSERS = 1
Expand Down
6 changes: 3 additions & 3 deletions docs/CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,12 +35,12 @@ the project as well as the one you plan to change fundamentally.

### Editing instrumentation

The instrumentation extension is included in `/automation/Extension/firefox/`.
The instrumentation extension is included in `/openwpm/Extension/firefox/`.
The instrumentation itself (used by the above extension) is included in
`/automation/Extension/webext-instrumentation/`.
`/openwpm/Extension/webext-instrumentation/`.
Any edits within these directories will require the extension to be re-built to produce
a new `openwpm.xpi` with your updates. You can use `./scripts/build-extension.sh` to do this,
or you can run `npm run build` from `automation/Extension/firefox/`.
or you can run `npm run build` from `openwpm/Extension/firefox/`.

### Debugging the platform

Expand Down
12 changes: 6 additions & 6 deletions docs/Configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,14 +3,14 @@
The browser and platform can be configured by two separate dictionaries. The
platform configuration options can be set in `manager_params`, while the
browser configuration options can be set in `browser_params`. The default
settings are given in `automation/default_manager_params.json` and
`automation/default_browser_params.json`.
settings are given in `openwpm/default_manager_params.json` and
`openwpm/default_browser_params.json`.

To load the default configuration parameter dictionaries we provide a helper
function `TaskManager::load_default_params`. For example:

```python
from automation import TaskManager
from openwpm import TaskManager
manager_params, browser_params = TaskManager.load_default_params(num_browsers=5)
```

Expand Down Expand Up @@ -58,7 +58,7 @@ of configuration dictionaries.
* `testing`
* A platform wide flag that can be used to only run certain functionality
while testing. For example, the Javascript instrumentation
[exposes its instrumentation function](https://github.com/citp/OpenWPM/blob/91751831647c37b769f0039d99d0a164384c76ae/automation/Extension/firefox/data/content.js#L447-L449)
[exposes its instrumentation function](https://github.com/citp/OpenWPM/blob/91751831647c37b769f0039d99d0a164384c76ae/openwpm/Extension/firefox/data/content.js#L447-L449)
on the page script global to allow test scripts to instrument objects
on-the-fly. Depending on where you would like to add test functionality,
you may need to propagate the flag.
Expand Down Expand Up @@ -166,7 +166,7 @@ To activate a given instrument set `browser_params[i][instrument_name] = True`
* A number of shortcuts are available to make writing `js_instrument_settings` less
cumbersome than spelling out the full schema. These shortcuts are converted to a full
specification by the `clean_js_instrumentation_settings` method in
[automation/js_instrumentation.py](../automation/js_instrumentation.py).
[openwpm/js_instrumentation.py](../openwpm/js_instrumentation.py).
* The first shortcut is the fingerprinting collection, specified by
`collection_fingerprinting`. This was the default prior to v0.11.0. It contains a collection
of APIs of potential fingerprinting interest:
Expand All @@ -181,7 +181,7 @@ To activate a given instrument set `browser_params[i][instrument_name] = True`
* Window properties (via `window.screen`)
* `collection_fingerprinting` is the default if `js_instrument` is `True`.
* The fingerprinting collection is specified by the json file
[fingerprinting.json](../automation/js_instrumentation_collections/fingerprinting.json).
[fingerprinting.json](../openwpm/js_instrumentation_collections/fingerprinting.json).
This file is also a nice reference example for specifying your own APIs using the other
shortcuts.
* Shortcuts:
Expand Down
12 changes: 6 additions & 6 deletions docs/Platform-Architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ The user-facing component of the OpenWPM platform is the Task Manager. The Task

## Instantiating a Task Manager

All automation code is contained within the `automation` folder; the Task Manager code is contained in `automation/TaskManager.py`.
All automation code is contained within the `openwpm` folder; the Task Manager code is contained in `openwpm/TaskManager.py`.

Task Managers can be instantiated in the following way:

Expand Down Expand Up @@ -68,9 +68,9 @@ with https://github.com/mozilla/OpenWPM/issues/743.

## Overview

Contained in `automation/BrowserManager.py`, Browser Managers provide a wrapper around the drivers used to automate full browser instances. In particular, we opted to use [Selenium](http://docs.seleniumhq.org/) to drive full browser instances as bot detection frameworks can more easily detect lightweight alternatives such as PhantomJS.
Contained in `openwpm/BrowserManager.py`, Browser Managers provide a wrapper around the drivers used to automate full browser instances. In particular, we opted to use [Selenium](http://docs.seleniumhq.org/) to drive full browser instances as bot detection frameworks can more easily detect lightweight alternatives such as PhantomJS.

Browser Managers receive commands from the Task Manager, which they then pass to the command executor (located in `automation/Commands/command_executor.py`), which receives a command object and converts it into web driver actions. Browser Managers also receive browser parameters which they use to instantiate the Selenium web driver using one of the browser initialization functions contained in `automation/DeployBrowsers`.
Browser Managers receive commands from the Task Manager, which they then pass to the command executor (located in `openwpm/Commands/command_executor.py`), which receives a command object and converts it into web driver actions. Browser Managers also receive browser parameters which they use to instantiate the Selenium web driver using one of the browser initialization functions contained in `openwpm/DeployBrowsers`.

The Browser class, contained in the same file, is the Task Manager's wrapper around Browser Managers, which allow it to cleanly kill and restart Browser Managers as necessary.

Expand All @@ -82,7 +82,7 @@ Throughout the course of a measurement, the Browser Managers' commands (along wi

# The WebExtension

All of our data collection happens in the OpenWPM WebExtension, which can be found under [automation/Extension](../automation/Extension).
All of our data collection happens in the OpenWPM WebExtension, which can be found under [openwpm/Extension](../openwpm/Extension).
The Extension makes heavy use of priviliged APIs and can only be installed on unbranded or custom builds of Firefox with add-on security disabled.

The currently supported instruments can be found in [Configuration.md](Configuration.md#Instruments)
Expand All @@ -92,8 +92,8 @@ The currently supported instruments can be found in [Configuration.md](Configura

## Overview

One of the Data Aggregators, contained in `automation/DataAggregator`, gets spawned in a separate process and receives data from the WebExtension and the platform alike. We as previously mentioned we support both local as well as remote data saving.
The most useful feature of the Data Aggregator is the fact that it is isolated from the other processes through a network socket interface (see `automation/SocketInterface.py`).
One of the Data Aggregators, contained in `openwpm/DataAggregator`, gets spawned in a separate process and receives data from the WebExtension and the platform alike. We as previously mentioned we support both local as well as remote data saving.
The most useful feature of the Data Aggregator is the fact that it is isolated from the other processes through a network socket interface (see `openwpm/SocketInterface.py`).

## Data Logged

Expand Down
4 changes: 2 additions & 2 deletions docs/Release-Checklist.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,8 @@ We aim to release a new version of OpenWPM with each new Firefox release (~1 rel
2. Find the commit hash for the Firefox release version you'd like to upgrade to.
3. Update the `TAG` variable in [`scripts/install-firefox.sh`](https://github.com/mozilla/OpenWPM/blob/5ffde00ecd5ecaa9105b74935490e5e267596eb7/scripts/install-firefox.sh#L12) to that hash and the comment to the new tag name.
2. Update extension dependencies.
1. Run `npm update` in `automation/Extension/firefox`.
2. Run `npm update` in `automation/Extension/webext-instrumentation`.
1. Run `npm update` in `openwpm/Extension/firefox`.
2. Run `npm update` in `openwpm/Extension/webext-instrumentation`.
3. Update python and system dependencies by following the ["managing requirements" instructions](https://github.com/mozilla/OpenWPM#managing-requirements).
4. Increment the version number in [VERSION](https://github.com/mozilla/OpenWPM/blob/master/VERSION)
5. Add a summary of changes since the last version to [CHANGELOG](https://github.com/mozilla/OpenWPM/blob/master/CHANGELOG.md)
Expand Down
10 changes: 5 additions & 5 deletions docs/Using_OpenWPM.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ Suppose we want to add a top-level command to cause the browser to jiggle the mo

To add a new command you need to modify the following four files:

1. Define all required paramters in a type in `automation/Commands/Types.py`
1. Define all required paramters in a type in `openwpm/Commands/Types.py`
In our case this looks like this:
```python
class JiggleCommand(BaseCommand):
Expand All @@ -31,9 +31,9 @@ To add a new command you need to modify the following four files:
return "JiggleCommand({})".format(self.num_jiggles)
```

2. Define the behaviour of our new command in `*_commands.py` in `automation/Commands/`,
2. Define the behaviour of our new command in `*_commands.py` in `openwpm/Commands/`,
e.g. `browser_commands.py`.
Feel free to add a new module within `automation/Commands/` for your own custom commands
Feel free to add a new module within `openwpm/Commands/` for your own custom commands
In our case this looks like this:
```python
from selenium.webdriver.common.action_chains import ActionChains
Expand All @@ -48,7 +48,7 @@ To add a new command you need to modify the following four files:
```

3. Make our function be called when the command_sequence reaches our Command, by adding it to the
`execute_command` function in `automation/Commands/command_executer.py`
`execute_command` function in `openwpm/Commands/command_executer.py`
In our case this looks like this:
```python
elif type(command) is JiggleCommand:
Expand All @@ -57,7 +57,7 @@ To add a new command you need to modify the following four files:
number_jiggles=self.num_jiggles)
```

4. Lastly we change ```automation/CommandSequence.py``` by adding a `jiggle_mouse` method to the `CommandSequence`
4. Lastly we change ```openwpm/CommandSequence.py``` by adding a `jiggle_mouse` method to the `CommandSequence`
so we can add our command to the commands list
In our case this looks like this:
```python
Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -31,15 +31,15 @@ The instrumentation leverages the available [JavaScript APIs for WebExtensions](
- Response body content
- Cookie Access (Experimental)

More specifically, all packets sent by the instrumentation conform to [these interfaces](https://github.com/mozilla/OpenWPM/tree/master/automation/Extension/webext-instrumentation/src/schema.ts).
More specifically, all packets sent by the instrumentation conform to [these interfaces](https://github.com/mozilla/OpenWPM/tree/master/openwpm/Extension/webext-instrumentation/src/schema.ts).

## Usage

The instrumentation is designed to invoke a `dataReceiver` object whenever a packet or log entry is available.

Pending proper documentation, the best way to see how this library is used is to check how the instrumentation is incorporated into the following extensions:

* https://github.com/mozilla/OpenWPM/tree/master/automation/Extension/firefox
* https://github.com/mozilla/OpenWPM/tree/master/openwpm/Extension/firefox
* https://github.com/mozilla/jestr-pioneer-shield-study

## Npm publishing
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
"typings": "build/main/index.d.ts",
"module": "build/module/index.js",
"repository": "https://github.com/mozilla/OpenWPM",
"homepage": "https://github.com/mozilla/OpenWPM/tree/master/automation/Extension/webext-instrumentation",
"homepage": "https://github.com/mozilla/OpenWPM/tree/master/openwpm/Extension/webext-instrumentation",
"keywords": [],
"scripts": {
"info": "npm-scripts-info",
Expand Down
File renamed without changes.
File renamed without changes.
1 change: 0 additions & 1 deletion automation/TaskManager.py → openwpm/TaskManager.py
Original file line number Diff line number Diff line change
Expand Up @@ -243,7 +243,6 @@ def _manager_watchdog(self) -> None:
# Check for browsers or displays that were not closed correctly
# 300 second buffer to avoid killing freshly launched browsers
# TODO This buffer should correspond to the maximum spawn timeout
# TODO This buffer should correspond to the maximum spawn timeout
if self.manager_params[PROCESS_WATCHDOG]:
geckodriver_pids: Set[int] = set()
display_pids: Set[int] = set()
Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
4 changes: 2 additions & 2 deletions scripts/build-extension.sh
Original file line number Diff line number Diff line change
Expand Up @@ -9,12 +9,12 @@ set -e
eval "$(conda shell.bash hook)"
conda activate openwpm

pushd automation/Extension/firefox
pushd openwpm/Extension/firefox
npm install
pushd ../webext-instrumentation
npm install
popd
npm run build
popd

echo "Success: automation/Extension/firefox/openwpm.xpi has been built"
echo "Success: openwpm/Extension/firefox/openwpm.xpi has been built"
4 changes: 2 additions & 2 deletions scripts/travis.sh
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
#!/bin/bash
if [[ "$TESTS" == "webextension" ]]; then
cd automation/Extension/webext-instrumentation;
cd openwpm/Extension/webext-instrumentation;
npm test;
else
cd test;
python -m pytest --cov=../automation --cov-report=xml $TESTS -s -v --durations=10;
python -m pytest --cov=../openwpm --cov-report=xml $TESTS -s -v --durations=10;
exit_code=$?;
if [[ "$exit_code" -ne 0 ]]; then
exit $exit_code;
Expand Down
4 changes: 2 additions & 2 deletions setup.cfg
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[tool:isort]
profile = black
known_future_library = future
known_first_party = automation,openwpmtest,test
known_first_party = openwpm,openwpmtest,test
default_section = THIRDPARTY
skip = venv,automation/Extension,firefox-bin
skip = venv,openwpm/Extension,firefox-bin
2 changes: 1 addition & 1 deletion test/conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
EXTENSION_DIR = os.path.join(
os.path.dirname(os.path.realpath(__file__)),
"..",
"automation",
"openwpm",
"Extension",
"firefox",
)
Expand Down
12 changes: 6 additions & 6 deletions test/manual_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,25 +8,25 @@
from selenium import webdriver
from selenium.webdriver.firefox.firefox_binary import FirefoxBinary

from automation import js_instrumentation as jsi
from automation.DeployBrowsers import configure_firefox
from automation.TaskManager import load_default_params
from automation.utilities.platform_utils import get_firefox_binary_path
from openwpm import js_instrumentation as jsi
from openwpm.DeployBrowsers import configure_firefox
from openwpm.TaskManager import load_default_params
from openwpm.utilities.platform_utils import get_firefox_binary_path

from .conftest import create_xpi
from .utilities import BASE_TEST_URL, start_server

# import commonly used modules and utilities so they can be easily accessed
# in the interactive session
from automation.Commands.utils import webdriver_utils as wd_util # noqa isort:skip
from openwpm.Commands.utils import webdriver_utils as wd_util # noqa isort:skip
import domain_utils as du # noqa isort:skip
from selenium.webdriver.common.keys import Keys # noqa isort:skip
from selenium.common.exceptions import * # noqa isort:skip

OPENWPM_LOG_PREFIX = "console.log: openwpm: "
INSERT_PREFIX = "Array"
BASE_DIR = dirname(dirname(realpath(__file__)))
EXT_PATH = join(BASE_DIR, "automation", "Extension", "firefox")
EXT_PATH = join(BASE_DIR, "openwpm", "Extension", "firefox")


class Logger:
Expand Down
2 changes: 1 addition & 1 deletion test/openwpm_jstest.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import re

from ..automation.utilities import db_utils
from ..openwpm.utilities import db_utils
from .openwpmtest import OpenWPMTest


Expand Down
2 changes: 1 addition & 1 deletion test/openwpmtest.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@

import pytest

from ..automation import TaskManager
from ..openwpm import TaskManager
from . import utilities


Expand Down
4 changes: 2 additions & 2 deletions test/test_callback.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
from functools import partial
from typing import List

from ..automation.CommandSequence import CommandSequence
from ..automation.TaskManager import TaskManager
from ..openwpm.CommandSequence import CommandSequence
from ..openwpm.TaskManager import TaskManager
from .openwpmtest import OpenWPMTest
from .utilities import BASE_TEST_URL

Expand Down
6 changes: 3 additions & 3 deletions test/test_callstack_instrument.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
from ..automation import TaskManager
from ..automation.utilities import db_utils
from ..automation.utilities.platform_utils import parse_http_stack_trace_str
from ..openwpm import TaskManager
from ..openwpm.utilities import db_utils
from ..openwpm.utilities.platform_utils import parse_http_stack_trace_str
from . import utilities
from .openwpmtest import OpenWPMTest

Expand Down
4 changes: 2 additions & 2 deletions test/test_crawl.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,8 @@
import domain_utils as du
import pytest

from ..automation import TaskManager
from ..automation.utilities import db_utils
from ..openwpm import TaskManager
from ..openwpm.utilities import db_utils
from .openwpmtest import OpenWPMTest

TEST_SITES = [
Expand Down
Loading

0 comments on commit cca85b0

Please sign in to comment.