Skip to content

Commit bd7c0cd

Browse files
authored
Merge pull request #21 from dev-dull/15-documentation
Add documentation, minor fixes for issues found while writing docs
2 parents 0d97899 + 2a15253 commit bd7c0cd

File tree

5 files changed

+181
-10
lines changed

5 files changed

+181
-10
lines changed

README.md

Lines changed: 169 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,169 @@
1-
# PyXIE
2-
A lightweight tracking pixel service written in Python
1+
## PyXIE
2+
### About
3+
A lightweight [Tracking Pixel](https://en.wikipedia.org/wiki/Tracking_Pixel?wprov=srpw1_0) service written in Python.
4+
5+
## Installation
6+
### Quickstart using Docker
7+
#### Pull the image from Dockerhub
8+
```bash
9+
user@shell> docker pull devdull/pyxie:latest
10+
latest: Pulling from devdull/pyxie
11+
12+
> snip <
13+
14+
Status: Downloaded newer image for devdull/pyxie:latest
15+
docker.io/devdull/pyxie:latest
16+
```
17+
18+
#### Create a directory to store PyXIE's data
19+
```bash
20+
user@shell> mkdir data
21+
```
22+
23+
#### Create your configuration file
24+
When running PyXIE as a Docker image, it is recommended to set the `DATABASE_FILE` value in `config.yaml` to ensure that data is persisted between container restarts. Below is a minimal example.
25+
26+
`config.yaml`:
27+
```yaml
28+
DATABASE_FILE: /app/data/uadb.json
29+
API_KEYS:
30+
- your-api-key-here
31+
- a-different-api-key-here
32+
- Another API key with spaces and a comma, but this might be hard to use later.
33+
```
34+
35+
#### Run the image, mounting the data path and configuration file:
36+
```bash
37+
user@shell> docker run -d --mount type=bind,src="./config.yaml",dst="/app/config.yaml" --mount type=bind,src="./data",dst="/app/data" -p 5000:5000 devdull/pyxie:latest
38+
```
39+
40+
#### Test the instance
41+
```bash
42+
user@shell> curl -X POST -H 'X-Api-Key: your-api-key-here' -d 'id=foo' 'http://localhost:5000/register'
43+
Success
44+
user@shell> ls -l data/ # Confirm the data file exists in the bound directory
45+
total 8
46+
-rw-r--r-- 1 user staff 2043 Jul 8 11:57 uadb.json
47+
```
48+
49+
#### Stuff the average user can ignore
50+
The service inside the container is run using Gunicorn. To configure the bind IP and port, you can set the environment variables `LISTEN_IP` and `LISTEN_PORT`. These should not be confused for the configuration items used by Flask which can be defined in `config.yaml`.
51+
52+
### Manual install using Flask (or Gunicorn)
53+
#### Install the app requirements
54+
```bash
55+
user@shell> python3 -m venv .venv
56+
user@shell> source .venv/bin/activate
57+
user@shell> pip3 install -r requirements.txt
58+
```
59+
60+
You should now be able to start PyXIE using Flask with the command `python3 pyxie.py` (listens on `127.0.0.1:5000`) or `gunicorn pyxie:pyxie` (listens on to `0.0.0.0:8000`)
61+
62+
## Usage
63+
### Configuration
64+
Below is a minimal configuration file which lists out API keys. These keys should be long and difficult to guess.
65+
66+
`config.yaml`:
67+
```yaml
68+
API_KEYS:
69+
- your-api-key-here
70+
- a-different-api-key-here
71+
- Another API key with spaces and a comma, but this might be hard to use later.
72+
```
73+
74+
Below is a complete list of user configurable settings:
75+
|Configuration item|Default value|Details|
76+
|---|---|---|
77+
|`LISTEN_IP`|`127.0.0.1`|The IP address to listen on when running with Flask (omit for Docker, Gunicorn)|
78+
|`LISTEN_PORT`|`5000`|The port number to listen on when running with Flask (omit for Docker, Gunicorn)|
79+
|`API_KEYS`|`[]` (empty list)|A list of API keys that should be considered valid by PyXIE|
80+
|`LOG_LEVEL`|`WARNING`|The logging level. Valid values are, `CRITICAL`, `ERROR`, `WARNING`, `INFO`, and `DEBUG`|
81+
|`DATABASE_FILE`|`uadb.json`|The file that stores all pixel tracking data|
82+
|`RRD_MAX_SIZE`|`10000`|Planned to be deprecated! The maximum number of records to keep for each `id`|
83+
84+
### Register a new `id`
85+
The purpose of an `id` is to enable the user to differentiate between the various places a tracking pixel has been embedded. For example, you would want a different `id` for tracking if a user saw an email versus tracking embedded into a specific webpage.
86+
87+
Make a `POST` request to the `/register` endpoint which specifies your new `id` as a parameter using an API key specified in your configuration as the value for a `X-Api-Key` header. If successful, you should get a "Success" message and a status code of `201`.
88+
89+
Here is an example that registers an `id` of `testing` for the service when it is running locally:
90+
```bash
91+
user@shell> curl -Ss -X POST -H 'X-Api-Key: your-api-key-here' -d 'id=testing' 'http://127.0.0.1:5000/register'
92+
Success
93+
```
94+
95+
If no `Success` message appears, nothing was registered. Double check your API key, your URL, and your port number.
96+
97+
Using your registered `id` as a `GET` parameter, you should now be able to navigate to the tracking pixel in your browser. For the `id` of `testing` like in the above call, the URL to the tracking pixel would be `http://127.0.0.1:5000/?id=testing`. Any unregistered IDs will result in a "Not Found" message and a `404` status code.
98+
99+
### Embed your tracking pixel
100+
How you embed your pixel will depend on the document format, but here's an example for an HTML page:
101+
```html
102+
<img src="http://127.0.0.1:5000/?id=testing" width="1" height="1" />
103+
```
104+
105+
Because the image is a transparent PNG a single pixel in size, it is unlikely to significantly interfere with the formatting of any website, but placing it at the bottom of a page should minimize any potential formatting issues. Specifying the width and height (like in the example or using CSS) should mitigate the likelihood of a broken image icon on your page should PyXIE go offline, or the `id` to be unregistered.
106+
107+
### View or collect stats
108+
Statistics are only viewable to individuals who have a valid API key, and can be accessed using the `/stats` endpoint. When successful, you should get valid JSON back as well as a status code of `200`.
109+
110+
for example:
111+
```bash
112+
user@shell> curl -Ss -H 'X-Api-Key: your-api-key-here' 'http://127.0.0.1:5000/stats' | jq
113+
{
114+
"browser_family_counts": {
115+
"foo": {
116+
"192.168.1.99": {
117+
"Firefox": 1,
118+
"curl": 1
119+
}
120+
},
121+
"testing": {
122+
"127.0.0.1": {
123+
"Firefox": 3
124+
}
125+
}
126+
},
127+
"os_family_counts": {
128+
"foo": {
129+
"192.168.1.99": {
130+
"Mac OS X": 1,
131+
"Unknown": 1
132+
}
133+
},
134+
"testing": {
135+
"127.0.0.1": {
136+
"Mac OS X": 3
137+
}
138+
}
139+
},
140+
"referrer_counts": {
141+
"foo": {
142+
"192.168.1.99": {
143+
"Unknown": 2
144+
}
145+
},
146+
"testing": {
147+
"127.0.0.1": {
148+
"Unknown": 3
149+
}
150+
}
151+
}
152+
}
153+
```
154+
155+
The data is structured in the following format (examples are from the first block in the above):
156+
- Name of the data (e.g. `browser_family_counts`)
157+
- an `id` you registered (e.g. `foo`)
158+
- The IP address of the individual who viewed the tracking pixel (e.g. `192.168.1.99`)
159+
- The value of the viewer data and the number of times that value has been seen (`Firefox` has been seen `1` time and `curl` has been seen `1` time)
160+
161+
To put all of that together: One or more user at the IP address `192.168.1.99` saw a tracking pixel with an `id` of `foo`. Once with a "browser family" of `Firefox`, and another with `curl`.
162+
163+
### Unregister an `id`
164+
Note that unregistering an ID is destructive and all data for that `id` will be lost. If you wish to retain the data, make a copy of your datafile (e.g. `uadb.json`) first. If successful, you should get a "Success" message and a status code of `204`.
165+
166+
```bash
167+
user@shell> curl -Ss -X DELETE -H 'X-Api-Key: your-api-key-here' 'http://127.0.0.1:5000/unregister?id=testing'
168+
Success
169+
```

constfig.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,8 @@ def __init__(self):
1313
self.LISTEN_PORT = 5000
1414
self.API_KEYS = []
1515
self.LOG_LEVEL = "WARNING"
16+
self.DATABASE_FILE = "uadb.json"
17+
self.RRD_MAX_SIZE = 10000 # Maximum number of records in the database
1618

1719
# Load user config (override defaults above)
1820
self.load_config()

ddb.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -150,7 +150,7 @@ def _get_id(self):
150150
return request.args.get("id")
151151

152152
def register(self):
153-
id = self._get_id()
153+
id = request.form.get("id")
154154
if id in self:
155155
raise KeyError(f"ID {id} already registered")
156156
super().__setitem__(id, _DDB(max_size=self._max_size))
@@ -181,13 +181,13 @@ def _cleanup(self):
181181
for v in self.values():
182182
v._cleanup()
183183

184-
def dump(self, filename="uadb.json"):
184+
def dump(self, filename=C.DATABASE_FILE):
185185
with open(filename, "w") as fout:
186186
json.dump(self, fout, indent=2)
187187
fout.flush()
188188
fout.truncate()
189189

190-
def load(self, filename="uadb.json"):
190+
def load(self, filename=C.DATABASE_FILE):
191191
try:
192192
with open(filename, "r") as fin:
193193
data = json.load(fin)

pyxie.py

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -10,9 +10,7 @@
1010

1111
def _validate_api_key():
1212
api_key = request.headers.get(C.HTTP_HEADER_X_API_KEY)
13-
if api_key in C.API_KEYS:
14-
return True
15-
return False
13+
return api_key in C.API_KEYS
1614

1715

1816
@pyxie.route("/register", methods=[C.HTTP_METHOD_POST])
@@ -52,7 +50,11 @@ def metrics():
5250

5351
@pyxie.route("/", methods=[C.HTTP_METHOD_GET])
5452
def root():
55-
_data()
53+
try:
54+
_data()
55+
except KeyError as e:
56+
return "Not Found", 404
57+
5658
return Response(C.ONE_BY_ONE, mimetype=C.HTTP_MIME_TYPE_PNG)
5759

5860

run.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ if [ -z "$LISTEN_IP" ]; then
55
fi
66

77
if [ -z "$LISTEN_PORT" ]; then
8-
export LISTEN_PORT=8000
8+
export LISTEN_PORT=5000 # Set to 5000 to match Flask's default and avoid confusion in the docs
99
fi
1010

1111
gunicorn --bind $LISTEN_IP:$LISTEN_PORT pyxie:pyxie

0 commit comments

Comments
 (0)