Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(replays): Add user-agent parsing to replays processor #1420

Merged
merged 57 commits into from
Aug 29, 2022
Merged
Show file tree
Hide file tree
Changes from 34 commits
Commits
Show all changes
57 commits
Select commit Hold shift + click to select a range
8a7094e
Add relay-replays package
cmanallen Aug 16, 2022
519e0e8
If replays are enabled parse replay-events and update user-agent meta…
cmanallen Aug 16, 2022
1b295d9
Remove comment
cmanallen Aug 16, 2022
d86143a
Add relay-replays as a dependency of relay-server
cmanallen Aug 16, 2022
cafbea5
Remove unused import
cmanallen Aug 16, 2022
833c733
Add additional payload fields
cmanallen Aug 16, 2022
ea78c94
Update parser test to supply vector of bytes
cmanallen Aug 16, 2022
bbd6dbd
Add comment to remind of faulty behavior
cmanallen Aug 16, 2022
c8eb9af
Provide event_id
cmanallen Aug 16, 2022
c2e7192
Only process replays if they match the schema
cmanallen Aug 16, 2022
59cac2e
Increase precision to 64 bits
cmanallen Aug 17, 2022
33bfeed
Assert appropriate payload modification were made
cmanallen Aug 17, 2022
f12726b
Add integration test degradation notice
cmanallen Aug 17, 2022
966b1f3
Remove rust coverage
cmanallen Aug 17, 2022
809228e
Merge branch 'master' into replays-parse-user-agent
cmanallen Aug 17, 2022
5c1c555
Update changelog
cmanallen Aug 17, 2022
e1051aa
Remove user-agent borrow
cmanallen Aug 17, 2022
30254cc
Fix linter errors
cmanallen Aug 17, 2022
82b37e5
Remove comment
cmanallen Aug 17, 2022
1e97484
Up timeout
cmanallen Aug 18, 2022
7bea292
Update user-agent docstring syntax
cmanallen Aug 18, 2022
f09d943
Mark some required fields as optional
cmanallen Aug 18, 2022
869756d
Add module docstring
cmanallen Aug 18, 2022
9507837
Update docstring
cmanallen Aug 18, 2022
92318bb
Rename replay parsing function and simplify body
cmanallen Aug 18, 2022
d92c657
Simplify assertion behavior
cmanallen Aug 18, 2022
3c32a03
Use Device type from uaparser module
cmanallen Aug 18, 2022
921b082
Allow sdk key to be optional
cmanallen Aug 18, 2022
c511540
Update mock data name to reflect payload origin
cmanallen Aug 18, 2022
c24e85a
Merge branch 'master' into replays-parse-user-agent
cmanallen Aug 18, 2022
9fda019
Merge branch 'replays-parse-user-agent' into replays-parse-client-ip-…
cmanallen Aug 18, 2022
603565d
Set user's ip-address if none was found
cmanallen Aug 19, 2022
7b32b6c
Merge branch 'replays-parse-client-ip-address' into replays-parse-use…
cmanallen Aug 19, 2022
436e9a0
Allow null replay_start_timestamp
cmanallen Aug 19, 2022
3713598
Update relay-replays/src/lib.rs
cmanallen Aug 19, 2022
9970906
Update relay-replays/src/lib.rs
cmanallen Aug 19, 2022
20cb0ed
Permit vectors to default to empty list
cmanallen Aug 19, 2022
4f5c846
Merge branch 'replays-parse-user-agent' of https://github.com/getsent…
cmanallen Aug 19, 2022
6204a26
Allow compiler to infer type
cmanallen Aug 19, 2022
8ebc850
Fix import order
cmanallen Aug 19, 2022
a095bf2
Merge branch 'replays-parse-user-agent' of https://github.com/getsent…
cmanallen Aug 19, 2022
28d9cd3
Log out failed deserialization
cmanallen Aug 19, 2022
9c5c51f
Return false
cmanallen Aug 19, 2022
03cd143
Make ReplayInput private
cmanallen Aug 19, 2022
c2b6fe5
Allow requests struct to default
cmanallen Aug 19, 2022
ac2a3c1
Make ReplayInput methods private
cmanallen Aug 19, 2022
4d529cb
Add docstring to normalize_replay_event function
cmanallen Aug 19, 2022
0f7e8b1
Add default for user struct and update ip-address setter with simplif…
cmanallen Aug 19, 2022
d7105fb
Merge branch 'master' into replays-parse-user-agent
cmanallen Aug 19, 2022
31dae59
Merge branch 'master' into replays-parse-user-agent
cmanallen Aug 29, 2022
1ddb848
Set unreachable and use * operator instead of to_string
cmanallen Aug 29, 2022
474c6c9
Merge branch 'replays-parse-user-agent' of https://github.com/getsent…
cmanallen Aug 29, 2022
edac361
Update cargo.lock
cmanallen Aug 29, 2022
a5382d2
Use outer-doc comment syntax
cmanallen Aug 29, 2022
92d1ae5
Flatten test cases
cmanallen Aug 29, 2022
51740d9
Access user_agent functions and types through the module
cmanallen Aug 29, 2022
2e458e6
Merge branch 'master' into replays-parse-user-agent
cmanallen Aug 29, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
**Features**:

- Make Redis connection pool configurable. ([#1418](https://github.com/getsentry/relay/pull/1418))
- Add user-agent parsing to replays processor. ([#1420](https://github.com/getsentry/relay/pull/1420))

## 22.8.0

Expand Down
17 changes: 17 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions py/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
## Unreleased

- Add `transaction_info` to event payloads, including the transaction's source and internal original transaction name. ([#1330](https://github.com/getsentry/relay/pull/1330))
- Add user-agent parsing to replays processor. ([#1420](https://github.com/getsentry/relay/pull/1420))

## 0.8.13

Expand Down
6 changes: 6 additions & 0 deletions relay-general/src/user_agent.rs
Original file line number Diff line number Diff line change
@@ -1,4 +1,10 @@
//! Utility functions for working with user agents.
//!
//! NOTICE:
//!
//! Adding user_agent parsing to your module will incur a latency penalty in the test suite.
//! Because of this some integration tests may fail. To fix this, you will need to add a timeout
//! to your consumer.

use lazy_static::lazy_static;
use uaparser::{Parser, UserAgentParser};
Expand Down
24 changes: 24 additions & 0 deletions relay-replays/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
[package]
name = "relay-replays"
authors = ["Sentry <oss@sentry.io>"]
description = "Replays functionality for Relay"
homepage = "https://getsentry.github.io/relay/"
repository = "https://github.com/getsentry/relay"
version = "22.7.0"
edition = "2018"
license-file = "../LICENSE"
publish = false

[dependencies]
relay-common = { path = "../relay-common" }
relay-general = { path = "../relay-general" }
relay-log = { path = "../relay-log" }
serde = { version = "1.0.114", features = ["derive"] }
serde_json = "1.0.55"
relay-filter = { path = "../relay-filter" }
rand = "0.6.5"
rand_pcg = "0.1.2"
unicase = "2.6.0"

[dev-dependencies]
insta = { version = "1.1.0", features = ["ron"] }
240 changes: 240 additions & 0 deletions relay-replays/src/lib.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,240 @@
//! Relay processing and normalization module.
cmanallen marked this conversation as resolved.
Show resolved Hide resolved
//!
//! Replays are multi-part values sent from Sentry integrations spanning arbitrary time-periods.
//! They are ingested incrementally and are available for viewing after the first two segments
//! (segment 0 and 1) have completed ingestion.
//!
//! # Protocol
//!
//! Relay is expecting a JSON object with some mandatory metadata. However, environment and user
//! metadata is usually sent in addition to the minimal payload.
//!
//! ```json
//! {
//! "type": "replay_event",
//! "replay_id": "d2132d31b39445f1938d7e21b6bf0ec4",
//! "event_id": "63c5b0f895441a94340183c5f1e74cd4",
//! "segment_id": 0,
//! "timestamp": 1597976392.6542819,
//! "replay_start_timestamp": 1597976392.6542819,
//! "urls": ["https://sentry.io"],
//! "error_ids": ["d2132d31b39445f1938d7e21b6bf0ec4"],
//! "trace_ids": ["63c5b0f895441a94340183c5f1e74cd4"],
//! "requests": {
//! "headers": {"User-Agent": "Mozilla/5.0..."}
//! },
//! }
//! ```
use relay_general::user_agent::{parse_device, parse_os, parse_user_agent, Device};
use serde::{Deserialize, Serialize};
use serde_json::Error;
use std::collections::HashMap;
use std::fmt::Write;
use std::net::IpAddr;
cmanallen marked this conversation as resolved.
Show resolved Hide resolved

pub fn normalize_replay_event(
cmanallen marked this conversation as resolved.
Show resolved Hide resolved
replay_bytes: &[u8],
detected_ip_address: Option<IpAddr>,
) -> Result<Vec<u8>, Error> {
let mut replay_input: ReplayInput = serde_json::from_slice(replay_bytes)?;

// Set user-agent metadata.
replay_input.set_user_agent_meta();

// Set user ip-address if needed.
match detected_ip_address {
Some(ip_address) => replay_input.set_user_ip_address(ip_address),
None => (),
}

serde_json::to_vec(&replay_input)
}

#[derive(Debug, Deserialize, Serialize)]
pub struct ReplayInput {
cmanallen marked this conversation as resolved.
Show resolved Hide resolved
#[serde(rename = "type")]
ty: String,
replay_id: String,
event_id: String,
segment_id: u8,
timestamp: f64,
replay_start_timestamp: Option<f64>,
urls: Vec<String>,
error_ids: Vec<String>,
cmanallen marked this conversation as resolved.
Show resolved Hide resolved
trace_ids: Vec<String>,
contexts: Option<Contexts>,
dist: Option<String>,
platform: Option<String>,
environment: Option<String>,
release: Option<String>,
tags: Option<HashMap<String, String>>,
user: Option<User>,
sdk: Option<VersionedMeta>,
requests: Requests,
}

impl ReplayInput {
pub fn set_user_agent_meta(&mut self) {
let user_agent = &self.requests.headers.user_agent;

let device = parse_device(user_agent);

let ua = parse_user_agent(user_agent);
cmanallen marked this conversation as resolved.
Show resolved Hide resolved
let browser_struct = VersionedMeta {
name: ua.family,
version: get_version(&ua.major, &ua.minor, &ua.patch),
};

let os = parse_os(user_agent);
let os_struct = VersionedMeta {
name: os.family,
version: get_version(&os.major, &os.minor, &os.patch),
};

self.contexts = Some(Contexts {
device: Some(device),
browser: Some(browser_struct),
os: Some(os_struct),
})
}

pub fn set_user_ip_address(&mut self, ip_address: IpAddr) {
match &self.user {
cmanallen marked this conversation as resolved.
Show resolved Hide resolved
Some(user) => {
// User was found but no ip-address exists on the object.
if user.ip_address.is_none() {
self.user = Some(User {
id: user.id.to_owned(),
username: user.username.to_owned(),
email: user.email.to_owned(),
ip_address: Some(ip_address.to_string()),
});
}
}
None => {
// Anonymous user-data provided.
self.user = Some(User {
id: None,
username: None,
email: None,
ip_address: Some(ip_address.to_string()),
});
}
}
}
}

fn get_version(
major: &Option<String>,
minor: &Option<String>,
patch: &Option<String>,
) -> Option<String> {
let mut version = major.clone()?;

if let Some(minor) = minor {
write!(version, ".{}", minor).ok();
if let Some(patch) = patch {
write!(version, ".{}", patch).ok();
}
}

Some(version)
}

#[derive(Debug, Deserialize, Serialize)]
struct Contexts {
browser: Option<VersionedMeta>,
device: Option<Device>,
os: Option<VersionedMeta>,
}

#[derive(Debug, Deserialize, Serialize)]
struct VersionedMeta {
name: String,
version: Option<String>,
}

#[derive(Debug, Deserialize, Serialize)]
struct User {
id: Option<String>,
username: Option<String>,
email: Option<String>,
ip_address: Option<String>,
}

#[derive(Debug, Deserialize, Serialize)]
struct Requests {
url: Option<String>,
headers: Headers,
}

#[derive(Debug, Deserialize, Serialize)]
struct Headers {
#[serde(rename = "User-Agent")]
user_agent: String,
}

#[cfg(test)]
mod tests {
use std::net::{IpAddr, Ipv4Addr};

use crate::ReplayInput;

#[test]
fn test_set_ip_address() {
let ip_address = IpAddr::V4(Ipv4Addr::new(127, 0, 0, 1));

// IP-Address was not set.
let payload = include_bytes!("../tests/fixtures/replay.json");
let mut replay_input: ReplayInput = serde_json::from_slice(payload).unwrap();
replay_input.set_user_ip_address(ip_address);
assert!("192.168.11.12".to_string() == replay_input.user.unwrap().ip_address.unwrap());

// IP-Address set.
let payload = include_bytes!("../tests/fixtures/replay_missing_user.json");
let mut replay_input: ReplayInput = serde_json::from_slice(payload).unwrap();
replay_input.set_user_ip_address(ip_address);
assert!("127.0.0.1".to_string() == replay_input.user.unwrap().ip_address.unwrap());

// IP-Address set.
let payload = include_bytes!("../tests/fixtures/replay_missing_user_ip_address.json");
let mut replay_input: ReplayInput = serde_json::from_slice(payload).unwrap();
replay_input.set_user_ip_address(ip_address);
assert!("127.0.0.1".to_string() == replay_input.user.unwrap().ip_address.unwrap());
}

#[test]
fn test_set_user_agent_meta() {
let payload = include_bytes!("../tests/fixtures/replay.json");
let mut replay_input: ReplayInput = serde_json::from_slice(payload).unwrap();
replay_input.set_user_agent_meta();

match replay_input.contexts {
Some(contexts) => {
match contexts.browser {
Some(browser) => {
assert!(browser.name == "Safari".to_string());
assert!(browser.version.unwrap() == "15.5".to_string());
}
None => assert!(false),
}
match contexts.os {
Some(os) => {
assert!(os.name == "Mac OS X".to_string());
assert!(os.version.unwrap() == "10.15.7".to_string());
}
None => assert!(false),
}
match contexts.device {
Some(device) => {
assert!(device.family == "Mac".to_string());
assert!(device.brand.unwrap() == "Apple".to_string());
assert!(device.model.unwrap() == "Mac".to_string());
}
None => assert!(false),
}
}
None => assert!(false),
}
}
}
49 changes: 49 additions & 0 deletions relay-replays/tests/fixtures/replay.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
{
"type": "replay_event",
"replay_id": "d2132d31b39445f1938d7e21b6bf0ec4",
"event_id": "123",
"segment_id": 0,
"timestamp": 1597977777.6189718,
"replay_start_timestamp": 1597976392.6542819,
"urls": [
"sentry.io"
],
"error_ids": [
"1",
"2"
],
"trace_ids": [
"3",
"4"
],
"dist": "1.12",
"platform": "Python",
"environment": "production",
"release": "version@1.3",
"tags": {
"transaction": "/organizations/:orgId/performance/:eventSlug/"
},
"sdk": {
"name": "name",
"version": "veresion"
},
"user": {
"id": "123",
"username": "user",
"email": "user@site.com",
"ip_address": "192.168.11.12"
},
"requests": {
"url": null,
"headers": {
"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.5 Safari/605.1.15"
}
},
"contexts": {
"trace": {
"trace_id": "4C79F60C11214EB38604F4AE0781BFB2",
"span_id": "FA90FDEAD5F74052",
"type": "trace"
}
}
}
Loading