Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(pii): Adopt new selectors #818

Merged
merged 10 commits into from
Nov 2, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@
- Emit event errors and normalization errors for unknown breadcrumb keys. ([#824](https://github.com/getsentry/relay/pull/824))
- Normalize `breadcrumb.ty` into `breadcrumb.type` for broken Python SDK versions. ([#824](https://github.com/getsentry/relay/pull/824))
- Add the client SDK interface for unreal crashes and set the name to `unreal.crashreporter`. ([#828](https://github.com/getsentry/relay/pull/828))
- Fine-tune the selectors for minidump PII scrubbing. ([#818](https://github.com/getsentry/relay/pull/818))

## 20.10.1

Expand Down
11 changes: 10 additions & 1 deletion relay-general/src/pii/attachments.rs
Original file line number Diff line number Diff line change
Expand Up @@ -369,7 +369,7 @@ impl<'a> PiiAttachmentsProcessor<'a> {
) -> ProcessingState<'s> {
self.root_state.enter_borrowed(
filename,
Some(Cow::Owned(FieldAttrs::new().pii(Pii::Maybe))),
Some(Cow::Owned(FieldAttrs::new().pii(Pii::True))),
Some(value_type),
)
}
Expand All @@ -383,9 +383,18 @@ impl<'a> PiiAttachmentsProcessor<'a> {
state: &ProcessingState<'_>,
encodings: ScrubEncodings,
) -> bool {
let pii = state.attrs().pii;
if pii == Pii::False {
return false;
}

let mut changed = false;

for (selector, rules) in &self.compiled_config.applications {
if pii == Pii::Maybe && !selector.is_specific() {
continue;
}

if state.path().matches_selector(&selector) {
for rule in rules {
// Note:
Expand Down
212 changes: 207 additions & 5 deletions relay-general/src/pii/minidumps.rs
Original file line number Diff line number Diff line change
Expand Up @@ -256,39 +256,61 @@ impl PiiAttachmentsProcessor<'_> {
let mut changed = false;

for item in items {
// IMPORTANT: Minidump sections are always classified as Pii:Maybe. This avoids to
// accidentally scrub stack memory with highly generic selectors. TODO: Update the PII
// system with a better approach.
let attrs = Cow::Owned(FieldAttrs::new().pii(Pii::Maybe));

match item {
MinidumpItem::StackMemory(range) => {
// IMPORTANT: The stack is PII::Maybe to avoid accidentally scrubbing it
// with highly generic selectors.
let slice = data
.get_mut(range)
.ok_or(ScrubMinidumpError::InvalidAddress)?;

// Backwards-compatible visit
let attrs = Cow::Owned(FieldAttrs::new().pii(Pii::Maybe));
let state =
file_state.enter_static("", Some(attrs), Some(ValueType::StackMemory));
changed |= self.scrub_bytes(slice, &state, ScrubEncodings::All);

// Documented visit
let attrs = Cow::Owned(FieldAttrs::new().pii(Pii::Maybe));
let state = file_state.enter_static(
"stack_memory",
Some(attrs),
Some(ValueType::Binary),
);
changed |= self.scrub_bytes(slice, &state, ScrubEncodings::All);
}
MinidumpItem::NonStackMemory(range) => {
// Backwards-compatible visit.
let slice = data
.get_mut(range)
.ok_or(ScrubMinidumpError::InvalidAddress)?;
let attrs = Cow::Owned(FieldAttrs::new().pii(Pii::Maybe));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't have to reinstantiate this I believe? Shouldn't you be able to pass down a Cow::Borrowed for the individual visits?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't pass this to multiple calls and it also is not clonable. And since in the future we'll want some to be set to Pii::Yes de-duplicating this seems more cognitive overhead than it saves. That was the logic anyway, I don't feel strongly so could change this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FieldAttrs is Copy

let state =
file_state.enter_static("", Some(attrs), Some(ValueType::HeapMemory));
changed |= self.scrub_bytes(slice, &state, ScrubEncodings::All);

// Documented visit.
let attrs = Cow::Owned(FieldAttrs::new().pii(Pii::Maybe));
let state = file_state.enter_static(
"heap_memory",
Some(attrs),
Some(ValueType::Binary),
);
changed |= self.scrub_bytes(slice, &state, ScrubEncodings::All);
}
MinidumpItem::LinuxEnviron(range) | MinidumpItem::LinuxCmdLine(range) => {
let slice = data
.get_mut(range)
.ok_or(ScrubMinidumpError::InvalidAddress)?;
let attrs = Cow::Owned(FieldAttrs::new().pii(Pii::Maybe));
let state = file_state.enter_static("", Some(attrs), Some(ValueType::Binary));
changed |= self.scrub_bytes(slice, &state, ScrubEncodings::All);
}
MinidumpItem::CodeModuleName(range) => {
let slice = data
.get_mut(range)
.ok_or(ScrubMinidumpError::InvalidAddress)?;
let attrs = Cow::Owned(FieldAttrs::new().pii(Pii::Maybe));
// Mirrors decisions made on NativeImagePath type
let state =
file_state.enter_static("code_file", Some(attrs), Some(ValueType::String));
Expand All @@ -299,6 +321,7 @@ impl PiiAttachmentsProcessor<'_> {
let slice = data
.get_mut(range)
.ok_or(ScrubMinidumpError::InvalidAddress)?;
let attrs = Cow::Owned(FieldAttrs::new().pii(Pii::Maybe));
// Mirrors decisions made on NativeImagePath type
let state =
file_state.enter_static("debug_file", Some(attrs), Some(ValueType::String));
Expand Down Expand Up @@ -377,6 +400,33 @@ mod tests {
iter.next(); // remove main module
iter.cloned().collect()
}

/// Returns the raw stack memory regions.
fn stacks<'slf>(&'slf self, which: Which) -> Vec<&'slf [u8]> {
let dump: &'slf Minidump<&'static [u8]> = match which {
Which::Original => &self.orig_dump,
Which::Scrubbed => &self.scrubbed_dump,
};

let thread_list: MinidumpThreadList = dump.get_stream().unwrap();
let stack_rvas: Vec<RVA> = thread_list
.threads
.iter()
.map(|t| t.raw.stack.memory.rva)
.collect();

// These bytes are kept alive by our struct itself, so returning them with the
// lifetime of our struct is fine. The lifetimes on the Minidump::MemoryRegions
// iterator are currenty wrong and assumes we keep a reference to the
// MinidumpMemoryList, hence we need to transmute this. See
// https://github.com/luser/rust-minidump/pull/111
let mem_list: MinidumpMemoryList<'slf> = dump.get_stream().unwrap();
mem_list
.iter()
.filter(|mem| stack_rvas.contains(&mem.desc.memory.rva))
.map(|mem| unsafe { std::mem::transmute(mem.bytes) })
.collect()
}
}

#[test]
Expand Down Expand Up @@ -542,4 +592,156 @@ mod tests {
);
}
}

#[test]
fn test_stack_scrubbing_backwards_compatible_selector() {
// Some users already use this bare selector, that's all we care about for backwards
// compatibility.
let scrubber = TestScrubber::new(
"linux.dmp",
include_bytes!("../../../tests/fixtures/linux.dmp"),
serde_json::json!(
{
"applications": {
"$stack_memory": ["@anything:mask"],
}
}
),
);
for stack in scrubber.stacks(Which::Scrubbed) {
assert!(stack.iter().all(|b| *b == b'*'));
}
}

#[test]
fn test_stack_scrubbing_path_item_selector() {
let scrubber = TestScrubber::new(
"linux.dmp",
include_bytes!("../../../tests/fixtures/linux.dmp"),
serde_json::json!(
{
"applications": {
"$minidump.stack_memory": ["@anything:mask"],
}
}
),
);
for stack in scrubber.stacks(Which::Scrubbed) {
assert!(stack.iter().all(|b| *b == b'*'));
}
}

#[test]
#[should_panic]
fn test_stack_scrubbing_valuetype_selector() {
// This should work, but is known to fail currently because the selector logic never
// considers a selector containing $binary as specific.
let scrubber = TestScrubber::new(
"linux.dmp",
include_bytes!("../../../tests/fixtures/linux.dmp"),
serde_json::json!(
{
"applications": {
"$minidump.$binary": ["@anything:mask"],
}
}
),
);
for stack in scrubber.stacks(Which::Scrubbed) {
assert!(stack.iter().all(|b| *b == b'*'));
}
}

#[test]
fn test_stack_scrubbing_valuetype_not_fully_qualified() {
// Not fully qualified valuetype should not touch the stack
let scrubber = TestScrubber::new(
"linux.dmp",
include_bytes!("../../../tests/fixtures/linux.dmp"),
serde_json::json!(
{
"applications": {
"$binary": ["@anything:mask"],
}
}
),
);
for (scrubbed_stack, original_stack) in scrubber
.stacks(Which::Scrubbed)
.iter()
.zip(scrubber.stacks(Which::Original).iter())
{
assert_eq!(scrubbed_stack, original_stack);
}
}

#[test]
#[should_panic]
fn test_stack_scrubbing_wildcard() {
// Wildcard should not touch the stack. However currently wildcards are considered
// specific selectors so they do. This is a known issue.
let scrubber = TestScrubber::new(
"linux.dmp",
include_bytes!("../../../tests/fixtures/linux.dmp"),
serde_json::json!(
{
"applications": {
"$minidump.*": ["@anything:mask"],
}
}
),
);
for (scrubbed_stack, original_stack) in scrubber
.stacks(Which::Scrubbed)
.iter()
.zip(scrubber.stacks(Which::Original).iter())
{
assert_eq!(scrubbed_stack, original_stack);
}
}

#[test]
fn test_stack_scrubbing_deep_wildcard() {
// Wildcard should not touch the stack
let scrubber = TestScrubber::new(
"linux.dmp",
include_bytes!("../../../tests/fixtures/linux.dmp"),
serde_json::json!(
{
"applications": {
"$attachments.**": ["@anything:mask"],
}
}
),
);
for (scrubbed_stack, original_stack) in scrubber
.stacks(Which::Scrubbed)
.iter()
.zip(scrubber.stacks(Which::Original).iter())
{
assert_eq!(scrubbed_stack, original_stack);
}
}

#[test]
fn test_stack_scrubbing_binary_not_stack() {
let scrubber = TestScrubber::new(
"linux.dmp",
include_bytes!("../../../tests/fixtures/linux.dmp"),
serde_json::json!(
{
"applications": {
"$binary && !stack_memory": ["@anything:mask"],
}
}
),
);
for (scrubbed_stack, original_stack) in scrubber
.stacks(Which::Scrubbed)
.iter()
.zip(scrubber.stacks(Which::Original).iter())
{
assert_eq!(scrubbed_stack, original_stack);
}
}
}
Loading