Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

util/log: automatic log redaction #48051

Merged
merged 1 commit into from
Jun 8, 2020
Merged

Conversation

knz
Copy link
Contributor

@knz knz commented Apr 26, 2020

Fixes #46031

See "log redaction" RFC from #48076 for details.

General concept:

  • logs stored on disk contain markers around data items potentially
    containing PII or confidential data (“unsafe data”).

  • debug zip, debug merge-log, and the Logs and LogFiles RPCs
    edit the log entries upon request to replace all the unsafe data by
    ‹×›.

See also the release notes below for a summary of UX.

Example redacted output:

I200426 20:06:45.558765 1 cli/start.go:671 ⋮ GEOS initialized at ‹×›
I200426 20:06:45.558772 1 cli/start.go:676 ⋮ starting cockroach node
I200426 20:06:45.562212 75 storage/rocksdb.go:606 ⋮ opening rocksdb instance at ‹×›
I200426 20:06:45.574608 75 server/server.go:750 ⋮ [n?] monitoring forward clock jumps based on server.clock.forward_jump_check_enabled
I200426 20:06:45.574782 75 storage/rocksdb.go:606 ⋮ opening rocksdb instance at ‹×›
I200426 20:06:45.581451 75 server/config.go:576 ⋮ [n?] ‹×› storage engine‹×› initialized
I200426 20:06:45.581459 75 server/config.go:579 ⋮ [n?] ‹×›
I200426 20:06:45.581466 75 server/config.go:579 ⋮ [n?] ‹×›
I200426 20:06:45.602204 75 server/server.go:1270 ⋮ [n?] **** add additional nodes by specifying --join=‹×›
I200426 20:06:45.602218 75 server/init.go:166 ⋮ [n?] no stores bootstrapped and --join flag specified, awaiting init command or join with an already initialized node.
I200426 20:06:45.602234 75 server/init.go:186 ⋮ [n?] **** cluster ‹×› has been created
I200426 20:06:47.676567 117 gossip/gossip.go:1538 ⋮ [n1] node has connected to cluster via gossip
I200426 20:06:47.676620 117 kv/kvserver/stores.go:255 ⋮ [n1] wrote ‹×› node addresses to persistent storage
I200508 12:34:37.238123 1346 kv/kvserver/replica_consistency.go:255 ⋮ [n1,consistencyChecker,s2,r‹×›] triggering stats recomputation to resolve delta of ‹×›

This output was produced from the following raw log:

I200426 20:06:45.558765 1 cli/start.go:671 ⋮ GEOS initialized at ‹/data/home/kena/src/go/src/github.com/cockroachdb/cockroach/lib/libgeos_c.so›
I200426 20:06:45.558772 1 cli/start.go:676 ⋮ starting cockroach node
I200426 20:06:45.562212 75 storage/rocksdb.go:606 ⋮ opening rocksdb instance at ‹"/data/home/kena/src/go/src/github.com/cockroachdb/cockroach/cockroach-data/cockroach-temp302681397"›
I200426 20:06:45.574608 75 server/server.go:750 ⋮ [n?] monitoring forward clock jumps based on server.clock.forward_jump_check_enabled
I200426 20:06:45.574782 75 storage/rocksdb.go:606 ⋮ opening rocksdb instance at ‹"/data/home/kena/src/go/src/github.com/cockroachdb/cockroach/cockroach-data"›
I200426 20:06:45.581451 75 server/config.go:576 ⋮ [n?] ‹1› storage engine‹› initialized
I200426 20:06:45.581459 75 server/config.go:579 ⋮ [n?] ‹RocksDB cache size: 128 MiB›
I200426 20:06:45.581466 75 server/config.go:579 ⋮ [n?] ‹store 0: RocksDB, max size 0 B, max open file limit 1878853›
I200426 20:06:45.602204 75 server/server.go:1270 ⋮ [n?] **** add additional nodes by specifying --join=‹kenax:26257›
I200426 20:06:45.602218 75 server/init.go:166 ⋮ [n?] no stores bootstrapped and --join flag specified, awaiting init command or join with an already initialized node.
I200426 20:06:45.602234 75 server/init.go:186 ⋮ [n?] **** cluster ‹2a7d34b5-a2aa-469b-8efd-ef7deb62ca73› has been created
I200426 20:06:47.676567 117 gossip/gossip.go:1538 ⋮ [n1] node has connected to cluster via gossip
I200426 20:06:47.676620 117 kv/kvserver/stores.go:255 ⋮ [n1] wrote ‹0› node addresses to persistent storage
I200508 12:34:37.238123 1346 kv/kvserver/replica_consistency.go:255 ⋮ [n1,consistencyChecker,s2,r‹4/1:/System{/tsd-tse}›] triggering stats recomputation to resolve delta of ‹{ContainsEstimates:1438 LastUpdateNanos:1588941276231756380 IntentAge:0 GCBytesAge:0 LiveBytes:-35159 LiveCount:-692 KeyBytes:-33660 KeyCount:-692 ValBytes:-1499 ValCount:-692 IntentBytes:0 IntentCount:0 SysBytes:0 SysCount:0}›

Release note (cli change): Certain kinds of advanced troubleshooting
information, for example internal details from the Go runtime, are now
output to a separate log file in the configured log directory. This
file is named <program>-stderr.xxx.log.

Release note (cli change): The server CLI commands (start,
start-single-node, demo) now support a new flag
--redactable-logs, which introduces markers in generated log entries
to facilitate redaction of sensitive information by cockroach debug zip and other CockroachDB APIs. This flag is currently not enabled by
default, but is recommended for all deployments and will be enabled by
default in a later version.

This feature is experimental.

Note: the flag is advertised (in --help) by cockroach start and
cockroach start-single-node, where log files are expected as a
matter of course; it is also supported, yet hidden (from --help),
by every other CLI command because every command can also produce log
entries when passed --log-dir or --logtostderr.

Release note (cli change): The commands cockroach debug zip and
cockroach debug merge-logs support a new flag --redact-logs. When
specified, it causes log messages that potentially contain
confidential data or PII to be redacted away.

This feature is experimental.

Note 1: At the time of this writing, this redaction is extremely
aggressive and may hinder the ability of Cockroach Labs to investigate
issues from log files, especially if --redactable-logs is not
enabled server-side.

Note 2: Other potentially-confidential data may still be retrieved by
cockroach debug zip; the new flag only applies to retrieved log
files inside the output zip file.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@blathers-crl
Copy link

blathers-crl bot commented Apr 26, 2020

❌ The GitHub CI (Cockroach) build has failed on 2c94f0ff.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@blathers-crl
Copy link

blathers-crl bot commented Apr 27, 2020

❌ The GitHub CI (Cockroach) build has failed on b1b979a8.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@petermattis
Copy link
Collaborator

I haven't looked at the change in detail, but the redacted output looks like it should be redacting keys from log tags which it is not currently doing:

I200426 20:06:49.615435 524 kv/kvserver/replica_consistency.go:255  [n1,consistencyChecker,s1,r4/1:/System{/tsd-tse}] triggering stats recomputation to resolve delta of <REDACTED>

Specifically, the /System{/tsd-tse} part of the log tag can contain table keys. Is this something you're planning to address in a subsequent PR?

@knz
Copy link
Contributor Author

knz commented Apr 27, 2020

Specifically, the /System{/tsd-tse} part of the log tag can contain table keys. Is this something you're planning to address in a subsequent PR?

I already called it out in the commit message as "left over work" that needs to be addressed in this PR before it even merges.

I put the PR out to stimulate discussion this week about the approach. I'm not keen on advancing in this direction if the discussion steers me somewhere else.

@bdarnell
Copy link
Contributor

The markers are selected to be no-op on standard unix terminals.

But not in less, emacs, or vim (where they appear as ^N and ^O).

logs need preprocessing to be greppable

What do you mean? I just made a test file and it works for me. But losing easy greppability seems like a significant downside.

I'm not a fan of digging up obscure characters that seem to be unused (and invisible!) in practice and giving them new meanings. This seems likely to bite us somehow. (For example, as part of a convoluted security vulnerability, inserting ^O characters into someone's database might be a way to defeat their log redactions).

We have occasional requests for changes to our log format (e.g. #44755), which is non-standard (and kind of annoying even when you're used to it - that leading IWEF character was a bad choice). Maybe it's time to introduce a second log stream in a structured format (which would make pre-processing effectively mandatory) from which the redacted and unredacted logs could be generated. We could continue logging the current format for a time as well.

@knz
Copy link
Contributor Author

knz commented Apr 27, 2020

I'm preparing a RFC, this will be a better place for a discussion.

Misc feedback:

But not in less, emacs, or vim (where they appear as ^N and ^O).

less -r works. For emacs/vim, that's easy to solve with editor configurations.

logs need preprocessing to be greppable
What do you mean?

Most of the text is still greppable (that was a big motivation for this particular approach!!), but you need to be careful about text that straddles a marker boundary.

(For example, as part of a convoluted security vulnerability, inserting ^O characters into someone's database might be a way to defeat their log redactions).

I've thought about this, they will be escaped appropriately.

Maybe it's time to introduce a second log stream in a structured format which would make pre-processing effectively mandatory

That is indeed planned, but I consider this to be orthogonal. Moreover, the mandatory pre-processing does not bring us in a better place than the proposal here.

@knz
Copy link
Contributor Author

knz commented Apr 27, 2020

RFC at #48076

@blathers-crl
Copy link

blathers-crl bot commented May 8, 2020

❌ The GitHub CI (Cockroach) build has failed on 7bcacbbf.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@blathers-crl
Copy link

blathers-crl bot commented May 8, 2020

❌ The GitHub CI (Cockroach) build has failed on d7c7a62b.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@blathers-crl
Copy link

blathers-crl bot commented May 8, 2020

❌ The GitHub CI (Cockroach) build has failed on 344b88d3.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@blathers-crl
Copy link

blathers-crl bot commented May 8, 2020

❌ The GitHub CI (Cockroach) build has failed on b23eedb6.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@knz knz force-pushed the 20200426-redact-logs branch 2 times, most recently from 9bd8827 to 49c8707 Compare May 24, 2020 20:35
@knz knz marked this pull request as ready for review May 24, 2020 20:35
@knz knz requested a review from a team May 24, 2020 20:35
@knz knz requested a review from a team as a code owner May 24, 2020 20:35
@knz knz requested review from dt and tbg and removed request for a team May 24, 2020 20:35
@knz knz added the X-blathers-silence Avoid blathers notifications label Jun 5, 2020
@knz knz force-pushed the 20200426-redact-logs branch 3 times, most recently from a051c4b to 8320e32 Compare June 5, 2020 22:20
@knz
Copy link
Contributor Author

knz commented Jun 5, 2020

the stderr redirection stuff works fine. I'll wrap this up soon.

@knz knz force-pushed the 20200426-redact-logs branch 2 times, most recently from 32bf468 to f19398c Compare June 6, 2020 12:48
craig bot pushed a commit that referenced this pull request Jun 6, 2020
49937: cli/interactive_tests: misc improvements r=rohany a=knz

Found while working on #48051

Co-authored-by: Raphael 'kena' Poss <knz@thaumogen.net>
@knz knz force-pushed the 20200426-redact-logs branch 2 times, most recently from c394aba to acda248 Compare June 7, 2020 16:47
Copy link
Contributor Author

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have implemented the SafeFormatter mechanism we discussed earlier and applied it in a few places in roachpb.
For example here:
https://github.com/cockroachdb/cockroach/pull/48051/files#diff-8daf13806c92216f1e390e1411fdf6cbR295-R322

It nicely makes the code more compact, and allows us to share code between the String() and SafeFormat() methods.

Feel free to have a look and provide further feedback. Otherwise I plan to merge this by EOD tomorrow.

Some additional notes for follow-up work:

  • this PR leaves the new feature disabled by default, until I expose the results to a few co-workers and gather a go/no-go decision on flipping the switch.
  • currently the errors package does not know about the SafeFormatter mechanism and still relies on the (now-obsolete) SafeMessager interface. The result is that the amount of details potentially reported to Sentry decreases with this PR. I plan to reconcile this in a separate PR.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @bdarnell, @dt, @knz, and @petermattis)

@knz knz force-pushed the 20200426-redact-logs branch 6 times, most recently from 4eadc41 to ecaa0bc Compare June 8, 2020 18:39
See "log redaction" RFC from cockroachdb#48076 for details.

General concept:

- logs stored on disk contain markers around data items potentially
  containing PII or confidential data (“unsafe data”).

- `debug zip`, `debug merge-log`, and the `Logs` and `LogFiles` RPCs
  edit the log entries upon request to replace all the unsafe data by
  `‹×›`.

See also the release notes below for a summary of UX.

Example redacted output:

```
I200426 20:06:45.558765 1 cli/start.go:671 ⋮ GEOS initialized at ‹×›
I200426 20:06:45.558772 1 cli/start.go:676 ⋮ starting cockroach node
I200426 20:06:45.562212 75 storage/rocksdb.go:606 ⋮ opening rocksdb instance at ‹×›
I200426 20:06:45.574608 75 server/server.go:750 ⋮ [n?] monitoring forward clock jumps based on server.clock.forward_jump_check_enabled
I200426 20:06:45.574782 75 storage/rocksdb.go:606 ⋮ opening rocksdb instance at ‹×›
I200426 20:06:45.581451 75 server/config.go:576 ⋮ [n?] ‹×› storage engine‹×› initialized
I200426 20:06:45.581459 75 server/config.go:579 ⋮ [n?] ‹×›
I200426 20:06:45.581466 75 server/config.go:579 ⋮ [n?] ‹×›
I200426 20:06:45.602204 75 server/server.go:1270 ⋮ [n?] **** add additional nodes by specifying --join=‹×›
I200426 20:06:45.602218 75 server/init.go:166 ⋮ [n?] no stores bootstrapped and --join flag specified, awaiting init command or join with an already initialized node.
I200426 20:06:45.602234 75 server/init.go:186 ⋮ [n?] **** cluster ‹×› has been created
I200426 20:06:47.676567 117 gossip/gossip.go:1538 ⋮ [n1] node has connected to cluster via gossip
I200426 20:06:47.676620 117 kv/kvserver/stores.go:255 ⋮ [n1] wrote ‹×› node addresses to persistent storage
I200508 12:34:37.238123 1346 kv/kvserver/replica_consistency.go:255 ⋮ [n1,consistencyChecker,s2,r‹×›] triggering stats recomputation to resolve delta of ‹×›
```

This output was produced from the following raw log:

```
I200426 20:06:45.558765 1 cli/start.go:671 ⋮ GEOS initialized at ‹/data/home/kena/src/go/src/github.com/cockroachdb/cockroach/lib/libgeos_c.so›
I200426 20:06:45.558772 1 cli/start.go:676 ⋮ starting cockroach node
I200426 20:06:45.562212 75 storage/rocksdb.go:606 ⋮ opening rocksdb instance at ‹"/data/home/kena/src/go/src/github.com/cockroachdb/cockroach/cockroach-data/cockroach-temp302681397"›
I200426 20:06:45.574608 75 server/server.go:750 ⋮ [n?] monitoring forward clock jumps based on server.clock.forward_jump_check_enabled
I200426 20:06:45.574782 75 storage/rocksdb.go:606 ⋮ opening rocksdb instance at ‹"/data/home/kena/src/go/src/github.com/cockroachdb/cockroach/cockroach-data"›
I200426 20:06:45.581451 75 server/config.go:576 ⋮ [n?] ‹1› storage engine‹› initialized
I200426 20:06:45.581459 75 server/config.go:579 ⋮ [n?] ‹RocksDB cache size: 128 MiB›
I200426 20:06:45.581466 75 server/config.go:579 ⋮ [n?] ‹store 0: RocksDB, max size 0 B, max open file limit 1878853›
I200426 20:06:45.602204 75 server/server.go:1270 ⋮ [n?] **** add additional nodes by specifying --join=‹kenax:26257›
I200426 20:06:45.602218 75 server/init.go:166 ⋮ [n?] no stores bootstrapped and --join flag specified, awaiting init command or join with an already initialized node.
I200426 20:06:45.602234 75 server/init.go:186 ⋮ [n?] **** cluster ‹2a7d34b5-a2aa-469b-8efd-ef7deb62ca73› has been created
I200426 20:06:47.676567 117 gossip/gossip.go:1538 ⋮ [n1] node has connected to cluster via gossip
I200426 20:06:47.676620 117 kv/kvserver/stores.go:255 ⋮ [n1] wrote ‹0› node addresses to persistent storage
I200508 12:34:37.238123 1346 kv/kvserver/replica_consistency.go:255 ⋮ [n1,consistencyChecker,s2,r‹4/1:/System{/tsd-tse}›] triggering stats recomputation to resolve delta of ‹{ContainsEstimates:1438 LastUpdateNanos:1588941276231756380 IntentAge:0 GCBytesAge:0 LiveBytes:-35159 LiveCount:-692 KeyBytes:-33660 KeyCount:-692 ValBytes:-1499 ValCount:-692 IntentBytes:0 IntentCount:0 SysBytes:0 SysCount:0}›
```

Release note (cli change): Certain kinds of advanced troubleshooting
information, for example internal details from the Go runtime, are now
output to a separate log file in the configured log directory.  This
file is named `<program>-stderr.xxx.log`.

Release note (cli change): The server CLI commands (`start`,
`start-single-node`, `demo`) now support a new flag
`--redactable-logs`, which introduces markers in generated log entries
to facilitate redaction of sensitive information by `cockroach debug
zip` and other CockroachDB APIs. This flag is currently not enabled by
default, but is recommended for all deployments and will be enabled by
default in a later version.

This feature is experimental.

Note: the flag is *advertised* (in `--help`) by `cockroach start` and
`cockroach start-single-node`, where log files are expected as a
matter of course; it is also supported, yet *hidden* (from `--help`),
by every other CLI command because every command can also produce log
entries when passed `--log-dir` or `--logtostderr`.

Release note (cli change): The commands `cockroach debug zip` and
`cockroach debug merge-logs` support a new flag `--redact-logs`. When
specified, it causes log messages that potentially contain
confidential data or PII to be redacted away.

This feature is experimental.

Note 1: At the time of this writing, this redaction is extremely
aggressive and may hinder the ability of Cockroach Labs to investigate
issues from log files, especially if `--redactable-logs` is not
enabled server-side.

Note 2: Other potentially-confidential data may still be retrieved by
`cockroach debug zip`; the new flag only applies to retrieved log
files inside the output zip file.
@knz
Copy link
Contributor Author

knz commented Jun 8, 2020

I'll process any further feedback in followup PRs.

bors r=tbg,bdarnell

@craig
Copy link
Contributor

craig bot commented Jun 8, 2020

Build succeeded

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
X-blathers-silence Avoid blathers notifications
Projects
None yet
Development

Successfully merging this pull request may close these issues.

cli: support debug zip --anonymize
5 participants