Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Let VMDBLogger.log_hashes filter out the sensitive value. #135

Merged
merged 1 commit into from
May 9, 2017

Conversation

lfu
Copy link
Member

@lfu lfu commented Apr 25, 2017

VMDBLogger.log_hashes would log the sensitive value as FILTERED.

Blocks #manageiq/pull/14878.

@miq-bot assign @gmcculloug
@miq-bot add_label fine/yes, enhancement

@gmcculloug
Copy link
Member

@Fryguy @jrafanie @bdunne
Looking for feedback on the enhancements to this logging method as we are looking to use it in other areas. Initially in PR ManageIQ/manageiq#14878 but thinking this could also replace the dump_obj calls for most of the providers during provisioning. (The VMware provision has some specific logging requirements, but others could switch to use this common method.)

I think the biggest change here is that the password lines would still be logged, so you could see the key exists, but the value would be marked as "[FILTERED]". Additionally any encrypted values would also be filtered out.

logger.send(level, " #{l}") unless filter.any? { |f| l.include?(f) }
if key = filter.detect { |f| l.include?(f) }
l.gsub!(/#{key}.*: (.+)/) { |m| m.gsub!($1, "[FILTERED]") }
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like it is changing behavior in an interesting way.

Before this, the entire line which contained a key in options[:filter] was not logged, but now the line has to match a regex and the key name is logged. Why make that change in addition to the filtering for MiqPassword::REGEXP?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see @gmcculloug mentioned this:

I think the biggest change here is that the password lines would still be logged, so you could see the key exists,

@@ -108,9 +108,13 @@ def self.log_hashes(logger, h, options = {})
filter = Array(options[:filter]).flatten.compact.map(&:to_s) << "password"
filter.uniq!

YAML.dump(h).split("\n").each do |l|
values = YAML.dump(h).gsub(MiqPassword::REGEXP) { |m| m.gsub($2, "FILTERED") }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason this is "FILTERED" and the other is "[FILTERED]" (with square brackets)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The substitution here ends up looking like v2:{FILTERED} where the braces come from the encrypted string. Adding [...] into the mix seemed useless.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh weird, so if we remove the capture groups from that regex then this will probably break. Couldn't we just not even print the v2 part? Is that really that important?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is based on an existing regex constant from MiqPassword. Would just need to create a new constant there that captures the entire string. Wasn't sure if it was worth it initially.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would still work with the existing regex:

irb(main):002:0> "stuff...v2:{asbvawerfab}..morestuff".gsub(MiqPassword::REGEXP, "[FILTERED]")
=> "stuff...[FILTERED]..morestuff"

next if l[0...3] == '---'
logger.send(level, " #{l}") unless filter.any? { |f| l.include?(f) }
if key = filter.detect { |f| l.include?(f) }
l.gsub!(/#{key}.*: (.+)/) { |m| m.gsub!($1, "[FILTERED]") }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of doing string manipulation here, why not just dup the hash passed in and alter the value before doing the YAML.dump? That seems like it would be much more reliable than this regex dance.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless of course you're intentionally trying to match keys like "password-but-not-really-a-password"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method was doing a YAML.dump before which means we do not have to be concerned with the depth of embedded objects.

If we stick with the YAML string I think we could do a gsub! for any filters over the whole string outside of the .each loop. Also, would .each_line work here instead of .split("\n")?

Copy link
Member

@carbonin carbonin Apr 26, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this would match password_for_important_things: s3cr3t, but not root_password: s3cr3t

If we want it to match both we should test for that.

EDIT:

Nevermind, just tested this:

irb(main):008:0> l = "root_password: s3cret"
=> "root_password: s3cret"
irb(main):009:0> l.gsub!(/password.*: (.+)/) { |m| m.gsub!($1, "[FILTERED]") }
=> "root_password: [FILTERED]"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gmcculloug .each_line would preserve the \n at the end of each line which would print out an extra blank line for each message in the log.

Copy link
Member

@carbonin carbonin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few things after talking to @gmcculloug

next if l[0...3] == '---'
logger.send(level, " #{l}") unless filter.any? { |f| l.include?(f) }
if key = filter.detect { |f| l.include?(f) }
l.gsub!(/#{key}.*: (.+)/) { |m| m.gsub!($1, "[FILTERED]") }
Copy link
Member

@carbonin carbonin Apr 26, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this would match password_for_important_things: s3cr3t, but not root_password: s3cr3t

If we want it to match both we should test for that.

EDIT:

Nevermind, just tested this:

irb(main):008:0> l = "root_password: s3cret"
=> "root_password: s3cret"
irb(main):009:0> l.gsub!(/password.*: (.+)/) { |m| m.gsub!($1, "[FILTERED]") }
=> "root_password: [FILTERED]"

expect(buffer.read).to_not include("pa$$w0rd")
message = buffer.read
expect(message).to_not include("pa$$w0rd")
expect(message).to include("FILTERED")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we change these specs to all compare against the expected message value directly?

That would make what log_hashes is doing a whole lot more clear.

if key = filter.detect { |f| l.include?(f) }
l.gsub!(/#{key}.*: (.+)/) { |m| m.gsub!($1, "[FILTERED]") }
end
logger.send(level, " #{l}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this isn't changed in this PR, but what is the value of writing a bunch of individual log lines rather than a single multi-line message?

@lfu lfu force-pushed the log_hashes branch 4 times, most recently from bc76743 to 80a59c3 Compare April 27, 2017 13:36
@Fryguy
Copy link
Member

Fryguy commented May 1, 2017

My concern is that previously, this was intentionally a pretty blunt object. We filtered line by line and did a line.include? with the filter key, so although you might get false positives, it would kill the line. Now, the regex introduces a potential for something to slip through. I'm wondering if it's cleaner to still just do a line.include?

@lfu
Copy link
Member Author

lfu commented May 3, 2017

@Fryguy Updated.

@Fryguy
Copy link
Member

Fryguy commented May 3, 2017

I'm good with this, but I don't know why the bot is not kicking a rubocop check here.

next if l[0...3] == '---'
logger.send(level, " #{l}") unless filter.any? { |f| l.include?(f) }
if key = filter.detect { |f| l.include?(f) }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lfu Can you fix the bot warning...to solve it, just use parens around your if conditional...

if (key = filter.detect { |f| l.include?(f) })

@miq-bot
Copy link
Member

miq-bot commented May 8, 2017

Checked commit lfu@d5ae4de with ruby 2.2.6, rubocop 0.47.1, and haml-lint 0.20.0
2 files checked, 0 offenses detected
Everything looks fine. ⭐

@Fryguy Fryguy merged commit 799ba56 into ManageIQ:master May 9, 2017
@Fryguy Fryguy added this to the Sprint 61 Ending May 22, 2017 milestone May 9, 2017
simaishi pushed a commit that referenced this pull request Jun 6, 2017
Let VMDBLogger.log_hashes filter out the sensitive value.
(cherry picked from commit 799ba56)

https://bugzilla.redhat.com/show_bug.cgi?id=1458434
@simaishi
Copy link
Contributor

simaishi commented Jun 6, 2017

Fine backport details:

$ git log -1
commit 8f6cd84d1290af555639ac279d408d9ca605b524
Author: Jason Frey <fryguy9@gmail.com>
Date:   Tue May 9 11:08:13 2017 -0400

    Merge pull request #135 from lfu/log_hashes
    
    Let VMDBLogger.log_hashes filter out the sensitive value.
    (cherry picked from commit 799ba5629dc9159a35e61f4d7730af704f874b03)
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1458434

@lfu lfu deleted the log_hashes branch September 29, 2018 14:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants