Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port parser filter from fluent-plugin-parser. fix #1189 #1191

Merged
merged 19 commits into from
Nov 10, 2016

Conversation

repeatedly
Copy link
Member

@repeatedly repeatedly commented Aug 29, 2016

Changes from original code:

  • Use v0.14 namespace
  • Use v0.14 API (Now working)
  • Remove warnings like unused variable

@repeatedly
Copy link
Member Author

@tagomoris We can't set estimate_current_event via parser helper API before configure. Do you know the workaround for this?

@repeatedly
Copy link
Member Author

Reaming task is replacing filter_stream with filter_with_time.

@repeatedly
Copy link
Member Author

Don't set estimate_current_event = false but test passed. Hmm...

Fluent::Test.setup
@tag = 'test'
@default_time = Time.parse('2010-05-04 03:02:01 UTC')
Timecop.freeze(@default_time)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we restore Time with Timecop.return in #teardown?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right

@tagomoris tagomoris added the v0.14 label Sep 1, 2016
self
end

def filter_stream(tag, es)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is just a comment. Feel free to ignore this.
I think that we can use #filter_with_time instead of #filter_stream here.
Adopting to filter pipeline should be next task?

config_param :inject_key_prefix, :string, default: nil
config_param :replace_invalid_sequence, :bool, default: false
config_param :hash_value_field, :string, default: nil
config_param :suppress_parse_error_log, :bool, default: false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's no need to use, but to use error stream instead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems good. I will rewrite it.

Default filter_stream implementation catches exceptions and
exceptions are routed to error stream.
So warning tests are changed to log check way.
The drawback is parser tries to parse time field even if
users don't need parsed time.
If this is a critical, we should revive time_parse option.
Several tests are executed by new and compat configurations
for checking backward compatibility
@repeatedly
Copy link
Member Author

Rebased and apply reviews.

@repeatedly
Copy link
Member Author

@tagomoris Could you check this patch again?

Copy link
Member

@tagomoris tagomoris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added review comments.

config_param :replace_invalid_sequence, :bool, default: false
config_param :hash_value_field, :string, default: nil

config_section :parse do
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's this blank config_section for?

r = handle_parsed(tag, record, t, values)
return t, r
else
router.emit_error_event(tag, time, record, Fluent::Plugin::Parser::ParserError.new("pattern not match with data '#{raw_value}'"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any cases NOT to return values without ParserError?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure but we can't say all existing parsers don't return nil for time and value.
So keeping this behaviour is better for users.

router.emit_error_event(tag, time, record, e)
return FAILED_RESULT
rescue ArgumentError => e
if @replace_invalid_sequence
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to do raise unless @replace_invalid_sequence to reduce nest levels.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point. I will rewrite these lines.

r
end

def replace_invalid_byte(string)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use String#scrub now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right.

raise
end
replaced_string = replace_invalid_byte(raw_value)
@parser.parse(replaced_string) do |t, values|
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use retry with scrubbed string to dedup code.

end
end

def test_nothing_raised
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto.

INVALID_MESSAGE = 'foo bar'
VALID_MESSAGE = 'col1=foo col2=bar'

def test_parser_error_warning
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are no assertions about warning.

@d = create_driver(CONFIG_DISABELED_SUPPRESS_PARSE_ERROR_LOG)
end

def test_raise_exception
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto.

end
end

def test_nothing_raised
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto.

</parse>
]

class EnabledSuppressParseErrorLogTest < self
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Migrated plugin code doesn't have suppress_parse_error_log option.
This test code is outdated, right?

@tagomoris
Copy link
Member

tagomoris commented Nov 7, 2016

Parser plugin helper can instantiate two or more parsers. So we can add "secondary" parser for the values which don't match the first parser. It's nice to have to replace fluent-plugin-multi-format-parser.

<parse>
  @type json
</parse>
<parse 1>
  @type regexp
  expression ...
</parse>
<parse 2>
  @type regexp
  expression ...
</parse>
# snip

@tagomoris tagomoris removed their assignment Nov 9, 2016
@repeatedly
Copy link
Member Author

Applied reviews.

@repeatedly
Copy link
Member Author

@tagomoris It is a good idea > secondary parsers.
It's a TODO after merged this PR.

@repeatedly
Copy link
Member Author

This patch is not the cause of test failure.

@repeatedly
Copy link
Member Author

@tagomoris check again.

invalid_utf8 = "\xff".force_encoding('UTF-8')

d = create_driver(CONFIG_NOT_REPLACE)
d.run(shutdown: false) do
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any reason to specify shutdown: false?

flexmock(d.instance.router).should_receive(:emit_error_event).
with(String, Integer, Hash, ArgumentError.new("data does not exist")).once
assert_nothing_raised {
d.run(shutdown: false) do
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@tagomoris
Copy link
Member

I commented about just one point (twice). Others look good to me.

@repeatedly
Copy link
Member Author

Applied all reviews. After test passed, I will merge this PR.

@repeatedly repeatedly merged commit 7b0c40b into master Nov 10, 2016
@repeatedly repeatedly deleted the port-parser-filter branch November 10, 2016 18:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants