Port parser filter from fluent-plugin-parser. fix #1189 #1191

repeatedly · 2016-08-29T10:47:58Z

Changes from original code:

Use v0.14 namespace
Use v0.14 API (Now working)
Remove warnings like unused variable

repeatedly · 2016-08-30T03:16:10Z

@tagomoris We can't set estimate_current_event via parser helper API before configure. Do you know the workaround for this?

repeatedly · 2016-08-31T05:19:05Z

Reaming task is replacing filter_stream with filter_with_time.

repeatedly · 2016-08-31T05:21:19Z

Don't set estimate_current_event = false but test passed. Hmm...

cosmo0920 · 2016-09-01T08:04:41Z

test/plugin/test_filter_parser.rb

+    Fluent::Test.setup
+    @tag = 'test'
+    @default_time = Time.parse('2010-05-04 03:02:01 UTC')
+    Timecop.freeze(@default_time)


Should we restore Time with Timecop.return in #teardown?

cosmo0920 · 2016-09-02T09:36:11Z

lib/fluent/plugin/filter_parser.rb

+      self
+    end
+
+    def filter_stream(tag, es)


It is just a comment. Feel free to ignore this.
I think that we can use #filter_with_time instead of #filter_stream here.
Adopting to filter pipeline should be next task?

tagomoris · 2016-09-13T04:23:31Z

lib/fluent/plugin/filter_parser.rb

+    config_param :inject_key_prefix, :string, default: nil
+    config_param :replace_invalid_sequence, :bool, default: false
+    config_param :hash_value_field, :string, default: nil
+    config_param :suppress_parse_error_log, :bool, default: false


It's no need to use, but to use error stream instead.

It seems good. I will rewrite it.

Default filter_stream implementation catches exceptions and exceptions are routed to error stream. So warning tests are changed to log check way.

The drawback is parser tries to parse time field even if users don't need parsed time. If this is a critical, we should revive time_parse option.

Several tests are executed by new and compat configurations for checking backward compatibility

repeatedly · 2016-11-04T08:01:56Z

Rebased and apply reviews.

repeatedly · 2016-11-04T11:39:47Z

@tagomoris Could you check this patch again?

tagomoris

Added review comments.

tagomoris · 2016-11-07T08:46:45Z

lib/fluent/plugin/filter_parser.rb

+    config_param :replace_invalid_sequence, :bool, default: false
+    config_param :hash_value_field, :string, default: nil
+
+    config_section :parse do


What's this blank config_section for?

tagomoris · 2016-11-07T08:54:02Z

lib/fluent/plugin/filter_parser.rb

+            r = handle_parsed(tag, record, t, values)
+            return t, r
+          else
+            router.emit_error_event(tag, time, record, Fluent::Plugin::Parser::ParserError.new("pattern not match with data '#{raw_value}'"))


Are there any cases NOT to return values without ParserError?

I'm not sure but we can't say all existing parsers don't return nil for time and value.
So keeping this behaviour is better for users.

tagomoris · 2016-11-07T08:55:16Z

lib/fluent/plugin/filter_parser.rb

+        router.emit_error_event(tag, time, record, e)
+        return FAILED_RESULT
+      rescue ArgumentError => e
+        if @replace_invalid_sequence


I think it's better to do raise unless @replace_invalid_sequence to reduce nest levels.

good point. I will rewrite these lines.

tagomoris · 2016-11-07T08:57:36Z

lib/fluent/plugin/filter_parser.rb

+      r
+    end
+
+    def replace_invalid_byte(string)


We can use String#scrub now.

tagomoris · 2016-11-07T08:58:55Z

lib/fluent/plugin/filter_parser.rb

+            raise
+          end
+          replaced_string = replace_invalid_byte(raw_value)
+          @parser.parse(replaced_string) do |t, values|


We should use retry with scrubbed string to dedup code.

tagomoris · 2016-11-07T09:23:21Z

test/plugin/test_filter_parser.rb

+      end
+    end
+
+    def test_nothing_raised


tagomoris · 2016-11-07T09:24:00Z

test/plugin/test_filter_parser.rb

+  INVALID_MESSAGE = 'foo bar'
+  VALID_MESSAGE   = 'col1=foo col2=bar'
+
+  def test_parser_error_warning


There are no assertions about warning.

tagomoris · 2016-11-07T09:24:18Z

test/plugin/test_filter_parser.rb

+      @d = create_driver(CONFIG_DISABELED_SUPPRESS_PARSE_ERROR_LOG)
+    end
+
+    def test_raise_exception


tagomoris · 2016-11-07T09:24:24Z

test/plugin/test_filter_parser.rb

+      end
+    end
+
+    def test_nothing_raised


tagomoris · 2016-11-07T09:26:42Z

test/plugin/test_filter_parser.rb

+    </parse>
+  ]
+
+  class EnabledSuppressParseErrorLogTest < self


Migrated plugin code doesn't have suppress_parse_error_log option.
This test code is outdated, right?

tagomoris · 2016-11-07T09:29:12Z

Parser plugin helper can instantiate two or more parsers. So we can add "secondary" parser for the values which don't match the first parser. It's nice to have to replace fluent-plugin-multi-format-parser.

<parse>
  @type json
</parse>
<parse 1>
  @type regexp
  expression ...
</parse>
<parse 2>
  @type regexp
  expression ...
</parse>
# snip

- Use proper assertion - Add data to EventTime and Integer time tests - Extract some assertions for better test cases

repeatedly · 2016-11-10T01:03:36Z

Applied reviews.

repeatedly · 2016-11-10T01:05:05Z

@tagomoris It is a good idea > secondary parsers.
It's a TODO after merged this PR.

repeatedly · 2016-11-10T01:14:36Z

This patch is not the cause of test failure.

repeatedly · 2016-11-10T09:27:16Z

@tagomoris check again.

tagomoris · 2016-11-10T09:54:02Z

test/plugin/test_filter_parser.rb

+    invalid_utf8 = "\xff".force_encoding('UTF-8')
+
+    d = create_driver(CONFIG_NOT_REPLACE)
+    d.run(shutdown: false) do


Are there any reason to specify shutdown: false?

tagomoris · 2016-11-10T09:54:56Z

test/plugin/test_filter_parser.rb

+    flexmock(d.instance.router).should_receive(:emit_error_event).
+      with(String, Integer, Hash, ArgumentError.new("data does not exist")).once
+    assert_nothing_raised {
+      d.run(shutdown: false) do


tagomoris · 2016-11-10T09:57:00Z

I commented about just one point (twice). Others look good to me.

repeatedly · 2016-11-10T17:22:15Z

Applied all reviews. After test passed, I will merge this PR.

repeatedly force-pushed the port-parser-filter branch from 8824827 to ebd5cf5 Compare August 29, 2016 10:49

cosmo0920 reviewed Sep 1, 2016
View reviewed changes

tagomoris added the v0.14 label Sep 1, 2016

cosmo0920 reviewed Sep 2, 2016
View reviewed changes

repeatedly mentioned this pull request Sep 5, 2016

Port parser filter to v0.12 #1203

Merged

tagomoris reviewed Sep 13, 2016
View reviewed changes

okkez mentioned this pull request Oct 13, 2016

Need some examples with @type forward fluent/fluent-plugin-grok-parser#21

Closed

okkez mentioned this pull request Oct 28, 2016

Any example on how to use the grok parser in a filter? fluent/fluent-plugin-grok-parser#23

Closed

repeatedly added 9 commits November 4, 2016 11:26

Port parser filter from fluent-plugin-parser. fix #1189

96664ac

Apply v0.14 style. Use plugin helpers.

0f14fa6

Add missing parser parameters to compat_parameters

a11d4fb

Call Timecop.return in teardown to restore Time

a0c8f7e

Use filter_with_time API

b166787

Default filter_stream implementation catches exceptions and exceptions are routed to error stream. So warning tests are changed to log check way.

Remove duplicated compat parameters

9359eea

Use error stream instead of warning logs

8e8b92d

Replace time_parse with reserve_time

c993be5

The drawback is parser tries to parse time field even if users don't need parsed time. If this is a critical, we should revive time_parse option.

Update test to use new configuration format

5ffcb87

Several tests are executed by new and compat configurations for checking backward compatibility

repeatedly force-pushed the port-parser-filter branch from 030dfd8 to 5ffcb87 Compare November 4, 2016 08:01

repeatedly assigned tagomoris Nov 4, 2016

tagomoris requested changes Nov 7, 2016

View reviewed changes

tagomoris assigned repeatedly Nov 9, 2016

tagomoris removed their assignment Nov 9, 2016

repeatedly added 8 commits November 10, 2016 09:06

Use String#scrub instead of own routine

2c82887

Remove useless config_section definition

4aef481

Improve invalid string handling to reduce duplicated code

33cac26

Recent update removes ValuesParser from v0.14 parsers

d08a34d

Use default_tag for driver run

5cb93a4

Remove deleted paramters tests

cb3dee6

Rename suppress warning test to unmached pattern

163dc5c

Improve tests

71d700c

- Use proper assertion - Add data to EventTime and Integer time tests - Extract some assertions for better test cases

Forget to update test to use event_time instead of to_i

82953c5

tagomoris reviewed Nov 10, 2016

View reviewed changes

tagomoris approved these changes Nov 10, 2016

View reviewed changes

Remove useless shutdown option from run

f64bb77

repeatedly merged commit 7b0c40b into master Nov 10, 2016

repeatedly deleted the port-parser-filter branch November 10, 2016 18:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Port parser filter from fluent-plugin-parser. fix #1189 #1191

Port parser filter from fluent-plugin-parser. fix #1189 #1191

repeatedly commented Aug 29, 2016 •

edited

Loading

repeatedly commented Aug 30, 2016

repeatedly commented Aug 31, 2016

repeatedly commented Aug 31, 2016

cosmo0920 Sep 1, 2016

repeatedly Sep 1, 2016

cosmo0920 Sep 2, 2016

tagomoris Sep 13, 2016

repeatedly Sep 23, 2016

repeatedly commented Nov 4, 2016

repeatedly commented Nov 4, 2016

tagomoris left a comment

tagomoris Nov 7, 2016

tagomoris Nov 7, 2016

repeatedly Nov 10, 2016

tagomoris Nov 7, 2016

repeatedly Nov 10, 2016

tagomoris Nov 7, 2016

repeatedly Nov 10, 2016

tagomoris Nov 7, 2016

tagomoris Nov 7, 2016

tagomoris Nov 7, 2016

tagomoris Nov 7, 2016

tagomoris Nov 7, 2016

tagomoris Nov 7, 2016

tagomoris commented Nov 7, 2016 •

edited

Loading

repeatedly commented Nov 10, 2016

repeatedly commented Nov 10, 2016

repeatedly commented Nov 10, 2016

repeatedly commented Nov 10, 2016

tagomoris Nov 10, 2016

tagomoris Nov 10, 2016

tagomoris commented Nov 10, 2016

repeatedly commented Nov 10, 2016

Port parser filter from fluent-plugin-parser. fix #1189 #1191

Port parser filter from fluent-plugin-parser. fix #1189 #1191

Conversation

repeatedly commented Aug 29, 2016 • edited Loading

repeatedly commented Aug 30, 2016

repeatedly commented Aug 31, 2016

repeatedly commented Aug 31, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

repeatedly commented Nov 4, 2016

repeatedly commented Nov 4, 2016

tagomoris left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tagomoris commented Nov 7, 2016 • edited Loading

repeatedly commented Nov 10, 2016

repeatedly commented Nov 10, 2016

repeatedly commented Nov 10, 2016

repeatedly commented Nov 10, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tagomoris commented Nov 10, 2016

repeatedly commented Nov 10, 2016

repeatedly commented Aug 29, 2016 •

edited

Loading

tagomoris commented Nov 7, 2016 •

edited

Loading