Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement record path filter. #386

Merged
merged 1 commit into from
Sep 1, 2021
Merged

Conversation

blackwinter
Copy link
Member

@blackwinter blackwinter commented Aug 30, 2021

Split event stream into records based on entity path.

Related to #382 and org.metafacture.xml.XmlElementSplitter.

Resolves #385.

@blackwinter blackwinter requested a review from fsteeg August 30, 2021 15:41
@blackwinter blackwinter linked an issue Aug 30, 2021 that may be closed by this pull request
@blackwinter blackwinter marked this pull request as ready for review August 30, 2021 15:47
@blackwinter
Copy link
Member Author

Proof of concept; more test cases to follow.

@blackwinter
Copy link
Member Author

Open questions:

  • Should it be allowed to filter for literals as well? (Emitting the literal directly)
  • What should happen with records that don't match the filter condition? Should the record start/end events still be emitted?

@blackwinter
Copy link
Member Author

blackwinter commented Aug 30, 2021

It would be nice if we could extract the TestHelpers into a common package (e.g. metafacture-framework) so we can reuse it here as well as in org.metafacture.metamorph. The "problem" is the Mockito dependency. Any ideas?

@fsteeg
Copy link
Member

fsteeg commented Aug 31, 2021

Proof of concept; more test cases to follow.

Great, looks good, many thanks!

Should it be allowed to filter for literals as well? (Emitting the literal directly)

Not sure what you mean, something like literal-to-object / LiteralToObject (emits literal values as objects)?

What should happen with records that don't match the filter condition? Should the record start/end events still be emitted?

Hm, they would always be empty right? I think the records should be skipped entirely. I think it makes sense: if we say e.g. 'records are in the data field', and there is no data field, we should not get any records.

It would be nice if we could extract the TestHelpers into a common package (e.g. metafacture-framework)

Sounds good, in metafacture-framework/src/test/java/org/metafacture/framework/helpers?

The "problem" is the Mockito dependency. Any ideas?

I don't understand what you mean with the problem... metafacture-framework already depends on Mockito if that's what you mean.

@blackwinter
Copy link
Member Author

Thanks for having a look at this.

Not sure what you mean, something like literal-to-object / LiteralToObject (emits literal values as objects)?

No, not really. I meant, what should happen here?

    @Test
    public void shouldProcessLiteralPath() {
        assertFilter(
                i -> {
                    i.setPath("lit");

                    i.startRecord("1");
                    i.literal("lit", "record 1");
                    i.endRecord();
                    i.startRecord("2");
                    i.literal("lit", "record 2");
                    i.endRecord();
                },
                o -> {
                    o.get().startRecord("1");
                    o.get().literal("lit", "record 1"); // ?
                    o.get().endRecord();
                    o.get().startRecord("2");
                    o.get().literal("lit", "record 2"); // ?
                    o.get().endRecord();
                }
        );
    }

This would behave differently, because with an entity filter path, the entity element itself would not be emitted.

Hm, they would always be empty right? I think the records should be skipped entirely. I think it makes sense: if we say e.g. 'records are in the data field', and there is no data field, we should not get any records.

OK, I'll adapt it accordingly.

Sounds good, in metafacture-framework/src/test/java/org/metafacture/framework/helpers?

Well, not exactly. I thought it would have to go in main instead of test. Would it be OK for you to provide a helper class to other modules from the test directory? Note that metafacture-framework is an api dependency of metafacture-mangling. So that would have to change, too.

I don't understand what you mean with the problem... metafacture-framework already depends on Mockito if that's what you mean.

Yes, it's currently a testImplementation dependency. I assumed it would have to be changed to implementation.

@blackwinter blackwinter changed the title Implement record path filter. (#385) Implement record path filter. Aug 31, 2021
Split event stream into records based on entity path.

Related to #382 and `org.metafacture.xml.XmlElementSplitter`.

Resolves #385.
@blackwinter blackwinter force-pushed the 385-filter-records-by-path branch from dca31a1 to ca9a107 Compare August 31, 2021 17:24
@blackwinter
Copy link
Member Author

Updated implementation:

  • Emitting record per matching entity; skipping empty records.
  • Fixed emitting multiple entities.
  • Added setEntitySeparator.
  • Added setRecordIdFormat.

Should be functionally complete now.

Copy link
Member

@fsteeg fsteeg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, looks good, thanks.

Skipping if the path matches only a literal seems good to me. And we can extract the test helper class in the future, if we have a good solution for that. So I think we can merge, but leaving that to you (as we usually do in our internal team process).

@blackwinter blackwinter merged commit 696b351 into master Sep 1, 2021
@blackwinter blackwinter deleted the 385-filter-records-by-path branch September 1, 2021 11:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Split up event stream into records
2 participants