-
Notifications
You must be signed in to change notification settings - Fork 3k
Core: Add metrics reporter for serializable table #7144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Core: Add metrics reporter for serializable table #7144
Conversation
882b279 to
6742851
Compare
|
PTAL @nastra |
| } | ||
|
|
||
| @Override | ||
| public MetricsReporter metricsReporter() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we add a test that makes sure the metrics reporter is preserved with the right properties?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PTAL
6742851 to
a08be15
Compare
| Assert.assertEquals("History must match", expected.history(), actual.history()); | ||
| } | ||
|
|
||
| public static void assertSerializedMetricsReporter( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need to add this here, this can be just part of the test method itself
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| Assert.assertEquals( | ||
| "metrics reporter from serializableTable should equals the one from origin table", | ||
| expected, | ||
| actual); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rather than testing actual equality on the objects, it should be enough to make sure that the properties are the same, hence the TestMetricsReporter wouldn't need to implement hasCode/equals
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| import org.junit.After; | ||
| import org.junit.Rule; | ||
| import org.junit.Test; | ||
| import org.junit.rules.TemporaryFolder; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please take a look at https://iceberg.apache.org/contribute/#testing. Basically new test classes should be JUnit5 rather than JUnit4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PTAL
| throw new CommitFailedException("Injected failure"); | ||
| } | ||
| Integer version = VERSIONS.get(tableName); | ||
| String metadataLocation = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are these changes necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the method lazyTable in SerializableTable will load metadata file, while the TestTable has not write metadata file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think those changes are not necessary. The test passed fine for me without having those changes
| public static void assertSerializedMetricsReporter( | ||
| MetricsReporter expected, MetricsReporter actual) { | ||
| Assert.assertNotNull("metrics reporter from serializableTable should not be null", actual); | ||
| Assert.assertTrue( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this can be simplified to Assertions.assertThat(actual).isNotNull().isInstanceOf(...) and for the properties: Assertions.assertThat(actual.properties).isEqualTo(...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, thanks
a08be15 to
69356e7
Compare
| public void testSerializableTableWithMetricsReporter() | ||
| throws IOException, ClassNotFoundException { | ||
| Map<String, String> properties = Maps.newHashMap(); | ||
| properties.put( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can be simplified to Map<String, String> properties = ImmutableMap.of(...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
|
||
| private static final SortOrder SORT_ORDER = SortOrder.builderFor(SCHEMA).asc("id").build(); | ||
|
|
||
| @TempDir public File temp; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this doesn't have to be public to work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| */ | ||
| default MetricsReporter metricsReporter() { | ||
| throw new UnsupportedOperationException( | ||
| "metricsReporter is not supported by " + getClass().getName()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| "metricsReporter is not supported by " + getClass().getName()); | |
| "Accessing metrics reporter is not supported"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| } | ||
| } | ||
|
|
||
| public static class TestMetricsReporter implements MetricsReporter { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we also move this to the respective test class? I don't think it's necessary to put this into TestHelpers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| properties.put( | ||
| CatalogProperties.METRICS_REPORTER_IMPL, TestHelpers.TestMetricsReporter.class.getName()); | ||
| MetricsReporter reporter = CatalogUtil.loadMetricsReporter(properties); | ||
| Table table = TestTables.create(temp, "tbl_A", SCHEMA, SPEC, SORT_ORDER, 2, reporter); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see where we're testing that the reporter itself has the correct properties.
Shouldn't this do
Map<String, String> reporterProperties = ImmutableMap.of("a", "1", "b", "2");
reporter.initialize(reporterProperties);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reporter.initialize(reporterProperties) is called in the method CatalogUtil.loadMetricsReporter(properties)
| if (lazyMetricsReporter == null) { | ||
| synchronized (this) { | ||
| if (lazyMetricsReporter == null) { | ||
| lazyMetricsReporter = CatalogUtil.loadMetricsReporter(this.metricsReporterProperties); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
metrics reporter is configured by catalog properties, not table properties. I add the interface method MetricsReporter.properties() to return the properties which used to load and initialize this MetricsReporter, therefor the SerializableTable can use this properties to load and initialize MetricsReporter agagin
| @Test | ||
| public void testSerializableTableWithMetricsReporter() | ||
| throws IOException, ClassNotFoundException { | ||
| Map<String, String> properties = Maps.newHashMap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
those are the properties of the table, so you'd have to pass them to the table. The metrics reporter then has separate properties
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
metrics reporter is configured by catalog properties, not table properties. Here this properties is taken as properties of catalog
69356e7 to
9838834
Compare
| if (lazyMetricsReporter == null) { | ||
| synchronized (this) { | ||
| if (lazyMetricsReporter == null) { | ||
| lazyMetricsReporter = CatalogUtil.loadMetricsReporter(this.metricsReporterProperties); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
metrics reporter is configured by catalog properties, not table properties. I add the interface method MetricsReporter.properties() to return the properties which used to load and initialize this MetricsReporter, therefor the SerializableTable can use this properties to load and initialize MetricsReporter agagin
|
PTAL @nastra |
nastra
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did take another look and I don't think we need the changes in TestTables, because the tests will pass without those.
9838834 to
2ac204b
Compare
you're right, it not necessary. I used to compare the metadata between origin table and serializable table, then it will be needed. I have removed the changes, PTAL @nastra |
nastra
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @wangyinsheng.
@aokolnychyi could you also take a look please?
|
PTAL @aokolnychyi |
1 similar comment
|
PTAL @aokolnychyi |
|
could you please take a look this pr, it has last for weeks @aokolnychyi |
|
@nastra Could you please approval the CI ? And is there someone else can review this pr? |
|
@aokolnychyi @danielcweeks could you take a look please ? thanks |
|
@wangyinsheng to fix CI failures I think you need to add the below code to and then also this to |
08afaff to
4ea3cc0
Compare
4ea3cc0 to
f831280
Compare
| this.metricsReporterProperties = | ||
| table instanceof BaseTable | ||
| ? SerializableMap.copyOf(table.metricsReporter().properties()) | ||
| : null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should rather do #7144 (comment) than setting this to null here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I missed the comment when I try to fix the CI failures, I wll change it later.
|
@wangyinsheng I talked to @rdblue yesterday and we concluded that we don't want to carry over the catalog properties just for the metrics reporter. The better approach would be to make |
|
@nastra I am confused why should't carry over the catalog properties for metrics reporter? We already support to config metrics report through catalog properites. And metrics reporter may contains non-serializable object, |
|
please take a look @nastra @rdblue @aokolnychyi @danielcweeks |
|
please take a look @nastra @rdblue @aokolnychyi @danielcweeks |
|
@wangyinsheng as mentioned previously, we still would like to make the |
|
@nastra, where are we with the fix for serialization here? |
@rdblue I haven't heard back from @wangyinsheng after my comment above , so I've opened #7370 a while ago for initial experimentation and then updated it to fix the serialization issue. |
@nastra @rdblue Sorry,I am working on other things for a while. But I am still keep my opinion, I think the |
|
This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the dev@iceberg.apache.org list. Thank you for your contributions. |
|
This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If you think that is incorrect, or the pull request requires review, you can revive the PR at any time. |
The metrics reporter will be lost when a table be sent to other nodes in a cluster through serializable table, This mr rebuild the metrics reporter when the table is deserialized