-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Kernel][Metrics][PR#3] Metrics report JSON serializer and LoggingMetricsReporter for the default engine #3904
Merged
Merged
Changes from all commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
730c8d6
Metrics serializer + LoggingMetricsReporter
allisonport-db f48e0b5
Respond to comments
allisonport-db 1ca4059
Update for new changes in base PR
allisonport-db c424af8
respond to comment
allisonport-db fcc0379
Update for merge
allisonport-db File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
54 changes: 54 additions & 0 deletions
54
...l/kernel-api/src/main/java/io/delta/kernel/internal/metrics/MetricsReportSerializers.java
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,54 @@ | ||
/* | ||
* Copyright (2024) The Delta Lake Project Authors. | ||
* | ||
* Licensed under the Apache License, Version 2.0 (the "License"); | ||
* you may not use this file except in compliance with the License. | ||
* You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, software | ||
* distributed under the License is distributed on an "AS IS" BASIS, | ||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
* See the License for the specific language governing permissions and | ||
* limitations under the License. | ||
*/ | ||
package io.delta.kernel.internal.metrics; | ||
|
||
import com.fasterxml.jackson.core.JsonProcessingException; | ||
import com.fasterxml.jackson.databind.ObjectMapper; | ||
import com.fasterxml.jackson.databind.module.SimpleModule; | ||
import com.fasterxml.jackson.databind.ser.std.ToStringSerializer; | ||
import com.fasterxml.jackson.datatype.jdk8.Jdk8Module; | ||
import io.delta.kernel.metrics.MetricsReport; | ||
import io.delta.kernel.metrics.SnapshotReport; | ||
|
||
/** Defines JSON serializers for {@link MetricsReport} types */ | ||
public final class MetricsReportSerializers { | ||
|
||
///////////////// | ||
// Public APIs // | ||
///////////////// | ||
|
||
/** | ||
* Serializes a {@link SnapshotReport} to a JSON string | ||
* | ||
* @throws JsonProcessingException | ||
*/ | ||
public static String serializeSnapshotReport(SnapshotReport snapshotReport) | ||
throws JsonProcessingException { | ||
return OBJECT_MAPPER.writeValueAsString(snapshotReport); | ||
} | ||
|
||
///////////////////////////////// | ||
// Private fields and methods // | ||
//////////////////////////////// | ||
|
||
private static final ObjectMapper OBJECT_MAPPER = | ||
new ObjectMapper() | ||
.registerModule(new Jdk8Module()) // To support Optional | ||
.registerModule( // Serialize Exception using toString() | ||
new SimpleModule().addSerializer(Exception.class, new ToStringSerializer())); | ||
|
||
private MetricsReportSerializers() {} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
97 changes: 97 additions & 0 deletions
97
...el-api/src/test/scala/io/delta/kernel/internal/metrics/MetricsReportSerializerSuite.scala
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,97 @@ | ||
/* | ||
* Copyright (2024) The Delta Lake Project Authors. | ||
* | ||
* Licensed under the Apache License, Version 2.0 (the "License"); | ||
* you may not use this file except in compliance with the License. | ||
* You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, software | ||
* distributed under the License is distributed on an "AS IS" BASIS, | ||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
* See the License for the specific language governing permissions and | ||
* limitations under the License. | ||
*/ | ||
package io.delta.kernel.internal.metrics | ||
|
||
import java.util.Optional | ||
|
||
import io.delta.kernel.metrics.SnapshotReport | ||
import org.scalatest.funsuite.AnyFunSuite | ||
|
||
class MetricsReportSerializerSuite extends AnyFunSuite { | ||
|
||
private def optionToString[T](option: Optional[T]): String = { | ||
if (option.isPresent) { | ||
if (option.get().isInstanceOf[String]) { | ||
s""""${option.get()}"""" // For string objects wrap with quotes | ||
} else { | ||
option.get().toString | ||
} | ||
} else { | ||
"null" | ||
} | ||
} | ||
|
||
private def testSnapshotReport(snapshotReport: SnapshotReport): Unit = { | ||
val timestampToVersionResolutionDuration = optionToString( | ||
snapshotReport.getSnapshotMetrics().getTimestampToVersionResolutionDurationNs()) | ||
val loadProtocolAndMetadataDuration = | ||
snapshotReport.getSnapshotMetrics().getLoadInitialDeltaActionsDurationNs() | ||
val exception: Optional[String] = snapshotReport.getException().map(_.toString) | ||
val expectedJson = | ||
s""" | ||
|{"tablePath":"${snapshotReport.getTablePath()}", | ||
|"operationType":"Snapshot", | ||
|"reportUUID":"${snapshotReport.getReportUUID()}", | ||
|"exception":${optionToString(exception)}, | ||
|"version":${optionToString(snapshotReport.getVersion())}, | ||
|"providedTimestamp":${optionToString(snapshotReport.getProvidedTimestamp())}, | ||
|"snapshotMetrics":{ | ||
|"timestampToVersionResolutionDurationNs":${timestampToVersionResolutionDuration}, | ||
|"loadInitialDeltaActionsDurationNs":${loadProtocolAndMetadataDuration} | ||
|} | ||
|} | ||
|""".stripMargin.replaceAll("\n", "") | ||
assert(expectedJson == MetricsReportSerializers.serializeSnapshotReport(snapshotReport)) | ||
} | ||
|
||
test("SnapshotReport serializer") { | ||
val snapshotContext1 = SnapshotQueryContext.forTimestampSnapshot("/table/path", 0) | ||
snapshotContext1.getSnapshotMetrics.timestampToVersionResolutionTimer.record(10) | ||
snapshotContext1.getSnapshotMetrics.loadInitialDeltaActionsTimer.record(1000) | ||
snapshotContext1.setVersion(1) | ||
val exception = new RuntimeException("something something failed") | ||
|
||
val snapshotReport1 = SnapshotReportImpl.forError( | ||
snapshotContext1, | ||
exception | ||
) | ||
|
||
// Manually check expected JSON | ||
val expectedJson = | ||
s""" | ||
|{"tablePath":"/table/path", | ||
|"operationType":"Snapshot", | ||
|"reportUUID":"${snapshotReport1.getReportUUID()}", | ||
|"exception":"$exception", | ||
|"version":1, | ||
|"providedTimestamp":0, | ||
|"snapshotMetrics":{ | ||
|"timestampToVersionResolutionDurationNs":10, | ||
|"loadInitialDeltaActionsDurationNs":1000 | ||
|} | ||
|} | ||
|""".stripMargin.replaceAll("\n", "") | ||
assert(expectedJson == MetricsReportSerializers.serializeSnapshotReport(snapshotReport1)) | ||
|
||
// Check with test function | ||
testSnapshotReport(snapshotReport1) | ||
|
||
// Empty options for all possible fields (version, providedTimestamp and exception) | ||
val snapshotContext2 = SnapshotQueryContext.forLatestSnapshot("/table/path") | ||
val snapshotReport2 = SnapshotReportImpl.forSuccess(snapshotContext2) | ||
testSnapshotReport(snapshotReport2) | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
51 changes: 51 additions & 0 deletions
51
...kernel-defaults/src/main/java/io/delta/kernel/defaults/engine/LoggingMetricsReporter.java
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,51 @@ | ||
/* | ||
* Copyright (2024) The Delta Lake Project Authors. | ||
* | ||
* Licensed under the Apache License, Version 2.0 (the "License"); | ||
* you may not use this file except in compliance with the License. | ||
* You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, software | ||
* distributed under the License is distributed on an "AS IS" BASIS, | ||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
* See the License for the specific language governing permissions and | ||
* limitations under the License. | ||
*/ | ||
package io.delta.kernel.defaults.engine; | ||
|
||
import com.fasterxml.jackson.core.JsonProcessingException; | ||
import io.delta.kernel.engine.MetricsReporter; | ||
import io.delta.kernel.internal.metrics.MetricsReportSerializers; | ||
import io.delta.kernel.metrics.MetricsReport; | ||
import io.delta.kernel.metrics.SnapshotReport; | ||
import org.slf4j.Logger; | ||
import org.slf4j.LoggerFactory; | ||
|
||
/** | ||
* An implementation of {@link MetricsReporter} that logs the reports (as JSON) to Log4J at the info | ||
* level. | ||
*/ | ||
public class LoggingMetricsReporter implements MetricsReporter { | ||
|
||
private static final Logger logger = LoggerFactory.getLogger(LoggingMetricsReporter.class); | ||
|
||
@Override | ||
public void report(MetricsReport report) { | ||
try { | ||
if (report instanceof SnapshotReport) { | ||
logger.info( | ||
"SnapshotReport = {}", | ||
MetricsReportSerializers.serializeSnapshotReport((SnapshotReport) report)); | ||
} else { | ||
logger.info( | ||
"{} = [{} does not support serializing this type of MetricReport]", | ||
report.getClass(), | ||
this.getClass()); | ||
} | ||
} catch (JsonProcessingException e) { | ||
logger.info("Encountered exception while serializing report {}: {}", report, e); | ||
} | ||
} | ||
} |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like we would have to do this for every type of report, right?
Any way to avoid that? (Since they would all just call
OBJECT_MAPPER.writeValueAsString(inputVariable)
Could we just take in a DeltaOperationReport?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I wasn't sure about this because I also think there's a use-case where you are only serializing a specific report type. maybe we can have both
serializeSnapshotReport
andserializeDeltaOperationReport
But then I need to make
DeltaOperationReport
serializable (which currently I did not). What do you think?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Every report we create will implement DeltaOperationReport, right?
And every report we create we will by default log in with our default engine, right?
Then it seems fair that DeltaOperationReport be serializable 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm I have some thoughts that I haven't fully fleshed out.
Every operation-type report we create will implement DeltaOperationReport, possibly not every report. But yes we probably plan to log all report types in our default engine (I can't think of a reason not to currently). But that would argue that we should make
MetricsReport
serializable....If we add a new report type i.e.
XXReport extends DeltaOperationReport
but don't make it serializable, it will still be serialized but will be missing additional information in XXReport.Do we ever expect an report that only extends DeltaOperationReport? Maybe we just want to report that an operation occurred or success vs failure?
Need to verify that if both DeltaOperationReport and SnapshotReport are serializable jackson will use the lowest ancestor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this is something we can flesh out later? These are all internal APIs and for now, I think, making sure the main report types are serializable is enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup! SGTM!