-
Notifications
You must be signed in to change notification settings - Fork 3k
Core: Extract REST metrics reporter into its own class #7339
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
core/src/main/java/org/apache/iceberg/rest/RESTSessionCatalog.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/apache/iceberg/rest/RESTMetricsReporter.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/apache/iceberg/rest/RESTMetricsReporter.java
Outdated
Show resolved
Hide resolved
| } | ||
| } | ||
|
|
||
| private TableIdentifier identifierWithoutCatalog(String tableNameWithCatalog) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The table name in reports always carries the catalog name as the first part. I might have missed it, but I haven't found anything in the codebase that would give us the TableIdentifier from the fully-qualified table name (that also carries the catalog's name). Should this functionality maybe live somewhere else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like a bad idea to parse this. Can we add the identifier to the ScanReport instead? Then we could use the identifier that was loaded. Maybe we should add it to BaseTable as well, like the Python implementation does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I completely agree here. It would be great if the actual table would carry a TableIdentifier, then we could just use that. I'll open a separate issue to improve that.
Can we add the identifier to the ScanReport instead?
- for a ScanReport we take the table name from the table directly:
- for a CommitReport we take it indirectly via CreateSnapshotEvent
So I think the first step would be to add TableIdentifier to Table and then use that TableIdentifier for ScanReport/CommitReport.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've opened #7419
61558e2 to
35aa55a
Compare
57284c5 to
75ecf6f
Compare
| import org.slf4j.Logger; | ||
| import org.slf4j.LoggerFactory; | ||
|
|
||
| public class RESTMetricsReporter implements MetricsReporter { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
making RESTMetricsReporter fully Serializable will be handled in a separate PR as it currently also requires #7144
75ecf6f to
c48fd88
Compare
8cae49f to
335206a
Compare
335206a to
72148d7
Compare
| import org.slf4j.Logger; | ||
| import org.slf4j.LoggerFactory; | ||
|
|
||
| class RESTMetricsReporter implements MetricsReporter { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Serializable? Or is this just to produce a separate one as a minimal refactor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
core/src/main/java/org/apache/iceberg/rest/RESTSessionCatalog.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/apache/iceberg/rest/RESTMetricsReporter.java
Outdated
Show resolved
Hide resolved
| // different method calls | ||
| if (!"v1/oauth/tokens".equals(path)) { | ||
| if ("v1/config".equals(path)) { | ||
| if ("v1/config".equals(path) || path.endsWith("/metrics")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why was this needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is because now we only instantiate a single metrics reporter and use the headers from the catalog's AuthSession.
In case that is a concern, we could instantiate a new reporter wherever we currently pass in the reporter and then use the headers from session::headers.
In fact that approach might be better as we wouldn't need to derive the TableIdentifier from the metrics report.
12b5f79 to
a29d3b2
Compare
a29d3b2 to
e047145
Compare
| } | ||
| } catch (Exception e) { | ||
| LOG.warn("Failed to report metrics to REST endpoint for table {}", tableIdentifier, e); | ||
| private MetricsReporter metricsReporter( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rather than wrapping the combined reporter in a lambda for the purpose of reporting, we just return the combined reporter here.
depends on #7337