-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds some additional timings. #110
Conversation
ad17aff
to
e431af9
Compare
|
||
this.filterBuilder = bardConfigResources.getFilterBuilder(); | ||
|
||
MetricDictionary metricDictionary = bardConfigResources | ||
.getMetricDictionary() | ||
.getScope(Collections.singletonList(tableName)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change was necessary to make metricDictionary
effectively final for use in lambdas (also the previous approach was confusing).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How so? Couldn't you just mark the existing local variable final
and make the compiler happy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because we were originally assigning the metricDictionary
variable at the very beginning of the method, but didn't do anything with it until we reassigned it to the above value just before actually using it.
Since we were reassigning it, it was no longer effectively final. So I just got rid of the unused initialization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, I see, this is just a straight refactor, no actual changes. got it.
Note that I tried to add timings just to steps of building the |
lock.readLock().lock(); | ||
RequestLog.stopTiming("LuceneReadLock"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't time this. A read-lock should be fast enough that the timings will actually be much slower than the lock. If we really want to get a sense of how much time this takes, we should run a profiler on this running under an actual workload.
lock.readLock().lock(); | ||
RequestLog.stopTiming("QueryingLuceneForPage" + currentPage + "ReadLock"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't time read-locks
@@ -587,6 +591,7 @@ private Query getFilterQuery(Set<ApiFilter> filters, int perPage) { | |||
throw new RuntimeException(e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't comment in the right place, but there are effectively 2 steps in this search operation:
- Get the Lucene results for the right page
- Hydrate the dim rows
I would wrap a timer around each of those high-level steps
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😆 I changed where I stopped the timer for querying Lucene to capture just the code that queries Lucene, but forgot to add a timer for hydrating the rows.
@@ -42,7 +43,10 @@ public Filter buildFilters(Map<Dimension, Set<ApiFilter>> filterMap) throws Dime | |||
|
|||
List<Filter> dimensionFilters = new ArrayList<>(filterMap.size()); | |||
for (Map.Entry<Dimension, Set<ApiFilter>> entry : filterMap.entrySet()) { | |||
String filterTimerName = "Building" + entry.getKey().getApiName() + "Filters"; | |||
RequestLog.startTiming(filterTimerName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is going to make a ton of timers and massively fill the request log. Don't do this per-event. Do it at the outer layer.
If you really want per-element timings, use the raw Metrics timings framework (ie. not requestLog), or write down a LogInfo object that includes the count of how many filters are being built
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, we already have a LogInfo
object that includes a count of how many filters are being built (the Filter
LogBlock).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that the Filter
log block counts the number of API filters, not the Druid query filters.
Inner log block recording information related to the API filters of a query.
~ Class-level Javadoc from the Filter log block
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, good catch. I'll look into enriching that class then to include both the API filters
and the Druid filters
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be better to add another block, rather than combine them I think. Especially as we look to the possibility of breaking the Druid-specific aspects into it's own module...
try { | ||
RequestLog.startTiming("QueryingLuceneForPage" + currentPage); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there are many pages, that could create a large bloat to the request log. I would move this timing up a level. You can calculate the per-page timings using the pagination information that I think should already be in the RequestLog.
entityBody = writer.writeValueAsString(druidQuery); | ||
} catch (JsonProcessingException e) { | ||
throw new IllegalStateException(e); | ||
} finally { | ||
RequestLog.stopTiming("DruidQuerySerialization"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should include the sequence number (see the timing we do below). Otherwise, when stitching requests back together, we may have trouble.
|
||
this.filterBuilder = bardConfigResources.getFilterBuilder(); | ||
|
||
MetricDictionary metricDictionary = bardConfigResources | ||
.getMetricDictionary() | ||
.getScope(Collections.singletonList(tableName)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How so? Couldn't you just mark the existing local variable final
and make the compiler happy?
* | ||
* @return The value returned by the generator | ||
*/ | ||
private <T> T timeGenerator(Producer<T> generator, String timerName) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While this solution is cute, it uglies up the calls everywhere, making the code much harder to understand. Instead, if we want to do this, we should use annotations on the "generator" methods. Take a look at the @Timed
annotation in the Metrics library to see how they are doing it. Having a similar annotation for the RequestLog
may be exactly what we're looking for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can't Java annotations be like Python decorators? Why???
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on what I've found so far, it seems like in order to do what you want, we need to dive into AspectJ or Spring's aspect-oriented programming. While that stuff seems like it has merit, I really don't have time to wrap my head around a whole new paradigm of OOP programming, or figuring out how to then use Spring/AspectJ to implement them.
So I'm just going to brute-force wrap all the timed invocations in a try-finally block. If someone else wants to take the time grok aspect-oriented programming in the context of AspectJ or Spring AOP, and create proper annotations, they are welcome to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough. I dug in a bit deeper too and it seems you are indeed right. That said, we should open an issue to revisit this, since it would help the RequestLog piece immensely to be able to be annotation-driven, rather than code-driven.
That said, we should make sure the timings are internal to the methods that are getting timed, since callers don't really need this complexity, and the method is better suited to handling it. (note: I've not looked at your updates yet, so this may already be the case).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree 100% that making the RequestLog annotation-based would be a Good Thing(TM).
I haven't actually pushed the timings into the methods. I'll do that now.
@@ -51,7 +51,12 @@ public void invoke(JsonNode rootNode) { | |||
HttpErrorCallback error = response.getErrorCallback(druidQuery); | |||
FailureCallback failure = response.getFailureCallback(druidQuery); | |||
|
|||
druidWebService.postDruidQuery(context, success, error, failure, druidQuery); | |||
try { | |||
RequestLog.startTiming("PostingDruidQuery"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This timing should be moved into the actual druid client (and it may already be there). Also, I don't think it'll time what you want, since it's only going to time how long the request takes to send, not send and come back. (yay async!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I intended to time how long it takes to send (hence the name postingDruidQuery
) it's an attempt to give us some idea of how much of the time we spend sending the druid query is spent serializing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like I said, this should move into the methods doing the sending, rather than the caller. Also, the timer needs to take into account the sequence number of the query, I think, in order for log merging to work correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sounds like something tricky enough to justify a DruidServlet and a test case to make sure that all the phases get tested.
Annotations don't require an aop framework at all. They are much simpler than that. Look at how the Timed annotation from Metrics does what it does. It should be pretty much what we want to do.
… On Nov 29, 2016, at 8:26 AM, Andrew Cholewa ***@***.***> wrote:
@archolewa commented on this pull request.
In fili-core/src/main/java/com/yahoo/bard/webservice/web/DataApiRequest.java:
> @@ -290,6 +327,24 @@ public DataApiRequest(
}
/**
+ * Times the execution of the specified producer, making sure to stop timing in a finally clause.
+ *
+ * @param generator The producer to be timed
+ * @param timerName The name to use for the timer
+ * @param <T> The type of the object produced
+ *
+ * @return The value returned by the generator
+ */
+ private <T> T timeGenerator(Producer<T> generator, String timerName) {
Based on what I've found so far, it seems like in order to do what you want, we need to dive into AspectJ or Spring's aspect-oriented programming. While that stuff seems like it has merit, I really don't have time to wrap my head around a whole new paradigm of OOP programming, or figuring out how to then use Spring/AspectJ to implement them.
So I'm just going to brute-force wrap all the timed invocations in a try-finally block. If someone else wants to take the time grok aspect-oriented programming in the context of AspectJ or Spring AOP, and create proper annotations, they are welcome to.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.
|
I did look into how Furthermore, all the examples I found for writing custom annotations seem to assume you have your own main method where you can scan files for annotations and do something, rather than doing something special on method invocation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few more things 😄
@@ -562,7 +564,7 @@ private Query getFilterQuery(Set<ApiFilter> filters, int perPage) { | |||
throw new PageNotFoundException(requestedPageNumber, perPage, 0); | |||
} | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keep this whitespace
|
||
this.filterBuilder = bardConfigResources.getFilterBuilder(); | ||
|
||
MetricDictionary metricDictionary = bardConfigResources | ||
.getMetricDictionary() | ||
.getScope(Collections.singletonList(tableName)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, I see, this is just a straight refactor, no actual changes. got it.
#111 issue to make the |
aaa3fb6
to
c1c0577
Compare
I know. I died a little with every line I wrote. My higher-order timing function is still an option ;) |
What the status on this? Is the job error being pursued? |
@michael-mclawhorn The status is "Oh crap the quarter's almost over and I have other things that need to get done!" |
8d5c155
to
0f4ddc1
Compare
@michael-mclawhorn The status is now "I addressed the things." |
👍 Seems reasonable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
@@ -37,7 +37,8 @@ class AsyncInvalidApiRequest extends AsyncFunctionalSpec { | |||
|
|||
@Override | |||
Map<String, Closure<Void>> getResultAssertions() { | |||
[ data: { assert GroovyTestUtils.compareJson(it.readEntity(String), EXPECTED_ERROR_MESSAGE) } ] | |||
[ data: { assert GroovyTestUtils.compareErrorPayload(it.readEntity(String), EXPECTED_ERROR_MESSAGE) } ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What impact does this change have?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks to a previous change, error messages now have a requestId
, which is a unique id. compareErrorPayload
ignores the requestId
when performing a comparison. I have no idea how this test got through those changes unchanged, but it started failing when I rebased my code onto master.
|
||
import com.fasterxml.jackson.annotation.JsonAutoDetect; | ||
|
||
import com.yahoo.bard.webservice.druid.model.filter.Filter; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move up.
filterTypeCounter.put( | ||
currentFilter.getClass().getSimpleName(), | ||
filterTypeCounter.getOrDefault(currentFilter.getClass().getSimpleName(), 0L) + 1L | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be replaced with Map::merge
:
filterTypeCounter.merge(currentFilter.getClass().getSimpleName(), 1L, (old, increment) -> old + increment);
filterStack.addAll(((MultiClauseFilter) currentFilter).getFields()); | ||
} else if (currentFilter instanceof NotFilter) { | ||
filterStack.add(((NotFilter) currentFilter).getField()); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This smells a little like extensibility pain... It would be great if Java made it easy to ask if things had various methods, but about the best we can do is add interfaces for specific methods. To that end, do we want to make HasFields
and HasField
interfaces, apply those interfaces to the MultiVlauseFilter
and NotFilter
(respectively), and then update this conditional to depend on those interfaces instead of these more concrete classes?
Specifically, it's the NotFilter
dependency that drew my attention, since the MultiClauseFilter
is abstract already and (roughly) intended to serve the same purpose as the interface would.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are going to have that kind of interface, I'd much rather just have HasFields
(Not
can just return a singleton set). Though I'd prefer a better name than HasFields
. I find that kind of convention hideously ugly in Java. Perhaps something like ComplexFilter
? i.e. a Filter that's built out of other filters?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me, Complex
isn't specific enough. HasFilters
is nice, in a way, because it fits (roughly) the bean convention of getFilters
/ setFilters
roughly translating to the filters
property present in Groovy and C# for example.
That said, I'm happy with ComplexFilter
meaning "has a getFilters
method". In part, I think we're fighting the generality of Druid's filters and their decision to use "field" to mean "another filter".
Checkstyle needs fix... otherwise I think this is good to squash into logical commits and merge. |
f41958c
to
5cb3a13
Compare
--A map is logged that describes the structure of the filter being sent to druid. For each filter it includes a count of the number of instances of that filter.
5cb3a13
to
a0b798f
Compare
--In order to better aid in performance optimizations, some additional
timing has been added to the request processing workflow.
--Based on past experiences, most slow queries spend the majority of
their time building a
DataApiRequest
, so timings have been addedaround all the major steps in building a
DataApiRequest
.--Furthermore, it's expected that the bulk of the time spent building a
DataApiRequest
is spent resolving filters, so some additional timingshave been added to the
LuceneSearchProvider
, the provider mostcommonly used for large dimensions.
--It's also possible that a lot of the request workflow is spent
serializing large Druid queries, so some timings have been added around
serializing Druid queries.