Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support passing X-Request-ID to Druid #68

Merged
merged 1 commit into from
Oct 17, 2016

Conversation

DennisMcWherter
Copy link
Contributor

@DennisMcWherter DennisMcWherter commented Oct 14, 2016

A note from offline discussion, HTTP headers are case insensitive. See RFC 7230 for more information.

@@ -92,6 +94,7 @@ private RequestLog(RequestLog rl) {
times = new LinkedHashMap<>(rl.times);
threadIds = new LinkedHashSet<>(rl.threadIds);
MDC.put(ID_KEY, logId);
idPrefix = "";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be set to the idPrefix on the passed-in RequestLog.

if (current.info == null) {
current.init();
}
current.idPrefix = (idPrefix == null) ? "" : idPrefix;
Copy link
Contributor

@archolewa archolewa Oct 14, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't enough. If you look at line 157, you'll see that we are putting the id into MDC. That MDC is what allows us to put the id into every log line generated by a request.

As a result, if we want the prefix to also show up in every log line (we do), then we need to make sure the logId and the idPrefix are added to MDC.

So at the minimum, in the last line in this method, we should add the following:

MDC.put(ID_KEY, getId())

Or manually do the string concatentation if calling the getter in the setter gives you heebie-jeebies.

Although in all honesty, I feel like that opens the door to subtle bugs, if init gets called later and overwrites MDC with the prefix.

Instead, I think that setIdPrefix should prepend the id to logId:

current.logId = idPrefix + current.logId;
MDC.put(ID_KEY, current.logId);

That way, if init gets called again at some other random point, we don't have to worry about the prefix being dropped.

@Override
Class<?>[] getResourceClasses() {
[ DataServlet.class
, BardLoggingFilter.class ]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would put this on one line. We don't generally wrap unless we have to.

Map<String, List<String>> getQueryParams() {
[
"metrics": ["width","depth"],
"dateTime": ["2014-06-02%2F2014-06-30"],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The date doesn't have to be URL-encoded.

/NotABlocker

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copy/pasta'd :) can unencode if we like


@Override
boolean compareResult(String result, String expectedResult, JsonSortStrategy sortStrategy = JsonSortStrategy.SORT_MAPS) {
if (!testedDruidQuery) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this and why is it necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the underlying BaseSpec runs 2 tests... One of which we don't want to perform this validation on. Similarly, both tests rely on this compareResult method, so we return true the second time it's run 🤔

Copy link
Contributor

@archolewa archolewa Oct 14, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. We use it compare both the API and the Druid response. How about, instead of using a boolean, we only do the comparison if the expected result contains the context field?

The API response doesn't contain any context mapping.

This also ensures that the test will work even if the tests were run out of order (I don't think Spock makes any guarantee about test order unless you use a special annotation).

}

@Override
Response makeAbstractRequest(Closure queryParams=this.&getQueryParams) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would appreciate a comment indicating that the only difference between this version of makeAbstractRequest and the parent version is that we are adding the X-Request-ID to the header. Otherwise, the two implementations are so similar that people may think this implementation is unnecessary, delete it, and then wonder why the test started failing.

Though a better approach may be to add a hook to makeAbstractRequest. Basically, add a method getAdditionalHeaders that returns headers to add to the request before making the call. We can then have BaseDataServletComponentSpec::makeAbstractRequest add the headers from getAdditionalHeaders to the request just before making the call.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the latter solution!

@archolewa
Copy link
Contributor

archolewa commented Oct 14, 2016

One other note: If you could remember to add PR labels when you open the PR (Reviewable and Need 2 Reviews in particular), it would be much appreciated. The labels make it a lot easier for us to determine the sttaus of the various PR's at a glance.

Thanks for the PR!

@DennisMcWherter DennisMcWherter force-pushed the xRequestId branch 2 times, most recently from efae917 to 4667c29 Compare October 14, 2016 20:32
Copy link
Contributor

@archolewa archolewa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Assuming you address my small remaining comment.


class DruidQueryIdSpec extends BaseDataServletComponentSpec {
def prefixId = "abcdef"
static def testedDruidQuery = false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This field is no longer necessary, and can be removed.

Copy link
Collaborator

@cdeszaq cdeszaq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changelog entry needed, as well as a bunch of in-line thoughts.

@@ -199,13 +199,13 @@ public FileVisitResult postVisitDirectory(Path dir, IOException exc) throws IOEx
* @param fieldName The name of the node to be emitted.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Javadoc update too?

@@ -83,6 +84,7 @@
@Override
public void filter(ContainerRequestContext request) throws IOException {

RequestLog.addIdPrefix(request.getHeaders().getFirst(X_REQUEST_ID_HEADER));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't just accept everything that comes in on this header. We should strip this down to some set of allowed characters, and also truncate down to a max length.

Applies to the other places we grab it as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are allowed chars for this header? Are we going to use something like heroku's definition? https://devcenter.heroku.com/articles/http-request-id

I'll go with that for now.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that'll work, though it would be good to add periods and underscores (. and _)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I would say that min-length doesn't matter, and 200 chars for max length seems sane. Stripping any unallowed character (and then truncating) is probably the most permissive approach too

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think trying to be "smart" about this value is beneficial since you're going to be looking for some sort of exact match on this string. It likely won't be useful if we try to fix it up.

RequestLog current = RLOG.get();
String newId = idPrefix + getId();
current.logId = newId;
MDC.put(ID_KEY, newId);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this will update the current.logId field (and the MDC value), I don't think it will update the UUID that is part of the RequestLog payload. I think that the LogInfo.uuid field (on the object stored in current.info) needs to be updated as well in order for that to happen.

I think the reason this was missed is because we're holding that UUID in 2 places current.logId (we read from that all over), and in the LogInfo.uuid field (current.info). It would be great if we could get rid of the current.logId field and only use the value from the LogInfo object. This refactor isn't required for this PR, but it would be great if we could pull it in sooner, rather than later.

As part of the change around where the logId is accessed from, we should update getId to pull from the single source of truth as well.

Also, since this is the closest I can get to it, it would be cool (if we do this refactoring) to move the MDC.put call in the init method up much higher, so that there's less of a gap between getting a UUID and setting it into MCD.

@@ -401,6 +401,18 @@ public static String getId() {
}

/**
* Set id prefix.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it's no longer setting, but is instead appending (so subsequent calls will keep appending), this JavaDoc should make that more clear.

@@ -71,6 +72,10 @@ abstract class BaseDataServletComponentSpec extends Specification {
populatePhysicalTableAvailability()
}

MultivaluedHashMap<String, String> getAdditionalHeaders() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear what these headers are for without tracing how this is used. Perhaps we should expand the method name to something like getAdditionalApiRequestHeaders? A JavaDoc would also help, but not critical if the method name is clearer.

import javax.ws.rs.core.MultivaluedHashMap


class DruidQueryIdSpec extends BaseDataServletComponentSpec {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to have this name reflect something about looking for the request ID header to be connected to the druid query id. Perhaps something like RequestIdPrefixesDruidQueryIdSpec?

JsonSlurper slurper = new JsonSlurper(sortStrategy)
def parsedJson = slurper.parseText(result)
if (parsedJson.context != null) {
testedDruidQuery = true
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This variable isn't a thing I don't think...

if (parsedJson.context != null) {
testedDruidQuery = true
return parsedJson.context.queryId.startsWith(prefixId)
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this all can be simplified using Groovy's safe navigation operator:

def parsedQuery = new JsonSlurper(sortStrategy).parseText(result)
assert parsedQuery?.context?.queryId?.startsWith(prefixId)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This didn't actually work. This fails in the case where we want this function to act as a noop.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, I see. you can't just assert in all cases, since not all cases have that condition being true. Makes sense.

@DennisMcWherter DennisMcWherter force-pushed the xRequestId branch 2 times, most recently from 4e146bd to 9eb90e4 Compare October 17, 2016 15:33
Copy link
Collaborator

@cdeszaq cdeszaq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking quite good, just need a test for the rules around the request ID and we'll be good for this one.

(That test can be a Spec just against the BardLoggingFilter, and Groovy can call private methods, so it's easy to test this method.)

*
* @param requestId Request id to add as queryId prefix to druid
*/
private void appendRequestId(String requestId) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to have a test for this, validating the rules.

@cdeszaq
Copy link
Collaborator

cdeszaq commented Oct 17, 2016

👍

@cdeszaq cdeszaq merged commit 2c04321 into yahoo:master Oct 17, 2016
@DennisMcWherter DennisMcWherter deleted the xRequestId branch October 17, 2016 17:44
* @param requestId Request id to add as queryId prefix to druid
*/
private void appendRequestId(String requestId) {
if (isValidRequestId(requestId)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this line. We are ignoring a requestId if it is a valid request id? Wouldn't we want to ignore the request id if it isn't valid?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. I think the behavior is correct (yay testing) the name is just wrong. @DeathByTape, could you open a quick PR to correct this?

*/
private boolean isValidRequestId(String requestId) {
return requestId == null || requestId.isEmpty() || requestId.length() > 200
|| !VALID_REQUEST_ID.matcher(requestId).matches();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this method is actually returning the opposite of what it claims. It's returning true if the requestId is not valid, and false if it is valid.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants