-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Return request ID in HTTP response headers #10854
Conversation
and include this information in the HTTP response headers
self._processing_finished_time = None | ||
self._processing_finished_time: Optional[float] = None | ||
|
||
# what time we finished sending the response to the client (or the connection | ||
# dropped) | ||
self.finish_time = None | ||
self.finish_time: Optional[float] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure why mypy started being unhappy, but it was interpreting these as being of type NoneType
throughout the program, causing problems later.
"generic_" wasn't useful
I don't think this is true.
It is only used if opentracing is enabled (and I'm not 100% sure what is included as the header). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if Patrick's comment above invalidates this PR or not, but I took a look.
Not entirely sure about the potential for log pollution; compare POST-424242
vs worker-event_persister18/POST-424242
, especially when the worker ID is redundant/implied in the log file itself (as far as I know?).
@@ -0,0 +1 @@ | |||
Include a request id in Synapse's HTTP responses to aid debugging. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Include a request id in Synapse's HTTP responses to aid debugging. | |
Include a request ID in Synapse's HTTP responses to aid debugging. |
def get_request_id(self): | ||
return "%s-%i" % (self.get_method(), self.request_seq) | ||
def get_request_id(self) -> str: | ||
return f"{self._instance_name}/{self.get_method()}-{self.request_seq}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the same thing that winds up in the logs?
I can see this perhaps making things a bit noisier, but not sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, this is what goes in the logs. I'm not enthusiastic about adding this noise to them.
# TODO can we avoid the cast? Maybe take finish_time as an explicit float param? | ||
response_send_time = ( | ||
cast(float, self.finish_time) - self._processing_finished_time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is slightly less evil.
# TODO can we avoid the cast? Maybe take finish_time as an explicit float param? | |
response_send_time = ( | |
cast(float, self.finish_time) - self._processing_finished_time | |
assert self.finish_time is not None | |
response_send_time = ( | |
self.finish_time - self._processing_finished_time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1. Or raise an explicit Exception.
@@ -188,7 +188,7 @@ def test_with_request_context(self): | |||
] | |||
self.assertCountEqual(log.keys(), expected_log_keys) | |||
self.assertEqual(log["log"], "Hello there, wally!") | |||
self.assertTrue(log["request"].startswith("POST-")) | |||
self.assertIn("POST-", log["request"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or vs:
self.assertIn("POST-", log["request"]) | |
self.assertTrue(log["request"].startswith("worker-test/POST-")) |
It sounds like it does.
What I want here is to be able quickly backtrack from a client HTTP response to the corresponding server-side processing. I've found myself wanting this when investigating sytests. A sytest run makes thousands of requests and it's a pain to have to cross-reference timestamps---never mind working out which worker's log file contains the relevant logs. I want a unique string I can bang into I think this is useful to have, even in the absence of full opentracing telemetry. It might be that |
I think this is a fair use-case. I personally feel somewhat inclined to say that the request ID in the logs themselves shouldn't make mention of the worker ID; but the header should. If you make the header format I think thoughts from someone on the team who does a bit more of that might be useful. |
I'm afraid I'm a little unconvinced that we want to be adding this header as well as the existing In order for
That still leaves you with the problem of figuring out which worker handled a particular request. A few thoughts here:
|
@@ -377,6 +377,7 @@ def _listen_http(self, listener_config: ListenerConfig): | |||
self.version_string, | |||
max_request_body_size=max_request_body_size(self.config), | |||
reactor=self.get_reactor(), | |||
instance_name=f"worker-{site_tag}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think site_tag
is quite what we want, is it? typically it's just the port number.
def get_request_id(self): | ||
return "%s-%i" % (self.get_method(), self.request_seq) | ||
def get_request_id(self) -> str: | ||
return f"{self._instance_name}/{self.get_method()}-{self.request_seq}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, this is what goes in the logs. I'm not enthusiastic about adding this noise to them.
# TODO can we avoid the cast? Maybe take finish_time as an explicit float param? | ||
response_send_time = ( | ||
cast(float, self.finish_time) - self._processing_finished_time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1. Or raise an explicit Exception.
Many thanks all.
Absolutely, this sounds ideal. And to be explicit, I'd include this header in the response even if opentracing was disabled.
That's fine by me, assuming the trace IDs are unique across workers (guessing they're a uuid of some kind?). |
I'm going to raise a new issue to summarise the discussion and close this. Many thanks all. |
We already have something like this when opentracing is enabled, see #10199. But that's only across federation.
When investigating test failures it's really useful to cross-reference HTTP responses with synapse logs. I propose exposing the request id to facilitate that. I've tried to change the request ids to be unique across all workers now, by including an instance name.
See also matrix-org/sytest#1144.