Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Improve logging and opentracing for to-device message handling #14598

Merged
merged 9 commits into from
Dec 6, 2022
44 changes: 37 additions & 7 deletions synapse/storage/databases/main/deviceinbox.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,8 +26,15 @@
cast,
)

from synapse.api.constants import EventContentFields
from synapse.logging import issue9533_logger
from synapse.logging.opentracing import log_kv, set_tag, trace
from synapse.logging.opentracing import (
SynapseTags,
log_kv,
set_tag,
start_active_span,
trace,
)
from synapse.replication.tcp.streams import ToDeviceStream
from synapse.storage._base import SQLBaseStore, db_to_json
from synapse.storage.database import (
Expand Down Expand Up @@ -678,12 +685,35 @@ def add_messages_txn(
],
)

if remote_messages_by_destination:
issue9533_logger.debug(
"Queued outgoing to-device messages with stream_id %i for %s",
stream_id,
list(remote_messages_by_destination.keys()),
)
for destination, edu in remote_messages_by_destination.items():
if issue9533_logger.isEnabledFor(logging.DEBUG):
issue9533_logger.debug(
"Queued outgoing to-device messages with "
"stream_id %i, EDU message_id %s, type %s for %s: %s",
stream_id,
edu["message_id"],
edu["type"],
destination,
[
f"{user_id}/{device_id} (msgid "
f"{msg.get(EventContentFields.TO_DEVICE_MSGID)})"
for (user_id, messages_by_device) in edu["messages"].items()
for (device_id, msg) in messages_by_device.items()
],
)

for (user_id, messages_by_device) in edu["messages"].items():
for (device_id, msg) in messages_by_device.items():
with start_active_span("store_outgoing_to_device_message"):
set_tag(SynapseTags.TO_DEVICE_EDU_ID, edu["sender"])
set_tag(SynapseTags.TO_DEVICE_EDU_ID, edu["message_id"])
set_tag(SynapseTags.TO_DEVICE_TYPE, edu["type"])
set_tag(SynapseTags.TO_DEVICE_RECIPIENT, user_id)
set_tag(SynapseTags.TO_DEVICE_RECIPIENT_DEVICE, device_id)
set_tag(
SynapseTags.TO_DEVICE_MSGID,
msg.get(EventContentFields.TO_DEVICE_MSGID),
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are quite a lot of tags. Don't suppose you have any insight into if this is expensive or not? I don't really know enough about elasticsearch to know if this is absolutely fine or if we need to be careful.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hrm, well, good question.

There's a few things to think about here.

Firstly, load on the synapse side.

  • If the active span isn't being sampled, set_tag is basically a no-op. I don't love that we'll be making all these function calls, but hopefully it's not too bad.
  • If the span is being sampled, making the tag is relatively inexpensive. All the tags are aggregated and sent over to the server once the span completes.

On the server side, I don't really know, but my understanding is that each tag is entered into an index which references a "document", where the document is a single span (which then holds references to the trace it contains). So... the index entry is just a (key, value) pair where the value is a uuid. In other words, no, I don't think adding extra tags is expensive on the server side.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!


async with self._device_inbox_id_gen.get_next() as stream_id:
now_ms = self._clock.time_msec()
Expand Down