Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

AssertionError: get_device_updates_by_remote returned too many EDUs causing federation problems #11719

Closed
richvdh opened this issue Jan 10, 2022 · 3 comments · Fixed by #11730
Assignees
Labels
S-Major Major functionality / product severely impaired, no satisfactory workaround. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues. X-Regression Something broke which worked on a previous release X-Release-Blocker Must be resolved before making a release

Comments

@richvdh
Copy link
Member

richvdh commented Jan 10, 2022

We're seeing quite a lot of these, and they seem to be interrupting federation traffic:

2022-01-10 10:03:58,307 - synapse.federation.sender.per_destination_queue - 364 - ERROR - federation_transaction_transmission_loop-9933549 - TX [....] Failed to send transaction
Capture point (most recent call last):
  File "/usr/local/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/local/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/synapse/src/synapse/app/federation_sender.py", line 22, in <module>
    start(sys.argv[1:])
  File "/home/synapse/src/synapse/app/generic_worker.py", line 505, in start
    _base.start_worker_reactor("synapse-generic-worker", config)
  File "/home/synapse/src/synapse/app/_base.py", line 126, in start_worker_reactor
    run_command=run_command,
  File "/home/synapse/src/synapse/app/_base.py", line 178, in start_reactor
    run()
  File "/home/synapse/src/synapse/app/_base.py", line 162, in run
    run_command()
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/internet/base.py", line 1318, in run
    self.mainLoop()
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/internet/base.py", line 1328, in mainLoop
    reactorBaseSelf.runUntilCurrent()
  File "/home/synapse/src/synapse/metrics/__init__.py", line 645, in f
    ret = func(*args, **kwargs)
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/internet/base.py", line 967, in runUntilCurrent
    f(*a, **kw)
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/internet/defer.py", line 662, in callback
    self._startRunCallbacks(result)
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/internet/defer.py", line 764, in _startRunCallbacks
    self._runCallbacks()
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/internet/defer.py", line 859, in _runCallbacks
    current.result, *args, **kwargs
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/internet/defer.py", line 1751, in gotResult
    current_context.run(_inlineCallbacks, r, gen, status)
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/internet/defer.py", line 1661, in _inlineCallbacks
    result = current_context.run(gen.send, result)
  File "/home/synapse/src/synapse/metrics/background_process_metrics.py", line 242, in run
    return await func(*args, **kwargs)
Traceback (most recent call last):
  File "/home/synapse/src/synapse/federation/sender/per_destination_queue.py", line 278, in _transaction_transmission_loop
    async with _TransactionQueueManager(self) as (
  File "/home/synapse/src/synapse/federation/sender/per_destination_queue.py", line 631, in __aenter__
    limit
  File "/home/synapse/src/synapse/federation/sender/per_destination_queue.py", line 563, in _get_device_update_edus
    assert len(edus) <= limit, "get_device_updates_by_remote returned too many EDUs"
AssertionError: get_device_updates_by_remote returned too many EDUs
@richvdh
Copy link
Member Author

richvdh commented Jan 10, 2022

Looks to be new in Synapse v1.50.0rc1: https://sentry.matrix.org/sentry/synapse-matrixorg/issues/239101

@richvdh richvdh added X-Regression Something broke which worked on a previous release X-Release-Blocker Must be resolved before making a release S-Major Major functionality / product severely impaired, no satisfactory workaround. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues. labels Jan 10, 2022
@squahtx
Copy link
Contributor

squahtx commented Jan 10, 2022

May be due to #10520, which increased the size of edus by one per user.
edus comes from DeviceWorkerStore.get_device_updates_by_remote and given its non-trivial implementation, I don't believe the assert holds, even before that change.

@reivilibre reivilibre self-assigned this Jan 11, 2022
@richvdh
Copy link
Member Author

richvdh commented Jan 11, 2022

yes, that does look odd. I guess the first thing to do here is to figure out how it used to work.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
S-Major Major functionality / product severely impaired, no satisfactory workaround. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues. X-Regression Something broke which worked on a previous release X-Release-Blocker Must be resolved before making a release
Projects
None yet
3 participants