Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

populate_stats_process_rooms failing with an unknown room #14800

Closed
clokep opened this issue Jan 9, 2023 · 3 comments · Fixed by #14873
Closed

populate_stats_process_rooms failing with an unknown room #14800

clokep opened this issue Jan 9, 2023 · 3 comments · Fixed by #14873
Assignees
Labels
A-Background-Updates Filling in database columns, making the database eventually up-to-date O-Occasional Affects or can be seen by some users regularly or most users rarely S-Major Major functionality / product severely impaired, no satisfactory workaround. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues.

Comments

@clokep
Copy link
Member

clokep commented Jan 9, 2023

As part of #14643 we are re-populating room and user stats, this is currently failing with an error:

Room !XXX for event $YYY is unknown
Stack trace
2023-01-06 17:45:04,582 - synapse.storage.background_updates - 431 - INFO - background_updates-0 - Starting update batch on background update 'populate_stats_process_rooms'
2023-01-06 17:45:04,725 - synapse.storage.background_updates - 302 - ERROR - background_updates-0 - Error doing update
Capture point (most recent call last):
  File "venv/lib/python3.8-pyston2.3/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "venv/lib/python3.8-pyston2.3/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "src/synapse/app/homeserver.py", line 386, in <module>
    main()
  File "src/synapse/app/homeserver.py", line 382, in main
    run(hs)
  File "src/synapse/app/homeserver.py", line 361, in run
    _base.start_reactor(
  File "src/synapse/app/_base.py", line 191, in start_reactor
    run()
  File "src/synapse/app/_base.py", line 173, in run
    run_command()
  File "src/synapse/app/_base.py", line 148, in <lambda>
    run_command: Callable[[], None] = lambda: reactor.run(),
  File "venv/site-packages/twisted/internet/base.py", line 1318, in run
    self.mainLoop()
  File "venv/site-packages/twisted/internet/base.py", line 1328, in mainLoop
    reactorBaseSelf.runUntilCurrent()
  File "venv/site-packages/twisted/internet/base.py", line 967, in runUntilCurrent
    f(*a, **kw)
  File "src/synapse/storage/databases/main/events_worker.py", line 1185, in fire
    d.callback(row_dict)
  File "venv/site-packages/twisted/internet/defer.py", line 696, in callback
    self._startRunCallbacks(result)
  File "venv/site-packages/twisted/internet/defer.py", line 798, in _startRunCallbacks
    self._runCallbacks()
  File "venv/site-packages/twisted/internet/defer.py", line 892, in _runCallbacks
    current.result = callback(  # type: ignore[misc]
  File "venv/site-packages/twisted/internet/defer.py", line 1792, in gotResult
    _inlineCallbacks(r, gen, status, context)
  File "venv/site-packages/twisted/internet/defer.py", line 1775, in _inlineCallbacks
    status.deferred.errback()
  File "venv/site-packages/twisted/internet/defer.py", line 735, in errback
    self._startRunCallbacks(fail)
  File "venv/site-packages/twisted/internet/defer.py", line 798, in _startRunCallbacks
    self._runCallbacks()
  File "venv/site-packages/twisted/internet/defer.py", line 892, in _runCallbacks
    current.result = callback(  # type: ignore[misc]
  File "venv/site-packages/twisted/internet/defer.py", line 735, in errback
    self._startRunCallbacks(fail)
  File "venv/site-packages/twisted/internet/defer.py", line 798, in _startRunCallbacks
    self._runCallbacks()
  File "venv/site-packages/twisted/internet/defer.py", line 892, in _runCallbacks
    current.result = callback(  # type: ignore[misc]
  File "venv/site-packages/twisted/internet/defer.py", line 1792, in gotResult
    _inlineCallbacks(r, gen, status, context)
  File "venv/site-packages/twisted/internet/defer.py", line 1693, in _inlineCallbacks
    result = context.run(
  File "venv/site-packages/twisted/python/failure.py", line 518, in throwExceptionIntoGenerator
    return g.throw(self.type, self.value, self.tb)
Traceback (most recent call last):
  File "src/synapse/storage/background_updates.py", line 294, in run_background_updates
    result = await self.do_next_background_update(sleep)
  File "src/synapse/storage/background_updates.py", line 424, in do_next_background_update
    await self._do_background_update(desired_duration_ms)
  File "src/synapse/storage/background_updates.py", line 467, in _do_background_update
    items_updated = await update_handler(progress, batch_size)
  File "src/synapse/storage/databases/main/stats.py", line 206, in _populate_stats_process_rooms
    await self._calculate_and_set_initial_state_for_room(room_id)
  File "src/synapse/storage/databases/main/stats.py", line 557, in _calculate_and_set_initial_state_for_room
    state_event_map = await self.get_events(event_ids, get_prev_content=False)  # type: ignore[attr-defined]
  File "src/synapse/storage/databases/main/events_worker.py", line 536, in get_events
    events = await self.get_events_as_list(
  File "src/synapse/logging/opentracing.py", line 896, in _wrapper
    return await func(*args, **kwargs)  # type: ignore[misc]
  File "src/synapse/logging/opentracing.py", line 896, in _wrapper
    return await func(*args, **kwargs)  # type: ignore[misc]
  File "src/synapse/storage/databases/main/events_worker.py", line 586, in get_events_as_list
    event_entry_map = await self.get_unredacted_events_from_cache_or_db(
  File "src/synapse/storage/databases/main/events_worker.py", line 818, in get_unredacted_events_from_cache_or_db
    missing_events: Dict[str, EventCacheEntry] = await delay_cancellation(
  File "venv/site-packages/twisted/internet/defer.py", line 1697, in _inlineCallbacks
    result = context.run(gen.send, result)
  File "src/synapse/storage/databases/main/events_worker.py", line 804, in get_missing_events_from_cache_or_db
    raise e
  File "src/synapse/storage/databases/main/events_worker.py", line 797, in get_missing_events_from_cache_or_db
    db_missing_events = await self._get_events_from_db(
  File "src/synapse/storage/databases/main/events_worker.py", line 1297, in _get_events_from_db
    raise Exception(
Exception: Room !XXX for event $YYY is unknown

This exception is raised at:

raise Exception(
"Room %s for event %s is unknown" % (d["room_id"], event_id)
)

It happens when attempting to fetch a non-membership event from a room whose room version is unknown (room_version IS NULL in the rooms table).

Some history:

@clokep
Copy link
Member Author

clokep commented Jan 9, 2023

To see how many rooms might be affected I ran:

SELECT room_id FROM (
    SELECT room_id, COUNT(*) FROM current_state_events
    WHERE
        room_id IN (
            SELECT room_id FROM rooms WHERE room_version IS NULL
        )
        AND type != 'm.room.member'
    GROUP BY room_id
) AS cse
WHERE cse.count > 0;

This only result in two rooms, including the room it is bailing on.

@DMRobertson DMRobertson added S-Major Major functionality / product severely impaired, no satisfactory workaround. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues. A-Background-Updates Filling in database columns, making the database eventually up-to-date O-Occasional Affects or can be seen by some users regularly or most users rarely labels Jan 9, 2023
@clokep
Copy link
Member Author

clokep commented Jan 9, 2023

This only result in two rooms, including the room it is bailing on.

The 2nd room has a m.room.create event in current_state_events and state_events, but that event doesn't exist in the events or event_json table.

It actually seems to have no events in the events / event_json tables which makes me think this was a failed purge?

@clokep clokep self-assigned this Jan 9, 2023
@clokep
Copy link
Member Author

clokep commented Jan 19, 2023

@erikjohnston and I discussed this and we think that these two rooms are so broken that we should see if there's a way for stats processing to skip them -- they're not usable anyway.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-Background-Updates Filling in database columns, making the database eventually up-to-date O-Occasional Affects or can be seen by some users regularly or most users rarely S-Major Major functionality / product severely impaired, no satisfactory workaround. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants