Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Stuck in "empty" #matrix-dev:matrix.org room #1933

Closed
andrewgdunn opened this issue Feb 21, 2017 · 17 comments
Closed

Stuck in "empty" #matrix-dev:matrix.org room #1933

andrewgdunn opened this issue Feb 21, 2017 · 17 comments

Comments

@andrewgdunn
Copy link

I've federated my homeserver about two weeks ago and one of my initial tests was to join #matrix-dev:matrix.org, which worked fine. Somewhere along the line the room turned into "Empty Room" on my room list, with an identifier of !XqBunHwQIXUiqCaoxq:matrix.org.

Now it seems I am unable to join #matrix-dev:matrix.org, and I cannot leave/forget the "Empty Room", the error message being M_UNKNOWN: Room '!XqBunHwQIXUiqCaoxq:matrix.org' does not exist.

this may be user error, I am just not sure of where to report it.

@ara4n
Copy link
Member

ara4n commented Feb 22, 2017 via email

@andrewgdunn
Copy link
Author

Sure thing, I've reached out via matrix to you directly.

@ara4n
Copy link
Member

ara4n commented Feb 22, 2017 via email

@andrewgdunn
Copy link
Author

user error on my end, looks like the Arch packager is responsible and was using journalctl.

@ara4n
Copy link
Member

ara4n commented Feb 22, 2017

logs have been extracted from the depths of systemd - @richvdh i have them on my laptop if you want to peruse tomorrow.

@ara4n
Copy link
Member

ara4n commented Feb 22, 2017

have conferred; it sounds like this a state reset caused by problems when rejecting invalid events over federation. this is tightly entangled with #1760, which @richvdh is currently blitzing.

@ara4n
Copy link
Member

ara4n commented Feb 22, 2017

tbp this is #1935

@richvdh
Copy link
Member

richvdh commented Feb 23, 2017

Hum; having looked at storrgie's logs, and thought about this a little more, I'm not entirely sure it is #1935.

In this case the problem seems to have happened between 2017-02-04 15:42:13, when storrgie joined #matrix-dev and immediately left again (as seen on matrix.org), and Feb 18 06:40:03 when his client sent another join request on the room, which is rejected.

At some point in that period storrgie's HS has gone from thinking he was not a member of the room, to (at least in part) thinking he was.

@storrgie: would you mind running a query for me to help diagnose:

select * from event_forward_extremities where room_id='!XqBunHwQIXUiqCaoxq:matrix.org';

@richvdh
Copy link
Member

richvdh commented Feb 23, 2017

Investigation of storrgie's db shows that his homeserver has recorded a very limited amount of state for matrix-dev, and in particular doesn't include his leave event:

synapse=# select stream_ordering, event_id, received_ts, type, sender, state_group, prev_event_id from events e left join event_to_state_groups esg using (event_id) left join event_edges using (event_id)  where e.room_id='!XqBunHwQIXUiqCaoxq:matrix.org' order by stream_ordering desc limit 50;
 stream_ordering |              event_id              |  received_ts  |        type         |        sender         | state_group |           prev_event_id            
-----------------+------------------------------------+---------------+---------------------+-----------------------+-------------+------------------------------------
          213479 | $1487813202327BLTuv:sw1v.org       | 1487813203133 | m.room.member       | @richvdh:sw1v.org     |             | $1487808645292bHAcq:sw1v.org
          137042 | $14862229331WOUdC:epiphyte.network | 1486811207473 | m.room.member       | @agd:epiphyte.network |      124423 | $148622151325FsoQT:intelfx.name
          137041 | $145191668846IYNfJ:jki.re          | 1486811207433 | m.room.power_levels | @erikj:jki.re         |             | $14482710929PBzzW:jki.re
          137041 | $145191668846IYNfJ:jki.re          | 1486811207433 | m.room.power_levels | @erikj:jki.re         |             | $145191487043lSIsJ:jki.re
          137040 | $144850062087elIJo:jki.re          | 1486811207410 | m.room.member       | @erikj:jki.re         |             | $142316567653uFnap:jki.re
          137040 | $144850062087elIJo:jki.re          | 1486811207410 | m.room.member       | @erikj:jki.re         |             | $1448485645255678aTNwL:matrix.org
          137039 | $14482710929PBzzW:jki.re           | 1486811207395 | m.room.power_levels | @erikj:jki.re         |             | $142375629822gFxyM:jki.re
          137039 | $14482710929PBzzW:jki.re           | 1486811207395 | m.room.power_levels | @erikj:jki.re         |             | $144827073254999BSjlM:matrix.org
          137038 | $142375629822gFxyM:jki.re          | 1486811207369 | m.room.power_levels | @erikj:jki.re         |             | $14229713842147qGCCG:matrix.org
          137038 | $142375629822gFxyM:jki.re          | 1486811207369 | m.room.power_levels | @erikj:jki.re         |             | $142375628821IwXEC:jki.re
          137037 | $142316567653uFnap:jki.re          | 1486811207351 | m.room.member       | @erikj:jki.re         |             | $14231442822Ljmbj:jki.re
          137037 | $142316567653uFnap:jki.re          | 1486811207351 | m.room.member       | @erikj:jki.re         |             | $14231646891368oygLm:matrix.org
          137036 | $14229713842147qGCCG:matrix.org    | 1486811207335 | m.room.power_levels | @erikj:matrix.org     |             | $14229713842146ZAjhF:matrix.org
          137036 | $14229713842147qGCCG:matrix.org    | 1486811207335 | m.room.power_levels | @erikj:matrix.org     |             | $14229713842146ZAjhF:matrix.org
          137035 | $14229713842146ZAjhF:matrix.org    | 1486811207316 | m.room.power_levels | @erikj:matrix.org     |             | $14229712692144ZqdYT:matrix.org
          137035 | $14229713842146ZAjhF:matrix.org    | 1486811207316 | m.room.power_levels | @erikj:matrix.org     |             | $14229713082145GCJpk:matrix.org
          137034 | $1416420930650ZDaIU:matrix.org     | 1486811207012 | m.room.member       | @erikj:matrix.org     |             | $1416420930634UfvTn:matrix.org
          137033 | $1416420925146WUNCs:jki.re         | 1486811206997 | m.room.power_levels | @erikj:jki.re         |             | $1416420923842GbLym:jki.re
          137033 | $1416420925146WUNCs:jki.re         | 1486811206997 | m.room.power_levels | @erikj:jki.re         |             | $1416420925128XcoKo:matrix.org
          137032 | $1416420923842GbLym:jki.re         | 1486811206983 | m.room.power_levels | @erikj:jki.re         |             | $1416420915228Byipk:matrix.org
          137032 | $1416420923842GbLym:jki.re         | 1486811206983 | m.room.power_levels | @erikj:jki.re         |             | $1416420923824Hdnta:matrix.org
          137031 | $1416420915228Byipk:matrix.org     | 1486811206967 | m.room.power_levels | @matthew:matrix.org   |             | $1416420914880oFczF:jki.re
          137031 | $1416420915228Byipk:matrix.org     | 1486811206967 | m.room.power_levels | @matthew:matrix.org   |             | $1416420915213BpjVc:matrix.org
          137030 | $1416420914880oFczF:jki.re         | 1486811206954 | m.room.power_levels | @erikj:jki.re         |             | $1416420914858UvPkK:jki.re
          137030 | $1416420914880oFczF:jki.re         | 1486811206954 | m.room.power_levels | @erikj:jki.re         |             | $1416420914858UvPkK:jki.re
          137029 | $1416420914858UvPkK:jki.re         | 1486811206940 | m.room.power_levels | @erikj:jki.re         |             | $1416420913198wPxtf:matrix.org
          137029 | $1416420914858UvPkK:jki.re         | 1486811206940 | m.room.power_levels | @erikj:jki.re         |             | $1416420914837zZuOf:matrix.org
          137028 | $1416420914590avyez:jki.re         | 1486811206924 | m.room.member       | @erikj:jki.re         |             | $1416420910692FTaSj:jki.re
          137028 | $1416420914590avyez:jki.re         | 1486811206924 | m.room.member       | @erikj:jki.re         |             | $1416420914572iFzgC:matrix.org
          137027 | $1416420913198wPxtf:matrix.org     | 1486811206911 | m.room.power_levels | @LeoNerd:matrix.org   |             | $1416420910562CZaRR:matrix.org
          137027 | $1416420913198wPxtf:matrix.org     | 1486811206911 | m.room.power_levels | @LeoNerd:matrix.org   |             | $1416420913185tdhzb:matrix.org
          137026 | $1416420911042txvso:matrix.org     | 1486811206895 | m.room.member       | @matthew:matrix.org   |             | $1416420911033LHKDh:perlsite.co.uk
          137025 | $1416420910562CZaRR:matrix.org     | 1486811206882 | m.room.power_levels | @LeoNerd:matrix.org   |             | $1416420910554DEyOl:matrix.org
          137025 | $1416420910562CZaRR:matrix.org     | 1486811206882 | m.room.power_levels | @LeoNerd:matrix.org   |             | $1416420910554DEyOl:matrix.org
          137024 | $1416420910554DEyOl:matrix.org     | 1486811206867 | m.room.power_levels | @LeoNerd:matrix.org   |             | $1416420910545FDbdr:matrix.org
          137024 | $1416420910554DEyOl:matrix.org     | 1486811206867 | m.room.power_levels | @LeoNerd:matrix.org   |             | $1416420910545FDbdr:matrix.org
          137023 | $1416420910545FDbdr:matrix.org     | 1486811206854 | m.room.power_levels | @LeoNerd:matrix.org   |             | $1416420910533PjyPo:matrix.org
          137023 | $1416420910545FDbdr:matrix.org     | 1486811206854 | m.room.power_levels | @LeoNerd:matrix.org   |             | $1416420910538EkGDl:matrix.org
          137022 | $1416420910538EkGDl:matrix.org     | 1486811206840 | m.room.join_rules   | @LeoNerd:matrix.org   |             | $1416420910533PjyPo:matrix.org
          137021 | $1416420910533PjyPo:matrix.org     | 1486811206826 | m.room.power_levels | @LeoNerd:matrix.org   |             | $1416420910529mzqgm:matrix.org
          137020 | $1416420910529mzqgm:matrix.org     | 1486811206811 | m.room.member       | @LeoNerd:matrix.org   |             | $1416420910522mQfXQ:matrix.org
          137019 | $1416420910522mQfXQ:matrix.org     | 1486811206796 | m.room.create       | @LeoNerd:matrix.org   |             | 
(42 rows)

@richvdh
Copy link
Member

richvdh commented Feb 23, 2017

So it looks like storrgie's db had got itself into a particular mess, which I don't think will be a common case. I think what happened is:

  • on 4 Feb, he joined and then left #matrix-dev using a sqlite db
  • then he blew away his db and restarted with a psql one
  • on 11 Feb, he received partial state for #matrix-dev. Possibly he tried to rejoin? But his HS got confused about there being events which claimed to come from his HS, but which it knew nothing about.

It remains a bit unclear what happened on 11 Feb to end up in such an inconsistent state.

The solution in this case was:

  1. DELETE FROM current_state_events WHERE room_id='!XqBunHwQIXUiqCaoxq:matrix.org';
    DELETE FROM event_forward_extremities WHERE room_id='!XqBunHwQIXUiqCaoxq:matrix.org';
    
  2. restart synapse
  3. restart clients
  4. rejoin #matrix-dev

I have opened #1940 to track the more general reports of people being spuriously rejoined to rooms.

@andrewgdunn
Copy link
Author

Very likely my fault due to this being my evaluation/deployment order of events:

  • set up homeserver using self signed certificates on sqlite, without federation enabled
  • test connectivity from riot.im/app
  • enable federation
  • test connectivity with #matrix-dev:matrix.org
  • delete sqlite database, delete certifictes
  • issue LE certificates, instance postgres
  • start back up
  • use for several weeks before I reported this ticket

I did not migrate the sqlite to postgres, I just deleted and started over. This was stupid of me. I know that before (on sqlite), and after (on postgres) I was using the same MXID. This likely compounded the issue.

@andrewgdunn
Copy link
Author

I would not be offended if you closed this @richvdh, it's mainly up for people to be informed when they do silly things during their initial deployment.

@richvdh
Copy link
Member

richvdh commented Feb 23, 2017

thanks. There are still some outstanding questions about why the server behaved the way it did when you rejoined. In particular, why it accepted your join event but apparently not your leave event (or anything thereafter). I'll keep it open for now so that if we get something similar in future we might be able to get some more info.

@rubo77
Copy link
Contributor

rubo77 commented Mar 3, 2017

I had the same problem, but I had no entry in event_forward_extremities, only one in current_state_events I only had to delete the current state for only my user. No need to restart the synapse server:

psql --dbname="synapse" --command="delete FROM current_state_events WHERE room_id like '%XqBunHwQIXUiqCaoxq:matrix.org' and state_key like '@rubo77:matrix.ec%';"

After that I could rejoin matrix-dev

@richvdh
Copy link
Member

richvdh commented Apr 20, 2017

@rubo77 sounds like a different problem

@rubo77
Copy link
Contributor

rubo77 commented Oct 1, 2018

Same problem here: #3001

@richvdh
Copy link
Member

richvdh commented Oct 1, 2018

This bug was meant to be quite specific to storrgie's problems. I'm not sure that keeping it open any longer has any value, since we're unlikely to go back and investigate that particular sequence of events.

A lot of the underlying problems would be solved by fixing #3422.

@richvdh richvdh closed this as completed Oct 1, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants