-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Filter out auth chain queries that don't exist #16552
Filter out auth chain queries that don't exist #16552
Conversation
Auth chain chain_id's and sequence numbers can't be zero, but that can appear in the query here. Stop that from showing up.
This reverts commit 98f253e.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that chain IDs are never 0. They come from a DB-backed sequence:
synapse/synapse/storage/databases/main/events_worker.py
Lines 293 to 300 in 12ca87f
self.event_chain_id_gen = build_sequence_generator( | |
db_conn, | |
database.engine, | |
get_chain_id_txn, | |
"event_auth_chain_id", | |
table="event_auth_chains", | |
id_column="chain_id", | |
) |
CREATE SEQUENCE IF NOT EXISTS event_auth_chain_id; |
which per https://www.postgresql.org/docs/16/sql-createsequence.html#id-1.9.3.81.6 should start at 1 and increase by 1.
Therefore querying for events in chain 0 will never return anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, your point is that the sequence_number in the chains is never 0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like these are also seeded at 1 and grow by 1:
synapse/synapse/storage/databases/main/events.py
Lines 941 to 950 in e9069c9
# We found a chain ID/sequence number candidate, check its | |
# not already taken. | |
proposed_new_id = existing_chain_id[0] | |
proposed_new_seq = existing_chain_id[1] + 1 | |
if chain_to_max_seq_no[proposed_new_id] < proposed_new_seq: | |
new_chain_tuple = ( | |
proposed_new_id, | |
proposed_new_seq, | |
) |
for chain_id, seq_no in event_chains.items(): | ||
chains[chain_id] = max(seq_no - 1, chains.get(chain_id, 0)) | ||
max_sequence_result = max(seq_no - 1, chains.get(chain_id, 0)) | ||
if max_sequence_result > 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if comparing to != 0 would be clearer?
if max_sequence_result > 0: | |
if max_sequence_result != 0: |
for chain_id, seq_no in event_chains.items(): | ||
chains[chain_id] = max(seq_no - 1, chains.get(chain_id, 0)) | ||
max_sequence_result = max(seq_no - 1, chains.get(chain_id, 0)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we understand when max_sequence_result is 0?
I think I've convinced myself that sequence numbers are strictly positive, and chains is a map to sequence numbers. Therefore we would need to have seq_no == 1 and chain_id not in chains
. Is there any particular meaning to this situation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this corresponds to a brand-new chain that isn't the target of some other chain. Which sounds like the kind of thing that only happens at the start of a room? (cc @erikjohnston: is any of this sane?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is basically handling the case where an initial event is the first thing in the chain (and nothing else references it). This is actually pretty common, as we start a new chain whenever we see a new type / state key pair.
chains[chain_id] = max(seq_no - 1, chains.get(chain_id, 0)) | ||
max_sequence_result = max(seq_no - 1, chains.get(chain_id, 0)) | ||
if max_sequence_result > 0: | ||
chains[chain_id] = max_sequence_result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With your change, is it possible that chains
can be empty after this loop? If so, will the query below fall over if we pass in an empty chains.items()
to execute_values?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is probably fine, but I was careful to double-check this since event auth chain logic is important to keep watertight. I would like @erikjohnston to double-check though.
for chain_id, seq_no in event_chains.items(): | ||
chains[chain_id] = max(seq_no - 1, chains.get(chain_id, 0)) | ||
max_sequence_result = max(seq_no - 1, chains.get(chain_id, 0)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd think I'd replace this change with something like:
for chain_id, seq_no in event_chains.items():
# Check if the initial event is the first item in the chain. If so, then
# there is nothing new to add from this chain.
if seq_no == 1:
continue
chains[chain_id] = max(seq_no - 1, chains.get(chain_id, 0))
Maybe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if that is clearer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Works for me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replaced logic in 94e22c2
Would you like me to lose that comment above the condition?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Yeah, I think that comment is now redundant
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed in 3eaeaff
…ll be the first item in that chain anyways
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
Short version:
Long version:
In
_get_auth_chain_ids_using_cover_index_txn()
:synapse/synapse/storage/databases/main/event_federation.py
Lines 300 to 304 in c14a7de
Seems to have a small logic problem, where the
max()
can evaluate to0
. Thechains
dict is passed into the following SQL block:synapse/synapse/storage/databases/main/event_federation.py
Lines 314 to 325 in c14a7de
where it inhabits the
max_seq
variable. Inside theevent_auth_chains
database table, that column can never be0
.Filter out that
0
before it gets to the SQL.The temporary metric attached produces results that appear like this(over less than 24 hours):
The metric commit isn't particularly efficient and has no long term relevance, so it will not be included in the final render.
Pull Request Checklist
(run the linters)
Signed-off-by: Jason Little realtyem@gmail.com