-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Consistently exclude users from directory #10914
Conversation
To do this: - move required helpers to the handlers test dir and pull them in via inheritance (ew) Additionally, we no longer rebuild the user dir the the user dir handlers tests. Instead, tests register their own users and that creates the required user_dir entries.
I think this now fails because I've changed the user directory code to require that users are not deactivated. I think this test was relying on some magic to sort of create a user based on the test case attribute "user_id". I don't think it was doing it properly by actually registering a user proper.
This makes mypy happier
83c4b6d
to
ff89fb8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking the time to unify these! Seems good to me.
if self.get_if_app_services_interested_in_user(user): | ||
return True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm unsure what to think about this.
I have some appservice-generated users in my homeserver's search directory. I can't confirm or deny whether I've done something wrong to wind up with that happening, but thought it was worth mentioning in case this should trigger the question of 'do we really intend to exclude AS users from the directory?'.
Both yes and no make sense here. No, on a lot of networks (IRC probably? Discord when using a guild-bot.), the users aren't contactable.
But yes, on some networks, users are contactable and there's nothing inherently wrong with letting you discover them when trying to start a DM ...
I would guess that the 'No' cases outweigh the 'Yes' cases, so I think what you're doing is sensible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like most of the code currently excludes AS users from the directory, but there are enough bugs that they leak through.
At least we're making it consistent here. There might be scope for a future change where we make this configurable per-appservice, but let's leave it for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm struggling to follow what's moving where in the unit tests, and why the changes you're making are safe - but I'm happy to take your word that it's still testing the right sort of thing. (For future reference: it might be preferable to split this into two PRs, where first of all we refactor all the tests, and secondly we clean up the code, might be preferable, as a way of demonstrating that we're not fundamentally changing behaviour).
A few thoughts and suggestions.
"""Certain classes of local user are omitted from the user directory. | ||
Is this user one of them? | ||
""" | ||
if self.hs.is_mine_id(user): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please can we do an early return rather than a big nested if statement?
is_deactivated = await self.store.get_user_deactivated_status(user_id) | ||
|
||
if not (is_support or is_deactivated): | ||
if not await self.store.is_excluded_from_user_dir(user_id): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find the double-negative here a bit mind-bending. Perhaps we should instead have an should_include_in_user_dir
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Glad it's not just me. I was erring towards preserving the original logic below as much as possible, but think it'll be better this way round.
@@ -357,12 +352,8 @@ async def _handle_new_user( | |||
# First, if they're our user then we need to update for every user | |||
if self.is_mine_id(user_id): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
think this is redundant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because we'll check to see if they're local within is_excluded_from_user_dir
? I agree...
But. I think putting the localness check into is_excluded_from_user_dir
makes things more confusing. To a first approximation, it's true that we'll never exclude a remote user from the user directory. With one exception: users_who_share_private_rooms
. This contains triples (local_user_id, any_user_id, room_id)
where the two users both belong to the given private room. Only local users live in the first entry of those tuples. (I think this is an optimisation?) To maintain that there's extra logic which needs to ensure we don't put remote users into that column.
I think it's going to be clearer if we have a function should_include_local_user_in_dir
and make the local-vs-remote logic explicit.
await self.update_profile_in_user_dir( | ||
user_id, profile.display_name, profile.avatar_url | ||
) | ||
if not await self.is_excluded_from_user_dir(user_id): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's probably fine, but for the record: what's the logic behind adding this here, and at line 353, where there was no check before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consistency with _handle_deltas
in the handlers code.
Rebuilding the user directory from scratch (here) should have the same outcome as starting with an empty directory and building it by listening for membership events (_handle_deltas
). Before my changes, the latter excluded support and deactivated users, so I think the former should too.
if self.get_if_app_services_interested_in_user(user): | ||
return True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wonder if we should do this test first, since it doesn't require a database hit and should be pretty quick.
if self.get_if_app_services_interested_in_user(user): | ||
return True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like most of the code currently excludes AS users from the directory, but there are enough bugs that they leak through.
At least we're making it consistent here. There might be scope for a future change where we make this configurable per-appservice, but let's leave it for now.
self.store, "remove_from_user_dir", return_value=defer.succeed(None) | ||
) as mock: | ||
self.get_success(self.handler.handle_local_user_deactivated(s_user_id)) | ||
self.get_success(mock.not_called()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this do what it's supposed to? Mock
doesn't seem to have a not_called
method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so. Has this been broken for a while then!?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changing to assert_not_called
makes the test fail!
My reading: the test is trying to ensure that handle_local_user_deactivated
doesn't try to delete a user from the db who would never have been in the directory. But I think handle_local_user_deactivated
has never had any logic in it, so I think this test has been incorrectly passing forever.
We could make handle_local_user_deactivated
call is_excluded_from_user_directory
, but that involves a SELECT from the db anyway to see if they're deactivated or a support user. I'm guessing that's the same order of magnitude of delay as doing a no-op deletion? And so I think the current behaviour is fine.
I guess it's worth having this kind of test here though, to make sure that synapse doesn't fall over when we deactivate a support user?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it's worth having this kind of test here though, to make sure that synapse doesn't fall over when we deactivate a support user?
Additional: maybe test_user_directory isn't the right place for such a test.
self.store, "remove_from_user_dir", return_value=defer.succeed(None) | ||
) as mock: | ||
self.get_success(self.handler.handle_local_user_deactivated(r_user_id)) | ||
self.get_success(mock.called_once_with(r_user_id)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... or a called_once_with
method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the original author was missing an assert_
from the method names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, this is concerning (and it was very possibly me, though I can't remember). Is there any way we can do this that will properly result in a failure if a typo is made?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Different typing stubs for Mock
maybe---removing its __getattr__
?
Edit: or otherwise we could make get_success throw if it's called with or encounters a Mock
object?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as of python 3.7, you can call mock.seal
(https://docs.python.org/3/library/unittest.mock.html#sealing-mocks). But that doesn't really help you now, and I think it just replaces one thing to forget with another thing to forget.
We exclude three kinds of local users from the user_directory tables. At present we don't consistently exclude all three in the same places. This commit introduces a new function to gather those exclusion conditions together. Because we have to handle local and remote users in different ways, I've made that function only consider the case of remote users. It's the callers responsibility to make the local versus remote distinction clear and correct. A test fixup is required. The test now hits a path which makes db queries against the users table. The expected rows were missing, because we were using a dummy user that hadn't actually been registered. Broken-off from #10914.
We exclude three kinds of local users from the user_directory tables. At present we don't consistently exclude all three in the same places. This commit introduces a new function to gather those exclusion conditions together. Because we have to handle local and remote users in different ways, I've made that function only consider the case of remote users. It's the callers responsibility to make the local versus remote distinction clear and correct. A test fixup is required. The test now hits a path which makes db queries against the users table. The expected rows were missing, because we were using a dummy user that hadn't actually been registered. Broken-off from #10914. Changes By my reading this makes these changes: Incremental updates: * When an app service user registers or changes their profile, they will _not_ be added to the user directory. (Previously only support users were excluded). This is consistent with the logic that rebuilds the user directory. See also [the discussion here](#10914 (comment)). * When a deactivated user joins a room, they will _not_ be added to the user directory. Previously they were. (This probably never happens, but the previous source code allows it.) * When a room changes from public to private or vice versa, any deactivated or support users will _not_ be added to the directory. Previously they were. Rebuild: * When rebuilding the directory, exclude support and disabled users from room sharing tables. Previously only appservice users were excluded. * Exclude all three categories of local users when rebuilding the directory. Previously `_populate_user_directory_process_users` didn't do any exclusion.
We exclude three kinds of local users from the user_directory tables. At present we don't consistently exclude all three in the same places. This commit introduces a new function to gather those exclusion conditions together. Because we have to handle local and remote users in different ways, I've made that function only consider the case of remote users. It's the callers responsibility to make the local versus remote distinction clear and correct. A test fixup is required. The test now hits a path which makes db queries against the users table. The expected rows were missing, because we were using a dummy user that hadn't actually been registered. Broken-off from #10914. ---- By my reading this makes these changes: * When an app service user registers or changes their profile, they will _not_ be added to the user directory. (Previously only support and deactivated users were excluded). This is consistent with the logic that rebuilds the user directory. See also [the discussion here](#10914 (comment)). * When rebuilding the directory, exclude support and disabled users from room sharing tables. Previously only appservice users were excluded. * Exclude all three categories of local users when rebuilding the directory. Previously `_populate_user_directory_process_users` didn't do any exclusion.
* Introduce `should_include_local_users_in_dir` We exclude three kinds of local users from the user_directory tables. At present we don't consistently exclude all three in the same places. This commit introduces a new function to gather those exclusion conditions together. Because we have to handle local and remote users in different ways, I've made that function only consider the case of remote users. It's the caller's responsibility to make the local versus remote distinction clear and correct. A test fixup is required. The test now hits a path which makes db queries against the users table. The expected rows were missing, because we were using a dummy user that hadn't actually been registered. We also add new test cases to covert the exclusion logic. ---- By my reading this makes these changes: * When an app service user registers or changes their profile, they will _not_ be added to the user directory. (Previously only support and deactivated users were excluded). This is consistent with the logic that rebuilds the user directory. See also [the discussion here](#10914 (comment)). * When rebuilding the directory, exclude support and disabled users from room sharing tables. Previously only appservice users were excluded. * Exclude all three categories of local users when rebuilding the directory. Previously `_populate_user_directory_process_users` didn't do any exclusion. Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
Gather the logic which determines if we should exclude a user from the directory into one place. I think this helps to make the code a lot more followable!
I think of this primarily as a tidy-up. But it does have a perceptible change. With this patch, rebuilding the directory will not include support and deactivated users, but it did before AFAICS.