people auto complete: Send user ID for users having same full name. #4094

agrawal-d · 2020-05-07T11:49:35Z

Previously, our @-mention autocomplete had a bug, wherein it
failed to correctly mention the correct user if more than one user
had the same full name.

Now, if more than one users have the same full name, we now send
the user ID as well to ensure that the correct user is
@-mentioned.

agrawal-d · 2020-05-07T11:50:36Z

@chrisbobbe, @gnprice, @ray-kraesig this PR is ready for a review.

chrisbobbe · 2020-05-07T23:04:56Z

src/autocomplete/PeopleAutocomplete.js

@@ -32,6 +32,12 @@ class PeopleAutocomplete extends PureComponent<Props> {
    const { users, onAutocomplete } = this.props;
    const user = users.find(x => x.email === email);
    if (user) {
+      // If another user with the same full name is found, we send the
+      // user ID as well, to ensure the mentioned user is uniquely identified.
+      if (users.find(x => x.full_name === user.full_name && x.user_id !== user.user_id)) {


I expect this will look different when #3339 is resolved, but it looks good to me as a fix before then.

agrawal-d · 2020-05-08T13:15:51Z

@chrisbobbe I've pushed some changes.

This version uses binary search and is faster than the linear one.
On CZO, the linear scan takes ~18ms or more.
This version takes 1ms or less.

chrisbobbe · 2020-05-14T23:17:38Z

Hmm, interesting! I actually didn't have binary search in mind when I commented at #4094 (comment).

Instead, I was thinking that we wouldn't be working with arrays here anymore, and instead something like an Immutable.Map (now much more possible with #4047 (comment) merged).

At #3339, @gnprice said:

Instead, we should make these lookups through efficient, indexed data structures. A JS object works great. A Map works equally great, and can be a bit cleaner in the code.

However! The natural thing to do for #3339, to get users using an "efficient, indexed data structure" is, I think to do an Immutable.Map where keys are user IDs — not full names, which is what we care about here. So unless we have a separate index for full names, we're kind of worse off than we are with an array; we'd be awkwardly (and still linearly) traversing all entries in the user ID-keyed Map.

One solution, as you've shown here, is to use binary search on an array. (One step that I think is missing currently is to ensure users is sorted by full_name before passing it to binarySearch.) Better than linear, it's true. But a lookup in an Immutable.Map, as I understand it, is near-constant.

@gnprice, as someone who used to do research on algorithms, will know what to do here. But the one thing that hasn't come up in my discussions with him is the need to maintain indexes on things besides a canonical, unique ID.

agrawal-d · 2020-05-15T16:58:01Z

Thanks for taking a look @chrisbobbe

Instead, I was thinking that we wouldn't be working with arrays here anymore, and instead something like an Immutable.Map (now much more possible with #4047 (comment) merged).

Yep, I did consider maps before. But JavaScript does not support multiMaps, so if we attempt to hash with full_name as keys, the last value will be overwritten. So we can't use a Map in this use case, as you explained above. That is why I implemented a binary search.

And if we hash with user_id as keys, the map iteration will be O(n). So I thought of using binary search, as users sorted by full_name is already available, thus saving the O(nlogn) overhead of sorting it.

One step that I think is missing currently is to ensure users is sorted by full_name before passing it to binarySearch

users is sorted:

zulip-mobile/src/autocomplete/PeopleAutocomplete.js

Line 101 in 09c5b66

users: getSortedUsers(state),

rk-for-zulip

@agrawal-d wrote:

Yep, I did consider maps before. But JavaScript does not support multiMaps, so if we attempt to hash with full_name as keys, the last value will be overwritten.

Disregarding possible variations in their definition of "insertion order" – which we're not concerned with – Multimap<K, V> is isomorphic to Map<K, Array<V>>.

But, as you note, we already have a sorted list.

The usual way to do this, I think, would be lower_bound and upper_bound functions, taking a candidate value k and returning the index to the first key not less than k (for lower_bound) or greater than k (for upper_bound). (These functions can be implemented almost identically to binary search, except that they should treat equality as the same as greater-than or less-than (respectively) when dispatching, and they should always return something.)

If you have those, const lo = foo.lower_bound(...), hi = foo.upper_bound(...) gives exactly the range [lo, hi) of values with keys equivalent to k.

However, we don't happen to have those handy, and they're not really necessary here. You can instead just do a binary search to find any matching user, and then check if either the preceding or following users' name also matches.

rk-for-zulip · 2020-05-18T22:42:13Z

src/utils/__tests__/binarySearch-test.js

+  const comparator = (element: number) => {
+    if (element < toFind) {
+      return -1;
+    } else if (element > toFind) {
+      return 1;
+    }
+    return 0;
+  };


Contextually this is equivalent to element - toFind.

rk-for-zulip · 2020-05-18T22:50:29Z

src/utils/binarySearch.js

+ */
+const binarySearch = <T>(
+  array: Array<T>,
+  comparator: (elem: T) => number,


I'd prefer this to be identical to what you'd pass to sort, accepting the value to be found as another parameter.

This is technically less flexible (for example, as written, you could pass a comparator that would find and accept any value starting with "M"), so I could understand supplying an auxiliary function to build the search-comparator from a sort-comparator and a value instead.

(EDIT: This was written before I saw the next commit.)

rk-for-zulip · 2020-05-18T23:04:58Z

src/autocomplete/PeopleAutocomplete.js

+      const idx = binarySearch(users, elem => {
+        if (elem.full_name === user.full_name && elem.user_id === user.user_id) {
+          return 1;
+        }
+        return elem.full_name.toLowerCase().localeCompare(user.full_name.toLowerCase());
+      });


This won't work: the given comparator doesn't describe a consistent ordering.

Consider user = { full_name: "John Doe", user_id: 4, ... }, and a users array matching the following:

users =~ [ { full_name: "A", user_id: 0 }, // -1 { full_name: "B", user_id: 1 }, // -1 { full_name: "C", user_id: 2 }, // -1 { full_name: "D", user_id: 3 }, // -1 { full_name: "John Doe", user_id: 4 }, // 1 { full_name: "John Doe", user_id: 5 }, // 0 { full_name: "X", user_id: 6 }, // 1 { full_name: "Y", user_id: 7 }, // 1 { full_name: "Z", user_id: 8 }, // 1 ];

(Trailing comments show the expected return value of the comparator for each element.)

binary_search will check users[4] first, which is the user in question; but the comparator will return 1, so it will limit further searches to the range [0, 3], and therefore never find { full_name: "John Doe", user_id: 5 }.

agrawal-d · 2020-05-26T06:18:24Z

@rk-for-zulip thanks for the review. I've made some changes to fix the issues you pointed out. Can you take a look again?

gnprice · 2020-05-26T23:27:39Z

Seeing this add an implementaion of binary search, my first reaction is that binary search is notoriously tricky to implement completely correctly. (Perhaps I'll go make a thread in #learning about that story -- it might be fun to read about. edit: here)

So I'm inclined to look for other solutions for that reason.

From an algorithmic perspective, one thing I notice here is that this is one of several places that we do a linear pass through this same array. If you look at getAutocompleteSuggestion, which is called in the render method of this component:

  const allAutocompleteOptions = getUsersAndWildcards(users);
  const startWith = filterUserStartWith(allAutocompleteOptions, filter, ownEmail);
  const initials = filterUserByInitials(allAutocompleteOptions, filter, ownEmail);
  const contains = filterUserThatContains(allAutocompleteOptions, filter, ownEmail);
  const matchesEmail = filterUserMatchesEmail(users, filter, ownEmail);

Each of those five lines represents a linear pass through all this data. The first one, in fact, actually allocates a copy:

export const getUsersAndWildcards = (users: User[]) => [
  { ...NULL_USER, full_name: 'all', email: '(Notify everyone)' },
  { ...NULL_USER, full_name: 'everyone', email: '(Notify everyone)' },
  ...users,
];

which tends to be especially expensive.

So all of those together are going to do much more work than we'd do by a linear search here. In algorithmic terms, that means the most we could gain by doing something fancier is a constant factor -- O(n) + O(log n) is still O(n), just like O(n) + O(n) is.

And actually the situation is sharper than that -- we're doing all that work every time the user types another letter, to search again with the new query. This code, on the other hand, only runs once at the end when the user chooses an item from the list.

So, let's just not do anything fancy here for now. 🙂

I do find, empirically on czo, that the people autocomplete can be pretty slow to respond with results. That would be good to improve, and it might involve using some more-clever algorithms and/or data structures in getAutocompleteSuggestion and its helpers. Happy to talk more about that on #mobile-team, if you're interested.

agrawal-d · 2020-05-27T09:50:38Z

@gnprice Thanks for the review, I've reverted it back to the simple version. Can you take a look again?

CC @chrisbobbe

chrisbobbe · 2020-05-29T21:49:01Z

Great, thanks! This looks good to me, though it just occurred to me that it would be good to refer to an authoritative source on the mention syntax, in the code comment you've added. I believe the get_mention_syntax function in static/js/people.js is what we want here.

agrawal-d · 2020-06-01T06:38:55Z

@chrisbobbe I've updated the commit message.

Previously, our @-mention autocomplete had a bug, wherein it failed to correctly mention the specific user if more than one users had the same full name. Now, if more than one users have the same full name, we send the user ID as well to ensure that the correct user is @-mentioned. The user ID is not sent for every @-mention to ensure the behaviour is consistent with the web app, i.e., we send it only in case of ambiguity. See the `get_mention_syntax` function in `static/js/people.js` in the webapp source for the origin of this syntax.

chrisbobbe · 2020-07-02T19:33:37Z

Merged, thanks, @agrawal-d! I also made a small tweak: adding the reference to the web app in a code comment, in addition to the commit message.

agrawal-d requested review from chrisbobbe, gnprice and rk-for-zulip May 7, 2020 11:49

agrawal-d force-pushed the mentions branch 2 times, most recently from fcf4f0b to 5827769 Compare May 7, 2020 11:55

agrawal-d added the a-compose/send Compose box, autocomplete, camera/upload, outbox, sending label May 7, 2020

chrisbobbe reviewed May 7, 2020

View reviewed changes

agrawal-d force-pushed the mentions branch from 5827769 to 9da8b4b Compare May 8, 2020 13:13

agrawal-d requested a review from chrisbobbe May 8, 2020 13:31

agrawal-d force-pushed the mentions branch from 9da8b4b to 09c5b66 Compare May 8, 2020 15:05

rk-for-zulip reviewed May 18, 2020

View reviewed changes

agrawal-d force-pushed the mentions branch 2 times, most recently from 0873981 to d9c6ddb Compare May 26, 2020 06:14

agrawal-d requested a review from rk-for-zulip May 26, 2020 06:17

agrawal-d force-pushed the mentions branch from d9c6ddb to 8fdc5e1 Compare May 27, 2020 09:49

agrawal-d force-pushed the mentions branch from 8fdc5e1 to 30279aa Compare June 1, 2020 06:38

chrisbobbe force-pushed the mentions branch from 30279aa to ebcaae2 Compare July 2, 2020 19:32

chrisbobbe merged commit ebcaae2 into zulip:master Jul 2, 2020

chrisbobbe mentioned this pull request Apr 21, 2021

Create extended user mention syntax when needed. #3000

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

people auto complete: Send user ID for users having same full name. #4094

people auto complete: Send user ID for users having same full name. #4094

agrawal-d commented May 7, 2020 •

edited

Loading

agrawal-d commented May 7, 2020

chrisbobbe May 7, 2020

agrawal-d commented May 8, 2020 •

edited

Loading

chrisbobbe commented May 14, 2020 •

edited

Loading

agrawal-d commented May 15, 2020 •

edited

Loading

rk-for-zulip left a comment

rk-for-zulip May 18, 2020

agrawal-d May 26, 2020

rk-for-zulip May 18, 2020

rk-for-zulip May 18, 2020

agrawal-d May 26, 2020

agrawal-d commented May 26, 2020

gnprice commented May 26, 2020 •

edited

Loading

agrawal-d commented May 27, 2020

chrisbobbe commented May 29, 2020 •

edited

Loading

agrawal-d commented Jun 1, 2020

chrisbobbe commented Jul 2, 2020

people auto complete: Send user ID for users having same full name. #4094

people auto complete: Send user ID for users having same full name. #4094

Conversation

agrawal-d commented May 7, 2020 • edited Loading

agrawal-d commented May 7, 2020

chrisbobbe May 7, 2020

Choose a reason for hiding this comment

agrawal-d commented May 8, 2020 • edited Loading

chrisbobbe commented May 14, 2020 • edited Loading

agrawal-d commented May 15, 2020 • edited Loading

rk-for-zulip left a comment

Choose a reason for hiding this comment

rk-for-zulip May 18, 2020

Choose a reason for hiding this comment

agrawal-d May 26, 2020

Choose a reason for hiding this comment

rk-for-zulip May 18, 2020

Choose a reason for hiding this comment

rk-for-zulip May 18, 2020

Choose a reason for hiding this comment

agrawal-d May 26, 2020

Choose a reason for hiding this comment

agrawal-d commented May 26, 2020

gnprice commented May 26, 2020 • edited Loading

agrawal-d commented May 27, 2020

chrisbobbe commented May 29, 2020 • edited Loading

agrawal-d commented Jun 1, 2020

chrisbobbe commented Jul 2, 2020

agrawal-d commented May 7, 2020 •

edited

Loading

agrawal-d commented May 8, 2020 •

edited

Loading

chrisbobbe commented May 14, 2020 •

edited

Loading

agrawal-d commented May 15, 2020 •

edited

Loading

gnprice commented May 26, 2020 •

edited

Loading

chrisbobbe commented May 29, 2020 •

edited

Loading