-
-
Notifications
You must be signed in to change notification settings - Fork 651
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
people auto complete: Send user ID for users having same full name. #4094
Conversation
@chrisbobbe, @gnprice, @ray-kraesig this PR is ready for a review. |
fcf4f0b
to
5827769
Compare
@@ -32,6 +32,12 @@ class PeopleAutocomplete extends PureComponent<Props> { | |||
const { users, onAutocomplete } = this.props; | |||
const user = users.find(x => x.email === email); | |||
if (user) { | |||
// If another user with the same full name is found, we send the | |||
// user ID as well, to ensure the mentioned user is uniquely identified. | |||
if (users.find(x => x.full_name === user.full_name && x.user_id !== user.user_id)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I expect this will look different when #3339 is resolved, but it looks good to me as a fix before then.
@chrisbobbe I've pushed some changes. This version uses binary search and is faster than the linear one. |
Hmm, interesting! I actually didn't have binary search in mind when I commented at #4094 (comment). Instead, I was thinking that we wouldn't be working with arrays here anymore, and instead something like an
However! The natural thing to do for #3339, to get One solution, as you've shown here, is to use binary search on an array. (One step that I think is missing currently is to ensure @gnprice, as someone who used to do research on algorithms, will know what to do here. But the one thing that hasn't come up in my discussions with him is the need to maintain indexes on things besides a canonical, unique ID. |
Thanks for taking a look @chrisbobbe
Yep, I did consider maps before. But JavaScript does not support multiMaps, so if we attempt to hash with And if we hash with
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@agrawal-d wrote:
Yep, I did consider maps before. But JavaScript does not support multiMaps, so if we attempt to hash with
full_name
as keys, the last value will be overwritten.
Disregarding possible variations in their definition of "insertion order" – which we're not concerned with – Multimap<K, V>
is isomorphic to Map<K, Array<V>>
.
But, as you note, we already have a sorted list.
The usual way to do this, I think, would be lower_bound
and upper_bound
functions, taking a candidate value k and returning the index to the first key not less than k (for lower_bound
) or greater than k (for upper_bound
). (These functions can be implemented almost identically to binary search, except that they should treat equality as the same as greater-than or less-than (respectively) when dispatching, and they should always return something.)
If you have those, const lo = foo.lower_bound(...), hi = foo.upper_bound(...)
gives exactly the range [lo, hi)
of values with keys equivalent to k.
However, we don't happen to have those handy, and they're not really necessary here. You can instead just do a binary search to find any matching user, and then check if either the preceding or following users' name also matches.
const comparator = (element: number) => { | ||
if (element < toFind) { | ||
return -1; | ||
} else if (element > toFind) { | ||
return 1; | ||
} | ||
return 0; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Contextually this is equivalent to element - toFind
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed.
src/utils/binarySearch.js
Outdated
*/ | ||
const binarySearch = <T>( | ||
array: Array<T>, | ||
comparator: (elem: T) => number, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer this to be identical to what you'd pass to sort
, accepting the value to be found as another parameter.
This is technically less flexible (for example, as written, you could pass a comparator that would find and accept any value starting with "M"), so I could understand supplying an auxiliary function to build the search-comparator from a sort-comparator and a value instead.
(EDIT: This was written before I saw the next commit.)
const idx = binarySearch(users, elem => { | ||
if (elem.full_name === user.full_name && elem.user_id === user.user_id) { | ||
return 1; | ||
} | ||
return elem.full_name.toLowerCase().localeCompare(user.full_name.toLowerCase()); | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This won't work: the given comparator doesn't describe a consistent ordering.
Consider user = { full_name: "John Doe", user_id: 4, ... }
, and a users
array matching the following:
users =~ [
{ full_name: "A", user_id: 0 }, // -1
{ full_name: "B", user_id: 1 }, // -1
{ full_name: "C", user_id: 2 }, // -1
{ full_name: "D", user_id: 3 }, // -1
{ full_name: "John Doe", user_id: 4 }, // 1
{ full_name: "John Doe", user_id: 5 }, // 0
{ full_name: "X", user_id: 6 }, // 1
{ full_name: "Y", user_id: 7 }, // 1
{ full_name: "Z", user_id: 8 }, // 1
];
(Trailing comments show the expected return value of the comparator for each element.)
binary_search
will check users[4]
first, which is the user in question; but the comparator will return 1
, so it will limit further searches to the range [0, 3]
, and therefore never find { full_name: "John Doe", user_id: 5 }
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed now.
0873981
to
d9c6ddb
Compare
@rk-for-zulip thanks for the review. I've made some changes to fix the issues you pointed out. Can you take a look again? |
Seeing this add an implementaion of binary search, my first reaction is that binary search is notoriously tricky to implement completely correctly. (Perhaps I'll go make a thread in #learning about that story -- it might be fun to read about. edit: here) So I'm inclined to look for other solutions for that reason. From an algorithmic perspective, one thing I notice here is that this is one of several places that we do a linear pass through this same array. If you look at const allAutocompleteOptions = getUsersAndWildcards(users);
const startWith = filterUserStartWith(allAutocompleteOptions, filter, ownEmail);
const initials = filterUserByInitials(allAutocompleteOptions, filter, ownEmail);
const contains = filterUserThatContains(allAutocompleteOptions, filter, ownEmail);
const matchesEmail = filterUserMatchesEmail(users, filter, ownEmail); Each of those five lines represents a linear pass through all this data. The first one, in fact, actually allocates a copy: export const getUsersAndWildcards = (users: User[]) => [
{ ...NULL_USER, full_name: 'all', email: '(Notify everyone)' },
{ ...NULL_USER, full_name: 'everyone', email: '(Notify everyone)' },
...users,
]; which tends to be especially expensive. So all of those together are going to do much more work than we'd do by a linear search here. In algorithmic terms, that means the most we could gain by doing something fancier is a constant factor -- O(n) + O(log n) is still O(n), just like O(n) + O(n) is. And actually the situation is sharper than that -- we're doing all that work every time the user types another letter, to search again with the new query. This code, on the other hand, only runs once at the end when the user chooses an item from the list. So, let's just not do anything fancy here for now. 🙂 I do find, empirically on czo, that the people autocomplete can be pretty slow to respond with results. That would be good to improve, and it might involve using some more-clever algorithms and/or data structures in |
@gnprice Thanks for the review, I've reverted it back to the simple version. Can you take a look again? CC @chrisbobbe |
Great, thanks! This looks good to me, though it just occurred to me that it would be good to refer to an authoritative source on the mention syntax, in the code comment you've added. I believe the |
@chrisbobbe I've updated the commit message. |
Previously, our @-mention autocomplete had a bug, wherein it failed to correctly mention the specific user if more than one users had the same full name. Now, if more than one users have the same full name, we send the user ID as well to ensure that the correct user is @-mentioned. The user ID is not sent for every @-mention to ensure the behaviour is consistent with the web app, i.e., we send it only in case of ambiguity. See the `get_mention_syntax` function in `static/js/people.js` in the webapp source for the origin of this syntax.
Merged, thanks, @agrawal-d! I also made a small tweak: adding the reference to the web app in a code comment, in addition to the commit message. |
Previously, our @-mention autocomplete had a bug, wherein it
failed to correctly mention the correct user if more than one user
had the same full name.
Now, if more than one users have the same full name, we now send
the user ID as well to ensure that the correct user is
@-mentioned.