Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
pr includes:
everything works correctly but its all a bit jank:
im computing dot products one by one on typed arrays (which are much faster than normal arrays, but still). this could easily be vectorized in numpy (pack embeddings into an embedding matrix, multiply that matrix by the current embedding to get the pairwise similarities) but js doesnt have a good library for it.
i'll probably make a little python thing to compute these, and it'll probably be useful if theres any other data-related stuff we want to do in the future.
i should be implementing a top-k sort rather than sorting the entire array of similarities. would be much faster (the algorithm im imagining would be n log k, the dead simple one is just three linear scans of the array), but at the moment the dominant cost is computing the cosine similarities.
not entirely sure what the best ui is for displaying these within updately's design system.. would need some help on this