Skip to content

Conversation

oluckyman
Copy link
Contributor

Inputs.search does not work with Cyrillic queries due to the problem with the word boundary assertion (\b). It supports only latin symbols:

image

This PR introduces negative lookbehind (?<!\S) instead of \b to detect a word
boundary.

Use negative lookbehind `(?<!\S)` instead of `\b` to detect word
boundary. Because `\b` does not support non-latin symbols.
@mootari
Copy link
Member

mootari commented Feb 25, 2022

Safari does not support negative look behind.

Also add unit test and snapshot for queries with Cyrillic
@Fil
Copy link
Collaborator

Fil commented Feb 26, 2022

(string).match(/[\p{L}-]+/ug) looks like a powerful solution (found at https://stackoverflow.com/questions/150033/regular-expression-to-match-non-ascii-characters).

(We don't want to only match words separated by spaces.)

@oluckyman oluckyman changed the title Support Cyrillic letters in search Support non-ascii characters in search Feb 28, 2022
@mbostock mbostock merged commit 3e2c79e into observablehq:main Apr 15, 2022
@tophtucker
Copy link
Contributor

Thanks for fixing this a whole, um, sixteen months ago, @oluckyman. Sorry for the ridiculous delay, but the fix is finally deployed to the site; you can see test cases for the new release here: https://observablehq.com/d/4303b3734fb950d8.

There's not really any good reason for the delay, we were just focused elsewhere. We're improved our internal processes for upgrading these libraries so more people can do it (like me!) and it shouldn't be so slow / such a bottleneck in the future. 😬 🙈 😅 🙏

@mootari
Copy link
Member

mootari commented Aug 23, 2025

Looks like the "-" got included in the character class by accident? #277

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants