Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider using a search library that handles typo tolerance #764

Open
dguo opened this issue Aug 10, 2018 · 1 comment
Open

Consider using a search library that handles typo tolerance #764

dguo opened this issue Aug 10, 2018 · 1 comment
Labels
A-Search Area: Search

Comments

@dguo
Copy link
Contributor

dguo commented Aug 10, 2018

Hello. I recently finished reading the Rust book, and when I played around with the search feature, I realized that it doesn't handle typos. You just don't get any results. I read through #51, and I didn't see any mention of Fuse.js, which is like Elasticlunr, but is designed to handle fuzzy-searching/typos.

Would the mdBook owners be open to possibly switching to it? There might be disadvantages too, like a noticeably slower search or subjectively worse results, so it would take some investigation/experimentation.

@mattico
Copy link
Contributor

mattico commented Aug 13, 2018

I would be open to changing the search library if it would improve the user experience. Some things to keep in mind:

  • Offline index generation: It's important that the search index can be generated at book build time, especially for large books. This (more or less) requires a rust implementation of the indexer.
  • Internationalization: While mdBook doesn't currently have built-in support for different languages, I want to add this soon. A search library should support using different stemmers for different languages, or generally work well on non-English text.
  • Index size: Since the search index is downloaded (and hopefully cached) on each page load, it is important to keep its size down. The current elasticlunr implementation is not ideal for large books, and I'd hope a new implementation could be better. I'd assume libraries which use a bitfield index rather than a text trie would have smaller indexes.
  • Search result teasers: We currently add a teaser from the text surrounding a search result, which is quite helpful for determining if a result is relevant. A new search implementation should retain this functionality.

Feel free to ask me if you have any questions!

@ehuss ehuss added the A-Search Area: Search label May 17, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-Search Area: Search
Projects
None yet
Development

No branches or pull requests

3 participants