Skip to content

wolfgarbe/LinSpell

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LinSpell

Fast Spelling correction & Approximate string search

The LinSpell spelling correction algorithm does not require edit candidate generation or specialized data structures like BK-tree or Norvig's algorithm. In most cases LinSpell is faster and requires less memory compared to BK-tree or Norvig's algorithm. LinSpell is language and character set independent.


Copyright (C) 2017 Wolf Garbe
Version: 1.0
Author: Wolf Garbe <wolf.garbe@seekstorm.com>
Maintainer: Wolf Garbe <wolf.garbe@seekstorm.com>
URL: https://github.com/wolfgarbe/linspell
Description:
https://seekstorm.com/blog/symspell-vs-bk-tree/
License:
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU Lesser General Public License, 
version 3.0 (LGPL-3.0) as published by the Free Software Foundation.
http://www.opensource.org/licenses/LGPL-3.0

Usage

single word + Enter: Display spelling suggestions
Enter without input: Terminate the program

Performance

Benchmark
Benchmark 1

Applications

  • Query correction (10–15% of queries contain misspelled terms),
  • Chatbots,
  • OCR post-processing,
  • Automated proofreading.

Frequency dictionary

The word frequency list was created by intersecting the two lists mentioned below. By reciprocally filtering only those words which appear in both lists are used. Additional filters were applied and the resulting list truncated to ≈ 80,000 most frequent words.

Blog Posts: Algorithm, Benchmarks, Applications

SymSpell vs. BK-tree: 100x faster fuzzy string search & spell checking


LinSpell is contributed by SeekStorm - the high performance Search as a Service & search API