Skip to content
forked from tsaost/nilsimsa

This package is a Go implemenation of Nilsimsa, an anti-spam focused locality-sensitive hashing algorithm

License

Notifications You must be signed in to change notification settings

chikamim/nilsimsa

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

nilsimsa

Package nilsimsa is a Go implemenation of Nilsimsa, ported from code.google.com/p/py-nilsimsa but follows the conventions establish by the md5 package in the standard lib

The Java implementaition at https://github.com/weblyzard/nilsimsa/blob/master/src/main/java/com/weblyzard/lib/string/nilsimsa/Nilsimsa.java was also consulted.

There is a discussion about using hash to score string similarities http://stackoverflow.com/questions/4323977/string-similarity-score-hash

Copyright 2015 Sheng-Te Tsao. All rights reserved. Use of this source code is governed by the same BSD-style license that is used by the Go standard library

Examples

fmt.Println(nilsimsa.HexSum([]byte("Hello nilsimsa!")))
// Output: 436119240183882801210e002e1cb00122a20d11b4268ab8001a51190c08084b

h1 := nilsimsa.HexSum([]byte("Hello nilsimsa!"))
h2 := nilsimsa.HexSum([]byte("Hello world!"))
h3 := nilsimsa.HexSum([]byte("Nobody Expects the Spanish Inquisition"))

fmt.Println(nilsimsa.DiffHexScore(h1, h1))
fmt.Println(nilsimsa.DiffHexScore(h1, h2))
fmt.Println(nilsimsa.DiffHexScore(h1, h3))
// Output:
// 1
// 0.4094488188976378
// 0.14960629921259844

About

This package is a Go implemenation of Nilsimsa, an anti-spam focused locality-sensitive hashing algorithm

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Go 100.0%