Cardinality estimation using HyperLogLog algorithm

The HyperLogLog algorithm estimates the cardinality of the data set (i.e. number of distinct elements in the data set) without having to store the actual elements seen, which would be required for a naive unique count implementation. In order to achieve a high degree of accuracy with a low memory footprint, a good hash algorithm must be chosen.

Installation

npm install cardinality

Usage

Extending

Recognizing that other people might not use the algorithm in the exact same way I do, I have attempted to preserve the integrity of the core algorithm while allowing end-users to extend many pieces of the implementation; in particular, the hash algorithm and the storage mechanisms are designed to be easily replaced in a modular fashion.

Known extensions:

Credits

Many tech bloggers and scalability evangelists have been writing about HyperLogLog and related ideas recently; however, this work is principally derived from the following pieces of work:

[http://github.com/sedictor/loglog](The GitHub repo of reference PHP and Javascript implementations of the LogLog and HyperLogLog algorithms by Vadim Semenov), from which this repository was originally forked.
The paper by Philippe Flajolet, Éric Fusy, Olivier Gandouet and Frédéric Meunier entitled "HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm", available http://algo.inria.fr/flajolet/Publications/FlFuGaMe07.pdf as well as blob/master/HyperLogLog.pdf for your reference.
(For future work) [http://hal.archives-ouvertes.fr/docs/00/46/53/13/PDF/sliding_HyperLogLog.pdf](a description of a minor HyperLogLog variation which provides for sliding windows of estimation)

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
lib		lib
test		test
.gitignore		.gitignore
.npmignore		.npmignore
.travis.yml		.travis.yml
HyperLogLog.pdf		HyperLogLog.pdf
README.md		README.md
SlidingHyperLogLog.pdf		SlidingHyperLogLog.pdf
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cardinality estimation using HyperLogLog algorithm

Installation

Usage

Extending

Credits

About

Releases

Packages

Languages

mattbornski/cardinality

Folders and files

Latest commit

History

Repository files navigation

Cardinality estimation using HyperLogLog algorithm

Installation

Usage

Extending

Credits

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages