Skip to content

Implements feature hashing (hashing trick) feature hashing, a fast and space-efficient way of vectorizing features.

License

Notifications You must be signed in to change notification settings

justinormont/HashingTrick.js

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

#HashingTrick.js Version Badge

dependency status dev dependency status License Downloads

npm badge

HashingTrick.js

Implements feature hashing, also known as the hashing trick, a fast and space-efficient way of vectorizing features. Converts tokenized strings in to a sparse feature vector.

Installation

  npm install hashingtrick.js --save

Usage

  var featureHashing = require('hashingtrick.js');
  var featureHasher = featureHashing.newFeatureHasher(18); // Feature vector will be 2^18 elements

  var stringsToHash = ['hello', 'world'];
  
  // Hash n-grams
  stringsToHash.forEach(function(str){ featureHasher.add(str); });
  
  console.log('sparseFeatureVector =', featureHasher.sparseFeatureVector());
  
  console.log('\n');
  console.log('Stats:');
  console.log('sparseLength =', featureHasher.sparseLength());
  console.log('length =', featureHasher.length());
  console.log('fillRatio =', featureHasher.fillRatio());
  console.log('collisions =', featureHasher.collisions());
  console.log('collisionRatio =', featureHasher.collisionRatio());
  console.log('valueCount =', featureHasher.valueCount());
  
  // -- Output --
  // sparseFeatureVector = { '13799': 1, '247186': 1 }
  //
  // Stats:
  // sparseLength = 2
  // length = 262144
  // fillRatio = 131072
  // collisions = 0
  // collisionRatio = 0
  // valueCount = 2

Tests

  npm test

Release History

  • 1.0.0 Initial release

About

Implements feature hashing (hashing trick) feature hashing, a fast and space-efficient way of vectorizing features.

Resources

License

Stars

Watchers

Forks

Packages

No packages published