Skip to content

Asynchronous MD5 calculator, employing WebAssembly for larger files

License

Notifications You must be signed in to change notification settings

briantbutton/md5-wasm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MD5-WASM

MD5-WASM is a fast asynchronous md5 calculator, optimized for large files.  It returns a Promise which resolves to the md5 string.  WebAssembly is seamlessly applied to calculate values for files above a certain size threshold.

Highlights

● 30x faster than the most popular md5 utility 
● Server-side (NodeJS) or client-side (browser) 
● Non-blocking, uses Promise syntax 

Raison d'être  

Faster and non-blocking

Our md5 hashing was initially performed using this simple and popular utility:  https://www.npmjs.com/package/md5  (called "MD5" herein)   However, MD5 is synchronous, blocking code execution, and slow — impractically slow for video files.  (On our low-powered server platform, we clock it at about 1 second per megabyte.) 

30x faster?

On larger files, yes.  Here are the benchmarks, comparing MD5 to MD5-WASM, run on our (slow) production server platform using NodeJS (v10.18.1) on Ubuntu. 

                   ELAPSED MILLISECONDS        MEGABYTES PER SECOND
                   MD5         MD5-WASM         MD5        MD5-WASM
0.2 Mbytes          260            90          1.05            2.2           
0.3 Mbytes          390           300          0.98            1.4            
0.5 Mbytes          520           360          0.98            1.4            
  1 Mbytes        1,000           170          1.00            8.5          
  2 Mbytes        2,000           240          1.00            8.5          
  4 Mbytes        4,000           330          1.00           12
  8 Mbytes        7,600           400          1.05           20
 12 Mbytes       12,400           490          0.96           24
 24 Mbytes       23,600           700          1.02           34
 37 Mbytes       38,500           990          0.96           37

On our benchmark system, MD5-WASM gives up 150 ms to complete WebAssembly instantiation.  After that, the relative performance gap between the two keeps growing, reaching 30x for a 37Mbyte file. 

Why the huge improvement?

It would not be surprising to see a 3x improvement up to 5x improvement from WebAssembly but 30x is definitely surprising.  For md5 calculation, WebAssembly holds one other big advantage.  Any JavaScript implementation does a lot of number format conversion during md5 calculation, while WebAssembly implementations need not. 

JavaScript runs native with 64-bit floating point numbers but all bitwise operations are done with 32-bit integers.  Since calculating a checksum is just scads of bitwise operations, Javascript implementations spend more time converting between number formats than they do on the checksum itself. 

Is there a downside?

You need do nothing different to accomodate WebAssembly — MD5-WASM loads in a browser or Node environment just like a pure JavaScript utility would.  Unlike MD5, MD5-WASM does not take parameters in a string format — you must convert the string before injecting it into MD5-WASM.  There is no synchronous version; you must use a promise instead of a simple blocking function call. 

Javascript Calls And Parameters

Usage example

let data  = contentsOfAFile();                        // Get the data any which way you can

// 'data' must be a Buffer, ArrayBuffer or Uint8Array
md5WASM(data)                                         // Our function
    .then( hash => console.log(hash) )
    .catch( err => console.log(err) )

Loading MD5-WASM

At less than 32K, the code file does not justify minification.  It is all-inclusive and has no external dependencies. 

HTML tag

<script type="text/javascript" src="path/md5-wasm.js"></script>

You will find the function at window.md5WASM

In NodeJS

md5WASM      = require("md5-wasm");

Problems, questions

Please open an issue at the GitHub repo.

About

Asynchronous MD5 calculator, employing WebAssembly for larger files

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published