This repository contains implementations of PoxHash in Header-Only C, Go, Python, Rust, Nim and JS. If you wish to know more about PoxHash read the specifications.
All the implementations are in functional paradigm. They all rely on zero 3rd party libraries and calls the respective standard library of the language is minimal at best.
To use the code just download the file into your working directory and import the functions mentioned in the interfaces section.
Download to your working directory:
wget https://raw.githubusercontent.com/Chubek/PoxHash/master/c/libpoxh.h -qO $(pwd)/libpoxh.h
It should be mentioned that the C implementation is header-only and does not need object linking.
This assumes libpoxh.h
is in the same level as you *.c
file.
#ifndef __POX_H_INC // include guards are not necessary, but recommended
#define __POX_H_INC
#include "libpoxh.h";
#endif
#ifndef __STDINT_H_INC
#define __STDINT_H_INC
#include <stdint.h> // we need this for uint8_t
#endif
Download to your working directory:
wget https://raw.githubusercontent.com/Chubek/PoxHash/master/go/libpoxh.go -qO $(pwd)/libpoxh.go
Make sure your directory structure looks like this:
project_dir/
libpoxh/
libpoxh.go
+ main.go
Assuming your project name is projectname
:
import (
"fmt" // just for the sake of demonstration
"projectname/libpoxh"
)
Download to your working directory:
wget https://raw.githubusercontent.com/Chubek/PoxHash/master/js/libpoxh.js -qO $(pwd)/libpoxh.js
Running JavaScript is simple. Just download the file into your directory and import it using CommonJS import syntax.
const libpox = require("libpoxh");
Download to your working directory:
wget https://raw.githubusercontent.com/Chubek/PoxHash/master/nim/libpoxh.nim -qO $(pwd)/libpoxh.nim
After downloading, import it like this:
form libpoxh import PoxHash, PoxDigest
Download to your working directory:
wget https://raw.githubusercontent.com/Chubek/PoxHash/master/python/libpoxh.python -qO $(pwd)/libpoxh.py
It really depends on how you are doing imports in your project. But assuming you are not doing relative imports, and your file is on the same level as the lixpoxh.py
:
from libpoxh import pox_hash, PoxDigest
First cd
into your project folder, where Cargo.toml
is, and:
Download to your working directory:
wget https://raw.githubusercontent.com/Chubek/PoxHash/master/rust/libpoxh.ts -qO $(pwd)/src/libpoxh.rs
use libpoxh::{pox_hash, PoxDigest};
Then you can use cargo run
and cargo build
normally.
After downloading the file, compile it into an rlib
file using:
rustc libpoxh.rs -O --crate-type lib --crate-name libpoxh -o libpoxrust.rlib
and when you import:
extern crate libpoxh;
use libpoxh::{pox_hash, PoxDigest};
And compile with --extern
flag to link:
rustc myapplication.rs -O --extern libpoxh=libpoxrust.rlib -o myapplicationbinary
All the implementations provide a simple function that will accept the message in unsigned byte form. They are, as follows:
Language | Function | Accepts | Returns |
---|---|---|---|
C | pox_hash | uint8_t* | struct PoxDigest (poxdigest_t) |
Go | PoxHash | []uint8 | type struct PoxDigest |
JS | poxHash | Uint8Array | PoxDigest typefunc |
Nim | PoxHash | seq[uint8] | object PoxDigest |
Python | pox_hash | array.array('B') | class PoxDigest |
Rust | pox_hash | &Vec | struct PoxDigest |
All the variables are a variation of an unsigned 8-bit integer array.
For static languages, it is obvious, but for Python, you need to convert a regular bytearray into a uint8 array like this:
def to_uint8_array(message: bytearray) -> any:
from array import array
array('B', message)
And for JavaScript:
const toUint8Array = (message) => {
return new Uint8Array(message);
};
If you wish to hash a character string, strings in Python and JS strings are not representations of the underlying bytes. You need to encode them first. Be very careful about encoding, as different codecs have different corresponding bytes for each 'raw' Python/JS glyph.
To endcode an string to UTF-8 in JavaScript:
const stringToByteArray = (str) => {
let utf8Encode = new TextEncoder();
byteArray = utf8Encode.encode(str);
return byteArray;
};
And in python you can just do myMessage.encode('utf8')
.
Rust, Go and Nim character strings can be easily converted to byte rrays and from there, unsigned bytearrays. In C, the char type is the same as the byte type. C has no special type for graphemes. Whereas in Rust and Nim the char
type and in Go the rune
type represents a character, in C they're one and the same.
To convert a string into bytes:
- Go:
signedByteArray := []byte(myStr)
- Rust converts strings to unsigned bytes by default:
let signed_byte_array = my_str.as_bytes().to_vec();
- Nim:
var signedByteArray = toOpenArrayByte(myStr)
Use the following ways to convert data in int8, byte or glyph (char) type into uint8 form in static languages:
- C:
const uint8_t unsigned_array = (uint8_t *)signed_array;
- Go:
unsignedArray := []uint8(signedArray)
- Nim:
import sequtils
var unsignedArray = map(signedArray, proc(x: byte): uint8 = cast[uint8](x))
- Rust:
let unsigned_array: Vec<u8> = signed_array.into_iter().map(|b| b as u8).collect();
// if you have a string just do this
let unsigned_array = myString.as_bytes().to_vec();
- Go:
unsignedArray := []uint8(myStr)
- Nim:
import sequtils
var unsignedArray = map(myStr, proc(x: char): uint8 = cast[uint8](x))
All these functions return an structured object, be it a struct, a class, or anything else that is a structured object, called PoxDigest
. Across the implementations, this structured object has 11 properties (also called 'fields') that are identical in content and name (except in Go, where they only slightly differ). These properties are digests in various forms, decimal or non-decimal, decimal in several bit widths and non-decimal in several bases. Let's go through them.
Note: In Python and JS, if the input is of the wrong type, the function will return None
and null
respectively. It won't throw any exceptions.
In Go | In Others | Decenary |
---|---|---|
Sexdigest | sexdigest | Non-Decimal |
Vigdigest | vigdiest | Non-Decimal |
Hexdigest | hexdigest | Non-Decimal |
Tetdigest | tetdigest | Non-Decimal |
Duodigest | duodigest | Non-Decimal |
Octdigest | octdigest | Non-Decimal |
Sendigest | sendigest | Non-Decimal |
Bindigest | bindigest | Non-Decimal |
Bytes | bytes | Decimal |
Words | words | Decimal |
Doubles | doubles | Decimal |
Quad | quad | Decimal |
Language | Type |
---|---|
C | char[digest_size + 1], null-terminated |
Go | string |
JavaScript | string |
Nim | string |
Python | str |
Rust | String |
Refer to Non-Decimal Digests section for more info
Language | Type |
---|---|
C | uint8_t[8] |
Go | [8]uint8 |
JS | Uint8Array[8] |
Nim | array[8, uint8] |
Python | array.array('B', 8) |
Rust | [u8; 8] |
Language | Type |
---|---|
C | uint16_t[4] |
Go | [4]uint16 |
JS | Uint16Array[4] |
Nim | array[4, uint16] |
Python | array.array('H', 4) |
Rust | [u16; 4] |
Language | Type |
---|---|
C | uint32_t[2] |
Go | [2]uint32 |
JS | Uint32Array[2] |
Nim | array[4, uint32] |
Python | array.array('I', 2) |
Rust | [u32; 2] |
Language | Type |
---|---|
C | uint64_t |
Go | uint64 |
JS | BigInt.asUintN(64) |
Nim | uint64 |
Python | array.array('Q', 1) |
Rust | u64 |
Refer to Decimal Digests section for more info.
The PoxDigest
type in Rust implements the following traits:
std::clone::Clone;
std::default::Default;
std::fmt::Display;
std::fmt::Debug;
std::fmt::Binary;
std::fmt::LowerHex;
std::fmt::UpperHex;
std::fmt::Octal;
std::fmt::Pointer;
As such, the following operatons are allowed for PoxDigest
:
let mut default_digest = libpoxh::PoxDigest::default();
let mut digest_vec = vec![libpoxh::PoxDigest::default(); SIZE];
let digest_clone = default_digest.clone();
println!("Digests Debug:\n{:?}", digest_clone);
println!("Digests Display:\n{}", digest_clone);
println!("Bindgest:\n{:b}", digest_clone);
println!("Hexdigest in UPPERCASE:\n{:X}", digest_clone);
println!("Hexdigest in lowercase:\n{:x}", digest_clone);
println!("Octdigest:\n{:o}", digest_clone);
println!("Pointer to Original: {:p}", default_digest);
println!("Pointer to Clone: {:p}", digest_clone);
This type has the predetermined string formatting function Format
implemented. As such, for example, the following:
fmt.Printf("%b", myPoxDigestInstance)
hexUpperPox = fmt.Sprintf("%X", myPoxDigestInstance)
Will print out PoxDigest.Bindigest
and PoxDigest.HexDiest
(in uppercase) repectively.
The following table indicates all the possible string formatting flags and their corresponding output:
Flag | Output |
---|---|
b | Bindigest |
X | Hexdigest, uppercase |
x | Hexdigest, lowercase |
o | Octdigest |
s | Sendigest |
Y | Vigdigest, uppercase |
y | Vigdigest, lowercase |
G | Sexdigest, uppercase |
g | Sexdigest, lowercase |
U | Tetdigest, uppercase |
u | Tetdigest, lowercase |
d | Duodigest |
B | Bytees |
W | Words |
D | Doubles |
Q | Quad |
F | All decimal digests |
N | All non-decimal digests |
A | Everything |
* | EUF unless predefiend |
This class implements the following Dunder methods:
__dict__
__repr__
__str__
__format__
__str__
and __repr__
have different outputs. The latter just prints the result of the custom __dict__
function as idented JSON string. But __str__
calls to __format__
to create an elegant output.
__format__
operates on custom flags. A group of flags correspond to one digest. They are as follows.
Digest | __format__ flags |
---|---|
Sexdigest | g , G , sex , SEX |
Vigdiest | v , V , vig , VIG |
Hexdigest (lowwecase) | h , hex |
Hexdigest (uppercase) | H , HEX |
Tetdigest (lowercase) | t , tet |
Tetdigest (uppercase) | T , TET |
Duodigest | d , D , duo , DUO |
Octdigest | o , O , oct , OCT |
Sendgiest | s , S , sen , SEN |
Bindgiest | b , B , bin , BIN |
Bytes | 8 , u8 , Bytes |
Words | 4 , u16 , Words |
Doubles | 2 , u32 , Doubles |
Quad | 1 , u64 , Quad |
Each flag can be used in severl ways, for example f"{digest:G}"
or format(digest, 'G')
.
The type alias poxdigest_t
is defined as:
typedef poxdigest_t struct PoxDigest
This is the type that the functiion pox_hash
returns, and it must be used at all times when dealing with PoxHash digests.
Invariably, in all of the implementations, the only smybols exprted as the hasher function, and the structured type containing the digests. No other symbol has been exported from these files. The libpoxh.h
file mostly relies on macros rather than functions, so there's little symbol save for the constants, all of which are static
meaning they are private to the header file. In Go all the symbols start with lowercase letters, save for the two exported functions. JavaScript uses module.exports
--- and Nim uses the *
notation to export the two symbols. Rust uses pub
to export the only two symbols defined within the module (other code are in submodules) and Python uses the __
prefix before all variables, making them private to the file.
I will now explain the different forms of digest contained within PoxDigest
objects.
There are 8 non-decimal digests in the digest object. They are all created by the usual formula of converting a decimal number to undecenary bases, that is, determining a set of digits that must be equal to the ordinal value of the base. So for base 16, we ned 16 digits, for base 20, we need 20 and so on. Usually, people select the same Indo-Persian numerals for digits ranging from 0 to 9, and if the base is larger than 10, we choose an alphabet letter for the remainder. After we determine these digits we divide the decimal number by that base and choose the digit based on the remainder and we do this until the decimal is 0. We then reverse the sequence and that will be our number in that base.
For example, to convert decimal number 2
to binary we divide it by 2, getting 1 and our reminder will be 0
. So our first digit is 0. Now we divide the result of the previous division, 1, by 2 and we get 0 so we must stop. But befoe that we select the resi;t pf 1 mod 2
that is 1
as our second digit. We have [0, 1]
but now we reverse it and we get [1, 0]
. That is the binary value for 2.
Note: One in binary is 0b01 and two is 0b10. But in our calculations we can decide to go Big-Endian so they will be reversed. That is why as specs say, we always use Litttle-Endian in PoxHash.
We thusly use this formula to convert our 4 word factors to undecenary digests. The final digest must be paddd by the equivalent of digit 0 in the base, and the final size of the digest will be number of digits of 65535 (maximum unsigned word) in that base, tims 4. So, for example, 65535 in hexadecimal is 0xFFFF so we have 4 digits, if we pad each factored in hex with 0, then we end up with a string of length 16.
Let's go through properties of each of them.
Let's take a look at the base number of each undecenary digest, the number of digits of 65535 in it, and at the end the length of the final digest which as stated before is the aforementioned maximum digits times 4.
PoxDigest Field | Base Num | Base Name | 65535 Len | Total Len |
---|---|---|---|---|
sexdigest | 60 | Sexagesimal | 3 | 12 |
vigdigest | 20 | Vigesimal | 4 | 16 |
hexdigest | 16 | Hexadecimal | 4 | 16 |
tetdigest | 14 | Tetradecimal | 5 | 20 |
duodigest | 12 | Duodecimal | 5 | 20 |
octdigest | 8 | Octal | 6 | 24 |
sendigest | 6 | Senary | 7 | 28 |
bindigest | 2 | Binary | 16 | 64 |
It is not gauranteed that the 4 final unsigned word digests will translate to completely unique-looking non-binary digests for messages that are very similar. For example uint8 array [2, 129]'s hexdigest may look unique, but its sexdigest may not.
Each of these digests have digits that may or may not be exclusive to PoxHash standard specified in SPEC.md. For Hexadecimal we use the usual notation, but for other larger-than-10 bases we use our own character-digit standard. They are as follows.
n | d |
---|---|
0 | '0' |
1 | '1' |
n | d |
---|---|
0 | '0' |
1 | '1' |
2 | '2' |
3 | '3' |
4 | '4 |
5 | '5' |
n | d |
---|---|
0 | '0' |
1 | '1' |
2 | '2' |
3 | '3' |
4 | '4 |
5 | '5' |
6 | '6' |
7 | '7' |
n | d | n | d | n | d |
---|---|---|---|---|---|
0 | '0' | 1 | '1' | 2 | '2' |
3 | '3' | 4 | '4' | 5 | '5' |
6 | '6' | 7 | '7' | 8 | '8' |
9 | '9' | 10 | '*' | 11 | '#' |
n | d | n | d | n | d | n | d | n | d |
---|---|---|---|---|---|---|---|---|---|
0 | '0' | 1 | '1' | 2 | '2' | 3 | '3' | 4 | '4' |
7 | '7' | 8 | '8' | 9 | '9' | 10 | 'T' | 11 | 'E' |
n | d | n | d | n | d | n | d |
---|---|---|---|---|---|---|---|
0 | '0' | 1 | '1' | 2 | '2' | 3 | '3' |
4 | '4' | 5 | '5' | 6 | '6' | 7 | '7' |
8 | '8' | 9 | '9' | 10 | 'A' | 11 | 'B' |
12 | 'C' | 13 | 'D' | 14 | 'E' | 15 | 'F' |
n | d | n | d | n | d | n | d | n | d |
---|---|---|---|---|---|---|---|---|---|
0 | 'A' | 1 | 'B' | 2 | 'C' | 3 | 'D' | 4 | 'E' |
5 | 'F' | 6 | 'G' | 7 | 'H' | 8 | 'I' | 9 | 'J' |
10 | '@' | 11 | '^' | 12 | '&' | 13 | '*' | 14 | '$' |
15 | '+' | 16 | '!' | 17 | ';' | 18 | ':' | 19 | '~' |
n | d | n | d | n | d | n | d | n | d |
---|---|---|---|---|---|---|---|---|---|
0 | '0' | 1 | '1' | 2 | '2' | 3 | '3' | 4 | '4' |
5 | '5' | 6 | '6' | 7 | '7' | 8 | '8' | 9 | '9' |
10 | 'A' | 11 | 'B' | 12 | 'C' | 13 | 'D' | 14 | 'E' |
15 | 'F' | 16 | 'G' | 17 | 'H' | 18 | 'I' | 19 | 'J' |
20 | 'K' | 21 | 'L' | 22 | 'M' | 23 | 'N' | 24 | 'O' |
25 | 'P' | 26 | 'Q' | 27 | 'R' | 28 | 'S' | 29 | 'T' |
30 | 'U' | 31 | 'V' | 32 | 'W' | 33 | 'X' | 34 | 'Y' |
35 | 'Z' | 36 | 'a' | 37 | 'b' | 38 | 'c' | 39 | 'd' |
40 | 'e' | 41 | 'f' | 42 | 'g' | 43 | 'h' | 44 | 'i' |
45 | 'j' | 46 | 'k' | 47 | 'l' | 48 | 'm' | 49 | 'n' |
50 | 'o' | 51 | 'p' | 52 | 'q' | 53 | 'r' | 54 | 's' |
55 | 't' | 56 | 'u' | 57 | 'v' | 58 | 'w' | 59 | 'x' |
The main decimal digest we have is the word digest, which are 4 unsigned 16-bit integers which the hashing operation has been carried on. The bytes are those 4 words decomposed into a uint8 array of size 8. Doubles are the reverse of that, every couple of uint16s have been merged into one uint32. Quad takes those 4 words and merges them into one uint64.
Below we will provide examples for each language. Importinh has been ommitted, see importing section above for more info. The examples also ignore the execution context (function or top-level).
uint8_t message[] = "SmallPox";
poxdiest_t digest = pox_hash(message);
if (digest.hexdigest != "06BD634EC1FCA83A") {
exit(1);
}
exit(0);
message := []byte("SmallPox")
var digest PoxDigest = PoxHash([]uint8(message))
if digest.Hexdigest != "06BD634EC1FCA83A" {
os.Exit(1)
}
os.Exit(0)
const utf8Encode = new TextEncoder();
const message = utf8Encode.encode("SmallPox");
const digest = poxHash(message);
if (digest.hexdigest != "06BD634EC1FCA83A") {
process.exit(1);
}
process.exit(0);
var
message: seq[uint8] = map("SmallPox", proc(x: char): uint8 = cast[uint8](x))
digest: PoxDigest
digest = PoxHash(message)
if digest.hexdigest != "06BD634EC1FCA83A":
quit(1)
quit(0)
from array import array
message = array('B', "SmallPox".encode())
digest = pox_hash(message)
if digest.hexdigest != "06BD634EC1FCA83A":
exit(1)
exit(0)
let message = String::from("SmallPox").as_bytes().to_vec();
let digest = pox_hash(&message);
if digest.hexdigest != "06BD634EC1FCA83A" {
std::process::exit(1);
}
std::process::exit(0);
- Specifications of PoxHash -> SPEC.md
- Using the example bash runners -> RUNNERS.md
- Rolling out your own implementation -> CONTRIBUTE.md