SPDX-FileCopyrightText | SPDX-License-Identifier | title | author | footer | description | keywords | color | class | style | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
© 2024 Menacit AB <foss@menacit.se> |
CC-BY-SA-4.0 |
Practical cryptography course: Hashing introduction |
Joel Rangsmo <joel@menacit.se> |
© Course authors (CC BY-SA 4.0) |
An introduction to hashing and its cryptographic uses |
|
#ffffff |
|
section.center {
text-align: center;
}
table strong {
color: #d63030;
}
table em {
color: #2ce172;
}
|
Before we dig into cryptographic hashing, let's talk about check digits and checksums.
Check digits provide input error detection.
Used for credit card numbers, Bitcoin addresses, patient identifiers, social security numbers...
The Luhn algorithm is a common solution.
Checksums allows data integrity verification.
Detect if information/files have been corrupted during network transfer or disk storage.
The same input data should always result in the same output checksum.
CRCs are commonly utilized.
$ cksum /etc/passwd
1530034959 1930 /etc/passwd
$ echo "Polar bears are cool" | cksum
3234477472 21
$ echo "Polar bears are cool" | cksum
3234477472 21
$ echo "Polar beers are cool" | cksum
3688108819 21
Different input values may produce the same output checksum.
Not likely a problem, unless someone actively tries to find collisions.
Used to attack data authenticity controls.
Cryptographic hashing to the rescue!
Cryptographic hash functions are like checksums, but designed to never collide in practice.
Hash shouldn't be predictable, unless fully computed.
The output hash will be the same size regardless if input data is 1kB or 10TB.
Sometimes called "one-way encryption".
$ echo "Polar bears are cool" | sha256sum
09c123f289f05677dbfa38dad697ae86ab2f3ef25c8935cfc8cd68a59f2f4d0a
$ echo "Polar beers are cool" | sha256sum
f170488bc43c691d3b9055567952d05d1cd43fbebd54c2098a0d5d7685d2eaa1
$ head --bytes 5G /dev/zero | sha256sum
7f06c62352aebd8125b2a1841e2b9e1ffcbed602f381c3dcb3200200e383d1d5
- Data integrity checking
- Password storage
- Authentication
- Fingerprinting
- Pseudo-random number generators
- Append-only databases/ledgers
- Proof of Knowledge
- Proof of Work
....
Like other cryptography methods, hashing algorithms have a best before date.
Check out "SHAttered":
$ shasum shattered-1.pdf
38762cf7f55934b34d179ae6a4c80cadccbb7f0a
$ shasum shattered-2.pdf
38762cf7f55934b34d179ae6a4c80cadccbb7f0a
If you wanna learn more about collision techniques and play with them, have a look at: github.com/corkami/collisions.