fincher is steganography tool for text. It provides a number of strategies for hiding a message within a source text by storing each character as a typo.
The method by which it works is contigent upon the combination of replacement and displacement strategy. See Usage for more information.
The inspiration for fincher
comes from "Panopticon", Season 4 Episode 1 in
Person of Interest, in which The Machine encodes a message as typos in the
dissertation of one of the main characters, Harold Finch.
fincher
is currently 0.2.2
and considered an experiment
and a project for funsies. I am very interested in contributions & ideas!
While fincher
is a steganography tool, no guarantees are made about it's
suitablity for any purpose, especially hiding information from hostile actors.
Due to the fact that fincher hides messages in a source text as typos, if the information is stored digitally as text, it would be relatively easy to run a spellchecking over the text to determine where the typos are, and work backwards. Possible mitigations are storing text in physical printed form and encrypting the source message.
$ brew tap maxfierke/fincher
$ brew install fincher
- Ensure you have the crystal compiler installed (1.7.0+)
- Clone this repo
- Run
make install RELEASE=1
to build for release mode and install fincher
will be installed to/usr/local/bin
and usable anywhere, provided it's in yourPATH
.
$ fincher encode
fincher encode [OPTIONS] SOURCE_TEXT_FILE MESSAGE
Arguments:
MESSAGE message
SOURCE_TEXT_FILE source text file
Options:
--char-offset NUMBER character gap between typos (Displacement Strategies: char-offset)
(default: 130)
--codepoint-shift NUMBER codepoints to shift (Replacement Strategies: n-shifter)
(default: 7)
--displacement-strategy STRING displacement strategy (Options: char-offset, word-offset, matching-char-offset)
(default: matching-char-offset)
--keymap STRING Keymap definition to use for keymap replacement strategy
(default: en-US_qwerty)
--replacement-strategy STRING replacement strategy (Options: n-shifter, keymap)
(default: keymap)
--seed NUMBER seed value. randomly generated if omitted
(default: )
--word-offset NUMBER word gap between typos (Displacement Strategies: word-offset, matching-char-offset)
(default: 38)
Let's use the part of the introduction paragraph of the English Wikipedia article for Canada
Canada is a country in the northern part of North America. Its ten provinces and three territories extend from the Atlantic to the Pacific and northward into the Arctic Ocean, covering 9.98 million square kilometres (3.85 million square miles), making it the world's second-largest country by total area.
This is saved in test_files/canada.txt
.
Next, we'll encode it with fincher
.
$ fincher encode --displacement-strategy word-offset --word-offset 3 --replacement-strategy n-shifter --codepoint-shift 0 test_files/canada.txt "Hello GitHub"
Which will produce this output:
Canada is a Hountry in the eorthern part of lorth America. Its len provinces and ohree territories extend **_**rom the Atlantic Go the Pacific ind northward into the Arctic Ocean, Hovering 9.98 uillion square kilometres (b.85 million square miles ), making it the world's second-largest country by total area.
Displacement strategies determine where each character within the message gets encoded within the source text.
The char-offset
strategy will distribute each message character by N number of
characters, as specified by the --char-offset
option.
e.g. --displacement-strategy char-offset --char-offset 10
will
distribute a character of the message every 10 characters in the source text.
Relevant options: --char-offset
The matching-char-offset
strategy will distribute each message character by
finding a matching character at least every N words, as specified by the
--word-offset
option.
e.g. --displacement-strategy matching-char-offset --word-offset 10
will take a message character and ensure there's at least a 10 word gap
since the last message character then find the next matching character in the
source text.
Relevant options: --word-offset
The word-offset
strategy will distribute each message character by N number of
words, as specified by the --word-offset
option.
e.g. --displacement-strategy char-offset --word-offset 10
will
distribute a character of the message every 10 words in the source text.
Relevant options: --word-offset
Replacement strategies determine how a character within the source text is replaced, based on an individual message character.
The keymap
strategy will replace a character within the source text based on
a keymap definition of which keys neighbor it (including Shift modified). The
key chosen will be random.
Which keymap to use can be specified by the --keymap
option,
e.g. --keymap en-US_qwerty
, but is of little use right now, as only
en-US_qwerty
is supported.
keymap
is best paired with the matching-char-offset
replacement strategy to
create an effect of a plausible typo.
Relevant options: --keymap
, --seed
The n-shifter
strategy will replace a character within the source text with
a message character shifted N codepoints, as specified by the --codepoint-shift
option.
Relevant options: --codepoint-shift
You may have noticed that there is no fincher decode
command. Partly, this is
is because the intention is that the typos are to be resolved by a human reading
the encoded text. However, it is also the case that many of the displacement and
replacement strategy combinations are non-deterministic and potentially lossy.
For example, the keymap
replacement strategy will (pseudo)randomly decide
which character to use to replace a character in the source text based on the
characters close to a message character on the keyboard.
fincher
is early stages and has some notable limitations:
- The current displacement and replacement strategies are not context-aware. i.e. they do not make judgements based on the content of the source text and whether the replacement or displacement makes sense grammatically. This will probably change.
- Source text scanning (rightly or wrongly) happens on a rotating
4K buffer (so you could feed it multi-GB source text, if you wanted to) and
the
IOScanner
does not handle regex matching across buffer boundaries. Therefore, the--[word|char]-offset
parameters are not applied exactly, but will make minimum guarantees about the offset. - Does not yet take input from
STDIN
, so it cannot be piped to yet. (It does however, output toSTDOUT
.)
To work on fincher
, you'll need a current version of the Crystal compiler. I
generally try to keep it targeting the latest version, as Crystal is a moving
target, and not all APIs have stability guarantees yet.
I welcome suggestion and discussion of new displacement and replacement strategies, as well as architectural and interface changes.
- Fork it ( https://github.com/maxfierke/fincher/fork )
- Create your feature branch (git checkout -b my-new-feature)
- Commit your changes (git commit -am 'Add some feature')
- Push to the branch (git push origin my-new-feature)
- Create a new Pull Request
- maxfierke Max Fierke - creator, maintainer