In small block forensics, the goal is to determine the existence of any content from a small dataset of known content in a large target drive.
This project is an approximation of the SBF technique that takes two directories as input (target directory, known content directory), and uses the small block randomized technique to find the existence of some file from the known content directory within the target directory. For a visual intro to small block forensics, see this PDF deck.
View a video explanation of the project here: demo.mp4
- Install pipenv
pip install pipenv
- Activate the venv
pipenv shell
- Install dependencies
pipenv install
python -m small_blk_forensics.backend.server
python client_example.py
Run SBF on a known content directory and target directory
python cmd_interface.py gen_hash_random \
--output_sql ./examples/out/known_content_hashes.sqlite \
--target_directory ./examples/target_folder \
--known_content_directory ./examples/known_dataset \
--block_size 4
Generate a SQLite DB contains hashes of all the blocks within a source directory
python cmd_interface.py gen_hash \
--output_sql ./examples/out/known_content_hashes.sqlite \
--known_content_directory ./examples/known_dataset \
--block_size 4
Run SBF on a pre-generated known content directory SQLite DB and target directory
python cmd_interface.py hash_random \
--input_sql ./examples/out/known_content_hashes.sqlite \
--target_directory ./examples/target_folder \
--block_size 4
Running black, isort, flake8 and mypy:
pipenv install --dev
make format