Releases: simsong/bulk_extractor
Release 2.1.1
v2.1.0: Merge pull request #450 from simsong/rel-2.10
January 22, 2024
RELEASE NOTES
The digital forensics tool bulk_extractor version 2.1.0 is now available for general use.
Release download point:
https://github.com/simsong/bulk_extractor/releases
GIT repository:
https://github.com/simsong/bulk_extractor
I am pleased to announce the general availability of bulk_extractor version 2.1. This is the first release of bulk_extractor version 2 that is recommended for general use.
Bulk_extractor 2 is a significant rewrite of bulk_extractor. Verison 2 significantly improves the performance and portability of version 1. The rewrite started in 2016 and was largely completed by January 2021.
Details of the rewrite, including a detailed report of the performance improvements and lessons learned, can be found in Sharpening Your Tools: Updating bulk_extractor for the 2020s, Simson Garfinkel and Jon Stewart. Communications of the ACM, August 2023.
Bulk_extractor version 2.1 is the first stable version of bulk_extractor version 2 that is recommended for general use. It corrects a problem with the string search scanner that caused bulk_extractor to hang on open-ended regular expressions such as [a-z]*@company.com
specified with the -F
flag. With version 2.1, we have replaced the C++17 regex compiler with Google's RE2 regex compiler that avoids backtracking. As a result, these open-ended regular expressions no longer hang.
2.0/2.1 Improvements over Version 1:
- BE2 is significantly faster on multi-core systems than BE1.
Release 2.1 Limitations
-
BEViewer is not included in this release. Although it works with Version 2, it is not yet officially supported.
-
scan_outlook and scan_hiberfile are now disabled by default because they did not have unit tests. These scanners can be re-enabled by specifying -eoutlook and -ehiberfile on the command line.
-
scan_aes no longer scans for 192-bit AES keys by default, although this behavior can be re-enabled.
Known bugs:
-
The RAR decompressor does not reliably decompress all RAR files and only supports RAR v1, v2, and v3.
-
The RAR scanner will not reliably name carved RAR file components that contain UTF-8 characters in their name.
You can help
We are looking for help to implement the following algorithms:
-
WkdmDecompress - http://www.opensource.apple.com/source/xnu/xnu-1456.1.26/iokit/Kernel/WKdmDecompress.c
-
xz, 7zip, and LZMA/LZMA2 decompression
-
lzo decompression
-
BZIP2 decompression
-
CAB decompression
-
Scanning for the start of BitLocker protected volumes.
-
NTFS decompression
-
Better handling of MIME encoding
-
Process more data with -e xor and look for CCN hits. Most will be false positives
-
Demonstration of bulk_extractor running on a grid (how fast can it run?)
-
Python Bridge - run multiple copies of python to let scanners be written in python
-
scan_pipe - runs every sbuf through an external program.
bulk_extractor 2.0.6
Minor packaging updates.
bulk_extractor 2.0.3
Version 2.0.3 is released. However, please note:
- There appears to be a hang in the multi-threaded logic on some systems. This is under review.
- Carving for IPv6 packets is not 100%
- There are compiler warnings when compiling on MacOS 13.3.1
bulk_extractor V2.0.0 RELEASE
Release 2.0.0 of bulk_extractor
, a high-performance digital forensics tool that works like a "find evidence" button, pulling actionable intelligence out of disk images, files, memory dumps, network traffic, and just about anything else.
Note: we recommend using the bulk_extractor-2.0.0.tar.gz
file attached, which is a proper release, rather than cloning the repo and all of the sub-repos and then using automake to create the configure script.
bulk_extractor V2.0.0 beta 3
bulk_extractor
--- a high-performance digital forensics tool that scans a disk image, a file, or a directory of files and extracts information such as email addresses, JPEGs and JSON snippets without parsing the file system or file system structures. Written in C++ and highly parallelized.
This beta:
- Adds additional regression test.
- Fixes bugs reported in betas 1 and 2.
Please report bugs to https://github.com/simsong/bulk_extractor/issues
bulk_extractor V2.0.0 beta 2
bulk_extractor
is a high-performance C++ program that scans a disk image, a file, or a directory of files and extracts information such as email addresses, JPEGs and JSON snippets without parsing the file system or file system structures.
This beta:
- Addresses packaging concerns and adds additional regression test.
- Fixes handling of E01 files.
Download from: bulk_extractor-2.0.0-beta2.tar.gz
bulk_extractor V2.0.0 beta 1
bulk_extractor
is a high-performance digital forensics tool that finds data including JPEG images, email addresses, social security numbers, and other kinds of "known formats" in files and on raw disk partitions, even if the data are compressed, BASE64 encoded, or transformed using other well-known algorithms.
After six years, we have a new release of bulk_extractor! This version now requires C++17, includes a significant test suite with significant code coverage, and is designed for systems with high numbers of CPU cores. Tested on Ubuntu, MacOS, and Fedora.
Bulk_Extractor Release 1.5.3
Release 1.5.3 corrects minor bugs that were found in version 1.5.0, and represents a significant improvement over release 1.4.0.
Official 1.4.0 release.
The official 1.4.0 release. Reasonably well tested.