ioutils provides very fast implementations for file traversal and reading/writing algorithms.
-
Very fast file read/write algorithms.
-
Very fast file traversal algorithms.
-
A small set of high performance file related functions and classes.
-
The fast-find command is a drop-in replacement for GNU find. This command can be 2x faster than both find and fd commands.
-
The fast-locate command is a drop-in replacement for GNU locate. This command is 10x faster than GNU locate command.
What is the different between ioutils and other similar open source projects such as GNU findutils and fd?
-
ioutils is written as a library so we can easily reuse it in other projects, for example ioutils is used by fastgrep, and codesearch.
-
Core algorithms are templatized using policy-based design so we can have flexible and reusable algorithms without sacrificing the performance i.e all classes are generated at the compile time.
-
All commands support regular expression pattern matching syntax by default.
All precompiled binaries for Linux and MacOS can be downloaded from this github repostory.
hungptit@hungptit ~/working/ioutils/command $ ./fast-find ../src/
../src/fdwriter.hpp
../src/reader.hpp
../src/fast-locate.hpp
../src/utilities.hpp
../src/read_policies.hpp
../src/search.hpp
../src/linestats.hpp
../src/search_policies.hpp
../src/filedb.hpp
../src/filesystem.hpp
../src/tbb_reader.hpp
../src/threaded_reader.hpp
../src/boost_memmap.hpp
../src/ioutils.hpp
../src/search_regex.hpp
../src/temporary_dir.hpp
../src/regex_policies.hpp
hungptit@hungptit ~/working/ioutils/command $ ./fast-find ../src/ -e "search\wre.*"
../src/search_regex.hpp
Note that fast-find can accept multiple paths. Below is an example that search for a source code file from both boost and Linux kernel source code
hungptit@hungptit ~/working/ioutils/benchmark $ time fast-find -e '/\w+options.c(p)*$' ../../3p/src/boost/ /usr/src/linux-4.17.1-gentoo/
/usr/src/linux-4.17.1-gentoo/net/ipv4/ip_options.c
/usr/src/linux-4.17.1-gentoo/drivers/net/bonding/bond_options.c
/usr/src/linux-4.17.1-gentoo/tools/perf/trace/beauty/waitid_options.c
../../3p/src/boost/libs/program_options/src/positional_options.cpp
real 0m0.187s
user 0m0.064s
sys 0m0.121s
Note: fast-find does support caseless matching using ignore-case flag.
hdang ~/w/i/commands> fast-find ../ --maxdepth 1 -e '[.]cpp$'
../benchmark/read_data.cpp
../benchmark/fast-find.cpp
../benchmark/file_read.cpp
../benchmark/fileio.cpp
../benchmark/filesystem.cpp
../benchmark/test_read.cpp
../benchmark/locate_benchmark.cpp
../unittests/tempdir.cpp
../unittests/read_data.cpp
../unittests/deserializers.cpp
../unittests/utilities.cpp
../unittests/read_stdin.cpp
../unittests/search.cpp
../unittests/writer.cpp
../commands/fast-updatedb.cpp
../commands/mwc.cpp
../commands/fast-find.cpp
../commands/fast-locate.cpp
../commands/linestats.cpp
fast-find does allow users to search for files that do not match a given option by turn on the inverse-match flag. Below is an example
hdang@dev115 ~/w/i/command> ./fast-find . -e '(o|bin|cmake|make|txt|internal|includecache|tmp|out)$|cache|CMakeFiles' -u
./linestats.cpp
./linestats
./compile_commands.json
./Makefile
./fast-find
./fast-updatedb
./fast-find.cpp
./fast-locate
./fast-locate.cpp
./fast-updatedb.cpp
./.database
./foo.bi
Before using fast-locate command we do need to build the file information database for our interrested folders. Below command will build file information database for boost, hyperscan, tbb, and seastar packages.
fast-updatedb boost/ hyperscan/ tbb/ rocksdb/ seastar/ -v
Assume we have already built the file information database using fast-updatedb command then we can use fast-locate to look for files that match our desired pattern.
This example will seach for all files with h and hh extensions
fast-locate '/\wfuture.(h|hh)$'
Or we can display all files in our database by executing fast-locate command
fast-locate
-
Where are benchmark results? This page has performance benchmark results and analysis for GNU find, fd, and fast-find commands.
-
How can I download pre-compiled binaries? Portable binaries for fast-find, fast-updatedb, and fast-locate can be found here.
-
I am not sure if the output of fast-find is the same as that of GNU find?
We tried our best to make sure that fast-find's output will be identical to the output of GNU find so it can be used as a drop-in replacement for GNU find. Below are the correctness test results obtained using our verification script for a large folder with 377250 files and folders. You can run it using your desired folder to verify that fast-find will work correctly in most situations. Note that both fast-find and fd do not output the input folder paths.
MacOS:unittests hdang$ ./verify.sh ~/working
Find all files using GNU find
Find all files using fast-find
Find all files using fd
==== Verify the output of fast-find ====
1d0
< /Users/hdang/working
==== Verify the output of fd ====
1d0
< /Users/hdang/working