sp

sp (Search and Print) is a basic implementation of grep/ripgrep. It can be used to find patterns/words in files.

The main idea behind sp was to create a cli that in terms of features lies somewhere between simply matching a substring and a regex search. In its current state, sp can be best seen as a limited extension of a substring search.

Options

USAGE:
    sp [OPTIONS] <PATTERN> <PATH>

ARGS:
    <PATTERN>    A pattern used for matching
    <PATH>       A file to search

OPTIONS:
    -c, --count              Suppress normal output and show number of matching lines
    -e, --ends-with          Only show matches containing fields ending with PATTERN
    -h, --help               Prints help information
    -i, --ignore-case        Case insensitive search
    -m, --max-count <NUM>    Limit number of shown matches
    -n, --no-line-number     Do not show line number which is enabled by default
    -s, --starts-with        Only show matches containing fields starting with PATTERN
    -V, --version            Prints version information
    -w, --words              Whole words search (i.e. non-word characters are stripped)

Fields are strings separated by contiguous whitespace (as defined by Unicode)

Building

This is a Rust project so first you have to make sure that Rust is running on your machine.

To build this project:

$ git clone https://github.com/streof/sp
$ cd sp
$ cargo build --release
$ ./target/release/sp --version
sp 0.1.2

Running tests

sp includes unit and integration tests. Run them as follows:

$ cargo test

Considerations

Tha main goal of this project was to reimplement a small number of grep/ripgrep alike features. Performance was not a strict requirement althought, in my opinion, cli's should at least be perceived as fast by their users. Performance is obviously a trade-off and for sp depends on things like:

memory allocation
cpu utilization
heuristics
algorithm implementation (e.g. searching, encoding/decoding)
number system calls

Here are some thoughts from the exploration phase:

A simple way to reduce memory allocation is using an IO buffer. Rust's standard library provides for example the very convenient lines API but also the lower level read_line and read_until methods. Initially, this project used read_line but then I read this reddit thread where linereader was mentioned. I ended up using bstr which offers a good balance between rich, ergonomic API and performance (see this commit).
Counts in sp rely on a very naive implementation that does not take any advantage of modern CPU capabilities (see for example bytecount)
The current matching algorithm relies on high level API's exposed by bstr. However, I also performed some simple benchmarks which suggested that switching to twoway will give a significant performance boost (>2x speedup). Rust uses the twoway algorithm for things like pattern matching, althought the implementation differs from the one provided by the twoway crate.
In some cases the number of read syscalls used by sp is significantly higher than when using ripgrep.
Ripgrep uses encoding_rs for fast encoding/decoding.

Name		Name	Last commit message	Last commit date
Latest commit History 69 Commits
src		src
tests		tests
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

sp

Options

Building

Running tests

Considerations

About

Releases

Packages

Languages

streof/sp

Folders and files

Latest commit

History

Repository files navigation

sp

Options

Building

Running tests

Considerations

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages