Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(simd): use simd-json for deserialization #216

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions .cargo/config
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
[build]
rustflags = ["-C", "target-cpu=native"]

[target.wasm32-unknown-unknown]
rustflags = ["-C", "target-feature=+simd128"]

[target.wasm32-wasi]
rustflags = ["-C", "target-feature=+simd128"]

186 changes: 160 additions & 26 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -24,3 +24,4 @@ codegen-units = 1
lto = true
opt-level = 'z'
panic = 'abort'

2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ Pronounce it as **jackal** 🐺.

## 📜 Philosophy

- ⚡Be fast
- ⚡Be fast (**SIMD** + **parallelism**)
- 🪶 Stay lightweight
- 🎮 Keep its features as simple as possible
- 🧠 Avoid redundancy
Expand Down
4 changes: 2 additions & 2 deletions crates/jql/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,8 @@ anyhow = "1.0.70"
clap = { version = "4.2.4", features = ["derive"] }
colored_json = { version = "3.1.0" }
jql-runner = { path = "../jql-runner", version = "6.0.6" }
serde = "1.0.160"
serde_stacker = "0.1.8"
mimalloc = "0.1.36"
serde_json = { workspace = true }
simd-json = "0.9.0"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't see it mentioned in the PR; how much did this improve performance?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the missing piece, not tested yet :)

Copy link
Owner Author

@yamafaktory yamafaktory Apr 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a small hyperfine shell script comparing the current binary from main and a release build from this branch.

There's definitely some improvement. Ideally this should be tested against a way bigger JSON file.

Benchmark 1: cat /home/yamafaktory/dev/jql/assets/github-repositories.json | jql '|>{"name","full_name"}' > /dev/null
  Time (mean ± σ):     597.2 ms ±  12.1 ms    [User: 880.7 ms, System: 165.1 ms]
  Range (min … max):   569.9 ms … 629.4 ms    100 runs
 
Benchmark 2: cat /home/yamafaktory/dev/jql/assets/github-repositories.json | /home/yamafaktory/dev/jql/target/release/jql '|>{"name","full_name"}' > /de
v/null
  Time (mean ± σ):     555.8 ms ±  13.9 ms    [User: 1001.4 ms, System: 130.3 ms]
  Range (min … max):   531.7 ms … 648.2 ms    100 runs
 
Summary
  'cat /home/yamafaktory/dev/jql/assets/github-repositories.json | /home/yamafaktory/dev/jql/target/release/jql '|>{"name","full_name"}' > /dev/null' ra
n
    1.07 ± 0.03 times faster than 'cat /home/yamafaktory/dev/jql/assets/github-repositories.json | jql '|>{"name","full_name"}' > /dev/null'

tokio = { version = "1.27.0", features = ["fs", "io-std", "io-util", "macros", "rt-multi-thread"] }

29 changes: 14 additions & 15 deletions crates/jql/src/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -25,10 +25,10 @@ use colored_json::{
PrettyFormatter,
};
use jql_runner::runner;
use mimalloc::MiMalloc;
use panic::use_custom_panic_hook;
use serde::Deserialize;
use serde_json::Value;
use serde_stacker::Deserializer;
use simd_json::serde;
use tokio::{
fs::File,
io::{
Expand All @@ -41,8 +41,11 @@ use tokio::{
},
};

#[global_allocator]
static GLOBAL: MiMalloc = MiMalloc;

/// Reads a file from `path`.
async fn read_file(path: impl AsRef<Path>) -> Result<String> {
async fn read_file(path: impl AsRef<Path>) -> Result<Vec<u8>> {
let display_path = path.as_ref().display();
let mut file = File::open(&path)

This comment was marked as resolved.

This comment was marked as resolved.

.await
Expand All @@ -53,7 +56,7 @@ async fn read_file(path: impl AsRef<Path>) -> Result<String> {
.await
.with_context(|| format!("Failed to read from file {display_path}"))?;

Ok(String::from_utf8_lossy(&contents).into_owned())
Ok(contents)
}

/// Renders the output or the error and exits.
Expand All @@ -68,27 +71,23 @@ fn render(result: Result<String>) {
}

/// Processes the JSON content based on the arguments.
async fn process_json(json: &str, args: &Args) -> Result<String> {
async fn process_json(json: &[u8], args: &Args) -> Result<String> {
if args.validate {
return serde_json::from_str::<Value>(json).map_or_else(
return serde::from_slice::<Value>(&mut json.to_vec()).map_or_else(
|_| Err(anyhow!("Invalid JSON file or content")),
|_| Ok("Valid JSON file or content".to_string()),
);
}

let query = match args.query_from_file.as_deref() {
Some(path) => read_file(path).await?,
Some(path) => String::from_utf8_lossy(&read_file(path).await?).into_owned(),
// We can safely unwrap since clap is taking care of the validation.
None => args.query.as_deref().unwrap().to_string(),
};

let mut deserializer = serde_json::Deserializer::from_str(json);

deserializer.disable_recursion_limit();

let deserializer = Deserializer::new(&mut deserializer);
let value: Value = Value::deserialize(deserializer)
let value = serde::from_slice::<Value>(&mut json.to_vec())
.with_context(|| format!("Failed to deserialize the JSON data"))?;

let result: Value = runner::raw(&query, &value)?;

if args.inline {
Expand Down Expand Up @@ -132,7 +131,7 @@ async fn main() -> Result<()> {
.await
.with_context(|| format!("Failed to read stream"))?
{
render(process_json(&line, &args).await);
render(process_json(line.as_bytes(), &args).await);

stdout
.flush()
Expand All @@ -157,7 +156,7 @@ async fn main() -> Result<()> {
let lines = String::from_utf8(buffer)
.with_context(|| format!("Failed to convert piped content from stdin"))?;

render(process_json(&lines, &args).await);
render(process_json(lines.as_bytes(), &args).await);

Ok(())
}
Loading