Releases: rsonquery/rsonpath
v0.8.2
[0.8.2] - 2023-09-23
Performance
- Improved handling of the root-only query
$
. (#160)- Full nodes result when asking for root: 2 times throughput increase.
- Indices/count result when asking for root: basically unboundedly faster,
no longer looks at the entire document.
Documentation
- Clarified the
approximate_spans
guarantees.- Now documentation mentions that the returned
MatchSpan
s can potentially
have their end indices farther than one would expect the input to logically end,
due to internal padding.
- Now documentation mentions that the returned
Bug fixes
- Fixed handling of the root-only query
$
on atomic documents. (#160)- Previously only object and array roots were supported.
- Fixed a bug when head-skipping to a single-byte key would panic. (#281)
- This was detected by fuzzing!
- The queries
$..["{"]
and$..["["]
would panic
on inputs starting with the bytes{"
or["
, respectively.
- Fixed a bug where disabling the
simd
feature would not actually
disable SIMD acceleration.
Reliability
- Made the ClusterFuzzLite batch workflow automatically create an issue
on failure to make sure the maintainers are notified.
v0.8.1
[0.8.1] - 2023-09-20
Features
- [breaking] Refactored the [
Match
]/[MatchSpan
] types.- [
Match
] now takes 32 bytes, down from 40. - All fields are now private, accessible via associated functions.
- Added the
len
function to [MatchSpan
].
- [
- Added
approximate_spans
result mode. (#242)- Engine can return an approximate span of the match,
where "approximate" means the start index is correct,
but the end index might include trailing whitespace after the match. - This mode is much faster that full
matches
, close to the performance
ofcount
, especially for large result sets. - This is a library-only feature.
- Engine can return an approximate span of the match,
- Library exposes a new optional feature,
arbitrary
.- When enabled, includes
arbitrary
as a dependency and provides anArbitrary
impl forJsonPathQuery
,
JsonString
, andNonNegativeArrayIndex
.
- When enabled, includes
Bug fixes
- Fixed a bug when memmem acceleration would fail for empty keys.
- This was detected by fuzzing! The query
$..[""]
would panic
on certain inputs due to invalid indexing.
- This was detected by fuzzing! The query
- Fixed a panic when parsing invalid queries with wide UTF8 characters.
- This was detected by fuzzing! Parsing a query with invalid syntax
caused by a longer-than-byte UTF-8 character would panic when
the error handler tried to resume parsing from the next byte
instead of respecting char boundaries.
- This was detected by fuzzing! Parsing a query with invalid syntax
- Fixed a panic caused by node results in invalid JSON documents.
- This was detected by fuzzing! Invalid JSON documents could
cause the NodeRecorder to panic if the apparent match span
was of length 1.
- This was detected by fuzzing! Invalid JSON documents could
- Fixed erroneous match span end reporting. (#247)
- Fixed a bug where
MatchSpan
values given by the engine were
almost always invalid.
- Fixed a bug where
Reliability
- Fuzzing integration with libfuzzer and ClusterFuzzLite.
cargo-fuzz
can be used
to fuzz the project with libfuzzer. Currently we have three fuzzing targets,
one for stressing the query parser, one for stressing the engine with arbitrary
bytes, and one stressing the engine with structure-aware queries and JSONs.- Fuzzing is now enabled on every PR. Using ClusterFuzzLite
we will fuzz the project every day on a cron schedule
to establish a corpus.
- Added correctness tests for match spans reporting (#247)
Dependencies
- Bump clap from 4.4.2 to 4.4.4.
- Bump vergen from 8.2.4 to 8.2.5.
v0.8.0
[0.8.0] - 2023-09-10
Features
- Portable binaries. (#231)
- SIMD capabilities are now discovered at runtime,
allowing us to distribute one binary per target. - Requirements for SIMD are now more granular,
allowing weaker CPUs to still get some of the acceleration:- Base SIMD is either SSE2, SSSE3, or AVX2.
- Structural classification works on SSSE3 and above.
- Quote classification works if
pclmulqdq
is available. - Depth classification works if
popcnt
is available.
- To counteract the increased binary size debug info is no longer
included in distributed binaries. - Codegen for distributed binaries is improved with fat LTO and setting
codegen units to 1. - SIMD capabilities are listed with
rq --version
.
- SIMD capabilities are now discovered at runtime,
Reliability
- Change clippy to auguwu/clippy-action
- The "official" action was not maintained for 3 years now.
This one is actively maintained (thanks Noel!).
- The "official" action was not maintained for 3 years now.
v0.7.1
v0.7.0
Thanks to @darrenboulton for the SSSE3 SIMD support and modularity. This is a huge step towards better SIMD support for rq
. 🎉
[0.7.0] - 2023-09-02
Features
- Added 32-bit and SSSE3 SIMD support.
- Refactored all SIMD code to enable modularity and more target feature types.
- Building for x86 now chooses one of four SIMD implementations:
- AVX2 64-bit
- AVX2 32-bit
- SSSE3 64-bit
- SSSE3 32-bit
- These are also now distributed as separate binaries.
Reliability
-
Fine-grained action permissions.
- Actions now use explicit, lowest possible permissions for all jobs.
-
Add SLSA3 provenance to the release pipeline.
- Future releases will include cryptographically signed provenance for all binaries.
See: https://slsa.dev/spec/v1.0/about
- Future releases will include cryptographically signed provenance for all binaries.
-
StepSecurity Apply security best practices.
- All CI uses hash-pinned dependencies now.
- Run the OSSF Scorecard check on each PR.
- Add Dependency review.
-
Removed test-codegen deps from
Cargo.lock
- By removing the codegen crate from the workspace their deps
are now separated and don't pollute the lock of the actual end product.
- By removing the codegen crate from the workspace their deps
-
cargo-deny
now runs with the CI to keep tabs on our deps.- Configured to reject Medium+ CVEs and non-compatible licenses.
Dependencies
- Bump clap from 4.3.19 to 4.4.2.
- Bump log from 0.4.19 to 0.4.20.
- Bump thiserror from 1.0.44 to 1.0.47.
- Bump trycmd from 0.14.16 to 0.14.17.
- Removed
memchr
as a dependency.- It was no longer needed after the custom
memmem
classifier
introduced in v0.6.0.
- It was no longer needed after the custom
- Removed
replace_with
as a dependency.- That code path was refactored earlier, dep was now unused.
Documentation
- Added the OpenSSF badge.
- We will be trying to achieve the Passing level before v1.0.0.
- Added the scorecard badge.
v0.6.1
We have a book! Find it on GitHub Pages.
[0.6.1] - 2023-08-07
Features
-
[breaking] Remove the `unique-members`` feature.
- This clutters the API more than anything.
If supporting duplicate keys is required in the future,
it can be easily added as aconst
config option,
not a compilation feature.
- This clutters the API more than anything.
-
Add the
--json
CLI option for passing JSONs inline.
Reliability
- Added snapshot tests for
rq
usingtrycmd
.- This is another layer of E2E tests, makes sure documentation examples
in the book are correct, and that our--help
and--version
outputs
remain consistent.
- This is another layer of E2E tests, makes sure documentation examples
Documentation
- We have a book!
- The first part is a usage guide for
rq
, and contains a short
JSONPath reference. - Other parts will follow, with a plan to finalize at least the library
usage guide before 1.0.0.
- The first part is a usage guide for
v0.6.0
rq
is officially useful! 🎉
Thanks to @charles-paperman for the memmem
contribution without which #56 would be a big perf regression!
[0.6.0] - 2023-08-02
Features
-
[breaking] Full match result mode. (#56)
This includes a revamp of all the internals that would be too long to describe in the log.
In short:memmem
was rewritten to a custom implementation (courtesy of @charles-paperman)- Each of the result modes has a separate
Recorder
that takes care of producing the results - The results are written to a
Sink
, provided by the user; this might be aVec
, the stdout,
or some otherio::Write
implementation. - Matches contain the full byte span of the value matched.
- A lot of
Input
and classifier APIs have massive breaking changes to accomodate this.
-
[breaking] Removed the Recursive engine.
- The Recursive implementation has outlived its usefulness.
Over time it became a near-duplicate of Main,
which was manifested by a need to implement
the same features twice with the exact same code
and to refactor/fix bugs with exact same code changes
but in two different files. We will focus efforts on the Main engine.
The--engine
CLI option was disabled, as there is only one engine now.
- The Recursive implementation has outlived its usefulness.
Reliability
- Qol improvement by separate test gen crate.
- This removes the confusing
gen-tests
feature from lib,
reduces its build dependencies, should improve
build times.
- This removes the confusing
Dependencies
- Bump clap from 4.3.10 to 4.3.19.
- Bump colored (dependency of simple_logger) from 2.0.0 to 2.0.4.
- This removes a transitive dependency on atty with a CVE.
- Bump rustflags from 0.1.3 to 0.1.4.
- Bump smallvec from 1.10.0 to 1.11.0.
- Bump thiserror from 1.0.40 to 1.0.44.
v0.5.1
Thanks for the contributions to @firewall2142, @azozello, @Pasifaee, which I've been long overdue to merge!
[0.5.1] - 2023-07-03
Features
- Consistent index result output. (#161)
- The
--result bytes
mode now consistently reports the first byte of the value it matched. This can be used to extract the actual value from the JSON by parsing from the reported byte.
- The
Bug Fixes
- Remove SHA from --version on crates.io. (#157)
- The Commit SHA part was incorrect, and there seems to be no way to get it when the crate is in registry
Library
- [breaking] Remove
tail-skip
andhead-skip
features.- These are now non-optional and integrated into the engines.
Reliability
-
Generate strings in classifier tests. (#173, #20)
- Improve classifier correctness tests by including quoted strings with escapes
in the generated proptest cases.
- Improve classifier correctness tests by including quoted strings with escapes
-
More tests for wildcard compilation.
- Added more cases for compiling the NFA and minimizing
for queries with wildcards.
- Added more cases for compiling the NFA and minimizing
-
Automated declarative end-to-end engine tests. (#134)
- Engine tests were rewritten to use declarative TOML configurations
for ease of creating new tests, maintenance and debugging ease.
Test coverage was increased, since compressed variants of inputs are
automatically generated and tested, and we now test all combinations
of input-engine-result types.
- Engine tests were rewritten to use declarative TOML configurations
Dependencies
- Bump addr2line from v0.19.0 to v0.20.0
- Bump anstyle from v1.0.0 to v1.0.1
- Bump anstyle-parse from v0.2.0 to v0.2.1
- Bump backtrace from v0.3.67 to v0.3.68
- Bump clap from 4.3.4 to 4.3.10.
- Bump gimli from v0.27.2 to v0.27.3
- Bump hashbrown from v0.12.3 to v0.14.0
- Bump indexmap from v1.9.3 to v2.0.0
- Bump is-terminal from v0.4.7 to v0.4.8
- Bump libc from v0.2.146 to v0.2.147
- Bump memmap2 from 0.7.0 to 0.7.1.
- Bump miniz_oxide from v0.6.2 to v0.7.1
- Bump object from v0.30.4 to v0.31.1
- Bump proc-macro2 from v1.0.60 to v1.0.63
- Bump quote from v1.0.28 to v1.0.29
- Bump serde_spanned from v0.6.2 to v0.6.3
- Bump syn from v2.0.18 to v2.0.22
- Bump toml from v0.7.4 to v0.7.5
- Bump toml_datetime from v0.6.2 to v0.6.3
- Bump toml_edit from v0.19.10 to v0.19.11
- Bump vergen from v8.2.1 to v8.2.3
- Bump windows-targets from v0.48.0 to v0.48.1
Documentation
- Rearrange readme to put usage first.
- Update bug report issue form.
- Changed the issue form to be more streamlined and use more polite language.
- Add MSRV to README.
v0.5.0
[0.5.0] - 2023-06-14
Thanks for the contributions to @zwerdlds, who delivered the index selector support for this release!
Features
- Rename bin to
rq
and lib torsonpath
. The crate names rename the same, just the output artifacts are different. - Parser support for array index selector. (#60)
- Parser now recognizes the array index selector with positive index values conforming to the I-JSON specification.
- Index selector engine support (#132). (#132#132)
- The automaton transition model has been changed to incorporate index-labelled transitions.
- Both engines now support queries with the index selector.
- New
Input
API. (#23#23)- A more abstract API to access the underlying byte stream replacing the reliance of the engines on a direct
&[u8]
slice access, to allow adding buffered input streams (#23) in the future. Two types were added,OwnedBytes
andBorrowedBytes
, to support the current easy scenario of having the bytes already in memory.
- A more abstract API to access the underlying byte stream replacing the reliance of the engines on a direct
- Add long version to CLI.
- Mmap support. (#23)
- Added
MmapInput
which maps a file into memory on unix and windows.
- Added
- The CLI app now automatically decides which input to use, favoring mmap in most cases. This can be overridden with
--force-input
.
Library
- Rename
Label
toJsonString
(#139). (#139#131)query::Label
is nowquery::JsonString
- The
unique-labels
feature is nowunique-members
EngineError:MalformedLabelQuotes
renamed toEngineError:MalformedStringQuotes
Reliability
- Proptests for parsing array indices queries. (#51)
Dependencies
- Bump clap from 4.1.11 to 4.3.4.
- Bump log from 0.4.17 to 0.4.19.
- Bump proptest from 1.1.0 to 1.2.0.
- Bump simple_logger from 4.1.0 to 4.2.0.
v0.4.0
[0.4.0] - 2023-04-20
Features
-
Wildcard descendant support.
- You can now use the
..*
/..[*]
selector that selects all nodes in the document it acts upon.
- You can now use the
-
Switch
Structural
toBracketType
. (#10)- The
Opening
andClosing
variants now differentiate between curly
and square brackets with a value of theBracketType
enum.
- The
Bug fixes
-
Fix parser incorrectly escaping labels.
- Queries like
$['\'']
would cause a parsing error, even though they were valid (match a child with key equal to "'
"). - The
\u
escape sequence is no longer recognized, since without UTF-8 handling they were meaningless.
See (#117).
- Queries like
-
Empty query array behavior.
- Running the query
$
on a document[]
was giving zero results. Now correctly matches the root array.
- Running the query
Documentation
- The grammar in top-level documentation now matches the implementation.
Reliability
- Added proptests for query parsing.
- Currently checks that correct queries are parsed correctly.
We still need tests for error conditions (see #51).
- Currently checks that correct queries are parsed correctly.