Skip to content

Releases: rsonquery/rsonpath

v0.8.2

23 Sep 13:50
8301eda
Compare
Choose a tag to compare

[0.8.2] - 2023-09-23

Performance

  • Improved handling of the root-only query $. (#160)
    • Full nodes result when asking for root: 2 times throughput increase.
    • Indices/count result when asking for root: basically unboundedly faster,
      no longer looks at the entire document.

Documentation

  • Clarified the approximate_spans guarantees.
    • Now documentation mentions that the returned MatchSpans can potentially
      have their end indices farther than one would expect the input to logically end,
      due to internal padding.

Bug fixes

  • Fixed handling of the root-only query $ on atomic documents. (#160)
    • Previously only object and array roots were supported.
  • Fixed a bug when head-skipping to a single-byte key would panic. (#281)
    • This was detected by fuzzing!
    • The queries $..["{"] and $..["["] would panic
      on inputs starting with the bytes {" or [", respectively.
  • Fixed a bug where disabling the simd feature would not actually
    disable SIMD acceleration.

Reliability

  • Made the ClusterFuzzLite batch workflow automatically create an issue
    on failure to make sure the maintainers are notified.

v0.8.1

20 Sep 16:14
221efe5
Compare
Choose a tag to compare

[0.8.1] - 2023-09-20

Features

  • [breaking] Refactored the [Match]/[MatchSpan] types.
    • [Match] now takes 32 bytes, down from 40.
    • All fields are now private, accessible via associated functions.
    • Added the len function to [MatchSpan].
  • Added approximate_spans result mode. (#242)
    • Engine can return an approximate span of the match,
      where "approximate" means the start index is correct,
      but the end index might include trailing whitespace after the match.
    • This mode is much faster that full matches, close to the performance
      of count, especially for large result sets.
    • This is a library-only feature.
  • Library exposes a new optional feature, arbitrary.
    • When enabled, includes arbitrary
      as a dependency and provides an Arbitrary impl for JsonPathQuery,
      JsonString, and NonNegativeArrayIndex.

Bug fixes

  • Fixed a bug when memmem acceleration would fail for empty keys.
    • This was detected by fuzzing! The query $..[""] would panic
      on certain inputs due to invalid indexing.
  • Fixed a panic when parsing invalid queries with wide UTF8 characters.
    • This was detected by fuzzing! Parsing a query with invalid syntax
      caused by a longer-than-byte UTF-8 character would panic when
      the error handler tried to resume parsing from the next byte
      instead of respecting char boundaries.
  • Fixed a panic caused by node results in invalid JSON documents.
    • This was detected by fuzzing! Invalid JSON documents could
      cause the NodeRecorder to panic if the apparent match span
      was of length 1.
  • Fixed erroneous match span end reporting. (#247)
    • Fixed a bug where MatchSpan values given by the engine were
      almost always invalid.

Reliability

  • Fuzzing integration with libfuzzer and ClusterFuzzLite.
    • cargo-fuzz can be used
      to fuzz the project with libfuzzer. Currently we have three fuzzing targets,
      one for stressing the query parser, one for stressing the engine with arbitrary
      bytes, and one stressing the engine with structure-aware queries and JSONs.
    • Fuzzing is now enabled on every PR. Using ClusterFuzzLite
      we will fuzz the project every day on a cron schedule
      to establish a corpus.
  • Added correctness tests for match spans reporting (#247)

Dependencies

  • Bump clap from 4.4.2 to 4.4.4.
  • Bump vergen from 8.2.4 to 8.2.5.

v0.8.0

10 Sep 21:37
e9487be
Compare
Choose a tag to compare

[0.8.0] - 2023-09-10

Features

  • Portable binaries. (#231)
    • SIMD capabilities are now discovered at runtime,
      allowing us to distribute one binary per target.
    • Requirements for SIMD are now more granular,
      allowing weaker CPUs to still get some of the acceleration:
      • Base SIMD is either SSE2, SSSE3, or AVX2.
      • Structural classification works on SSSE3 and above.
      • Quote classification works if pclmulqdq is available.
      • Depth classification works if popcnt is available.
    • To counteract the increased binary size debug info is no longer
      included in distributed binaries.
    • Codegen for distributed binaries is improved with fat LTO and setting
      codegen units to 1.
    • SIMD capabilities are listed with rq --version.

Reliability

  • Change clippy to auguwu/clippy-action
    • The "official" action was not maintained for 3 years now.
      This one is actively maintained (thanks Noel!).

v0.7.1

09 Sep 23:53
51129e4
Compare
Choose a tag to compare

[0.7.1] - 2023-09-09

Bug Fixes

  • Panic when head-skipping block boundary. (#249)
    • Fixed an issue when head-skipping acceleration in nodes result mode would
      panic in very specific input circumstances, or if the input had really long JSON keys.

Dependencies

  • Bump thiserror from 1.0.47 to 1.0.48.

v0.7.0

02 Sep 19:26
5e6d505
Compare
Choose a tag to compare

Thanks to @darrenboulton for the SSSE3 SIMD support and modularity. This is a huge step towards better SIMD support for rq. 🎉

[0.7.0] - 2023-09-02

Features

  • Added 32-bit and SSSE3 SIMD support.
    • Refactored all SIMD code to enable modularity and more target feature types.
    • Building for x86 now chooses one of four SIMD implementations:
      • AVX2 64-bit
      • AVX2 32-bit
      • SSSE3 64-bit
      • SSSE3 32-bit
    • These are also now distributed as separate binaries.

Reliability

  • Fine-grained action permissions.

    • Actions now use explicit, lowest possible permissions for all jobs.
  • Add SLSA3 provenance to the release pipeline.

  • StepSecurity Apply security best practices.

    • All CI uses hash-pinned dependencies now.
    • Run the OSSF Scorecard check on each PR.
    • Add Dependency review.
  • Removed test-codegen deps from Cargo.lock

    • By removing the codegen crate from the workspace their deps
      are now separated and don't pollute the lock of the actual end product.
  • cargo-deny now runs with the CI to keep tabs on our deps.

    • Configured to reject Medium+ CVEs and non-compatible licenses.

Dependencies

  • Bump clap from 4.3.19 to 4.4.2.
  • Bump log from 0.4.19 to 0.4.20.
  • Bump thiserror from 1.0.44 to 1.0.47.
  • Bump trycmd from 0.14.16 to 0.14.17.
  • Removed memchr as a dependency.
    • It was no longer needed after the custom memmem classifier
      introduced in v0.6.0.
  • Removed replace_with as a dependency.
    • That code path was refactored earlier, dep was now unused.

Documentation

  • Added the OpenSSF badge.
    • We will be trying to achieve the Passing level before v1.0.0.
  • Added the scorecard badge.

v0.6.1

07 Aug 17:25
8187708
Compare
Choose a tag to compare

We have a book! Find it on GitHub Pages.

[0.6.1] - 2023-08-07

Features

  • [breaking] Remove the `unique-members`` feature.

    • This clutters the API more than anything.
      If supporting duplicate keys is required in the future,
      it can be easily added as a const config option,
      not a compilation feature.
  • Add the --json CLI option for passing JSONs inline.

Reliability

  • Added snapshot tests for rq using trycmd.
    • This is another layer of E2E tests, makes sure documentation examples
      in the book are correct, and that our --help and --version outputs
      remain consistent.

Documentation

  • We have a book!
    • The first part is a usage guide for rq, and contains a short
      JSONPath reference.
    • Other parts will follow, with a plan to finalize at least the library
      usage guide before 1.0.0.

v0.6.0

02 Aug 22:01
768481a
Compare
Choose a tag to compare

rq is officially useful! 🎉

Thanks to @charles-paperman for the memmem contribution without which #56 would be a big perf regression!

[0.6.0] - 2023-08-02

Features

  • [breaking] Full match result mode. (#56)
    This includes a revamp of all the internals that would be too long to describe in the log.
    In short:

    • memmem was rewritten to a custom implementation (courtesy of @charles-paperman)
    • Each of the result modes has a separate Recorder that takes care of producing the results
    • The results are written to a Sink, provided by the user; this might be a Vec, the stdout,
      or some other io::Write implementation.
    • Matches contain the full byte span of the value matched.
    • A lot of Input and classifier APIs have massive breaking changes to accomodate this.
  • [breaking] Removed the Recursive engine.

    • The Recursive implementation has outlived its usefulness.
      Over time it became a near-duplicate of Main,
      which was manifested by a need to implement
      the same features twice with the exact same code
      and to refactor/fix bugs with exact same code changes
      but in two different files. We will focus efforts on the Main engine.
      The --engine CLI option was disabled, as there is only one engine now.

Reliability

  • Qol improvement by separate test gen crate.
    • This removes the confusing gen-tests feature from lib,
      reduces its build dependencies, should improve
      build times.

Dependencies

  • Bump clap from 4.3.10 to 4.3.19.
  • Bump colored (dependency of simple_logger) from 2.0.0 to 2.0.4.
    • This removes a transitive dependency on atty with a CVE.
  • Bump rustflags from 0.1.3 to 0.1.4.
  • Bump smallvec from 1.10.0 to 1.11.0.
  • Bump thiserror from 1.0.40 to 1.0.44.

v0.5.1

03 Jul 15:38
b0cfba0
Compare
Choose a tag to compare

Thanks for the contributions to @firewall2142, @azozello, @Pasifaee, which I've been long overdue to merge!

[0.5.1] - 2023-07-03

Features

  • Consistent index result output. (#161)
    • The --result bytes mode now consistently reports the first byte of the value it matched. This can be used to extract the actual value from the JSON by parsing from the reported byte.

Bug Fixes

  • Remove SHA from --version on crates.io. (#157)
    • The Commit SHA part was incorrect, and there seems to be no way to get it when the crate is in registry

Library

  • [breaking] Remove tail-skip and head-skip features.
    • These are now non-optional and integrated into the engines.

Reliability

  • Generate strings in classifier tests. (#173, #20)

    • Improve classifier correctness tests by including quoted strings with escapes
      in the generated proptest cases.
  • More tests for wildcard compilation.

    • Added more cases for compiling the NFA and minimizing
      for queries with wildcards.
  • Automated declarative end-to-end engine tests. (#134)

    • Engine tests were rewritten to use declarative TOML configurations
      for ease of creating new tests, maintenance and debugging ease.
      Test coverage was increased, since compressed variants of inputs are
      automatically generated and tested, and we now test all combinations
      of input-engine-result types.

Dependencies

  • Bump addr2line from v0.19.0 to v0.20.0
  • Bump anstyle from v1.0.0 to v1.0.1
  • Bump anstyle-parse from v0.2.0 to v0.2.1
  • Bump backtrace from v0.3.67 to v0.3.68
  • Bump clap from 4.3.4 to 4.3.10.
  • Bump gimli from v0.27.2 to v0.27.3
  • Bump hashbrown from v0.12.3 to v0.14.0
  • Bump indexmap from v1.9.3 to v2.0.0
  • Bump is-terminal from v0.4.7 to v0.4.8
  • Bump libc from v0.2.146 to v0.2.147
  • Bump memmap2 from 0.7.0 to 0.7.1.
  • Bump miniz_oxide from v0.6.2 to v0.7.1
  • Bump object from v0.30.4 to v0.31.1
  • Bump proc-macro2 from v1.0.60 to v1.0.63
  • Bump quote from v1.0.28 to v1.0.29
  • Bump serde_spanned from v0.6.2 to v0.6.3
  • Bump syn from v2.0.18 to v2.0.22
  • Bump toml from v0.7.4 to v0.7.5
  • Bump toml_datetime from v0.6.2 to v0.6.3
  • Bump toml_edit from v0.19.10 to v0.19.11
  • Bump vergen from v8.2.1 to v8.2.3
  • Bump windows-targets from v0.48.0 to v0.48.1

Documentation

  • Rearrange readme to put usage first.
  • Update bug report issue form.
    • Changed the issue form to be more streamlined and use more polite language.
  • Add MSRV to README.

v0.5.0

14 Jun 22:58
c1d3cbf
Compare
Choose a tag to compare

[0.5.0] - 2023-06-14

Thanks for the contributions to @zwerdlds, who delivered the index selector support for this release!

Features

  • Rename bin to rq and lib to rsonpath. The crate names rename the same, just the output artifacts are different.
  • Parser support for array index selector. (#60)
    • Parser now recognizes the array index selector with positive index values conforming to the I-JSON specification.
  • Index selector engine support (#132). (#132#132)
    • The automaton transition model has been changed to incorporate index-labelled transitions.
  • Both engines now support queries with the index selector.
  • New Input API. (#23#23)
    • A more abstract API to access the underlying byte stream replacing the reliance of the engines on a direct &[u8] slice access, to allow adding buffered input streams (#23) in the future. Two types were added, OwnedBytes and BorrowedBytes, to support the current easy scenario of having the bytes already in memory.
  • Add long version to CLI.
  • Mmap support. (#23)
    • Added MmapInput which maps a file into memory on unix and windows.
  • The CLI app now automatically decides which input to use, favoring mmap in most cases. This can be overridden with --force-input.

Library

  • Rename Label to JsonString (#139). (#139#131)
    • query::Label is now query::JsonString
  • The unique-labels feature is now unique-members
  • EngineError:MalformedLabelQuotes renamed to EngineError:MalformedStringQuotes

Reliability

  • Proptests for parsing array indices queries. (#51)

Dependencies

  • Bump clap from 4.1.11 to 4.3.4.
  • Bump log from 0.4.17 to 0.4.19.
  • Bump proptest from 1.1.0 to 1.2.0.
  • Bump simple_logger from 4.1.0 to 4.2.0.

v0.4.0

20 Apr 23:16
v0.4.0
381e346
Compare
Choose a tag to compare

[0.4.0] - 2023-04-20

Features

  • Wildcard descendant support.

    • You can now use the ..*/..[*] selector that selects all nodes in the document it acts upon.
  • Switch Structural to BracketType. (#10)

    • The Opening and Closing variants now differentiate between curly
      and square brackets with a value of the BracketType enum.

Bug fixes

  • Fix parser incorrectly escaping labels.

    • Queries like $['\''] would cause a parsing error, even though they were valid (match a child with key equal to "'").
    • The \u escape sequence is no longer recognized, since without UTF-8 handling they were meaningless.
      See (#117).
  • Empty query array behavior.

    • Running the query $ on a document [] was giving zero results. Now correctly matches the root array.

Documentation

  • The grammar in top-level documentation now matches the implementation.

Reliability

  • Added proptests for query parsing.
    • Currently checks that correct queries are parsed correctly.
      We still need tests for error conditions (see #51).