Skip to content

Conversation

@samyron
Copy link
Contributor

@samyron samyron commented Nov 16, 2025

This PR optimizes json_string_unescape.

Two commits:

  1. Use ARM Neon to scan for \. While scanning, copy the current chunk to the output.
  2. Add a fast path when unescaping a single character.

If this PR is accepted, I will follow up with an SSE2 implementation.

Benchmarks

Run on a Macbook Air M1.

twitterescaped.json is from simdjson-data.

== Parsing activitypub.json (58160 bytes)
ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +PRISM [arm64-darwin24]
Warming up --------------------------------------
               after     1.103k i/100ms
Calculating -------------------------------------
               after     11.143k (± 0.8%) i/s   (89.74 μs/i) -     56.253k in   5.048516s

Comparison:
              before:    10366.8 i/s
               after:    11143.2 i/s - 1.07x  faster

== Parsing twitterescaped.json (562408 bytes)
ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +PRISM [arm64-darwin24]
Warming up --------------------------------------
               after    73.000 i/100ms
Calculating -------------------------------------
               after    737.341 (± 0.9%) i/s    (1.36 ms/i) -      3.723k in   5.049667s

Comparison:
              before:      712.1 i/s
               after:      737.3 i/s - 1.04x  faster

I should note that the fast path for unescaping a single character accounts for about 1% of the speed increase in activitypub.json. It's pretty minor.

@byroot byroot force-pushed the sm/string-unescape-neon branch from 864ef5b to 9535e8a Compare November 22, 2025 08:46
@byroot byroot force-pushed the sm/string-unescape-neon branch from 9535e8a to b42e968 Compare November 22, 2025 13:25
@byroot
Copy link
Member

byroot commented Nov 22, 2025

Sorry for the delay, I just started a new work and was busy.

This PR is interesting, but while reviewing it, it gave me another idea: #902

We already find the \ during parsing, so we could actually record them to pass them to the decoder. Of course there is a space tradeoff, but that's the idea.

With a handcrafted benchmark:

benchmark_parsing "some_unescape", JSON.dump([((" "*100) + "\n")*15])
benchmark_parsing "more_unescape", JSON.dump([((" "*100) + "\n")*30])

My PR perform significantly better:

== Parsing some_unescape (1534 bytes)
ruby 3.4.6 (2025-09-16 revision dbd83256b1) +YJIT +PRISM [arm64-darwin24]
Warming up --------------------------------------
               after   252.233k i/100ms
Calculating -------------------------------------
               after      2.619M (± 0.6%) i/s  (381.76 ns/i) -     13.116M in   5.007434s

Comparison:
              before:  3159184.8 i/s
               after:  2619427.0 i/s - 1.21x  slower


== Parsing more_unescape (3064 bytes)
ruby 3.4.6 (2025-09-16 revision dbd83256b1) +YJIT +PRISM [arm64-darwin24]
Warming up --------------------------------------
               after   153.779k i/100ms
Calculating -------------------------------------
               after      1.579M (± 0.7%) i/s  (633.23 ns/i) -      7.997M in   5.063882s

Comparison:
              before:  1796951.2 i/s
               after:  1579212.5 i/s - 1.14x  slower

(after is your branch, before is #902).

So I think I'll go with #902

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants