Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add some optimizations to utf8 decoding #1948

Merged
merged 6 commits into from
Jul 12, 2020

Conversation

johnynek
Copy link
Contributor

@johnynek johnynek commented Jul 9, 2020

This does four things:

  1. adds Chunk.Queue.startsWith so we can be a bit more precise when checking for utf8 byte order mark
  2. be miserly in internal methods with allocations and use null instead of Option (avoid allocation of Some on a loop).
  3. use a Builder to avoid having to return a tuple and to reverse a list in an internal method.
  4. avoid fold on an internal loop and just write out the while loop.

Copy link

@non non left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. Hopefully you got some great performance improvements from fewer allocations and less memory pressure?

I had a few small suggestions but none of them really affect the correctness of the PR, so take them or leave them.

if (c == counter)
res = 0
else
res = counter + 1
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure res is already 0 so i think you just want if (c != counter) res = counter + 1 don't you?

else
res = counter + 1
// exit the loop
idx = 0
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If bs.size is 1 then what would happen? It seems like minIdx would be set to 0 (0.max(-2)) and idx would also be set to 0 (1-1). In that case, idx = 0 won't break you out of the loop. I think idx = -1 is a safer way to do this (although I would just use return myself).

Oh I see, you're counting on the decrement operator that occurs later. That makes sense, although IMO using -1 or return here might be a bit more reassuring to the reader. (That said, there's no bug so maybe it's not a big deal.)

new String(allBytes.take(splitAt).toArray, utf8Charset) :: output,
Chunk.bytes(allBytes.drop(splitAt))
)
if (splitAt == allBytes.size) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a small nit but I'd use length instead of size when working with arrays:

scala> val arr = Array(1,2,3)
val arr: Array[Int] = Array(1, 2, 3)

scala> u.reify { arr.size }
val res1: reflect.runtime.universe.Expr[Int] = Expr[Int](Predef.intArrayOps(arr).size)

scala> u.reify { arr.length }
val res2: reflect.runtime.universe.Expr[Int] = Expr[Int](arr.length)

@johnynek
Copy link
Contributor Author

Benchmark results:

on main:
[info] Benchmark                  (asciiStringSize)   Mode  Cnt       Score       Error  Units
[info] TextBenchmark.asciiDecode                128  thrpt    6  195334.730 ± 18599.880  ops/s
[info] TextBenchmark.asciiDecode               1024  thrpt    6  153746.264 ± 61576.197  ops/s
[info] TextBenchmark.asciiDecode               4096  thrpt    6  118761.218 ±  4695.784  ops/s

on this branch:
[info] Benchmark                  (asciiStringSize)   Mode  Cnt       Score      Error  Units
[info] TextBenchmark.asciiDecode                128  thrpt    6  218236.471 ± 1145.261  ops/s
[info] TextBenchmark.asciiDecode               1024  thrpt    6  192552.031 ± 1689.683  ops/s
[info] TextBenchmark.asciiDecode               4096  thrpt    6  146657.477 ±  589.035  ops/s

So this is 11% to 25% faster depending on the size, for this simple benchmark.

@mpilquist mpilquist merged commit 150d846 into typelevel:main Jul 12, 2020
@mpilquist mpilquist added this to the 2.4.3 milestone Aug 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants