feat: use Uint8Array for ids #786

Flarna · 2020-02-14T18:01:59Z

Which problem is this PR solving?

closes #698

Short description of the changes

Use Uint8Array for traceId/spandId to avoid conversion from buffer to
hex string during creation and from hex string to buffer during export.

The opentelemetry protocol uses also a byte array in protobuf.

There are still conversions needed for propagation via HTTP headers but
overall the number of conversions should be reduced.

Use Uint8Array for traceId/spandId to avoid conversion from buffer to hex string during creation and from hex string to buffer during export. The opentelemetry protocol uses also a byte array in protobuf. There are still conversions needed for propagation via HTTP headers but overall the number of conversions should be reduced.

OlivierAlbertini · 2020-02-14T18:59:56Z

Can we get quick benchmark ? Like ab test would be acceptable to me.

packages/opentelemetry-api/test/api/api.test.ts

Flarna · 2020-02-14T19:29:59Z

Can we get quick benchmark ? Like ab test would be acceptable to me.

Are there somewhere instructions how to run OTel benchmarks?

dyladan · 2020-02-14T19:32:16Z

Can we get quick benchmark ? Like ab test would be acceptable to me.

Are there somewhere instructions how to run OTel benchmarks?

you can just do npm run bench in the root of the project

codecov-io · 2020-02-14T19:38:15Z

Codecov Report

Merging #786 into master will increase coverage by 1.96%.
The diff coverage is 93.48%.

@@            Coverage Diff             @@
##           master     #786      +/-   ##
==========================================
+ Coverage   90.67%   92.64%   +1.96%     
==========================================
  Files         236      243       +7     
  Lines       11045    10816     -229     
  Branches     1024     1064      +40     
==========================================
+ Hits        10015    10020       +5     
+ Misses       1030      796     -234

Impacted Files	Coverage Δ
...s/opentelemetry-metrics/test/MeterProvider.test.ts	`100% <ø> (ø)`	⬆️
...elemetry-tracing/src/export/SimpleSpanProcessor.ts	`100% <ø> (ø)`	⬆️
...telemetry-tracing/src/export/BatchSpanProcessor.ts	`100% <ø> (ø)`	⬆️
...opentelemetry-node/test/NodeTracerProvider.test.ts	`100% <ø> (ø)`
...ackages/opentelemetry-api/src/metrics/NoopMeter.ts	`72.09% <ø> (-9.04%)`	⬇️
packages/opentelemetry-api/src/metrics/Metric.ts	`100% <ø> (ø)`	⬆️
...kages/opentelemetry-metrics/test/mocks/Exporter.ts	`50% <ø> (-50%)`	⬇️
...s/opentelemetry-web/test/WebTracerProvider.test.ts	`0% <0%> (ø)`
...ges/opentelemetry-exporter-jaeger/src/transform.ts	`100% <100%> (ø)`	⬆️
packages/opentelemetry-metrics/src/export/types.ts	`100% <100%> (ø)`	⬆️
... and 193 more

Flarna · 2020-02-14T21:05:18Z

Result of benchmarks:

IDs as Uint8Array

Beginning NoopTracerProvider Benchmark...
  5 tests completed.

  #startSpan                     x 1,092,254,357 ops/sec ±1.06% (20 runs sampled)
  #startSpan:parent              x 1,098,253,393 ops/sec ±1.24% (20 runs sampled)
  #startSpan with attribute      x 1,088,682,868 ops/sec ±1.94% (20 runs sampled)
  #startSpan with 30 attributes  x    11,387,274 ops/sec ±1.49% (20 runs sampled)
  #startSpan with 100 attributes x     3,721,347 ops/sec ±0.20% (20 runs sampled)

Beginning BasicTracerProvider Benchmark...
  5 tests completed.

  #startSpan                     x 209,398 ops/sec ±1.49% (20 runs sampled)
  #startSpan:parent              x 115,787 ops/sec ±1.21% (20 runs sampled)
  #startSpan with attribute      x 207,648 ops/sec ±1.53% (20 runs sampled)
  #startSpan with 30 attributes  x  64,260 ops/sec ±0.43% (20 runs sampled)
  #startSpan with 100 attributes x   5,902 ops/sec ±1.33% (20 runs sampled)

Beginning BasicTracerProvider with SimpleSpanProcessor Benchmark...
  5 tests completed.

  #startSpan                     x 167,168 ops/sec ±3.13% (20 runs sampled)
  #startSpan:parent              x  97,077 ops/sec ±3.16% (20 runs sampled)
  #startSpan with attribute      x 143,967 ops/sec ±16.37% (20 runs sampled)
  #startSpan with 30 attributes  x  52,678 ops/sec ±1.60% (20 runs sampled)
  #startSpan with 100 attributes x   5,680 ops/sec ±1.82% (20 runs sampled)

Beginning BasicTracerProvider with BatchSpanProcessor Benchmark...
  5 tests completed.

  #startSpan                     x 164,040 ops/sec ±3.30% (20 runs sampled)
  #startSpan:parent              x  93,457 ops/sec ±1.19% (20 runs sampled)
  #startSpan with attribute      x 124,553 ops/sec ±15.51% (20 runs sampled)
  #startSpan with 30 attributes  x  49,418 ops/sec ±2.17% (20 runs sampled)
  #startSpan with 100 attributes x   5,572 ops/sec ±1.12% (20 runs sampled)


Beginning B3Format Benchmark...
  2 tests completed.

  #Inject  x 1,188,475 ops/sec ±0.51% (100 runs sampled)
  #Extract x 1,263,576 ops/sec ±2.83% (100 runs sampled)

Beginning HttpTraceContext Benchmark...
  2 tests completed.

  #Inject  x 1,448,021 ops/sec ±0.59% (100 runs sampled)
  #Extract x 1,064,664 ops/sec ±0.44% (100 runs sampled)

IDs as hex strings

Beginning NoopTracerProvider Benchmark...
  5 tests completed.

  #startSpan                     x 1,059,885,282 ops/sec ±2.87% (20 runs sampled)
  #startSpan:parent              x 1,040,236,747 ops/sec ±2.38% (20 runs sampled)
  #startSpan with attribute      x 1,054,605,044 ops/sec ±2.23% (20 runs sampled)
  #startSpan with 30 attributes  x    11,243,262 ops/sec ±1.37% (20 runs sampled)
  #startSpan with 100 attributes x     3,674,193 ops/sec ±0.79% (20 runs sampled)

Beginning BasicTracerProvider Benchmark...
  5 tests completed.

  #startSpan                     x 146,849 ops/sec ±1.65% (20 runs sampled)
  #startSpan:parent              x  97,562 ops/sec ±0.57% (20 runs sampled)
  #startSpan with attribute      x 148,188 ops/sec ±0.45% (20 runs sampled)
  #startSpan with 30 attributes  x  54,051 ops/sec ±0.30% (20 runs sampled)
  #startSpan with 100 attributes x   5,793 ops/sec ±1.01% (20 runs sampled)

Beginning BasicTracerProvider with SimpleSpanProcessor Benchmark...
  5 tests completed.

  #startSpan                     x 131,322 ops/sec ±3.49% (20 runs sampled)
  #startSpan:parent              x  87,738 ops/sec ±2.40% (20 runs sampled)
  #startSpan with attribute      x 135,653 ops/sec ±1.92% (20 runs sampled)
  #startSpan with 30 attributes  x  43,870 ops/sec ±14.04% (20 runs sampled)
  #startSpan with 100 attributes x   5,693 ops/sec ±1.21% (20 runs sampled)

Beginning BasicTracerProvider with BatchSpanProcessor Benchmark...
  5 tests completed.

  #startSpan                     x 132,999 ops/sec ±2.03% (20 runs sampled)
  #startSpan:parent              x  85,832 ops/sec ±2.31% (20 runs sampled)
  #startSpan with attribute      x 131,084 ops/sec ±2.65% (20 runs sampled)
  #startSpan with 30 attributes  x  47,688 ops/sec ±1.20% (20 runs sampled)
  #startSpan with 100 attributes x   5,545 ops/sec ±2.47% (20 runs sampled)


Beginning B3Format Benchmark...
  2 tests completed.

  #Inject  x 5,770,521 ops/sec ±0.33% (100 runs sampled)
  #Extract x 5,599,810 ops/sec ±0.48% (100 runs sampled)

Beginning HttpTraceContext Benchmark...
  2 tests completed.

  #Inject  x 15,769,315 ops/sec ±1.15% (100 runs sampled)
  #Extract x  3,545,804 ops/sec ±0.39% (100 runs sampled)

Don't know if this is enough for a conclusion. As expected span creation and binary propagation is faster but text propagation is slower. An end2end setup using text tagging and a few internal spans with e.g. jaeger exporter would be better.

OlivierAlbertini · 2020-02-14T21:36:47Z

Thx @Flarna

dyladan · 2020-02-14T22:14:05Z

Summary of benchmark:

Minor speed improvement in span creation
B3 Inject 5.7M/sec to 1B/sec (~600x)
B3 Extract 5.7M to 1.3M (~0.25x)
TC Inject 15.8M to 900K (~0.057x)
TC Extract 3.5M to 1M (~0.3x)

The speed improvement in span creation is fairly minor and seems fine, but the very interesting results are the propagators. The B3 inject has a seemingly impossible speedup, but everything else suffered quite a bit with the trace context injection showing the biggest drop.

dyladan · 2020-02-14T22:29:25Z

All the conversions around Buffer/Uint8Array here https://github.com/open-telemetry/opentelemetry-js/pull/786/files#diff-c818867edf5b3960f94f5c579ff06043 are unfortunate as in most cases a Buffer can be used as a Uint8Array anyways. I wonder if we would be better adding platform-specific wrapper classes with the methods we use (toHex, fromHex, equality), so that the platform specific code could use whatever platform specific object (node: buffer, web: Uint8Array) makes the most sense for it.

Flarna · 2020-02-15T00:52:11Z

I updated benchmark results above. I noticed that I forgot to adapt the benchmarks therefore measurements for propagators were garbage.
It still looks like the improvement is quite low. Most likely the main overhead of Otel is in other areas then string/buffer conversions.

Flarna · 2020-02-17T08:22:25Z

I think we should be also careful in interpretation of the benchmarks as they show ops/sec but the real usecase don't do the ops independent. e.g. you will always have at least one created spans for inject and one more for extract.
Maybe they should display time/op instead which would allow to sum up.

Examples:
#startSpan:parent improved from 11.4us/op to 10.3us/op
#inject degraded from 0.063us/op to 0.69us/op
#extract degraded from 0.28us/op to 0.94us/op

summing this up (2 spans + inject + extract): 23,14us => 22.23us

But I'm still not sure if the added complexity is it worth to do this.

obecny · 2020-02-27T20:46:26Z

Guys with regards to your test I completely agree with comparing the total sum of cpu usage, taking into consideration the creation of span and then exporting that - in the most typical scenario. As it will be hard to understand which of this operation is "more valuable" / "more used" and which of this operation will have bigger impact of the total cpu usage.

obecny · 2020-02-27T20:50:37Z

are those tests run only for node or also for different browsers ?

dyladan · 2020-02-27T21:05:34Z

I agree using total timings makes sense.

operation	speed improvement (µs)
span creation (root or non-injected outgoing)	1.1
incoming request (span + extract)	0.44
outgoing request (span + inject)	0.473

scenario	speed improvement (µs)
1 incoming + 1 outgoing	0.913
1 incoming + 10 outgoing	5.17
1 incoming + 1 db (no inject)	1.54
1 incoming + 10 db	11.44

Overall this seems like a micro-optimization that may not be worth the effort of maintaining a more complex implementation. It may also end up being slower in cases where ids are logged many times. It is easy to envision a use-case where span context is logged on every log message. In this case you must pay the encoding cost each time you log (ignoring the possibility of saving and re-using the encoded version, which is yet more complexity).

obecny · 2020-02-27T21:10:53Z

what if we add some caching when converting ? so each id will be converted once only no matter how many times you will convert it ?

dyladan · 2020-02-27T23:34:58Z

I think conversion caching would add additional complexity that we may not want to maintain for such a small performance gain.

Maybe @Flarna can provide some stats as to the mean and median numbers of outgoing span operations for an incoming request?

Flarna · 2020-02-28T07:53:28Z

are those tests run only for node or also for different browsers ?

I used only Node.js (I think one of the latest 12.x versions on windows). No idea how to run OTel and it's benchmarks in a browser...
But it would be interesting to get some numbers as at least some conversion functions have different implementations in browser and node.

It may also end up being slower in cases where ids are logged many times. It is easy to envision a use-case where span context is logged on every log message.

Well if people start logging span context that often the conversion cost is most likely the wrong area to start for improving...

what if we add some caching when converting ? so each id will be converted once only no matter how many times you will convert it ?

I thought about this also. But this would require to encapsulate this into some class with getters/setters which adds overhead again. And it's no longer that easy to just create a SpanContext from literals. see also #698 (comment))

Maybe @Flarna can provide some stats as to the mean and median numbers of outgoing span operations for an incoming request?

Sorry, no numbers regarding this. Incoming and outgoing spans have usually several attributes on it (e.g. HTTP semantic conventions,...) therefore the relative gain for them will be small.
Most relative gain will be on internal spans with no attributes but I would not assume that they are created that frequently.

In general I was expecting more gain from this change. I have to admit that using Uint8Array instead string is more error prone. I have not yet closed this PR to give some more time to discuss on it and let people try some other usecases to get more insight on impact.
I think we should decide on this before reaching Beta as it is a non compatible API change.

Flarna · 2020-02-28T19:22:37Z

I had a chat with @dyladan today and we found that a significant part of the improvents seen here (at least for nodejs) are just because randomTraceId() has been changed from two calls to randomBytes to one call requesting 16 bytes.
Or in other words replacing randomSpanId() + randomSpanId() by crypto.randomBytes(TRACE_ID_BYTES).toString('hex') is most likely the better way to go.

Seems that at least V8 is significant better in working with string compared to Buffer/Uint8Array. I think I add a similar experience a while ago in a different context. I assume the reason is that string is much more used and present since a longer time so much work work in optimization has been done for them.

dyladan · 2020-02-28T20:35:59Z

Closing in favor of #824

Flarna requested review from bg451, dyladan, markwolff, mayurkale22, obecny, OlivierAlbertini, rochdev and vmarchaud as code owners February 14, 2020 18:02

Flarna added 2 commits February 14, 2020 19:12

chore: lint

2767587

fix: redis test

e5d5174

Flarna changed the title ~~feat: use Uint8Array for ids~~ WIP feat: use Uint8Array for ids Feb 14, 2020

dyladan reviewed Feb 14, 2020

View reviewed changes

packages/opentelemetry-api/test/api/api.test.ts Outdated Show resolved Hide resolved

Flarna added 2 commits February 14, 2020 20:22

fix: grpc tests

ff8e144

chore: lint

abc1ccf

Flarna added 2 commits February 14, 2020 21:29

fix: exporter-collector

e8db7f9

fix: add idToBase64

64474a9

Flarna changed the title ~~WIP feat: use Uint8Array for ids~~ feat: use Uint8Array for ids Feb 14, 2020

Flarna added 2 commits February 15, 2020 01:15

chore: avoid unneeded copy

2ee0646

fix: adapt benchmarks

42c135f

Flarna added 3 commits February 24, 2020 12:44

Merge remote-tracking branch 'upstream/master' into ids-as-uint8array

62a8d3a

chore: fix merge issues

8091be7

chore: lint

1d80426

dyladan mentioned this pull request Feb 28, 2020

Faster trace id generation #824

Merged

dyladan closed this Feb 28, 2020

Flarna deleted the ids-as-uint8array branch February 28, 2020 21:36

Flarna mentioned this pull request Feb 28, 2020

Use buffer instead string for traceId and spanId #698

Closed

2 tasks

pichlermarc pushed a commit to dynatrace-oss-contrib/opentelemetry-js that referenced this pull request Dec 15, 2023

test: test multiple versions of graphql (open-telemetry#786)

6766471

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: use Uint8Array for ids #786

feat: use Uint8Array for ids #786

Flarna commented Feb 14, 2020

OlivierAlbertini commented Feb 14, 2020

Flarna commented Feb 14, 2020

dyladan commented Feb 14, 2020

codecov-io commented Feb 14, 2020 •

edited

Loading

Flarna commented Feb 14, 2020 •

edited

Loading

OlivierAlbertini commented Feb 14, 2020

dyladan commented Feb 14, 2020

dyladan commented Feb 14, 2020

Flarna commented Feb 15, 2020

Flarna commented Feb 17, 2020

obecny commented Feb 27, 2020

obecny commented Feb 27, 2020

dyladan commented Feb 27, 2020 •

edited

Loading

obecny commented Feb 27, 2020

dyladan commented Feb 27, 2020

Flarna commented Feb 28, 2020

Flarna commented Feb 28, 2020

dyladan commented Feb 28, 2020

feat: use Uint8Array for ids #786

feat: use Uint8Array for ids #786

Conversation

Flarna commented Feb 14, 2020

Which problem is this PR solving?

Short description of the changes

OlivierAlbertini commented Feb 14, 2020

Flarna commented Feb 14, 2020

dyladan commented Feb 14, 2020

codecov-io commented Feb 14, 2020 • edited Loading

Codecov Report

Flarna commented Feb 14, 2020 • edited Loading

OlivierAlbertini commented Feb 14, 2020

dyladan commented Feb 14, 2020

dyladan commented Feb 14, 2020

Flarna commented Feb 15, 2020

Flarna commented Feb 17, 2020

obecny commented Feb 27, 2020

obecny commented Feb 27, 2020

dyladan commented Feb 27, 2020 • edited Loading

obecny commented Feb 27, 2020

dyladan commented Feb 27, 2020

Flarna commented Feb 28, 2020

Flarna commented Feb 28, 2020

dyladan commented Feb 28, 2020

codecov-io commented Feb 14, 2020 •

edited

Loading

Flarna commented Feb 14, 2020 •

edited

Loading

dyladan commented Feb 27, 2020 •

edited

Loading