Skip to content

Conversation

@hrodmn
Copy link
Contributor

@hrodmn hrodmn commented Jun 12, 2025

From #1167

This adds some convenience functions for adding opentelemetry traces to titiler applications in titiler.core.telemetry. These features will only be activated if the opentelemetry dependencies are installed.

When enabled in a titiler application, the traces can be browsed in a UI like Jaeger:
image

Factory class instrumentation: the BaseTilerFactory has a new method add_telemetry() which will add a trace to each registered endpoint in the factory class.

Endpoint instrumentation: endpoint methods in all of the tiler factory classes in titiler.core, titiler.mosaic, and titiler.xarray have been decorated with some operation_tracer context managers to track the performance of specific operations within an endpoint function.

To keep things clean I opted to skip adding additional attributes to the traces, but we might want to revisit that in the future!

TODO

  • add telemetry.py
  • decorate factory classes
  • add tests
  • update docs

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'TiTiler performance Benchmarks'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.30.

Benchmark suite Current: df331c6 Previous: 995e833 Ratio
WGS1984Quad longest_transaction 0.07 s 0.05 s 1.40

This comment was automatically generated by workflow using github-action-benchmark.

@vincentsarago
Copy link
Member

Thanks @hrodmn this looks super promising 🙏

I've added couple comments (mostly on naming).

My main concern is about the number of lines the tracing will add to the factory, mostly due to the set_attribute blocks which we should try to avoid if the information can be derived from other place (e.g by looking at the query parameter, like for dataset path and tile matrix set)

If we move forward, we would need to add trace to all other factory endpoints 😬

@hrodmn
Copy link
Contributor Author

hrodmn commented Jun 13, 2025

My main concern is about the number of lines the tracing will add to the factory, mostly due to the set_attribute blocks which we should try to avoid if the information can be derived from other place (e.g by looking at the query parameter, like for dataset path and tile matrix set)

Yes, I agree about not wanting to add so many lines just for tracing. I will work on ways to make attribute setting more succinct. User input parameters should be easy to absorb automatically but anything else would need to be added manually. Maybe we can leave ImageData introspection out for now and add it to rio-tiler if users want it.

If we move forward, we would need to add trace to all other factory endpoints 😬

Thanks to the add_telemetry method we will get basic tracing capability for any factory that is based on BaseFactory for free! If we want any of the more detailed within-function traces (like open_dataset, read_tile, etc) that will require changes in every method definition where we need more detailed info.

@hrodmn
Copy link
Contributor Author

hrodmn commented Jun 13, 2025

We could decide to completely skip setting attributes for and just add the operation_tracer context manager in places where we want more detailed profiling or performance evaluation.

We can still get a trace timeline like this without exporting all of the extra attributes:
image

@hrodmn hrodmn force-pushed the feat/open-telemetry-hr branch from 4a86484 to 8b8a9b8 Compare June 16, 2025 11:48
@hrodmn hrodmn marked this pull request as ready for review June 16, 2025 11:50
@hrodmn hrodmn changed the title [WIP]: add basic telemetry decorators and tracing to tile method add basic telemetry decorators and tracing to tile method Jun 16, 2025
@hrodmn hrodmn changed the title add basic telemetry decorators and tracing to tile method add OpenTelemetry instrumentation to titiler.core Jun 16, 2025
@hrodmn hrodmn requested a review from vincentsarago June 16, 2025 11:58
@hrodmn
Copy link
Contributor Author

hrodmn commented Jun 16, 2025

@vincentsarago thanks for your feedback on these changes! I have implemented your naming suggestions and trimmed out the verbose attribute tracing lines.

I did some research on the expected performance hit for including these traces and I don't think it will be significant. Using the BatchSpanProcessor reduces overhead for an application because it collects traces in batches and sends them to the collector endpoint in a separate thread. It is also conventional for an application to only collect traces for a sample (e.g. 2%) of requests, which would limit the impact of enabling telemetry if there is any measureable performance reduction.

@geospatial-jeff
Copy link
Contributor

I did some research on the expected performance hit for including these traces and I don't think it will be significant.

Historically OTEL has had some bad performance issues but most of that has been worked out over the last few years 👍

@hrodmn hrodmn marked this pull request as draft June 16, 2025 14:21
@hrodmn hrodmn changed the title add OpenTelemetry instrumentation to titiler.core [WIP] add OpenTelemetry instrumentation to titiler.core Jun 16, 2025
@hrodmn hrodmn force-pushed the feat/open-telemetry-hr branch 2 times, most recently from e1d2a27 to 5786e36 Compare June 23, 2025 15:57
@hrodmn hrodmn force-pushed the feat/open-telemetry-hr branch from a07dfeb to 15b75f4 Compare June 24, 2025 11:30
@hrodmn hrodmn marked this pull request as ready for review June 24, 2025 11:33
@hrodmn hrodmn force-pushed the feat/open-telemetry-hr branch from 8ca0685 to ed338c0 Compare July 8, 2025 11:02
@vincentsarago vincentsarago self-requested a review July 8, 2025 13:13
Copy link
Contributor

@geospatial-jeff geospatial-jeff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@hrodmn hrodmn merged commit bb4de6b into main Jul 8, 2025
10 checks passed
@hrodmn hrodmn deleted the feat/open-telemetry-hr branch July 8, 2025 14:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants