Skip to content

Conversation

JacksonWeber
Copy link

Description

This pull request introduces experimental support for detecting synthetic traffic (such as bots and test monitors) in the OpenTelemetry Requests instrumentation by analyzing the User-Agent header. It adds logic to classify requests as originating from bots or test systems, sets a corresponding span attribute, and includes comprehensive tests to validate this behavior.

The most important changes are:

Synthetic User Agent Detection and Span Attributes:

  • Added a function _detect_synthetic_user_agent in __init__.py to analyze the User-Agent header and classify requests as either "bot" or "test" traffic, prioritizing test patterns over bot patterns.
  • Set the user_agent.synthetic.type span attribute when a synthetic user agent is detected, using new constants for attribute name and possible values. [1] [2]

Semantic Conventions:

  • Introduced a new module semconv.py defining experimental semantic convention constants for synthetic user agent detection, including ATTR_USER_AGENT_SYNTHETIC_TYPE, USER_AGENT_SYNTHETIC_TYPE_VALUE_BOT, and USER_AGENT_SYNTHETIC_TYPE_VALUE_TEST.
  • Updated imports in __init__.py to use these new constants from semconv.py.

Testing:

  • Added a comprehensive test suite in test_user_agent_synthetic.py to verify detection logic for various user agent scenarios, including bots, test agents, normal browsers, case insensitivity, substring matches, and pattern priority.

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Tested via unit tests included in test_user_agent_synthetic.py

Does This PR Require a Core Repo Change?

  • Yes. - Link to PR:
  • No.

Checklist:

See contributing.md for styleguide, changelog guidelines, and more.

  • Followed the style guidelines of this project
  • Changelogs have been updated
  • Unit tests have been added
  • Documentation has been updated

@JacksonWeber JacksonWeber requested a review from a team as a code owner August 4, 2025 23:48
@JacksonWeber JacksonWeber requested a review from rads-1996 August 21, 2025 18:05
Copy link
Contributor

@xrmx xrmx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since there's nothing requests specific here I think this should go into opentelemetry-util-http instead. Said that I'm not sure we should ship our own experimental semantic conventions. I don't see any PR in the semantic-conventions repo adding this stuff, so could you please elaborate a bit what's your plan?

@xrmx xrmx moved this to Reviewed PR that needs fixing in @xrmx's Python PR digest Aug 22, 2025
@xrmx
Copy link
Contributor

xrmx commented Aug 23, 2025

Since there's nothing requests specific here I think this should go into opentelemetry-util-http instead. Said that I'm not sure we should ship our own experimental semantic conventions. I don't see any PR in the semantic-conventions repo adding this stuff, so could you please elaborate a bit what's your plan?

So I had seen this PR in semconv repo open-telemetry/semantic-conventions#1523 before writing that comment but I only read the PR description with an old attribute name and not the title with the updated one that matches this. So since we already have the attribute in the semconv since a few (https://github.com/open-telemetry/opentelemetry-python/blob/05343a5c8848f5f55a69100a0becf61766b33051/opentelemetry-semantic-conventions/src/opentelemetry/semconv/_incubating/attributes/user_agent_attributes.py#L41) we should just import it from there and not opencode it.

@JacksonWeber
Copy link
Author

Since there's nothing requests specific here I think this should go into opentelemetry-util-http instead. Said that I'm not sure we should ship our own experimental semantic conventions. I don't see any PR in the semantic-conventions repo adding this stuff, so could you please elaborate a bit what's your plan?

  1. This PR does only add this functionality to requests, however I'm happy to move these changes over to opentelemetry-util-http if that'd be more appropriate
  2. Addressing this on your other comment.

@JacksonWeber
Copy link
Author

Since there's nothing requests specific here I think this should go into opentelemetry-util-http instead. Said that I'm not sure we should ship our own experimental semantic conventions. I don't see any PR in the semantic-conventions repo adding this stuff, so could you please elaborate a bit what's your plan?

So I had seen this PR in semconv repo open-telemetry/semantic-conventions#1523 before writing that comment but I only read the PR description with an old attribute name and not the title with the updated one that matches this. So since we already have the attribute in the semconv since a few (https://github.com/open-telemetry/opentelemetry-python/blob/05343a5c8848f5f55a69100a0becf61766b33051/opentelemetry-semantic-conventions/src/opentelemetry/semconv/_incubating/attributes/user_agent_attributes.py#L41) we should just import it from there and not opencode it.

Thanks for the point out here, the guidance in OTel JS works a bit differently (they ask to hard-code these kinds of experimental semantic conventions in order to avoid breaking customers using the old experimental attributes). I'll update these imports.

@JacksonWeber JacksonWeber requested a review from xrmx August 25, 2025 18:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Reviewed PR that needs fixing
Development

Successfully merging this pull request may close these issues.

3 participants