Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Celery task producer (celery.apply) span in APM triggers Datadog service inference into creating a new service for the service's own hostname #11491

Closed
patrys opened this issue Nov 21, 2024 · 2 comments · Fixed by #10750

Comments

@patrys
Copy link

patrys commented Nov 21, 2024

We've enabled service inference, and our service list is now filled with spurious services named after every pod in every k8s service that publishes tasks. After running for just an hour, we already have 150 of those.

All of the fake services only report a single source of data, celery.apply and visiting traces for celery.apply confirms that celery.hostname seems to be converted into peer.hostname despite this span not actually making a client connection anywhere.

Under each celery.apply span we see the expected sqs.sendmessage span for the actual delivery of the task. That second span has the correct peer tags and is paired with the expected queue service.

Agents are deployed using Helm chart version 3.73.0
Code is traced using dd-trace-py version 2.11.2

@patrys patrys changed the title Celery task producer (celery.apply) in APM triggers Datadog service inference into creating a new service for the service's own hostname Celery task producer (celery.apply) span in APM triggers Datadog service inference into creating a new service for the service's own hostname Nov 21, 2024
@patrys
Copy link
Author

patrys commented Nov 22, 2024

After some investigation, the signal handler calls set_tags_from_context(span, kwargs["headers"]), and the comment suggests, that it's specifically to set celery.hostname. The hostname in question is the pod name of the task producer.

Then, the trace_afer_publish signal handler extracts the celery.hostname and uses its value to set out.host, which is wrong as the hostname is the origin of the task, not its target.

out.host is then transformed (by the agent, I assume) to peer.hostname, which is expected behavior, but because of the above, the value is incorrect.

wconti27 added a commit that referenced this issue Nov 26, 2024
… 2.17] (#11540)

Backport b9573be from #10750 to 2.17.

## Motivation

Change `out.host` tags to point towards the celery broker, instead of
the local celery hostname. Fixes service-representation issues.

Fixes [11491](#11491)

## Checklist
- [x] PR author has checked that all the criteria below are met
- The PR description includes an overview of the change
- The PR description articulates the motivation for the change
- The change includes tests OR the PR description describes a testing
strategy
- The PR description notes risks associated with the change, if any
- Newly-added code is easy to change
- The change follows the [library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
- The change includes or references documentation updates if necessary
- Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))

## Reviewer Checklist
- [x] Reviewer has checked that all the criteria below are met 
- Title is accurate
- All changes are related to the pull request's stated goal
- Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- Testing strategy adequately addresses listed risks
- Newly-added code is easy to change
- Release note makes sense to a user of the library
- If necessary, author has acknowledged and discussed the performance
implications of this PR as reported in the benchmarks PR comment
- Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)

---------

Co-authored-by: Zachary Groves <32471391+ZStriker19@users.noreply.github.com>
Co-authored-by: William Conti <58711692+wconti27@users.noreply.github.com>
Co-authored-by: William Conti <william.conti@datadoghq.com>
@wconti27
Copy link
Contributor

Hello, this behavior has been fixed in our latest patch release of 2.17.2. out.host will now point to the celery broker host.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants