Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(telemetry): ensure instrumentation telemetry is compatible with python 3.12 #6859

Merged
merged 42 commits into from
Sep 8, 2023

Conversation

mabdinur
Copy link
Contributor

@mabdinur mabdinur commented Sep 8, 2023

Motivation

Description

  • Start telemetry worker thread as early as possible.
  • Delays sending all telemetry events until app-started is queued.
  • Refactors tests to align with this new logic.

Risk

  • Telemetry events (metrics/logs/integrations) are queued as early as possible but these events are only sent when the trace agent writer is started. This may result in a memory leak if high cardinality telemetry metrics and logs are added in the future. This is not a concern right now.

Checklist

  • Change(s) are motivated and described in the PR description.
  • Testing strategy is described if automated tests are not included in the PR.
  • Risk is outlined (performance impact, potential for breakage, maintainability, etc).
  • Change is maintainable (easy to change, telemetry, documentation).
  • Library release note guidelines are followed. If no release note is required, add label changelog/no-changelog.
  • Documentation is included (in-code, generated user docs, public corp docs).
  • Backport labels are set (if applicable)

Reviewer Checklist

  • Title is accurate.
  • No unnecessary changes are introduced.
  • Description motivates each change.
  • Avoids breaking API changes unless absolutely necessary.
  • Testing strategy adequately addresses listed risk(s).
  • Change is maintainable (easy to change, telemetry, documentation).
  • Release note makes sense to a user of the library.
  • Reviewer has explicitly acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment.
  • Backport labels are set in a manner that is consistent with the release branch maintenance policy
  • If this PR touches code that signs or publishes builds or packages, or handles credentials of any kind, I've requested a review from @DataDog/security-design-and-guidance.
  • This PR doesn't touch any of that.

Yun-Kim and others added 30 commits August 31, 2023 14:14
Includes:
- pinning images and package dependencies in CircleCI
- removing stale `master` branch
- dropping pylons framework test
- updating versioning documentation
- drop Python < 3.7 as supported on setup.py
- removing pylons/boto from CI suitespec
- release note for 2.0
Includes:
- pinning images and package dependencies in CircleCI
- removing stale `master` branch
- dropping pylons framework test
- updating versioning documentation
- drop Python < 3.7 as supported on setup.py
- removing pylons/boto from CI suitespec
- release note for 2.0
This PR adds a note to upgrade to 2.x in `upgrading.rst` and removes all
deprecated items slated for removal in 2.0.0. This includes:

- `DD_GEVENT_PATCH_ALL`: no special configuration is now necessary to
make `ddtrace-run` work with gevent.
- `DD_AWS_TAG_ALL_PARAMS`: the boto/botocore/aiobotocore integrations no
longer collect all API parameters by default.
- `DD_REMOTECONFIG_POLL_SECONDS`: replaced by
`DD_REMOTE_CONFIG_POLL_INTERVAL_SECONDS`
- ASM deprecated constants including ``APPSEC_ENABLED``,
``APPSEC_JSON``, ``APPSEC_EVENT_RULE_VERSION``,
``APPSEC_EVENT_RULE_ERRORS``,
``APPSEC_EVENT_RULE_LOADED``, ``APPSEC_EVENT_RULE_ERROR_COUNT``,
``APPSEC_WAF_DURATION``, ``APPSEC_WAF_DURATION_EXT``,
``APPSEC_WAF_TIMEOUTS``, ``APPSEC_WAF_VERSION``,
``APPSEC_ORIGIN_VALUE``, ``APPSEC_BLOCKED``,
``IAST_JSON``, ``IAST_ENABLED``, ``IAST_CONTEXT_KEY``. These constants
were meant for private use only and should not affect existing code.
- ``ddtrace.contrib.grpc.constants.GRPC_PORT_KEY``: replaced by
`ddtrace.ext.net.TARGET_PORT`
- ``ddtrace.ext.cassandra.ROW_COUNT``, ``ddtrace.ext.mongo.ROW_COUNT``,
``ddtrace.ext.sql.ROW_COUNT``: replaced by `ddtrace.ext.db.ROWCOUNT`
- `ddtrace.filters.TraceCiVisibilityFilter`: removed as this was for
private use only and does not affect existing code.
- `ddtrace.contrib.starlette.get_resource` and
`ddtrace.contrib.starlette.span_modifier` and
`ddtrace.contrib.fastapi.span_modifier`: the fastapi and starlette
integrations now provide the full route and not just mounted route for
sub-applications by default.
- `ddtrace.contrib.starlette.config['aggregate_resources']` and
`ddtrace.contrib.fastapi.config['aggregate_resources']`: the starlette
and fastapi integrations no longer have the option to aggregate
resources as this occurs by default now.
- `DD_TRACE_OBFUSCATION_QUERY_STRING_PATTERN`: replaced by
`DD_TRACE_OBFUSCATION_QUERY_STRING_REGEXP`.

Additionally, the `pep562` dependency and references to it have been
removed as it is no longer needed after dropping support for Python <
3.7.

Note that `DD_CALL_BASIC_CONFIG` and `DD_LOG_FORMAT` are removed in
#6612 which includes a subtle change in functionality.

## Checklist

- [x] Change(s) are motivated and described in the PR description.
- [x] Testing strategy is described if automated tests are not included
in the PR.
- [x] Risk is outlined (performance impact, potential for breakage,
maintainability, etc).
- [x] Change is maintainable (easy to change, telemetry, documentation).
- [x] [Library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
are followed. If no release note is required, add label
`changelog/no-changelog`.
- [x] Documentation is included (in-code, generated user docs, [public
corp docs](https://github.com/DataDog/documentation/)).
- [x] Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))

## Reviewer Checklist

- [x] Title is accurate.
- [x] No unnecessary changes are introduced.
- [x] Description motivates each change.
- [x] Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes unless absolutely necessary.
- [x] Testing strategy adequately addresses listed risk(s).
- [x] Change is maintainable (easy to change, telemetry, documentation).
- [x] Release note makes sense to a user of the library.
- [x] Reviewer has explicitly acknowledged and discussed the performance
implications of this PR as reported in the benchmarks PR comment.
- [x] Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)
@mabdinur mabdinur marked this pull request as ready for review September 8, 2023 02:06
@mabdinur mabdinur requested a review from a team as a code owner September 8, 2023 02:06
@majorgreys
Copy link
Contributor

@mabdinur great work! Can you add a release note for this?

@mabdinur mabdinur changed the title fix(telemetry): ensure instrumentation telemetry is compatible with python 3.12 chore(telemetry): ensure instrumentation telemetry is compatible with python 3.12 Sep 8, 2023
@mabdinur
Copy link
Contributor Author

mabdinur commented Sep 8, 2023

Member

I updated the PR scope from fix to chore. Telemetry is not a customer facing feature. I don't think a release note would be useful or actionable.

@mabdinur mabdinur added the changelog/no-changelog A changelog entry is not required for this PR. label Sep 8, 2023
@majorgreys majorgreys requested review from a team as code owners September 8, 2023 15:21
@majorgreys majorgreys requested review from jbertran and removed request for a team September 8, 2023 15:21
@pr-commenter
Copy link

pr-commenter bot commented Sep 8, 2023

Benchmarks

Benchmark execution time: 2023-09-08 16:28:38

Comparing candidate commit 6c86ea3 in PR branch munir/fix-telemetry-2.x with baseline commit 44a87f8 in branch yunkim/add-py-312.

Found 4 performance improvements and 7 performance regressions! Performance is the same for 93 metrics, 0 unstable metrics.

scenario:otelspan-start

  • 🟥 max_rss_usage [+1.032MB; +1.206MB] or [+2.201%; +2.572%]

scenario:otelspan-start-finish

  • 🟥 max_rss_usage [+1.095MB; +1.245MB] or [+3.877%; +4.409%]

scenario:otelspan-start-finish-telemetry

  • 🟩 max_rss_usage [-1060.018KB; -897.870KB] or [-3.568%; -3.022%]

scenario:sethttpmeta-all-disabled

  • 🟩 max_rss_usage [-879.697KB; -732.079KB] or [-3.070%; -2.555%]

scenario:sethttpmeta-obfuscation-disabled

  • 🟥 max_rss_usage [+1.061MB; +1.223MB] or [+3.854%; +4.441%]

scenario:sethttpmeta-obfuscation-regular-case-explicit-query

  • 🟥 max_rss_usage [+600.965KB; +768.328KB] or [+2.142%; +2.739%]

scenario:sethttpmeta-obfuscation-regular-case-implicit-query

  • 🟩 max_rss_usage [-809.374KB; -645.115KB] or [-2.810%; -2.240%]

scenario:sethttpmeta-obfuscation-send-querystring-disabled

  • 🟥 max_rss_usage [+672.031KB; +843.079KB] or [+2.391%; +3.000%]

scenario:sethttpmeta-obfuscation-worst-case-implicit-query

  • 🟥 max_rss_usage [+678.697KB; +842.557KB] or [+2.414%; +2.996%]

scenario:span-start-finish

  • 🟥 max_rss_usage [+913.006KB; +1073.554KB] or [+3.330%; +3.915%]

scenario:span-start-finish-telemetry

  • 🟩 max_rss_usage [-1.258MB; -1.104MB] or [-4.359%; -3.826%]

@mabdinur mabdinur merged commit dad4d9d into yunkim/add-py-312 Sep 8, 2023
138 of 141 checks passed
@mabdinur mabdinur deleted the munir/fix-telemetry-2.x branch September 8, 2023 16:30
mabdinur added a commit that referenced this pull request Sep 11, 2023
… python 3.12 (#6859)

- Make the instrumentation telemetry client compatible with python3.12:
python/cpython#104826

 - Start telemetry worker thread as early as possible.
 - Delays sending all telemetry events until app-started is queued.
 - Refactors tests to align with this new logic.

- Telemetry events (metrics/logs/integrations) are queued as early as
possible but these events are only sent when the trace agent writer is
started. This **may** result in a memory leak if high cardinality
telemetry metrics and logs are added in the future. This is not a
concern right now.

- [x] Change(s) are motivated and described in the PR description.
- [x] Testing strategy is described if automated tests are not included
in the PR.
- [x] Risk is outlined (performance impact, potential for breakage,
maintainability, etc).
- [x] Change is maintainable (easy to change, telemetry, documentation).
- [x] [Library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
are followed. If no release note is required, add label
`changelog/no-changelog`.
- [x] Documentation is included (in-code, generated user docs, [public
corp docs](https://github.com/DataDog/documentation/)).
- [x] Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))

- [x] Title is accurate.
- [x] No unnecessary changes are introduced.
- [x] Description motivates each change.
- [x] Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes unless absolutely necessary.
- [x] Testing strategy adequately addresses listed risk(s).
- [x] Change is maintainable (easy to change, telemetry, documentation).
- [x] Release note makes sense to a user of the library.
- [x] Reviewer has explicitly acknowledged and discussed the performance
implications of this PR as reported in the benchmarks PR comment.
- [x] Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)
- [x] If this PR touches code that signs or publishes builds or
packages, or handles credentials of any kind, I've requested a review
from `@DataDog/security-design-and-guidance`.
- [x] This PR doesn't touch any of that.

---------

Co-authored-by: Yun Kim <yun.kim@datadoghq.com>
Co-authored-by: Tahir H. Butt <tahir.butt@datadoghq.com>
Co-authored-by: Gabriele N. Tornetta <gabriele.tornetta@datadoghq.com>
Co-authored-by: Yun Kim <35776586+Yun-Kim@users.noreply.github.com>
Co-authored-by: Emmett Butler <emmett.butler321@gmail.com>
Co-authored-by: ZStriker19 <zach.groves@datadoghq.com>
majorgreys added a commit that referenced this pull request Sep 13, 2023
… python 3.12 (#6859)

## Motivation

- Make the instrumentation telemetry client compatible with python3.12:
python/cpython#104826

## Description

 - Start telemetry worker thread as early as possible.
 - Delays sending all telemetry events until app-started is queued.
 - Refactors tests to align with this new logic.
 

## Risk 

- Telemetry events (metrics/logs/integrations) are queued as early as
possible but these events are only sent when the trace agent writer is
started. This **may** result in a memory leak if high cardinality
telemetry metrics and logs are added in the future. This is not a
concern right now.

## Checklist

- [x] Change(s) are motivated and described in the PR description.
- [x] Testing strategy is described if automated tests are not included
in the PR.
- [x] Risk is outlined (performance impact, potential for breakage,
maintainability, etc).
- [x] Change is maintainable (easy to change, telemetry, documentation).
- [x] [Library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
are followed. If no release note is required, add label
`changelog/no-changelog`.
- [x] Documentation is included (in-code, generated user docs, [public
corp docs](https://github.com/DataDog/documentation/)).
- [x] Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))

## Reviewer Checklist

- [x] Title is accurate.
- [x] No unnecessary changes are introduced.
- [x] Description motivates each change.
- [x] Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes unless absolutely necessary.
- [x] Testing strategy adequately addresses listed risk(s).
- [x] Change is maintainable (easy to change, telemetry, documentation).
- [x] Release note makes sense to a user of the library.
- [x] Reviewer has explicitly acknowledged and discussed the performance
implications of this PR as reported in the benchmarks PR comment.
- [x] Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)
- [x] If this PR touches code that signs or publishes builds or
packages, or handles credentials of any kind, I've requested a review
from `@DataDog/security-design-and-guidance`.
- [x] This PR doesn't touch any of that.

---------

Co-authored-by: Yun Kim <yun.kim@datadoghq.com>
Co-authored-by: Tahir H. Butt <tahir.butt@datadoghq.com>
Co-authored-by: Gabriele N. Tornetta <gabriele.tornetta@datadoghq.com>
Co-authored-by: Yun Kim <35776586+Yun-Kim@users.noreply.github.com>
Co-authored-by: Emmett Butler <emmett.butler321@gmail.com>
Co-authored-by: ZStriker19 <zach.groves@datadoghq.com>
@mabdinur mabdinur mentioned this pull request Sep 25, 2023
18 tasks
mabdinur added a commit that referenced this pull request Sep 26, 2023
## Description

This PR ensures the instrumentation telemetry client is compatible with
python3.12's threading module by:
- Removing `telemetry_writer.add_count_metric` from
`SpanAggregator.shutdown`. This ensures telemetry events are not queued
on tracer shutdown.
- Ensuring the telemetry writer thread is only started once per
application.
- Ensures the telemetry writer thread is disabled when an application is
shutdown.

## Motivation

The following change failed to support the telemetry client in
python3.12: #6859. This PR
will hopefully fix this 🤞.

### Reproduction for python3.12 runtime errors
```
docker run --rm -it python:3.12.0rc3 bash
root@a1f3c1d307ec:/# pip install ddtrace==2.0.0rc2
root@a1f3c1d307ec:/# python -c "import ddtrace; _ = ddtrace.tracer.trace('foo'); raise Exception"
```

### Output
```
Traceback (most recent call last):
  File "<string>", line 1, in <module>
Exception
Exception ignored in atexit callback: <bound method Tracer._atexit of <ddtrace.tracer.Tracer object at 0xffffbd967260>>
Traceback (most recent call last):
  File "/usr/local/lib/python3.12/site-packages/ddtrace/tracer.py", line 293, in _atexit
    self.shutdown(timeout=self.SHUTDOWN_TIMEOUT)
  File "/usr/local/lib/python3.12/site-packages/ddtrace/tracer.py", line 1024, in shutdown
    processor.shutdown(timeout)
  File "/usr/local/lib/python3.12/site-packages/ddtrace/internal/processor/trace.py", line 270, in shutdown
    self._queue_span_count_metrics("spans_created", "integration_name", None)
  File "/usr/local/lib/python3.12/site-packages/ddtrace/internal/processor/trace.py", line 291, in _queue_span_count_metrics
    telemetry_writer.add_count_metric(
  File "/usr/local/lib/python3.12/site-packages/ddtrace/internal/telemetry/writer.py", line 514, in add_count_metric
    if self.status == ServiceStatus.RUNNING or self.enable():
                                               ^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/ddtrace/internal/telemetry/writer.py", line 218, in enable
    self.start()
  File "/usr/local/lib/python3.12/site-packages/ddtrace/internal/service.py", line 58, in start
    self._start_service(*args, **kwargs)
  File "/usr/local/lib/python3.12/site-packages/ddtrace/internal/periodic.py", line 135, in _start_service
    self._worker.start()
  File "/usr/local/lib/python3.12/threading.py", line 971, in start
    _start_new_thread(self._bootstrap, ())
RuntimeError: can't create new thread at interpreter shutdown
```

## Risk

This PR reverts an optimization that ensured telemetry span creation
metrics were queued in batches of 100. Without this optimization we can
expect a 2-5% increase to span creation and span finish.


## Checklist

- [x] Change(s) are motivated and described in the PR description.
- [x] Testing strategy is described if automated tests are not included
in the PR.
- [x] Risk is outlined (performance impact, potential for breakage,
maintainability, etc).
- [x] Change is maintainable (easy to change, telemetry, documentation).
- [x] [Library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
are followed. If no release note is required, add label
`changelog/no-changelog`.
- [x] Documentation is included (in-code, generated user docs, [public
corp docs](https://github.com/DataDog/documentation/)).
- [x] Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))

## Reviewer Checklist

- [x] Title is accurate.
- [x] No unnecessary changes are introduced.
- [x] Description motivates each change.
- [x] Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes unless absolutely necessary.
- [x] Testing strategy adequately addresses listed risk(s).
- [x] Change is maintainable (easy to change, telemetry, documentation).
- [x] Release note makes sense to a user of the library.
- [x] Reviewer has explicitly acknowledged and discussed the performance
implications of this PR as reported in the benchmarks PR comment.
- [x] Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)
- [x] If this PR touches code that signs or publishes builds or
packages, or handles credentials of any kind, I've requested a review
from `@DataDog/security-design-and-guidance`.
- [x] This PR doesn't touch any of that.
github-actions bot pushed a commit that referenced this pull request Sep 26, 2023
## Description

This PR ensures the instrumentation telemetry client is compatible with
python3.12's threading module by:
- Removing `telemetry_writer.add_count_metric` from
`SpanAggregator.shutdown`. This ensures telemetry events are not queued
on tracer shutdown.
- Ensuring the telemetry writer thread is only started once per
application.
- Ensures the telemetry writer thread is disabled when an application is
shutdown.

## Motivation

The following change failed to support the telemetry client in
python3.12: #6859. This PR
will hopefully fix this 🤞.

### Reproduction for python3.12 runtime errors
```
docker run --rm -it python:3.12.0rc3 bash
root@a1f3c1d307ec:/# pip install ddtrace==2.0.0rc2
root@a1f3c1d307ec:/# python -c "import ddtrace; _ = ddtrace.tracer.trace('foo'); raise Exception"
```

### Output
```
Traceback (most recent call last):
  File "<string>", line 1, in <module>
Exception
Exception ignored in atexit callback: <bound method Tracer._atexit of <ddtrace.tracer.Tracer object at 0xffffbd967260>>
Traceback (most recent call last):
  File "/usr/local/lib/python3.12/site-packages/ddtrace/tracer.py", line 293, in _atexit
    self.shutdown(timeout=self.SHUTDOWN_TIMEOUT)
  File "/usr/local/lib/python3.12/site-packages/ddtrace/tracer.py", line 1024, in shutdown
    processor.shutdown(timeout)
  File "/usr/local/lib/python3.12/site-packages/ddtrace/internal/processor/trace.py", line 270, in shutdown
    self._queue_span_count_metrics("spans_created", "integration_name", None)
  File "/usr/local/lib/python3.12/site-packages/ddtrace/internal/processor/trace.py", line 291, in _queue_span_count_metrics
    telemetry_writer.add_count_metric(
  File "/usr/local/lib/python3.12/site-packages/ddtrace/internal/telemetry/writer.py", line 514, in add_count_metric
    if self.status == ServiceStatus.RUNNING or self.enable():
                                               ^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/ddtrace/internal/telemetry/writer.py", line 218, in enable
    self.start()
  File "/usr/local/lib/python3.12/site-packages/ddtrace/internal/service.py", line 58, in start
    self._start_service(*args, **kwargs)
  File "/usr/local/lib/python3.12/site-packages/ddtrace/internal/periodic.py", line 135, in _start_service
    self._worker.start()
  File "/usr/local/lib/python3.12/threading.py", line 971, in start
    _start_new_thread(self._bootstrap, ())
RuntimeError: can't create new thread at interpreter shutdown
```

## Risk

This PR reverts an optimization that ensured telemetry span creation
metrics were queued in batches of 100. Without this optimization we can
expect a 2-5% increase to span creation and span finish.

## Checklist

- [x] Change(s) are motivated and described in the PR description.
- [x] Testing strategy is described if automated tests are not included
in the PR.
- [x] Risk is outlined (performance impact, potential for breakage,
maintainability, etc).
- [x] Change is maintainable (easy to change, telemetry, documentation).
- [x] [Library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
are followed. If no release note is required, add label
`changelog/no-changelog`.
- [x] Documentation is included (in-code, generated user docs, [public
corp docs](https://github.com/DataDog/documentation/)).
- [x] Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))

## Reviewer Checklist

- [x] Title is accurate.
- [x] No unnecessary changes are introduced.
- [x] Description motivates each change.
- [x] Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes unless absolutely necessary.
- [x] Testing strategy adequately addresses listed risk(s).
- [x] Change is maintainable (easy to change, telemetry, documentation).
- [x] Release note makes sense to a user of the library.
- [x] Reviewer has explicitly acknowledged and discussed the performance
implications of this PR as reported in the benchmarks PR comment.
- [x] Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)
- [x] If this PR touches code that signs or publishes builds or
packages, or handles credentials of any kind, I've requested a review
from `@DataDog/security-design-and-guidance`.
- [x] This PR doesn't touch any of that.

(cherry picked from commit 065784b)
mabdinur added a commit that referenced this pull request Sep 26, 2023
Backport 065784b from #7043 to 2.0.

## Description

This PR ensures the instrumentation telemetry client is compatible with
python3.12's threading module by:
- Ensuring the telemetry writer thread is only started once per
application.
- Ensures the telemetry writer thread is disabled when an application is
shutdown.
- Ensures telemetry metrics are queued `SpanAggregator.shutdown` without
restarting the telemetry writer thread.

## Motivation

The following change failed to support the telemetry client in
python3.12: #6859. This PR
will hopefully fix this 🤞.

### Reproduction for python3.12 runtime errors
```
docker run --rm -it python:3.12.0rc3 bash
root@a1f3c1d307ec:/# pip install ddtrace==2.0.0rc2
root@a1f3c1d307ec:/# python -c "import ddtrace; _ = ddtrace.tracer.trace('foo'); raise Exception"
```

### Output
```
Traceback (most recent call last):
  File "<string>", line 1, in <module>
Exception
Exception ignored in atexit callback: <bound method Tracer._atexit of <ddtrace.tracer.Tracer object at 0xffffbd967260>>
Traceback (most recent call last):
  File "/usr/local/lib/python3.12/site-packages/ddtrace/tracer.py", line 293, in _atexit
    self.shutdown(timeout=self.SHUTDOWN_TIMEOUT)
  File "/usr/local/lib/python3.12/site-packages/ddtrace/tracer.py", line 1024, in shutdown
    processor.shutdown(timeout)
  File "/usr/local/lib/python3.12/site-packages/ddtrace/internal/processor/trace.py", line 270, in shutdown
    self._queue_span_count_metrics("spans_created", "integration_name", None)
  File "/usr/local/lib/python3.12/site-packages/ddtrace/internal/processor/trace.py", line 291, in _queue_span_count_metrics
    telemetry_writer.add_count_metric(
  File "/usr/local/lib/python3.12/site-packages/ddtrace/internal/telemetry/writer.py", line 514, in add_count_metric
    if self.status == ServiceStatus.RUNNING or self.enable():
                                               ^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/ddtrace/internal/telemetry/writer.py", line 218, in enable
    self.start()
  File "/usr/local/lib/python3.12/site-packages/ddtrace/internal/service.py", line 58, in start
    self._start_service(*args, **kwargs)
  File "/usr/local/lib/python3.12/site-packages/ddtrace/internal/periodic.py", line 135, in _start_service
    self._worker.start()
  File "/usr/local/lib/python3.12/threading.py", line 971, in start
    _start_new_thread(self._bootstrap, ())
RuntimeError: can't create new thread at interpreter shutdown
```

## Risk

~~This PR reverts an optimization that ensured telemetry span creation
metrics were queued in batches of 100. Without this optimization we can
expect a 5-10% increase to span creation and span finish.~~


## Checklist

- [x] Change(s) are motivated and described in the PR description.
- [x] Testing strategy is described if automated tests are not included
in the PR.
- [x] Risk is outlined (performance impact, potential for breakage,
maintainability, etc).
- [x] Change is maintainable (easy to change, telemetry, documentation).
- [x] [Library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
are followed. If no release note is required, add label
`changelog/no-changelog`.
- [x] Documentation is included (in-code, generated user docs, [public
corp docs](https://github.com/DataDog/documentation/)).
- [x] Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))

## Reviewer Checklist

- [x] Title is accurate.
- [x] No unnecessary changes are introduced.
- [x] Description motivates each change.
- [x] Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes unless absolutely necessary.
- [x] Testing strategy adequately addresses listed risk(s).
- [x] Change is maintainable (easy to change, telemetry, documentation).
- [x] Release note makes sense to a user of the library.
- [x] Reviewer has explicitly acknowledged and discussed the performance
implications of this PR as reported in the benchmarks PR comment.
- [x] Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)
- [x] If this PR touches code that signs or publishes builds or
packages, or handles credentials of any kind, I've requested a review
from `@DataDog/security-design-and-guidance`.
- [x] This PR doesn't touch any of that.

Co-authored-by: Munir Abdinur <munir.abdinur@datadoghq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
changelog/no-changelog A changelog entry is not required for this PR.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants