-
Notifications
You must be signed in to change notification settings - Fork 427
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
chore(internal): symbol inference (#6920)
We introduce logic to infer the symbols exported by a module. This can be used by any product that requires symbol information to provide features such as expression auto-completion. The actual enablement of the symbol inference and upload is controlled via a remote configuration signal. For this feature to work as intended, RC is an essential internal requirement. For performance reasons, the symbol inference is performed starting from module objects, after they have been imported. This way we skip a second parsing step, and reuse the result of the original operation performed by CPython itself when the module was loaded for the first time. Because the main consumer of this feature is Dynamic Instrumentation, the information that is retrieved from each module reflects the DI capabilities of discovering "instrumentable" symbols at runtime. Some finer details (such as nested functions/classes) might therefore be missing at this stage. There is also a special handling for scenarios where child processes are spawned via fork to parallelise the workload. Because each child process would perform the same symbol discovery, the backend would receive multiple uploads of duplicate information. To avoid wasting resources and bandwidth, we added logic to ensure that uploads happen from the parent process, and _at most_ one forked child process. The risk of this approach is that, if fork is used for anything else other than the worker child model, the symbol picture reconstructed on the backend site might be incomplete. However, we are betting on this scenario to be very rare, compared to the more common forked child process model. ## Checklist - [x] Change(s) are motivated and described in the PR description. - [x] Testing strategy is described if automated tests are not included in the PR. - [x] Risk is outlined (performance impact, potential for breakage, maintainability, etc). - [x] Change is maintainable (easy to change, telemetry, documentation). - [x] [Library release note guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html) are followed. If no release note is required, add label `changelog/no-changelog`. - [x] Documentation is included (in-code, generated user docs, [public corp docs](https://github.com/DataDog/documentation/)). - [x] Backport labels are set (if [applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)) ## Reviewer Checklist - [ ] Title is accurate. - [ ] No unnecessary changes are introduced. - [ ] Description motivates each change. - [ ] Avoids breaking [API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces) changes unless absolutely necessary. - [ ] Testing strategy adequately addresses listed risk(s). - [ ] Change is maintainable (easy to change, telemetry, documentation). - [ ] Release note makes sense to a user of the library. - [ ] Reviewer has explicitly acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment. - [ ] Backport labels are set in a manner that is consistent with the [release branch maintenance policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting) - [ ] If this PR touches code that signs or publishes builds or packages, or handles credentials of any kind, I've requested a review from `@DataDog/security-design-and-guidance`. - [ ] This PR doesn't touch any of that.
- Loading branch information
Showing
15 changed files
with
1,035 additions
and
25 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
from ddtrace.internal import forksafe | ||
from ddtrace.internal.remoteconfig.worker import remoteconfig_poller | ||
from ddtrace.internal.symbol_db.remoteconfig import SymbolDatabaseAdapter | ||
from ddtrace.settings.symbol_db import config as symdb_config | ||
|
||
|
||
def bootstrap(): | ||
if symdb_config._force: | ||
# Force the upload of symbols, ignoring RCM instructions. | ||
from ddtrace.internal.symbol_db.symbols import SymbolDatabaseUploader | ||
|
||
SymbolDatabaseUploader.install() | ||
else: | ||
# Start the RCM subscriber to determine if and when to upload symbols. | ||
remoteconfig_poller.register("LIVE_DEBUGGING_SYMBOL_DB", SymbolDatabaseAdapter()) | ||
|
||
@forksafe.register | ||
def _(): | ||
remoteconfig_poller.unregister("LIVE_DEBUGGING_SYMBOL_DB") | ||
remoteconfig_poller.register("LIVE_DEBUGGING_SYMBOL_DB", SymbolDatabaseAdapter()) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,70 @@ | ||
import os | ||
|
||
from ddtrace.internal.forksafe import has_forked | ||
from ddtrace.internal.logger import get_logger | ||
from ddtrace.internal.remoteconfig._connectors import PublisherSubscriberConnector | ||
from ddtrace.internal.remoteconfig._publishers import RemoteConfigPublisher | ||
from ddtrace.internal.remoteconfig._pubsub import PubSub | ||
from ddtrace.internal.remoteconfig._subscribers import RemoteConfigSubscriber | ||
from ddtrace.internal.remoteconfig.worker import remoteconfig_poller | ||
from ddtrace.internal.runtime import get_ancestor_runtime_id | ||
from ddtrace.internal.symbol_db.symbols import SymbolDatabaseUploader | ||
|
||
|
||
log = get_logger(__name__) | ||
|
||
|
||
def _rc_callback(data, test_tracer=None): | ||
if get_ancestor_runtime_id() is not None and has_forked(): | ||
log.debug("[PID %d] SymDB: Disabling Symbol DB in forked process", os.getpid()) | ||
# We assume that forking is being used for spawning child worker | ||
# processes. Therefore, we avoid uploading the same symbols from each | ||
# child process. We restrict the enablement of Symbol DB to just the | ||
# parent process and the first fork child. | ||
remoteconfig_poller.unregister("LIVE_DEBUGGING_SYMBOL_DB") | ||
|
||
if SymbolDatabaseUploader.is_installed(): | ||
SymbolDatabaseUploader.uninstall() | ||
|
||
return | ||
|
||
for metadata, config in zip(data["metadata"], data["config"]): | ||
if metadata is None: | ||
continue | ||
|
||
if not isinstance(config, dict): | ||
continue | ||
|
||
upload_symbols = config.get("upload_symbols") | ||
if upload_symbols is None: | ||
continue | ||
|
||
if upload_symbols: | ||
log.debug("[PID %d] SymDB: Symbol DB RCM enablement signal received", os.getpid()) | ||
if not SymbolDatabaseUploader.is_installed(): | ||
try: | ||
SymbolDatabaseUploader.install() | ||
log.debug("[PID %d] SymDB: Symbol DB uploader installed", os.getpid()) | ||
except Exception: | ||
log.error("[PID %d] SymDB: Failed to install Symbol DB uploader", os.getpid(), exc_info=True) | ||
remoteconfig_poller.unregister("LIVE_DEBUGGING_SYMBOL_DB") | ||
else: | ||
log.debug("[PID %d] SymDB: Symbol DB RCM shutdown signal received", os.getpid()) | ||
if SymbolDatabaseUploader.is_installed(): | ||
try: | ||
SymbolDatabaseUploader.uninstall() | ||
log.debug("[PID %d] SymDB: Symbol DB uploader uninstalled", os.getpid()) | ||
except Exception: | ||
log.error("[PID %d] SymDB: Failed to uninstall Symbol DB uploader", os.getpid(), exc_info=True) | ||
remoteconfig_poller.unregister("LIVE_DEBUGGING_SYMBOL_DB") | ||
break | ||
|
||
|
||
class SymbolDatabaseAdapter(PubSub): | ||
__publisher_class__ = RemoteConfigPublisher | ||
__subscriber_class__ = RemoteConfigSubscriber | ||
__shared_data__ = PublisherSubscriberConnector() | ||
|
||
def __init__(self): | ||
self._publisher = self.__publisher_class__(self.__shared_data__) | ||
self._subscriber = self.__subscriber_class__(self.__shared_data__, _rc_callback, "LIVE_DEBUGGING_SYMBOL_DB") |
Oops, something went wrong.