-
-
Notifications
You must be signed in to change notification settings - Fork 329
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: clean up profilers for discarded transactions #3154
fix: clean up profilers for discarded transactions #3154
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this approach is future-proof. I fear that it will be fragile. Someone can easily forget to call discardProfilerForTracer
when changing the SentryTracer
. Having one place to call discardProfilerForTracer
would be better.
@armcknight, I'm a bit confused about why we need to keep references to tracers here
We only use the |
Sources/Sentry/SentryTracer.m
Outdated
@@ -425,10 +425,16 @@ - (void)finishInternal | |||
{ | |||
[self cancelDeadlineTimer]; | |||
if (self.isFinished) { | |||
#if SENTRY_TARGET_PROFILING_SUPPORTED | |||
discardProfilerForTracer(self); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
h
: Instead of having to ensure calling discardProfilerForTracer
in all the correct places, it would be great to have one single place in the tracer. Maybe here
sentry-cocoa/Sources/Sentry/SentryTracer.m
Lines 442 to 450 in 4386045
[self.delegate tracerDidFinish:self]; | |
if (self.finishCallback) { | |
self.finishCallback(self); | |
// The callback will only be executed once. No need to keep the reference and we avoid | |
// potential retain cycles. | |
self.finishCallback = nil; | |
} |
or here
sentry-cocoa/Sources/Sentry/SentryTracer.m
Line 504 in 4386045
[self captureTransactionWithProfile:transaction]; |
would be sufficient, if we also do the following: To ensure that we don't end up with infinite memory growth, we could store timestamps SentryProfiledTracerConcurrency.m
when a traceID or a profiler was added. We only have to keep them around until transactions time out, which is 450s or something similar. Whenever we add a new tracer to profile mapping in SentryProfiledTracerConcurrency
we could check if there are old profiles in the dictionary, and if there are, we remove them. This algorithm would be our insurance and we also don't need weak references. WDYT, @armcknight?
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I also just noticed that the 500s limit is only a duration limit enforced after computation. Automatic tracers actually also have a 30 second timeout. But this is only for automatic tracers, not manually created ones. If there is no time limit for these, then we can't simply track timestamps.
(Note: I hid a couple other comments that won't matter if we can't do this.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this will be fine, yes. At the point where there is any memory growth due to unended transactions, that's a logic error by the SDK consumer, not the SDK itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think just adding the call to discardProfilerForTracer
in SentryTracer.dealloc
will be sufficient. We can't call that before/after this
sentry-cocoa/Sources/Sentry/SentryTracer.m
Line 442 in 4386045
[self.delegate tracerDidFinish:self]; |
because then the profiler will be gone by the time we try to get its data later in finishInternal
at the call to captureTransactionWithProfile
(which eventually goes on to remove the tracking for the tracer, that's the happy path).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added several new tests, please check them out!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the point where there is any memory growth due to unended transactions, that's a logic error by the SDK consumer, not the SDK itself.
With the previous solution, it wouldn't be a logic error by our users. Doing the following and never calling finish for whatever reason is a valid use case.
let transaction1 = SentrySDK.startTransaction(name: "name", operation: "op")
let transaction2 = SentrySDK.startTransaction(name: "name", operation: "op")
With the previous solution, the SDK would start a profile for the transaction and keep the profile in memory forever. ARC would deallocate transaction1 and transaction2 at some point.
Anyway, with the current solution, it shouldn't be a problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess by logic error I just meant, if they start tons of endless transactions and keep refs to them, but also expect our memory to not grow accordingly.
Correct, it's essentially relying on |
…ing-transaction-bookkeeping-cleanup
Performance metrics 🚀
|
Revision | Plain | With Sentry | Diff |
---|---|---|---|
dbc67d2 | 1239.49 ms | 1248.88 ms | 9.39 ms |
b385962 | 1195.85 ms | 1221.63 ms | 25.78 ms |
b6ba04e | 1230.48 ms | 1253.20 ms | 22.72 ms |
06548c0 | 1262.80 ms | 1275.00 ms | 12.20 ms |
407ff99 | 1225.49 ms | 1232.88 ms | 7.39 ms |
e2abb0d | 1235.08 ms | 1257.00 ms | 21.92 ms |
7bc3c0d | 1261.16 ms | 1278.38 ms | 17.22 ms |
8f397a7 | 1224.66 ms | 1236.48 ms | 11.82 ms |
2405ba5 | 1248.37 ms | 1259.30 ms | 10.93 ms |
257c2a9 | 1239.52 ms | 1251.08 ms | 11.56 ms |
App size
Revision | Plain | With Sentry | Diff |
---|---|---|---|
dbc67d2 | 20.76 KiB | 427.74 KiB | 406.98 KiB |
b385962 | 20.76 KiB | 399.69 KiB | 378.93 KiB |
b6ba04e | 20.76 KiB | 414.44 KiB | 393.68 KiB |
06548c0 | 20.76 KiB | 427.36 KiB | 406.59 KiB |
407ff99 | 20.76 KiB | 427.87 KiB | 407.10 KiB |
e2abb0d | 20.76 KiB | 434.72 KiB | 413.96 KiB |
7bc3c0d | 20.76 KiB | 427.36 KiB | 406.59 KiB |
8f397a7 | 20.76 KiB | 420.55 KiB | 399.79 KiB |
2405ba5 | 20.76 KiB | 435.23 KiB | 414.47 KiB |
257c2a9 | 20.76 KiB | 401.36 KiB | 380.60 KiB |
Previous results on branch: armcknight/fix/profiling-transaction-bookkeeping-cleanup
Startup times
Revision | Plain | With Sentry | Diff |
---|---|---|---|
d71e189 | 1240.52 ms | 1254.35 ms | 13.83 ms |
3c9bbdd | 1229.72 ms | 1249.90 ms | 20.18 ms |
App size
Revision | Plain | With Sentry | Diff |
---|---|---|---|
d71e189 | 22.84 KiB | 403.87 KiB | 381.02 KiB |
3c9bbdd | 22.84 KiB | 403.37 KiB | 380.53 KiB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are getting close.
Sources/Sentry/SentryTracer.m
Outdated
@@ -425,10 +425,16 @@ - (void)finishInternal | |||
{ | |||
[self cancelDeadlineTimer]; | |||
if (self.isFinished) { | |||
#if SENTRY_TARGET_PROFILING_SUPPORTED | |||
discardProfilerForTracer(self); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Manual transactions don't have a duration limit. Users could keep it alive for hours. So my suggestion won't work.
I thought about it and I think we only have one problem to solve with the current solution: When a user starts a transaction, never calls finish, and doesn't keep a reference to the transactions, SentryProfiledTracerConcurrency
will keep a reference to the profile and the traceID forever. The dealloc
of the SentryTracer could call discardProfilerForTracer
. dealloc
would also be our insurance if we forget to call discardProfilerForTracer
somewhere.
Furthermore, I think we only need to call discardProfilerForTracer
below or above the following line
sentry-cocoa/Sources/Sentry/SentryTracer.m
Line 442 in 4386045
[self.delegate tracerDidFinish:self]; |
We use this callback also here
sentry-cocoa/Sources/Sentry/SentryPerformanceTracker.m
Lines 233 to 238 in 4386045
- (void)tracerDidFinish:(SentryTracer *)tracer | |
{ | |
@synchronized(self.spans) { | |
[self.spans removeObjectForKey:tracer.spanId]; | |
} | |
} |
We don't need to consider transactions that never finish in the SentryPerformanceTracker
as it only keeps track of automatic transactions and they have a timeout.
Doing those two suggested changes would do the job, I believe.
Please add some tests after doing the changes here.
…ing-transaction-bookkeeping-cleanup
…bookkeeping-cleanup
…action-bookkeeping-cleanup
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## main #3154 +/- ##
=============================================
+ Coverage 89.172% 89.197% +0.025%
=============================================
Files 502 502
Lines 53916 54062 +146
Branches 19344 19405 +61
=============================================
+ Hits 48078 48222 +144
Misses 4988 4988
- Partials 850 852 +2
... and 14 files with indirect coverage changes Continue to review full report in Codecov by Sentry.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, I think we just miss a couple of more tests to ensure we covered all edge cases. After adding these we can merge. Approving to unblock the PR.
CI is still unhappy and we miss a changelog entry.
* @warning Must be called from a synchronized context. | ||
*/ | ||
void | ||
_unsafe_cleanUpProfiler(SentryProfiler *profiler, NSString *tracerKey) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m
: The prefix unsafe
raises a few questions for me at the first glance. I thought this function is somewhat dangerous 😄 . Actually it's just not thread safe. What about renaming it to cleanUpProfiler_non_thread_safe
?
_unsafe_cleanUpProfiler(SentryProfiler *profiler, NSString *tracerKey) | |
_cleanUpProfiler_non_thread_safe(SentryProfiler *profiler, NSString *tracerKey) |
… finished normally by tracer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
📜 Description
💡 Motivation and Context
For the feedback at #3135 (comment)
💚 How did you test it?
📝 Checklist
You have to check all boxes before merging:
sendDefaultPII
is enabled.🔮 Next steps