fix: clean up profilers for discarded transactions #3154

armcknight · 2023-07-14T23:41:17Z

📜 Description

💡 Motivation and Context

For the feedback at #3135 (comment)

💚 How did you test it?

📝 Checklist

You have to check all boxes before merging:

I reviewed the submitted code.
I added tests to verify the changes.
No new PII added or SDK only sends newly added PII if sendDefaultPII is enabled.
I updated the docs if needed.
Review from the native team if needed.
No breaking change or entry added to the changelog.
No breaking change for hybrid SDKs or communicated to hybrid SDKs.

🔮 Next steps

philipphofmann

I don't think this approach is future-proof. I fear that it will be fragile. Someone can easily forget to call discardProfilerForTracer when changing the SentryTracer. Having one place to call discardProfilerForTracer would be better.

Sources/Sentry/Profiling/SentryProfiledTracerConcurrency.mm

philipphofmann · 2023-07-17T14:36:55Z

@armcknight, I'm a bit confused about why we need to keep references to tracers here

sentry-cocoa/Sources/Sentry/Profiling/SentryProfiledTracerConcurrency.mm

Lines 18 to 26 in 0f5d163

    
           /** 
        
            * a mapping of profilers to the tracers that started them that are still in-flight and will need to 
        
            * query them for their profiling data when they finish. this helps resolve the incongruity between 
        
            * the different timeout durations between tracers (500s) and profilers (30s), where a transaction 
        
            * may start a profiler that then times out, and then a new transaction starts a new profiler, and 
        
            * we must keep the aborted one around until its associated transaction finishes. 
        
            */ 
        
           static NSMutableDictionary</* SentryProfiler.profileId */ NSString *, 
        
               NSMutableSet<SentryTracer *> *> *_gProfilersToTracers;

We only use the traceId in SentryProfiledTracerConcurrency, and we rely on the count to resetProfilingTimestamps for the framesTracker and to stop the profiler.

philipphofmann · 2023-07-17T16:27:38Z

Sources/Sentry/SentryTracer.m

@@ -425,10 +425,16 @@ - (void)finishInternal
 {
    [self cancelDeadlineTimer];
    if (self.isFinished) {
+#if SENTRY_TARGET_PROFILING_SUPPORTED
+        discardProfilerForTracer(self);


h: Instead of having to ensure calling discardProfilerForTracer in all the correct places, it would be great to have one single place in the tracer. Maybe here

sentry-cocoa/Sources/Sentry/SentryTracer.m

Lines 442 to 450 in 4386045

[self.delegate tracerDidFinish:self];

if (self.finishCallback) {

self.finishCallback(self);

// The callback will only be executed once. No need to keep the reference and we avoid

// potential retain cycles.

self.finishCallback = nil;

}

or here

sentry-cocoa/Sources/Sentry/SentryTracer.m

Line 504 in 4386045

[self captureTransactionWithProfile:transaction];

would be sufficient, if we also do the following: To ensure that we don't end up with infinite memory growth, we could store timestamps SentryProfiledTracerConcurrency.m when a traceID or a profiler was added. We only have to keep them around until transactions time out, which is 450s or something similar. Whenever we add a new tracer to profile mapping in SentryProfiledTracerConcurrency we could check if there are old profiles in the dictionary, and if there are, we remove them. This algorithm would be our insurance and we also don't need weak references. WDYT, @armcknight?

Hmm, I also just noticed that the 500s limit is only a duration limit enforced after computation. Automatic tracers actually also have a 30 second timeout. But this is only for automatic tracers, not manually created ones. If there is no time limit for these, then we can't simply track timestamps.

(Note: I hid a couple other comments that won't matter if we can't do this.)

I think this will be fine, yes. At the point where there is any memory growth due to unended transactions, that's a logic error by the SDK consumer, not the SDK itself.

I think just adding the call to discardProfilerForTracer in SentryTracer.dealloc will be sufficient. We can't call that before/after this

sentry-cocoa/Sources/Sentry/SentryTracer.m

Line 442 in 4386045

[self.delegate tracerDidFinish:self];

because then the profiler will be gone by the time we try to get its data later in finishInternal at the call to captureTransactionWithProfile (which eventually goes on to remove the tracking for the tracer, that's the happy path).

I added several new tests, please check them out!

At the point where there is any memory growth due to unended transactions, that's a logic error by the SDK consumer, not the SDK itself.

With the previous solution, it wouldn't be a logic error by our users. Doing the following and never calling finish for whatever reason is a valid use case.

let transaction1 = SentrySDK.startTransaction(name: "name", operation: "op") let transaction2 = SentrySDK.startTransaction(name: "name", operation: "op")

With the previous solution, the SDK would start a profile for the transaction and keep the profile in memory forever. ARC would deallocate transaction1 and transaction2 at some point.

Anyway, with the current solution, it shouldn't be a problem.

I guess by logic error I just meant, if they start tons of endless transactions and keep refs to them, but also expect our memory to not grow accordingly.

armcknight · 2023-07-17T23:23:26Z

@armcknight, I'm a bit confused about why we need to keep references to tracers here

sentry-cocoa/Sources/Sentry/Profiling/SentryProfiledTracerConcurrency.mm

Lines 18 to 26 in 0f5d163

/**

* a mapping of profilers to the tracers that started them that are still in-flight and will need to

* query them for their profiling data when they finish. this helps resolve the incongruity between

* the different timeout durations between tracers (500s) and profilers (30s), where a transaction

* may start a profiler that then times out, and then a new transaction starts a new profiler, and

* we must keep the aborted one around until its associated transaction finishes.

*/

static NSMutableDictionary</* SentryProfiler.profileId */ NSString *,

NSMutableSet<SentryTracer *> *> *_gProfilersToTracers;

We only use the traceId in SentryProfiledTracerConcurrency, and we rely on the count to resetProfilingTimestamps for the framesTracker and to stop the profiler.

Correct, it's essentially relying on count. So we can just keep a counter per profiler to know when to stop it. Can't remember now why I was keeping the tracers around 🤷🏻 See 1467d20

…ing-transaction-bookkeeping-cleanup

github-actions · 2023-07-18T00:35:56Z

Performance metrics 🚀

	Plain	With Sentry	Diff
Startup time	1207.50 ms	1237.92 ms	30.42 ms
Size	22.84 KiB	403.24 KiB	380.39 KiB

Baseline results on branch: main

Startup times

Revision	Plain	With Sentry	Diff
`dbc67d2`	1239.49 ms	1248.88 ms	9.39 ms
`b385962`	1195.85 ms	1221.63 ms	25.78 ms
`b6ba04e`	1230.48 ms	1253.20 ms	22.72 ms
`06548c0`	1262.80 ms	1275.00 ms	12.20 ms
`407ff99`	1225.49 ms	1232.88 ms	7.39 ms
`e2abb0d`	1235.08 ms	1257.00 ms	21.92 ms
`7bc3c0d`	1261.16 ms	1278.38 ms	17.22 ms
`8f397a7`	1224.66 ms	1236.48 ms	11.82 ms
`2405ba5`	1248.37 ms	1259.30 ms	10.93 ms
`257c2a9`	1239.52 ms	1251.08 ms	11.56 ms

App size

Revision	Plain	With Sentry	Diff
`dbc67d2`	20.76 KiB	427.74 KiB	406.98 KiB
`b385962`	20.76 KiB	399.69 KiB	378.93 KiB
`b6ba04e`	20.76 KiB	414.44 KiB	393.68 KiB
`06548c0`	20.76 KiB	427.36 KiB	406.59 KiB
`407ff99`	20.76 KiB	427.87 KiB	407.10 KiB
`e2abb0d`	20.76 KiB	434.72 KiB	413.96 KiB
`7bc3c0d`	20.76 KiB	427.36 KiB	406.59 KiB
`8f397a7`	20.76 KiB	420.55 KiB	399.79 KiB
`2405ba5`	20.76 KiB	435.23 KiB	414.47 KiB
`257c2a9`	20.76 KiB	401.36 KiB	380.60 KiB

Previous results on branch: armcknight/fix/profiling-transaction-bookkeeping-cleanup

Startup times

Revision	Plain	With Sentry	Diff
`d71e189`	1240.52 ms	1254.35 ms	13.83 ms
`3c9bbdd`	1229.72 ms	1249.90 ms	20.18 ms

App size

Revision	Plain	With Sentry	Diff
`d71e189`	22.84 KiB	403.87 KiB	381.02 KiB
`3c9bbdd`	22.84 KiB	403.37 KiB	380.53 KiB

philipphofmann

We are getting close.

Sources/Sentry/Profiling/SentryProfiledTracerConcurrency.mm

Sources/Sentry/include/SentryProfiledTracerConcurrency.h

philipphofmann · 2023-07-18T08:14:48Z

Sources/Sentry/SentryTracer.m

@@ -425,10 +425,16 @@ - (void)finishInternal
 {
    [self cancelDeadlineTimer];
    if (self.isFinished) {
+#if SENTRY_TARGET_PROFILING_SUPPORTED
+        discardProfilerForTracer(self);


Manual transactions don't have a duration limit. Users could keep it alive for hours. So my suggestion won't work.

I thought about it and I think we only have one problem to solve with the current solution: When a user starts a transaction, never calls finish, and doesn't keep a reference to the transactions, SentryProfiledTracerConcurrency will keep a reference to the profile and the traceID forever. The dealloc of the SentryTracer could call discardProfilerForTracer. dealloc would also be our insurance if we forget to call discardProfilerForTracer somewhere.

Furthermore, I think we only need to call discardProfilerForTracer below or above the following line

sentry-cocoa/Sources/Sentry/SentryTracer.m

Line 442 in 4386045

[self.delegate tracerDidFinish:self];

We use this callback also here

sentry-cocoa/Sources/Sentry/SentryPerformanceTracker.m

Lines 233 to 238 in 4386045

- (void)tracerDidFinish:(SentryTracer *)tracer

{

@synchronized(self.spans) {

[self.spans removeObjectForKey:tracer.spanId];

}

}

We don't need to consider transactions that never finish in the SentryPerformanceTracker as it only keeps track of automatic transactions and they have a timeout.

Doing those two suggested changes would do the job, I believe.

Please add some tests after doing the changes here.

…ing-transaction-bookkeeping-cleanup

…ping-cleanup

…bookkeeping-cleanup

…action-bookkeeping-cleanup

codecov · 2023-07-19T00:31:03Z

Codecov Report

Merging #3154 (0529a52) into main (5b14b06) will increase coverage by 0.025%.
The diff coverage is 98.750%.

Additional details and impacted files

@@              Coverage Diff              @@
##              main     #3154       +/-   ##
=============================================
+ Coverage   89.172%   89.197%   +0.025%     
=============================================
  Files          502       502               
  Lines        53916     54062      +146     
  Branches     19344     19405       +61     
=============================================
+ Hits         48078     48222      +144     
  Misses        4988      4988               
- Partials       850       852        +2

Impacted Files	Coverage Δ
...entry/Profiling/SentryProfiledTracerConcurrency.mm	`94.736% <94.444%> (-1.874%)`	⬇️
Sources/Sentry/SentryProfiler.mm	`80.065% <100.000%> (+0.398%)`	⬆️
Sources/Sentry/SentryTracer.m	`96.526% <100.000%> (-0.159%)`	⬇️
...SentryProfilerTests/SentryProfilerSwiftTests.swift	`97.530% <100.000%> (+0.420%)`	⬆️

... and 14 files with indirect coverage changes

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5b14b06...0529a52. Read the comment docs.

philipphofmann

LGTM, I think we just miss a couple of more tests to ensure we covered all edge cases. After adding these we can merge. Approving to unblock the PR.

CI is still unhappy and we miss a changelog entry.

philipphofmann · 2023-07-19T10:08:43Z

Sources/Sentry/Profiling/SentryProfiledTracerConcurrency.mm

+ * @warning Must be called from a synchronized context.
+ */
+void
+_unsafe_cleanUpProfiler(SentryProfiler *profiler, NSString *tracerKey)


m: The prefix unsafe raises a few questions for me at the first glance. I thought this function is somewhat dangerous 😄 . Actually it's just not thread safe. What about renaming it to cleanUpProfiler_non_thread_safe?

Suggested change

_unsafe_cleanUpProfiler(SentryProfiler *profiler, NSString *tracerKey)

_cleanUpProfiler_non_thread_safe(SentryProfiler *profiler, NSString *tracerKey)

Sources/Sentry/SentryProfiler.mm

Tests/SentryProfilerTests/SentryProfilerSwiftTests.swift

… finished normally by tracer

philipphofmann

LGTM

CHANGELOG.md

…ping-cleanup

armcknight added 2 commits July 14, 2023 13:48

rename SentryTracerConcurrency -> SentryProfiledTracerConcurrency

cd6d521

clean up profilers for discarded transactions

352df6a

armcknight requested review from philipphofmann and brustolin as code owners July 14, 2023 23:41

armcknight changed the base branch from main to armcknight/ref/rename-SentryTracerConcurrency July 14, 2023 23:41

This was referenced Jul 14, 2023

ref: use weak references to store tracers and profilers #3155

Closed

fix: profiler timeout scheduling and data preservation #3135

Merged

also stop a running profiler

0f5d163

philipphofmann reviewed Jul 17, 2023

View reviewed changes

Sources/Sentry/Profiling/SentryProfiledTracerConcurrency.mm Outdated Show resolved Hide resolved

philipphofmann reviewed Jul 17, 2023

View reviewed changes

Base automatically changed from armcknight/ref/rename-SentryTracerConcurrency to main July 17, 2023 22:59

armcknight added 2 commits July 17, 2023 15:29

only store the count of tracers in bookkeeping, not a set of references

1467d20

Merge remote-tracking branch 'origin/main' into armcknight/fix/profil…

b04c3ae

…ing-transaction-bookkeeping-cleanup

philipphofmann reviewed Jul 18, 2023

View reviewed changes

armcknight added 8 commits July 18, 2023 11:49

Merge remote-tracking branch 'origin/main' into armcknight/fix/profil…

e63cddf

…ing-transaction-bookkeeping-cleanup

just call discard profiler from SentryTracer.dealloc

ea7a894

stub tests to write

d53dce1

fixup! stub tests to write

45eed31

Merge branch 'main' into armcknight/fix/profiling-transaction-bookkee…

2e02b79

…ping-cleanup

Merge branch 'main' into armcknight/fix/profiling-transaction-bookkee…

b070216

…ping-cleanup

fixup! Merge branch 'main' into armcknight/fix/profiling-transaction-…

a36628e

…bookkeeping-cleanup

fixup! fixup! Merge branch 'main' into armcknight/fix/profiling-trans…

5c2add3

…action-bookkeeping-cleanup

stronger checks

e5092c0

philipphofmann approved these changes Jul 19, 2023

View reviewed changes

armcknight added 2 commits July 19, 2023 16:02

add more tests

e452f98

fix tvos build

8c2ade9

armcknight added 2 commits July 20, 2023 17:06

dont assert presence of profiler in discard; it may have already been…

097e834

… finished normally by tracer

changelog

95dcedb

philipphofmann approved these changes Jul 21, 2023

View reviewed changes

CHANGELOG.md Outdated Show resolved Hide resolved

armcknight added 2 commits July 21, 2023 11:18

Merge branch 'main' into armcknight/fix/profiling-transaction-bookkee…

99ea0bf

…ping-cleanup

fix changelog and test failure

0529a52

armcknight merged commit ed68562 into main Jul 21, 2023

armcknight deleted the armcknight/fix/profiling-transaction-bookkeeping-cleanup branch July 21, 2023 19:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: clean up profilers for discarded transactions #3154

fix: clean up profilers for discarded transactions #3154

armcknight commented Jul 14, 2023

philipphofmann left a comment •

edited

Loading

philipphofmann commented Jul 17, 2023 •

edited

Loading

philipphofmann Jul 17, 2023 •

edited

Loading

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

armcknight Jul 18, 2023 •

edited

Loading

armcknight Jul 18, 2023

armcknight Jul 18, 2023

armcknight Jul 19, 2023

philipphofmann Jul 19, 2023 •

edited

Loading

armcknight Jul 20, 2023

armcknight commented Jul 17, 2023 •

edited

Loading

github-actions bot commented Jul 18, 2023 •

edited

Loading

Baseline results on branch: main

Startup times

App size

Previous results on branch: armcknight/fix/profiling-transaction-bookkeeping-cleanup

Startup times

App size

philipphofmann left a comment

philipphofmann Jul 18, 2023 •

edited

Loading

codecov bot commented Jul 19, 2023 •

edited

Loading

philipphofmann left a comment •

edited

Loading

philipphofmann Jul 19, 2023

philipphofmann left a comment

	[self.delegate tracerDidFinish:self];

	if (self.finishCallback) {
	self.finishCallback(self);

	// The callback will only be executed once. No need to keep the reference and we avoid
	// potential retain cycles.
	self.finishCallback = nil;
	}

	- (void)tracerDidFinish:(SentryTracer *)tracer
	{
	@synchronized(self.spans) {
	[self.spans removeObjectForKey:tracer.spanId];
	}
	}

	_unsafe_cleanUpProfiler(SentryProfiler profiler, NSString tracerKey)
	_cleanUpProfiler_non_thread_safe(SentryProfiler profiler, NSString tracerKey)

fix: clean up profilers for discarded transactions #3154

fix: clean up profilers for discarded transactions #3154

Conversation

armcknight commented Jul 14, 2023

📜 Description

💡 Motivation and Context

💚 How did you test it?

📝 Checklist

🔮 Next steps

philipphofmann left a comment • edited Loading

Choose a reason for hiding this comment

philipphofmann commented Jul 17, 2023 • edited Loading

philipphofmann Jul 17, 2023 • edited Loading

Choose a reason for hiding this comment

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

armcknight Jul 18, 2023 • edited Loading

Choose a reason for hiding this comment

armcknight Jul 18, 2023

Choose a reason for hiding this comment

armcknight Jul 18, 2023

Choose a reason for hiding this comment

armcknight Jul 19, 2023

Choose a reason for hiding this comment

philipphofmann Jul 19, 2023 • edited Loading

Choose a reason for hiding this comment

armcknight Jul 20, 2023

Choose a reason for hiding this comment

armcknight commented Jul 17, 2023 • edited Loading

github-actions bot commented Jul 18, 2023 • edited Loading

Performance metrics 🚀

Baseline results on branch: main

Startup times

App size

Previous results on branch: armcknight/fix/profiling-transaction-bookkeeping-cleanup

Startup times

App size

philipphofmann left a comment

Choose a reason for hiding this comment

philipphofmann Jul 18, 2023 • edited Loading

Choose a reason for hiding this comment

codecov bot commented Jul 19, 2023 • edited Loading

Codecov Report

philipphofmann left a comment • edited Loading

Choose a reason for hiding this comment

philipphofmann Jul 19, 2023

Choose a reason for hiding this comment

philipphofmann left a comment

Choose a reason for hiding this comment

philipphofmann left a comment •

edited

Loading

philipphofmann commented Jul 17, 2023 •

edited

Loading

philipphofmann Jul 17, 2023 •

edited

Loading

armcknight Jul 18, 2023 •

edited

Loading

philipphofmann Jul 19, 2023 •

edited

Loading

armcknight commented Jul 17, 2023 •

edited

Loading

github-actions bot commented Jul 18, 2023 •

edited

Loading

philipphofmann Jul 18, 2023 •

edited

Loading

codecov bot commented Jul 19, 2023 •

edited

Loading

philipphofmann left a comment •

edited

Loading