-
Notifications
You must be signed in to change notification settings - Fork 287
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
reporter: don't expire actively used executables #247
Changes from all commits
c9ac552
1a8f02d
15529cb
bf83645
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -29,6 +29,10 @@ import ( | |
"go.opentelemetry.io/ebpf-profiler/libpf/xsync" | ||
) | ||
|
||
const ( | ||
executableCacheLifetime = 1 * time.Hour | ||
) | ||
|
||
// Assert that we implement the full Reporter interface. | ||
var _ Reporter = (*OTLPReporter)(nil) | ||
|
||
|
@@ -147,7 +151,7 @@ func NewOTLP(cfg *Config) (*OTLPReporter, error) { | |
if err != nil { | ||
return nil, err | ||
} | ||
executables.SetLifetime(1 * time.Hour) // Allow GC to clean stale items. | ||
executables.SetLifetime(executableCacheLifetime) // Allow GC to clean stale items. | ||
|
||
frames, err := lru.NewSynced[libpf.FileID, | ||
*xsync.RWMutex[map[libpf.AddressOrLineno]sourceInfo]]( | ||
|
@@ -574,7 +578,9 @@ func (r *OTLPReporter) getProfile() (profile *profiles.Profile, startTS, endTS u | |
fileIDtoMapping[traceInfo.files[i]] = idx | ||
locationMappingIndex = idx | ||
|
||
execInfo, exists := r.executables.Get(traceInfo.files[i]) | ||
// Ensure that actively used executables do not expire. | ||
execInfo, exists := r.executables.GetAndRefresh(traceInfo.files[i], | ||
executableCacheLifetime) | ||
Comment on lines
-577
to
+583
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What is the difference here? Does not There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
FreeLRU does not actively purge expired items. For the reporter caches, we recently added a ticker to call to |
||
|
||
// Next step: Select a proper default value, | ||
// if the name of the executable is not known yet. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we also increase the size of this cache as 4096 seems small given we still haven't solved the core issue here, which is that items may be dropped from this cache (whether through time-based expiration or due to the LRU being full) and there's no control or guarantee as to when they'll be re-inserted.
I'd set the cache size to 16384 until we really solve the problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No strong opinion on the cache size (4096 seems quite large to me).
Ideally, we should have metrics for expiration and eviction, then get numbers from production systems.
We can also make the cache sizes and lifetimes configurable.
And we can create an LRU wrapper that automatically resizes the LRUs, as suggested at #248 (comment)
Maybe better continue at #244 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also not forget that (currently), the reporter implementation in this repository is just for demo/example purposes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No strong opinion on my side either - went ahead and bumped it to 16384 in bf83645