Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(check_golang_profiler_changes): acknowledge new golang profiler changes #102

Merged
merged 1 commit into from
Aug 15, 2024

Conversation

github-actions[bot]
Copy link
Contributor

@github-actions github-actions bot commented May 14, 2024

This PR is created by godeltaprof/compat/cmd/check_golang_profiler_changes/main.go to notify src/runtime/mprof.go or src/runtime/pprof in golang are updated.
Please take look at the golang commits and update godeltaprof accordingly if needed.
Merge the PR to acknowledge golang runtime changes and state no further actions needed for godeltaprof.

src/runtime/mprof.go
last known 2141315251da47745c8f649c01e598a19bd68897
current 1416cebdd54b4d96f6095db18002ae3f90f1f4c0

commit 1416cebdd54b4d96f6095db18002ae3f90f1f4c0
Author: Rhys Hiltner <rhys.hiltner@gmail.com>
Date:   Thu Aug 8 15:24:15 2024 -0700

    runtime: record all sampled mutex profile events
    
    The block and mutex profiles have slightly different behaviors when a
    sampled event has a negative (or zero) duration. The block profile
    enforces a minimum duration for each event of "1" in the cputicks unit.
    It does so by clamping the duration to 1 if it was originally reported
    as being smaller. The mutex profile for app-level contention enforces a
    minimum duration of 0 in a similar way: by reporting any negative values
    as 0 instead.
    
    The mutex profile for runtime-internal contention had a different
    behavior: to enforce a minimum event duration of "1" by dropping any
    non-conforming samples.
    
    Stop dropping samples, and use the same minimum (0) that's in place for
    the other mutex profile events.
    
    Fixes #64253
    Fixes #68453
    Fixes #68781
    
    Change-Id: I4c5d23a2675501226eef5b9bc1ada2efc1a55b9e
    Reviewed-on: https://go-review.googlesource.com/c/go/+/604355
    LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
    Reviewed-by: Mauri de Souza Meneguzzo <mauri870@gmail.com>
    Reviewed-by: Michael Pratt <mpratt@google.com>
    Reviewed-by: Carlos Amedee <carlos@golang.org>
    Auto-Submit: Rhys Hiltner <rhys.hiltner@gmail.com>

commit fc5073bc155545dde4856cccdfcbb31880d1eb66
Author: David Chase <drchase@google.com>
Date:   Tue Jul 23 11:43:23 2024 -0400

    runtime,internal: move runtime/internal/sys to internal/runtime/sys
    
    Cleanup and friction reduction
    
    For #65355.
    
    Change-Id: Ia14c9dc584a529a35b97801dd3e95b9acc99a511
    Reviewed-on: https://go-review.googlesource.com/c/go/+/600436
    Reviewed-by: Keith Randall <khr@google.com>
    LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
    Reviewed-by: Keith Randall <khr@golang.org>

commit 87abb4afb63918d3e5ee5d7ebd196386fa22e368
Author: Nick Ripley <nick.ripley@datadoghq.com>
Date:   Tue Jul 16 07:15:22 2024 -0400

    runtime: avoid multiple records with identical stacks from MutexProfile
    
    When using frame pointer unwinding, we defer frame skipping and inline
    expansion for call stacks until profile reporting time. We can end up
    with records which have different stacks if no frames are skipped, but
    identical stacks once skipping is taken into account. Returning multiple
    records with the same stack (but different values) has broken programs
    which rely on the records already being fully aggregated by call stack
    when returned from runtime.MutexProfile. This CL addresses the problem
    by handling skipping at recording time. We do full inline expansion to
    correctly skip the desired number of frames when recording the call
    stack, and then handle the rest of inline expansion when reporting the
    profile.
    
    The regression test in this CL is adapted from the reproducer in
    https://github.com/grafana/pyroscope-go/issues/103, authored by Tolya
    Korniltsev.
    
    Fixes #67548
    
    This reapplies CL 595966.
    The original version of this CL introduced a bounds error in
    MutexProfile and failed to correctly expand inlined frames from that
    call. This CL applies the original CL, fixing the bounds error and
    adding a test for the output of MutexProfile to ensure the frames are
    expanded properly.
    
    Change-Id: I5ef8aafb9f88152a704034065c0742ba767c4dbb
    Reviewed-on: https://go-review.googlesource.com/c/go/+/598515
    LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
    Reviewed-by: Carlos Amedee <carlos@golang.org>
    Reviewed-by: Cherry Mui <cherryyz@google.com>

commit 6948b4df8c5ec9f249960ea60dba6349d1975c0e
Author: Cherry Mui <cherryyz@google.com>
Date:   Mon Jul 15 19:49:44 2024 +0000

    Revert "runtime: avoid multiple records with identical stacks from MutexProfile"
    
    This reverts CL 595966.
    
    Reason for revert: This CL contains a bug. See the comment in https://go-review.googlesource.com/c/go/+/595966/8#message-57f4c1f9570b5fe912e06f4ae3b52817962533c0
    
    Change-Id: I48030907ded173ae20a8965bf1b41a713dd06059
    Reviewed-on: https://go-review.googlesource.com/c/go/+/598219
    Reviewed-by: Than McIntosh <thanm@google.com>
    LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
    Reviewed-by: Carlos Amedee <carlos@golang.org>

commit 183a40db6def2780478053b3cd14f3f5c355d999
Author: Nick Ripley <nick.ripley@datadoghq.com>
Date:   Tue Jul 2 12:53:36 2024 -0400

    runtime: avoid multiple records with identical stacks from MutexProfile
    
    When using frame pointer unwinding, we defer frame skipping and inline
    expansion for call stacks until profile reporting time. We can end up
    with records which have different stacks if no frames are skipped, but
    identical stacks once skipping is taken into account. Returning multiple
    records with the same stack (but different values) has broken programs
    which rely on the records already being fully aggregated by call stack
    when returned from runtime.MutexProfile. This CL addresses the problem
    by handling skipping at recording time. We do full inline expansion to
    correctly skip the desired number of frames when recording the call
    stack, and then handle the rest of inline expansion when reporting the
    profile.
    
    The regression test in this CL is adapted from the reproducer in
    https://github.com/grafana/pyroscope-go/issues/103, authored by Tolya
    Korniltsev.
    
    Fixes #67548
    
    Co-Authored-By: Tolya Korniltsev <korniltsev.anatoly@gmail.com>
    Change-Id: I6a42ce612377f235b2c8c0cec9ba8e9331224b84
    Reviewed-on: https://go-review.googlesource.com/c/go/+/595966
    LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
    Reviewed-by: Carlos Amedee <carlos@golang.org>
    Reviewed-by: Cherry Mui <cherryyz@google.com>
    Auto-Submit: Carlos Amedee <carlos@golang.org>
    Reviewed-by: Felix Geisendörfer <felix.geisendoerfer@datadoghq.com>

commit 82c14346d89ec0eeca114f9ca0e88516b2cda454
Author: Cherry Mui <cherryyz@google.com>
Date:   Wed Jul 3 14:14:34 2024 -0400

    cmd/link: don't disable memory profiling when pprof.WriteHeapProfile is used
    
    We have an optimization that if the memory profile is not consumed
    anywhere, we set the memory profiling rate to 0 to disable the
    "background" low-rate profiling. We detect whether the memory
    profile is used by checking whether the runtime.MemProfile function
    is reachable at link time. Previously, all APIs that access the
    memory profile go through runtime.MemProfile. But the code was
    refactored in CL 572396, and now the legacy entry point
    WriteHeapProfile uses pprof_memProfileInternal without going
    through runtime.MemProfile. In fact, even with the recommended
    runtime/pprof.Profile API (pprof.Lookup or pprof.Profiles),
    runtime.MemProfile is only (happen to be) reachable through
    countHeap.
    
    Change the linker to check runtime.memProfileInternal instead,
    which is on all code paths that retrieve the memory profile. Add
    a test case for WriteHeapProfile, so we cover all entry points.
    
    Fixes #68136.
    
    Change-Id: I075c8d45c95c81825a1822f032e23107aea4303c
    Reviewed-on: https://go-review.googlesource.com/c/go/+/596538
    Reviewed-by: Than McIntosh <thanm@google.com>
    LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

commit 9114c51521c641ff6c33648da03774dc3eb86b79
Author: Rhys Hiltner <rhys.hiltner@gmail.com>
Date:   Wed May 29 16:43:38 2024 +0000

    Revert "runtime: prepare for extensions to waiting M list"
    
    This reverts commit be0b569caa0eab1a7f30edf64e550bbf5f6ff235 (CL 585635).
    
    Reason for revert: This is part of a patch series that changed the
    handling of contended lock2/unlock2 calls, reducing the maximum
    throughput of contended runtime.mutex values, and causing a performance
    regression on applications where that is (or became) the bottleneck.
    
    Updates #66999
    Updates #67585
    
    Change-Id: I7843ccaecbd273b7ceacfa0f420dd993b4b15a0a
    Reviewed-on: https://go-review.googlesource.com/c/go/+/589117
    Auto-Submit: Rhys Hiltner <rhys.hiltner@gmail.com>
    LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
    Reviewed-by: Than McIntosh <thanm@google.com>
    Reviewed-by: Michael Pratt <mpratt@google.com>

commit afbbc2894b4af076d7f171fd6e83541d35b7b5a3
Author: Rhys Hiltner <rhys.hiltner@gmail.com>
Date:   Wed May 29 16:42:23 2024 +0000

    Revert "runtime: double-link list of waiting Ms"
    
    This reverts commit d881ed6384ae58154d99682f1e20160c64e7c3c2 (CL 585637).
    
    Reason for revert: This is part of a patch series that changed the
    handling of contended lock2/unlock2 calls, reducing the maximum
    throughput of contended runtime.mutex values, and causing a performance
    regression on applications where that is (or became) the bottleneck.
    
    Updates #66999
    Updates #67585
    
    Change-Id: I70d8d0b74f73be95c43d664f584e8d98519aba26
    Reviewed-on: https://go-review.googlesource.com/c/go/+/589116
    Auto-Submit: Rhys Hiltner <rhys.hiltner@gmail.com>
    LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
    Reviewed-by: Than McIntosh <thanm@google.com>
    Reviewed-by: Michael Pratt <mpratt@google.com>

commit 5dead59add1f11e98de155860aa86175e893ac01
Author: Rhys Hiltner <rhys.hiltner@gmail.com>
Date:   Wed May 29 16:41:51 2024 +0000

    Revert "runtime: profile mutex contention during unlock"
    
    This reverts commit ba1c5b2c4573e10f3b5f0e0f25a27f17fba67eb0 (CL 585638).
    
    Reason for revert: This is part of a patch series that changed the
    handling of contended lock2/unlock2 calls, reducing the maximum
    throughput of contended runtime.mutex values, and causing a performance
    regression on applications where that is (or became) the bottleneck.
    
    Updates #66999
    Updates #67585
    
    Change-Id: Ibeec5d8deb17e87966cf352fefc7efe2267839d6
    Reviewed-on: https://go-review.googlesource.com/c/go/+/589115
    Auto-Submit: Rhys Hiltner <rhys.hiltner@gmail.com>
    Reviewed-by: Michael Pratt <mpratt@google.com>
    Reviewed-by: Than McIntosh <thanm@google.com>
    LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

commit ca7d300509626e2071f3f5babc2e9c121d806fec
Author: Rhys Hiltner <rhys.hiltner@gmail.com>
Date:   Wed May 29 16:41:10 2024 +0000

    Revert "runtime: remove GODEBUG=runtimecontentionstacks"
    
    This reverts commit 87e930f7289136fad1310d4b63dd4127e409bac5 (CL 585639)
    
    Reason for revert: This is part of a patch series that changed the
    handling of contended lock2/unlock2 calls, reducing the maximum
    throughput of contended runtime.mutex values, and causing a performance
    regression on applications where that is (or became) the bottleneck.
    
    Updates #66999
    Updates #67585
    
    Change-Id: I1e286d2a16d16e4af202cd5dc04b2d9c4ee71b32
    Reviewed-on: https://go-review.googlesource.com/c/go/+/589097
    Reviewed-by: Than McIntosh <thanm@google.com>
    LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
    Reviewed-by: Michael Pratt <mpratt@google.com>
    Auto-Submit: Rhys Hiltner <rhys.hiltner@gmail.com>

commit 1be701a344d1f1819dc08d78259684de1da6f923
Author: Rhys Hiltner <rhys.hiltner@gmail.com>
Date:   Wed May 29 16:36:36 2024 +0000

    Revert "runtime: split mutex profile clocks"
    
    This reverts commit 8ab131fb1256a4a795c610e145c022e22e2d1567 (CL 586796)
    
    Reason for revert: This is part of a patch series that changed the
    handling of contended lock2/unlock2 calls, reducing the maximum
    throughput of contended runtime.mutex values, and causing a performance
    regression on applications where that is (or became) the bottleneck.
    
    Updates #66999
    Updates #67585
    
    Change-Id: I54711691e86e072081482102019d168292b5150a
    Reviewed-on: https://go-review.googlesource.com/c/go/+/589095
    Reviewed-by: Michael Pratt <mpratt@google.com>
    Auto-Submit: Rhys Hiltner <rhys.hiltner@gmail.com>
    LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
    Reviewed-by: Than McIntosh <thanm@google.com>

commit 8ab131fb1256a4a795c610e145c022e22e2d1567
Author: Rhys Hiltner <rhys.hiltner@gmail.com>
Date:   Tue May 21 23:17:13 2024 -0700

    runtime: split mutex profile clocks
    
    Mutex contention measurements work with two clocks: nanotime for use in
    runtime/metrics, and cputicks for the runtime/pprof profile. They're
    subject to different sampling rates: the runtime/metrics view is always
    enabled, but the profile is adjustable and is turned off by default.
    They have different levels of overhead: it can take as little as one
    instruction to read cputicks while nanotime calls are more elaborate
    (although some platforms implement cputicks as a nanotime call). The use
    of the timestamps is also different: the profile's view needs to attach
    the delay in some Ms' lock2 calls to another M's unlock2 call stack, but
    the metric's view is only an int64.
    
    Treat them differently. Don't bother threading the nanotime clock
    through to the unlock2 call, measure and report it directly within
    lock2. Sample nanotime at a constant gTrackingPeriod.
    
    Don't consult any clocks unless the mutex is actually contended.
    
    Continue liberal use of cputicks for now.
    
    For #66999
    
    Change-Id: I1c2085ea0e695bfa90c30fadedc99ced9eb1f69e
    Reviewed-on: https://go-review.googlesource.com/c/go/+/586796
    TryBot-Result: Gopher Robot <gobot@golang.org>
    Auto-Submit: Rhys Hiltner <rhys.hiltner@gmail.com>
    Reviewed-by: Michael Pratt <mpratt@google.com>
    LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
    Run-TryBot: Rhys Hiltner <rhys.hiltner@gmail.com>
    Reviewed-by: Carlos Amedee <carlos@golang.org>

commit 87e930f7289136fad1310d4b63dd4127e409bac5
Author: Rhys Hiltner <rhys.hiltner@gmail.com>
Date:   Tue May 14 14:44:37 2024 -0700

    runtime: remove GODEBUG=runtimecontentionstacks
    
    Go 1.22 promised to remove the setting in a future release once the
    semantics of runtime-internal lock contention matched that of
    sync.Mutex. That work is done, remove the setting.
    
    For #66999
    
    Change-Id: I3c4894148385adf2756d8754e44d7317305ad758
    Reviewed-on: https://go-review.googlesource.com/c/go/+/585639
    Reviewed-by: Carlos Amedee <carlos@golang.org>
    LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
    Auto-Submit: Rhys Hiltner <rhys.hiltner@gmail.com>
    Reviewed-by: Michael Pratt <mpratt@google.com>

commit ba1c5b2c4573e10f3b5f0e0f25a27f17fba67eb0
Author: Rhys Hiltner <rhys.hiltner@gmail.com>
Date:   Tue May 14 12:32:14 2024 -0700

    runtime: profile mutex contention during unlock
    
    When an M's use of a lock causes delays in other Ms, capture the stack
    of the unlock call that caused the delay. This makes the meaning of the
    mutex profile for runtime-internal mutexes match the behavior for
    sync.Mutex: the profile points to the end of the critical section that
    is responsible for delaying other work.
    
    Fixes #66999
    
    Change-Id: I4abc4a1df00a48765d29c07776481a1cbd539ff8
    Reviewed-on: https://go-review.googlesource.com/c/go/+/585638
    Auto-Submit: Rhys Hiltner <rhys.hiltner@gmail.com>
    LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
    Reviewed-by: Carlos Amedee <carlos@golang.org>
    Reviewed-by: Michael Pratt <mpratt@google.com>

commit d881ed6384ae58154d99682f1e20160c64e7c3c2
Author: Rhys Hiltner <rhys.hiltner@gmail.com>
Date:   Mon May 13 13:00:52 2024 -0700

    runtime: double-link list of waiting Ms
    
    When an M unlocks a contended mutex, it needs to consult a list of the
    Ms that had to wait during its critical section. This allows the M to
    attribute the appropriate amount of blame to the unlocking call stack.
    
    Mirroring the implementation for users' sync.Mutex contention (via
    sudog), we can (in a future commit) use the time that the head and tail
    of the wait list started waiting, and the number of waiters, to estimate
    the sum of the Ms' delays.
    
    When an M acquires the mutex, it needs to remove itself from the list of
    waiters. Since the futex-based lock implementation leaves the OS in
    control of the order of M wakeups, we need to be prepared for quickly
    (constant time) removing any M from the list.
    
    First, have each M add itself to a singly-linked wait list when it finds
    that its lock call will need to sleep. This case is safe against
    live-lock, since any delay to one M adding itself to the list would be
    due to another M making durable progress.
    
    Second, have the M that holds the lock (either right before releasing,
    or right after acquiring) update metadata on the list of waiting Ms to
    double-link the list and maintain a tail pointer and waiter count. That
    work is amortized-constant: we'll avoid contended locks becoming
    proportionally more contended and undergoing performance collapse.
    
    For #66999
    
    Change-Id: If75cdea915afb59ccec47294e0b52c466aac8736
    Reviewed-on: https://go-review.googlesource.com/c/go/+/585637
    Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
    Reviewed-by: Michael Pratt <mpratt@google.com>
    LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
    Auto-Submit: Rhys Hiltner <rhys.hiltner@gmail.com>

commit be0b569caa0eab1a7f30edf64e550bbf5f6ff235
Author: Rhys Hiltner <rhys.hiltner@gmail.com>
Date:   Mon May 13 12:23:58 2024 -0700

    runtime: prepare for extensions to waiting M list
    
    Move the nextwaitm field into a small struct, in preparation for
    additional metadata to track how long Ms need to wait for locks.
    
    For #66999
    
    Change-Id: Ib40e43c15cde22f7e35922641107973d99439ecd
    Reviewed-on: https://go-review.googlesource.com/c/go/+/585635
    Reviewed-by: Michael Pratt <mpratt@google.com>
    LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
    Auto-Submit: Rhys Hiltner <rhys.hiltner@gmail.com>
    Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>

commit 66cc2b7ca760d62294584d6680df65892cf7a8cf
Author: Felix Geisendörfer <felix.geisendoerfer@datadoghq.com>
Date:   Sat Apr 27 13:41:05 2024 +0200

    runtime: make profstackdepth a GODEBUG option
    
    Allow users to decrease the profiling stack depth back to 32 in case
    they experience any problems with the new default of 128.
    
    Users may also use this option to increase the depth up to 1024.
    
    Change-Id: Ieaab2513024915a223239278dd97a6e161dde1cf
    Reviewed-on: https://go-review.googlesource.com/c/go/+/581917
    Reviewed-by: Austin Clements <austin@google.com>
    LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
    Reviewed-by: Michael Knyszek <mknyszek@google.com>

commit 1b9dc3e178be578cf1d8c06fe371283a58bdd93f
Author: Felix Geisendörfer <felix.geisendoerfer@datadoghq.com>
Date:   Fri May 17 15:07:07 2024 +0200

    runtime: increase profiling stack depth to 128
    
    The current stack depth limit for alloc, mutex, block, threadcreate and
    goroutine profiles of 32 frequently leads to truncated stack traces in
    production applications. Increase the limit to 128 which is the same
    size used by the execution tracer.
    
    Create internal/profilerecord to define variants of the runtime's
    StackRecord, MemProfileRecord and BlockProfileRecord types that can hold
    arbitrarily big stack traces. Implement internal profiling APIs based on
    these new types and use them for creating protobuf profiles and to act
    as shims for the public profiling APIs using the old types.
    
    This will lead to an increase in memory usage for applications that
    use the impacted profile types and have stack traces exceeding the
    current limit of 32. Those applications will also experience a slight
    increase in CPU usage, but this will hopefully soon be mitigated via CL
    540476 and 533258 which introduce frame pointer unwinding for the
    relevant profile types.
    
    For #43669.
    
    Change-Id: Ie53762e65d0f6295f5d4c7d3c87172d5a052164e
    Reviewed-on: https://go-review.googlesource.com/c/go/+/572396
    LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
    Reviewed-by: Austin Clements <austin@google.com>
    Reviewed-by: Cherry Mui <cherryyz@google.com>

commit 47187a4f4f226c4b9e0e920c5ad1ec9ce83bdc35
Author: Felix Geisendörfer <felix.geisendoerfer@datadoghq.com>
Date:   Sun May 19 15:21:53 2024 +0200

    runtime: fix profile stack trace depth regression
    
    Previously it was possible for mutex and block profile stack traces to
    contain up to 32 frames in Stack0 or the resulting pprof profiles.
    CL 533258 changed this behavior by using some of the space to
    record skipped frames that are discarded when performing delayed inline
    expansion. This has lowered the effective maximum stack size from 32 to
    27 (the max skip value is 5), which can be seen as a small regression.
    
    Add TestProfilerStackDepth to demonstrate the issue and protect all
    profile types from similar regressions in the future. Fix the issue by
    increasing the internal maxStack limit to take the maxSkip value into
    account. Assert that the maxSkip value is never exceeded when recording
    mutex and block profile stack traces.
    
    Three alternative solutions to the problem were considered and
    discarded:
    
    1) Revert CL 533258 and give up on frame pointer unwinding. This seems
       unappealing as we would lose the performance benefits of frame
       pointer unwinding.
    2) Discard skipped frames when recording the initial stack trace. This
       would require eager inline expansion for up to maxSkip frames and
       partially negate the performance benefits of frame pointer
       unwinding.
    3) Accept and document the new behavior. This would simplify the
       implementation, but seems more confusing from a user perspective. It
       also complicates the creation of test cases that make assertions
       about the maximum profiling stack depth.
    
    The execution tracer still has the same issue due to CL 463835. This
    should be addressed in a follow-up CL.
    
    Co-authored-by: Nick Ripley <nick.ripley@datadoghq.com>
    Change-Id: Ibf4dbf08a5166c9cb32470068c69f58bc5f98d2c
    Reviewed-on: https://go-review.googlesource.com/c/go/+/586657
    Reviewed-by: Austin Clements <austin@google.com>
    Reviewed-by: Cherry Mui <cherryyz@google.com>
    LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

commit f4494522dc067bc930dc73b91e3ef931ce4583da
Author: Nick Ripley <nick.ripley@datadoghq.com>
Date:   Fri Oct 6 13:02:40 2023 -0400

    runtime: use frame pointer unwinding for block and mutex profilers
    
    Use frame pointer unwinding, where supported, to collect call stacks for
    the block, and mutex profilers. This method of collecting call stacks is
    typically an order of magnitude faster than callers/tracebackPCs. The
    marginal benefit for these profile types is likely small compared to
    using frame pointer unwinding for the execution tracer. However, the
    block profiler can have noticeable overhead unless the sampling rate is
    very high. Additionally, using frame pointer unwinding in more places
    helps ensure more testing/support, which benefits systems like the
    execution tracer which rely on frame pointer unwinding to be practical
    to use.
    
    Change-Id: I4b36c90cd2df844645fd275a41b247352d635727
    Reviewed-on: https://go-review.googlesource.com/c/go/+/533258
    LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
    Reviewed-by: Cherry Mui <cherryyz@google.com>
    Auto-Submit: Cherry Mui <cherryyz@google.com>
    Reviewed-by: Michael Pratt <mpratt@google.com>

src/runtime/pprof
last known 07496729c58dd58316b01c8bb81aa21a925ae006
current 0b0dfcd5404ce86d6c818d78bdb6348ded459e96

commit 0b0dfcd5404ce86d6c818d78bdb6348ded459e96
Author: apocelipes <seve3r@outlook.com>
Date:   Wed Jul 24 10:24:06 2024 +0000

    runtime: use slices and maps to clean up tests
    
    Replace reflect.DeepEqual with slices.Equal/maps.Equal, which is
    much faster.
    
    Also remove some unecessary helper functions.
    
    Change-Id: I3e4fa2938fed1598278c9e556cd4fa3b9ed3ad6d
    GitHub-Last-Rev: 69bb43fc6e5c4a4a7d028528fe00b43db784464e
    GitHub-Pull-Request: golang/go#67603
    Reviewed-on: https://go-review.googlesource.com/c/go/+/587815
    Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
    LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
    Auto-Submit: Ian Lance Taylor <iant@google.com>
    Reviewed-by: Ian Lance Taylor <iant@google.com>

commit 87abb4afb63918d3e5ee5d7ebd196386fa22e368
Author: Nick Ripley <nick.ripley@datadoghq.com>
Date:   Tue Jul 16 07:15:22 2024 -0400

    runtime: avoid multiple records with identical stacks from MutexProfile
    
    When using frame pointer unwinding, we defer frame skipping and inline
    expansion for call stacks until profile reporting time. We can end up
    with records which have different stacks if no frames are skipped, but
    identical stacks once skipping is taken into account. Returning multiple
    records with the same stack (but different values) has broken programs
    which rely on the records already being fully aggregated by call stack
    when returned from runtime.MutexProfile. This CL addresses the problem
    by handling skipping at recording time. We do full inline expansion to
    correctly skip the desired number of frames when recording the call
    stack, and then handle the rest of inline expansion when reporting the
    profile.
    
    The regression test in this CL is adapted from the reproducer in
    https://github.com/grafana/pyroscope-go/issues/103, authored by Tolya
    Korniltsev.
    
    Fixes #67548
    
    This reapplies CL 595966.
    The original version of this CL introduced a bounds error in
    MutexProfile and failed to correctly expand inlined frames from that
    call. This CL applies the original CL, fixing the bounds error and
    adding a test for the output of MutexProfile to ensure the frames are
    expanded properly.
    
    Change-Id: I5ef8aafb9f88152a704034065c0742ba767c4dbb
    Reviewed-on: https://go-review.googlesource.com/c/go/+/598515
    LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
    Reviewed-by: Carlos Amedee <carlos@golang.org>
    Reviewed-by: Cherry Mui <cherryyz@google.com>

commit 6948b4df8c5ec9f249960ea60dba6349d1975c0e
Author: Cherry Mui <cherryyz@google.com>
Date:   Mon Jul 15 19:49:44 2024 +0000

    Revert "runtime: avoid multiple records with identical stacks from MutexProfile"
    
    This reverts CL 595966.
    
    Reason for revert: This CL contains a bug. See the comment in https://go-review.googlesource.com/c/go/+/595966/8#message-57f4c1f9570b5fe912e06f4ae3b52817962533c0
    
    Change-Id: I48030907ded173ae20a8965bf1b41a713dd06059
    Reviewed-on: https://go-review.googlesource.com/c/go/+/598219
    Reviewed-by: Than McIntosh <thanm@google.com>
    LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
    Reviewed-by: Carlos Amedee <carlos@golang.org>

commit 183a40db6def2780478053b3cd14f3f5c355d999
Author: Nick Ripley <nick.ripley@datadoghq.com>
Date:   Tue Jul 2 12:53:36 2024 -0400

    runtime: avoid multiple records with identical stacks from MutexProfile
    
    When using frame pointer unwinding, we defer frame skipping and inline
    expansion for call stacks until profile reporting time. We can end up
    with records which have different stacks if no frames are skipped, but
    identical stacks once skipping is taken into account. Returning multiple
    records with the same stack (but different values) has broken programs
    which rely on the records already being fully aggregated by call stack
    when returned from runtime.MutexProfile. This CL addresses the problem
    by handling skipping at recording time. We do full inline expansion to
    correctly skip the desired number of frames when recording the call
    stack, and then handle the rest of inline expansion when reporting the
    profile.
    
    The regression test in this CL is adapted from the reproducer in
    https://github.com/grafana/pyroscope-go/issues/103, authored by Tolya
    Korniltsev.
    
    Fixes #67548
    
    Co-Authored-By: Tolya Korniltsev <korniltsev.anatoly@gmail.com>
    Change-Id: I6a42ce612377f235b2c8c0cec9ba8e9331224b84
    Reviewed-on: https://go-review.googlesource.com/c/go/+/595966
    LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
    Reviewed-by: Carlos Amedee <carlos@golang.org>
    Reviewed-by: Cherry Mui <cherryyz@google.com>
    Auto-Submit: Carlos Amedee <carlos@golang.org>
    Reviewed-by: Felix Geisendörfer <felix.geisendoerfer@datadoghq.com>

commit 82c14346d89ec0eeca114f9ca0e88516b2cda454
Author: Cherry Mui <cherryyz@google.com>
Date:   Wed Jul 3 14:14:34 2024 -0400

    cmd/link: don't disable memory profiling when pprof.WriteHeapProfile is used
    
    We have an optimization that if the memory profile is not consumed
    anywhere, we set the memory profiling rate to 0 to disable the
    "background" low-rate profiling. We detect whether the memory
    profile is used by checking whether the runtime.MemProfile function
    is reachable at link time. Previously, all APIs that access the
    memory profile go through runtime.MemProfile. But the code was
    refactored in CL 572396, and now the legacy entry point
    WriteHeapProfile uses pprof_memProfileInternal without going
    through runtime.MemProfile. In fact, even with the recommended
    runtime/pprof.Profile API (pprof.Lookup or pprof.Profiles),
    runtime.MemProfile is only (happen to be) reachable through
    countHeap.
    
    Change the linker to check runtime.memProfileInternal instead,
    which is on all code paths that retrieve the memory profile. Add
    a test case for WriteHeapProfile, so we cover all entry points.
    
    Fixes #68136.
    
    Change-Id: I075c8d45c95c81825a1822f032e23107aea4303c
    Reviewed-on: https://go-review.googlesource.com/c/go/+/596538
    Reviewed-by: Than McIntosh <thanm@google.com>
    LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

commit ca7d300509626e2071f3f5babc2e9c121d806fec
Author: Rhys Hiltner <rhys.hiltner@gmail.com>
Date:   Wed May 29 16:41:10 2024 +0000

    Revert "runtime: remove GODEBUG=runtimecontentionstacks"
    
    This reverts commit 87e930f7289136fad1310d4b63dd4127e409bac5 (CL 585639)
    
    Reason for revert: This is part of a patch series that changed the
    handling of contended lock2/unlock2 calls, reducing the maximum
    throughput of contended runtime.mutex values, and causing a performance
    regression on applications where that is (or became) the bottleneck.
    
    Updates #66999
    Updates #67585
    
    Change-Id: I1e286d2a16d16e4af202cd5dc04b2d9c4ee71b32
    Reviewed-on: https://go-review.googlesource.com/c/go/+/589097
    Reviewed-by: Than McIntosh <thanm@google.com>
    LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
    Reviewed-by: Michael Pratt <mpratt@google.com>
    Auto-Submit: Rhys Hiltner <rhys.hiltner@gmail.com>

commit 9a8995b8b6a08d5fe01122771f962b36336f8aec
Author: Russ Cox <rsc@golang.org>
Date:   Wed May 22 17:09:02 2024 -0400

    all: document legacy //go:linkname for modules with ≥100 dependents
    
    For #67401.
    
    Change-Id: I015408a3f437c1733d97160ef2fb5da6d4efcc5c
    Reviewed-on: https://go-review.googlesource.com/c/go/+/587598
    Reviewed-by: Cherry Mui <cherryyz@google.com>
    LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
    Auto-Submit: Russ Cox <rsc@golang.org>

commit bf91eb3a8bb057a620f3823e4d6b74a529c0a44d
Author: Alan Donovan <adonovan@google.com>
Date:   Thu May 23 10:47:36 2024 -0400

    std: fix calls to Printf(s) with non-constant s
    
    In all cases the intent was not to interpret s as a format string.
    In one case (go/types), this was a latent bug in production.
    (These were uncovered by a new check in vet's printf analyzer.)
    
    Updates #60529
    
    Change-Id: I3e17af7e589be9aec1580783a1b1011c52ec494b
    Reviewed-on: https://go-review.googlesource.com/c/go/+/587855
    LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
    Reviewed-by: Russ Cox <rsc@golang.org>

commit b0b1d42db32a992150dd26681d3bda222e108303
Author: Ian Lance Taylor <iant@golang.org>
Date:   Wed May 22 13:38:40 2024 -0700

    all: change from sort functions to slices functions where feasible
    
    Doing this because the slices functions are slightly faster and
    slightly easier to use. It also removes one dependency layer.
    
    This CL does not change packages that are used during bootstrap,
    as the bootstrap compiler does not have the required slices functions.
    It does not change the go/scanner package because the ErrorList
    Len, Swap, and Less methods are part of the Go 1 API.
    
    Change-Id: If52899be791c829198e11d2408727720b91ebe8a
    Reviewed-on: https://go-review.googlesource.com/c/go/+/587655
    LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
    Reviewed-by: Ian Lance Taylor <iant@google.com>
    Auto-Submit: Ian Lance Taylor <iant@google.com>
    Commit-Queue: Ian Lance Taylor <iant@google.com>
    Reviewed-by: Damien Neil <dneil@google.com>

commit 41884dcd05546ced3634496d931d0b005fc8c2e6
Author: Rhys Hiltner <rhys.hiltner@gmail.com>
Date:   Wed May 22 09:57:14 2024 -0700

    runtime/pprof: ignore runtime-internal samples in test
    
    Tests of the mutex profile focus on sync.Mutex, which is easy to
    control. But since those tests still use the runtime, and contention on
    internal runtime.mutex values is now also part of the mutex profile, we
    have to filter out those samples before examining the profile. Otherwise
    the test may be confused by stray contention on sched.lock (or other
    runtime-internal locks) as a natural consequence of using goroutines.
    
    Fixes #67563
    
    Change-Id: I066a24674d8b719dbeca4a5c0f76b53bc07498c1
    Reviewed-on: https://go-review.googlesource.com/c/go/+/586957
    Reviewed-by: Cherry Mui <cherryyz@google.com>
    LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
    Auto-Submit: Rhys Hiltner <rhys.hiltner@gmail.com>
    Reviewed-by: Michael Pratt <mpratt@google.com>

commit 87e930f7289136fad1310d4b63dd4127e409bac5
Author: Rhys Hiltner <rhys.hiltner@gmail.com>
Date:   Tue May 14 14:44:37 2024 -0700

    runtime: remove GODEBUG=runtimecontentionstacks
    
    Go 1.22 promised to remove the setting in a future release once the
    semantics of runtime-internal lock contention matched that of
    sync.Mutex. That work is done, remove the setting.
    
    For #66999
    
    Change-Id: I3c4894148385adf2756d8754e44d7317305ad758
    Reviewed-on: https://go-review.googlesource.com/c/go/+/585639
    Reviewed-by: Carlos Amedee <carlos@golang.org>
    LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
    Auto-Submit: Rhys Hiltner <rhys.hiltner@gmail.com>
    Reviewed-by: Michael Pratt <mpratt@google.com>

commit 1b9dc3e178be578cf1d8c06fe371283a58bdd93f
Author: Felix Geisendörfer <felix.geisendoerfer@datadoghq.com>
Date:   Fri May 17 15:07:07 2024 +0200

    runtime: increase profiling stack depth to 128
    
    The current stack depth limit for alloc, mutex, block, threadcreate and
    goroutine profiles of 32 frequently leads to truncated stack traces in
    production applications. Increase the limit to 128 which is the same
    size used by the execution tracer.
    
    Create internal/profilerecord to define variants of the runtime's
    StackRecord, MemProfileRecord and BlockProfileRecord types that can hold
    arbitrarily big stack traces. Implement internal profiling APIs based on
    these new types and use them for creating protobuf profiles and to act
    as shims for the public profiling APIs using the old types.
    
    This will lead to an increase in memory usage for applications that
    use the impacted profile types and have stack traces exceeding the
    current limit of 32. Those applications will also experience a slight
    increase in CPU usage, but this will hopefully soon be mitigated via CL
    540476 and 533258 which introduce frame pointer unwinding for the
    relevant profile types.
    
    For #43669.
    
    Change-Id: Ie53762e65d0f6295f5d4c7d3c87172d5a052164e
    Reviewed-on: https://go-review.googlesource.com/c/go/+/572396
    LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
    Reviewed-by: Austin Clements <austin@google.com>
    Reviewed-by: Cherry Mui <cherryyz@google.com>

commit 47187a4f4f226c4b9e0e920c5ad1ec9ce83bdc35
Author: Felix Geisendörfer <felix.geisendoerfer@datadoghq.com>
Date:   Sun May 19 15:21:53 2024 +0200

    runtime: fix profile stack trace depth regression
    
    Previously it was possible for mutex and block profile stack traces to
    contain up to 32 frames in Stack0 or the resulting pprof profiles.
    CL 533258 changed this behavior by using some of the space to
    record skipped frames that are discarded when performing delayed inline
    expansion. This has lowered the effective maximum stack size from 32 to
    27 (the max skip value is 5), which can be seen as a small regression.
    
    Add TestProfilerStackDepth to demonstrate the issue and protect all
    profile types from similar regressions in the future. Fix the issue by
    increasing the internal maxStack limit to take the maxSkip value into
    account. Assert that the maxSkip value is never exceeded when recording
    mutex and block profile stack traces.
    
    Three alternative solutions to the problem were considered and
    discarded:
    
    1) Revert CL 533258 and give up on frame pointer unwinding. This seems
       unappealing as we would lose the performance benefits of frame
       pointer unwinding.
    2) Discard skipped frames when recording the initial stack trace. This
       would require eager inline expansion for up to maxSkip frames and
       partially negate the performance benefits of frame pointer
       unwinding.
    3) Accept and document the new behavior. This would simplify the
       implementation, but seems more confusing from a user perspective. It
       also complicates the creation of test cases that make assertions
       about the maximum profiling stack depth.
    
    The execution tracer still has the same issue due to CL 463835. This
    should be addressed in a follow-up CL.
    
    Co-authored-by: Nick Ripley <nick.ripley@datadoghq.com>
    Change-Id: Ibf4dbf08a5166c9cb32470068c69f58bc5f98d2c
    Reviewed-on: https://go-review.googlesource.com/c/go/+/586657
    Reviewed-by: Austin Clements <austin@google.com>
    Reviewed-by: Cherry Mui <cherryyz@google.com>
    LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

@github-actions github-actions bot force-pushed the check_golang_profiler_changes_1715660795 branch 7 times, most recently from bc39ac8 to 93d5771 Compare May 21, 2024 04:27
@github-actions github-actions bot force-pushed the check_golang_profiler_changes_1715660795 branch 4 times, most recently from 0e0ec7f to b497846 Compare May 25, 2024 04:26
@github-actions github-actions bot requested a review from a team as a code owner May 25, 2024 04:26
@github-actions github-actions bot force-pushed the check_golang_profiler_changes_1715660795 branch 8 times, most recently from e70d662 to e791cdd Compare June 2, 2024 04:27
@github-actions github-actions bot force-pushed the check_golang_profiler_changes_1715660795 branch 7 times, most recently from e91934d to d4fb06b Compare June 9, 2024 04:27
@github-actions github-actions bot force-pushed the check_golang_profiler_changes_1715660795 branch from d4fb06b to d895f8e Compare June 10, 2024 04:29
@github-actions github-actions bot force-pushed the check_golang_profiler_changes_1715660795 branch 7 times, most recently from 3dc0d23 to 42c31fe Compare July 29, 2024 04:27
@github-actions github-actions bot force-pushed the check_golang_profiler_changes_1715660795 branch 7 times, most recently from bd05fe7 to 30a307c Compare August 5, 2024 04:28
@github-actions github-actions bot force-pushed the check_golang_profiler_changes_1715660795 branch 8 times, most recently from 52bb5e0 to fcfa0b0 Compare August 13, 2024 04:27
@github-actions github-actions bot force-pushed the check_golang_profiler_changes_1715660795 branch from fcfa0b0 to 0169ec7 Compare August 14, 2024 04:28
@github-actions github-actions bot force-pushed the check_golang_profiler_changes_1715660795 branch from 0169ec7 to c4848be Compare August 15, 2024 04:28
@korniltsev korniltsev enabled auto-merge (squash) August 15, 2024 10:59
@korniltsev korniltsev merged commit 278dd8c into main Aug 15, 2024
1 check passed
@korniltsev korniltsev deleted the check_golang_profiler_changes_1715660795 branch August 15, 2024 11:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant