Description
Go version
tip
What did you do?
Run this benchmark:
func CallWrite(w io.Writer, x byte) {
// The backing array for this slice currently escapes
// because it is used as an interface method call arg.
b := make([]byte, 0, 64)
b = append(b, x)
w.Write(b) // Interface method call.
}
func BenchmarkCallWrite(b *testing.B) {
g := &GoodWriter{}
b.ReportAllocs()
for i := 0; i < b.N; i++ {
CallWrite(g, 0)
}
}
What did you see happen?
Interface method arguments are marked as escaping. In our example, the byte slice is heap allocated due to its use as the argument to the w.Write
:
BenchmarkCallWrite 31895619 47.36 ns/op 64 B/op 1 allocs/op
What did you expect to see?
It would be nice to see 0 allocations.
Suggestion
I suspect in many cases we can teach escape analysis to only conditionally heap allocate an interface method parameter (the byte slice b
in our example) depending on what concrete implementation is passed in at execution time behind the interface.
Today, escape analysis cannot generally reach different conclusions for distinct control flow paths, nor set up a single value to be conditionally stack or heap allocated, but we could teach it to do those things in certain situations, including for the type assertions that are automatically introduced at compile time by PGO-based devirtualization. The suggested approach doesn't solely rely on PGO, but people who are concerned about performance really should be using PGO, including PGO has some inlining superpowers that are helpful for avoiding allocations. (Today, PGO devirtualization doesn't avoid the allocation we are discussing).
This is to help target cases where the concrete type behind an interface is not statically known. (That is, the compiler today is able to recognize in cases like w := &GoodWriter{}; w.Write(b)
that statically w
must have a specific concrete type, but that does not help today when the concrete type for w
is not statically known, such as if w
is passed in as a parameter as in our example above).
I have a basic working version, and I plan to send a CL. It still needs cleanup, more tests, I need to look at a broader set of results, etc., and it has a couple of shortcuts that I am in the middle of removing, but I am cautiously hopeful it can be made to work robustly.
Benchmark results for the example from the playground link above using that CL with PGO enabled:
│ 1.24.out │ new.out │
│ sec/op │ sec/op vs base │
CallWrite-4 48.850n ± 2% 4.385n ± 1% -91.02% (p=0.000 n=20)
│ 1.24.out │ new.out │
│ B/op │ B/op vs base │
CallWrite-4 64.00 ± 0% 0.00 ± 0% -100.00% (p=0.000 n=20)
│ 1.24.out │ new.out │
│ allocs/op │ allocs/op vs base │
CallWrite-4 1.000 ± 0% 0.000 ± 0% -100.00% (p=0.000 n=20)
Any feedback welcome, especially of the flavor "This will never work for reasons X and Y" (which would save everyone time 😅).
Discussion
When compiling CallWrite
, the compiler does not know the concrete implementation hiding behind interface w
. For example, CallWrite
might be invoked with a nice concrete implementation that does not retain/leak the w.Write
byte slice argument (like GoodWriter
in our example above), or CallWrite
might be invoked with something like LeakingWriter
:
// LeakingWriter leaks its Write argument to the heap.
type LeakingWriter struct{ a, b int }
func (w *LeakingWriter) Write(p []byte) (n int, err error) {
global = p
return len(p), nil
}
var global []byte
io.Writer
happens to document that implementations must not retain the byte slice, but the compiler doesn't read documentation and this type of allocation also affects other interfaces as well that don't document similar prohibitions.
If we collect a profile that is able to observe that CallWrite
is frequently called with GoodWriter
, the compiler can use PGO to conditionally devirtualize w
in most cases, and effectively rewrite the IR at compile time to something akin to:
func OneTypeAssert(w io.Writer, x byte) {
b := make([]byte, 0, 64)
b = append(b, x)
// PGO-based rewrite of w.Write(b)
tmpw, ok := w.(*GoodWriter)
if ok {
tmpw.Write(b) // concrete method call
} else {
w.Write(b) // interface method call
}
}
However, the byte slice still must be heap allocated given the function can be called with something other than GoodWriter
.
We might be tempted to help things by manually adding a second type assertion around the make
:
func TwoTypeAsserts(w io.Writer, x byte) {
// Human attempt to help.
var b []byte
_, ok := w.(*GoodWriter)
if ok {
b = make([]byte, 0, 64)
} else {
b = make([]byte, 0, 64)
}
b = append(b, x)
// PGO-based rewrite of w.Write(b)
tmpw, ok := w.(*GoodWriter)
if ok {
tmpw.Write(b) // concrete method call
} else {
w.Write(b) // interface method call
}
}
However, that still heap allocates with today's escape analysis, which constructs a directed data-flow graph that is insensitive to control flow branches.
WIP CL
Our WIP CL teaches escape analysis to recognize type assertions like those created by PGO devirtualization (or by a human), track what happens in the concrete call case (e.g., does the concrete call also cause b
to escape), propagate the interface method argument use back to locations like the make
in our original example, and if warranted, rewrite a single potentially allocating location (like a single make
) to have instead have two locations protected by appropriate type assertions in the IR.
The net result is the slice backing array in one location has been proven to be safe to place on the stack and while the other location is heap allocated (and which one of those happens depends on what interface is passed in at execution time).
In other words, a human can write the original three-line CallWrite
from above, and the compiler can conclude it is worthwhile to transform it to the TwoTypeAsserts
form and do the work to make the heap allocation conditional.
In the WIP implementation, some current limitations include:
- The new tracking information does not yet flow across function boundaries for interprocedural analysis, though I have hopefully set things up to be able to tackle that piece soon.
- For simplicity, the interface (
w
in our example) must currently be passed in as a function parameter. I think this restriction could be relaxed, though a tricky case is ifw
is a field on another struct. (I'll briefly comment more on that below). - The interface variable (e.g.,
w
) must not be reassigned in view of the allocating location or interface method call. This might be a reasonable restriction for a first cut. (OTOH, the potentially allocating value can be reassigned, as shown in our examples above).
Metadata
Metadata
Assignees
Labels
Type
Projects
Status