-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/compile: improve escape analysis of make([]T, n) where n is non-constant #20533
Comments
Implementing alloca wouldn't be impossible now that we have frame pointers everywhere. It's still tricky, though, because stack space is limited. |
Please, do not allocate big chunks of data on stack. Otherwise issues similar to #18625 and #19817 will appear. Additionally, it would be great to have stack size profiler described at #20010 in order to detect stack (ab)users if optimizations similar to this one will go into go :) Probably it would be better to add new type of memory - |
@RLH and I have been talking about doing this sort of thing for a while. There are several cases where escape analysis can in principle determine that an object's extent does not exceed its scope but it is still forced to heap-allocate it. It wouldn't necessarily require a special heap. We've been talking about adding an explicit |
@ezrosent and I were recently discussing a similar idea - that you could prove that a given pointer to a heap-allocated object is the only pointer to it, and thus even if it gets sent across a channel or in some other way escapes its scope, it could still be deterministically freed. The notion of such a special heap or an explicit |
Structs now inspect the value before each key, because yielding of the key must of course be skipped if the value is to be skipped. And yet, we're not done here, and that test is commented out for a reason! This is more complicated than for e.g. stdlib json.Marshal -- we have to emit length information at the beginning of an object. And this, in turn, is capital-H Hard. Emitting the correct length information *up front* will require significantly more code changes, and they're a tad controvertial. We have to inspect *all* the fields to see if they're going to be skipped. And strangely, I think we're going to have to do that twice. Checking for the fields to skip must happen at the top, that much is clear; but then to remember which ones we already know will be skipped would require O(n) memory in the length of the struct... which would imply a heap allocation to track! (Worrying about heap allocs is not news in the refmt project because of our stepfunc design, but it's interesting to note we'd be in trouble anyway: Go actually always lets runtime-sized slice creation escape to heap: golang/go#20533 .) So. An O(2n) runtime is going to be a better trade than slipping from constant to O(n) memory. Hrmph. Anyway, that bit will be in the next commits. Signed-off-by: Eric Myhre <hash@exultant.us>
Structs now inspect the value before each key, because yielding of the key must of course be skipped if the value is to be skipped. And yet, we're not done here, and that test is commented out for a reason! This is more complicated than for e.g. stdlib json.Marshal -- we have to emit length information at the beginning of an object. And this, in turn, is capital-H Hard. Emitting the correct length information *up front* will require significantly more code changes, and they're a tad controvertial. We have to inspect *all* the fields to see if they're going to be skipped. And strangely, I think we're going to have to do that twice. Checking for the fields to skip must happen at the top, that much is clear; but then to remember which ones we already know will be skipped would require O(n) memory in the length of the struct... which would imply a heap allocation to track! (Worrying about heap allocs is not news in the refmt project because of our stepfunc design, but it's interesting to note we'd be in trouble anyway: Go actually always lets runtime-sized slice creation escape to heap: golang/go#20533 .) So. An O(2n) runtime is going to be a better trade than slipping from constant to O(n) memory. Hrmph. Anyway, that bit will be in the next commits. Signed-off-by: Eric Myhre <hash@exultant.us>
I'm pretty sure this is quite hard to prove in cases like the one you describe (while limiting compile time), so you would have to resort to something like reference counting (not that I'm opposed to reference counting in general). There is already escape analysis for function arguments (which had a huge impact on the performance of the fmt package when it was introduced), so the kind of solution proposed by @valyala and @aclements should be quite possible. Can anyone confirm whether there has been any work in this direction? |
Algorithmically, the idea is to do something roughly analogous to Rust's ownership tracking. Essentially, you start off with the assumption that all objects have a single unique owner - and thus, that they can be deallocated when they go out of scope - and then you work forwards from the allocation point to see if it's ever the case that the same pointer gets sent to two or more different places (sent across two channels, sent across a channel and stored in a data structure, etc), in which case the unique ownership property is broken. If the property is never broken, then you can free the object when it goes out of scope. I haven't actually prototyped this, but my guess is that that algorithm would be pretty reasonable in terms of execution time. Total speculation, though. |
I think we're going off topic here and should probably continue on the mailing list. Let me just point out why:
Not really. I wouldn't take Rust as and indication that this is possible in Go. Rust puts the burden of deciding ownership on the developer, not the compiler, by having different kinds of references and strict rules on how you can use them. Their compiler doesn't have to do analysis like this.
You mean, backwards from the receiving end I guess, since you want to know there if you are the sole owner of everything that comes in or if it could have escaped. This is both complex and ends up in the "whole program analysis" category. Especially the latter doesn't play well with Go's compile time goals and seems quite different than the more localized escape analysis discussed in this issue. Don't get me wrong. I would love to see something like this too, even if it only works in very specific cases. I just don't think that it is reasonable to expect that to be accomplished on the same timescale as the original issue. |
Agreed. Suffice it to say that I think there's promise here, but most of my thinking on this was done five months ago, so I don't remember why I arrive at that conclusion. If folks are interested, I'd be happy to discuss this in more detail somewhere, but I certainly don't have time to make a prototype now, so I'm not going to drive this myself. Sorry to hijack the thread :) |
Change https://golang.org/cl/152478 mentions this issue: |
In this program, the compiler's escape analysis judges that the array allocated by
make
escapes to the heap, when in fact it does not, presumably because its size is non-constant and thus it cannot be allocated on the stack (withoutalloca
).The lack of this optimization makes it hard to write a good bytecode interpreter in Go because the interpreter's operand stack has a non-constant size, and is thus heap-allocated, even though it is guaranteed by construction not to escape. Consequently, the interpreter incurs a heap allocation at the start of each function, and then a GC write barrier each time it stores an operand to the stack, which is a common operation.
Perhaps the notions of "escapes to heap" and "requires a write barrier" could be decoupled so that a non-constant-sized non-escaping heap variable could avoid write barriers. Or perhaps the compiler could use
alloca
to allocate non-constant-size non-escaping variables on the stack.The text was updated successfully, but these errors were encountered: