-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: Large maps cause significant GC pauses #9477
Comments
/cc @randall77 @dvyukov @rsc @RLH |
The reference conversation for this support issue can be found here: My findings at the end of the thread are here: |
This may be fixed in 1.5 with the concurrent gc. However, the work of scanning the hash tables will not go away, it will just be paid for in smaller chunks. Hash tables will still have overflow pointers so they will still need to be scanned and there are no plans to fix that. I'm open to ideas if anyone has them. |
👍. I added some example cases with preallocated maps for comparison. @joliver agree with the slice issue too. We use arrays for very large blocks where possible, but it's annoying. |
@randall77 on pain of going into a rat hole, I ran @cespare's tests with a few different values of
It's possible that in |
Is the problem about how long a call to runtime.GC() takes or is it really On Fri, Jan 2, 2015 at 2:46 PM, Josh Bleecher Snyder <
|
It actually is a problem with the garbage collection itself. The examples thus far call The biggest question being raised in this issue is whether optimizations could be made on certain kinds of structures such as maps where the contained type is known to not contain pointers, such as a |
If go reduced the observed garbage collection pauses from hundreds of On Mon, Jan 5, 2015 at 11:22 AM, Jonathan Oliver notifications@github.com
|
I run an application that uses a lot of map and sees similar issues (~600ms pause time with a 1.6GB heap). Decreasing the pause to 10ms at a time would be a big help to this app. However, I wonder if the overall cost of GC could be decreased separately. @randall77, I read through hashmap.go recently and it looks like the use of overflow buckets may be restricted enough that they could be allocated from a map-local arena instead of on the global heap. It may not have to lean on the general-purpose GC just to keep track of its overflow buckets. It looks like overflow buckets aren't ever freed except during an evacuation, and that pointers to them are only accessible within small parts of the runtime. The memory for the overflow buckets could be backed by an array similar to hmap.buckets (with a bump-the-pointer allocator), and could be referenced by their offset into the array instead of a real pointer (which would be chased by the GC). Is this approach possible? |
I suppose that the total GC time would be the same or maybe slower if the start/stop takes some time, but if GC runs 20% of the time, instead of 600 ms pause, it would be 3 seconds at 80% your code, 20% the GC. Maybe it is a solution to avoid long pauses in interactive programs but the loss of performance is there in any case. I wonder if it would be possible to produce garbage faster than it is collected... |
@rhysh, it is an interesting idea to allocate overflow buckets as a single array (perhaps contiguous with the main buckets). Then we could use bucket ids instead of pointers to reference them. The main problem is that there's no a priori maximum number of overflow buckets needed. It depends on how the hashing works out. It could be as bad as half the size of the main bucket array. You wouldn't want to allocate this worst case ahead of time. Maybe there would be a way to trigger growth on # of used overflow buckets instead of on just the number of entries as we do now. That way, we could have a fixed maximum, say 10%, of overflow buckets allocated at the start and we grow if we run out of them. |
A goroutine that is allocating or creating other GC work such as writing On Mon, Jan 5, 2015 at 4:41 PM, siritinga notifications@github.com wrote:
|
We would not have this kind of problem if go provided an "unmanaged" library which could implement manual memory management for regions of heap to be ignored by the GC. Neither a pointer to any address to the "unmanaged" heap nor any pointer type stored inside the "unmanaged" heap would be considered for GC. This could be a very simple solution which would solve once and for all the problems that go has with long-lived pointer values and which will probably never be solved by GC. |
Right @randall77, there's no limit on the number of overflow buckets. If a map stays about the same size but has a lot of churn, it seems like each of the primary buckets could grow to have a considerable number of overflow buckets chained to them - the random distribution of elements could bump each bucket over 8 or 16 elements, without ever increasing the average beyond the load factor. Since the overflow buckets aren't freed when they're emptied via runtime·mapdelete, the expansion would be permanent. There'd probably need to be an "overflow bucket evacuation" process that would operate like the current evacuation to allow resizing of the new array. Added complexity for sure, but it may be worth it. Do you happen to have statistics on how many overflow buckets end up used for long-lived maps with churn? This could maybe be collected at runtime by chasing the overflow map pointers, or offline by inspecting debug.WriteHeapDump output? |
@rhysh, no I don't have any data on long-lived maps. I have been meaning to implement compacting and freeing of the overflow buckets when the map has never been iterated over. For maps which have had an iterator started on them, however, I don't see any easy way to compact. Your idea of repurposing evacuation to do compacting may work. We'd then have two types of evacuation - to a map of the same size and a map of double the size. Both would be triggered by running out of the overflow bucket pool. We'd have to figure out which to do at that point (how?). |
@randall77 I haven't thought through how iteration interacts with this. Maybe golang-dev has some good ideas? My rough impression is that the evacuation work has to be complete by the time we'd need a subsequent resize. It looks like you guarantee this by doing evacuate work on writes and doubling the size. If the size doesn't double, then we might run into problems (needing another evacuate before the current one has finished). I wonder if the overflow compaction could happen only when the table was already twice the size it needed to be. If we've got a table with 6000 elements, it will fit in a hmap with 1024 buckets (if loadFactor is 6.5). When it runs out of overflow buckets due to churn, maybe we resize it to 2048 buckets. When it runs out of overflow buckets again (but still has ~6000 elements), we can notice that it's already twice as big as required and evacuate it to a new table of the same size. This same-size evacuation might have to run twice as fast to finish in time, but it's still a constant factor. |
If we have a map[k]v where both k and v does not contain pointers and we want to improve scan performance, then we can do the following. |
I am not sure how important are hashmaps w/o pointers, somebody needs to do a study. I feel that they are not very frequently used, but on the other hand it is a nice optimization option if you have a large map and can refactor the code so that k/v do not contain pointers (e.g. change string to [8]byte). |
I would dearly love this optimization as an option; I posted to https://groups.google.com/forum/#!topic/golang-nuts/pHYverdFcLc with a description of what I'm facing. But think 10+second stop the world GC halts frequently. @cespare indicated that he had good success rolling his own implementation, If for some reason his code isn't shareable, I'll see what I can cook up. |
This is all completely different on tip and I think the large pauses are already gone there. |
@ianlancetaylor you mean that tip should have lower GC times or simply lower pauses because it is concurrent? In a quick check, I see that GC total time is almost double than 1.4.1 for a big map[string]string but the time is in the first field of the gctrace debug instead of the third (I think the first is the mark phase that now is concurrent?). |
Lower pause times. Which is what this big is about. |
Tip contains a lot of code to check the validity of the GC phases and On Sun, Jan 25, 2015 at 3:51 AM, siritinga notifications@github.com wrote:
|
Mailed a change that makes GC ignore maps when both key and value do not contain pointers: |
Currently we scan maps even if k/v does not contain pointers. This is required because overflow buckets are hanging off the main table. This change introduces a separate array that contains pointers to all overflow buckets and keeps them alive. Buckets themselves are marked as containing no pointers and are not scanned by GC (if k/v does not contain pointers). This brings maps in line with slices and chans -- GC does not scan their contents if elements do not contain pointers. Currently scanning of a map[int]int with 2e8 entries (~8GB heap) takes ~8 seconds. With this change scanning takes negligible time. Update #9477. Change-Id: Id8a04066a53d2f743474cad406afb9f30f00eaae Reviewed-on: https://go-review.googlesource.com/3288 Reviewed-by: Keith Randall <khr@golang.org>
@dvyukov Thanks so much! |
The change is merged. |
I'm sorry to say there is; I have been having weird corruption issues with maps post-GC while running on tip, and @dvyukov, where do you want me to include details? Here, or on golang-devs? |
@vessenes please file a separate issue with repro instructions |
The bug I mentioned got discussed on -dev and fixed by @dvyukov. |
Closing this out as the low-hanging fruit was fixed by @dvyukov and there's no clear next actions wrt pointer-containing map types. Of course I'm still looking forward to future map improvements :) With @dvyukov's change, GC pause behavior for maps should more closely resemble that for the other aggregate data types and should be easier to reason about in general (say, if you're trying to reduce pause latency and you know the "duration is proportional to # of pointers on the heap" rule of thumb). |
I'm on Go 1.4 and linux/amd64. I've noticed this since at least Go 1.2.
Large maps incur large GC pause times. This happens even when the map key and value types do not contain pointers, and when the map isn't changing between GC runs. I assume it comes from the collector traversing internal pointers in the map implementation.
Roughly speaking, with a single map containing millions of entries, one will typically see GC pause times of hundreds of ms.
Here's an example program that shows a few different approaches: http://play.golang.org/p/AeHSLyLz_c. In particular, it compares
In the real program where this issue caused me problems, I have a server that used a very large map (>10M entries) as an index into off-heap data. Even after getting rid of pointers in the map value and sharding the map, the pause times were >100ms which was too large for my use case. So I ended up writing a very simple hashmap implementation using only slices/indexes and that brought the pause times to under 1ms.
I wonder if the map implementation and GC can be made to work together better to mitigate this.
(It's possible that this is expected/unfortunate, or that it will get better with improved GC along with everything else, but I wanted to have a record of it to center the discussion.)
The text was updated successfully, but these errors were encountered: