Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sync: add Map.Len method? #20680

Closed
F21 opened this issue Jun 15, 2017 · 60 comments
Closed

sync: add Map.Len method? #20680

F21 opened this issue Jun 15, 2017 · 60 comments
Labels
FeatureRequest Issues asking for a new feature that does not need a proposal. NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Milestone

Comments

@F21
Copy link

F21 commented Jun 15, 2017

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?

1.9-beta1

What operating system and processor architecture are you using (go env)?

Windows 10 64-bit

set GOARCH=amd64
set GOBIN=
set GOEXE=.exe
set GOHOSTARCH=amd64
set GOHOSTOS=windows
set GOOS=windows
set GOPATH=C:\Work
set GORACE=
set GOROOT=C:\Go
set GOTOOLDIR=C:\Go\pkg\tool\windows_amd64
set GCCGO=gccgo
set CC=gcc
set GOGCCFLAGS=-m64 -mthreads -fmessage-length=0
set CXX=g++
set CGO_ENABLED=1
set CGO_CFLAGS=-g -O2
set CGO_CPPFLAGS=
set CGO_CXXFLAGS=-g -O2
set CGO_FFLAGS=-g -O2
set CGO_LDFLAGS=-g -O2
set PKG_CONFIG=pkg-config

It would be really useful to have a Length() method on sync.Map that can return the number of items in a map. Currently, I need to do something like this, which is quite tedious and not as readable:

length := 0

myMap.Range(func(_, _ interface{}) bool {
	length++
	
	return true
})
@randall77
Copy link
Contributor

@bcmills

@bcmills
Copy link
Contributor

bcmills commented Jun 15, 2017

What's the use-case?

@F21
Copy link
Author

F21 commented Jun 15, 2017

I am storing items filled by goroutines processing data into 2 separate maps. Once this is done, I need to compare the length of these 2 maps to perform a quick sanity check and decide on which branch to proceed for further processing.

Previously I was using github.com/orcaman/concurrent-map and there was a Count() method to do this.

@bradfitz bradfitz changed the title Length() for sync.Map sync: add Map.Length method? Jun 15, 2017
@bradfitz bradfitz added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Jun 15, 2017
@bradfitz bradfitz added this to the Go1.9Maybe milestone Jun 15, 2017
@bradfitz
Copy link
Contributor

This would normally be a Go 1.10 thing, but since this type is new in Go 1.9, I'll let @bcmills decide.

The implementation might be hairy enough to warrant Go 1.10 anyway, especially if the representation needs to change and other code gets modified.

@cespare
Copy link
Contributor

cespare commented Jun 15, 2017

Isn't the number you get out going to be a best-effort guess anyway? (Like len(channel)). Seems like an easy user workaround would be to maintain a parallel atomic int64 count.

(I don't know if that'd also be a reasonable internal implementation or not.)

@F21
Copy link
Author

F21 commented Jun 15, 2017

In my case, I am retrieving the length of the map after all goroutines have finished.

@bcmills
Copy link
Contributor

bcmills commented Jun 15, 2017

sync.Map is optimized for long-lived, mostly-write workloads, for which a Len method would either be misleading (under-counting keys) or inefficient (introducing cache contention on the length counter).

The Range workaround at least has the benefit of appearing to be as expensive as it actually is.

I'm not opposed to the idea of adding a Len method, but I think we would need to show that the use-case is common enough to merit an API with such subtleties. At the very least, I don't think we should add it in 1.9.

@bradfitz bradfitz modified the milestones: Go1.10, Go1.9Maybe Jun 15, 2017
@rsc rsc changed the title sync: add Map.Length method? sync: add Map.Len method? Jun 26, 2017
@AlexStocks
Copy link

AlexStocks commented Jul 6, 2017

@bcmills Do not you think it's very rediculous that Go has a built in func len for map while sync.Map does not have a corresponding function func (Map)Len() int to get its size?

@bcmills
Copy link
Contributor

bcmills commented Jul 6, 2017

@AlexStocks No, I don't think it's "very [ridiculous]". Concurrent data structures are not the same as unsynchronized data structures. The built-in map type doesn't have (and doesn't need) a LoadOrStore, for example.

We should decide the API of each type based on its own tradeoffs. Consistency is a benefit, but there are costs to weigh it against.

@rfyiamcool
Copy link

I thank go1.9 sync.map length feature should be required. Of course, I'm just a suggestion.
don't range all entry, Extend a field as a atomic counter ..

@bcmills
Copy link
Contributor

bcmills commented Jul 16, 2017

@rfyiamcool

Extend a field as a atomic counter

Delete calls (and Store calls with previously-deleted keys) on disjoint keys in the read-only part of the map do not contend in the current implementation. An atomic counter would reintroduce contention for those calls.

We didn't omit Len just to be stubborn. It really is a subtle problem.

@bcmills
Copy link
Contributor

bcmills commented Jul 17, 2017

#21035 has more detail on some of the optimizations that might complicate an efficient implementation of Len.

@GoLangsam
Copy link

GoLangsam commented Jul 26, 2017

@bcmills May I suggest to have some remark in the source comments saying something like
"As of now, Len() is intentionally not implemented/provided due to ..." with a brief and concrete rationale.

This would ease understanding and adjust expectations of future users (who much more likely read source comments than old issues).

( And this should also be applied to other methods suggested elsewhere (such as UpdateOrStore) and currently not reasonably implementable. )

@maj-o
Copy link

maj-o commented Sep 7, 2017

I looked up the sourcecode of Range() and I think (knowing that the result could be wrong, see remark on Range() and above) this would be the solution for Len()

return len(read.m) instead of the for loop in line 328

@bcmills
Copy link
Contributor

bcmills commented Sep 7, 2017

@maj-o Promoting the read map makes Len an O(N) operation if interleaved with Store. By convention, Len methods on types in the Go standard library are O(1).

@ghost
Copy link

ghost commented Sep 15, 2017

I believe that having a .Count() or .Len() or even just adding a count integer would be a good addition for syncmap.Map. Personally I'm using sync.Map to track connections in an asynchronous p2p API that is heavily built around propagating network writes. The count is necessary for artificial connection limits and I had to switch from maps, because of raise conditions in goroutines.

Hope this shows some use-cases for you. I can think of various situations where this simple functionality would reduce code by many lines.

@cznic
Copy link
Contributor

cznic commented Sep 15, 2017

The count is necessary for artificial connection limits and I had to switch from maps, because of raise conditions in goroutines.

I think that does not justify adding a method to sync.Map, because it's easy to make your own derived type which just additionaly keeps the count/len information if you think such information is useful.

@maj-o
Copy link

maj-o commented Sep 16, 2017

If dirty has length oft N.
But 1sr : nobody is interested in dirty - if this blocks You, leave it out - O(1)
2nd : we can asume, that dirty is somehow const and small against the total amount - O(1) + const = O(1)
I know that in worst case (map is empty and dort is full) we could habe O(N) - this can be legt out or ignorred, because in real life it dies not matter. In real live the map is bigger then dirty and the amount of dirty is constant.. So, if it hurts You, comment the dirty loop out or leave it, because it does not matter in real live. And there it is O(1)

@bcmills
Copy link
Contributor

bcmills commented Sep 19, 2017

@maj-o

If dirty has length oft N.

That assumes that the internal representation of sync.Map always includes a single dirty map. Nothing in the API or documentation guarantees that to be the case, and in fact some optimizations (such as #21035) may require the opposite.

@dhui
Copy link

dhui commented Oct 4, 2017

My usecase for an O(1) Len() func is to pre-allocate a slice of data (an optimization) to hold a filtered subset of the the map values, populated by a call to Range(). FYI, my primary use for sync.Map is concurrent O(1) lookups.

So for my usecase, Len() doesn't need to be consistent since the worse case is an under-allocated slice (resulting in inconsistent performance) or an over-allocated slice (resulting in a bit more memory consumed).

@ghost
Copy link

ghost commented Oct 5, 2017

Hey guys,

I just want to clarify on something here. What would you say the scope of requirements adding such functionally be?

Like, to make a decision to add this utility function a part of the API, are you looking for a large quantity of developer need/want for the feature?

Or are we just aiming to decide how it should work?

I'm the former situation I believe that a count feature would just be expected by developers considering that the map type can be used in len()

In the latter situation if it's something that we all want added, but comes down to deciding how it should operate, I think that leads to two potential situations:

1 - a somewhat ballpark count of the map would be necessary for an application. This could be non blocking and the result just needs to be close to the current state.

2 - a blocking function that needs to be timed for an absolute accurate count at the given moment.

A utility function could be made for both scenarios.

I think it is best to make at least the boilerplate functionality for developers to help evade bad practices due to niavity when using the API. Best practices with concurrency is not obvious to new developers and devs trying to adopt go. I do foresee newbies at least reading a godoc and being able to choose based on need.

@jessicayuen
Copy link

jessicayuen commented Apr 28, 2020

I'd like to add another use case, and am seeking alternative suggestions. My use case is with testing. I recently moved from a conventional map + RWMutex to the sync.Map model due support for concurrent locks on a per-key basis. The value is of type chan. In various of my tests, I'd like to validate that the map is populated and cleaned up appropriately when server shutdown handlers are called. The specifics of the channel contents aren't important for my use case.

With maps I can easily check assert.Equal(t, 1, len(myMap)) before and assert.Equal(t, 0, len(myMap)) after, but it's not so simple with sync.Map.

I'm getting around this currently by wrapping the check in a function that iterates .Range().

@bcmills Has there been any reconsideration for adding native len or.Length() support?

@larytet
Copy link

larytet commented Apr 29, 2020

@jessicayuen , add an atomic counter after every add/delete. If the atomic is zero the map is empty. This approach has a benefit. You can add a simple performance monitor (see the idea https://github.com/larytet-go/accumulator) on top of the atomic and watch the load of the map in the real-time.

@bcmills
Copy link
Contributor

bcmills commented Apr 30, 2020

@jessicayuen, if the test fails, presumably you're going to need to know more about the failure mode than just the length of the map. (Which key or keys failed to be removed? Are the channels associated with those keys nil, or empty, or buffered with non-empty buffer contents?)

So you likely need that Range call to obtain enough information to diagnose the failure anyway, and given that, the Len method is superfluous.

@ocket8888
Copy link

ocket8888 commented Jul 22, 2020

My use-case for this, since one was asked for earlier, is for efficient allocation of space in a regular map when reading values out of the sync.Map. Operations are mostly write, but every once in a while (on request) I dump the contents to a regular map to serve out of an API. Currently it looks like:

mystuff := map[interface{}]interface{}{}
syncMap.Range(func (key, value interface{}) bool {
	mystuff[key] = value
	return true
})

and these maps get big, so that does a lot of reallocation. It'd be an improvement to be able to do:

mystuff := make(map[interface{}]interface{}, syncMap.Len())

even if that's misleading because other things are added before the Range because it'll still help avoid reallocs.

@larytet
Copy link

larytet commented Jul 22, 2020

and these maps get big, so that does a lot of reallocation. It'd be an improvement to be able to do:

mystuff := make(map[interface{}]interface{}, syncMap.Len())

even if that's misleading because other things are added before the Range because it'll still help avoid reallocs.

@ocket8888 ,
I think that in this specific situation it is better for the application to keep a counter, atomic or not, of calls to Put(). I would go even further suggesting to prepare the regular map "mystuff" ahead of time if the memory allows.
If the map is large a problem can have system wide impact. it makes sense to watch the map closely.

@ocket8888
Copy link

The actual sync.Map is a private member of a wrapping structure, and doesn't expose Put, only Store and LoadOrStore, but I suppose it would be simple enough to count those uses just the same.

Preparing the map ahead of time is out of the question for a litany of reasons, not the least of which being "I would have no idea how to even begin implementing that in this pile of fettucine code, and the very thought gives me shakes and dry heaves".

Thanks for the suggestion, I think I'll probably do that at some point.

@cristaloleg
Copy link

Operations are mostly write

@ocket8888 afaik sync.Map works better under ready-heavy workloads. It's also stated in the documentation https://pkg.go.dev/sync#Map Probably simple map with RWMutex might be better for your case.

@ocket8888
Copy link

From the sync.Map docs:

"The Map type is optimized for two common use cases: (1) when the entry for a given key is only ever written once but read many times, as in caches that only grow, or (2) when multiple goroutines read, write, and overwrite entries for disjoint sets of keys. In these two cases, use of a Map may significantly reduce lock contention compared to a Go map paired with a separate Mutex or RWMutex. "

My case is the second. There are many goroutines reading, writing, and especially overwriting entries for disjoint sets of keys. Actually, to be specific, there are many goroutines writing and overwriting, and only one reading.

@larytet
Copy link

larytet commented Sep 23, 2020

@ocket8888 May be a custom lock free hashtable makes sense in your case?
For example/inspiration check this https://github.com/larytet-go/hashtable
and https://github.com/larytet/mcachego

@ghost
Copy link

ghost commented Sep 26, 2020

Can someone clarify if the stance changed on this? Revisiting this after over a year, I think this entire issue needs to be closed. We're beating a dead horse that's going nowhere.

This became evident to me over time as I realized this feature doesn't matter. It's not a blocker for anything, coding around it is easy, and the only people wanting this are just asking for a convenience tool (I will admit that was the case that led me to being in this issue).

That leads me to the solution. The blocker has already been expressed in this issue of "who needs it?" The way I see it, if we really want it, we have to go make it, market it, and maybe one day we might be able to show the interest it takes to get it into the standard library.

Absent of that, this is a dead horse. Good luck comrades. I'm unsubscribing.

@pioz
Copy link

pioz commented Oct 1, 2020

What's the use-case?

Testing, I need to test if the elements on the map are exactly what I expect.

@davecheney
Copy link
Contributor

Then you’d want to iterate over the map to asset that it’s contents are identical.

@larytet
Copy link

larytet commented Nov 10, 2020

When I implement a counter of stored keys internally I occasionally do the following:

	if _, ok = m.MyLoad(key); !ok {
		// do my stuff 
		m.MyStore(key, value)
	}

In the call to MyStore() I increment the length counter m.len and call to sync.Map.Store(). In a multi-thread environment I need to call sync.Map.Lock() around the key existence test. Otherwise I would count calls to Map.Store() instead of the new keys.

Is there a chance to add a return value to the sync.Map.Store which returns Ok if the key did not exist in the map before the call?

@jeremycochoy
Copy link

jeremycochoy commented Jun 13, 2021

Just call it "BlockingLen" or "SlowLen" or "DontUseItLen" if you are worried users don't realize it is costly, but I don't think that hundreds (looks at the upvotes...) of peoples having to re-implement len by hand is very efficient...

@helphi
Copy link

helphi commented Nov 16, 2021

We waiting for it 4 years.

@cespare
Copy link
Contributor

cespare commented Nov 16, 2021

We should put this issue through the proposal process since it's an API change.

@ianlancetaylor
Copy link
Member

I think it's clear at this point that we are not going to add this method. In general making Len amortized constant time without adding cost to all users of sync.Map is complex. People who need Len can avoid the Store and Delete methods and can use simple wrappers around the LoadOrStore and LoadOrDelete methods that maintain their own count.

I'm closing this issue.

@mitar
Copy link
Contributor

mitar commented Feb 6, 2022

What about providing this simple wrapper in stdlib? sync.MapWithLen? It could even not have Store and Delete methods.

@fy0

This comment was marked as duplicate.

@mkeen

This comment was marked as duplicate.

@golang golang locked as resolved and limited conversation to collaborators May 25, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FeatureRequest Issues asking for a new feature that does not need a proposal. NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Projects
None yet
Development

No branches or pull requests