-
Notifications
You must be signed in to change notification settings - Fork 3.5k
It eats tremendous amout of memory on high load #134
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
ReadMessage is a helper method for getting a reader using NextReader and reading from that reader to a buffer. Call NextReader directly and pass the preallocated buffer to the Read method. |
Thank you. This way reduces memory usage dramatically. Sorry for not finding it myself. ROUTINE ======================== github.com/gorilla/websocket.(_Conn).NextReader in C:/Users/roman/gocode/src/github.com/gorilla/websocket/conn.go Would be great to introduce some way to not instantiate messageReader and messageWritr per each request. E.g. let caller to pass some kind of factory which creates messageReader and messageWriter instances. This way I would be able to use pre-allocated messageReader and messageWriter and improve it even further. |
or just use sync.Pool to hold messageWriter and messageReader instances? |
CL 8b209f6 reduces the amount of memory allocated by NextReader and NextWriter. Readers and writers are discarded to prevent read past the end of a message and write after a message is closed. A pool of two readers and two writers per connection is probably sufficient to prevent these errors. |
Much better now. ROUTINE ======================== vendor/github.com/gorilla/websocket.(*Conn).NextWriter in C:/Projects/spacewars.io/spacewar.go/src/vendor/github.com/gorilla/websocket/conn.go It is already acceptable for my goals. Thank you. |
A two element pool per type per connection is better than a sync.Pool because
|
I agree. Do you mean a pool of 2 elements of messageWriter{} ? |
Ups. As I think now, you mean not to call NextReader and NextWriter per each read and write. Instead, create 2 readers and 2 writers and switch between them so each odd write uses the first Writer and each even write uses the second. Is my current understanding correct? If so, I will make the test. |
The proposal is that NextWriter alternates between two *messageWriter values and NextReader alternates between two *messageReader values. I prefer not to do this because a stale io.WriterCloser or io.Reader returned by these APIs can become active again. Since the memory use is acceptable for your use, let's leave the code as is. |
Ok. Thank you. |
Hi, I am relatively new to developing in go, and I would like to know if it is possible to explain a bit more how to make better use of memory with gorilla since I am developing a websocket server and I have a high memory consumption when the server has> 300K concurrent connections. Maybe with some code example? Thanks. |
@mariano - are you able to share:
1. a memory profile for your application under typical load?
2. detail on how you are measuring "high memory consumption" - e.g. what
metrics are you looking at? Are you looking at VM size, RSS, actual memory
in use, etc?
3. what your application does on a per-connection basis, and how you have
narrowed it down to gorilla/websockets (also see 1.)
…On Sun, Jun 18, 2017 at 12:04 PM Mariano ***@***.***> wrote:
Hi, I am relatively new to developing in go, and I would like to know if
it is possible to explain a bit more how to make better use of memory with
gorilla since I am developing a websocket server and I have a high memory
consumption when the server has> 300K concurrent connections
cc: @garyburd <https://github.com/garyburd> @softaria
<https://github.com/softaria>
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#134 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AABIcOzst_nfle1G2qIhuV7JXbUIYUDKks5sFXS0gaJpZM4Iksab>
.
|
Hi @elithrar sorry for delay, as i said im new in go. Info related to your questions:
As you can see here, go handle 200K concurrent connections (connected=#) and writes show RPS As i said before im new in go and my question is related not what language choose, is related to what can i do better to improve and my code and get a right picture of the power of golang. In the image another message is printed:
This print show when the server is stressed. Another point to understand is the unbounded concurrency and how this related with my test. For example in NodeJS, i use a batch to handle N requests from clients and achieve better results. Thanks. |
@marianogenovese The Upgrade method allocated 1790 bytes per connection. That seems reasonable. The |
Hi @garyburd i have a .net core client to generate load to my server, I have increased the connection timeout on my client and thus have fewer 1006 errors this works but the cpu usage is still high. More connections more CPU more contention. |
in previous comments my gorilla webserver create a goroutine per connection, i have changed my server without goroutine like this:
Memory now with 250K increase compared with previous profiles:
-alloc_objects
|
The first example starts a new goroutine and returns from the goroutine started by the HTTP server. This allows GC of per connection data created by the HTTP server. The second example executes on the goroutine started by the HTTP server. There is one goroutine serving the connection in both examples. |
Ok I understand, but I still do not understand the difference between profiles. |
@marianogenovese What are the specific differences you are concerned about? |
Hello
I tried to use it for real time MMO game. And got a problem with garbage collector working constantly.
With 1000 clients each sending and receiving 100 messages per second library generates ~10 Gb of memory in 2 minutes.
In contrast, google's x/net/websocket generates about 10 times less work for gc.
Here is 2 first line from profiler gathered during 2 min of work:
(pprof) top10
12.02GB of 12.47GB total (96.38%)
Dropped 63 nodes (cum <= 0.06GB)
Showing top 10 nodes out of 35 (cum >= 0.10GB)
flat flat% sum% cum cum%
6.73GB 53.94% 53.94% 6.73GB 53.94% bytes.makeSlice
2.30GB 18.45% 72.38% 9.02GB 72.38% io/ioutil.readAll
Here is a place where the memory usage get generated


It could be easily fixed when you would allow passing pre-allocated buffers as arguments to readMessage instead of allocating it per each frame.
I am ready to provide any further information when you are interested.
The text was updated successfully, but these errors were encountered: