Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak in REP socket handling #2567

Closed
dex6 opened this issue May 9, 2017 · 5 comments
Closed

Memory leak in REP socket handling #2567

dex6 opened this issue May 9, 2017 · 5 comments

Comments

@dex6
Copy link

dex6 commented May 9, 2017

Problem happens in a REP server, when a REQ client disconnects before the response is sent. According to valgrind, about 500B leaks per response on my test setup. I've discovered the problem in ZMQ 4.2.1, then confirmed also in master and 4.1.6.
The app I'm working on is a simple REQ/REP server, working on Linux, using tcp/ipc protocols. It occasionally requires significant amount of time to process the request, due to external hardware problems, during which our client timeouts and leaves the server with small, but growing leak.

I've traced the problem to a partial message being left in a pipe:

  1. When a request is received, the rep_t::xrecv() calls router_t::xsend() to copy all the labels to response message.
  2. The rest of message is passed to user's app, which works hard to prepare the response
  3. In the meantime, the client goes away
  4. The response is sent using rep_t::xsend(), which calls router_t::xsend(), which correctly notices the pipe had gone away and frees the response (thanks to fix from Memory leak on REP socket server when the REQ client disappears #1313), however the labels already put into the pipe in step "1" are left there and never freed.

I suppose the session_base_t::clean_pipes() should remove them in step "3", however it does not happen for some reason... Unfortunately, I don't have enough time to get familiar with libzmq code, so I cannot came with a patch at hand... But I would gratefully test one.

Please see attached valgrind output, and a minimal server + client test apps for reproducing the issue (my real server/client uses czmq 4 and pyzmq 14.x, however that should not matter much). The sample server is hardcoded with "processing" time (just a sleep) set to 1s, and the sample client has RCVTIMEO set to 0.5s. With such values I get the memleak every request. When receive timeout is raised enough that client reads the response, the problem stops occurring (obviously).

libzmq_valgrind.txt
testapps.zip

bluca added a commit to bluca/libzmq that referenced this issue May 10, 2017
Solution: roll back the pipe if writing messages other than the
first fails in router::xsend.
Also add test case that reproduces the memory leak when ran with
valgrind.
Fixes zeromq#2567
@bluca
Copy link
Member

bluca commented May 10, 2017

This is a very good analysis, thank you. I've managed to create a simple and self contained test case that reproduces the problem both locally and on Travis.

I think the solution is to call rollback () on current_out. Once the CI is green I'll open a PR.

@dex6
Copy link
Author

dex6 commented May 11, 2017

@bluca Thanks for preparing the patch, I'll merge it to my company's libzmq fork, and let it run for the next weekend. I'll keep you posted here.

BTW. any plans to release 4.2.3 in near future?

@bluca
Copy link
Member

bluca commented May 11, 2017

Yes it's almost time, but there's a couple of things that need fixing first (cmake and ipv6 related)

@dex6
Copy link
Author

dex6 commented May 15, 2017

The patch seems working for me. You may close the issue once PR #2572 is merged.

Thanks for help!

@bluca
Copy link
Member

bluca commented May 15, 2017

Great, thanks for confirming

@bluca bluca reopened this May 16, 2017
bluca added a commit to bluca/libzmq that referenced this issue May 17, 2017
Solution: roll back the pipe if writing messages other than the
first fails in router::xsend. Roll it back also when the pipe is
terminating.
Also add test case that reproduces the memory leak when ran with
valgrind.
Fixes zeromq#2567
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants