Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Subscribers losing messages via xpub/xsub proxy #3214

Closed
thecodefactory opened this issue Aug 13, 2018 · 7 comments
Closed

Subscribers losing messages via xpub/xsub proxy #3214

thecodefactory opened this issue Aug 13, 2018 · 7 comments

Comments

@thecodefactory
Copy link

thecodefactory commented Aug 13, 2018

Issue description

Apologies in advance if this has been answered, I searched through the issues and didn't find it. The zmq issue we're seeing is almost exactly described by someone else here (with sample code included).

https://stackoverflow.com/questions/43129714/zeromq-xpub-xsub-serious-flaw

Our specific setup is this:

We have an external facing service endpoint (xpub), and we have an internal persistent xsub connected to it and we proxy between these two. An undetermined amount of time later, an event is fired and a new pub socket connects to the xsub and sends a single message. It appears that each event socket send does not error, but the external subscriber (connected to the xpub) does not receive all messages (looking at wire traffic).

Instead of just going for that workaround mentioned on stack overflow, which does appear to work for our case, I wanted to know if 1) This issue is actually an issue in zmq (i.e. the use case should be properly supported), 2) if the issue is fixed in zmq, or 3) What should we expect from using this workaround (other than the fact that push/pull can internally queue instead of drop the messages). Are there other negative consequences with the push/pull replacements?

Environment

  • libzmq version (commit hash if unreleased):

4.2.0

  • OS:

GNU/Linux

Minimal test code / Steps to reproduce the issue

  1. See link above, code included.

What's the actual result? (include assertion message & call stack if applicable)

The external subscriber(s) do not receive all messages published.

What's the expected result?

The external subscriber(s) should receive all messages.

@bluca
Copy link
Member

bluca commented Aug 19, 2018

This is just a variation of #2267 see that issue for details

@bluca bluca closed this as completed Aug 19, 2018
@thecodefactory
Copy link
Author

@bluca Thanks for the response! To be clear, I was asking about the xsub/xpub case where information sharing could/should take place -- you're saying that this is no different from the sub/pub scenarios described in the linked issue and that in short it's not a supported use-case (i.e. the xsub/xpub proxy diagram in the documentation is incorrect)?

@bluca
Copy link
Member

bluca commented Aug 20, 2018

It's not that it's unsupported, it's that there is an issue described in that other ticket that can happen when a sub socket binds and a pub connects. Given a connect is async, you carry on writing the message before the sub has sent the subscriptions, and hence you see nothing as the messages are dropped since as far as the publisher is concerned at that point there are no subscribers. That's why a sleep (or any other activity) makes it work.

You could use XPUB so that you can know for a fact that subscribers are there, given you will receive the sub message in the application.

@thecodefactory
Copy link
Author

@bluca Thank you, I understand this issue better now.

@thecodefactory
Copy link
Author

@bluca Lastly, are you able to comment on the trade-offs with the push/pull workaround (described in the link in the original post)? Is this a reasonable model, or does it have significant downsides that aren't apparent at first?

@bluca
Copy link
Member

bluca commented Aug 20, 2018

It's probably a better fit for that model. Just keep in mind of the differences in behaviour in mute state (block vs drop) of pub and push. You can read about that in the zmq_socket manpage

@thecodefactory
Copy link
Author

@bluca Ok, I'm aware of that issue (block in push/pull vs drop in pub/sub). Thanks again, appreciated! I just wanted to make sure it's not an unsupported model that just appears to work in this case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants