Skip to content
This repository has been archived by the owner on Apr 22, 2023. It is now read-only.

net.listen: port 0 in cluster #3324

Closed
mafintosh opened this issue May 26, 2012 · 23 comments
Closed

net.listen: port 0 in cluster #3324

mafintosh opened this issue May 26, 2012 · 23 comments
Labels

Comments

@mafintosh
Copy link
Member

According to the docs listening to port 0 on a net.Server will assign it a random port.
This works well for the following program (using node 0.6.18):

var net = require('net');
var s = net.createServer();
s.listen(0, function() {
   console.log(s.address().port); // will print a random port
});
var s1 = net.createServer();
s1.listen(0, function() {
   console.log(s1.address().port); // will print another random port
});

If we however run the above program in a cluster worker the servers get assigned the same port

var net = require('net');
var cluster = require('cluster');

if (cluster.isMaster) {
   cluster.fork();
   return;
}

var s = net.createServer();
s.listen(0, function() {
   console.log(s.address().port); // will print a random port
});
var s1 = net.createServer();
s1.listen(0, function() {
   console.log(s1.address().port); // will print the SAME port as above :(
});

Which seems very much like a bug caused by the cluster workers sharing port 0 like they would for any normal port.

AndreasMadsen added a commit to AndreasMadsen/node-v0.x-archive that referenced this issue May 26, 2012
AndreasMadsen added a commit to AndreasMadsen/node-v0.x-archive that referenced this issue May 26, 2012
Fix issue: nodejs#3324
Fix issue: nodejs#3325

And renames the addressType property in cluster listening event argument
to family
@bnoordhuis
Copy link
Member

That's the expected behavior as far as I'm concerned: the express purpose of the cluster module is to share a bound socket across processes. If you don't want that, don't use cluster.

If the documentation is unclear / non-obvious on that point, let me know and I'll update it.

@AndreasMadsen
Copy link
Member

@bnoordhuis I don't object to your argument, I actually tend to agree with you. However there are cases where listening to a random port is preferred.

One case is where big state objects are being send to the worker, since process.send is sync that is not the preferred way, next is UDP but that is not very reliable when sending big messages and at last there is TCP. You could of couse listen to a specified port depending on the workerID but listening to a random port may actually be the preferred choice, if the logic is decoupled intro a submodule.

@mafintosh
Copy link
Member Author

@bnoordhuis So there will be no way that 2 servers can listen to a random port if you are using cluster?

@bnoordhuis
Copy link
Member

So there will be no way that 2 servers can listen to a random port if you are using cluster?

Not for now. The current behavior of binding to port 0 won't change but maybe we can add a 'random and I really mean it' option someday.

@isaacs
Copy link

isaacs commented Jun 5, 2012

@bnoordhuis ++

When you listen in a cluster, you get the same port as any other cluster member listening with the same arguments. That's by design.

If you are just using cluster so that you can have the child.send() IPC stuff, then you can do that already with child_process.fork().

@AndreasMadsen
Copy link
Member

@isaacs I agree the current design is the prefered design. However in some situations process.send() is totally useless because it blocks the sender. Fixing that is of course the prefered solution, but when I write at #libuv I get the impression that this is very low priority and may never happen. So I seek other solutions and "fixing this issue" is one of them.

@supershabam
Copy link

I agree that this is unexpected behavior with clustering. I really need at least a 'random and I really mean it' flag.

My main concern is the source of this unexpected behavior. Look at the cluster.js code:

cluster._getServer = function(tcpSelf, address, port, addressType, cb) {

  ...

  // Store tcp instance for later use
  var key = [address, port, addressType].join(':');
  serverListeners[key] = tcpSelf;

  ...

We store the tcp instance for later use so that we can essentially bind on the same address for all workers in the cluster.

But this key we create is literally storing port "0" as the port we are listening on, which is false. We open a connection on a real port. It is merely convention that we use 0 to mean assign me a random port. The connection should be opened, and then this key should be computed with the actual port being used.

The side-effect of this key is two-fold: (1) when we listen to port 0 every worker node uses the same port/connection, and (2) after listening to port 0, there is actually a connection on some real port X that if a worker node out of some small chance tries to listen to will fail.

I suggest that this is a real bug, even if you do intend for workers listening on port 0 to unify themselves on the same port.

As cluster begins to be used more and more, this real bug will pop up more and more. The way it is currently implemented I can only ever have one server listening on a random port (in my ENTIRE project including all my dependencies). If I include a single dependency that listens on a random port, I will be unable to listen on a random port myself. Not a bug?

@bnoordhuis
Copy link
Member

(1) when we listen to port 0 every worker node uses the same port/connection

That's intentional.

(2) after listening to port 0, there is actually a connection on some real port X that if a worker node out of some small chance tries to listen to will fail.

That won't happen. Port 0 tells the operating system "give me a random port that's free".

@supershabam
Copy link

Sorry, let me rephrase (2) because I'm not debating your response at all, and that wasn't what I was trying to say.

Here's a flow of events.

  • Worker A requests new port 0 connection
  • OS gives Worker A a connection on port 9001
  • Worker B requests new port 0 connection
  • Cluster sees we already have a "port 0" connection and returns that (which is actually listening on port 9001)
  • Worker C requests new port 9001 connection
  • Cluster SHOULD return connection on port 9001, but it recognized this connection as "port 0" connection. So instead, cluster tries to create a new connection on 9001, but the address is already in use.
  • Worker C's receives error trying to connect on 9001

The problem I'm saying is that listening on port 0 consumes a real port number, let's call X. As a member of a cluster, I should be able to connect and listen on port X (because I'm in a cluster), but this causes an error.

@bnoordhuis
Copy link
Member

Can you turn that into a code snippet? I'm not sure I follow your logic.

@AndreasMadsen
Copy link
Member

Cluster SHOULD return connection on port 9001, but it recognized this connection as "port 0" connection.

I assume you mean

Cluster SHOULD return connection on port 9001, but it didn't recognize this connection as "port 0" connection.

@supershabam one of the good things about cluster is that you with little or none abstraction can execute the worker file outside a cluster environment. This makes it very easy to debug your application without the complexity of having an master.

So this is how I imagine your application:

var net = require('net'):

var random = net.createServer().listen(0, function () {
  var addr = random.address();

  // of course you won't use addr but just listen on `9001`
  // and this time you where unfortunate and addr.port was also `9001`.
  // but for the sake of testing this is will give a consistent result:
  var server = net.createServer().listen(addr.port, addr.address);

  // and then an error will throw because the address is already in use
});

So instead of doing that you will simply allocate the 9001 port before creating an connection to an random port:

var net = require('net'):

var server = net.createServer().listen(9001, function () {
  var server = net.createServer().listen(0);
});

Above will work cluster or not. I don't say that this issue isn't an issue, but if I interpret your argument correctly I would say that it is invalid, since the exact same behaviour can happen in an none-cluster worker.

@isaacs
Copy link

isaacs commented Jun 14, 2012

For v0.8, we're just going to document the edge case, and leave it at that.

For 0.9, I'm not sure whether listen(0) should even be supported in cluster, but certainly, it'd be nice for cluster to have a "listen on my own, don't ask the parent" flag.

You should structure your app accordingly. Note that each worker is given a unique numeric identifier, so you can always do something like this:

server.listen(9000 + cluster.worker.id);

I'm sorry, but I have to agree with @bnoordhuis's original point, and say no to this. There's been a fair amount of debate, and some reasonable points made. But, the bottom line is that there are no plans to change the behavior of listen(0), inside of a cluster worker or in the master process.

Let's discuss other things. This one isn't happening.

Just to reiterate, some form of listen(...., noSeriouslyListenHereDontAskMaster) API would be nice for v0.9, and would give you what you want (without changing cluster's semantics in any other ways), but is off the table for v0.8.

@supershabam
Copy link

@isaacs Thanks.

Just for a heads up for your 0.9 considerations. I am writing a helper module (malone) that makes use of listen(0). It works fine as long as the person using malone is not doing so from within a cluster worker.

When somebody uses my module in a cluster worker, strange bugs come out because of this listen(0) business. I'd love to see a flag added. I'd hate to see listen(0) go away (though I can't use it right now anyways when clustered).

My current workaround is to do

// if clustered
listen(Math.floor(Math.random() * 40000) + 20000); 

// otherwise listen(0);

It's not great. I hate embedding that slight chance of certain failure into my module, but I can't rely on using listen(0) when I'm trying to support both clustered and unclustered environments.

Also, since I'm developing this module as a helper/dependency, it would be irresponsible of me to use listen(0) even if I intend to have all workers listen on the same port because I'm just a dependency. As a dependency, I can't be claiming the listen(0) address for the entire program. Listen(0) essentially becomes a hidden global variable in cluster, and as a dependency, I shouldn't be setting any globals that could affect the rest of the program.

@bnoordhuis @AndreasMadsen thanks for your input too.

@isaacs
Copy link

isaacs commented Jun 14, 2012

@supershabam What about this?

var port = 0;
var cluster = require('cluster');
if (cluster.isWorker) {
  port = 20000 + cluster.worker.id;
}

@supershabam
Copy link

@isaacs I like that. It's an improvement as it removes the chance of having two workers choose the same port.

It still has a chance of trying to allocate a port that is currently used by the OS. AFAIK, listen(0) is the only sure way to get an available port with no random chance of failure (aside from running out of ports).

@supershabam
Copy link

For anybody still interested in this issue, I created a drop-in replacement for the net module called net-cluster.

https://npmjs.org/package/net-cluster

This module operates the same as net, except that when listening on port 0 as a worker you will always listen on a random port instead of sharing the same connection.

@starlogik
Copy link

is it possible for workers to bind to the same UDP port using cluster, similarly to sharing TCP sockets?

in all my tests, if I bind each worker to the same UDP port, only the last worker process to bind actually receives any packets sent to that port.

@bnoordhuis
Copy link
Member

is it possible for workers to bind to the same UDP port using cluster, similarly to sharing TCP sockets?

Not right now. It requires an overhaul in libuv and is kind of awkward to implement on Windows. It's on the TODO list though.

@starlogik
Copy link

what would it take to bubble it up to top of the list?
how much work would the overhaul entail?

@bnoordhuis
Copy link
Member

what would it take to bubble it up to top of the list?

I'm amenable to bribery, preferably beer.

how much work would the overhaul entail?

It's a couple of hours of work but it requires an API change in libuv. That needs some thought if we don't want to have to do it again in six months time.

@piscisaureus
Copy link

what would it take to bubble it up to top of the list?

I'm amenable to bribery, preferably beer.

Don't bribe Ben, it buys you nothing. Bribe me instead.

@AndreasMadsen
Copy link
Member

UDP feature request is at #2194

@starlogik
Copy link

@bnoordhuis & @piscisaureus, beer works ... send address!

or better yet, do you have an amazon wishlist and can we have feature by weekend :)

shigeki pushed a commit to shigeki/node-v0.x-archive that referenced this issue Oct 16, 2015
Now that we have backticks we no longer need to use util.format
to template strings!

This commit was inspired by nodejs#3324, and it replaces instances of
util.format with backtick strings in a number of tests

Reviewed-By: Rich Trott <rtrott@gmail.com>
Reviewed-By: Brian White <mscdex@mscdex.net>
Reviewed-By: James M Snell <jasnell@gmail.com>
PR-URL: nodejs/node#3359
richardlau pushed a commit to ibmruntimes/node that referenced this issue Nov 5, 2015
Now that we have backticks we no longer need to use util.format
to template strings!

This commit was inspired by nodejs#3324, and it replaces instances of
util.format with backtick strings in a number of tests

Reviewed-By: Rich Trott <rtrott@gmail.com>
Reviewed-By: Brian White <mscdex@mscdex.net>
Reviewed-By: James M Snell <jasnell@gmail.com>
PR-URL: nodejs/node#3359
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

6 participants