Skip to content
This repository has been archived by the owner on Apr 22, 2023. It is now read-only.

Problem with cluster and passing connections to workers #7784

Closed
elad opened this issue Jun 14, 2014 · 10 comments
Closed

Problem with cluster and passing connections to workers #7784

elad opened this issue Jun 14, 2014 · 10 comments
Labels

Comments

@elad
Copy link

elad commented Jun 14, 2014

Hello,

I described this issue elsewhere (indutny/sticky-session#9) but since I can reproduce it on Ubuntu and Mac OS X in multiple environments and without sticky-session I'm posting it here for review.

dummy.js:

var cluster = require('cluster'),
    net = require('net'),
    http = require('http');

if (cluster.isMaster) {
    var worker = cluster.fork();

    net.createServer(function(c) {
        worker.send('conn', c);
    }).listen(3000);
} else {
    var server = http.createServer(function(req, res) {
        res.writeHead(200, { 'Content-Type': 'text/plain' });
        res.end('okay');
    }).listen(0, 'localhost');

    process.on('message', function(msg, c) {
        if (msg !== 'conn') {
            return;
        }

        server.emit('connection', c);
    });
}

stress_dummy.js:

var async = require('async'),
    request = require('request'),
    moment = require('moment');

function dummy_request(callback) {
    request.get({
        url: 'http://localhost:3000',
        json: true
    }, function(err, res, body) {
        callback(err);
    });
}

function run(n) {
    var mstart = moment();
    async.times(n, function(i, callback) {
        var mstart2 = moment();
        async.parallel({
            dummy1: dummy_request,
            dummy2: dummy_request,
            dummy3: dummy_request,
            dummy4: dummy_request,
            dummy5: dummy_request,
        }, function(err, results) {
            var mend2 = moment();
            console.log(i, 'done, err:', err, 'time elapsed (ms):', mend2.diff(mstart2));
            callback();
        });
    }, function(err) {
        var mend = moment();
        console.log('time elapsed (ms):', mend.diff(mstart));
    });
}

var n = parseInt(process.argv[2]) || 10;
console.log('running', n, 'times');
run(n);

Install required modules:

$ npm install async request moment

Then run:

Terminal 1:

$ node dummy

Terminal 2:

$ node stress_dummy

It very often happens that stress_dummy hangs without receiving responses to all ten requests.

Thoughts?

@cjihrig
Copy link

cjihrig commented Jun 14, 2014

Is there a specific reason why you need to pass around connections in that way? I was able to run your stress test against the following code with no problems:

var http = require('http');
var cluster = require('cluster');

if (cluster.isMaster) {
  cluster.fork();
} else {
  http.createServer(function(req, res) {
    res.writeHead(200, { 'Content-Type': 'text/plain' });
    res.end('okay');
  }).listen(3000);
}

@dashed
Copy link

dashed commented Jun 14, 2014

Can sorta confirm:

$ node stress_dummy.js 
running 10 times
1 'done, err:' undefined 'time elapsed (ms):' 59
2 'done, err:' undefined 'time elapsed (ms):' 63
3 'done, err:' undefined 'time elapsed (ms):' 66
4 'done, err:' undefined 'time elapsed (ms):' 69
5 'done, err:' undefined 'time elapsed (ms):' 76
6 'done, err:' undefined 'time elapsed (ms):' 84
7 'done, err:' undefined 'time elapsed (ms):' 93
8 'done, err:' undefined 'time elapsed (ms):' 100
9 'done, err:' undefined 'time elapsed (ms):' 105

// ctrl-c here

$ node stress_dummy.js 
running 10 times
0 'done, err:' undefined 'time elapsed (ms):' 60
1 'done, err:' undefined 'time elapsed (ms):' 56
2 'done, err:' undefined 'time elapsed (ms):' 59
3 'done, err:' undefined 'time elapsed (ms):' 60
4 'done, err:' undefined 'time elapsed (ms):' 62
5 'done, err:' undefined 'time elapsed (ms):' 65
6 'done, err:' undefined 'time elapsed (ms):' 66
7 'done, err:' undefined 'time elapsed (ms):' 68
8 'done, err:' undefined 'time elapsed (ms):' 70
9 'done, err:' undefined 'time elapsed (ms):' 72
time elapsed (ms): 90

@elad
Copy link
Author

elad commented Jun 14, 2014

@cjihrig - sorry for not providing more background. I came across this behavior when modeling a server after indutny's sticky-session. In a nutshell, Socket.IO's handshake requires multiple connections, and if we leave the distribution to the operating system, then connections related to a single handshake may end up in different workers. This of course makes Socket.IO not work reliably with node cluster, so the solution was to use file descriptor passing and accept connections in the master and pass them to workers according to source address.

In other words, while the program you posted works for this particular test case, it doesn't generalize and fails when you try to use the same pattern with a Socket.IO server. :/

@dashed
Copy link

dashed commented Jun 14, 2014

@elad Maybe this is the case? http://markdawson.tumblr.com/post/17525116003/node

@elad
Copy link
Author

elad commented Jun 14, 2014

I'm not sure for several reasons:

  • I tried bumping http.globalAgent.maxSockets and passing as pool an http.Agent instance with its maxSockets member set to 10, and neither improved the behavior.
  • This behavior is client independent, because it originally happened from a browser app. I just isolated it to stress_dummy.js to make testing easier.
  • It seems to happen with both 5 and 10 requests sent to the server, and usually for the first one. So if we're sending requests with index 0-4 or 0-9, it will happen with request 0 in both and other (1-4, 1-9) succeeding, so the number of requests doesn't seem to matter much.

So it seems like this is something server-side. But I have no idea, hence posting here. :)

@mathiask88
Copy link

Same problem here. I wanted to implement a sticky-session like the module sticky-session from @indutny mentioned in the first post, but after connection event fired the request event does not. In 0.8.x the sticky-session module works like a charm, but not with 0.10.x. So how can I pass a socket to a worker http server?

trevnorris pushed a commit that referenced this issue Oct 27, 2014
Currently when a server receives a new connection the underlying socket
handle begins reading data immediately. This causes problems when
sockets are passed between processes, as data can be read by the first
process and thus never read by the second process.

This commit allows sockets that are constructed with a handle to be
paused initially.

PR-URL: #8576
Fixes: #7905
Fixes: #7784
Reviewed-by: Trevor Norris <trev.norris@gmail.com>
@trevnorris
Copy link

Fixed by c2b4f48.

@elad
Copy link
Author

elad commented Oct 27, 2014

Fantastic, thanks! Any estimate when v0.12 will be tagged and released? :)

@trevnorris
Copy link

@elad Soon. I'm finishing the final patch that needs to land. Have had a training this week, so going to finish it up next week.

@migounette
Copy link

@trevnorris Any news on it ?
If you can provide an estimate target date (days, weeks...)
We are expecting it for production landing. Better to go with 0.12 rather than 0.11.14
Cheers

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

6 participants