dns.lookup shares a thread pool with disk operations #2868

ntoshev · 2012-03-05T14:27:18Z

If you generate a lot of DNS requests, they get queued and become slow. Disk I/O gets in the same queue so it is slowed down as well.

Southern · 2012-03-11T01:55:49Z

When you say "disk I/O", are you referring to fs or the actual I/O of Node?

If you're referring to fs, the problem is that both fs and dns both use process.nextTick() callbacks. A simple solution to avoid this issue is to use child_process or cluster to create a couple of threads to handle the massive amounts of requests you require.

thejh · 2012-03-11T19:30:01Z

Am Samstag, den 10.03.2012, 17:55 -0800 schrieb Colton Baker:

A simple solution to avoid this issue is to use child_process or cluster to create a couple of threads to handle the massive amounts of requests you require.

That's not "threads", it's processes. And, uhm, this really seems like a
bad response. It's like "sorry, yes, the cooling system has a small
leak, but if you pour in a few litres of water each hour, it doesn't
blow up".

Southern · 2012-03-11T23:35:53Z

@thejh Okay. Maybe using "threads" wasn't the best word to choose. Should have known someone would be a technical brat. Regardless of whether child_process and cluster are true threads, each has their own process variable. Thus the callbacks for each process.nextTick should function independently. This eliminates the need to "revive" code that isn't broken in the first place.

This is not a Node error. It is a user error. You act as if this wouldn't happen in another coding language where a lot of requests were made back-to-back, using the same queued callback for everything.

By the way, where was your suggestion on how to fix this? If you're here to TroLLz, can I play too?

ntoshev · 2012-03-12T09:54:20Z

I'm referring to fs. See this thread for context:

http://groups.google.com/group/nodejs/browse_thread/thread/6f564ac1dcd650bf/100bc657172b88e2

I solved my own problem using dns.resolve4 and then handing the IP to an http request, the performance difference is amazing. I also contributed a patch to the fetch module (https://github.com/andris9/fetch) doing that. I'm a little surprised no other http client library implements this. It would be the bottleneck of many network-bound projects.

thejh · 2012-03-12T12:43:13Z

"Colton Baker" reply@reply.github.com schrieb:

@thejh Okay. Maybe using "threads" wasn't the best word to choose.
Should have known someone would be a technical brat. Regardless of
whether child_process and cluster are true threads, each has their
own process variable. Thus the callbacks for each process.nextTick
should function independently. This eliminates the need to "revive"
code that isn't broken in the first place.

This is not a Node error. It is a user error. You act as if this
wouldn't happen in another coding language where a lot of requests were
made back-to-back, using the same queued callback for everything.

By the way, where was your suggestion on how to fix this? If you're
here to TroLLz, can I play too?

Reply to this email directly or view it on GitHub:
#2868 (comment)

I don't mind you contributing... Uhm... half-constructive comments, too. :D

No, seriously - IMO, it'd be good to be able to reserve IO threads for specific node domains, but I don't have code for that as it's not related to any problems I have and I'm a bit lazy. Also, well, we don't have domains yet, so it won't work anyway.

bnoordhuis · 2012-03-12T14:16:40Z

To clear up any misunderstandings, what happens here is that getaddrinfo() DNS lookups and async file I/O share the same thread pool in C land. Said thread pool has limited concurrency so if DNS lookups are slow because the nameserver is down (to name something), that also slows down file ops. It's a libuv/libeio issue.

arnaud-lb · 2013-04-05T11:47:29Z

This is a very serious issue, as queuing only 4 DNS queries will delay everything else using the thread pool.

Considering that a DNS query can take more than 20 seconds with default timeout/retry settings, if the thread pool queue is filled with DNS queries, everything can be delayed for minutes or even hours.

jorangreef · 2013-10-23T08:37:54Z

Are there any blockers to Node using dns.resolve internally for net.connect etc? And leaving dns.lookup for those that need it?

bnoordhuis · 2013-10-23T09:16:32Z

I've responded to your question on the mailing list. For posterity:

Is it really necessary that Node's net.connect etc. still use dns.lookup
internally?

Yes, for several reasons. For example, c-ares doesn't speak mDNS and
probably never will, its notion of what the upstream DNS servers are
and in what order they should be tried can be wildly at odds with the
system resolver, and so on.

tl;dr net.connect() and friends won't be switching to dns.resolve() anytime soon.

The scalability of the thread pool is a subject of ongoing research; it's possible to speed up specific workloads by a factor of 10 - but not, it seems, without regressing other workloads by a factor of 5 or more. That won't do, of course.

If you want node.js to use more threads, run export UV_THREADPOOL_SIZE=<num-threads> before starting node.

bnoordhuis · 2013-10-23T10:25:09Z

Let me add that a change that makes the DNS resolver configurable in the call to net.connect() might be acceptable. (Should target master, come with docs and regression tests, etc. See CONTRIBUTING.md for details.)

The API I have in mind is something like this:

net.connect({
  host: "nodejs.org",
  port: 80,
  resolver: function(host, type, callback) {
    return dns.resolve(host, type, callback);
  }
});

It's up for debate if the 'type' argument should be called 'family' instead and be a number, like how dns.lookup() works. I'm open to well-reasoned suggestions.

jorangreef · 2013-10-23T11:42:18Z

Thanks Ben, that would help.

I had a runtime environment flag in mind though, just to indicate to net.connect that it should use dns.resolve directly. This would save me from having to update calls to net.connect to pass in a custom resolver (although that might be great for other use cases).

tjfontaine · 2013-10-23T15:27:46Z

also for posterity there are things like http://npmjs.org/package/native-dns which implement dns in terms of node primitives such that you're not subject to the thread pool

bminer · 2013-12-24T20:20:38Z

Can this issue be re-opened? I like the API that @bnoordhuis has proposed. :)

@bnoordhuis - Is it also possible to revise dns.lookup to accept either a family or rrtype? Instead of accepting 4 or 6, it could accept any of these: 4, 6, A or AAAA. This would allow one to use dns.lookup just like dns.resolve for resolving A or AAAA records.

Thoughts?

indutny · 2013-12-25T16:54:39Z

I'm ok with it, but have no time to work on it :(

bminer · 2013-12-27T14:59:35Z

That's okay. I will try to take a shot at it over the next few days here. Just too busy at the moment.

At any rate, if the issue is re-opened, it might get a bit more attention.

Thanks!

bminer · 2013-12-28T23:47:23Z

I think that there are a few issues here:

1.) The API for net.connect and for http.request should be changed to what @bnoordhuis proposed, allowing one to specify a resolver option to handle DNS resolution.
2.) dns.lookup should be revised to accept either a family or rrtype. Instead of accepting 4 or 6, it could accept any of these: 4, 6, A or AAAA. This would allow one to use dns.lookup just like dns.resolve for resolving A or AAAA records and make the API a bit more user-friendly.
3.) Most importantly: For most users who just want to use the default dns.lookup routine and utilize the operating system's resolver, DNS lookups should probably use a separate thread pool, or alternatively, they should not be allowed to fill up the entire shared thread pool. As @arnaud-lb mentioned, DNS queries can take more than a few seconds, which can significantly (and undesirably) slow other asynchronous I/O operations. This might be unacceptable in time-sensitive applications (i.e. reading from a hardware device such as a serial port).
4.) dns.resolve and friends read /etc/resolv.conf when Node starts to determine which nameservers to use. Unfortunately, C-Ares does not expose a ares_reinit() function to re-read this file if it changes. This can make using the other DNS functions a little less reliable than dns.lookup at times.

With all that being said... How should this issue be handled? Should a new issue be opened? Should this one be re-opened?

trevnorris · 2014-01-07T23:24:11Z

Going to reopen since we seem to be cool w/ the possibility of implementation. But yeah, core doesn't have time for this now. PR's will be welcome. :)

bminer · 2014-01-10T01:56:49Z

@trevnorris - Thanks for re-opening! If time permits, I will work on a PR.

chrisdickinson · 2014-09-30T17:48:25Z

Closing this in favor of the reduced issue #8475.

fxfred mentioned this issue Jun 10, 2012

getaddrinfo ERROR in chroot environment #3399

Closed

bnoordhuis closed this as completed Oct 23, 2013

bminer mentioned this issue Dec 24, 2013

Uncaught exceptions can cause node to hang serialport/node-serialport#282

Closed

bminer mentioned this issue Dec 27, 2013

Add option for DNS resolving ladjs/superagent#306

Closed

trevnorris reopened this Jan 7, 2014

simov mentioned this issue Jul 20, 2014

Request performance request/request#933

Closed

chrisdickinson mentioned this issue Sep 30, 2014

net: Allow custom DNS resolvers #8475

Closed

chrisdickinson closed this as completed Sep 30, 2014

bminer mentioned this issue Dec 3, 2014

Issue opening/closing character devices in Node.js #7101

Closed

napalm272 mentioned this issue Aug 26, 2016

Delays while making requests over a slow network serialport/node-serialport#797

Closed

ottobonn mentioned this issue Sep 7, 2016

dns.lookup blocks filesystem I/O nodejs/node#8436

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dns.lookup shares a thread pool with disk operations #2868

dns.lookup shares a thread pool with disk operations #2868

ntoshev commented Mar 5, 2012

Southern commented Mar 11, 2012

thejh commented Mar 11, 2012

Southern commented Mar 11, 2012

ntoshev commented Mar 12, 2012

thejh commented Mar 12, 2012

bnoordhuis commented Mar 12, 2012

arnaud-lb commented Apr 5, 2013

jorangreef commented Oct 23, 2013

bnoordhuis commented Oct 23, 2013

bnoordhuis commented Oct 23, 2013

jorangreef commented Oct 23, 2013

tjfontaine commented Oct 23, 2013

bminer commented Dec 24, 2013

indutny commented Dec 25, 2013

bminer commented Dec 27, 2013

bminer commented Dec 28, 2013

trevnorris commented Jan 7, 2014

bminer commented Jan 10, 2014

chrisdickinson commented Sep 30, 2014

dns.lookup shares a thread pool with disk operations #2868

dns.lookup shares a thread pool with disk operations #2868

Comments

ntoshev commented Mar 5, 2012

Southern commented Mar 11, 2012

thejh commented Mar 11, 2012

Southern commented Mar 11, 2012

ntoshev commented Mar 12, 2012

thejh commented Mar 12, 2012

bnoordhuis commented Mar 12, 2012

arnaud-lb commented Apr 5, 2013

jorangreef commented Oct 23, 2013

bnoordhuis commented Oct 23, 2013

bnoordhuis commented Oct 23, 2013

jorangreef commented Oct 23, 2013

tjfontaine commented Oct 23, 2013

bminer commented Dec 24, 2013

indutny commented Dec 25, 2013

bminer commented Dec 27, 2013

bminer commented Dec 28, 2013

trevnorris commented Jan 7, 2014

bminer commented Jan 10, 2014

chrisdickinson commented Sep 30, 2014