Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Curb does not seem to perform persistent connections #30

Closed
drbrain opened this issue May 7, 2010 · 13 comments
Closed

Curb does not seem to perform persistent connections #30

drbrain opened this issue May 7, 2010 · 13 comments

Comments

@drbrain
Copy link

drbrain commented May 7, 2010

If I run:
curl http://localhost/~drbrain/zeros-1k http://localhost/~drbrain/zeros-1k

Under strace on FreeBSD I can see it perform two sendo()/recvfrom() on a single socket without closing it in between
(trimmed):

socket(PF_INET6, SOCK_STREAM, IPPROTO_TCP) = 3
connect(3, {sa_family=AF_UNSPEC, sa_data="\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...}, 28) = 0
sendto(3, "GET /~drbrain/zeros-1k HTTP/1.1\r\n"..., 169, MSG_NOSIGNAL, NULL, 0) = 169
recvfrom(3, "HTTP/1.1 200 OK\r\nDate: Fri, 07 Ma"..., 16384, 0, NULL, NULL) = 1296
[...]
sendto(3, "GET /~drbrain/zeros-1k HTTP/1.1\r\n"..., 169, MSG_NOSIGNAL, NULL, 0) = 169
recvfrom(3, "HTTP/1.1 200 OK\r\nDate: Fri, 07 Ma"..., 16384, 0, NULL, NULL) = 1296

When I perform what should be the equivalent ruby code I see two sockets created and closed:

require 'rubygems'
require 'curb'

N = (ARGV.shift || 50).to_i

c = Curl::Easy.new 'http://127.0.0.1/~drbrain/zeros-2k'
N.times do
c.perform
end

running this with 2 (trimmed):

socket(PF_INET, SOCK_STREAM, IPPROTO_TCP) = 3
connect(3, {sa_family=0x69 /* AF_??? /, sa_data="/../../crypto/"...}, 16) = 0
sendto(3, "GET /~drbrain/zeros-2k HTTP/1.1\r\n"..., 65, MSG_NOSIGNAL, NULL, 0) = 65
recvfrom(3, "HTTP/1.1 200 OK\r\nDate: Fri, 07 Ma"..., 16384, 0, NULL, NULL) = 2320
[...]
socket(PF_INET, SOCK_STREAM, IPPROTO_TCP) = 4
connect(4, {sa_family=0x32 /
AF_??? */, sa_data=" Apache/2.2.13"...}, 16) = 0
sendto(4, "GET /~drbrain/zeros-2k HTTP/1.1\r\n"..., 65, MSG_NOSIGNAL, NULL, 0) = 65
recvfrom(4, "HTTP/1.1 200 OK\r\nDate: Fri, 07 Ma"..., 16384, 0, NULL, NULL) = 2320

@mksm
Copy link

mksm commented May 7, 2010

I believe there's an individual socket pool for each Easy handle. If you use Multi, it will keep a shared connection pool. Or reutilize the Easy handle by setting the url again and calling #perform.

@drbrain
Copy link
Author

drbrain commented May 7, 2010

Switching this example to use Curb::Multi I see the same behavior (multiple sockets created).

The man page for curl_easy_perform() says:

   You can do any amount of calls to curl_easy_perform(3) while using  the
   same handle. If you intend to transfer more than one file, you are even
   encouraged to do so. libcurl will then attempt to re-use the same  con-
   nection for the following transfers, thus making the operations faster,
   less CPU intense and using less network resources. Just note  that  you
   will have to use curl_easy_setopt(3) between the invokes to set options
   for the following curl_easy_perform.

Which leads me to believe that Curl::Easy#perform should behave as I expect.

@taf2
Copy link
Owner

taf2 commented May 9, 2010

this should fix the issue. http://github.com/taf2/curb/commit/73e11030c9debfc6c51f32e6aef4f597281cf6db

It keeps the original multi handle around between invocations.

I added two benchmarks in bench/curb_easy.rb and bench/nethttp_test.rb

here's the results for me:

time ruby bench/curb_easy.rb
Duration 0.059045 seconds

real 0m0.083s
user 0m0.019s
sys 0m0.014s

time ruby bench/nethttp_test.rb
Duration 0.063115 seconds

real 0m0.731s
user 0m0.406s
sys 0m0.070s

@mksm
Copy link

mksm commented May 10, 2010

Just to point it out, libcurl already uses a multi handle to perform a easy handle even if you don't manually add the handle.

@taf2
Copy link
Owner

taf2 commented May 10, 2010

I think the issue with curb here is different... Unless libcurl provides away to get at that multi handle? For each easy request curb was creating a new multi handle, effectively throwing away any open connections. Now instead, when calling the curb easy perform it will reuse it's internal multi handle... Hope that makes sense?

@mksm
Copy link

mksm commented May 10, 2010

I'm wondering why create a multi for each easy handle if libcurl does that by itself inside easy_perform?

@taf2
Copy link
Owner

taf2 commented May 10, 2010

curl_easy_perform blocks the whole ruby interpreter. To avoid the blocking and have libcurl play nice with ruby, all IO is run through rb_thread_select. In order to expose that kind of IO a multi interface was necessary... Unless there is an interface to expose what select method libcurl uses or access the easy handle's internal multi interface?

@mksm
Copy link

mksm commented May 10, 2010

Did some testing in 1.8.7 and 1.9.1 and it does some evil blocking indeed. Since we don't have access to libcurl's select or internal multi, a suggestion would be to fit rb_thread_schedule() in a callback and get rid of multi and the fds hassle.

@taf2
Copy link
Owner

taf2 commented May 10, 2010

you mean for example have a libcurl callback on the easy handle such as on_progress or another that calls maybe rb_thread_schedule or even yields ?

@mksm
Copy link

mksm commented May 11, 2010

Yup, but according to libcurl docs, on_progress callback does not seem to be a good choice since it keeps being called every second even if there is no transfer running. I've did a quick test by placing rb_thread_schedule inside CURLOPT_WRITEFUNCTION callback and works . I didn't run any benchmark tho.

@taf2
Copy link
Owner

taf2 commented May 11, 2010

if there's not transfer running than it's blocked... write function would only be called when data is received?

@taf2
Copy link
Owner

taf2 commented May 11, 2010

I really think select is the right thing here, you want the OS to tell you when to idle and when to read/write otherwise let ruby run.

@taf2
Copy link
Owner

taf2 commented May 19, 2010

Alright, it's using persistent connections now

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants