-
Notifications
You must be signed in to change notification settings - Fork 7.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix EINPROGRESS very rarely occurring on synchronous php_network_connect_socket() #13252
base: PHP-8.2
Are you sure you want to change the base?
Conversation
Thanks for your PR. However, PR needs to be from the lowest release branch when the fix apply, thus here PHP-8.2, PHP-8.0 is unsupported and PHP-8.1 is only for careful security fixes. |
…n synchronous mode
e46b987
to
07544a8
Compare
Thanks, I think I've updated it to now be against the correct base. |
So this was quite tricky to recreate actually. It's just not easy to recreate EINPROGRESS on TCP. I initially thought that I could just add sleep between listen and accept on the server but that's not how it works. The connect returns once the 3 way TCP handshake is done and accepting is irrelevant. So I had to find another way which is delaying SYN-ACK packet which I had to do using iptable and tc. I wrote a testing C client and server and once I recreated it in C, I did the same in PHP. You can see all my test files for this here: bukka/php-util@5fdc4df . Basically I wanted to confirm what I thought from the code which is that this is not correct change. You are right that The only other thing that came to my mind is that the errno would be checked but I did a bit of search and don't see any obvious place where it could be problematic. Might need to search more but guess we can reset it - you can test it with |
Ah, that's unfortunate that it doesn't seem to fix it. I haven't been able to verify that my fix works for us in production since the issue both crops up quite infrequently (in bursts every 3 weeks or so) and we don't compile our own PHP, so pushing this to (one of) our workers would be quite difficult to do. Thanks for figuring out a way to reproduce the Regardless, I will see if I can reproduce it now on a test setup with PHPRedis and your SYN+ACK delay. Thanks for the work you've put in so far! |
Think it's mainly to catch error case but I wouldn't think that EINPROGRESS should be ever such case because it should timeout in such case. But might be wrong so it would be good to test it. I will try to do a bit more research / testing and see. |
Could this be what causes intermittent "Operation Now In Progress" errors with various PHP functions that make network calls? For a long time now, I've seen it occur with FTP functions, stream_socket_client(), PDO trying to connect to a MySQL server, and CURL (specifically using Guzzle, that would use PHP curl) and while it is difficult to replicate, it seems to occur more often during high load on an affected server. One example we used to have was a server that ran a letsencrypt cron job every night to check for certificate renewals. That specific server had a bunch of configurations that pointed to domains that were no longer active, and I believe this caused letsencrypt's certbot to take longer than normal to complete its renewal process. It's also during this period where we would almost always see an uptick in "Operation Now In Progress" errors from various processes that could have been doing any of the actions I just described. |
I've become very interested in this as we now see both PHPRedis and PRedis encountering EINPROGRESS, and PHPRedis uses a non-blocking socket. My question is when php_network_connect_socket is called with asynchronous=0, can the caller assume that EINPROGRESS will not be returned, since that error is specific to nonblocking connects? I ask this since most callers may think that asynchronous = 0 means "blocking socket". I believe there's a code path where php_network_connect_socket receives EINPROGRESS on the connect, however the socket finishes connecting and is ready to read during the polling. The error variable never gets cleared so callers who are expecting blocking operations receive an unexpected EINPROGRESS return. |
So first of all there's non blocking switch as I see you noticed in that linked Redis issue that I went through as well. When going through it I came across redis/hiredis#1213 which showed the interrupted poll (errno == EINTR) can stop the poll and return initial error. After a while of looking to the stream (PHP network) code I realised that this is what is probably happening. If such situation happens, it goes to the |
php_network_connect_socket()
in blocking mode does aselect()
to block on aconnect()
call that reportedEINPROGRESS
. However, it then checks forselect()
errors usinggetsockopt(..., SO_ERROR, ...)
, which will return the earlierEINPROGRESS
since it wasn't cleared byphp_socket_errno()
. I believe this is the cause of phpredis/phpredis#1881 and possibly other issues.Note that, since the issue only happens very sporadically, I'm unable to write a test that reproduces it, or to test that my fix works in practice. Also new to contributing to PHP, so apologies in advance if I've done something wrong. :-)