Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Socket causes connection timed out exception #4092

Closed
yellow1912 opened this issue Mar 11, 2021 · 5 comments
Closed

Socket causes connection timed out exception #4092

yellow1912 opened this issue Mar 11, 2021 · 5 comments

Comments

@yellow1912
Copy link

yellow1912 commented Mar 11, 2021

  1. What did you do? If possible, provide a simple script for reproducing the error.

I'm using swoole with pheanstalk, if I enable the sockets extension (sockets.io) for PHP, then pheanstalk will automatically use socket to connect to beanstalk and will get connection timeout error

My code is rather long, so I will copy only the parts I think important, please let me know if I should copy more

Co\run(function () {
$tubes = [
        'mailer' => [
            'concurrency' => 5,
            'process' => function ($pheanstalk, $jobId, $message) use ($logger) {
                      // do something here
                      return true;
            }],
        'file' => [
            'concurrency' => 5,
            'process' => function ($pheanstalk, $jobId, $message) use ($logger) {
                      // do something here
                      return true;
            }]
    ];

$shared = new \Swoole\Table(512);
$shared->column('status', Swoole\Table::TYPE_INT);
$shared->create();
$shared['terminated'] = ['status' => 0];

$system = new \Swoole\Table(128);
$system->column('value', Swoole\Table::TYPE_INT);
$system->create();

$wg = new WaitGroup();

// lets calculate the system usage but not so often
go(function () use ($wg, $shared, $system) {
    $wg->add();
    while (0 === $shared['terminated']['status']) {
        $system['cpu'] = ['value' => getCpuUsage()];
        $system['memory'] = ['value' => getMemoryUsage()];

        // lets sleep 20 second each loop
        co::sleep(20);
    }
    $wg->done();
});

foreach ([SIGINT, SIGTERM] as $sig) {
        pcntl_signal(
            $sig,
            function () use ($shared, $sig, $io) {
                $io->info(sprintf('Terminating after receiving signal `%s`', $sig));
                $shared['terminated']['status'] = 1;
            }
        );
    }

foreach ($tubes as $tube => $options) {
        if (!isset($tubes[$tube]['process'])) continue;

        $process = $tubes[$tube]['process'];
        go(function () use ($wg, $tube, $process, $shared, $system, $logger, $io) {
            $wg->add();

            $listener = Pheanstalk::create(BEANSTALK_ADDRESS, BEANSTALK_PORT);
            $listener->useTube($tube)->watch($tube);

            $io->info(sprintf('Start watching tube %s', $tube));

            $retries = new \Swoole\Table(1024);
            $retries->column('count', Swoole\Table::TYPE_INT);
            $retries->create();

            while (0 === $shared['terminated']['status']) {
                // okie, if we have high cpu and memory usage we should do something about it
                /** @var int $cpu */
                $cpu = $system['cpu']['value'];
                /** @var int $memory */
                $memory = $system['memory']['value'];

                while ($cpu > 70 || $memory > 70) {
                    // lets sleep 30 seconds
                    $io->warning(sprintf('System overloaded [CPU:%s, Memory:%s]. Sleep for 30 sections.', $cpu, $memory));
                    co::sleep(30);
                }

                pcntl_signal_dispatch();

                // the timeout means that if there is nothing in the tube it will be stuck here for that period
                // for now, 5 second seems like a good time length
                $job = $listener->reserveWithTimeout(5);

                if (isset($job)) {
                    $listener->bury($job);
                    // lets process this job inside our co-routines
                    $io->info(sprintf('[%s] Executing job %s with id %s', $tube, $tube, $job->getId()));

                    go(function () use ($tube, $process, $listener, $job, $retries, $io) {
                        // lets handle the job
                        $success = $process($listener, $job->getId(), $job->getData());

                        $key = 'job_' . $job->getId();
                        if ($success) {
                            $io->success(sprintf('[%s] Finished job %s with id %s', $tube, $tube, $job->getId()));
                            // remove the job
                            $listener->delete($job);
                        } else {
                            if (!$retries->exists($key)) {
                                $retries[$key] = ['count' => 0];
                            }

                            $retries[$key] = ['count' => $retries[$key]['count'] + 1];

                            $io->error(sprintf('[%s] Failed job %s with id %s', $tube, $tube, $job->getId()));

                            if ($retries[$key]['count'] < MAX_RETRIES) {
                                $io->info(sprintf('Kicking job with id %s', $job->getId()));
                                $listener->kickJob($job);
                            } else {
                                $io->info(sprintf('Deleting job with id %s after multiple retries', $job->getId()));
                                $listener->delete($job);
                            }
                        }
                    });
                }

                if ($logs = $logger->hasLogs()) {
                    fwrite(STDERR, $logger->getLogs('log15', 'json'));
                    $logger->resetLogs();
                }

                // lets sleep 5 second each loop
                co::sleep(5);
            }

            $retries->destroy();
            $wg->done();
            pcntl_signal_dispatch();
        });
    }

    $wg->wait();
    $shared->destroy();
});
Stack trace:
#0 /vendor/pda/pheanstalk/src/Socket/SocketSocket.php(104): Pheanstalk\Socket\SocketSocket->throwException()
#1 /vendor/pda/pheanstalk/src/Connection.php(99): Pheanstalk\Socket\SocketSocket->read()
#2 /vendor/pda/pheanstalk/src/Pheanstalk.php(372): Pheanstalk\Connection->dispatchCommand()
#3 /vendor/pda/pheanstalk/src/Pheanstalk.php(267): Pheanstalk\Pheanstalk->dispatch()
#4 worker.php(205): Pheanstalk\Pheanstalk->reserveWithTimeout()
#5 {main}
  thrown in/vendor/pda/pheanstalk/src/Socket/SocketSocket.php on line 81
  1. What did you expect to see?

I expect the socket to connect properly.

  1. What did you see instead?

The connection timeout error.

On additional note, if I build swoole WITHOUT the socket option, and remove sockets extension from PHP, pheanstalk will automatically fall to use filesocket option, which results in another error:

PHP Fatal error:  Uncaught Swoole\Error: Socket#11 has already been bound to another coroutine#7, reading of the same socket in coroutine#10 at the same time is not allowed in /vendor/pda/pheanstalk/src/Socket/FileSocket.php:89
Stack trace:
#0 /vendor/pda/pheanstalk/src/Socket/FileSocket.php(89): fgets()
#1 /vendor/pda/pheanstalk/src/Connection.php(84): Pheanstalk\Socket\FileSocket->getLine()
#2 /vendor/pda/pheanstalk/src/Pheanstalk.php(369): Pheanstalk\Connection->dispatchCommand()
#3 /vendor/pda/pheanstalk/src/Pheanstalk.php(72): Pheanstalk\Pheanstalk->dispatch()
#4 worker.php(219): Pheanstalk\Pheanstalk->delete()
#5 {main}
  thrown in /vendor/pda/pheanstalk/src/Socket/FileSocket.php on line 89

I wonder if it is necessary to write a socket connection class for pheanstalk that utilize swoole socket instead of the default PHP socket?

  1. What version of Swoole are you using (show your php --ri swoole)?

swoole

Swoole => enabled
Author => Swoole Team team@swoole.com
Version => 4.6.4-dev
Built => Mar 11 2021 15:04:21
coroutine => enabled with boost asm context
epoll => enabled
eventfd => enabled
signalfd => enabled
cpu_affinity => enabled
spinlock => enabled
rwlock => enabled
sockets => enabled
openssl => OpenSSL 1.1.1j 16 Feb 2021
dtls => enabled
http2 => enabled
json => enabled
pcre => enabled
zlib => 1.2.11
mutex_timedlock => enabled
pthread_barrier => enabled
futex => enabled
async_redis => enabled

Directive => Local Value => Master Value
swoole.enable_coroutine => On => On
swoole.enable_library => On => On
swoole.enable_preemptive_scheduler => Off => Off
swoole.display_errors => On => On
swoole.use_shortname => On => On
swoole.unixsock_buffer_size => 8388608 => 8388608

  1. What is your machine environment used (show your uname -a & php -v & gcc -v) ?
    Linux lead-dev-0 4.15.0-117-generic #118-Ubuntu SMP Fri Sep 4 20:02:41 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

PHP 7.4.16 (cli) (built: Mar  5 2021 07:54:20) ( NTS )
Copyright (c) The PHP Group
Zend Engine v3.4.0, Copyright (c) Zend Technologies
    with Zend OPcache v7.4.16, Copyright (c), by Zend Technologies
    with blackfire v1.39.1~linux-x64-non_zts74, https://blackfire.io, by Blackfire

Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/7/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 7.5.0-3ubuntu1~18.04' --with-bugurl=file:///usr/share/doc/gcc-7/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --prefix=/usr --with-gcc-major-version-only --program-suffix=-7 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --enable-default-pie --with-system-zlib --with-target-system-zlib --enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)
@yellow1912
Copy link
Author

yellow1912 commented Mar 12, 2021

It seems like just putting the code inside Co\run(function () {}) is enough to cause the issue.

Without Co\run: (runs perfectly)

$listener = Pheanstalk::create(BEANSTALK_ADDRESS, BEANSTALK_PORT);
$job = $listener->watch('command')->reserveWithTimeout(10);
var_dump($job);

With Co\run: (timeout)

Co\run(function () {
    $listener = Pheanstalk::create(BEANSTALK_ADDRESS, BEANSTALK_PORT);
    $job = $listener->watch('command')->reserveWithTimeout(10);
    var_dump($job);
});

Also, with \Swoole\Runtime::enableCoroutine(); then swoole won't allow running the code without putting it inside Co\run

Note: compile swoole without the sockets support does seem to get around the connection timeout error. I still get some weird timeout when I try to run the whole code inside coroutine using my sample code in the first post of this thread, however.

Note2: after re-compiling without the sockets support, I still get the connection timeout error when I try to run pheanstalk reserveWithTimeout inside Co\run. I'm quite sure this has something to do with swoole but not entirely sure how to debug or fix this.

Note3: Further debugging seems to indicate that the below code (inside pheanstalk) is stalling at the socket_read:

public function read(int $length): string
    {
        $this->checkClosed();

        $buffer = '';
        while (mb_strlen($buffer, '8BIT') < $length) {
           // my debug shows that this socket_read always time out
            $result = socket_read($this->socket, $length - mb_strlen($buffer, '8BIT')); 
            if ($result === false) {
                $this->throwException();
            }
            $buffer .= $result;
        }

        return $buffer;
    }

Interesting enough, a similar method inside pheanstalk works fine:

public function getLine(): string
    {
        $this->checkClosed();

        $buffer = '';
        // Reading stops at \r or \n. In case it stopped at \r we must continue reading.
        while (substr($buffer, -1, 1) !== "\n") {
            // it seems like with PHP_NORMAL_READ then socket_read works just fine? Strange.
            $result = socket_read($this->socket, 1024, PHP_NORMAL_READ);
            if ($result === false) {
                $this->throwException();
            }
            $buffer .= $result;
        }



        return rtrim($buffer);
    }

@yellow1912
Copy link
Author

I can confirm that, even when compiling swoole without sockets (or with it), trying to read socket with the default mode will always timeout:

No sockets:

Swoole => enabled
Author => Swoole Team <team@swoole.com>
Version => 4.6.4
Built => Mar 13 2021 11:09:50
coroutine => enabled with boost asm context
epoll => enabled
eventfd => enabled
signalfd => enabled
cpu_affinity => enabled
spinlock => enabled
rwlock => enabled
openssl => OpenSSL 1.1.1j  16 Feb 2021
dtls => enabled
http2 => enabled
json => enabled
curl-native => enabled
pcre => enabled
zlib => 1.2.11
mutex_timedlock => enabled
pthread_barrier => enabled
futex => enabled

Code:

\Swoole\Runtime::enableCoroutine();
Co\run(function () {
    go(function () {
        $listener = Pheanstalk::create(BEANSTALK_ADDRESS, BEANSTALK_PORT);
        $listener->useTube('command');
        $job = $listener->watchOnly('command')->reserveWithTimeout(10);
    });
});

The timeout happens here inside pheanstalk as mentioned above in the previous post:

$result = socket_read($this->socket, $length - mb_strlen($buffer, '8BIT'));

Running the code outside of Co\run does not result in that issue. What should I do?

@yellow1912
Copy link
Author

Hello, I know everyone is busy. Is there anything that I can do to make debugging easier?

@yellow1912
Copy link
Author

I can confirm that if I change the code from

$result = socket_read($this->socket, $length - mb_strlen($buffer, '8BIT'));

to

$result = socket_read($this->socket, $length - mb_strlen($buffer, '8BIT'), PHP_NORMAL_READ);

Then I will not get the timeout issue.

Any idea how to fix it? This code belongs to pheanstalk library and it works fine if swoole is not enabled.

@yellow1912
Copy link
Author

Okie, so the fixes I found are:

  1. Move the pheanstalk listener outside of Co:run
  2. Or, use another library that works better with swoole: https://github.com/xpader/swbeanstalk

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant