Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hung connection #135

Closed
fernandomacho opened this issue Oct 15, 2023 · 12 comments · Fixed by #138
Closed

hung connection #135

fernandomacho opened this issue Oct 15, 2023 · 12 comments · Fixed by #138
Labels

Comments

@fernandomacho
Copy link

Hi I try to update to last version 0.8.13 from 0.8.12

After install some times receive this error

Oct 15 00:39:59 server-name amqproxy[3491300]: Upstream closed connection: CHANNEL_ERROR - expected 'channel.open'
Oct 15 00:39:59 server-name amqproxy[3491300]: Upstream connection closed when returned
Oct 15 00:46:01 server-name amqproxy[3491300]: Upstream closed connection: CHANNEL_ERROR - expected 'channel.open'
Oct 15 00:46:01 server-name amqproxy[3491300]: Upstream connection closed when returned
Oct 15 01:17:24 server-name amqproxy[3491300]: Unhandled exception in spawn(name: upstream read loop localhost:5672): AMQ::Protocol::Frame::Body-end was 123, expected 206 (AMQ::Protocol::Error::InvalidFrameEnd)
Oct 15 01:17:25 server-name amqproxy[3491300]: from tmp/amqproxy/lib/amq-protocol/src/amq/protocol/frames.cr:43:13 in 'read_loop'
Oct 15 01:17:25 server-name amqproxy[3491300]: from tmp/amqproxy/src/amqproxy/upstream.cr:34:7 in '->'
Oct 15 01:17:25 server-name amqproxy[3491300]: from usr/share/crystal/src/fiber.cr:146:11 in 'run'
Oct 15 01:17:25 server-name amqproxy[3491300]: from usr/share/crystal/src/fiber.cr:98:34 in '->'
Oct 15 01:17:25 server-name amqproxy[3491300]: from ???

and in other times get

Oct 15 02:53:00 server-name amqproxy[3491300]: Receiving AMQ::Protocol::Frame::Basic::Deliver(@channel=1, @bytesize=37, @consumer_tag="consumer", @delivery_tag=882, @redelivered=false, @eXchange="logger", @routing_key="unir-ia") but no client to delivery to
Oct 15 02:53:00 server-name amqproxy[3491300]: Receiving AMQ::Protocol::Frame::Header(@channel=1, @bytesize=15, @class_id=60, @weight=0, @body_size=483, @properties=AMQ::Protocol::Properties(@timestamp_raw=nil, @content_type=nil, @content_encoding=nil, @headers=nil, @delivery_mode=2, @priority=nil, @correlation_id=nil, @reply_to=nil, @Expiration=nil, @message_id=nil, @type=nil, @user_id=nil, @app_id=nil, @reserved1=nil)) but no client to delivery to
Oct 15 02:53:00 server-name amqproxy[3491300]: Receiving AMQ::Protocol::Frame::Body(@channel=1, @bytesize=483, @body_size=483, @Body=#<TCPSocket:fd 47>) but no client to delivery to
Oct 15 03:15:33 server-name amqproxy[3491300]: Upstream closed connection: CHANNEL_ERROR - expected 'channel.open'

But most of the time the connection simply was hangs.

I am using php-amqplib version v3.5.4. Once the connection is opened, it remains "hanging" on any operation performed on the channel, for example exchange_declare.

I try debug amqproxy running:

./amqproxy --debug -l 135.125.87.151 -p 5674 amqp://localhost:5672
but not get any error only connect and disconnect (when it works)

The rabbit version I am using is 3.12.6

I have reverted to version 0.8.12 which works correctly.

@alupher
Copy link

alupher commented Oct 15, 2023

We're seeing this issue too, it makes amqproxy unusable for us. We also had to downgrade to v0.8.12 to fix it.

@dentarg
Copy link
Member

dentarg commented Oct 16, 2023

Thanks for reporting, we will take a look.

@dentarg
Copy link
Member

dentarg commented Oct 16, 2023

@alupher @fernandomacho Feel free to share any code snippets that triggers the problem, as that would help making quicker progress on this

@dentarg
Copy link
Member

dentarg commented Oct 16, 2023

Might be that we're missing a call to client_disconnected

def client_disconnected
@current_client = nil
return if closed?
@lock.synchronize do
@open_channels.each do |ch|
if @unsafe_channels.includes? ch
close = AMQ::Protocol::Frame::Channel::Close.new(ch, 200_u16, "Client disconnected", 0_u16, 0_u16)
close.to_io @socket, IO::ByteFormat::NetworkEndian
@socket.flush
end
end
end
end

in the ensure block

ensure
@socket.close unless @socket.closed?
client.close_socket if client
end

(or should we only disconnect clients when IO::Error / OpenSSL::SSL::Error happens?)

@fernandomacho
Copy link
Author

fernandomacho commented Oct 16, 2023

Hi I not use SSL

For example this code:

$connection = new AMQPStreamConnection( $host, $port, $username, $password, $vhost, false, /*insists*/ 'AMQPLAIN', /*login method*/ null, /*login response*/ null, /*locale*/ 602.0, /*connection timeout*/ 602.0, /*read write timeout */ null, /* context */ true, /* keepalive */ 60, /* hearbeat */ 602.0, /*channel_rpc_timeout*/ null /* SSL */ ); $channel = $connection->channel(); $channel->basic_qos(null, $prefetch, null);

This problem actually appears when any operation is performed on the channel

In some cases it works correctly, but in most cases the connection hangs and you have to kill the process that opens the connection

@dentarg
Copy link
Member

dentarg commented Oct 16, 2023

should we only disconnect clients when IO::Error / OpenSSL::SSL::Error happens?

Maybe? with v0.8.13 (#128) we started to close the client socket also when this happened

if frame.is_a? AMQ::Protocol::Frame::Connection::Close
@log.error "Upstream closed connection: #{frame.reply_text}"
begin
write AMQ::Protocol::Frame::Connection::CloseOk.new
rescue ex : WriteError
@log.error "Error writing CloseOk to upstream: #{ex.inspect}"
end
return

@dentarg
Copy link
Member

dentarg commented Oct 16, 2023

@fernandomacho (or anyone else) do you have more complete logs to share? You can share via https://www.cloudamqp.com/support.html if you don't want to share them here (mention this issue)

@fernandomacho
Copy link
Author

fernandomacho commented Oct 16, 2023

Hi, no more log, I try run amqproxy with -d param and not log anything
If you generate debug version I would try reproduce this error, with debug version

@TristanPouliquen
Copy link

Hi, following up on this issue as we also encountered issues where the update to 0.8.13 put our production environment down.

The symptoms were:

  • many connections being opened to our server that opened connections to amqproxy (webhook endpoints)
  • long waiting times which means that the PHP processes could not close
  • the PHP FPM worker pool was exhausted and our server could not respond anymore to additional requests

Sadly we do not have detailed logs around that as we had to act quickly to restore services.

@dentarg
Copy link
Member

dentarg commented Oct 17, 2023

Okay, I'm able to reproduce with v0.8.13

include(__DIR__ . '/config.php');

use PhpAmqpLib\Connection\AMQPStreamConnection;
use PhpAmqpLib\Exchange\AMQPExchangeType;
use PhpAmqpLib\Message\AMQPMessage;

$exchange = 'router';
$queue = 'msgs';
$i = 0;

while ($i <= ATTEMPTS) {
    $connection = new AMQPStreamConnection(HOST, PORT, USER, PASS, VHOST);
    echo "Connection created" . PHP_EOL;
    $channel = $connection->channel();
    $channel->queue_declare($queue, false, true, false, false);
    $channel->exchange_declare($exchange, AMQPExchangeType::DIRECT, false, true, false);
    $channel->queue_bind($queue, $exchange);

    $msg = new AMQPMessage($i++, array('content_type' => 'text/plain'));
    $channel->basic_publish($msg, $exchange);

    echo "Message published: " . $msg->body . PHP_EOL;

    usleep(500000); // 0.5 seconds

    $channel->close();
    $connection->close();
    echo "Connection closed" . PHP_EOL;
}
integration-php-amqplib-amqproxy-1   | 2023-10-17 09:05:33 UTC: Proxy upstream: rabbitmq:5672
integration-php-amqplib-amqproxy-1   | 2023-10-17 09:05:33 UTC: Proxy listening on 0.0.0.0:5673
integration-php-amqplib-amqproxy-1   | 2023-10-17 09:05:34 UTC: Client connected: 172.26.0.5:60272
integration-php-amqplib-conns-1      | Connection created
integration-php-amqplib-rabbitmq-1   | 2023-10-17 09:05:34.006434+00:00 [info] <0.703.0> accepting AMQP connection <0.703.0> (172.26.0.4:33100 -> 172.26.0.2:5672)
integration-php-amqplib-rabbitmq-1   | 2023-10-17 09:05:34.008802+00:00 [info] <0.703.0> connection <0.703.0> (172.26.0.4:33100 -> 172.26.0.2:5672) has a client-provided name: AMQProxy 0.8.13
integration-php-amqplib-rabbitmq-1   | 2023-10-17 09:05:34.008802+00:00 [info] <0.703.0>
integration-php-amqplib-rabbitmq-1   | 2023-10-17 09:05:34.009979+00:00 [info] <0.703.0> connection <0.703.0> (172.26.0.4:33100 -> 172.26.0.2:5672 - AMQProxy 0.8.13
integration-php-amqplib-rabbitmq-1   | 2023-10-17 09:05:34.009979+00:00 [info] <0.703.0> ): user 'guest' authenticated and granted access to vhost '/'
integration-php-amqplib-conns-1      | Message published: 0
integration-php-amqplib-amqproxy-1   | 2023-10-17 09:05:34 UTC: Client disconnected: 172.26.0.5:60272
integration-php-amqplib-conns-1      | Connection closed
integration-php-amqplib-amqproxy-1   | 2023-10-17 09:05:34 UTC: Client connected: 172.26.0.5:60282
integration-php-amqplib-conns-1      | Connection created
<hung here>

with amproxy v0.8.12 the code above continues to publish messages

@akhomyakov-marfatech
Copy link

I also faced this problem.
Fortunately, it occurred when setting up new application nodes. Spent 40 minutes debugging.
If we had updated on all nodes at once, we would have stopped production.

Anyway, guys, thank you for this app.
I really like amqproxy. It has surprisingly improved the performance of the application greatly.

dentarg added a commit that referenced this issue Oct 20, 2023
Fix regression introduced by #128

Close #135
@dentarg dentarg added the bug label Oct 20, 2023
@dentarg
Copy link
Member

dentarg commented Oct 20, 2023

Sorry about that @akhomyakov-marfatech

v0.8.14 has been released addressing this issue, please give it a try and let us know how it goes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants