Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Total connections should stay positive #360

Merged
merged 12 commits into from
Jan 7, 2019
Merged

Total connections should stay positive #360

merged 12 commits into from
Jan 7, 2019

Conversation

kvosper
Copy link
Contributor

@kvosper kvosper commented Dec 14, 2018

A fix for #351

@@ -149,6 +150,7 @@ public void configure(Channel channel, HttpHandler httpPipeline) {
.addLast("connection-throttler", excessConnectionRejector)
.addLast("idle-handler", new IdleStateHandler(serverConfig.requestTimeoutMillis(), 0, serverConfig.keepAliveTimeoutMillis(), MILLISECONDS))
.addLast("channel-stats", channelStatsHandler)
.addLast("channel-connection-stats", new ConnectionCountHandler(metrics))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remember the root cause for this bug: An ExcessConnectionRejector causes a channelInactive event being sent without preceding a channelActive. Therefore, we should fix the ExcessConnectionRejector and ensure channelActive is always sent before channelInactive.

For example, I sketched something like this:

public class ExcessConnectionRejector extends ChannelInboundHandlerAdapter {
    private static final Logger LOGGER = getLogger(ExcessConnectionRejector.class);
    private final ChannelGroup channelGroup;
    private final int maxConnectionsCount;
    private boolean channelActivated;

    public ExcessConnectionRejector(ChannelGroup channelGroup, int maxConnectionsCount) {
        checkArgument(maxConnectionsCount > 0);
        this.channelGroup = requireNonNull(channelGroup);
        this.maxConnectionsCount = maxConnectionsCount;
    }

    @Override
    public void channelRegistered(ChannelHandlerContext ctx) throws Exception {
        if (channelGroup.size() >= maxConnectionsCount) {
            LOGGER.warn("Max allowed connection  to server exceeded: current={} configured={}", channelGroup.size(), maxConnectionsCount);
            ctx.close();
            return;
        }
        channelGroup.add(ctx.channel());
        super.channelRegistered(ctx);
    }

    @Override
    public void channelActive(ChannelHandlerContext ctx) throws Exception {
        LOGGER.info("channel active");
        channelActivated = true;
        super.channelActive(ctx);
    }

    @Override
    public void channelInactive(ChannelHandlerContext ctx) throws Exception {
        LOGGER.info("channel inactive");
        if (channelActivated) {
            super.channelInactive(ctx);
        }
    }
    
}

You would of course make ExcessConnectionRejector non-shared. You can also move the event order enforcing logic out into a separate Netty handler if you so wish.

With this approach you don't need to widen the scope of this PR onto other non-relevant classes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not entirely true - the channelActivated field needs to be in a non-shared handler, but the channel count needs to be shared. However, it is true that the implementation can be simplified and I will do so.

(Mikko is no longer reviewing this PR, but I am leaving this comment so that other reviewers understand that I have taken the words into account.)

if (eventProcessor != null) {
eventProcessor.submit(new ChannelInactiveEvent());
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes can be backed off after fixing the Excess Connection Rejector which will enforce the correct event ordering.


super.channelInactive(ctx);
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class can also be left alone.

Copy link
Contributor

@mikkokar mikkokar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing from Request Changes to Comment as I'm disappearing tomorrow.

@kvosper
Copy link
Contributor Author

kvosper commented Jan 2, 2019

I have made some changes, but it appears to intermittently fail when running the make e2e, on a test that passes in my IDE. I may have to close the PR and reopen later.

@kvosper
Copy link
Contributor Author

kvosper commented Jan 3, 2019

Regarding my earlier comment, it looks like the test client will sometimes see the failure to connect as TransportLostException instead of BadHttpResponseException and the test did not account for that. This has now been fixed.

@@ -37,11 +37,11 @@
public class ChannelStatisticsHandler extends ChannelDuplexHandler {
private static final Logger LOGGER = LoggerFactory.getLogger(ChannelStatisticsHandler.class);

public static final String PREFIX = "connections";
static final String PREFIX = "connections";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are some of these static fields package-private and others private? I don't see all of them used in tests or elsewhere outside the class... For the ones that are used in tests (e.g., RECEIVED_BYTES), do we usually annotate them with VisibleForTesting ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was done using the suggestions from my IDE. I will see if anything can be done to clean it up. Usually for our tests, we don't reference constants, but just use the literal values.

Copy link
Contributor

@dvlato dvlato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue that our excessConnectionRejector sends a channelInactive without a corresponding channelActive event is not fixed at all, is it? I dont'see many differences with how Netty deals with it in similar code but we might want to add parts of what they are doing (e.g., removing the handler from the pipeline). I don't know if we have looked into what the state of the network is in 'channelRegistered' , it might make sense to move the code to channelActive...
https://github.com/netty/netty/blob/22b2c4c3b8be75ad3ea97f115d1bc56fa46069d2/handler/src/main/java/io/netty/handler/ipfilter/AbstractRemoteAddressFilter.java

@kvosper kvosper merged commit 0cb3c04 into ExpediaGroup:master Jan 7, 2019
@kvosper kvosper deleted the total-connections-should-stay-positive branch January 7, 2019 14:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants