Skip to content

Record WebFlux server metrics for client disconnection #23606

Closed
@cdmatta

Description

@cdmatta

Version: Spring Boot 2.3.4.RELEASE with Webflux & Metrics

Scenario:

  1. Client aborts the connection before server sends a response.
  2. Server request processing takes forever (server configured with read/write timeout). Server closes the connection

In both the above cases, one can see a log generated due to connection abort (client/server). However, no metrics are generated.

Project dependencies

<dependencies>
	<dependency>
		<groupId>org.springframework.boot</groupId>
		<artifactId>spring-boot-starter-webflux</artifactId>
	</dependency>
	<dependency>
		<groupId>org.springframework.boot</groupId>
		<artifactId>spring-boot-starter-actuator</artifactId>
	</dependency>
	<dependency>
		<groupId>io.micrometer</groupId>
		<artifactId>micrometer-registry-prometheus</artifactId>
	</dependency>
</dependencies>
@SpringBootApplication
@RestController
public class Application {
    public static void main(String[] args) {
        SpringApplication.run(Application.class, args);
    }

    @GetMapping("/five")
    public Mono<String> five() {
        LocalDateTime start = LocalDateTime.now();
        return Mono.just("start")
                .delayElement(Duration.ofSeconds(5))
                .then(Mono.defer(() -> Mono.just("Call took " + Duration.between(start, LocalDateTime.now()).getSeconds() + " seconds")));
    }

    @GetMapping("/twenty")
    public Mono<String> forever() {
        LocalDateTime start = LocalDateTime.now();
        return Mono.just("start")
                .delayElement(Duration.ofSeconds(20))
                .then(Mono.defer(() -> Mono.just("Call took " + Duration.between(start, LocalDateTime.now()).getSeconds() + " seconds")));
    }

    @GetMapping("/fifty")
    public Mono<String> loopy() {
        LocalDateTime start = LocalDateTime.now();
        return Mono.fromRunnable(() -> {
                    for (int i = 0; i < 10; i++) {
                        System.out.println("Iteration " + i);
                        try {
                            Thread.sleep(5_000);
                        } catch (InterruptedException e) {
                            e.printStackTrace();
                            System.out.println("Thread was interrupted");
                        }
                    }
                }
        ).then(Mono.defer(() -> Mono.just("Call took " + Duration.between(start, LocalDateTime.now()).getSeconds() + " seconds")));
    }

    @Bean
    public WebServerFactoryCustomizer serverFactoryCustomizer() {
        return new NettyTimeoutCustomizer();
    }

    class NettyTimeoutCustomizer implements WebServerFactoryCustomizer<NettyReactiveWebServerFactory> {
        @Override
        public void customize(NettyReactiveWebServerFactory factory) {
            int timeout = 10_000;
            factory.addServerCustomizers(server -> server.tcpConfiguration(tcp ->
                    tcp.option(ChannelOption.CONNECT_TIMEOUT_MILLIS, timeout)
                            .doOnConnection(connection ->
                                    connection
                                            .addHandlerLast(new WriteTimeoutHandler(timeout, MILLISECONDS))
                                            .addHandlerLast(new ReadTimeoutHandler(timeout, MILLISECONDS))
                            )));
        }
    }
}

Use JMeter for the tests below

  1. Call the /five endpoint. Abort the call before 5 seconds.
    Result -
    No metrics are generated

  2. Call the /twenty endpoint.
    Result -
    After 10 seconds, Client receives connection closed - org.apache.http.NoHttpResponseException: 127.0.0.1:8080 failed to respond.
    No metrics are generated.

  3. Call the /fifty endpoint
    Result -
    Strangely enough, after 50 seconds, the call succeeds with response "Call took 50 seconds"
    Metric is generated.
    But why is the connection not interrupted by the server ? Read/Write timeout is configured for 10 seconds.

  4. Call the /fifty endpoint. Abort the call in 10 seconds.
    Result -
    The iterating/sleeping thread does not get interrupted, interal loop succeeds.
    Exception is noticed in log after the entire loop.
    Metrics look like http_server_requests_seconds_max{exception="IOException",method="GET",outcome="SUCCESS",status="200",uri="/fifty",} 50.157466

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions