Missing requests in metrics #105

samuelg · 2020-05-29T18:41:49Z

We recently noticed that swagger-stats appears to be missing some requests in the exposed swagger-stats/metrics route. The application in question is using Open API using express-openapi. We also have metrics for the HA proxies fronting the application as well as a bunyan middleware logging all requests within the same application that happens to also report request-finish events along with a status code.

Looking at the latest metrics, we see a little over 1.7M 200s and 100K 401s being reported by HA proxy, whereas swagger-stats is reporting a little less than 700K 200s and only 282 401s. The logs being written to by the bunyan middleware match what HA proxy is reporting.

We are looking specifically at the api_request_total metrics.

Our package versions are:

{
  "bunyan": "^1.8.12",
  "bunyan-middleware": "^1.0.0",
  "express": "^4.17.1",
  "express-async-handler": "^1.1.4",
  "express-openapi": "^6.0.0",
  "swagger-stats": "^0.95.17",
}

We are configuring the swagger-stats middleware as follows:

app.use(
  swStats.getMiddleware({
    swaggerSpec: apiDoc,
    swaggerOnly: true,
    authentication: true,
    onAuthenticate: (req, username, password) => {
        // check auth for metrics
    })
  })
);

In case that helps here is how we are configuring the bunyan middleware:

app.use(
  bunyanMiddleware({
    obscureHeaders: [],
    logger,
    requestStart: true,
  })
);

We are using version 12.16.2 of Node.js.

The application where swagger-stats is installed has been running for 10 days straight without a restart so we don't believe this is due to stats being reset due to a restart.

The issue is also not that some routes are not in the swagger definition and therefore not being kept track of as we do see some metrics for all routes, just nowhere near as many as we would expect to see.

The swagger-stats/metrics route is being scrapped by Prometheus against the application directly as opposed to going through HA Proxy so we also do not believe that is to blame for the discrepancies, nor would that explain the difference in 401s we are seeing.

We are at a loss as to why swagger-stats appears to be missing so many requests.

Any help tracking this down would be appreciated.

The text was updated successfully, but these errors were encountered:

sv2 · 2020-05-29T20:24:47Z

Is it possible to run app for some time with swagger-stats debug enabled - with env variable DEBUG=sws:* ?
If you could provide captured debug log, that could help to spot the issue

As you mentioned HA Proxy - are you by any chance running multiple nodes of app behind HA Proxy used as load balancer ? if so, each node would see subset of all API requests

samuelg · 2020-06-01T13:19:24Z

We can work to add the debug ENV var to our deployment to gather logs.

We do have more than one node behind the HA proxy yes, but metrics are being collected by a single Prometheus and we are looking at a sum of the metrics. We also have a single Logstash server receiving logs from all nodes, and the counts there match the counts from the HA proxy.

samuelg · 2020-06-02T17:58:17Z

After deploying the application with swagger-stats debugging enabled and analyzing logs it appears the issue is that we are using swaggerOnly: true in our config but some of our clients (which we do not control) are using a trailing / when accessing some of our API routes. The routes are being routed correctly by openapi but because they are not technically an exact match due to the trailing / swagger-stats is not tracking them. Is that expected? If the route is part of the openapi definition and being routed correctly I would expect swagger-stats to track it. For now we're going to disable swaggerOnly as a workaround.

sv2 · 2020-06-02T19:58:36Z

Thanks for letting me know! This calls for improvement in matching. Will try to reproduce and handle trailing / properly. Just in case, can you give me couple of examples of your routes ?

samuelg · 2020-06-03T13:47:31Z

Here is a truncated view of our api definition:

/v1/tokens:
    post:
      operationId: createToken
/v1/sessions:
    post:
      operationId: createSession

We are seeing access to the following in the logs:

/v1/tokens
/v1/tokens/
/v1/sessions
/v1/sessions/

The ones with a trailing / are the ones missing the swagger-stats. None of these routes have parameters in the route path. All parameters are passed via a POST body or a header depending on the route.

sv2 · 2021-04-03T18:29:28Z

Addressed v0.99.1

sv2 self-assigned this May 29, 2020

sv2 added bug question labels May 29, 2020

sv2 added a commit that referenced this issue Jun 8, 2020

Handle properly URL with trailing slash - #105

f1ad550

sv2 closed this as completed Apr 3, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Missing requests in metrics #105

Missing requests in metrics #105

samuelg commented May 29, 2020 •

edited

Loading

sv2 commented May 29, 2020

samuelg commented Jun 1, 2020

samuelg commented Jun 2, 2020

sv2 commented Jun 2, 2020

samuelg commented Jun 3, 2020

sv2 commented Apr 3, 2021

Missing requests in metrics #105

Missing requests in metrics #105

Comments

samuelg commented May 29, 2020 • edited Loading

sv2 commented May 29, 2020

samuelg commented Jun 1, 2020

samuelg commented Jun 2, 2020

sv2 commented Jun 2, 2020

samuelg commented Jun 3, 2020

sv2 commented Apr 3, 2021

samuelg commented May 29, 2020 •

edited

Loading