Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiproc doesn't capture all metrics #1089

Open
matt0x6F opened this issue Feb 12, 2025 · 2 comments
Open

Multiproc doesn't capture all metrics #1089

matt0x6F opened this issue Feb 12, 2025 · 2 comments

Comments

@matt0x6F
Copy link

matt0x6F commented Feb 12, 2025

Hey folks,

I have a similar setup to what is described here. The only difference is that I have nginx in front talking to gunicorn over sockets, but I don't think that matters as far as I can tell.

I wasn't able to use this code:

# Using multiprocess collector for registry
def make_metrics_app():
    registry = CollectorRegistry()
    multiprocess.MultiProcessCollector(registry)
    return make_asgi_app(registry=registry)


metrics_app = make_metrics_app()
app.mount("/metrics", metrics_app)

It results in a page that gets a 307, so I implemented this:

@router.get("/metrics", response_class=PlainTextResponse, include_in_schema=False)
async def get_metrics() -> str:
    """
    Get prometheus metrics
    """
    registry = CollectorRegistry()
    multiprocess.MultiProcessCollector(registry)
    data = generate_latest(registry)
    res = Response(content=data)
    res.headers["Content-Type"] = CONTENT_TYPE_LATEST
    res.headers["Content-Length"] = str(len(data))
    return res

Apart from that my setup is pretty typical, I have some counters and a histogram:

AUTOMATIONS_TOTAL: Final = Gauge(
    name=prefix_("automations_total"),
    documentation="Number of automations run",
    labelnames=["platform", "action", "kind", "status"],
    multiprocess_mode="sum",
)

REQUEST_TIME_SECONDS: Final = Histogram(
    name=prefix_("request_time_seconds"),
    documentation="Time spent processing request",
    labelnames=["method", "url_rule", "status_code"],
)

REQUESTS_IN_PROGRESS_TOTAL: Final = Gauge(
    name=prefix_("requests_in_progress_total"),
    documentation="Number of concurrent requests",
    # See Metrics Tuning (Gauge)
    # https://github.com/prometheus/client_python#multiprocess-mode-gunicorn
    multiprocess_mode="sum",
)

My in progress counter always remains at zero, but I'm hoping that's because requests finish faster than I can see them. The histogram, however, never shows up - I see no record of it in my PROMETHEUS_MULTIPROC_DIR and it's never rendered in the output. Any ideas for what I could troubleshoot?

@matt0x6F
Copy link
Author

Some extra context: most all of these get called in async middleware. Here's a rather benign one:

async def add_request_id(
    request: Request, call_next: Callable[[Request], Awaitable[Response]]
) -> Response:
    if not is_prometheus_endpoint(request):
        REQUESTS_IN_PROGRESS_TOTAL.inc()
        request.state.request_id = correlation_id.get()
        start_time = time.time()

    response = await call_next(request)

    if not is_prometheus_endpoint(request):
        time_taken = time.time() - start_time

        # we replace UUIDs with <uuid> so that we flatten the curve of cardinality with unique URLs across all urls
        REQUEST_TIME_SECONDS.labels(
            method=request.method,
            url_rule=replace_uuids(request.url.path),
            status_code=response.status_code,
        ).observe(time_taken)
        REQUESTS_IN_PROGRESS_TOTAL.dec()

    return response

@csmarchbanks
Copy link
Member

Hmm, I am not seeing anything obvious, does REQUESTS_IN_PROGRESS show up in the multiproc dir?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants