Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error executing query: context deadline exceeded when backend error #501

Closed
hailong1004 opened this issue Feb 21, 2022 · 4 comments
Closed

Comments

@hailong1004
Copy link

my promxy.yaml configuration:

global:
  evaluation_interval: 5s
  external_labels:
    source: promxy
promxy:
  server_groups:
    - static_configs:
        - targets:
          - myhost1
          - myhost2
      ignore_error: true

my start command line:

./promxy-v0.0.75-linux-amd64 \
  --config=promxy.yaml \
  --log-level=debug \
  --bind-addr=:9088 \
  --query.timeout=5s \
  --web.read-timeout=5s \
  --http.shutdown-timeout=5s \
  --rules.alert.resend-delay=5s \
  --rules.alert.for-grace-period=5s \
  --rules.alert.for-outage-tolerance=5s \
  --query.lookback-delta=5s

myhost2
When the myhost2 vm service fails, but tcp port is open:

# telnet myhost2 80
Trying 2.3.4.5...
Connected to 2.3.4.5.
Escape character is '^]'.


# time curl -I http://myhost2/metrics
curl: (56) Recv failure: Connection reset by peer

real	2m50.354s
user	0m0.008s
sys	0m0.003s

promxy get error: Error executing query: context deadline exceeded
image

But if port 80 of myhost2 is closed, promxy can directly ignore the data of myhost2. Is this a bug of promxy?

jacksontj added a commit that referenced this issue Mar 28, 2022
Add ability to set timeout on downstream separate from dial

Fixes #501
@jacksontj
Copy link
Owner

Thanks for the report! This is actually a bit of an odd edge case that TBH we haven't run into yet. Generally when the downstream dies the socket goes down as well -- and in that case the query responds correctly. In this case if the socket is live but no response happens then the query is sent but a response never happens. In this case we'd basically need to enforce some timeout to trigger an error before the full request timeout.

I have created #507 to add such an option -- but it'll require configuring a timeout for the response headers. I would recommend instead seeing if there is a way to get the TCP port to not be open when the service is down (not sure how you are doing that, maybe some LB?)

@jacksontj
Copy link
Owner

Actually, after submitting that PR i noticed that there already is a Timeout option that does this. So if you also set a Timeout option then it'll handle as you seem to want. I'll again call out that I'd suggest getting the downstream to fail in such a way that the socket isn't left open -- but this is a workable solution.

@alessandroniciforo
Copy link
Contributor

Hi @jacksontj. I'm experiencing what seems a similar use-case: a downstream Prometheus server experienced some issues and it was accepting new connections but never returning any response. Promxy kept timing out at 30 seconds. I'd like to lower this timeout. How can I set that Timeout option you mentioned?

@alessandroniciforo
Copy link
Contributor

Nevermind, I found the option. I opened #524 to add it to the sample config.yaml file. Thanks for the good work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants