Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Knex: Timeout acquiring a connection #4598

Closed
2 tasks done
rjeevaram opened this issue Mar 20, 2024 · 13 comments
Closed
2 tasks done

Knex: Timeout acquiring a connection #4598

rjeevaram opened this issue Mar 20, 2024 · 13 comments
Labels
area:core issues describing changes to the core of uptime kuma help question Further information is requested

Comments

@rjeevaram
Copy link

⚠️ Please verify that this question has NOT been raised before.

  • I checked and didn't find similar issue

🛡️ Security Policy

📝 Describe your problem

I've been facing the below error in Uptimekuma Monitor recently. Has anyone faced this issue and any solution?

Knex: Timeout acquiring a connection. The pool is probably full. Are you missing a .transacting(trx) call?

📝 Error Message(s) or Log

Knex: Timeout acquiring a connection. The pool is probably full. Are you missing a .transacting(trx) call?

🐻 Uptime-Kuma Version

1.23.11

💻 Operating System and Arch

Running as K8s pods in debian 11 Node

🌐 Browser

Chrome 122.0.6261.94

🖥️ Deployment Environment

  • Runtime:
  • Database:
  • Filesystem used to store the database on:
  • number of monitors:
@rjeevaram rjeevaram added the help label Mar 20, 2024
@CommanderStorm CommanderStorm added question Further information is requested area:core issues describing changes to the core of uptime kuma labels Mar 20, 2024
@CommanderStorm
Copy link
Collaborator

Please fill out the Deployment Environment to help us debug this.
In addition, what retention have you configured?

TL;DR:
In v1, reducing either retention/ping-rate/.. or moving to a faster storage solution is the only option to get around this problem.
In v2, we have likely resolved this problem. See #4500 for further details of what needs to happen before we can release this version.

@Lanhild

This comment was marked as off-topic.

@CommanderStorm

This comment was marked as off-topic.

@Lanhild

This comment was marked as off-topic.

@jiriteach

This comment was marked as duplicate.

@CommanderStorm
Copy link
Collaborator

I am going to close this as resolved by v2.0 as I don't see evidence to the contrary.

TL;DR:
In v1, reducing either retention/ping-rate/.. or moving to a faster storage solution is the only option to get around this problem.
In v2, we have likely resolved this problem. See #4500 for further details of what needs to happen before we can release this version.

@jiriteach

This comment was marked as resolved.

@Lanhild

This comment was marked as resolved.

@thebiblelover7
Copy link

@CommanderStorm Is the only option to wait for v2? I've set my activity history to 7 days, have only 12 monitors, I've wiped out the database and I still have the same issues....

@CommanderStorm
Copy link
Collaborator

CommanderStorm commented Jul 13, 2024

@thebiblelover7 Your issue seems somewhat strange.

  • What storage are you running? (nfs, emmc, sd-card?) => 12 monitors with 7 days is way lower than reported by anybody..
  • Could you have a look if this button works? (Was your db created before v1.10?)
    image

@thebiblelover7
Copy link

thebiblelover7 commented Jul 13, 2024

@CommanderStorm I'm running it on an Oracle Instance, so I don't know what storage they use.... But with other servers I've run on their instances, they seem to be running totally fine.

I've run Shrink Database multiple times even though I believe my db was created after v1.10

It has definitely been strange. Now I can't believe whenever I get spammed with 12 notifications saying all my services are down because of this issue....

@CommanderStorm
Copy link
Collaborator

CommanderStorm commented Jul 14, 2024

v2 will likely not solve your particular deployment issue. If you have suuuch slow IO, there is not much that can be done for you..
Maybe Oracle does have a plan or configuration where you get more than a few disk reads a second (Your IO must be sloooooow to run into this that early)..

Maybe the machine you are running on has defective IO.. (contact support if benchmarks differ too much..)
Really can't tell without further debugging on your end.. (IO-benchmarks, lscpu, a link to oracles page for your instance-type)

@thebiblelover7
Copy link

Well, @CommanderStorm , I recreated the instance, set the IO as fast as I could, and used ubuntu minimal instead of Oracle Linux... we'll see how it goes now. So far (last hour) it's working well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:core issues describing changes to the core of uptime kuma help question Further information is requested
Projects
None yet
Development

No branches or pull requests

5 participants