Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Any specific reason why uptime-kuma does not support NFS file systems #2668

Closed
2 tasks done
Maven35 opened this issue Jan 24, 2023 · 20 comments
Closed
2 tasks done

Any specific reason why uptime-kuma does not support NFS file systems #2668

Maven35 opened this issue Jan 24, 2023 · 20 comments
Labels

Comments

@Maven35
Copy link

Maven35 commented Jan 24, 2023

⚠️ Please verify that this bug has NOT been raised before.

  • I checked and didn't find similar issue

🛡️ Security Policy

📝 Describe your problem

im running uptime-kuma in a kubernetes environment, and im wondering what are specific limitations from running with NFS storage. also i've noticed it contains an embedded SQLite database, is there any discussion or thoughts on using an external database or adding that support? also are there any plans to have dedicated kubernetes helm charts it would be cool to have this product on the CNCF since its pretty awesome.

🐻 Uptime-Kuma Version

1.19.6

💻 Operating System and Arch

debian

🌐 Browser

chrome

🐋 Docker Version

kubernetes 1.21

🟩 NodeJS Version

No response

@Maven35 Maven35 added the help label Jan 24, 2023
@Maven35
Copy link
Author

Maven35 commented Jan 24, 2023

also is there any plans to have high availability as a possibility for uptime-kuma. currently, with the single container and built-in database, it does not seem possible for me to run multiple instances with the same data.

@yitsushi
Copy link

I have uptime kuma running on my clusters with nfs-subdir-external-provisioner, it's running in that way ever since I first deployed. The oldest timestamp on the filesystem is 2021-10-12 17:11:48.039084132 +0000.

The HA would be awesome, to be precise I ended up on this issue while I was looking for an existing issue about supporting psql or something as database instead of using sqlite.

@Maven35
Copy link
Author

Maven35 commented Feb 28, 2023 via email

@AndrewKvalheim
Copy link

AndrewKvalheim commented Apr 23, 2023

“due to SQLite”. Its FAQ warns:

SQLite uses reader/writer locks to control access to the database. But use caution: this locking mechanism might not work correctly if the database file is kept on an NFS filesystem. This is because fcntl() file locking is broken on many NFS implementations.

The documentation elaborates:

SQLite uses POSIX advisory locks to implement locking on Unix. SQLite assumes that these system calls all work as advertised. If that is not the case, then database corruption can result. One should note that POSIX advisory locking is known to be buggy or even unimplemented on many NFS implementations (including recent versions of Mac OS X) and that there are reports of locking problems for network filesystems under Windows. Your best defense is to not use SQLite for files on a network filesystem.

It would be helpful if Uptime Kuma was clearer about its actual requirements instead of vaguely saying that specific filesystems aren’t “supported”.

For example, the documention could say something like:

System requirements:

@chevdor
Copy link

chevdor commented May 23, 2023

While testing NFS on K8s, I ran into the typical chmod issue due to the fact that the container cannot change the permissions of the NFS share.

This is usually solved by using a sub-folder so I tried using /app/data/sub and setting DATA_DIR accordingly. This did not work.

Instead, I could get the container started using:

  • NFS mapped to /data
  • DATA_DIR: /data/sub

@CommanderStorm
Copy link
Collaborator

@AndrewKvalheim

It would be helpful if Uptime Kuma was clearer about its requirements instead of vaguely saying that specific filesystem aren’t “supported”.

Please refer to this article why nfs might corrupt your data: https://www.sqlite.org/howtocorrupt.html
TLDR: unless running multiple pods/having a restart policy which is not fitting, this likely won't happen to you.

However: Running a database on a distributed storage backend has significant performance impacts that could make this app unusable.

@CommanderStorm
Copy link
Collaborator

CommanderStorm commented May 23, 2023

@Maven35 @chevdor Please note that the unofficial helm chart is maintained by the Dennis at https://github.com/dirsigler/uptime-kuma-helm

On the topic of the external DB:
Please see #2720 and associated PRs like #3017

There is also a milestone for this

@CommanderStorm
Copy link
Collaborator

CommanderStorm commented May 23, 2023

@Maven35

it would be cool to have this product on the CNCF since its pretty awesome.

Given the scope and core-engineering of this project (it is not designed to scale, it is not a distributed system) I would be very surprised if the wonderful people at the CNCF would even consider this project.
The project proposal steps for sandbox projects seem quite steep, if you have more insight please comment.

If using CNCF Tooling:
Prometheus can also do uptime monitoring with the right dashboards. It is not as simple to set up, but works very reliably.

@mabed-fr
Copy link

I come out of a lab where I set up uptime kuma in a highly available environment with auto scaling (with min and max a 1) and docker volume on the AWS EFS (nfs v4) I don't have encountered problems.

@Maven35
Copy link
Author

Maven35 commented Jun 17, 2023 via email

@CommanderStorm
Copy link
Collaborator

CommanderStorm commented Jun 30, 2023

@Maven35
Currently, this Project does not support high availability.
An environment with 1 Instance is by definition not highly available.
I would argue that it does not need to be, as an uptime monitor should not be co-located with what you are monitoring.
If in doubt, you can set up monitors to monitor your uptime monitor.

Reasoning:
This way we don't need

  • worker-coordinator or
  • consistent hashing or
  • leader-election to schedule heartbeats

The general distributed-system drawbacks/advantages don't apply.

  • Scale is currently limited to 200…2000 monitors and
  • avaliability is limited to the machine the uptime-monitor is running on.

@Aur0nd
Copy link

Aur0nd commented Aug 19, 2023

While testing NFS on K8s, I ran into the typical chmod issue due to the fact that the container cannot change the permissions of the NFS share.

This is usually solved by using a sub-folder so I tried using /app/data/sub and setting DATA_DIR accordingly. This did not work.

Instead, I could get the container started using:

  • NFS mapped to /data
  • DATA_DIR: /data/sub

This is literally the solution, nice one thank you!

Copy link

We are clearing up our old issues and your ticket has been open for 3 months with no activity. Remove stale label or comment or this will be closed in 2 days.

@chakflying
Copy link
Collaborator

Closing since louislam/uptime-kuma-wiki#68 has been merged, which should solve this.

@chevdor
Copy link

chevdor commented Dec 11, 2023

I think the blank statement related to NFS is too broad. Indeed, SQLite + NFS 3 is asking for troubles but there is no reason for NFS4+ to be problematic. I am running using NFS4.1 for a while without an issue (I know it does not mean it will never happen but still...).

Did anyone ever run into issues using Sqlite over NFS 4+ ?

@CommanderStorm
Copy link
Collaborator

While nfs4 has resolved the file locking issue accroding to the mysql docs, I think sharing data directories is still a footgun, the support effort ("help, my db suddenly got corrupted") I would think is substantial. Given how much time tackling the current issue load takes, I am not certain that that would be doable.

@chevdor
Copy link

chevdor commented Dec 11, 2023

Considering how simple it is to backup sqlite, I will probably take the risk and make backups often, then see when it breaks, if that ever happens. The benefit of the solution is worth the effort. I also tested in another context and sqlite is much faster than postgres for instance.

@CommanderStorm
Copy link
Collaborator

CommanderStorm commented Dec 11, 2023

I also tested in another context and sqlite is much faster than postgres for instance.

Such benchmarking is highly application-dependent and cannot be generalised to anything.

For example:

  • SQLite is optimised for small queries (some of our queries are bug in v1!), as such the query optimisatiion engine does not put in the effort (this effort takes time) to find the most optimal query plan, but often choses a simpler one instead.
  • SQLite can only run one query at a time
  • most "external DBs" do have to content with network-congestion and rely on packet based communication adding latency and resourece consumpton (but also scalability)

@jledesma84
Copy link

While testing NFS on K8s, I ran into the typical chmod issue due to the fact that the container cannot change the permissions of the NFS share.

This is usually solved by using a sub-folder so I tried using /app/data/sub and setting DATA_DIR accordingly. This did not work.

Instead, I could get the container started using:

  • NFS mapped to /data
  • DATA_DIR: /data/sub

@chevdor I'm using the image louislam/uptime-kuma:1.
How can I use NFS4+?
How did you change the DATA_DIR? Was it using an environment variable on the dockerfile?

@CommanderStorm
Copy link
Collaborator

The environment variables can be found here.

I don't get what you mean by

Was it using an environment variable on the dockerfile

docker has the -e flag while docker-compose has this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

9 participants