-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
promtail causing very high cpu load when running and stops #5350
Comments
When reporting issues, there is a template that we have in place to extract some relevant information related to the bug. This is the template:
Please update your description using this template. |
I also has this same issue. My online environment solves the problem of too high CPU through this ‘limit_config’ configuration. limit_config:
readline_rate: 1000
readline_burst: 2000
readline_rate_enabled: true refer pr: #5031 |
@dannykopping To first fill the template. Describe the bug To Reproduce
Expected behavior Environment:
Screenshots, Promtail config, or terminal output
When running this like so the load goes trhough the roof: And promtail stops As an answer to @liguozhong :
I get the message: |
Please include the version of Loki/Promtail you are using |
Added them to the message :) |
Thanks 👍 Out of interest, why are you running Promtail 2.4.2 but with Loki 2.2.1? |
I tried several versions of promtail to see if it had any influence because of perhaps bugs or new code. |
OK, I thought so. Thanks.
Coming back to your original message:
This activates when your system is under memory pressure. If your system starts swapping, that can in turn put a lot of strain on the disk. When the disk is under pressure and promtail needs to read from disk, there might be a lot of I/O wait which can factor into load.
The
Your system seems to have a lot of physical memory free, though:
So I'm not sure why it's swapping so much. You'll have to investigate that to find out why, or you can try disabling swap and re-enabling it which will force the OS to reallocate all of that memory into physical RAM. I'm not yet convinced that this is an issue within promtail. Happy to re-evaluate if new data is presented. |
What i do notice is that other machines work fine with promtail. |
Please run |
Just before promtail stops the result is:
|
9MB of RAM is basically nothing. I don't think it's promtail that's causing your system to swap. I'm pretty sure this is a swap -> disk saturation -> I/O load problem. |
swapon -s gives following:
|
We are creating new RHEL8 virtual machines for our applications.
Within 1 minute it uses up all memory |
Hi! This issue has been automatically marked as stale because it has not had any We use a stalebot among other tools to help manage the state of issues in this project. Stalebots are also emotionless and cruel and can close issues which are still very relevant. If this issue is important to you, please add a comment to keep it open. More importantly, please add a thumbs-up to the original issue entry. We regularly sort for closed issues which have a We may also:
We are doing our best to respond, organize, and prioritize all issues but it can be a challenging task, |
The issue was solved with the latest version of promtail that I installed yesterday. |
This just happened to me as well (2.4.2), with *log being the only thing I was grabbing. Trying the upgrade advice, thanks for the tip @waardd |
From which to which version did it help? We are observing 15 cores of CPU usage on high log traffic. |
I can also confirm that the high cpu went away with a new version. Something was off with 2.4.2 |
The older version we tried was 2.4.2. |
I've been hitting this, very high CPU on one particular node (with no clear reason). |
I have the similar issue. A bunch of ec2 machines in AWS starts with 100% by promtail. Promtail is working and sending logs. We are getting rid of promtail/loki. Sorry. No production usage.
Here is my config: `server: positions: clients:
scrape_configs:
|
My machine is a 4 core GCP instance, promtail running i a container. I have about 200 such instances, and so far I am only seeing this on 1 machine. promtail is max'ing out all cores. Prof suggests it's just golang select. This could possible be a loop over a closed channel maybe? |
I am currently facing the same issue, running promtail |
I had a look with perf-trace to see if I could get some more details on what is happening.
The machine runs at 100% until promtail is restarted. There's no large volume of logs. perf trace shows a continuous stream of 920.450 promtail/277361 syscalls:sys_enter_newfstatat:dfd: CWD, filename: "", statbuf: 0xc00253bbd8
syscall.Syscall6.abi0 (/usr/bin/promtail)
os.statNolog (/usr/bin/promtail)
os.Stat (/usr/bin/promtail)
github.com/hpcloud/tail/watch.(*PollingFileWatcher).ChangeEvents.func1 (/usr/bin/promtail)
runtime.goexit.abi0 (/usr/bin/promtail)
Initially CWD was set to |
Maybe on Windows, MacOS and Linux we can set |
Any updates on this, I have updated promtail to latest version still its maxing out cpu cores. |
promtail just nuked another one of our machines, this is an urgent issue 😢 |
I'm getting a little closer. When one of our workflows stops, the file watcher begins to loop coninuous recieving nil events from the filewatcher channel. This code here... The problem is that the file events channel gets closed, so the loop repeatedly reads nil values from it (it's not a nil channel, it's a closed channel). It's not obvious to me why this is only triggering on very specific circumstances. In our case I think it's the updating of annotations on a pod causing it to be rediscovered. during the shutdown process. I suspect the following:
The PR below (#6135) fixes the issue in our case. |
any workaround before new version released? |
Not that I know of, I had to run a build with patches applied. |
FYI, this syntax has changed, for anyone coming across this
|
On several of my servers i want to use promtail like i did on a bunch of other machines.
But when i start promtail on only the var/log logs, the CPU load goes trough the roof (like 15 to 20) and then promtail stops.
Looks like kswap is joining the party in being high on CPU when running promtail.
Is there anybody out there that has this same issue?
It's on RHEL7 by the way.
The text was updated successfully, but these errors were encountered: