-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
filebeat preventing freeing of disk space on deleted files #896
Comments
Unfortunately in 1.0.1 ignore_older has multiple "features" which overlap. That is the reason we are introducing close_older in addition: #718 In the future you can set close_older to a low value to release file handles and ignore_older to a high value to make sure still all files are picked up. Filebeat should definitively not keep file handlers open longer then ignore_older but it is expected that it keeps the files open until ignore_older (which blocks file deletion). There are a lot of improvements in the upcoming 1.1 related to this handling files, unfortunately close_older is not part of 1.1 but the next major release. There is an additional flag which I don't like to recommend, but can help sometimes: force_close_files: https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-configuration-details.html#_force_close_files Have a look if this could work in your setup. An other option is that you try our nightly builds which already include the close_older feature: https://beats-nightlies.s3.amazonaws.com/index.html?prefix=filebeat/ |
Thanks for the feedback, the close_older setting may help our particular combination of factors quite well, though ultimately the root problem is with our logging framework not updating timestamps. I'll keep an eye out for that and try it out when it becomes available in a release. I've dropped the ignore_older setting significantly (to 12 hours) on one of our problematic servers that generates high volumes. If everything is working as expected, we should see the older files being deleted tonight. I am actually already using the force_close_files setting as posted in my config above. If my understanding of that setting is correct, that should allow deletion of the files and release of the free space, since filbeat should release the handle when the deletion/move is detected. (or does it close the file after each check interval? -- more technical details here would be awesome) However, the observed behavior suggests this isn't working (or I am misunderstanding.) I'm also confused by the file apparently being deleted, but space not being freed.. Is it the force_close_files setting that allows the file to be deleted? To restate, the file appears to be deleted, since it cannot be opened, nor can you view or change its permissions, though it does show in explorer. In this state, is the file really deleted? Is it possible for processes to append lines to it? If deleted, how will filebeat ever harvest new lines from the file? |
We just released 1.1. It would be nice if you could upgrade to 1.1 and see if you still have the same issues. For the timestamp not updated: We see this issue sometimes on shared drives as it seems to cache some information. Could this also apply to your case? force_close_files works as following: Normally filebeat detects a file based on the 3 identifier listed here (https://github.com/elastic/beats/blob/master/filebeat/input/file_windows.go#L13). The filename itself is only secondary as it can change over time (rotation). If a filename disappears, filebeat keeps the harvester open as it assumes the file was rotated and waits until the ids show up again under a different file name to make sure it can continue reading the file imidiately. What force_close_files does is that if a file is not found anymore under the same file name, it closes the file handler ( beats/filebeat/harvester/reader.go Line 121 in d096631
So force_close_file should definitively close the file handler and make it possible to delete it. If this is not the case, this sounds like a bug. |
I have noticed the same issue in atomic linux where files being removed by logrotate are not fully deleted because of filebeat holding the file handler.
|
@vjsamuel Which filebeat version are you using? Can you share your config? |
This is my config:
The issue got resolved when I added:
|
Hm, strange. You seem to be on alpha2. close_older should be set to 1h by default and not require force_close_files. Did you add the lines above to all prospectors? |
I am on version filebeat-1.2.3-1.x86_64 but still seem to have the same issue. If my logstash server goes down, filebeat will keep my logs open forever, causing my server to run out of hard drive space. I have configured my application server to rotate logs every hour, or once they reach 100MB. My config file, and a list showing all the open files logstash is keeping open
|
@knap1930 So this problem also persists when logstash comes back online? |
@ruflin When logstash comes back online, things will recover. The problem is that if the host's hard drive fills up before logstash comes back online. I have a cron job that moves rotated logs over to s3 after the file is rotated. Ideally if there is a problem with logstash, filebeat will give up eventually and allow the space to be reclaimed, not keep trying forever. I thought that was the purpose of the |
Filebeat will not release the file handler until it successfully sent the data. So The option you are looking for is probably |
any suggestion for me trying to add close_older: 30m My Configuration follows 👍 filebeat.prospectors: Each - is a prospector. Most options can be set at the prospector level, soyou can use different prospectors for various configurations.Below are the prospector specific configurations.
#----------------------------- Logstash output -------------------------------- The Logstash hostshosts: ["10.146.134.15:5044","10.146.134.16:5044","10.146.134.17:5044"] |
trying all formats no luck any help you can do
|
We have a bunch of windows services that generate large amounts of logs, with Enterprise Library logging application block. I've installed the latest filebeat 1.0.1 as a windows service, that forwards to a cluster of linux logstash servers, with output to a cluster of linux elasticsearch servers. We have scheduled tasks running daily that deletes log files older than a given time interval.
This weekend, we received some alerts about servers disks filling up. Our on-call staff tried to review and delete log files, but were getting odd permissions issues suggesting they have no access. At a loss for what to do next, they restarted the server, and a bunch of log files disappeared. This morning I came in and caught up, and realized what happened; the scheduled task had deleted the files, but they still were showing in explorer, and the disk had not released the space. I confirmed on another server by restarting the filebeat service and watching a number of files disappear.
My filebeat config looks like this:
Some more context:
Our servers disks are undersized and of various capacities, different services generate varying amount of logs, some up to 300MB/hr. This has generally been troublesome for our management of free space, and regularly causes issues. Our IT dept has installed a bunch of one-off scheduled tasks to keep disks from filling up, deleting files by last modified date, some retaining a week, some retaining a day worth of logs.
Our logging library does not release file handles on actively logging files, and modified dates do not get updated until a filehandle is released. Files are set to roll over at 10MB, which depending on the rate of logging one 10MB file can last for a few minutes or for a month. This leads to a poor scenario where files can be seen as 'older' than the
ignore_older
setting, and are not harvested. Consequently, we've bumped up the setting from 1 day to 7 days to try to pick up these files. This setting is a compromise and will probably need to be adjusted on a per-server basis.Lastly, I realize that this ignore_older setting is likely at play here and that filebeat will continue to monitor files up until this interval. Is it expected behavior that filebeat will keep an open file handle and prevent deletion/release of files? Second, while i have not seen it directly, our on-call team reports some files were up to a month old, which suggests that there is a bug where filebeat is monitoring and holding files older than the specified interval.
The text was updated successfully, but these errors were encountered: