File watcher hangs #35

sumantfordev · 2020-12-22T19:18:15Z

I have a windows docker image and started using this to stream logs from my application log foler to std out. It all work fines but sometimes it stops pushing logs to std out.
At first i thought it might be no new logs but when i open the files , new lines are continuously getting added by the actual application but Logmonitor doesnot puts those new lines anymore. Everytime i have to restart the container. My configuration json file is pulling logs from Eventvwr and log folder. The eventvwr keeps working though.

What could be the possible reasons? Can there be a possibility of file locking which is not allowing the logmonitor to read new lines? Need some urgent guidance on this.

henrysmith · 2021-06-30T02:19:46Z

We have the same issue. Did you get any solution? or a workaround?

sumantfordev · 2021-07-06T11:22:42Z

No solution yet. Are you facing this in docker or kubernets environment?
For us in local we use plain docker deployment we have not faced the issue, The issue is in production where we use kubernets.

henrysmith · 2021-07-06T12:53:40Z

In all the environments. When we are not using Kubernetes in my local machine for example LogMonitor stops sending the log content to the standard output, therefore, we don't see anything in the docker logs. We changed the log4net config to use small files etc but still. We are not able to fully reproduce but it happens after running for a couple of days.

sumantfordev · 2021-07-12T10:09:36Z

Are you using any docker logging driver https://docs.docker.com/config/containers/logging/configure/ ?
For us what we are seeing is windows container stucks writing log to the file ( using the json logging file driver). And that's when logs stops coming in the std out. So its not actually teh file watcher that is hanging but the stdout is stuck.
I can somewhat relate this issue to this windows container issue- microsoft/Windows-Containers#106 .

henrysmith · 2021-08-25T20:05:53Z

I think we have an issue with LogMonitor reading the log4net log file.

In our case, we are using a log4net.Appender.RollingFileAppender. Sometimes when the file gets the maximum size and it's replaced LogMonitor stops reading the file and logs are not shipped to the STDOUT.

I know LogMonitor still working because I can see entries from the EventLog. Sometimes randomly LogMinitor is able to read the file and send the logs to the STDOUT.

    <appender name="JsonFileAppender" type="log4net.Appender.RollingFileAppender">
	    <file type="log4net.Util.PatternString" value="%property{ApplicationPath}\logs\application-logs.json"/>
	    <appendToFile value="true" />
	    <rollingStyle value="Size" />
	    <datePattern value="yyyy-MM-dd.lo\g" />
	    <maxSizeRollBackups value="20" />
	    <maximumFileSize value="40KB" />

	    <!-- logs to the same file always and rolls the others -->
	    <staticLogFileName value="true" />
	    <layout type="log4net.Layout.SerializedLayout, log4net.Ext.Json">
                         ..
	    </layout>
    </appender>

{
  "LogConfig": {
    "sources": [      
      {
        "type": "File",
        "directory": "******\\logs",
        "filter": "*.json",
        "includeSubdirectories": false
      },
      {
        "type": "EventLog",
        "startAtOldestRecord": true,
        "eventFormatMultiLine": false,
        "channels": [
          {
            "name": "system",
            "level": "Information"
          },
          {
            "name": "application",
            "level": "Error"
          },
          {
            "name": "application",
            "level": "Warning"
          }

        ]
      },
      {
        "type": "ETW",
        "providers": [
          {
            "providerName": "IIS: WWW Server",
            "ProviderGuid": "3A2A4E84-4C21-4981-AE10-3FDA0D9B0F83",
            "level": "Information"
          },
          {
            "providerName": "Microsoft-Windows-IIS-Logging",
            "ProviderGuid ": "7E8AD27F-B271-4EA2-A783-A47BDE29143B",
            "level": "Information",
            "keywords": "0xFF"
          }
        ]
      }
    ]
  }
}

iankingori · 2022-06-14T08:31:02Z

@profnandaa review #26 as well

knunery · 2022-07-21T20:33:42Z

Is the file watcher not safe to use until this issue is fixed? @iankingori

iankingori · 2022-07-25T08:38:12Z

Hey @knunery, we are actively working to resolve this as it also impacts log rotation. As for whether it is safe, I haven't seen an instance where the logs fail to write to file due to log monitor, we only fail to read and write to stdout so the logs can still be accessed on file.

seamustheseagull · 2022-09-22T12:04:30Z

We are also experiencing this. We've logs set up to rotate hourly and so far there doesn't seem to be a pattern to when or why LogMonitor stops sending new logs to stdout.
If there are any settings we can use to get debug output from LogMonitor or anything I can do to assist troubleshooting, let me know. I don't enough know C++ to be of programming help :).

iankingori · 2022-09-23T12:21:59Z

Hey @seamustheseagull, we don't currently have a debug output option but that would be great to add. A couple of questions to help us repro the issue:

Are you using version 1.1 or 1.2 of log monitor?
What OS Version are you running on?
What are you using for your log rotation (e.g log4net, log4rs....) and what is your log rotation config beyond the hourly cycles?

seamustheseagull · 2022-09-23T13:50:04Z

Hi @iankingori

We're running v1.1 but in the process of moving to 1.2
Host & Containers are Windows Server 20H2 (I know)
We're using NLog. Rotates hourly, and retains the last six log files. On rotation it's renamed from file.log.json to file.{#}.archive.json

One of the curious things I've noticed in the LogMonitor output is every so often there'll be a message like this:

[2022-09-23T13:00:10.000Z][LOGMONITOR] ERROR: Error in log file monitor. Failed to open file \\?\c:\AppLogs\<appname>\<appname>.75.archive.json. Error: 32

Error 32 is a file access error, and given it happens on the hour, then this error likely occurs when the log rotation deletes the file. So that's to be expected.

What it's unexpected for me is that the file filter in LogMonitorConfig.json is "*.log.json".

So LogMonitor shouldn't be monitoring "*.archive.json". But I don't know the nuances of file handles - perhaps when the file is renamed on rotation, LogMonitor's handle to it is retained? Maybe that's relevant to the issue, maybe not.

Anything other questions, fire away.

DmitrySenin · 2023-11-23T20:19:45Z

The problem in our case seems to be caused by massive number of writes to log files. From what we understood from the source code, LogMonitor captures all events related to file system in a given folder to specific files (if the filter is defined)
Generating thousands writes a minute to a single file seems to lead to some sort of overflow in the internal queue of LogMonitor causing it to stop reading the file at all.
We have worked it around by tweaking the configuration of log4net to generate a new file every minute. We also had to introduce our own appender due to a bug in log4net. So our solution looks like this:

namespace IntApp.Wilco.Common.Logs
{
    using System;
    using System.IO;
    using System.Linq;

    using log4net.Appender;
    using log4net.Util;

    public class CloudFileAppender : RollingFileAppender
    {
        /// <summary>
        /// The adjust file before append.
        /// </summary>
        protected override void AdjustFileBeforeAppend()
        {
            base.AdjustFileBeforeAppend();

            var logsDirectory = Directory.GetParent(this.File).FullName;
            var logFiles = Directory.GetFiles(logsDirectory, "*.log");

            if (logFiles.Length > this.MaxSizeRollBackups)
            {
                logFiles
                    .OrderBy(System.IO.File.GetLastWriteTime)
                    .Take(logFiles.Length - this.MaxSizeRollBackups)
                    .ToList()
                    .ForEach(DeleteFile);
                LogLog.Debug(typeof(CloudFileAppender), $"Deleting {logFiles.Length} old log files");
            }
        }
    }
}

  <appender name="CloudTimeBasedRollingFileAppender" type="Intapp.Wilco.Common.Logs.CloudFileAppender, Intapp.Wilco.Common">
    <file value="logs\" />
    <appendToFile value="true" />
    <maxSizeRollBackups value="180" />
    <datePattern value=".MM.dd.yyyy.HH.mm'.log'" />
    <preserveLogFileNameExtension value="true" />
    <rollingStyle value="Date" />
    <staticLogFileName value="false" />
    <lockingModel type="log4net.Appender.FileAppender+ExclusiveLock"/>
    <layout type="IntApp.Wilco.Common.Logs.JsonLayout, IntApp.Wilco.Common">
    </layout>
  </appender>

After delivering all of this to Prod, we discovered that now generating tons of log records leads to terrible performance of our app. We have not figure out whether it was caused by application writes to the file system or LogMonitor furiously reading from generated files but after we rolled back to our old approach with a filebeat sidecar, things came to normal.
So I hope besides fixing the problem with File Watcher hanging from time to time, performance will also be checked

iankingori · 2023-11-24T07:00:45Z

Thanks for the detailed info @DmitrySenin. We've had trouble getting this to repro on our end before. I'll try use your setup and see whether I can repro. Please share a sample of your FileMonitor configuration and base image.

DmitrySenin · 2023-11-24T13:48:30Z

@iankingori , here it is

{
    "LogConfig": {
      "logFormat": "custom",
      "sources": [
        {
          "type": "EventLog",
          "eventFormatMultiLine": false,
          "startAtOldestRecord": true,
          "channels": [
            {
              "name": "application",
              "level": "Information",
              "higher": true
            },
            {
              "name": "system",
              "level": "Information",
              "higher": true
            }
          ],
          "customLogFormat": "{'timestamp':'%TimeStamp%', 'message':'%Message%', 'eventLog':{'eventId':'%EventId%'}, 'level':'%Severity%'}|json"
        },
        {
          "type": "File",
          "eventFormatMultiLine": false,
          "directory": "c:\\logs",
          "filter": "*.log",
          "customLogFormat": "%Message%"
        },
        {
          "type": "ETW",
          "eventFormatMultiLine": false,
          "providers": [
            {
              "providerName": "IIS: WWW Server",
              "providerGuid": "3A2A4E84-4C21-4981-AE10-3FDA0D9B0F83",
              "level": "Information",
              "higher": true
            },
            {
              "providerName": "Microsoft-Windows-IIS-Logging",
              "providerGuid": "7E8AD27F-B271-4EA2-A783-A47BDE29143B",
              "level": "Information",
              "higher": true
            }
          ],
          "customLogFormat": "{'timestamp':'%TimeStamp%', 'message':'%EventData%', 'etw':{'eventId':'%EventId%', 'providerId':'%ProviderId%', 'providerName':'%ProviderName%', 'executionProcessId':'%ExecutionProcessId%', 'executionThreadId':'%ExecutionThreadId%', 'keyword':'%Keyword%'}, 'level':'%Severity%'}|json"
        }
      ]
    }
  }

From what I recall, 1k log records a second was enough to replicate the problem

iankingori · 2023-11-24T13:52:50Z

Which base container image are you using and is this happening on docker locally or just on Kubernetes in production?

DmitrySenin · 2023-11-24T13:58:07Z

@iankingori ,
I just realised that for my experiments I was using a different config

{
  "LogConfig": {
    "logFormat": "custom",
    "sources": [
      {
        "type": "File",
        "directory": "C:\\projects\\Fiddle\\Fiddle\\bin\\Debug\\net7.0\\Logs",
        "filter": "*.log",
        "includeSubdirectories": false,
        "includeFileNames": false,
        "customLogFormat": "%Message%"
      }
    ]
  }
}

I reproduced the problem locally without docker or anything. Just LogMonitor with the config above started from a terminal, and a side app generating tons of logs to a single file.
We didn't try that with Docker locally and observed the problem on Kubernetes in Production. Our based image is mcr.microsoft.com/dotnet/framework/aspnet:4.7.2-20210209-windowsservercore-ltsc2019

DmitrySenin · 2023-11-24T14:42:59Z

@iankingori , I found the project which I used to repro the problem locally. Here it is Fiddle.zip. It's a C# app generating 1k of beefy log records a second via log4net to a file which rolls over hourly. You will also find LogMonitor config inside but you will need to correct the value of directory
I run LogMonitor first then the app. After some time, LogMonitor stops outputting anything to the console but the app keeps writing logs to the file

bchew · 2023-11-24T23:30:11Z

We also have encountered the same issue similar to what @DmitrySenin has reported (one of our app processes generated >1k log records and the LogMonitor process would stop outputting anything to stdout after that).

Tried it with Docker locally and the same issue was present. Have tried a few workarounds as mentioned for log4net - log rotation being set to small files, disabling log rotation altogether and having it output to a single log file and it does seem that the large volume of records would be able to reproduce this issue.

Our base image is mcr.microsoft.com/dotnet/framework/aspnet:4.8-20231010-windowsservercore-ltsc2022.

DmitrySenin · 2024-01-25T21:45:34Z

Just checking if any work is going on for this problem?

iankingori · 2024-01-26T18:49:37Z

Hey @DmitrySenin. we were able to successfully repro using the steps you shared and are working on fixing the issue.

cbittencourt · 2024-04-25T19:56:27Z

Hi, any updates on this issue?

profnandaa added needs-investigation bug Something isn't working labels Jun 14, 2022

profnandaa self-assigned this Jun 14, 2022

profnandaa assigned iankingori and unassigned profnandaa Jul 18, 2022

profnandaa mentioned this issue Jul 21, 2022

What's missing for rotating log support in LogMonitor? #26

Open

DmitrySenin mentioned this issue Aug 17, 2023

Log Monitor Feedback and Future Features #84

Open

TinaMor assigned TinaMor and unassigned iankingori Nov 27, 2023

TinaMor added triaged and removed needs-investigation labels May 20, 2024

iankingori mentioned this issue Aug 14, 2024

fix file watcher hanging and console write performance #183

Merged

iankingori closed this as completed in #183 Aug 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

File watcher hangs #35

File watcher hangs #35

sumantfordev commented Dec 22, 2020

henrysmith commented Jun 30, 2021

sumantfordev commented Jul 6, 2021

henrysmith commented Jul 6, 2021

sumantfordev commented Jul 12, 2021 •

edited

Loading

henrysmith commented Aug 25, 2021 •

edited

Loading

iankingori commented Jun 14, 2022

knunery commented Jul 21, 2022

iankingori commented Jul 25, 2022

seamustheseagull commented Sep 22, 2022

iankingori commented Sep 23, 2022

seamustheseagull commented Sep 23, 2022

DmitrySenin commented Nov 23, 2023

iankingori commented Nov 24, 2023

DmitrySenin commented Nov 24, 2023 •

edited

Loading

iankingori commented Nov 24, 2023

DmitrySenin commented Nov 24, 2023

DmitrySenin commented Nov 24, 2023 •

edited

Loading

bchew commented Nov 24, 2023

DmitrySenin commented Jan 25, 2024 •

edited

Loading

iankingori commented Jan 26, 2024

cbittencourt commented Apr 25, 2024

File watcher hangs #35

File watcher hangs #35

Comments

sumantfordev commented Dec 22, 2020

henrysmith commented Jun 30, 2021

sumantfordev commented Jul 6, 2021

henrysmith commented Jul 6, 2021

sumantfordev commented Jul 12, 2021 • edited Loading

henrysmith commented Aug 25, 2021 • edited Loading

iankingori commented Jun 14, 2022

knunery commented Jul 21, 2022

iankingori commented Jul 25, 2022

seamustheseagull commented Sep 22, 2022

iankingori commented Sep 23, 2022

seamustheseagull commented Sep 23, 2022

DmitrySenin commented Nov 23, 2023

iankingori commented Nov 24, 2023

DmitrySenin commented Nov 24, 2023 • edited Loading

iankingori commented Nov 24, 2023

DmitrySenin commented Nov 24, 2023

DmitrySenin commented Nov 24, 2023 • edited Loading

bchew commented Nov 24, 2023

DmitrySenin commented Jan 25, 2024 • edited Loading

iankingori commented Jan 26, 2024

cbittencourt commented Apr 25, 2024

sumantfordev commented Jul 12, 2021 •

edited

Loading

henrysmith commented Aug 25, 2021 •

edited

Loading

DmitrySenin commented Nov 24, 2023 •

edited

Loading

DmitrySenin commented Nov 24, 2023 •

edited

Loading

DmitrySenin commented Jan 25, 2024 •

edited

Loading