Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invoke-IcingaCheckMemory detects wrong size for Pagefile #360

Closed
mattpoel opened this issue Aug 6, 2023 · 19 comments · Fixed by #363
Closed

Invoke-IcingaCheckMemory detects wrong size for Pagefile #360

mattpoel opened this issue Aug 6, 2023 · 19 comments · Fixed by #363
Assignees
Labels
bug Something isn't working Investigation The team is looking into the cause of the issue
Milestone

Comments

@mattpoel
Copy link

mattpoel commented Aug 6, 2023

Invoke-IcingaCheckMemory reports a Pagefile usage of 157.29TB:

> Invoke-IcingaCheckMemory -PageFileWarning 20GB -PageFileCritical 30GB -Verbose 2 -Debug
[CRITICAL] Memory Usage [CRITICAL] PageFile Usage
\_ [CRITICAL] PageFile Usage
   \_ [CRITICAL] C:\pagefile.sys: 157.29TB is greater than threshold 32.21GB
| 'memory::ifw_memory::used'=3155665000B;;;0;8588939000 'cpagefilesys::ifw_pagefile::used'=157286400000000B;21474836480;32212254720;0;2013266000000000

Actual usage via Win32_PageFileUsage reports 150MB out of 1920MB:

> Get-CimInstance -ClassName Win32_PageFileUsage -Property *

Status                :
Name                  : C:\pagefile.sys
CurrentUsage          : 150
Caption               : C:\pagefile.sys
Description           : C:\pagefile.sys
InstallDate           : 21/06/2023 5:53:04 AM
AllocatedBaseSize     : 1920
PeakUsage             : 191
TempPageFile          : False
PSComputerName        :
CimClass              : root/cimv2:Win32_PageFileUsage
CimInstanceProperties : {Caption, Description, InstallDate, Name...}
CimSystemProperties   : Microsoft.Management.Infrastructure.CimSystemProperties

Version information:

Component    Version   Available
---          ---       ---
agent        2.14.0    2.14.0
framework    1.11.0    1.11.0
plugins      1.11.0    1.11.0
service      1.2.0     1.2.0
@atj
Copy link

atj commented Aug 8, 2023

+1

> Invoke-IcingaCheckMemory -PageFileWarning 30GB -PageFileCritical 35GB -Verbosity 2
[CRITICAL] Memory Usage [CRITICAL] PageFile Usage
\_ [CRITICAL] PageFile Usage
   \_ [CRITICAL] C:\pagefile.sys: 245.37TB is greater than threshold 37.58GB
\_ [OK] Used Memory: 3.13GiB
| 'memory::ifw_memory::used'=3363602000B;;;0;4294414000 'cpagefilesys::ifw_pagefile::used'=245366800000000B;32212254720;37580963840;0;12884900000000000

> Get-CimInstance -ClassName Win32_PageFileUsage -Property *

Status                :
Name                  : C:\pagefile.sys
CurrentUsage          : 259
Caption               : C:\pagefile.sys
Description           : C:\pagefile.sys
InstallDate           : 4/28/2023 3:30:34 AM
AllocatedBaseSize     : 1024
PeakUsage             : 465
TempPageFile          : False
PSComputerName        :
CimClass              : root/cimv2:Win32_PageFileUsage
CimInstanceProperties : {Caption, Description, InstallDate, Name...}
CimSystemProperties   : Microsoft.Management.Infrastructure.CimSystemProperties

> Show-Icinga
[...]
Component    Version   Available
---          ---       ---
agent        2.14.0    2.14.0
apichecks    1.2.0     1.2.0
framework    1.11.0    1.11.0
plugins      1.11.0    1.11.0
restapi      1.2.0     1.2.0
service      1.2.0     1.2.0

@LordHepipud
Copy link
Collaborator

It seems the issue is caused by different versions of Windows. One will report the Pagefile in bytes, while the other will report it in MB/GB.

I will have to check on which systems the values are different and provide a global fix.

@LordHepipud LordHepipud added this to the v1.11.1 milestone Aug 8, 2023
@LordHepipud LordHepipud added the bug Something isn't working label Aug 8, 2023
@LordHepipud LordHepipud self-assigned this Aug 8, 2023
@LordHepipud LordHepipud added the Investigation The team is looking into the cause of the issue label Aug 8, 2023
@LordHepipud
Copy link
Collaborator

#338 and #359 should be related to this issue

@atj
Copy link

atj commented Aug 9, 2023

It seems the issue is caused by different versions of Windows. One will report the Pagefile in bytes, while the other will report it in MB/GB.

Are you sure about this? Every reference I can find to the Win32_PageFileSetting WMI class states that the size properties are in megabytes.

https://learn.microsoft.com/en-us/windows/win32/cimwin32prov/win32-pagefilesetting

InitialSize

Data type: uint32

Access type: Read/write

Qualifiers: MappingStrings ("Win32Registry|System\CurrentControlSet\Control\Session Manager\Memory Management|PagingFiles"), Units ("megabytes")

Anyway here's the version info from my system which presents the values in megabytes:

> Get-ComputerInfo -Property @('OsName','OsOperatingSystemSKU','OSArchitecture','WindowsVersion','WindowsBuildLabEx')

OsName               : Microsoft Windows Server 2022 Standard
OsOperatingSystemSKU : StandardServerEdition
OsArchitecture       : 64-bit
WindowsVersion       : 2009
WindowsBuildLabEx    : 20348.1.amd64fre.fe_release.210507-1500

@mattpoel
Copy link
Author

mattpoel commented Aug 9, 2023

It seems the issue is caused by different versions of Windows. One will report the Pagefile in bytes, while the other will report it in MB/GB.

I will have to check on which systems the values are different and provide a global fix.

@LordHepipud, I do have servers with Windows 2022 were the check is correctly gathering the size and on others it is not (and all of them are VMs cloned from the same template with the exact same update level).

atj added a commit to transitiv/icinga-powershell-plugins that referenced this issue Aug 9, 2023
The unit for pagefile attributes was changed from megabytes to bytes in
PR#346 but the call to New-IcingaCheck was not updated to reflect this.

Fixes Icinga#360.
atj added a commit to transitiv/icinga-powershell-plugins that referenced this issue Aug 9, 2023
The unit for pagefile attributes was changed from megabytes to bytes in

Fixes Icinga#360.
atj added a commit to transitiv/icinga-powershell-plugins that referenced this issue Aug 9, 2023
The unit for pagefile attributes was changed from megabytes to bytes in
GH346 but the call to New-IcingaCheck was not updated to reflect this.

Fixes Icinga#360.
atj added a commit to transitiv/icinga-powershell-plugins that referenced this issue Aug 9, 2023
The unit for pagefile attributes was changed from megabytes to bytes in
IcingaGH-346 but the call to New-IcingaCheck was not updated to reflect this.

Fixes Icinga#360.
@atj
Copy link

atj commented Aug 9, 2023

@LordHepipud, I do have servers with Windows 2022 were the check is correctly gathering the size and on others it is not (and all of them are VMs cloned from the same template with the exact same update level).

Can you check the version of the plugins component on the servers where it is working, because this issue seems to a side effect of #346 rather than due to any differences in WMI class property units?

@mattpoel
Copy link
Author

mattpoel commented Aug 9, 2023

@atj outputs from a Windows 2022 VM where it is working:

[OK] Memory Usage
\_ [OK] PageFile Usage
   \_ [OK] C:\pagefile.sys: 0B
\_ [OK] Used Memory: 14.89% (4.76GiB)

Installed components on this system:

Component    Version   Available
---          ---       ---
agent        2.14.0    2.14.0
framework    1.11.0    1.11.0
plugins      1.11.0    1.11.0
service      1.2.0     1.2.0

Outputs from a VM with the same Windows 2022 template (same version, updates, etc.) where huge numbers are presented for the pagefile:

[OK] Memory Usage
\_ [OK] PageFile Usage
   \_ [OK] C:\pagefile.sys: 355.47TB
\_ [OK] Used Memory: 50.45% (8.07GiB)

Installed components on this system:

Component    Version   Available
---          ---       ---
agent        2.14.0    2.14.0
framework    1.11.0    1.11.0
plugins      1.11.0    1.11.0
service      1.2.0     1.2.0

Usage data from this system:

> Get-CimInstance -ClassName Win32_PageFileUsage -Property *

Status                :
Name                  : C:\pagefile.sys
CurrentUsage          : 339
Caption               : C:\pagefile.sys
Description           : C:\pagefile.sys
InstallDate           : 21/06/2023 5:53:04 AM
AllocatedBaseSize     : 2432
PeakUsage             : 553
TempPageFile          : False
PSComputerName        :
CimClass              : root/cimv2:Win32_PageFileUsage
CimInstanceProperties : {Caption, Description, InstallDate, Name...}
CimSystemProperties   : Microsoft.Management.Infrastructure.CimSystemProperties

@mattpoel
Copy link
Author

mattpoel commented Aug 9, 2023

@atj problem might start once the pagefile is utilized and might be fine on systems without pagefile consumption but would have to go through all Windows Servers where we adapted the thresholds to TB or PB.

@atj
Copy link

atj commented Aug 9, 2023

@mattpoel, the issue is due to the unit being specified as megabytes when it is actually bytes here:

So if your pagefile usage is zero then the issue won't be triggered:

 [OK] Memory Usage
 \_ [OK] PageFile Usage
    \_ [OK] C:\pagefile.sys: 0B
 \_ [OK] Used Memory: 14.89% (4.76GiB)

If you want to try the fix, you can do the following:

Edit C:\Program Files\WindowsPowerShell\Modules\icinga-powershell-plugins\plugins\Invoke-IcingaCheckMemory.psm1 and make the following change on line 132:

-                    -Unit 'MB' `
+                    -Unit 'B' `

Then run Copy-IcingaFrameworkCacheTemplate, open a new PS session and re-run the check:

> Invoke-IcingaCheckMemory -PageFileWarning 10GB -PageFileCritical 20GB -Verbosity 3
[CRITICAL] Memory Usage [CRITICAL] PageFile Usage (All must be [OK])
\_ [CRITICAL] PageFile Usage (All must be [OK])
   \_ [CRITICAL] C:\pagefile.sys: 363.86TB is greater than threshold 21.47GB
\_ [OK] Used Memory: 3.85GiB
| 'memory::ifw_memory::used'=4136940000B;;;0;4294414000 'cpagefilesys::ifw_pagefile::used'=363855900000000B;10737418240;21474836480;0;12884900000000000
2

> # edit Invoke-IcingaCheckMemory.psm1
> Copy-IcingaFrameworkCacheTemplate

> # open new PS session
> Invoke-IcingaCheckMemory -PageFileWarning 10GB -PageFileCritical 20GB -Verbosity 3
[OK] Memory Usage (All must be [OK])
\_ [OK] PageFile Usage (All must be [OK])
   \_ [OK] C:\pagefile.sys: 452.00MiB
\_ [OK] Used Memory: 3.85GiB
| 'memory::ifw_memory::used'=4129587000B;;;0;4294414000 'cpagefilesys::ifw_pagefile::used'=473956400B;10737418240;21474836480;0;12884900000
0

@mattpoel
Copy link
Author

mattpoel commented Aug 9, 2023

@atj looking good. Icinga is still reporting the wrong usage after the modification (service restart was restarted). Do you know what could be the cause for this?

@atj
Copy link

atj commented Aug 9, 2023

@mattpoel: that's strange, I'd expect the changes to take effect after a service restart. Are you using the API check forwarder?

@mattpoel
Copy link
Author

mattpoel commented Aug 9, 2023

@atj: It's a "default" setup. No explicit configuration.

We stumbled upon another issue on a Windows 2016 server, where IcingaCheckMemory (only via Icinga) will report an invalid threshold for the pagefile. Per Icinga command definition, the following command is executed from the debug.log:

W:\>C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe -NoProfile -NoLogo -ExecutionPolicy ByPass -C "try { Use-Icinga -Minimal; } catch { Write-Output 'The Icinga PowerShell Framework is either not installed on the system or not configured properly. Please check https://icinga.com/docs/windows for further details'; Write-Output 'Error:' $($_.Exception.Message)Components:`r`n$( Get-Module -ListAvailable 'icinga-powershell-*' )`r`n'Module-Path:'`r`n$($Env:PSModulePath); exit 3; }; Exit-IcingaExecutePlugin -Command 'Invoke-IcingaCheckMemory' " -Warning 98% -Critical 99% -PageFileWarning 4GB -PageFileCritical 16GB -IncludePageFile @() -ExcludePageFile @() -Verbosity 2
Ausnahme beim Aufrufen von "WarnOutOfRange" mit 1 Argument(en):  "[UNKNOWN]: Icinga Invalid Input Error was thrown: ConversionUnitMissing

Unable to parse input value. You have to add an unit to your input value. Example: "10GB". Allowed units are: "B, KB, MB, GB, TB, PB, KiB, MiB, GiB, TiB, PiB"."

A "pure" execution of Invoke-IcingaCheckMemory doesn't report the error and even a percent threshold is working:

PS C:\Windows\system32> Invoke-IcingaCheckMemory -Warning 98% -Critical 99% -PageFileWarning 4GB -PageFileCritical 16GB
-IncludePageFile @() -ExcludePageFile @() -Verbosity 2
[WARNING] Memory Usage [WARNING] PageFile Usage
\_ [WARNING] PageFile Usage
   \_ [WARNING] C:\pagefile.sys: 4.39GiB is greater than threshold 4GiB
\_ [OK] Used Memory: 94.57% (18.91GiB)
| 'memory::ifw_memory::used'=20309230000B;21044814000;21259557000;0;21474300000 'cpagefilesys::ifw_pagefile::used'=47154
46000B;4294967296;17179869184;0;8321499000
1
PS C:\Windows\system32> Invoke-IcingaCheckMemory -Warning 98% -Critical 99% -PageFileWarning 80% -PageFileCritical 90% -
IncludePageFile @() -ExcludePageFile @() -Verbosity 2
[OK] Memory Usage
\_ [OK] PageFile Usage
   \_ [OK] C:\pagefile.sys: 56.67% (4.39GiB)
\_ [OK] Used Memory: 94.64% (18.93GiB)
| 'memory::ifw_memory::used'=20322400000B;21044814000;21259557000;0;21474300000 'cpagefilesys::ifw_pagefile::used'=47154
46000B;6657199200;7489349100;0;8321499000
0

This is only happening on this specific Windows 2016 server where we just deployed Icinga for Windows.

Shall I report this problem in a separate issue?

Thanks a lot in advance!

@atj
Copy link

atj commented Aug 9, 2023

@mattpoel: I believe the cause of the invalid threshold error is the same as this issue, so my fix should resolve it:

> Invoke-IcingaCheckMemory -PageFileWarning 5% -PageFileCritical 10% -Verbosity 3
[OK] Memory Usage (All must be [OK])
\_ [OK] PageFile Usage (All must be [OK])
   \_ [OK] C:\pagefile.sys: 0.57% (70MiB)
\_ [OK] Used Memory: 2.51GiB
| 'memory::ifw_memory::used'=2690613000B;;;0;4294414000 'cpagefilesys::ifw_pagefile::used'=73400320B;644245000;1288490000;0;12884900000
0

My guess as to why you're seeing different behaviour between Exit-IcingaExecutePlugin and Invoke-IcingaCheckMemory is that one invocation is using the cache file and the other isn't. Both the Powershell Framework and the plugins module create a cache file for performance reasons, which is essentially all of the individual .psm1 files concatenated together. You should find the plugin cache located at C:\Program Files\WindowsPowerShell\Modules\icinga-powershell-plugins\compiled\icinga-powershell-plugins.ifw_compilation.psm1 - open it in a text editor and check to see if the changes you made to Invoke-IcingaCheckProcess.psm1 have been propagated to it. If not then try running Copy-IcingaFrameworkCacheTemplate.

@atj
Copy link

atj commented Aug 11, 2023

As a final update on this, contrary to what I stated in my previous comment the Copy-IcingaFrameworkCacheTemplate doesn't recompile the plugins cache. I haven't been able to find a way to trigger it, so I've ended up calling Write-IcingaForWindowsComponentCompilationFile manually, as per the template file:

https://github.com/Icinga/icinga-powershell-framework/blob/093c5e7935eb57a4e12a7056806b023ed360e09e/templates/compilation.psm1.template#L1

The following PowerShell snippet will fix the plugin, recompile the plugins cache and restart the Icinga 2 service:

$ErrorActionPreference = 'Stop';

$file = 'C:\Program Files\WindowsPowerShell\Modules\icinga-powershell-plugins\plugins\Invoke-IcingaCheckMemory.psm1';
$content = Get-Content -Path $file -Raw;
$content.Replace("-Unit 'MB'", "-Unit 'B'") | Set-Content -Path $file;

Write-IcingaForWindowsComponentCompilationFile `
    -ScriptRootPath 'C:\Program Files\WindowsPowerShell\Modules\icinga-powershell-plugins\plugins' `
    -CompiledFilePath 'C:\Program Files\WindowsPowerShell\Modules\icinga-powershell-plugins\compiled\icinga-powershell-plugins.ifw_compilation.psm1';

Restart-IcingaWindowsService;

@LordHepipud
Copy link
Collaborator

Thanks for all the input. You are correct, the Unit was the error on the plugin. Though I'm not sure on why on my test machines I had issues with only on some machines reporting wrong values, while others worked fine with v1.11.0.

I will provide a fix for this with v1.11.1

@LordHepipud
Copy link
Collaborator

You can simply run

Publish-IcingaForWindowsComponent -Name plugins

by the way. This will force the cache rebuild. In addition starting with v1.11.0, you can use

icinga -RebuildCache -NoNewInstance

which will then compile the cache for all components

@atj
Copy link

atj commented Aug 15, 2023

After finding the actual cause of this issue I went to the effort of submitting a (admittedly trivial) PR to fix it but you created your own and merged it without even notifying me. If Icinga want to encourage contributions from partners this isn't the way to do it.

@LordHepipud
Copy link
Collaborator

Hello, I'm really sorry about that - I didn't see your PR for this issue 🙁
Sorry that this slipped through 🙁

@geotekberlin
Copy link

May I politely ask when we can expect V.1.11.1 to be released? We are waiting to deploy several new hosts, but can't because the PS "update from snapshot option" doesn't offer to install anything.

IMO if a released version has severe bugs, like complete failure of check plugins, there should be either a new subversion available as soon as the issue is fixed, or the complete rollout procedure should be generally thought over. Something like a Release Candidate of Beta channel for early adopters, but then the released software should be more stable than it is now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Investigation The team is looking into the cause of the issue
Projects
None yet
4 participants