Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perfhelper.ps1 should check if perf counters started before creating perfhash.hsh #83

Open
robertkaucher opened this issue Apr 15, 2019 · 5 comments

Comments

@robertkaucher
Copy link

We had a few new systems added this week and the "Server Manager Performance Monitor" collector set had not been started on these and the metric plugins were sending strange/no data. I realized the counters were not started and started them on all the servers but this did not have any impact on the metrics. I found that Get-PerformanceCounterByID in perfhelper.ps1 was creating a cache file but was not checking if the performance counters were started and if they were not most of the metrics would not function correctly. I'm not sure if the scripts should start the counters or simply not create the $hashFile unless the counters are started.

$datacollectorset = New-Object -COM Pla.DataCollectorSet
$datacollectorset.Query("Server Manager Performance Monitor",$null)
if ($datacollectorset.Status -ne 0) {
	Export-Clixml -InputObject $perfHash -Path $hashfile
}

Or perhaps the counters should be started since anyone attempting to use the plugins will need this to be the case to use them any way.

function Get-PerformanceCounterByID
{
    param
    (
        [Parameter(Mandatory=$true)]
        $Name
    )
    $hashfile = (Join-Path $PSScriptRoot perfhash.hsh)
   $datacollectorset = New-Object -COM Pla.DataCollectorSet
   $datacollectorset.Query("Server Manager Performance Monitor",$null)
   if ($datacollectorset.Status -eq 0) {
	$datacollectorset.Start($true)
        if((Test-Path $hashFile)){
             Remove-Item $hashFile
        }
   }
    if ([System.IO.File]::Exists($hashfile)) {       
        $perfHash = Import-Clixml -Path $hashfile
    }
    if ($perfHash -eq $null)
    {
        $key = 'Registry::HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Perflib\009'
        $counters = (Get-ItemProperty -Path $key -Name Counter).Counter
        $perfHash = @{}
        $all = $counters.Count
         for($i = 0; $i -lt $all; $i+=2)
        {
           $perfHash.$($counters[$i+1]) = $counters[$i]
        }
        Export-Clixml -InputObject $perfHash -Path $hashfile
    }
    $perfHash.$Name
}

It's probably wise to perform a synchronous start on the data collector set and delete the $hashFile if it has been previously created.
https://docs.microsoft.com/en-us/windows/desktop/api/pla/nf-pla-idatacollectorset-start

@robertkaucher
Copy link
Author

I just tested this and it looks like starting the data collector set will not work. At the least, the $hashFile probably should not be created.

The operator or administrator has refused the request. (Exception from 
HRESULT: 0x800710E0)
At C:\sensu-go\plugins\PowerShell\perfhelper.ps1:44 char:2
+     $datacollectorset.Start($true)
+     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : OperationStopped: (:) [], COMException
    + FullyQualifiedErrorId : System.Runtime.InteropServices.COMException

@majormoses
Copy link
Member

@csabo any thoughts? I don't really know what I am talking about with windows...

@csabo
Copy link
Contributor

csabo commented Apr 16, 2019

@majormoses I haven’t personally seen this issue, but it looks legit, I’ll try to squeeze some testing in tomorrow morning, and then work with @robertkaucher on a fix as needed.

Sent with GitHawk

@csabo
Copy link
Contributor

csabo commented Apr 17, 2019

@robertkaucher So I agree that dropping the hashFile will be the way to do this. Rather than using the Get-* helper functions, I did a quick test just using Get-Counter directly, and I was seeing about a 1 second per Get-Counter call on total execution time, which is quicker than I was seeing using the original helper functions. Below is my test with metrics-windows-system.ps1.

If starting the counters doesn't work, would wrapping in some conditional to return a message stating the counter isn't started, or something of the like meet your needs?

param(
    [switch]$UseFullyQualifiedHostname
    )

$ThisProcess = Get-Process -Id $pid
$ThisProcess.PriorityClass = "BelowNormal"

. (Join-Path $PSScriptRoot perfhelper.ps1)

if ($UseFullyQualifiedHostname -eq $false) {
    $Path = ($env:computername).ToLower()
}else {
    $Path = [System.Net.Dns]::GetHostEntry([string]"localhost").HostName.toLower()
}

$count_interrupt = [System.Math]::Round((Get-Counter "\Processor Information(_total)\Interrupts/sec" -SampleInterval 1 -MaxSamples 1).CounterSamples.CookedValue)
$count_context = [System.Math]::Round((Get-Counter "\System\Context Switches/sec" -SampleInterval 1 -MaxSamples 1).CounterSamples.CookedValue)
$Time = DateTimeToUnixTimestamp -DateTime (Get-Date)

Write-Host "$Path.system.irq_per_second $count_interrupt $Time"
Write-Host "$Path.system.context_switches_per_second $count_context $Time"

@robertkaucher
Copy link
Author

@csabo I think what should be done is more of a philosophical question about how plugin errors should be communicated to Sensu users. Using write-host may not be ideal because then someone is going to need to be running the scripts manually to see it. If there is already a defined approach for what to do when plugins start throwing errors, I'd just follow that and maybe use write-host as a courtesy. It was by manually running the plugins that we discovered the issues after all. As long as the cache file is not created before the counters are started, I'll be happy with the solution.

I suppose another concern will be if additional metrics are added at a later point and new counters need to be included that were not available when the system was initially added to Sensu. How necessary is the hash file? If it doesn't need to be created and this would not impact the performance of the metric plugins, then that would be the best option as not generating it fixes all of the problems in one go.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants