-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Start-RSJob -Timeout #186
Comments
Yes, Its applied to entire batch , but do you really need something other ? |
I can see why this would be needed. I run scans across our network on a few thousand machines and sometimes have issues with 1-2 machines not error out when I cannot remote into them. The script never completes cause of those. If I had each specific job have a timeout I could mitigate those issues. I'd prefer not to do a batch timeout and have each job have a specific timeout. That way I can be sure the entire scan is complete and any machines that don't error out still get completed so the script can properly end. |
@brian-heidrich That is exactly the issue I am trying to resolve. Particularly when dealing with WMI, I always experience a few "hanged" jobs out of a couple thousands. Likely not very elegant, but this is the work around I have developed: $RSJobTimeout = 3600
$RSJobPollingMilliseconds = 2000
$RSBatchTimeout = 3600 * 24
$RSJobArguments = @{
Throttle = $Throttle
Batch = $Batch
FunctionsToLoad = $FunctionsToLoad
Name = $Name
ScriptBlock = $ScriptBlock
}
$RSJob = $Computers | Start-RSJob @RSJobArguments
$RSJob | ForEach-Object { $_ | Add-Member -NotePropertyName ExpirationTime -NotePropertyValue (Get-Date).AddSeconds($RSJobTimeout) }
# Record batch start time.
$BatchStartTime = Get-Date
While ($true) {
$StartTime = Get-Date
# Update ExpirationTime on NotStarted Jobs
$NotStartedRSJobs = Get-RSJob -Batch $Batch -State NotStarted
$NotStartedRSJobs | ForEach-Object { $_.ExpirationTime = (Get-Date).AddSeconds($RSJobTimeout) }
# Count the number of unfinished jobs before acting on the queues to ensure we don't miss any output.
$UnfinishedRSJobs = @(Get-RSJob -Batch $Batch | Where-Object -Property State -NE "Completed")
if ($UnfinishedRSJobs.Count -gt 0) {
$CurrentTime = Get-Date
$UnfinishedRSJobs | ForEach-Object {
if (($CurrentTime - $_.ExpirationTime).TotalSeconds -gt 0) { $_ | Remove-RSJob -Force }
}
}
# Ouptput completed jobs.
$FinishedRSJobs = Get-RSJob -Batch $Batch -State Completed
$FinishedRSJobs | Receive-RSJob
$FinishedRSJobs | Remove-RSJob -Force
if ($UnfinishedRSJobs.Count -eq 0 -or ((Get-Date) - $BatchStartTime).TotalSeconds -gt $RSBatchTimeout) {
# All jobs finished or the batch timed out.
break
} else {
# Calculate how long we just spent polling the queues
$LoopDuration = ((Get-Date) - $StartTime).TotalMilliseconds
# Use that to adjust how long we wait.
Start-Sleep -Milliseconds ([System.Math]::Max(0, $RSJobPollingMilliseconds - $LoopDuration))
}
}
# Remove background jobs. If the batch timed out some of them may still be running.
Get-RSJob -Batch $Batch | Remove-RSJob -Force |
so, what exactly problem with
in this scenario ? it never out or what ? |
@MVKozlov the issue is that the Timeout value needs to be guessed. Scenarios:
The code above sets both. A timeout for each individual RSJob and a timeout for the jobs-batch. This returns the collection as soon as feasible - while terminating individual unresponsive jobs - and sets a jobs-batch timeout to handle runaway processes. One more important point. A timeout on individual jobs maximizes the "throttle" usage. Assuming a throttle value of 5 (default), if a job hangs, we will only be able to have batches of 4. If two hang, we will only have batches of 3. That is inefficient when dealing with a large number of endpoints. |
@MVKozlov Exactly what @opustecnica said. I never know for sure exactly how long my global timeout needs to be due to the amount of machines on my network. We are always imaging new machine and taking machines off the network, meaning we never have a set amount for me to know how long to always make the global timeout. A per job timeout would be a lot more useful, as I know how long 1 job should take. |
Do{
$JustFinishedJobs = New-Object System.Collections.ArrayList
$RunningJobs = New-Object System.Collections.ArrayList
ForEach ($WaitJob in $WaitJobs) {
If($WaitJob.State -match 'Completed|Failed|Stopped|Suspended|Disconnected') {
[void]$JustFinishedJobs.Add($WaitJob)
} Else {
[void]$RunningJobs.Add($WaitJob)
}
}
$WaitJobs = $RunningJobs
#Wait just a bit so the HasMoreData can update if needed
Start-Sleep -Milliseconds 100
$JustFinishedJobs
$Completed += $JustFinishedJobs.Count
if($Timeout){
if((New-Timespan $Date).TotalSeconds -ge $Timeout){
$TimedOut = $True
break
}
}
}
While($Waitjobs.Count -ne 0) @opustecnica , @brian-heidrich The only reason I see for individual timeout is throttling. Actual job start may be long after Unfortunately, @proxb has not been here since april, so if I have some time I try to implement it next week in my own fork |
Implemented RunDate property to reflect actual job start Refactored job.State Added -PerJobTimeout switch to Wait-RSJob Added Tests for new work mode
please test Sorry, Throttling slot still not freed because But now you can use it like this $goodJobs = $data | Start-RSJob { ... } | Wait-RSJob -Timeout 120 -PerJobTimeout
$goodJobs | WorkOnThis
$goodJobs | Remove-RSJob
$badJobs = Get-RSJob |
Do you want to request a feature or report a bug?
Would it be possible to add a "-Timeout" feature to Start-RSJob?
What is the current behavior?
There is currently no timeout. When working with large number of computers, particularly if making WMI calls, a few might suffer from a "hang" WMI connection. When piped to a Wait-RSJob, the process never completes because of still-running RSJobs. The Wait-RSJob Timeout applies - if I understand correctly - to the entire batch of RSJobs and there seem to be no way to apply a timeout to individual RSJobs.
Thank you.
The text was updated successfully, but these errors were encountered: