Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invoke-Pester with Code-Coverage taking longer with latest OS update #1318

Closed
Mahendra1260 opened this issue May 31, 2019 · 56 comments
Closed
Assignees

Comments

@Mahendra1260
Copy link

Mahendra1260 commented May 31, 2019

1. General summary of the issue

Hi Team,
We used to run Pester test cases with Invoke-Pester with -Code-Coverage flag. The build used to make maximum of 10min.
After the recent Microsoft OS updates, build time in invoke-pester increased drastically to 40min and so.
I am wondering if there is anything related to KB article #kb4494441
After i uninstall the KB article, my build time is normal back to 10min. Is there something we need to do in Pester inorder to be compatible with OS patch ?

2. Describe Your Environment

Operating System, windows10
Pester version, 4.8.1

3. Expected Behavior

The build should take on 11 min

4.Current Behavior

Build taking around 42 min

5. Possible Solution

6. Context

@nohwnd
Copy link
Member

nohwnd commented May 31, 2019

Hello, thanks for the report. Could you provide more info please? This is pretty hard to guess, because it can be caused by pester, or by your code running in pester, or both.

Are you able to identify where it slows down?
Is every test simply slower or there are some test that take way longer, while others run approximately the same?
Are you sure this is because of CodeCoverage? Why do you think that?

@DavidGibsonNorth
Copy link

DavidGibsonNorth commented May 31, 2019

I'm seeing the same issue on Windows 10 and Server 2016 machines with recent Windows Updates. I've got better info on Server 2016. Tests with coverage on are taking 10 times longer on servers with kb4494440 (May 2019) than servers with kb4493470 (April 2019).

Proven on 1 server by removing kb4494440, reboot, and seeing our whole test suite drop from 11 minutes back to 1 minute.

> Is every test simply slower or there are some test that take way longer, while others run approximately the same?
Slowdown is on every test. About 5 - 20 times slower on each module.

Demo of issue using your own tests

  • Download [https://github.com/pester/Pester/archive/4.8.1.zip] and extract it.
Import-Module .\Pester-4.8.1\Pester.psm1
$testDir = (Get-Module Pester).Path | Split-Path | Join-Path -ChildPath 'Functions\Assertions'
(Invoke-Pester -Script "$testDir\Be.Tests.ps1" -PassThru -Show None).Time.TotalSeconds
(Invoke-Pester -Script "$testDir\Be.Tests.ps1" -PassThru -Show None -CodeCoverage "$testDir\*.ps1").Time.TotalSeconds

The times with CodeCoverage on are going up from 1 seconds to 10 on machines with kb4494440.

I've not spotted any obvious cause in the support article. There are a load of security fixes, but not much detail.

@Mahendra1260
Copy link
Author

Mahendra1260 commented May 31, 2019

@nohwnd for addressing it.
All the tests are taking slightly longer than usual. Once I uninstall the particular update (kb4494441) from machine, everything runs as it was before. This is where I feel its something to do with Pester, the way it runs.
I doubt that it could be due to my code running Pester, because it was working fine till the particular patch is applied and works fine again, if I uninstall the patch

This is how I am running the pester:
Invoke-Pester $TestFiles -ExcludeTag $excludeTags -CodeCoverage $CodeFiles -PassThru -OutputFormat NUnitXml -OutputFile $config.testResultsFile -CodeCoverageOutputFile $config.coverageFile -Show Fails

Also, I feel its due to Code coverage as the time taken by the Pester is long with -CodeCoverage flag..
Without CodeCoverage flag, the tests completes pretty smooth.

@nohwnd
Copy link
Member

nohwnd commented Jun 2, 2019

Okay, thank you both for the info. I will try to find out what is wrong. Code coverage depends on breakpoint so hopefully it is not that, and instead it is something less fundamental, like adding to an array that is slower on the fixed systems.

@nohwnd nohwnd self-assigned this Jun 2, 2019
@DavidGibsonNorth
Copy link

Thanks @nohwnd. If you want any more info, or something tested, let me know.

One minor bit of info, I'm seeing the slowdown on a mix of hardware - VMs, tablet & desktops.

@nohwnd
Copy link
Member

nohwnd commented Jun 3, 2019

Great to know, I will be testing it on a vm most likely.

@nohwnd
Copy link
Member

nohwnd commented Jun 3, 2019

Yeah one more thing are you running it on PowerShell 5 or 6 or both?

@DavidGibsonNorth
Copy link

PowerShell 5. Using Pester 4.8.1 from GitHub, and Pester 4.8.1 & 4.4.0 from PSGallery.

@Mahendra1260
Copy link
Author

Major Minor Build Revision


5 1 17763 503

@nohwnd : I am majorly facing the issue when I give CodeCoverage fllag while invoking Pester.

@tfabris
Copy link

tfabris commented Jun 4, 2019

@nohwnd - My company is experiencing the exact same problem. It only occurs on some of our test agent machines, not all of them, and I'm not sure what the difference is between the machines. Pester module version is 4.8.1.

Same symptom: When running code coverage, tests take longer to run. Though each individual test case only takes a few extra seconds to run, when added all up across hundreds of tests, it adds up to real money eventually: test suites which used to take about 10-15 minutes to run are now taking about 45-60 minutes to run.

Problem started occurring for us on approximately 2019-05-16, perhaps there was a change in the operating system, or in Powershell, on the prior Patch Tuesday, which Pester isn't very compatible with?

Example output from slow and fast tests, both of these are with version 4.8.1 of Pester but they are on different computers:

Fast:

[21:07:06] :	 [Step 2/2] Executing unit tests for 'C:\BuildAgent\work\b41c9303f646ade0\BetaModules\BitTitan.Runbooks.MSOnline\Functions\Block-UntilMSOnlineUserAvailable.ps1'.
[21:07:07] :	 [Step 2/2] Executing all tests in 'C:\BuildAgent\work\b41c9303f646ade0\BetaModules\BitTitan.Runbooks.MSOnline\Functions\Block-UntilMSOnlineUserAvailable.Tests.ps1' with Tags unit
[21:07:07] :	 [Step 2/2] 
[21:07:07] :	 [Step 2/2] Executing script C:\BuildAgent\work\b41c9303f646ade0\BetaModules\BitTitan.Runbooks.MSOnline\Functions\Block-UntilMSOnlineUserAvailable.Tests.ps1
[21:07:07] :	 [Step 2/2] 
[21:07:07] :	 [Step 2/2]   Describing BitTitan.Runbooks.MSOnline/Block-UntilMSOnlineUserAvailable
[21:07:07] :	 [Step 2/2] 
[21:07:07] :	 [Step 2/2]     Context when the user is immediately available
[21:07:07] :	 [Step 2/2]       [+] returns the user after trying once 81ms
[21:07:07] :	 [Step 2/2] 
[21:07:07] :	 [Step 2/2]     Context when the user is available after 10 tries
[21:07:08] :	 [Step 2/2]       [+] returns the user after trying 10 times 605ms
[21:07:08] :	 [Step 2/2] 
[21:07:08] :	 [Step 2/2]     Context when the user is not available
[21:07:09] :	 [Step 2/2]       [+] returns null after trying the maximum number of times and outputs an error 1.16s
[21:07:09] :	 [Step 2/2] Tests completed in 2.5s

Slow:

[21:07:06] :	 [Step 2/2] Executing unit tests for 'C:\BuildAgent\work\b41c9303f646ade0\BetaModules\BitTitan.Runbooks.MSOnline\Functions\Block-UntilMSOnlineUserAvailable.ps1'.
[21:07:06] :	 [Step 2/2] Executing all tests in 'C:\BuildAgent\work\b41c9303f646ade0\BetaModules\BitTitan.Runbooks.MSOnline\Functions\Block-UntilMSOnlineUserAvailable.Tests.ps1' with Tags unit
[21:07:06] :	 [Step 2/2] 
[21:07:06] :	 [Step 2/2] Executing script C:\BuildAgent\work\b41c9303f646ade0\BetaModules\BitTitan.Runbooks.MSOnline\Functions\Block-UntilMSOnlineUserAvailable.Tests.ps1
[21:07:06] :	 [Step 2/2] 
[21:07:06] :	 [Step 2/2]   Describing BitTitan.Runbooks.MSOnline/Block-UntilMSOnlineUserAvailable
[21:07:07] :	 [Step 2/2] 
[21:07:07] :	 [Step 2/2]     Context when the user is immediately available
[21:07:07] :	 [Step 2/2]       [+] returns the user after trying once 286ms
[21:07:07] :	 [Step 2/2] 
[21:07:07] :	 [Step 2/2]     Context when the user is available after 10 tries
[21:07:11] :	 [Step 2/2]       [+] returns the user after trying 10 times 2.78s
[21:07:11] :	 [Step 2/2] 
[21:07:11] :	 [Step 2/2]     Context when the user is not available
[21:07:16] :	 [Step 2/2]       [+] returns null after trying the maximum number of times and outputs an error 5.26s
[21:07:16] :	 [Step 2/2] Tests completed in 10.1s

@tfabris
Copy link

tfabris commented Jun 4, 2019

Update: Machine differences:

RUNS PESTER SLOW:
ver
Microsoft Windows [Version 10.0.14393]
winver
Version 1607 (OS Build 14393.2999)

RUNS PESTER FAST:
ver
Microsoft Windows [Version 10.0.14393]
winver
Version 1607 (OS Build 14393.2969)

@nohwnd
Copy link
Member

nohwnd commented Jun 5, 2019

@SteveL-MSFT are you aware of anything specific that really slowed down in PowerShell 5 after kb4494440? I searched PowerShell repo and did not find anything related to v6 to give me a starting point.

@SteveL-MSFT
Copy link
Contributor

@nohwnd I'm not aware of any specific changes in that KB that would have affected WinPS performance. If you can isolate it to a small repro using just PowerShell, I can have someone look into it.

@tfabris
Copy link

tfabris commented Jun 5, 2019

@nohwnd - @SteveL-MSFT - Any chance it's the recent security fixes, which nerf the speculative execution pipeline on Intel chips? Those are expected to cause performance hits (although not necessarily expected to be this severe).

@nohwnd
Copy link
Member

nohwnd commented Jun 6, 2019

Yeah I am suspecting to be the fix for that, but pin-pointing which thing exactly is the hard part :)

@Mahendra1260
Copy link
Author

@nohwnd Does it mean, there will be a fix on Pester side?

@nohwnd
Copy link
Member

nohwnd commented Jun 13, 2019

@Mahendra1260 yes, if I find that the slow-down is caused by something that I can avoid in the code.

@SteveL-MSFT
Copy link
Contributor

I've seen other reports that the KB associated with speculative execution fix seems to be the cause of perf degradation as uninstalling that specific KB restores perf.

@Mahendra1260
Copy link
Author

@nohwnd : Not sure if you have any update on the issue

@Outek
Copy link

Outek commented Jun 19, 2019

I can confirm such a behavior on a Windows Server 2019 1809.

@nohwnd
Copy link
Member

nohwnd commented Jun 19, 2019

No update yet :/

@Mahendra1260
Copy link
Author

@nohwnd waiting for some update on the issue as this is causing huge performance issue to our system :)

@nohwnd
Copy link
Member

nohwnd commented Jul 3, 2019

Tried to replicate this yesterday, but it is fast on all systems I have. I installed 1607 and tried updating it with the kb but it took 3 hours and failed to finish after restart. Any tips how to get the appropriate version? I have msdn, but there I see only a single build of 1607 there, from jan 17 I think.

I will try Windows Server 1809 @Outek could you share the exact version? Is that after some update as well or on clean install?

@nohwnd
Copy link
Member

nohwnd commented Jul 3, 2019

Can't replicate on win server 1809 ( 10.0.17763 ) either.

@DavidGibsonNorth
Copy link

I can still replicate on server build 1607.
This is with update KB4498947 from May plus KB4503267 & KB4503537 from June applied.
The update previous May update KB4494440 was replaced by KB4498947, both exhibited the coverage slowdown.

I can also still replicate on Windows 10 build 1803. 3 post-April KB's applied, KB4497398, KB4503308 & KB4503286.

@toose
Copy link

toose commented Jul 3, 2019

I've been able to reproduce this behavior after installing Windows 10 feature update v1809 (updated from v1803).

I know that installing 1709/1803 bare and running Pester without any updates installed will allow code coverage to complete quickly, as I had tested this myself before discovering that the KB(s) referenced above were responsible for breaking it. From there I would try and install the update manually to see if it breaks.

Hope this helps.

@nohwnd
Copy link
Member

nohwnd commented Jul 5, 2019

Tried building 4 vms to do this, but it always don't replicate, or the updates fail to apply and I end up with update rollback. Anyone has public vm that I can use to replicate this?:D

@suprak
Copy link

suprak commented Jul 17, 2019

I'm on Win 10, 1903 (18362.239) and Pester test cases are running extremely slow.
Tests that used to take milliseconds, now take seconds.

@nohwnd
Copy link
Member

nohwnd commented Jul 18, 2019

@suprak Finally that replicates it for me. Windows 10 Bussiness version 1903 (updated July 2019), straight from MSDN (gwmi win32_operatingsystem -> version 10.0.18362).

@Mahendra1260
Copy link
Author

@nohwnd @SteveL-MSFT Any recent update on the issue?

@SteveL-MSFT
Copy link
Contributor

The perf regression due to PSBreakPoint is fixed in latest PS7 (preview.3 is expected soon, but you can try the nightly build). The fix will be backported to Windows PowerShell 5.1 (no date as we don't control that).

@DavidGibsonNorth
Copy link

@SteveL-MSFT thanks for the update and the fix. That's the fix working for me, I tried the nightly build of PowerShell 7 on Server 2016 and Windows 10. On both systems, coverage back to 1.5 minutes for my test suite using PowerShell PS7, at 15 minutes using Powershell 5.1.

@SteveL-MSFT
Copy link
Contributor

This is the specific PR PowerShell/PowerShell#10269

@jborean93
Copy link

This is a long shot but I thought I would mention it. When implementing a coverage collector for Ansible using breakpoints like Pester I found that it was performing really poorly. I found that as soon as I set anything to the -Action parameter for Set-PSBreakpoint, even an empty scriptblock, the timeout would just balloon. It confused me because Pester sets -Action {} as per the line in Coverage.ps1 but I never saw this issue. In the end we just don't specify action at all which works just fine for Ansible's case. I have a feeling it might not be viable in Pester due to it entered a nested pipeline and waiting for user input on an interactive console but potentially it's worth investigating.

@nohwnd
Copy link
Member

nohwnd commented Sep 21, 2019

@jborean93 Thanks for the suggestion, I am afraid this won't help because I was able to reproduce the problem above without using any -Action { }.

@jborean93
Copy link

Darn, thought I would mention it just in case it did help.

@nohwnd
Copy link
Member

nohwnd commented Sep 24, 2019

@jborean93 Thanks anyway ☺️

@asears
Copy link
Contributor

asears commented Jan 21, 2020

Is this still waiting on Powershell 7 to be released to close? Experiencing very slow code coverage when feature is turned on.

@nohwnd
Copy link
Member

nohwnd commented Jan 23, 2020

@asears this should already be included in 6.2.3 (per #1318 (comment)), are you using powershellcore on powershell?

@asears
Copy link
Contributor

asears commented Feb 25, 2020

Using Azure Devops Powershell task.

Tried 4.0

Task         : Azure PowerShell
Description  : Run a PowerShell script within an Azure environment
Version      : 4.159.12
Author       : Microsoft Corporation
Help         : https://aka.ms/azurepowershelltroubleshooting

Tried 5.0

Task         : Azure PowerShell
Description  : Run a PowerShell script within an Azure environment
Version      : 5.165.0
Author       : Microsoft Corporation
Help         : https://aka.ms/azurepowershelltroubleshooting

Azure Pipelines, vs2017-win2016 Agent Specification
Pester v4.10.1

We do have something running at around 12 -15 minutes / build.

We also need a way to consolidate multiple Jacoco.xml summary results to a single file to deal with Devops Code Coverage Publish task limitations. That's not really a Pester issue but if anyone has a method to do so please chime in.

@nohwnd
Copy link
Member

nohwnd commented Mar 28, 2020

@asears At one point in v5 I did this on the breakpoints. I think BPs are accessible on the CC part of result object, so maybe if you saved them into clixml, and then loaded all of them and merged?

https://github.com/pester/Pester/blob/v5.0/Functions/Coverage.ps1#L469

Just explaining this I feel like this is not the way to go, but I already typed it 😁

We probably need to code it ourselves 🙂

@johnwc
Copy link

johnwc commented Apr 23, 2020

Just as a FYI, running code coverage in DevOps pipeline with PSCore takes over an hour. Which causes the build to cancel. Locally in PS 6.2.4 it takes 20 min with code coverage enabled.

DevOps Pipeline's $PSVersionTable

Name                           Value
----                           -----
PSVersion                      7.0.0
PSEdition                      Core
GitCommitId                    7.0.0
OS                             Microsoft Windows 10.0.17763
Platform                       Win32NT
PSCompatibleVersions           {1.0, 2.0, 3.0, 4.0…}
PSRemotingProtocolVersion      2.3
SerializationVersion           1.1.0.1
WSManStackVersion              3.0

@iSazonov
Copy link

Locally in PS 6.2.4 it takes 20 min with code coverage enabled.

Can you test with latest PowerShell 7.0 build? And 7.1 Preview2?

@johnwc
Copy link

johnwc commented Apr 26, 2020

@iSazonov the DevOps pipeline uses PS 7.0, that is the version table shown.

@nohwnd
Copy link
Member

nohwnd commented May 21, 2021

Breakpoints in powershell are really slow when there are many, and this is multiplied in some systems with 5.1 where additional security checks are done (plus powershell 5 is really slow). Not much that can be done about it in the breakpoint based code coverage.

@nohwnd nohwnd closed this as completed May 21, 2021
@johnwc
Copy link

johnwc commented May 22, 2021

@nohwnd We were not using PS5... And there are no breakpoints in a pipeline build...

@nohwnd
Copy link
Member

nohwnd commented May 22, 2021

@johnwc I am sorry, I explained that with too much insider knowledge omitted. Here’s another try:

Pester is defining breakpoints for code coverage. We use them to see which code was executed and which code was not.

PowerShell, including the latest version, is not expecting us to define more than 100 breakpoints. Internally it handles them in a very slow way, and the performance of the whole shell gets exponentially worse as you add more, because they check them and move them to new array multiple times on every single row of code.

When running Pester own tests, we define over 7000 of breakpoints, for example. I’ve wrote a fix for that, but it is not merged to PowerShell. And would not apply to any prior versions even if it was.

To make it even worse, there is a security fix that slows down execution of Breakpoints on some versions of Windows, because on every BP it also checks in registry if the fix is applied (or something along those lines). We experience this in our pipelines as well, just look at the execution time of PS5 and PS7 on Pester pipeline.

The issue above is caused by some change in the OS and from our point of view there is nothing more I can do about it. I investigated performance, and work on implementing a new way of collecting code coverage.

@aolszowka
Copy link
Contributor

@nohwnd I know this is an old thread, but as you can see I've linked a few related issues.

To make it even worse, there is a security fix that slows down execution of Breakpoints on some versions of Windows, because on every BP it also checks in registry if the fix is applied (or something along those lines). We experience this in our pipelines as well, just look at the execution time of PS5 and PS7 on Pester pipeline.

I opened the bug report with the hosted agents here actions/runner-images#4398 and as noted in there it seems like the issue is remediated in windows-2022. I am not sure how much flexibility you have in your pipelines but can you try under that pool and see if the runtimes improve? For us they drastically improved.

It might help other users like @johnwc above.

@nohwnd
Copy link
Member

nohwnd commented Nov 2, 2021

@aolszowka Thanks for the report. I believe you it helped, and will leave it up to others to move their pipelines to windows-2022 and try it out.

There is also a new approach to code coverage that I added to Pester 5 which does not use breakpoints and seems to be as accurate as the breakpoint based approach. It's rather stand alone, and could be back ported to Pester 4 if someone would be willing to do the work. It is also way more performant on larger code bases.

https://youtu.be/5oMTBKZDh-c?t=2478

@ladenedge
Copy link

Setting

vmImage: 'windows-2022'

worked for me as well.

  • windows-latest: >13m
  • windows-2022: ~1m 30s

Happily, windows-2022 is slated to become the default on March 6, so that should help anyone still experiencing this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests