Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New rule for ForEach-Object -Parallel that helps users pinpoint problems with their variable scoping #1410

Closed
Jawz84 opened this issue Feb 8, 2020 · 9 comments · Fixed by #1419

Comments

@Jawz84
Copy link
Contributor

Jawz84 commented Feb 8, 2020

Summary of the new feature

In response to questions seen on Twitter, like this one: https://twitter.com/_bateskevin/status/1225536662769405952?s=20

As a PowerShell 7 user, I would like to be warned when I use variables incorrectly with ForEach-Object -Parallel, so I don't have to wonder why I get an error related to a non-initialized variable.

Example with incorrect variable usage:

$myVar = 10
1..2 | ForEach-Object -Parallel {
    $myVar | Get-Member
}

I would like PSScriptAnalyzer to warn me that $myVar is not going to work. (I need to use $using:myVar)

Proposed technical implementation details (optional)

A quick proof of context:

function Test-ForeachParallelWarningForNonAssignedNonUsingVars {
    param(
        $script
    )

    if ($script -is [scriptblock]) {
        $ScriptBlocks = $script.Ast.FindAll( {
            $args[0] -is [System.Management.Automation.Language.ScriptBlockAst]
        }, $true)
    }
    else {
        $ScriptBlocks = [scriptblock]::Create($script).Ast.FindAll( {
            $args[0] -is [System.Management.Automation.Language.ScriptBlockAst]
        }, $true)
    }

    foreach ($ScriptBlockAst in $ScriptBlocks) {
        $foreachObjectParallelCommandAst = $ScriptBlockAst.FindAll( {
            $args[0] -is [System.Management.Automation.Language.CommandAst] -and 
            $args[0].GetCommandName() -match "^foreach$|^%$|^foreach-object$" -and 
            $args[0].CommandElements.ParameterName -like "pa*"
        }, $false)
        
        foreach ($parallelCommandAst in $foreachObjectParallelCommandAst) {  
            $assignments = $parallelCommandAst.CommandElements.FindAll({$args[0] -is [System.Management.Automation.Language.AssignmentStatementAst]}, $true)
            $varsInAssignments = $assignments | ForEach-Object { $_.Left.FindAll( { $args[0] -is [System.Management.Automation.Language.VariableExpressionAst]}, $true)}
            
            $nonAssignedNonUsingVars = $parallelCommandAst.CommandElements.FindAll({
                    $args[0] -is [System.Management.Automation.Language.VariableExpressionAst] -and 
                    -not ($args[0].Parent -is [System.Management.Automation.Language.UsingExpressionAst]) -and
                    $varsInAssignments -notcontains $args[0]
                }, $true)

            return $nonAssignedNonUsingVars
        }
    }
}


## tests
Describe "Test-ForeachParallelWarningForNonAssignedNonUsingVars" {
    Context "Should detect something" {
        $testcases = @(
            @{
                Description = "Foreach-Object -Parallel with undeclared var"
                ScriptBlock = { 
                    1..2 | ForEach-Object -Parallel { $var }
                }
            }
            @{
                Description = "alias foreach -parallel with undeclared var"
                ScriptBlock = {
                    1..2 | ForEach -Parallel { $var }
                }
            }
            @{
                Description = "alias % -parallel with undeclared var"
                ScriptBlock = {
                    1..2 | % -Parallel { $var }
                }
            }
            @{
                Description = "abbreviated param Foreach-Object -pa with undeclared var"
                ScriptBlock = {
                    1..2 | foreach-object -pa { $var }
                }
            }
            @{
                Description = "Nested Foreach-Object -Parallel with undeclared var"
                ScriptBlock = {
                    $myNestedScriptBlock = {
                        1..2 | ForEach-Object -Parallel { $var }                   
                    }
                }
            }
        )

        it "should emit for: <Description>" -TestCases $testcases {
            param($Description, $ScriptBlock)
            Test-ForeachParallelWarningForNonAssignedNonUsingVars $ScriptBlock | Should -Be "`$var"
        }
    }

    Context "Should not detect anything" {
        $testcases = @(
            @{
                Description = "Foreach-Object with uninitialized var inside"
                ScriptBlock = { 
                    1..2 | ForEach-Object { $var }
                }
            }
            @{
                Description = "Foreach-Object -Parallel with uninitialized `$using: var"
                ScriptBlock = {
                    1..2 | foreach-object -Parallel { $using:var }
                }
            }
            @{
                Description = "Foreach-Object -Parallel with var assigned locally"
                ScriptBlock = { 
                    1..2 | ForEach-Object -Parallel { $var="somevalue" }
                }
            }
            @{
                <# Not implemented #>
                Description = "Foreach-Object -Parallel with built-in var '`$Args' inside"
                ScriptBlock = { 
                    1..2 | ForEach-Object { $Args[0] } -ArgumentList "a" -Parallel
                }            
            }
        )

        it "should not emit anything for: <Description>" -TestCases $testcases {
            param($Description, $ScriptBlock)
            Test-ForeachParallelWarningForNonAssignedNonUsingVars $ScriptBlock | Should -BeNullOrEmpty
        }
    }
}

What is the latest version of PSScriptAnalyzer at the point of writing

1.18.3

@ghost ghost added the Needs: Triage 🔍 label Feb 8, 2020
@felixfbecker
Copy link

I think your tests are missing a case for assigning $var inside the scriptblock and then referencing it (should not error)

@Jawz84
Copy link
Contributor Author

Jawz84 commented Feb 8, 2020

@felixfbecker thank you, added a test :-)
The code needs to be redone in C# completely (to be able to reuse parts from UseDeclaredVarsMoreThanAssignments.cs for example), and needs refactoring and optimizing, but it shows a way to handle this particular case.

@bergmeister
Copy link
Collaborator

bergmeister commented Feb 8, 2020

You could publish this as a custom rule, which can be written in PowerShell. Otherwise, I'd rather create a new rule to keep it separate from UseDeclaredVarsMoreThanAssignments. Due to PowerShell's scoping nothing prevents someone from declaring a variable with the same name inside the -Parallel scriptblock but using it for a different purpose as their scope is isolated, therefore inevitably your rule could produce false positives like UseDeclaredVarsMoreThanAssignments does.
To me this rather sounds like user education is needed because the same problem applies to using the background operator &, which is just a scriptblock as an argument for Start-Job...

@Jawz84
Copy link
Contributor Author

Jawz84 commented Feb 8, 2020

I was hoping to generate some built-in user education through a rule like this. I am aware the same kind of problems are also present in Invoke-Command and & etc. If we can notify users for these cases too, that would be added bonus.

You have a point with the user being able to declare a variable with the same name in the new scope. There is nothing to prevent that, and it's a harder situation to check against. And I am guessing would 99% of the time be non-intentional. I have no real fix for that.

If we go forward with this idea, I agree this should be a separate rule in PsScriptAnalyzer, or a custom (powershell based) rule. I just think we can reuse ideas from the rule I mentioned, because it already solves some issues with detecting variables, built in variables, scoping etc.

If we can find another way, even outside of PsScriptAnalyzer, how we can help users find out about the difference between ForEach-Object and ForEach-Object -Parallel I am very much open to that as well.

@felixfbecker
Copy link

I think a rule like this definitely belongs into core PSScriptAnalyzer, as it can be immensely useful in catching errors before execution. Especially in the scenario where you're trying to parallelize a ForEach-Object that was sequential before, it's hard to catch all the variables that need to be changed. This problem doesn't apply as much to the other cases, because there is no direct "migration" scenario.

@bergmeister
Copy link
Collaborator

Maybe this new rule should be applied to a couple of cmdlets/parameter pairs.
@Jawz84 Do you plan to open a PR? Let me know if you need to some help/guidance. The prototype looks ok but I in C# we need to be a bit stricter and cannot just use matching to find the cmdlet name, It would rather be a case insensitive comparison against the cmdlet and its known alias

@Jawz84
Copy link
Contributor Author

Jawz84 commented Feb 10, 2020

Yes, I might, but it will have to wait until next week. I am also keeping an eye on PowerShell/PowerShell#11811
I think detection at develop time is best, but not everyone uses psscriptanalyzer, so detection at runtime is good too.
If you have tips for me to consider before I give it a try next week, please let me know. I love to learn.

@bergmeister
Copy link
Collaborator

bergmeister commented Feb 15, 2020

@Jawz84
In terms of tips: The ShowPSAst is useful for showing the AST.
Although you can just use code and the command line for most development, if you want to add resource strings (the ones in the .resx files), then you need to either use VisualStudio (which will update some auto-generated code, which are the *.Designer.cs files) or otherwise manually add the entries in *.Designer.cs (this is a deficiency of .net core)
Just follow the instructions in the readme around installing the correct .net core sdk and latest version of platyps module. Then you can build the module using ./build.ps1 (or use the -All switch to build the full module for all ps versions) and the produced module is in a folder called out. When running tests locally, you cannot use the integrated ps terminal in code because it already has PSSA loaded...
You can develop using either powershell core or windows powershell, there is not advantage or disadvantage to either of them.
I suggest you use one of the existing rules and just copy paste the .cs file as a starting point. You can open a draft PR early even if it's just rough work.

@Jawz84
Copy link
Contributor Author

Jawz84 commented Feb 16, 2020

Thanks Christoph. Working on it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants