Behind the PowerShell Pipeline logo

Behind the PowerShell Pipeline

Subscribe
Archives
January 17, 2025

PowerShell Parsing and Processing

Let's continue diving down the rabbit hole to see what we can do with the PowerShell Abstract Syntax Tree (AST). And believe me, this is a very deep rabbit hole with many branches. I think there are things to learn throughout the entire journey.

So far, I've been demonstrating how to parse a PowerShell script file with the AST. However, you can also input PowerShell code as a string.

$cmd = 'get-process s* -includeUserName | Where {$_.WS -ge 250MB} -ov w | Sort WS -Desc | Select -Fi 5 -outv x'

Make sure the command will properly run in PowerShell. I also recommend defining it as a literal string using single quotes so that PowerShell doesn't try to expand variables.

Instead of the ParseFile() method, we'll use ParseInput().

New-Variable astTokens
New-Variable astErr
$AST = [System.Management.Automation.Language.Parser]::ParseInput($cmd, [ref]$astTokens, [ref]$astErr)

You still need the [ref] variables. You need to include the [ref] type in the method. Because we aren't parsing a file, the AST object doesn't have much to work with.

PowerShell parsed input AST
figure 1

Don't be deceived. There are still tokens.

PS C:\> $astTokens | group Kind

Count Name                      Group
----- ----                      -----
    1 Variable                  {$_}
    5 Parameter                 {-includeUserName, -ov, -Desc, -Fi…}
    2 Number                    {250MB, 5}
    7 Identifier                {Where, WS, w, Sort…}
    2 Generic                   {get-process, s*}
    1 EndOfInput                {<eof>}
    1 LCurly                    {{}
    1 RCurly                    {}}
    3 Pipe                      {|, |, |}
    1 Dot                       {.}
    1 Ige                       {-ge}

PS C:\&gt; $astTokens | group TokenFlags

Count Name                      Group
----- ----                      -----
   12 None                      {s*, -includeUserName, $_, 250MB…}
    1 BinaryPrecedenceComparis… {-ge}
    3 ParseModeInvariant        {{, }, <eof>}
    3 SpecialOperator, ParseMo… {|, |, |}
    1 SpecialOperator, Disallo… {.}
    4 CommandName               {get-process, Where, Sort, Select}
    1 MemberName                {WS}

By the way, there are many token kinds and flags. You can use Get-TypeMember from the PSScriptTools module to view them. I'll let you run these commands to see the results.

Get-TypeMember System.Management.Automation.Language.TokenKind
Get-TypeMember System.Management.Automation.Language.TokenFlags

Aliases

My command string uses aliases. Let's use the AST to find them and replace the text with the full command. I know the names:

PS C:\&gt; $astTokens | where text -match "Sort|Where|Select" | Format-Table

Text    TokenFlags       Kind HasError Extent
----    ----------       ---- -------- ------
Where  CommandName Identifier    False Where
Sort   CommandName Identifier    False Sort
Select CommandName Identifier    False Select

Looking at this, I now have an idea on how I can identify them programmatically.

PS C:\&gt; $astTokens | where { $_.TokenFlags -eq 'CommandName' -AND $_.Kind -eq 'Identifier' } | format-Table

Text    TokenFlags       Kind HasError Extent
----    ----------       ---- -------- ------
Where  CommandName Identifier    False Where
Sort   CommandName Identifier    False Sort
Select CommandName Identifier    False Select

I can use the Get-Alias cmdlet to find the full command and resolve the alias.

PS C:\&gt; $astTokens | where { $_.TokenFlags -eq 'CommandName' -AND $_.Kind -eq 'Identifier' } |
Select-Object -expand Text | Get-Alias | Select-Object  Name, ResolvedCommand

Name   ResolvedCommand
----   ---------------
where  Where-Object
sort   Sort-Object
select Select-Object

I'm not doing it in my sample code, but it is possible to use the same alias multiple times in the same command. I only want to resolve unique names. One technique I use is Group-Object.

$Aliases = $astTokens | where { $_.TokenFlags -eq 'CommandName' -AND $_.Kind -eq 'Identifier' } |
Group-Object -Property Text -NoElement

I don't need the elements, I only need the names. Now I can iterate through the names, resolve the alias and replace it in the command string, $cmd.

foreach ($alias in $Aliases.Name) {
    Try {
        #using Try/Catch in case the alias can't be resolved
        $resolved = Get-Alias -Name $alias -ErrorAction Stop
        #replace the alias text in the command string with the resolved command
        #I'm using a regex word boundary to ensure I'm replacing the whole word
        $cmd = $cmd -Replace ("\b$($alias)\b", $resolved.ResolvedCommand)
    }
    Catch {
        Write-Warning "Unable to resolve alias $($alias.Text)"
    }
}

Here's the result:

PS C:\&gt; $cmd
get-process s* -includeUserName | Where-Object {$_.WS -ge 250MB} -ov w | Sort-Object WS -Desc | Select-Object -Fi 5 -outv x
Want to read the full issue?
GitHub Bluesky LinkedIn About Jeff
Powered by Buttondown, the easiest way to start and grow your newsletter.