Faster Filtering
In this issue:
Today I want to cover a topic that you should know about, but one that I don't think gets a lot of attention or discussion. It is something that can boost the performance of your PowerShell expressions considerably. Even though it was introduced in PowerShell v4 to support the latest Desired State Configuration (DSC) release, you can use it in normal PowerShell expressions.
You should recognize a statement like this.
PS C:\> Get-Service | Where-Object {$_.StartType -eq 'Automatic' -AND $_.Status -ne 'Running'}
Status Name DisplayName
------ ---- -----------
Stopped DropboxUpdaterInternalService123.0.6299.144 DropboxUpdater InternalServ…
Stopped DropboxUpdaterService123.0.6299.144 DropboxUpdater Service 123.…
Stopped edgeupdate Microsoft Edge Update Servi…
Stopped GoogleUpdaterInternalService148.0.7730.0 Google Updater Internal Ser…
Stopped GoogleUpdaterService148.0.7730.0 Google Updater Service (Goo…
Stopped Intel(R) TPM Provisioning Service Intel(R) TPM Provisioning S…
Stopped MapsBroker Downloaded Maps Manager
Stopped sppsvc Software Protection
Stopped vmicrdv Hyper-V Remote Desktop Virt…
Stopped WbioSrvc Windows Biometric Service
I'm piping a collection of objects representing services to the Where-Object cmdlet which is filtering for services configured to start automatically but not currently running. These services don't necessarily indicate a problem. Many services will stop when they have nothing to do. That's not what I want to cover.
This expression runs smoothly and relatively quickly, taking 1.3 seconds to run on my desktop. I could easily pipe this output to another command.
The alternative that I want to show you is the Where() method. The tricky thing about this method is that it is technically not a part of the .NET Framework. You won't see it with Get-Member. The method is defined in the System.Management.Automation DLL. It is implemented by the PowerShell engine when processing collections; specifically, collections of the System.Collections.IList class. The PowerShell engine injects the Where() method when handling collections. This is convenient because just about everything you run in PowerShell that generates a bunch of things will be a collection.
Where()
The basic syntax is similar to Where-Object. Let me create small collection of numbers.
PS C:\> $n = 1..10
You won't see the Where() method as an option, but it is there. Invoke the method on the collection object using a filtering scriptblock like you would with Where-Object. You can use $_ or $PSItem to reference each object.
PS C:\> $n.Where({$_%2})
1
3
5
7
9
I quickly filtered for odd numbers. I can verify that $n supports the method by testing for the type.
PS C:\> $n -is [System.Collections.IList]
True
The shortcut syntax is to wrap the command generating the list of objects in parentheses. This will force PowerShell to execute and hold the results in memory.
PS C:\> (1..10).Where({-Not ($_%2)})
2
4
6
8
10
You can treat the wrapped expression as a variable and invoke the Where() method. Returning to my Get-Service example, I could refactor it:
(Get-Service).Where({$_.StartType -eq 'Automatic' -AND $_.Status -ne 'Running'})
This runs slightly faster. Your performance gains will vary depending the code you are using to generate the list of objects and your filtering code. The Where() method should be faster, but you should always test.
PS C:\> Measure-Command { 1..5000 | Where {$_%2} }
Days : 0
Hours : 0
Minutes : 0
Seconds : 0
Milliseconds : 186
Ticks : 1861224
TotalDays : 2.15419444444444E-06
TotalHours : 5.17006666666667E-05
TotalMinutes : 0.00310204
TotalSeconds : 0.1861224
TotalMilliseconds : 186.1224
PS C:\> Measure-Command { (1..5000).Where({$_%2}) }
Days : 0
Hours : 0
Minutes : 0
Seconds : 0
Milliseconds : 59
Ticks : 593297
TotalDays : 6.86686342592593E-07
TotalHours : 1.64804722222222E-05
TotalMinutes : 0.000988828333333333
TotalSeconds : 0.0593297
TotalMilliseconds : 59.3297
PS C:\> Measure-Command { Get-Process | Where WS -ge 500MB}
Days : 0
Hours : 0
Minutes : 0
Seconds : 0
Milliseconds : 135
Ticks : 1356487
TotalDays : 1.57000810185185E-06
TotalHours : 3.76801944444444E-05
TotalMinutes : 0.00226081166666667
TotalSeconds : 0.1356487
TotalMilliseconds : 135.6487
PS C:\> Measure-Command { (Get-Process).Where({$_.WS -ge 500MB}) }
Days : 0
Hours : 0
Minutes : 0
Seconds : 0
Milliseconds : 105
Ticks : 1055422
TotalDays : 1.22155324074074E-06
TotalHours : 2.93172777777778E-05
TotalMinutes : 0.00175903666666667
TotalSeconds : 0.1055422
TotalMilliseconds : 105.5422
I think you'll see even larger gains at greater scale.