PowerShell Filters and Functions
The other day I saw social media post from Adam Bertram about PowerShell filters. Talk about a blast from the past. Back in the early days of PowerShell, when scripting was still in its infancy, we could write functions
or filters
. Filters were a special type of function that was optimized for processing data in the pipeline. They were a way to make your scripts more efficient by reducing the amount of memory used and the number of objects created. I think they have fallen out of favor as most people simply write functions. But Adam raised the issue of performance, so I thought we'd spend a little time comparing filters and functions and figure out if there is a place for filters in today's PowerShell.
Your first thought might be, "I can already filter with Where-Object. What's the big deal." In many situations you may be right.
PS C:\> dir c:\scripts\ -file | where {$_.LastWriteTime -gt (Get-Date).AddHours(-36)}
Directory: C:\Scripts
Mode LastWriteTime Length Name
---- ------------- ------ ----
-a--- 5/22/2024 3:31 PM 1214 select-until.ps1
But what if you run this command often? Or would like to customize it. Suppose I want files modified in the last 48 hours? A filtering function might be a better choice.
Function
Most of you would probably write a function like this:
Function Get-LastModified {
[CmdletBinding()]
Param(
[Parameter(Mandatory, ValueFromPipeline)]
[System.IO.FileInfo]$InputObject,
[int]$Hours = 24
)
Begin {}
Process {
if ($InputObject.LastWriteTime -gt (Get-Date).AddHours(-$Hours)) {
$InputObject
}
}
End {}
}
I know that in this example, there are other ways to write this function that include specifying the path. But I wanted to focus on the filtering aspect.
PS C:\> dir c:\scripts\ -file | Get-LastModified -Hours 48
Directory: C:\Scripts
Mode LastWriteTime Length Name
---- ------------- ------ ----
-a--- 5/23/2024 11:24 AM 414 Demo-FilterFunction.ps1
-a--- 5/21/2024 1:21 PM 1502 demo-regex-look.ps1
-a--- 5/21/2024 12:36 PM 936 demo-regex-namedcaptures.ps1
-a--- 5/22/2024 3:31 PM 1214 select-until.ps1
Filter
However this can also be written as a Filter
. This is a coding structure very similar to a an advanced function. Whereas an advanced function uses the Begin
, Process
, and End
script blocks, the code in a filter is what you would put in a Process
script block. You don't even need to define the script block.
Filter LastModified {
Param(
[Parameter(Mandatory, ValueFromPipeline)]
[System.IO.FileInfo]$InputObject,
[int]$Hours = 24
)
if ($InputObject.LastWriteTime -gt (Get-Date).AddHours(-$Hours)) {
$InputObject
}
}
Notice the lack of [cmdletbinding()]
. A filter is not a command like an advanced function. This means you can't use parameter validation like [ValidateNotNullOrEmpty()]
. You should define a parameter for the incoming object. The $InputObject
parameter is a common convention. I also suggest specifying the type name.
I don't think there are any rules regarding naming of filters. I'm inclined to not use a standard verb-noun name, primarily a filter is intended to be used in a pipeline expression. You wouldn't run it as you would a function.
The filter is a distinct command type.
PS C:\> Get-Command -CommandType filter
CommandType Name Version Source
----------- ---- ------- ------
Filter LastModified
Use the filter where you would normally use the advanced function.
PS C:\> dir c:\scripts\ -file | LastModified -Hours 48 |
Sort LastWriteTime | Select Name,LastWriteTime
Name LastWriteTime
---- -------------
demo-regex-namedcaptures.ps1 5/21/2024 12:36:48 PM
demo-regex-look.ps1 5/21/2024 1:21:14 PM
select-until.ps1 5/22/2024 3:31:04 PM
Demo-FilterFunction.ps1 5/23/2024 11:24:22 AM
Let's look at another example. I'll begin with this.
PS C:\> Get-Process | Where {$_.Name -ne 'Explorer' -AND $_.WS -gt 250MB}
Handles NPM(K) PM(K) WS(K) CPU(s) Id SI ProcessName
------- ------ ----- ----- ------ -- -- -----------
521 39 296276 354712 23.47 3148 1 brave
1838 131 161008 269144 11.09 21088 1 brave
366 33 566948 616720 39.31 8820 1 Code
684 39 286776 323444 36.42 25992 1 Code
1026 77 219812 294228 9.08 20372 1 Discord
6662 165 364120 422332 16.17 28492 1 Dropbox
1362 8133 111468 304172 78.05 4040 0 ekrn
763 130 248256 328868 28.31 18632 1 PowerToys.PowerLauncher
1104 232 345492 521308 12.58 20520 1 pwsh
1294 222 320676 514092 12.66 26184 1 pwsh
705 36 230284 291612 17.55 27760 1 slack
803 63 916660 719652 1.98 7432 0 sqlservr
I can turn that filtering script block into a filtering function.
Filter ws {
Param(
[Parameter(Position = 0)]
[int]$WS = 100mb,
[Parameter(Mandatory, ValueFromPipeline)]
[System.Diagnostics.Process]$InputObject
)
if ($InputObject.Name -ne 'Explorer' -AND $InputObject.WS -gt $WS) {
$InputObject
}
}
I switched parameters around a bit to make the filter easy to use.
PS C:\> Get-Process | ws 500mb
Handles NPM(K) PM(K) WS(K) CPU(s) Id SI ProcessName
------- ------ ----- ----- ------ -- -- -----------
1091 239 371804 550860 19.17 20520 1 pwsh
1282 222 320644 514128 12.83 26184 1 pwsh
805 63 916396 719592 2.03 7432 0 sqlservr
I suppose another reason you don't need a standard name for the filter, is that I assume you will typically use them interactively. Like you would an alias. If you are using a filter in a script file, I'd recommend adding a comment, or give your filter a more meaningful name.
This filter will accept any Process object.
PS C:\> Get-Process -ComputerName Dom1,Srv1,Srv2 | ws | Select ID,Name,WS,MachineName
Id Name WS MachineName
-- ---- -- -----------
656 lsass 151953408 Dom1
1096 MsMpEng 143548416 Srv1
2836 MsMpEng 142053376 Srv2
3944 MsMpEng 162484224 Dom1
1184 svchost 126734336 Dom1
> This example will not work in PowerShell 7 because the ComputerName
parameter has been removed.
Performance Testing
Where this gets interesting is in performance. I'm going to test in PowerShell 7.4. First, let's see how long Where-object
takes. I'll use my file filtering example from earlier.
Measure-Command {
$r = Get-ChildItem c:\scripts\ -file -Recurse |
Where-Object { $_.LastWriteTime -gt (Get-Date).AddHours(-36) }
}
This took 1.2487 seconds. By the way, there are over 10K files in my scripts folder.
Now for the advanced function.
Measure-Command {
$r = Get-ChildItem c:\scripts\ -file -Recurse | Get-LastModified -Hours 36
}
This took 1.3039 seconds. I tested in a new PowerShell session to avoid any caching effect which will throw off the measurement.
Now for the filter.
Measure-Command {
$r =Get-ChildItem c:\scripts\ -file -Recurse | LastModified -Hours 36
}
Surprisingly, this took longer than expected at 1.385 seconds. This may be a disk related factor as I search through sub-folders. Let me retry without recursing.
Measure-Command {
Get-ChildItem c:\scripts\ -file |
Where-Object { $_.LastWriteTime -gt (Get-Date).AddHours(-36) }
}
583ms.
Measure-Command {
Get-ChildItem c:\scripts\ -file | Get-LastModified -Hours 36
}
527ms.
Measure-Command {
Get-ChildItem c:\scripts\ -file | LastModified -Hours 36
}
454ms.
Let's compare Where-Object
to the filter on process objects.
Measure-Command {
Get-Process | Where {$_.Name -ne 'Explorer' -AND $_.WS -gt 250MB}
}
That took 21.1ms.
Measure-Command {
Get-Process | ws 250mb
}
Hmmm. This took 43ms. I know filters can make a difference. Let's simplify everything.
PS C:\> Measure-Command {
>> 1..1000 | where {$_%2}
>> } | Select TotalMilliseconds
TotalMilliseconds
-----------------
15.4808
Now I'll test with this filter.
Filter Odd {
if ($_ % 2) {
$_
}
}
PS C:\> Measure-Command {
>> 1..1000 | Odd
>> }| Select TotalMilliseconds
TotalMilliseconds
-----------------
6.2209
That's what I was expecting. Here's the deal.
As soon as you add parameters to a filter, PowerShell adds overhead to process the parameters. A true filter is a simple filtering script block.
Filter ws {
if ($_.Name -ne 'Explorer' -AND $_.WS -gt 250mb) {
$_
}
}
Now Get-Process | ws
only takes 31ms. That's an improvement. I've lost the ability to specify a value for the WS parameter. I since I'm using filters interactively, I could hack my way around this.
$wsLimit = 500MB
Filter ws {
if ($_.Name -ne 'Explorer' -AND $_.WS -gt $wsLimit) {
$_
}
}
This now takes about 14ms to run so that is an improvement.
Summary
Using filters over functions or even Where_Object
will take some trial and error and experimentation. I like the idea of a filter, even with parameters as it means less typing and more flexibility. Before finishing this article, I tried testing one more time, but without Measure-Command
which incurs overhead.
Running Get-Process | where { $_.Name -ne 'Explorer' -AND $_.WS -gt 250MB }
in a clean PowerShell 7 window took 84ms. Testing with my original filter that has parameters took 79ms.
I guess, like so many things in PowerShell, it just depends. I'd love to hear about your experiences using filters.