Documents and Objects
Over the course of the last few weeks, I've been sharing my experiences in building PowerShell tooling that I can use to create an archive index for this newsletter. I'm still unsure on how I want to present this information to you I've been able to get a list of all archived emails from the Buttondown API. I've been using that data to create a content summary with excerpts. This has involved a lot more string parsing than I had hoped, but regular expressions help a lot.
Because this is a long process, I've saved data to an an XML file using Export-Clixml
. At some point, I'll need to revisit the code I used to get the data via the API, and turn it into a proper PowerShell tool. For now, I'll import this data to continue parsing it.
$all = Import-Clixml C:\scripts\behind-api-emails.xml
When we left off, I had code to parse the content and create a summary document using markdown.

But there are problems.

The first problem to deal with is misconfigured code fences. In the sample markdown document I'm creating a preview snippet parsing text and words. The text is markdown which means I am getting text like code fences.

The problem is that I have an opening code fence as indicated by the arrow, but there is no closing code fence. How can I account for this?
We might need to backup a moment. My code to create the markdown document is based on parse the original data from the API and creating a summary. I am using this private function.
Function parsePreview {
[cmdletbinding()]
Param([string]$Body,[int]$WordCount = 100)
[regex]$rx = '<[^>]+>'
#replace H tags with markdown
$body = $body -replace ''
, "`n`n# " -replace ''
, "`n`n## " -replace ''
, "`n`n### "
$body = $body -replace '', "`n`n"
$body = $body -replace "’","'"
$body = $body -replace ">",">"
#define a new variable
$preview = $rx.replace($body, '')
#strip out markdown quote blocks
$preview = $preview -replace "(?<=\s*)\>(\s+)?.*(\r)?\n", ''
#get the first $WordCount number of words
$preview = ($preview -split " " | Select-Object -First $WordCount ) -join ' '
#append and ellipsis to indicate more
"$preview`n..."
}
My earlier code processed the API data using Select-Object
.
$e = $all | Select-Object @{Name = 'Title'; Expression = { $_.Subject } },
@{Name = 'Type'; Expression = { $_.email_type } },
@{Name = 'Published'; Expression = { $_.publish_date } },
@{Name = 'Preview'; Expression = { parsePreview -Body $_.body } },
@{Name = 'Link'; Expression = { $_.absolute_url } } |
Sort-Object Published -Descending
I don't want to type this all the time, so I'll turn this into a function.
Function New-BDEmailItem {
[cmdletbinding()]
Param(
[Parameter(Mandatory,ValueFromPipeline)]
[object]$InputObject
)
Begin {
Write-Verbose "[$((Get-Date).TimeOfDay) BEGIN ] Starting $($MyInvocation.MyCommand)"
. $PSScriptRoot\parsePreview.ps1
} #begin
Process {
Write-Verbose "[$((Get-Date).TimeOfDay) PROCESS] Processing $($InputObject.Subject)"
$InputObject | Select-Object -Property @{Name = 'Title'; Expression = { $_.Subject } },
@{Name = 'Type'; Expression = { $_.email_type } },
@{Name = 'Published'; Expression = { $_.publish_date } },
@{Name = 'Preview'; Expression = { parsePreview -Body $_.body } },
@{Name = 'Link'; Expression = { $_.absolute_url } } |
Sort-Object Published -Descending
} #process
End {
Write-Verbose "[$((Get-Date).TimeOfDay) END ] Ending $($MyInvocation.MyCommand)"
} #end
}
This function relies on the parsePreview
function being loaded into my session. I'll manually load it for now. If I eventually create a module, I could simply reference it as a private function. Both functions need to be in the same folder.
With this function, I can parse the API data.
$e = $all | New-BDEmailItem -Verbose

Document Parsing
I can use this to create my markdown document.
$doc = [System.Collections.Generic.List[string]]::new()
$doc.Add("# Behind the PowerShell Pipeline Archive`n")
foreach ($item in $e) {
...
Here's how I can find missing closing code fences. I know that the group of backticks should come in pairs. I can use Select-String
to help me "count".
$codeFence = $a | Select-String -Pattern "``````" -AllMatches
$A
is the preview property of an object with a code fence. Using the -AllMatches
parameter, I can get all the matches backtick groups. I can then count the number of matches.
$codefence.matches.count
1
The $codefence
variable is a MatchInfo
object.
PS C:\> $codefence.GetType().Name
MatchInfo
PS C:\> $codefence.matches
Groups : {0}
Success : True
Name : 0
Captures : {0}
Index : 580
Length : 3
Value : ```
ValueSpan :
I can use the modulo operator to test if this is an even or odd number.
PS C:\> $codefence.matches.count%2
1
I can treat this as a Boolean
result.
if ([bool]($codeFence.Matches.count % 2)) {
#append a closing code fence to the preview text
$short += "`n``````"
}
Here is the complete markdown generation code, that also includes a few other cleanup steps.
$doc = [System.Collections.Generic.List[string]]::new()
$doc.Add("# Behind the PowerShell Pipeline Archive`n")
foreach ($item in $e) {
if ($item.Type -eq 'premium') {
$emoji = ':heavy_dollar_sign:'
}
else {
$emoji = ':globe_with_meridians:'
}
#reduce any headings in the title
$doc.Add("## [$($item.Title)]($($item.Link))")
#insert blank line
$doc.Add('')
$doc.Add("***Published***: $($item.Published) UTC $emoji")
$doc.Add('')
$short = $item.Preview -replace "#{1,2}(?=\s)", '###'
#test for matching code fences
$codeFence = $short | Select-String -Pattern "``````" -AllMatches
#test if an odd number
if ([bool]($codeFence.Matches.count % 2)) {
#append a closing code fence to the preview text
$short += "`n``````"
}
#replace ? with apostrophe
$short = $short -replace '\?', "'"
#trim out empty lines
$short = $short -replace '\s+(\r?)\n', "`n`n"
$doc.Add($short.Trim())
$doc.Add('')
}
$doc | Out-File c:\work\behind-test.md
This fixes the initial problem which exposes more problems. I might not be possible to completely parse the markdown content automatically. I may have no choice but to manually edit the content. I have an idea on how to make this a little easier
Using Objects
The ultimate solution, is to have objects that I can use to create output. I have been using objects to create the markdown document, but I could just as easily use ConvertTo-HTML
.
$e | ConvertTo-Html -Title "Behind the PowerShell Pipeline Archive" -PostContent "created
$(Get-Date)" -As List -css C:\scripts\samplecss\sample3.css -PreContent "Behind the PowerShell Pipeline Archive
" | out-file d:\temp\behind.html

Let's clean this up and write a script to make a proper HTML page.
#requires -version 5.1
#Make-BDArchiveHTML.ps1
Param(
[Parameter(
Position = 0,
Mandatory,
HelpMessage = "Specify the processed Buttondown archive data"
)]
[ValidateNotNullOrEmpty()]
[object]$Data,
[Parameter(HelpMessage = "The path to a CSS file to embed in the HTML output.")]
[ValidateScript({Test-Path $_})]
[string]$CSSPath = "c:\scripts\sample3.css",
[string]$ReportTitle = "Behind the PowerShell Pipeline Archive",
[string]$OutputPath = "behind-archive.html"
)
#emojis to insert
$globe = '🌍'
$dollar = '💲'
#import the CSS file to make a stand-alone HTML page
$head = @"
$ReportTitle
$(Get-Content $CSSPath -Raw)
"@
#define a list to hold the HTML body
$body = [System.Collections.Generic.List[string]]::new()
$body.Add("$ReportTitle
")
foreach ($item in $data) {
if ($item.Type -eq 'premium') {
$emoji = $dollar
}
else {
$emoji = $globe
}
$body.Add("
$($item.Title) $emoji")
#add an HTML fragment
$item | Select-Object -Property Published,Preview |
ConvertTo-HTML -As List -Fragment |
Out-String |
ForEach-Object { $body.Add($_) }
}
$body.Add("Created
$(Get-Date)")
ConvertTo-HTML -Body $body -head $Head | Out-File -FilePath $OutputPath
The script will import a CSS file and use it to create a stand-alone HTML page. I can use this script to create an HTML page from the API data.
.\Make-BDArchiveHtml.ps1 -Data ($e | Sort-Object Published -descending) -CSSPath C:\scripts\sample.css

This page has links to the original content. I've also recreated the emoji indicators. I will try to attach this file to the newsletter.
Summary
Working with objects is definitely the way to go. It is easy to get sidetracked into parsing text. I think I need to to take one more step in refining the data. For now, I'll save $e
so I can re-use it later this week.
$e | Export-Clixml c:\scripts\behind-archive-entries.xml
I have something fun in mind so keep an eye on your inbox.