Creating Pandoc PowerShell Tools Part 2
I've been sharing my experiences in building Pandoc-based PowerShell tools. My ultimate goal is to create a tool that will convert a Markdown README file from a GitHub repository into a PDF file. As always, I encourage you to focus on the techniques and concepts I use and not necessarily the specific tools I'm building. I am determined to get through the remaining details today so let's get to it.
Fonts
One element I might want to take advantage of is the ability to use different fonts in my PDF file. I can do this by using the --variable
option with the mainfont
and monofont
variables. I also showed you last time how you can use a metadata header in your Markdown file to set these variables. However, this can be a little tricky and I have done more than my share of trial and error. The font name you use can refer to a font family, or you can specify a specific font including the extension like CascadiaMono.ttf
. But it needs to be a font that the Latex engine recognizes.
I discovered the fc-list
utility that was installed with I installed MikTex.
PS C:\> fc-list cascadiacode
fc-list: security risk: running with elevated privileges
C:/WINDOWS/Fonts/CascadiaCode-SemiBold.ttf: Cascadia Code,Cascadia Code SemiBold:style=SemiBold,Regular
C:/WINDOWS/Fonts/CascadiaCode.ttf: Cascadia Code:style=ExtraLight
...
The utility writes colon (:) delimited output, but I can easily use ConvertFrom-CSV
because the path contains a colon. So I used PowerShell to parse the output, beginning with a regular expression to split the output into named captures.
[regex]$rx = '^(?<path>.+): (?<name>.+):style=(?<style>.+)$'
I can create a custom object in PowerShell using the named captures.
$f = fc-list cascadiacode
$fontList = foreach ($font in $f) {
if ($rx.IsMatch($font)) {
$r = $rx.Match($font)
[PSCustomObject]@{
Name = $r.Groups['name'].Value
Style = $r.Groups['style'].Value -split ','
Path = $r.Groups['path'].Value
}
}
else {
Write-Warning "Failed to parse:$font"
}
}
Once I know this will work, I can generate a list of all fonts and save the data to a JSON file.
$fontlist | Sort-Object Name | ConvertTo-Json | Out-File c:\scripts\latex-fontlist.json
In my PowerShell session, I can import this list.
$fl = Get-Content C:\scripts\latex-fontlist.json | ConvertFrom-Json
This will help me determine variable names, or values to use in a metadata YAML file.
mainfont: notosans-regular.ttf
fontsize: 10pt
monofont: Consola.ttf
monofontoptions: Scale=0.75
Colors
Likewise, I might want to specify colors for my PDF file. I might want to change link colors or even the page color. If I am using a table of contents, I can even set a color for that. You can use standard colors like red
or blue
, but I discovered other options. Take a look at the documentation at https://www.overleaf.com/learn/latex/Using_colors_in_LaTeX#Named_colors_provided_by_the_xcolor_package. You can use values from the xcolor
package. I can set these colors in the metadata YAML file. If I use packages like dvipsnames
, I can specify even more colors. I need to define this in the metadata.
mainfont: notosans-regular.ttf
fontsize: 12pt
monofont: Consola.ttf
monofontoptions: Scale=0.85
subject: "Sample Markdown Document"
title: "Sample Markdown to PDF"
keywords: [Markdown, PowerShell, Pandoc,demo]
author: "Jeffery D. Hicks"
date: "2024-09-05"
geometry: margin=1in
toc: true
toc-depth: 2
documentclass: report
urlcolor: BlueViolet
toccolor: PineGreen
header-includes:
- \usepackage[dvipsnames]{xcolor}
- \definecolor{bg}{RGB}{240, 240, 240}
- \pagecolor{bg}
I can even define a custom color. Using the YAML file, I can create a test PDF.
pandoc -s -o d:\temp\sample5.pdf .\sample5.md --embed-resources --metadata-file=.\meta.yml
Modify the File
The last part of my process involves editing the Markdown file. Remember, I need to replace emojis with corresponding image links, among other things. If I only had a simple changes to make, I might attempt simple text replacements. Instead, I am going to load the original Markdown file into a generic list. It will be easier to modify the file contents using the list object. Then I can write a new, temporary markdown file and convert that to a PDF. My original Markdown file will remain unchanged.
Let me walk you through the process. I'm going to use the README file from my PSTimers module.
$path = "C:\scripts\PSTimers\README.md"
I'll create a generic list object to store the Markdown file and read the contents of the file into the list.
$content = [System.Collections.Generic.List[string]]::new()
$content.AddRange([string[]]$(Get-Content -Path $Path))
This is the content I will manipulate.
Doc Links
One of the things I want to fix are relative references to the docs folder.
I want to change these links to point to the online GitHub links. I'll use a regular expression pattern to find these links.
[regex]$rx = '(?<=\()docs\/'