Creating Pandoc PowerShell Tools
One of the things I enjoy about creating a PowerShell-based tool is discovery. When writing code, I will realize that it would be helpful if I could use a particular command, feature, or scripting element. However, I may need to learn how to implement it. I am forced to learn something new, which is always a good thing. Sometimes, after going down a rabbit hole, I realize what I thought would be my solution was the wrong choice. You have to be willing to abandon a direction and pivot. Don't force your code. And don't feel discouraged about the time and effort you invested exploring. You may not need what you learned in this project, but it might be helpful down the road. And the process of learning reinforces good learning patterns.
I want to share some of my experiences creating a PowerShell tool to convert a module's README Markdown file to a PDF using the Pandoc utility we've been looking at. As a reminder, as with much of my work, don't focus on or get hung up on the end result. Instead, focus on the process and techniques I used.
The first step is to outline my requirements. This step will help me determine what parameters I will need and identify what challenges I need to overcome. Because I want to run my command with minimal effort, I can set default parameter values. It is better to have parameters than to hard-code the settings. The command may have what appear to be many parameters, but they provide flexibility for that odd file I want to convert.
This is my initial list of items to configure in the PDF.
- Metadata
- Title
- Table Of Contents
- Code Syntax Highlighting
- Document font settings
- Code font setting
- Page settings
I also know I'll need to address a few things. Often, my README files include links to command Markdown files in the docs
folder. This works just fine when viewing the README file in GitHub. But for a standalone PDF, I need to convert those links to point to the corresponding Markdown file in the GitHub repository.
I know I can embed images with Pandoc. However, I sometimes add a few emojis to a README. I need to be able to embed them as images.
Finally, I often use Markdown code like this:
> *This is a special note*
When converting to a PDF this doesn't format the way I would like. I want to replace >
with __Note:__
Using Markdown Metadata
As I was researching and testing Pandoc commands, I realized I might be able to use Markdown metadata. This is a block of text at the beginning of a Markdown file that is not rendered but can be used by third-party tools, like Pandoc.
---
mainfont: 'Verdana'
monofont: 'Lucida Console'
title: 'Sample Markdown Document'
---
# My Title
What you put in the metadata block depends on how it will be used. After a little testing, I realized I could use Pandoc variables names in the metadata block. Instead of having to use a parameter like --variable mainfont='Verdana'
, I could use metadata block. This would make my command much cleaner and easier to use.
Here's the metadata header for a test document.
---
mainfont: 'Verdana'
monofont: 'Lucida Console'
monobackgroundcolor: 'white'
backgroundcolor: '#f3f4f6'
linkcolor: green
fontsize: 14pt
maxwidth: 85%
author: 'Jeffery Hicks'
date: '2024-09-03'
title: 'Sample Markdown Document'
subject: 'pandoc markdown'
keywords: ['Markdown','PowerShell']
---
Now, instead of having to specify all of these parameters on the command line, I can run a much simpler command like this:
pandoc -s -o d:\temp\3.html .\sample3.md --embed-resources --highlight-style=espresso --from markdown+emoji
The only parameters I need to specify are those specific to the conversion process. Note that I specifically specified the --from parameter because my Markdown file has emojis. This creates a nicely formatted HTML file.
I even get updated metadata in the HTML file.
Although I had to experiment with a data value. Values with spaces failed to work.
The HTML conversion was able to embed the emojis as images in the HTML file.
Maybe, I can insert a metadata block, or create a metadata file the PDF conversion. But there are multiple problems when I try to run this command:
pandoc -s -o d:\temp\3.pdf .\sample3.md --embed-resources --highlight-style=espresso --from markdown+emoji --pdf-engine=xelatex
I had to specify a different pdf-engine to avoid fatal errors. I encounter errors setting the main font size and recognizing emojis.
[WARNING] [makePDF] LaTeX Warning: Unused global option(s): [14pt].
[WARNING] Missing character: There is no p (U+1F604) in font Verdana/OT:script=latn;language=dflt;m
[WARNING] Missing character: There is no b (U+2757) in font Verdana/OT:script=latn;language=dflt;m
Although the rest of the PDF looks good, including PDF metadata.
Integrating Emojis
After more research, I realized I could insert an image link to the emoji. The :smile:
emoji as a Unicode value of 0x1f604.
PS C:\> PS C:\> [char]::ConvertFromUtf32(0x1f604)
😄
I tend to use GitHub compatible emojis. Which is fortunate, because GitHub maintains an image library. Instead of using the Unicode value, I can use the image URL.
![](https://github.com/images/icons/emoji/unicode/1f604.png)