Creating a GitHub Repository Tool
Not that long ago, I saw something on social media that got me thinking, and as usually happens, one thing leads to another and before I knew it I was heads down in PowerShell code. The target of my scripting fever may not be relevant to everyone, but I think there is value is sharing my process. What did I consider as building my tool? How did I refine it? I firmly believe that you can learn as much from the journey as you can from the destination. I think this will process will take a few articles. Don't focus as much on the end result as the journey. Of course, there may be things for you to pick up from my code, but I think the real value is in the process.
GitHub Repository Information
My initial thought was that I wanted a PowerShell tool I could run that would give me information about my GitHub repositories. Ultimately, I want to tie this to my projects in the PowerShell Gallery, but let's not get ahead of ourselves.
Initially, I knew I wanted the output to include the repository name and URL. I assumed there would be a way to get at least that information. I expected to add to the list of properties once I discovered what else was available. Naturally, I will need to query GitHub for my repository information. There is well documented API for GitHub which I've used in the past. But I decided to take what I think will be an easier route.
Instead of fussing with tokens and URLS, I decided to use the Github CLI tool, gh.exe. You can install this from GitHub, or use a package manager like winget
.
winget install --id GitHub.cli
When you first install it, you will need to configure it with your account and authentication information. You can find complete documentation at https://cli.github.com/manual/. Once configured, you can do just everything you can do with the API, but with the convenience of a command line tool.
Repo commands
You can run gh
or gh --help
to get started. Do so, you'll see an option to manage repositories.
gh repo
Drilling down, I can get help on the list
command.
Based on the help, I can try running the command.
gh repo list --no-archived --visibility public
I expect to see my public repositories that I have not marked as archived.
This is a good start. However, the output is text which means there's not much I can do with it. Nor am I getting the information I want.
JSON to the Rescue
Going back to the help I seen an option to format the output as JSON. This is ideal. If you can get JSON output, you can convert it to PowerShell objects and do whatever you want with it. I didn't show it in the screenshot above, but the help shows me the available fields I can include in the output.
JSON FIELDS
assignableUsers, codeOfConduct, contactLinks, createdAt, defaultBranchRef,
deleteBranchOnMerge, description, diskUsage, forkCount, fundingLinks,
hasDiscussionsEnabled, hasIssuesEnabled, hasProjectsEnabled, hasWikiEnabled,
homepageUrl, id, isArchived, isBlankIssuesEnabled, isEmpty, isFork,
isInOrganization, isMirror, isPrivate, isSecurityPolicyEnabled, isTemplate,
isUserConfigurationRepository, issueTemplates, issues, labels, languages,
latestRelease, licenseInfo, mentionableUsers, mergeCommitAllowed, milestones,
mirrorUrl, name, nameWithOwner, openGraphImageUrl, owner, parent,
primaryLanguage, projects, projectsV2, pullRequestTemplates, pullRequests,
pushedAt, rebaseMergeAllowed, repositoryTopics, securityPolicyUrl,
squashMergeAllowed, sshUrl, stargazerCount, templateRepository, updatedAt, url,
usesCustomOpenGraphImage, viewerCanAdminister, viewerDefaultCommitEmail,
viewerDefaultMergeMethod, viewerHasStarred, viewerPermission,
viewerPossibleCommitEmails, viewerSubscription, visibility, watchers
I can use the --json
option to get the output in JSON format.
gh repo list --no-archived -L 3 --json id,name,url,description
I only need a small sample to verify the syntax.
A word of caution with this command. The JSON fields names are case-sensitive. And even though it looks like you are passing an array of field names, you are not. You are passing a single string. The field names are separated by commas so you can't have any spaces between commas.
But now I can easily create PowerShell output.
gh repo list --no-archived -L 3 --json 'id,name,url,description' |
ConvertFrom-Json |
Format-List
Notice that I didn't try to create a complex gh
command to give me the final result. I took baby steps to validate the process. Too often, I see people try to write the finished product from the very beginning, which is inevitably complicated and error-prone. When it fails, it can be difficult to know what part of the process failed. By taking small steps, I can verify each step and make sure it works before moving on.
Objects, Objects, and More Objects
The ConvertFrom-JSON
cmdlet is a good start, but I can do better. I want to do better. For one thing, I'd like to have properly cased property names. Instead of description
it should be Description
. Little details like this separate a good tool from a great tool. If you take the time to pay attention to this detail, you are likely to pay attention to other details and the result is better, and more professional, code.
I also have a challenge is that some properties are nested objects.
gh repo view PSGalleryReport --json "name,url,defaultBranchRef,watchers"
If I want to surface these types of values, I need to customize the output from ConvertFrom-JSON
. I can do this by creating a custom object.
$fields = 'id,name,url,description,diskUsage,pushedAt,updatedAt,latestRelease,stargazerCount,watchers,defaultBranchRef,visibility'
$r = gh repo list --no-archived -L 5 --json $fields
$x = $r | ConvertFrom-Json | Select -Last 1
Here's the converted object from the properties I want.
It is easy enough to create another custom object from this.
$y = [PSCustomObject]@{
PSTypeName = 'GitHubRepo'
Name = $x.name
Description = $x.description
LastUpdate = $x.updatedAt
Visibility = $x.visibility
DefaultBranch = $x.defaultBranchRef.name
LatestReleaseName = $x.latestRelease.name
LatestReleaseDate = $x.latestRelease.publishedAt
LastPush = $x.pushedAt
StargazerCount = $x.stargazerCount
WatcherCount = $x.watchers.totalCount
URL = $x.url
DiskUsage = $x.diskUsage
ID = $x.id
}
I'm giving my custom object a type name so I can create a custom format file later. But now, I have a better formatted object that meets my requirements.