On Metafiles

template file

                July 19, 2022

            On Metafiles

            A metafile is a file that represents multiple possible files. The most common type of metafile in use is the template file:
<p>I passed in {number}</p>

Templates are usually filled out at runtime, so it "looks like" one file, but you can take a script that takes the template and produces a bunch of output files. That's what makes it a metafile.
I first got interested in metafiles while working on learntla. I wanted to present specifications as a sequence of iterations but keep them all in sync, so that changing a name in the first one automatically propagated through the rest. My solution was put them all in a single XML file:
  IsUnique(s) == <s on="1">Cardinality(seen) = Len(s)</s><s on="2-">
    \A i, j \in 1..Len(s): 
      <s on="3">i # j => </s>seq[i] # seq[j]</s>

Then I used a script to expand the XML into multiple files:
\* file__1.tla
  IsUnique(s) == Cardinality(seen) = Len(s)

\* file__2.tla
  IsUnique(s) == 
    \A i, j \in 1..Len(s): 
      seq[i] # seq[j]

\* file__3.tla
  IsUnique(s) == 
    \A i, j \in 1..Len(s): 
      i # j => seq[i] # seq[j]

The XML is a metafile which represents three different "actual" files. I'd estimate that this trick cut at least a week off the project time.
(Why XML? It's a poor data language, but it's the best markup language for text. XML is good because it 1) preserves whitespace, 2) has inline markup, and 3) cleanly delineates between content and content data (via attributes).¹)
The Switchfile
The learntla format used number attributes to represent sequential versions of a file. A more general metafile would use string flags:²
<on _="debug,test">
print("Debugging line")</on>
f()

If at least of one {debug, test} is passed in during expansion, the print statement appears in the file, otherwise it'll just contain f(). This metafile format can do all sorts of interesting things:

<on _=""> starts a "comment" block- whatever's in it will never appear in an output file.
<on _="profiling"> makes it easy to keep instrumentation on your local copy of the code without it polluting the master branch.
<on _="mac"> gives you multiple compilation targets and, unlike #ifdef, leaves out all the "cruft" code for the OSes you're not on.
When refactoring code I often want to toggle between running the original version and running the refactor, and which means writing shims or manually tweaking the code. It'd be easier if I could put both versions in the same metafile, with different switches.

You can also incorporate metafiles into larger scripts. One common thing in constraint solving code is that optimizations interact nonlinearly. If you have optimizations A B and C, it could be that BC is the fastest combination, then A, then B or C, then AB or AC, then no optimizations at all, then ABC. Checking all six possibilities is a huge pain, especially if one of them is a change in two parts of the code.
With metafiles, we put them all in a single metafile, and then write a script that runs benchmarks on each possible expansion.
Other possible metafiles
I can't think of any other general metafiles off the top of my head, but there's probably good metafiles for specific tasks. I haven't tried writing a metafile for preprocessing, but I can imagine it'd be good for that.
Another idea: a while back I was working on a "predicate logic for programmers" guide and was stuck on how to write the logic.³ Should I write ∀x∈S or all x in S? Former is how mathematicians write it, latter is easier for beginners. I eventually settled on the latter syntax, but I could have also written it as <fa/>x<in/>S and output both versions. Then the reader could choose which one they wanted to read.
Metafiles as toolkit
I don't know if this would work well in a team setting. My metafile examples all fall into two categories: metafiles that represent different possibilities for the same file (debug, profiling, optimization), and ones which represent a set of different files (versions, compilation targets). In the former case, different people would all have different possibilities they want to encode, which turn the metafile into a messafile. Latter case would work better, I think.
The "different possibilities" metafiles would then work as "toolkit": the software you use to make your software product. You'd take a couple of metafile formats that serve your specific purposes and write tooling around them. Toolkits are a topic I'd like to explore more. I've seen people talk about what's in their toolkits but haven't read anything about the abstract concept of a toolkit.

I find scribble more aesthetically pleasing but it doesn't have much tooling or editor support. ↩

You could generalize this further with more complicated conditionals, but that's too much effort. ↩

Originally this was gonna be a book, but I'll start with an article crash course instead. ↩

            If you're reading this on the web, you can subscribe here. Updates are once a week. My main website is here.
My new book, Logic for Programmers, is now in early access! Get it here.

Don't miss what's next. Subscribe to Computer Things:

Start the conversation: