Ideas for a "Production Cafe"
Ever since I wrote What’s in Production? I’ve grown steadily more obsessed with it. The act of writing these thoughts down focused the discordant, nagging voice in my head into a church choir singing the praises of production applications.
The best part is that the more I wonder, the more interesting examples I learn about. Some are pleasantly expected, like bandit learning being used for COVID testing, some are surprisingly simplistic, like bandit learning NOT being used for a critical explore-exploit tradeoff, some are genuinely new, like an application of submodular maximization to Amazon product exploration, and some are old favorites I’m revisiting, like Reed-Solomon codes used in space exploration.
It’s hard to remember all of this information, so I’d like to make a website to categorize it. It’s basically a crud database of topics and examples of those topics used in practical settings. Anyone can submit data, but I will personally review submissions to see if they’re up to my standards before they’re published. I’m calling the idea the “Production Cafe.” I registered the prod.cafe domain (currently forwarding to a Google form).
The way I see it, applications come and go, as do the implementing engineers. Someone might implement submodular maximization at Amazon, but then leave and not know that the maintainers decided to redesign that feature. Applications might also be successful ten years ago, but then a newer, better alternative replaces them.
This will influence the Production Cafe by framing applications more accurately as evidence that a topic is useful, with associated dates describing when the application was created, and when it was last verified as still in use.
A quick sketch of a database schema might look like
Topic: name: string description: string definition_reference: url System: name: string link: url Evidence: // (live system, publication, source code, personal vouch, deprecation, ...) evidence_type: enum topic: Topic system: System description: string reference_link: url evidence_date: timestamp
You can imagine that not all types of evidence are weighted equally. A link to source code for an active (open source) project is the best, publications describing production systems are somewhere lower, and publications describing prototypes are even lower.
I also want to add a personal vouch, but it’s a tough call. For one, I want people to be able to say, “I used this math in a real project!” If provable artifacts are hidden behind corporate IP shields, then I would have to take the person’s word for it. A reference to, say, a blog post describing it would be better. But even without that, there should be some value, even if minuscule, in hearing enough people claim something is used. I just wouldn’t want that to somehow enable more deceit.
One other aspect of this that seems important to me is that evidence is dated. That is, evidence provided that something was used in production in 2009 is weighted lower than evidence from 2020. I suspect that would provide a decent filter, where ideas that are truly useful would be used in multiple settings over many years. And ideas that are not useful would be tried once, perhaps submitted as evidence to the site, and then not used again. All of these weights could coalesce into a ranking, or at least enough caveat emptors so that readers have the clearest possible picture of what’s being claimed.
Now, I don’t want to get ahead of myself here. The site will probably not become popular, and I’ll just use it as my personal journal to record the examples I see and am convinced by.
But if it does become popular, I could see it being a nice resource for researchers looking to understand important applications of their research field, or educators who want to answer students who ask, “when am I ever going to use this?”, or math enthusiasts like myself who want to know what works and what doesn’t. All in a searchable, indexed format. Perhaps it could even grow beyond math or spawn sister sites for other types of topics. Math can’t be the only domain to have this problem. I also imagine I could find a way to profit from this in the long term, though I don’t want to think about that before I have something that satsfies my own needs.
Right now I’m just fiddling around with a Django app to start. The UI will likely be ugly because I’m a pretty simple guy on that front. Since I’m busy with my baby and house hunting and a full time job, I probably won’t get prod.cafe out the door for a bit.
If you want to contribute to the underlying knowledgebase, please fill out this form. Or if you want to share ideas, encouragement, or you know about websites that already do this, reply to this newsletter email and let me know.