Ship the Capability, Not the Repo Summary
Ship the Capability, Not the Repo Summary
Most open-source AI projects do not fail at discovery. They fail at transfer.
A developer can find the repository, skim the README, star it, and still have no safe way to bring that project into Claude Code, Codex, Cursor, Aider, or any other AI host. The problem is not awareness. The problem is that a repository is not yet a portable capability.
That gap is where this weekly note starts.
At Doramagic, I have been packaging open-source AI projects into source-backed assets that an AI host can actually use before it touches a real environment. The work sounds like documentation at first. It is not. A useful capability asset has to answer a harder question:
What should the host AI know, verify, avoid, and recover from before it claims the project is ready?
That is a different product from a project summary.
A Summary Tells You What Exists. A Capability Asset Tells You What To Trust.
A typical project summary compresses the visible surface:
- what the project does;
- what language or framework it uses;
- how to install it;
- why it looks interesting.
That is useful for browsing, but it is not enough for an AI host. A host AI needs operating boundaries. It needs to know whether a claim comes from source evidence, a manual route, a community issue, a release note, or an unverified assumption. It also needs to know when to stop.
This matters because AI hosts are very good at sounding done. If you give one a neat summary, it may convert that summary into confident next steps:
- "The package is installed."
- "The quickstart works."
- "The tool is safe to expose."
- "The eval passed."
Those claims are cheap to say and expensive to clean up.
So the first rule of a reusable AI capability is simple:
Do not let the host AI turn project context into runtime proof.
The Six Pieces I Want In Every Serious Capability Asset
This week, I am using a simple six-part checklist.
First, there needs to be a source boundary. What did we inspect? What is upstream? What is Doramagic-authored? What is copied, adapted, excluded, or link-only?
Second, there needs to be a host entry point. A project may have a README, but an AI host needs an instruction file, a prompt preview, or a loading route it can use without improvising.
Third, there needs to be a smallest safe action. Not "integrate this into production." Not "rewrite the app." Just the first action that proves the host understands the project boundary.
Fourth, there needs to be a pitfall log. A good pack should not pretend that public issues, version drift, config risk, or permission risk do not exist.
Fifth, there needs to be an eval or smoke check. The host AI should demonstrate the correct behavior before it touches a real project.
Sixth, there needs to be a recovery path. If the agent claims success too early, skips risk checks, or confuses preview with execution, the pack should tell it how to back up.
If one of these is missing, the asset may still be useful as reading material. It is not yet a dependable capability.
Four Field Notes From This Week
The useful part of this work is not making every project look equally polished. The useful part is making different projects expose different failure modes.
LangMem: Sometimes The Right Pack Is Link-Only
LangMem is a good reminder that more packaging is not always better packaging.
The Doramagic LangMem public pack is intentionally link-only. It includes upstream links, original metadata, a short Doramagic evaluation, and boundary notes. It does not redistribute upstream text, code, screenshots, rewritten manuals, tutorials, or prompt packs.
That decision is not a weakness. It is a trust signal.
If the redistribution boundary is not clean enough, a pack should say so. The wrong move would be to dress a legally or operationally uncertain boundary as a full context pack just because a fuller artifact looks better in a catalog.
For product communication, this matters. Doramagic should not be remembered as "the site that summarizes AI repos." It should be remembered as the place that is willing to say: this project is interesting, but the public asset should stay narrow until the boundary is defensible.
Promptfoo: Evals Are Not Decoration
Promptfoo is a better fit for a full context pack because its core question is already close to Doramagic's question: how do you evaluate prompts, agents, RAG flows, and model behavior before you trust them?
The Doramagic promptfoo pack includes host instructions, a prompt preview, evals, pitfalls, and recovery rules. Its smoke check has a very specific expectation: the agent should restate the task, identify boundaries, propose a verification step, and avoid claiming success.
That last part is the product.
An eval is not a badge you attach after the article. It is the thing that prevents the AI host from turning "I understand promptfoo" into "promptfoo is installed and working." For AI workflows, the difference between those two sentences is where most hidden risk lives.
DeepEval: Evaluation Tools Still Need Evaluation Boundaries
DeepEval is another evaluation-oriented project, but the Doramagic pack treats it with the same skepticism it would apply to any other tool.
The pack marks several items as requiring verification. One high-priority pitfall points to an open community evidence item around an OWASP LLM02 output-handling metrics pack. Other notes remind the host not to trust research conclusions, citations, experiments, output quality, compatibility, or rollback safety before real installation and verification.
That is the right posture.
Evaluation frameworks are especially easy to over-trust because they sit in the "quality" part of the stack. But a tool that evaluates other systems can still have version drift, setup gaps, open issues, unclear compatibility, and output-quality uncertainty.
The lesson is blunt: using an eval framework does not remove the need for eval boundaries.
FastMCP: Tool Creation Is Also Permission Design
FastMCP is attractive because it makes MCP server and client work feel fast and Pythonic. That is the visible promise.
The less comfortable question is what happens after a tool is exposed to an agent.
The Doramagic FastMCP pack records installation, configuration, maintenance, security, and permission risks that require verification. One of the more important risk patterns is about destructive tool capabilities such as shell execution, file deletion, and environment access.
This is exactly where a repo summary becomes dangerous. "Build MCP servers faster" is a useful claim. But if an AI host only carries that claim forward, it may skip the real design question:
What permissions should this tool never receive by default?
For MCP projects, capability packaging must include permission posture. Otherwise you are not shipping a capability. You are handing an agent a sharper object and hoping the prompt is polite.
The Product Lesson
The product lesson from these examples is not "use these four projects."
The lesson is that a useful AI project asset has to preserve the distance between:
- source-backed context;
- install-time reality;
- runtime behavior;
- production confidence.
Most AI content collapses those layers because collapsed layers are easier to market. The article sounds cleaner. The demo moves faster. The CTA becomes simpler.
But collapsed layers are exactly what make AI tooling feel magical in a video and expensive in a real repo.
Doramagic should take the opposite position. The boundary is the feature.
A Reusable Capability Checklist
Before you bring an open-source AI project into an AI host, ask these questions:
- What is the upstream source, and what has actually been inspected?
- Is this a full capability pack, a metadata page, a manual, or a link-only boundary?
- What can the host AI safely do before installation?
- What must wait until real installation or runtime verification?
- What is the smallest safe first action?
- What would count as false confidence?
- Which issues, releases, or pitfall notes should change the recommendation?
- What eval proves the host understands the boundary?
- What should the host do when the eval fails?
- What should never be exposed to the agent by default?
If a project page cannot answer these, the project may still be worth exploring. It just is not ready to be treated as a portable capability.
Source-Backed Examples
Two concrete Doramagic manuals used while preparing this note:
- Promptfoo manual: https://doramagic.ai/en/projects/promptfoo/manual/
- FastMCP manual: https://doramagic.ai/en/projects/fastmcp/manual/
Disclosure: I build Doramagic. These are independent project assets, not official upstream documentation or endorsements.
Next Week
Next week, I want to make the checklist stricter.
The question will be: when should Doramagic publish a full pack, and when should it deliberately stop at link-only metadata?
That decision may be less exciting than a new tool demo. It is also the kind of decision that determines whether an AI capability library earns trust over time.