Archive • vadimall.com • Buttondown

Run Real AI Features in the Browser with Transformers.js v4 and WebGPU

July 1, 2026

Transformers.js v4 WebGPU browser AI TypeScript tutorial: build client-side semantic search and on-device summaries with a WASM fallback and real benchmarks....

Edge RAG: Build a Sub-100ms Retrieval App with Cloudflare Workers AI and Vectorize

June 29, 2026

A complete Cloudflare Workers AI Vectorize edge RAG tutorial in TypeScript. Embed, index, retrieve, and stream answers from a single Worker on 300+ PoPs —...

Give Your AI Agent Persistent Memory with Anthropic Managed Agents

June 19, 2026

Stop building your own vector store just so your assistant remembers things. This Anthropic Managed Agents memory store TypeScript tutorial gives your agent...

Anthropic Agent Skills in TypeScript: Package Reusable Instructions and Code as Tools

June 17, 2026

Stop copy-pasting the same 2,000-token system prompt into every feature. This Anthropic Agent Skills TypeScript SDK tutorial shows how to package...

Build a ChatGPT App with the OpenAI Apps SDK and MCP in TypeScript

June 15, 2026

Ship an interactive app inside ChatGPT with the OpenAI Apps SDK and MCP in TypeScript. Build an MCP server, an iframe widget, and the postMessage bridge that...

Why Your "Working" AI Demo Will Break in Production: A Reality Check for PMs and Founders

June 12, 2026

The AI demo vs production gap, explained for PMs and founders: eight specific ways a working prototype breaks at scale, plus a pre-roadmap checklist. Read...

Pricing AI Features: A Founder's Guide to Per-Seat vs Usage-Based Models

June 10, 2026

Per-seat vs usage-based pricing for AI features, with real gross-margin math for three products and a TypeScript calculator you can adapt to your own...

OpenAI Realtime API vs Gemini Live vs Pipecat: Which for Voice AI in TypeScript

June 8, 2026

Build the same voice agent in OpenAI Realtime, Gemini Live, and Pipecat-JS. A TypeScript comparison of latency, interruption handling, mid-stream function...

Hybrid Search That Actually Works: BM25 + Embeddings + Reranking in TypeScript

June 5, 2026

Pure vector search misses exact-match queries like SKUs and error codes. Fix retrieval with hybrid search in TypeScript: BM25 + embeddings, fused with RRF,...

PII Redaction Middleware: Strip Sensitive Data Before It Reaches the LLM

June 3, 2026

Build TypeScript middleware that redacts PII before it reaches the LLM, then re-hydrates the response - reversible redaction, audit logging, and a test...

Build an AI Eval Suite with Promptfoo: Catch Prompt Regressions Before Production

June 1, 2026

How to use Promptfoo to set up a TypeScript AI eval suite that catches prompt regressions in CI - deterministic asserts, LLM-as-judge, cost budgets, and a...

Why Your AI Feature Needs a Job Queue (And How to Add One with BullMQ)

May 29, 2026

Why synchronous LLM calls in Next.js API routes break under real load — and how to refactor to a BullMQ job queue with idempotency, priority lanes, SSE...

Add AI Image Generation to Your Next.js App with Replicate, Fal, and Cloudflare R2

May 27, 2026

Add AI image generation to your Next.js app end-to-end — call Replicate or Fal with Flux, store outputs in Cloudflare R2, serve via signed URLs, handle...

Build Generative UI with Vercel AI SDK: Stream React Components from an LLM

May 25, 2026

Build generative UI with the Vercel AI SDK — stream real React components from an LLM tool call, render partial state during streaming, and gracefully fall...

Pinecone vs Turbopuffer vs pgvector: Which Vector Database for Production RAG in 2026

May 22, 2026

A real benchmark of Pinecone, Turbopuffer, and pgvector for production RAG in TypeScript. Latency, recall, and monthly cost at 1M, 10M, and 100M chunks. Read...

Build a Text-to-SQL Feature: Let Users Query Your Database in Plain English

May 20, 2026

Build a production-safe text to SQL TypeScript feature with Postgres. Schema context, Zod validation, sandboxed roles, EXPLAIN gating, and a self-correction...

Cut Your Claude API Bill by 90% Using Prompt Caching in TypeScript

May 18, 2026

How to use Anthropic prompt caching in TypeScript to cut Claude API costs by up to 90%. Where to place cache_control breakpoints, TTL tradeoffs, and real RAG...

When Your AI Feature Gets Gamed: Prompt Injection Defense for JavaScript Apps

May 15, 2026

Practical prompt injection defense for JavaScript web apps: input sanitization middleware, system prompt hardening, canary tokens, and output validation with...

How to Write an AI Feature Spec That Engineers Won't Push Back On

May 13, 2026

An AI feature spec template with the eight sections engineers actually want: success metrics, fallback behavior, latency budgets, edge cases, and eval sets....

Build a Voice-Enabled AI Assistant in the Browser with TypeScript

May 11, 2026

Build a voice AI assistant in the browser with TypeScript using the Web Speech API, an LLM, and speech synthesis. Handle interruption, wake words, and mobile...