Your Screen Is the New API
Cheap models below, screen-use agents above, and one permission checklist to forward.
DeepSeek V4 landed at the end of April with a 1M context window, open weights, and coding scores strong enough that "which model is best?" is becoming the wrong question.
The better question: "which workloads belong on which side of the open/closed line?"
At the same time, OpenAI gave Codex the ability to operate Mac apps directly. HeyGen's HyperFrames turned web primitives into an agent-friendly motion graphics pipeline.
Cheap models are pushing down from one side. Screen-use agents are pushing outward from the other. The permission model has to catch up.
Open models are catching up
DeepSeek V4 ships with a 1M context window, MIT-licensed weights, and competitive public coding scores. On OpenRouter, V4-Pro is listed at $0.435 input and $0.87 output per million tokens.
At that price, frontier models have to earn their spot in the workflow.
The easy candidates: high volume internal work where "good enough" is the bar. Document extraction, classification, internal question answering, summarization, first draft content, support ticket triage.
The harder cases: customer facing work, agentic tool use, and regulated workflows where auditability, support, and review burden matter as much as token price. If you don't have an MLOps function, deployment overhead can eat the savings.
OpenAI's GPT-5.5 release makes the same point from the other side. GPT-5.5 is expensive at the API layer, but OpenAI is arguing that Codex gets more work done with fewer tokens than GPT-5.4. Cheaper open models below. Expensive frontier models above. Routing rules keep either one from turning into habit.
I’m feeling this in my own workflow. Claude Code and Opus have been my default for months, but more of my editing and implementation work is moving to Codex with GPT-5.5 or GPT-5.4.
Segment first. Then migration comes later, if the evals justify it. To decide this, audit the top three things your team runs Claude or GPT on. The high-volume, lower-stakes tasks are candidates for open-model testing now. The customer-facing and agentic ones stay on the frontier.
Your screen is the new API
OpenAI gave Codex hands in April. The operating change is simple: an app no longer needs an API to become automatable.
If it's visible on your Mac, an agent can see it, click it, type into it, and work through it with you. The screen becomes the integration layer.
The same pattern is showing up in creative work. HyperFrames treats a video composition like a web page. Claude Code or Codex can read, write, lint, and render scenes the same way it edits any other codebase.
Both are cases of the same crossing: agents moving from API-bound integrations into UI-bound work. At that point, setup is infrastructure. Browser profiles, accounts, logs, scopes, approvals.
APIs force a permission conversation: token, scopes, rotation, logs. Screen-use agents inherit the human workspace. If your browser is already authenticated into GitHub, Vercel, Stripe, or Slack, the practical permission model becomes "whatever this session can reach."
The Vercel incident in April wasn't a computer-use story. It was an early version of the same failure class: a third-party AI tool, an OAuth path, and environment variables treated as potentially exposed.
The lock-down list before trusting an agent with your workspace:
- Separate browser profile. No shared tabs or unrelated client sessions.
- Least-privilege account. If the agent only needs staging, don't give it a session that can reach production.
- Treat visible secrets as exposed. Environment pages, admin consoles, terminal buffers.
- Default environment variables to sensitive. Vercel shipped this as a product change after the incident. Apply the same rule locally.
- One task per session. Broad sessions are where permission boundaries quietly disappear.
- Log the work. What it touched, what it changed, what it chose not to do.
If someone on your team is asking whether an agent should get browser, repo, or SaaS access, forward them this checklist.
The answer is not "never." It is a smaller, cleaner workspace with explicit boundaries.
Worth Reading
n8n's MCP server can now build workflows. It used to expose existing workflows. Now it can create and update them inside your n8n instance. That's this issue in product form: agents are starting to assemble the workflows the tools run.
OpenAI published its AGI principles. The most candid line is Altman's own: OpenAI may have to "trade off some empowerment for more resilience" as it scales. That's the platform-governance version of the permission question. Worth reading once now, worth re-reading in six months when you notice what changed.
ChatGPT Images 2.0 is worth scrolling, even if you don't care about image models. The demo gallery is the signal: text-heavy layouts, multilingual posters, infographics, editorial spreads. The unlock is not prettier pictures. It's visual work that starts to behave more like a brief you can revise.
--Collin
P.S. Working through an agent access decision? Reply with the workflow. I can help map the permission boundary before it turns into folklore.
Forwarded this? You can subscribe here: buttondown.com/collinwilkins