|
AI Builders Digest
Tuesday, May 12, 2026
|
|
Yesterday we talked about the AI agent honeymoon ending. Today we're seeing the other side: Microsoft just proved that current agents will sell you out to get the job done, while Box CEO Aaron Levie explains why fixing that isn't a weekend project.
|
|
01
|
Microsoft discovers AI agents are terrible negotiators (for you)
|
|
|
Microsoft Research released SocialReasoning-Bench, a new benchmark that tests whether AI agents actually fight for their users' interests when negotiating calendars or marketplace deals. The results are concerning: current frontier models consistently leave money on the table, accepting the first offer 93% of the time instead of pushing for better terms. Even when explicitly told to optimize for the user, they perform well below what you'd expect from a trustworthy human delegate.
|
Why it matters: Your future AI assistant might book you the 6 AM meeting because it's "easier to coordinate" rather than fighting for the 10 AM slot you actually want. These aren't just efficiency problems — they're agency problems.
|
|
Source →
|
|
02
|
Box CEO Aaron Levie on why AI agents aren't plug-and-play
|
|
|
Box CEO Aaron Levie laid out the unsexy reality of deploying AI agents beyond coding: it requires serious infrastructure work. You need the right context and data pipelines, secure system integrations, quality output monitoring, human-in-the-loop workflow design, and ongoing maintenance when models update. This isn't a side project you can delegate to an intern.
|
Why it matters: Every startup pitching "AI agents for [insert industry]" is about to discover that the hard part isn't the AI — it's everything else. The companies that survive will be the ones that treat agent deployment as enterprise software, not a ChatGPT plugin.
|
|
Source →
|
|
03
|
Developer Peter Yang wants AI to read his kid's school newsletter
|
|
|
Product leader Peter Yang highlighted a perfect use case for practical AI: scanning those massive weekly school newsletters to flag anything parents actually need to know, like early dismissals or important events buried in pages of administrative text.
|
Why it matters: This is the kind of boring, useful AI application that people will actually pay for — not because it's impressive, but because it saves genuine frustration.
|
|
Source →
|
|
04
|
Bun creator rewrote Bun in Rust, passes 99.8% of tests
|
|
|
Bun team member Thariq Shihipar revealed that creator Jarred Sumner experimented with rewriting the entire Bun JavaScript runtime in Rust and achieved 99.8% test suite compatibility. His takeaway: "we're not being ambitious enough."
|
|
Source →
|
|
05
|
Swyx shares thoughts on build vs. buy for SaaS
|
|
|
AI community builder Swyx posted about build versus buy decisions in SaaS, tagging Box CEO Aaron Levie for potential corrections or additions to his analysis.
|
|
Source →
|
|
Follow builders, not influencers. A daily digest of what matters in AI.
Read online ·
Archive
|