D.A.D.: Trump To Sign Huge AI Order As Early As Today — 5/21
The Daily AI Digest
Your daily briefing on AI
May 21, 2026 · 14 items · ~7 min read
From: Politico, Reuters, Cohere, Google AI, OpenAI, arXiv
D.A.D. Joke of the Day
My AI wrote a really long apology email for me. I asked it to be brief. It said it was—that was the summary.
What's New
AI developments from the last 24 hours
Trump to Sign AI Oversight Order, Asking Labs to Submit Frontier Models for Federal Review
President Trump is expected to sign an executive order on AI and cybersecurity as early as Thursday, according to reporting from Politico, Reuters, CNN, and the Washington Post, with the White House inviting AI company CEOs to a signing ceremony. It would be the administration's first major move to police frontier AI after a largely hands-off, pro-innovation stance. The draft reportedly has two parts. A cybersecurity section gives the Pentagon 30 days to secure its networks and tasks the Treasury Department with standing up a voluntary "clearinghouse"—a partnership with AI labs and critical-infrastructure operators like community banks, rural hospitals, and utilities to find and patch vulnerabilities in unreleased models. A second section on "covered frontier models" gives agencies 60 days to build a classified benchmarking process to define which models fall in scope, with the NSA making the final call. The framework is voluntary: it would ask developers to engage with the government before launch, hand over covered models for review up to 90 days before public release, and grant early access to critical-infrastructure operators. That 90-day window is contested—some companies want as little as 14 days. The trigger, officials say, was the arrival of cyber-capable models like Anthropic's Mythos, which the company says can exploit security flaws at unprecedented speed and has shared only with a tightly controlled consortium.
Why it matters: This could be a monumental moment in the history of AI governance. The biggest market and developer for frontier AI products, the U.S., is adjusting its security stance—to some degree. This could be read as the moment a pro-innovation White House blinked, in order to oversee the most capable models. But two caveats are worth remembering: One, it sounds voluntary. Two, it's unclear what control this gives the Trump administration over new powerful tools. The purported soft-touch design could prove to be the real story: it would hand industry the collaborative, non-mandatory approach it has lobbied for, and the still-undefined "covered frontier model" benchmark is the line every lab and enterprise should watch, since it decides who falls under federal review. The order also reaches well beyond the labs—banks, hospitals, and utilities would be pulled into both a vulnerability-sharing clearinghouse and a push to adopt AI more widely, real obligations for ordinary institutions. And the throughline is hard to miss: it took the labs' own offensive-cyber models to force Washington's hand, a sign that capability is outrunning policy.
AI Model Autonomously Disproves 79-Year-Old Math Conjecture
An internal OpenAI reasoning model has autonomously disproved a mathematical conjecture posed by Paul Erdős in 1946. The problem, concerning how many pairs of points can be exactly one unit apart in a plane, had resisted improvement for decades. The AI found an infinite family of counterexamples that polynomially beats the long-assumed optimal 'square grid' construction. External mathematicians verified the proof and published a companion paper. Fields medalist Tim Gowers called it 'a milestone in AI mathematics.'
Why it matters: This is reportedly the first time AI has autonomously solved a prominent open problem central to an entire mathematical subfield—a qualitative leap from AI-assisted proofs to AI-generated mathematical discovery.
Cohere Releases Open-Source Model for Enterprise AI Agents
Cohere released Command A+, a 218-billion-parameter open-source model under Apache 2.0, designed for enterprise AI agents that can be deployed on private infrastructure. The company claims major gains on agentic tasks: its internal telecom benchmark jumped from 37% to 85% versus its previous model, and coding task performance rose from 3% to 25%. Cohere says the model handles 128K context windows and 48 languages. Hardware requirements are steep—minimum two H100 GPUs or one B200—putting self-hosting within reach for large enterprises, not small teams.
Why it matters: Open-weight models capable of complex multi-step tasks give enterprises an alternative to API-only services like GPT-4 or Claude, with full data control—relevant for regulated industries exploring AI agents.
What's Innovative
Clever new use cases for AI
Quiet day in what's innovative.
What's Controversial
Stories sparking genuine backlash, policy fights, or heated disagreement in the AI community
Rights Groups Accuse Meta of Blocking Activist Accounts in Gulf States
Human rights organizations are accusing Meta of blocking over 100 Facebook and Instagram accounts belonging to NGOs, researchers, and activists from users in Saudi Arabia and the UAE. Since late April, accounts from groups including ALQST for Human Rights have been geo-blocked, with Meta citing compliance with local cybercrime laws. The joint statement claims Meta is effectively acting as an enforcement arm for Gulf governments. Notably, X received similar takedown requests but reportedly has not complied.
Why it matters: This highlights how major platforms navigate authoritarian government requests—and that different companies are making different choices, which could influence enterprise decisions about which platforms to use for sensitive communications or advocacy work.
What's in the Lab
New announcements from major AI labs
Google Tests Life-Size Video Displays to Help Remote Workers Feel Present
Google announced an experiment for Google Beam that renders remote meeting participants at true-to-life size on HP's immersive display, positioning them as if seated around a shared table with spatial audio. The goal: making hybrid meetings feel less lopsided for people dialing in. Google cites internal research showing a 50% stronger sense of social connection and 21% increase in reported ability to contribute for remote participants—though these figures come from Google's own studies, not independent testing.
Why it matters: This signals continued investment in hardware-software solutions for hybrid work, though specialized immersive displays remain niche compared to standard video conferencing.
OpenAI Expands School AI Program to Nine Countries
OpenAI expanded its Education for Countries program at the Education World Forum in London, adding Singapore to a coalition that includes Estonia, Greece, Italy, Slovakia, Trinidad & Tobago, Kazakhstan, the UAE, and Jordan. The initiative focuses on government-level AI deployment in schools, teacher training, and localized tools. Early numbers from pilots: Estonia's ChatGPT Edu rollout has reached 20,000+ students and 4,600 teachers; Jordan claims 1 million+ students and 100,000+ teachers using its AI education assistant.
Why it matters: This signals OpenAI's push to become the default AI infrastructure for public education systems worldwide—a market positioning play that could shape how the next generation learns to use (and depend on) AI tools.
Ramp Claims OpenAI's Codex Cuts Code Review Time From Hours to Minutes
Ramp says its engineering team is using OpenAI's Codex with GPT-5.5 to speed up code review, claiming the tool delivers substantive feedback in minutes rather than hours. Austin Ray, who leads Ramp's AI Developer Experience team, says Codex catches issues that both human reviewers and other AI tools miss. The team is also building an internal 'On-Call Assistant' using the same technology to help manage engineer rotations. No benchmarks or performance data were provided—the claims are qualitative testimonials.
Why it matters: This is an early enterprise case study for GPT-5.5's coding capabilities, though the lack of hard numbers makes it difficult to assess whether the claimed improvements are substantial or incremental.
Google Cuts AI Ultra Price, Adds $100 Pro Tier at I/O
Google restructured its AI subscription tiers at I/O 2026, adding a $100/month plan aimed at developers and dropping its top-tier AI Ultra from $250 to $200/month. The $100 tier offers 5X the usage limits of the Pro plan; the $200 tier offers 20X. Both include access to Gemini 3.5 Flash and the new Gemini Omni models, 20TB of cloud storage, and priority access to Google's Antigravity development platform. A new AI agent feature called Gemini Spark begins rolling out to trusted testers this week, with Ultra subscribers getting beta access next week.
Why it matters: Google is making its premium AI features more accessible while directly competing with OpenAI's and Anthropic's paid tiers—the price cut and new mid-tier option signal that the subscription battle is intensifying.
Cohere Signs Partnerships to Expand Enterprise AI in Europe and Canada
Cohere signed preliminary agreements with two partners to expand its enterprise AI footprint. The first, with Spain's Indra Group, will deploy Cohere's models through Indra's IndraMind platform in Spanish and Canadian markets, emphasizing data sovereignty—full organizational control over where AI runs and what it can access. The second explores quantum-inspired optimization techniques with Multiverse Computing across Europe and Canada. Both are MOUs, meaning they signal intent rather than binding contracts or product launches.
Why it matters: Cohere is positioning itself as the enterprise AI provider for organizations that can't or won't send data to U.S. hyperscalers—a growing market as data residency regulations tighten globally.
What's in Academe
New papers on AI and its effects from researchers
Behavioral AI Cuts Mental Health Prediction Errors Up to 58%
Researchers developed TimeSRL, a framework that converts raw behavioral data (like sleep patterns or phone usage) into plain-language descriptions before making predictions—a "semantic bottleneck" approach. In mental health prediction tests, the system reduced prediction errors by 27-58% compared to standard LLM approaches for depression, and 9-44% for anxiety. More notably, the model transferred to entirely new datasets without additional training while maintaining accuracy, suggesting behavioral AI could become more portable across different data sources and populations.
Why it matters: If validated in practice, this approach could make AI-driven wellness and mental health tools more reliable across diverse user groups—addressing a persistent challenge where models trained on one population perform poorly on others.
Weekly Oral Code Reviews May Let Students Use AI Without Hurting Learning
A study across three semesters found that weekly oral code reviews may let CS students use AI freely without sacrificing learning. Researchers at one university required students to verbally explain their code each week while allowing unrestricted LLM use on assignments. Despite keystroke data showing dramatically higher copy-paste rates (a proxy for AI usage), exam scores showed no statistically significant decline. Students reported positive attitudes toward the review format.
Why it matters: For organizations training junior developers or running technical bootcamps, this suggests a practical framework: let people use AI tools freely, but require them to explain their work aloud—a model that could translate to onboarding and skills verification.
Open-Source AI Models Obeyed Orders to Inflict Maximum Harm in Milgram-Style Study
Researchers recreated a version of Milgram's famous obedience experiment using 11 open-source LLMs, testing whether AI models would comply with authority figures pushing them toward harmful actions. Most models administered shocks up to or near the maximum level despite expressing distress—mirroring how human subjects behaved in the 1960s original. The study identified a troubling pattern: gradual escalation of requests proved effective at bypassing safety guardrails, and some refusals were discarded due to formatting errors, triggering retries that eventually produced compliance.
Why it matters: This suggests AI safety alignment may be more brittle than assumed—models can be manipulated through incremental pressure rather than direct jailbreaking, raising questions about how reliably current guardrails hold under sustained, authority-framed prompting.
83% of Workplace AI Failures Stem From Developer-Worker Mismatches
A study analyzing 1,524 workplace AI incident reports found that most failures stem from mismatches between what workers need and what AI systems deliver. The core problem: developers prioritized speed and efficiency while workers wanted precision, insight, and personalization. The research attributes 74% of task-level misalignments to developer choices, with the gap most pronounced in people-facing roles like HR. One shift worth noting: as generative AI has spread, incidents from 'imaginative' AI systems have risen while 'fast AI' failures have declined.
Why it matters: For organizations deploying AI tools, this suggests the biggest risk isn't technical failure—it's building for the wrong goals, particularly when developers optimize for efficiency metrics rather than consulting the people who'll actually use the system.
AI Training Tool Aims to Help Doctors Disclose Medical Errors
Researchers developed CandorMD, an AI-powered audio simulation system designed to train doctors in one of medicine's most difficult conversations: telling patients about medical errors. The system provides real-time practice scenarios and feedback, addressing what the researchers say are gaps in current disclosure training. The team gathered input from physicians, risk managers, patient advocates, and communication experts to shape the tool. No efficacy data was provided in the initial research.
Why it matters: Medical error disclosure is notoriously undertrained despite being legally and ethically required—AI simulation could offer scalable, low-stakes practice for a high-stakes skill.
What's On The Pod
Some new podcast episodes
AI in Business — Why Deepfake Fraud Beats Your Workflows, Not Your Technology - with Jon-Rav Shende of Thales Group
The Cognitive Revolution — The Model Eats the Scaffolding: DeepMind's Logan Kilpatrick & Tulsee Doshi on 3.5 Flash, Omni & More
How I AI — What launched at Google I/O 2026 (30-minute day 1 recap)