OpenAI and Google Race to the Bottom on Price While Meta's Kenya Workers See It All
1. OpenAI and Google Ship Budget Models Hours Apart On March 3, OpenAI released GPT-5.3 Instant as ChatGPT's new default model, available to all users immediately. Hours later, Google launched Gemini 3.1 Flash-Lite, its cheapest Gemini 3 model to date.
2. Meta's Data Workers in Kenya See "Everything" Ray-Ban Glasses Record Data annotators at Sama's Nairobi office review footage from Meta's Ray-Ban smart glasses. They label video clips so Meta's AI assistant can learn.
3. Microsoft Banned "Microslop" and Lost Control of Its Own Discord The word "Microslop" got Microsoft's official Copilot Discord server locked down last week. Users coined the nickname to mock Copilot's output quality.
In Brief
- SWE-rebench V2 Delivers Multi-Language Task Collection for Training Software Engineering Agents Researchers released SWE-rebench V2, a large-scale benchmark with reproducible execution environments and test suites spanning multiple programming languages. Existing SWE-agent training datasets cover only a few high-resource ecosystems; V2 targets the data scarcity bottleneck that limits reinforcement learning for code agents.
- CHIMERA Generates Compact Synthetic Data to Bootstrap Open-Source Reasoning Models CHIMERA addresses three bottlenecks in open-source LLM reasoning training: no seed datasets with detailed chain-of-thought traces, limited scalability of human annotation, and poor reproducibility. The framework produces synthetic training data usable for both supervised fine-tuning and reinforcement learning.
- LLaDA-o Unifies Text Understanding and Image Generation in a Single Diffusion Model LLaDA-o pairs masked diffusion for text with continuous diffusion for images through a shared attention backbone, cutting redundant computation for fixed conditions. A data-centric length adaptation method handles variable-length multimodal outputs.
- CoVe Trains Multi-Turn Tool-Use Agents via Constraint-Guided Data Synthesis CoVe defines explicit task constraints that serve double duty: guiding synthetic training data generation and verifying correctness. The framework targets the gap between ambiguous real-world user requests and the deterministic API calls agents must execute.
- OmniLottie Generates Vector Animations from Text, Image, and Video Inputs OmniLottie converts multimodal instructions into Lottie-format vector animations by tokenizing the JSON structure into compact parameterized tokens. Raw Lottie files contain too much structural metadata for direct learning; the tokenization strips invariant formatting while preserving shape and motion data.
- Developer Ships Open-Source Voice Agent with Sub-500ms End-to-End Latency A developer published a full technical writeup on building a voice agent from scratch that achieves under 500 milliseconds of latency. The post covers the complete pipeline from speech recognition through response generation.
- OpenAutoNLU Automates Model Training for Text Classification and NER OpenAutoNLU selects training regimes automatically based on dataset characteristics, requiring zero manual configuration. The open-source library bundles data quality diagnostics, out-of-distribution detection, and LLM integration in a low-code API.
- Google DeepMind Publishes Prompt Guide for Project Genie World Generation Google DeepMind released four prompting techniques for its Project Genie model, which generates interactive 3D environments from text descriptions. The guide covers spatial layout, object placement, and style control.
- WorldStereo Maintains 3D Consistency in Camera-Guided Video Generation WorldStereo introduces geometric memory modules that preserve consistent 3D structure when reconstructing scenes from generated video. Current video diffusion models produce high visual quality but break geometric coherence across different camera paths.
- Spatial Reward Model Fixes Layout Errors in Text-to-Image Generation A new reward modeling method improves spatial understanding in image generation using a dataset of over 80,000 annotated examples. The approach reduces the repeated sampling attempts currently needed when prompts specify precise spatial relationships.
Don't miss what's next. Subscribe to AI News Digest: