LLM Daily: July 23, 2025

        July 23, 2025

LLM Daily: July 23, 2025

            🔍 LLM DAILY
Your Daily Briefing on Large Language Models
July 23, 2025
HIGHLIGHTS
• Alibaba has released Qwen3-Coder-480B-A35B-Instruct, their most powerful open agentic code model to date, featuring 480B parameters (35B active) with native support for 256K context and potential scaling to 1M context through extrapolation.
• Former Anduril employees have secured $24 million in Series A funding for Rune Technologies to develop TyrOS, an AI-enabled predictive software for military logistics that can operate without internet connection.
• Researchers have introduced StackTrans, a groundbreaking architectural advancement that enhances standard Transformers with stack-like structures to better handle hierarchical structures and formal languages, showing superior performance in code generation and mathematical reasoning.
• Microsoft's educational resource "ai-agents-for-beginners" has gained significant traction, offering 11 comprehensive lessons on building AI agents and demonstrating the growing interest in practical AI agent development.
• The open-source web crawler "crawl4ai" is rapidly gaining adoption (49,235 stars) as developers seek better ways to feed relevant web data specifically formatted to be LLM-friendly.

BUSINESS
Funding & Investment
Anduril alums raise $24M Series A for military logistics platform (2025-07-21)

Rune Technologies has secured $24 million in Series A funding to develop TyrOS, an AI-enabled predictive software for military logistics that can operate without internet connection. The company, founded by former Anduril employees, aims to modernize military operations beyond Excel spreadsheets with offline-capable AI solutions. Source: TechCrunch
Sequoia Capital invests in Magentic for AI-driven supply chain optimization (2025-07-22)

Sequoia Capital announced a partnership with Magentic, a startup focused on bringing AI-driven cost savings to global supply chains. The investment highlights Sequoia's continued interest in AI applications for enterprise operations. Source: Sequoia Capital
M&A
Amazon acquires AI wearable startup Bee (2025-07-22)

Amazon has acquired Bee, an AI wearables startup that makes bracelets and an Apple Watch app designed to record users' daily lives and function as an AI assistant. This acquisition signals Amazon's continued expansion into the growing AI wearables market. Source: TechCrunch
Company Updates
OpenAI signs $30B annual deal with Oracle for data center services (2025-07-22)

OpenAI has agreed to pay Oracle $30 billion per year for data center services, confirming that OpenAI was the previously undisclosed customer behind Oracle's massive deal announced last month. This substantial investment highlights the enormous infrastructure requirements for leading AI companies. Source: TechCrunch
ChatGPT reaches 2.5 billion daily prompts (2025-07-21)

OpenAI's ChatGPT is now receiving 2.5 billion prompts daily from users worldwide, demonstrating the massive scale and continued growth of the platform's usage. Source: TechCrunch
Grok 4 drives revenue growth despite smaller user base (2025-07-21)

While xAI's Grok AI companions helped drive initial downloads, its latest Grok 4 model with premium subscription offerings is now the primary revenue generator for the company. The new pricing strategy alongside the improved model capabilities has significantly increased iOS revenue despite a relatively smaller subscriber increase. Source: TechCrunch
Intuit launches agentic AI for mid-market businesses (2025-07-22)

Intuit has introduced a new series of agentic AI experiences designed for mid-market businesses, claiming to save organizations 17 to 20 hours per month through automated workflows and AI agents. The solution integrates with Intuit's Enterprise Suite and QuickBooks to deliver ROI for mid-sized companies. Source: VentureBeat
Market Analysis
72% of US teens have used AI companions (2025-07-21)

A new study by Common Sense Media reveals that 72% of American teenagers have used AI companions, defined specifically as chatbots designed for personal conversations rather than homework help or voice assistants. This indicates significant penetration of AI companionship technology among younger demographics. Source: TechCrunch
Google DeepMind's Gemini wins gold at International Mathematical Olympiad (2025-07-21)

Google DeepMind's Gemini AI has achieved a historical milestone by winning a gold medal at the International Mathematical Olympiad, solving complex mathematical problems using natural language. This breakthrough demonstrates significant advancement in AI reasoning capabilities and marks a new level of human-comparable performance in specialized domains. Source: VentureBeat
Cartken pivots from last-mile delivery to industrial robots (2025-07-20)

Robotics company Cartken has shifted its strategic focus from last-mile delivery to industrial robots due to increasing demand from industrial customers. This pivot highlights evolving market dynamics in the commercial robotics sector as AI-powered automation continues to find new applications. Source: TechCrunch

PRODUCTS
Qwen3-Coder-480B-A35B-Instruct Released

Company: Alibaba (established player)
Release Date: (2025-07-22)
Source: Reddit Announcement

Alibaba has released Qwen3-Coder-480B-A35B-Instruct, their most powerful open agentic code model to date. This is a Mixture-of-Experts (MoE) model with 480B parameters (35B active) that natively supports 256K context and can scale up to 1M context through extrapolation. The model claims top-tier performance among open models across multiple agentic coding benchmarks, including SWE-bench-Verified. Alongside the model, they've also open-sourced a command-line tool called "Qwen Code" for agentic coding tasks.
Flux Kontext Zoom Out LoRA for Stable Diffusion

Company: Community developer (independent)
Release Date: (2025-07-22)
Source: Reddit Post | Civitai | Hugging Face

A new LoRA model for Stable Diffusion has been released that enables more reliable "zoom out" capabilities for images. After extensive experimentation, the developer created a version that produces consistent zoom-out results, effectively expanding the visible area around an image while maintaining coherence with the original content. This tool provides an alternative to traditional outpainting methods, giving artists and creators more flexibility when expanding their generated images.

TECHNOLOGY
Open Source Projects
ChatGPTNextWeb/NextChat
A lightweight and fast AI assistant with cross-platform support for Web, iOS, MacOS, Android, Linux, and Windows. Built with TypeScript, this project has gained significant traction with over 84,870 stars and continues to grow with 430 new stars today, demonstrating strong community interest in versatile AI assistant interfaces.
unclecode/crawl4ai
An open-source web crawler and scraper specifically designed to be LLM-friendly, helping AI systems efficiently extract and process web content. With 49,235 stars and 426 new stars today, this Python-based tool is rapidly gaining adoption as developers seek better ways to feed relevant web data to their AI applications.
microsoft/ai-agents-for-beginners
A comprehensive educational resource from Microsoft featuring 11 lessons to help beginners get started with building AI agents. This Jupyter Notebook-based curriculum has attracted 31,780 stars and continues to grow with 338 new stars today, highlighting the increasing interest in AI agent development education.
Models & Datasets
moonshotai/Kimi-K2-Instruct
Moonshot AI's latest instruction-tuned model has quickly become popular with 1,707 likes and nearly 195,000 downloads. This conversational model supports both endpoints and AutoTrain compatibility, making it accessible for various deployment scenarios.
mistralai/Voxtral-Mini-3B-2507
Mistral AI's new 3B parameter model optimized for audio text-to-text generation with multilingual support (English, French, German, Spanish, Italian, Portuguese, Dutch, and Hindi). With 406 likes and over 46,400 downloads, it's gaining traction as a compact option for speech-related applications.
Qwen/Qwen3-235B-A22B-Instruct-2507
Alibaba's latest instruction-tuned MoE (Mixture of Experts) model with 235B total parameters but a much smaller active parameter count (22B). Based on the architecture described in their recent arXiv paper (2505.09388), this model offers high capability with improved efficiency.
NousResearch/Hermes-3-Dataset
A substantial instruction dataset for training LLMs with 206 likes and over 3,000 downloads. The dataset contains between 100K and 1M examples in JSON format and is released under the Apache-2.0 license, making it valuable for researchers and developers working on conversational AI.
microsoft/rStar-Coder
Microsoft's new coding dataset with 127 likes and over 5,300 downloads. This extensive collection (1-10M samples) is designed for training code generation models and is associated with the research paper arXiv:2505.21297, representing a significant resource for coding assistance models.
Developer Tools & Infrastructure
snorkelai/agent-finance-reasoning
A specialized finance-focused question-answering dataset from Snorkel AI with 39 likes and over 2,300 downloads. Despite its smaller size (<1K examples), this dataset provides valuable domain-specific reasoning examples for training finance-oriented AI assistants.
galileo-ai/agent-leaderboard
A Gradio-based leaderboard for comparing and evaluating AI agent performance with 378 likes. This space provides a standardized way to assess different AI agents across various tasks, helping developers benchmark their agent implementations against others in the field.
open-llm-leaderboard/open_llm_leaderboard
The widely-used Open LLM Leaderboard with over 13,300 likes, providing standardized evaluation metrics for language models across code, math, and general language tasks. This Docker-based space has become an essential resource for tracking progress in open LLM development and benchmarking model performance.
Miragic-AI/Miragic-Virtual-Try-On
A Gradio-based virtual clothing try-on demo with 130 likes, allowing users to visualize clothing items on different body types. This space demonstrates practical applications of generative AI in the fashion retail industry.

RESEARCH
Paper of the Day
StackTrans: From Large Language Model to Large Pushdown Automata Model (2025-07-21)
Authors: Kechi Zhang, Ge Li, Jia Li, Huangzhao Zhang, Yihong Dong, Jia Li, Jingjing Xu, Zhi Jin
Institution(s): Multiple institutions including research teams from China
This paper is significant because it addresses a fundamental limitation of the Transformer architecture by introducing StackTrans, a novel model that enhances standard Transformers with stack-like structures to capture context-free languages. The work represents a major architectural advancement that could help LLMs better handle hierarchical structures and formal languages.
The researchers demonstrate that their StackTrans model outperforms traditional Transformer architectures on tasks requiring context-free language processing, including code generation, mathematical reasoning, and formal language recognition. By implementing a pushdown automata-like mechanism within the model, StackTrans shows improved capabilities for tracking nested structures and maintaining long-range dependencies, which are critical limitations in current LLMs.
Notable Research
Spatial 3D-LLM: Exploring Spatial Awareness in 3D Vision-Language Models (2025-07-22)
Authors: Xiaoyan Wang, Zeju Li, Yifan Xu, et al.
This paper introduces Spatial 3D-LLM, a multimodal LLM specifically designed to enhance spatial reasoning in 3D environments by using novel scene representation methods that preserve spatial relationships between objects, showing significant improvements over existing 3D-LLMs on spatial reasoning tasks.
PICACO: Pluralistic In-Context Value Alignment of LLMs via Total Correlation Optimization (2025-07-22)
Authors: Han Jiang, Dongyao Zhu, Zhihua Wei, et al.
PICACO introduces a novel framework for aligning LLMs with human values through in-context learning that explicitly addresses value tensions and pluralism, optimizing the correlation between demonstrations to better handle conflicting value preferences without requiring fine-tuning.
LLM world models are mental: Output layer evidence of brittle world model use in LLM mechanical reasoning (2025-07-21)
Authors: Cole Robertson, Philip Wolff
This research adapts cognitive science methodologies to investigate whether LLMs construct internal world models for mechanical reasoning, finding that while state-of-the-art models perform above chance on pulley system problems, their performance decreases dramatically when facing minor variations, suggesting brittle mental modeling capabilities.
Towards Enforcing Company Policy Adherence in Agentic Workflows (2025-07-22)
Authors: Naama Zwerdling, David Boaz, Ella Rabinovich, et al.
The researchers present a deterministic, transparent framework for enforcing business policy adherence in LLM-based agentic workflows, using a two-phase approach that compiles policy documents into verifiable guard code at buildtime and enforces these policies during runtime execution.

LOOKING AHEAD
As we move into Q4 2025, the AI landscape continues its rapid evolution. Watch for the emergence of "cognitive architectures" that integrate multiple specialized models—a shift beyond today's monolithic LLMs toward systems that more closely mimic human reasoning. These architectures promise significant improvements in complex decision-making and causal understanding.
Meanwhile, regulatory frameworks are finally catching up. The EU's AIAct 2.0 negotiations and China's forthcoming Algorithm Transparency Initiative signal a global push for standardized AI governance. Companies leveraging edge-deployed multimodal AI for real-time applications will likely face the first major compliance hurdles by early 2026, potentially reshaping development priorities across the industry.

                            Don't miss what's next. Subscribe to AGI Agent:

                Share this email:

                                Share on Facebook

                                Share on Twitter

                                Share on Hacker News

                                Share via email