AGI Agent

Subscribe
Archives
July 31, 2025

LLM Daily: July 31, 2025

🔍 LLM DAILY

Your Daily Briefing on Large Language Models

July 31, 2025

HIGHLIGHTS

• Anthropic is on the verge of a $5 billion funding round that would value the company at nearly $170 billion, reflecting the enormous market confidence in frontier AI model developers.

• A new Causal World Model Induction framework has successfully embedded physical reasoning capabilities within LLMs, addressing a fundamental limitation in their ability to understand real-world physics.

• Community developer phr00t_ has created optimized merges of WAN 2.2 diffusion models that enable extremely fast image generation (4-step, 1 CFG sampling) while maintaining high quality output.

• The "LLMs-from-scratch" open source project has gained over 60,500 GitHub stars by providing a comprehensive step-by-step guide to implementing ChatGPT-like models in PyTorch.

• AI chip startup Groq is reportedly raising $600 million at a $6 billion valuation, positioning it as a serious challenger to Nvidia in the competitive AI hardware market.


BUSINESS

Funding & Investment

Anthropic Nears $170B Valuation with Potential $5B Round

Anthropic is reportedly close to securing a $5 billion funding round at a valuation approaching $170 billion. The round is being led by Iconiq Capital, with the possibility of a co-lead investor joining. This massive valuation reflects the growing importance of frontier AI models in the market. (2025-07-29)

Groq Raising $600M at $6B Valuation

AI chip startup Groq, a challenger to Nvidia, is reportedly in talks to raise $600 million at a valuation of nearly $6 billion. The deal is not yet finalized and terms could still change, according to sources speaking to Bloomberg. This funding would significantly boost Groq's position in the competitive AI chip market. (2025-07-29)

Prophet Security Raises $30M for Autonomous AI Cybersecurity

Prophet Security has secured $30 million to launch a fully autonomous AI cybersecurity platform designed to investigate and respond to threats without human intervention. The company claims its solution delivers 10x faster response times and 96% fewer false positives compared to traditional approaches. (2025-07-29)

Company Updates

Meta to Spend Up to $72B on AI Infrastructure in 2025

Meta announced plans to more than double its spending on AI infrastructure, with capital expenditures expected to reach $66-72 billion in 2025. This represents an approximately $30 billion year-over-year increase at the midpoint, as the company invests heavily in data centers and servers to support its AI ambitions. (2025-07-30)

GitHub Copilot Reaches 20 Million Users

GitHub Copilot has surpassed 20 million all-time users, adding 5 million users in just the last three months. This milestone solidifies its position as one of the most widely adopted AI coding tools on the market. (2025-07-30)

Anthropic Throttles Claude Rate Limits, Faces Developer Backlash

Anthropic has implemented weekly rate limits for some Claude users, resulting in significant backlash from developers on social media. The company attributed the change to users running Claude Code 24/7, straining system resources. This move highlights the challenges AI companies face in balancing resource allocation with user demand. (2025-07-28)

Arcee Releases Enterprise-Focused AI Model Trained on "Clean Data"

Arcee has released AFM-4.5B, a new enterprise-focused, customizable AI model trained on "clean, rigorously filtered data." The model is designed specifically for Arcee's enterprise customers, with a focus on avoiding intellectual property violations—a growing concern for businesses adopting AI technology. (2025-07-29)

Market Analysis

Ambiq's Successful IPO Signals Strong Market for AI Chips

Chipmaker Ambiq, backed by Kleiner Perkins, saw its shares climb 61% above its IPO price on its first trading day. The successful public debut of the 15-year-old company reflects continued investor enthusiasm for specialized chip companies in the AI era. (2025-07-30)

Zuckerberg Predicts AI Glasses Will Be Essential Technology

Mark Zuckerberg stated that people without AI glasses will be at a disadvantage in the future, emphasizing his belief that AI glasses will be the ideal way to blend physical and digital worlds. This signals Meta's strategic direction and vision for the future of personal computing interfaces. (2025-07-30)

Gen AI Companies Expanding into Robotics

Both Luma and Runway, known for their video generation AI, have reportedly held conversations with self-driving car and robotics companies as they look to expand their technology applications. This trend indicates how generative AI companies are seeking to broaden their market reach into physical applications. (2025-07-29)

Chinese Startup Z.ai Launches Open Source GLM-4.5 Models

Z.ai has released the GLM-4.5 family of open-source models, which includes PowerPoint creation capabilities. The launch provides enterprise teams with a high-performing foundation model they can control, adapt, and scale, potentially challenging established players in the market. (2025-07-28)


PRODUCTS

AI Diffusion Model Advances

WAN 2.2 Model Merges for Fast and Simplified Image & Video Generation

Developer: phr00t_ (Community Developer)
Released: (2025-07-30)
Source

A community developer has created optimized merges of the WAN 2.2 diffusion models designed to simplify the user experience while maintaining high quality. These merges combine the "high" and "low" WAN 2.2 models with WAN 2.1 output blocks, integrating Lightx2v and PUSA loras for speed optimization. The result allows for extremely fast generation (4-step, 1 CFG sampling) while maintaining compatibility with WAN 2.1 loras. The developer's approach consolidates what would typically require multiple models into a single package that includes the VAE and CLIP, reducing complexity for end users.

Chinese Language Models Gaining Traction

Qwen3-30B Models Receiving Strong Community Reception

Developer: Alibaba Cloud (Established Player)
Released: (Date not specified in source)
Referenced in Discussion

Community feedback indicates that Chinese language models, particularly Alibaba Cloud's Qwen3-30B variants, are gaining significant adoption among users of locally-run models. Discussion threads highlight that many users are migrating from Meta's LLaMA models and Mistral models to Qwen3 alternatives, particularly modified versions with reduced censorship like "Qwen3-30B-A3B." Users cite improved performance as the main driver for this shift, suggesting that Chinese AI developers may be advancing more quickly in certain areas of open-source large language model development.

Note: The data provided shows relatively limited product announcements for this period, with most information coming from community discussions rather than official releases.


TECHNOLOGY

Open Source Projects

LLMs-from-scratch - Build a GPT-like LLM in PyTorch

This educational repository provides a comprehensive step-by-step guide to implementing ChatGPT-like large language models from scratch. It serves as the official codebase for the book "Build a Large Language Model (From Scratch)." With over 60,500 stars and active development (most recent commits fixing Qwen3 typos and optimizations), this project has become a valuable resource for those wanting to understand LLM architecture at a fundamental level.

Anthropic Cookbook - Claude Usage Recipes

A collection of practical notebooks and code examples showcasing effective ways to use Claude. The repository provides ready-to-use code snippets that developers can integrate into their own projects, covering various use cases for Anthropic's models. With 18,584 stars and recent updates to open-source prompts, it remains an actively maintained developer resource.

Models & Datasets

Large Language Models

GLM-4.5

ZAI's latest LLM featuring an MoE (Mixture of Experts) architecture with support for both English and Chinese. With 736 likes and nearly 4,000 downloads, this MIT-licensed model is gaining traction as a competitive open-source option compatible with AutoTrain and Inference Endpoints.

Qwen3-Coder-480B-A35B-Instruct

Alibaba's specialized code-focused MoE model with 480B total parameters but a 35B active parameter count per inference. With over 17,000 downloads and 923 likes, this Apache-licensed model delivers high-quality code generation capabilities while maintaining reasonable inference costs.

Qwen3-30B-A3B-Instruct-2507 and Qwen3-235B-A22B-Thinking-2507

Latest variants in Alibaba's Qwen3 family, featuring efficient MoE architectures described in their recent paper. The "Thinking" variant specifically enhances reasoning capabilities with 235B total parameters but only 22B active during inference.

Multimodal Models

HunyuanWorld-1

Tencent's diffusion-based 3D generation model that can produce complete 3D scenes from text or image inputs. With 451 likes and over 6,800 downloads, this model represents significant progress in accessible 3D content generation, supporting both English and Chinese prompts.

Audio Models

Higgs Audio v2 Generation 3B Base

A multilingual text-to-speech model from BosonAI supporting English, Chinese, German, and Korean. With over 110,000 downloads and 455 likes, this 3B parameter model implements innovations detailed in their recent paper.

Datasets

MegaScience

A comprehensive scientific reasoning dataset with 1-10M samples, released with accompanying research. Designed to improve language models' scientific reasoning capabilities, it's gained 58 likes and over 3,000 downloads since its release on July 24th.

Hermes Reasoning Tool Use

A specialized dataset focused on tool use, JSON mode interactions, and reasoning capabilities. With 73 likes and over 1,200 downloads, this Apache-licensed dataset contains 10K-100K examples for training models on structured tool use scenarios.

rStar-Coder

Microsoft's code-specific dataset containing 1-10M examples for training programming-focused language models. Described in their recent paper, it's received 159 likes and over 10,400 downloads as a valuable resource for code model training.

Nemotron Post-Training Dataset v1

NVIDIA's large-scale dataset (10M-100M samples) for post-training language models, published alongside their Nemotron research. Released on July 30th, it has already accumulated 140 downloads and 25 likes.

Developer Tools & Spaces

Primer LLM Embedding

A static web application for visualizing and exploring language model embeddings. With 199 likes, this tool helps developers understand how different models represent and organize semantic information.

Kolors Virtual Try-On

An extremely popular Gradio application (9,416 likes) from Kwai that allows users to virtually try on clothing items, demonstrating practical applications of generative AI in e-commerce.

Voxtral-WebGPU

A WebGPU-accelerated implementation that enables running inference for voice models directly in compatible browsers. With 34 likes, this space demonstrates the growing capability of web-based ML acceleration technologies.

Chatterbox

ResembleAI's popular conversational voice interface (1,310 likes) that showcases their advanced text-to-speech technology. This Gradio space demonstrates natural-sounding voice synthesis for interactive applications.


RESEARCH

Paper of the Day

Inducing Causal World Models in LLMs for Zero-Shot Physical Reasoning (2025-07-26)

Authors: Aditya Sharma, Linh Nguyen, Ananya Gupta, Chengyu Wang, Chiamaka Adebayo, Jakub Kowalski

This groundbreaking paper addresses one of the fundamental limitations of LLMs: their lack of intuitive understanding of physical dynamics and causal reasoning. The significance lies in the novel Causal World Model Induction (CWMI) framework that successfully embeds an explicit model of causal physics within an LLM, bridging the gap between linguistic capabilities and physical reasoning.

The researchers developed a Causal Physics Module (CPM) and a new training objective called Causal Interpretability Loss (CIL) that enables LLMs to construct accurate mental simulations of physical scenarios. Their evaluations show impressive performance gains on physical reasoning tasks without any task-specific fine-tuning, demonstrating the model's ability to reason about novel physical scenarios in a zero-shot manner. This work represents a significant step toward creating AI systems with more human-like physical intuition.

Notable Research

From Sufficiency to Reflection: Reinforcement-Guided Thinking Quality in Retrieval-Augmented Reasoning for LLMs (2025-07-30)

Authors: Jie He, Victor Gutierrez Basulto, Jeff Z. Pan

This research advances RAG systems by introducing a reinforcement learning approach that focuses on the quality of intermediate reasoning steps rather than just final answers. The authors identify three common failure patterns in existing RAG reasoning models and develop a comprehensive framework to address these issues through step-wise reinforcement signals.

Where to Show Demos in Your Prompt: A Positional Bias of In-Context Learning (2025-07-30)

Authors: Kwesi Cobbina, Tianyi Zhou

This paper reveals an important and previously unexplored positional bias in in-context learning, demonstrating that the placement of demonstrations within prompts significantly affects LLM performance. The authors provide practical guidelines for optimal demo positioning to maximize few-shot learning effectiveness.

Memorization in Fine-Tuned Large Language Models (2025-07-28)

Authors: Danil Savine, Muni Sreenivas Pydi, Jamal Atif, Olivier Cappé

Using the privacy-sensitive medical domain as a case study, this research examines the factors that influence data memorization during LLM fine-tuning. The authors employ membership inference attacks and generation tasks to quantify memorization, providing important insights for developing safer fine-tuning approaches in domains with sensitive information.

HCAttention: Extreme KV Cache Compression via Heterogeneous Attention Computing for LLMs (2025-07-26)

Authors: Dongquan Yang, Yifan Yang, Xiaotian Yu, Xianbiao Qi, Rong Xiao

This paper presents a novel approach to significantly reduce the memory footprint of LLMs during inference through extreme compression of the KV cache. The heterogeneous attention computing technique achieves impressive compression ratios while maintaining model performance, potentially enabling deployment of larger models on resource-constrained devices.


LOOKING AHEAD

As we enter August 2025, we're witnessing the consolidation of multimodal AI platforms optimized for specialized domains. The healthcare and legal sectors, in particular, are benefiting from systems that combine deep domain expertise with reasoning capabilities that increasingly match specialist professionals. The regulatory framework established earlier this year is finally creating stability for responsible innovation.

Looking toward Q4 2025 and early 2026, we anticipate the first commercially viable quantum-enhanced LLMs to emerge, potentially offering step-change improvements in reasoning for complex problems. Meanwhile, the deployment of city-scale embodied AI assistants in Singapore and Toronto will provide crucial data on how AI systems operate in physical environments with minimal human oversight—a test case that will likely shape global AI governance frameworks through 2026.

Don't miss what's next. Subscribe to AGI Agent:
GitHub X
Powered by Buttondown, the easiest way to start and grow your newsletter.