AGI Agent

Subscribe
Archives
September 26, 2025

LLM Daily: September 26, 2025

🔍 LLM DAILY

Your Daily Briefing on Large Language Models

September 26, 2025

HIGHLIGHTS

• Juicebox has secured $30M in funding from Sequoia Capital for its LLM-powered hiring platform, while Cohere reached a $7B valuation following new funding and a strategic partnership with AMD, highlighting continued strong investment in AI startups.

• A community developer has successfully trained a 960M parameter language model from scratch using the Llama 3 architecture for approximately $500 using Amazon credits, demonstrating the increasing accessibility of LLM development.

• A groundbreaking theoretical paper proved that Group Relative Policy Optimization (GRPO), a popular RL algorithm for training LLMs, actually induces a non-trivial process reward model under specific conditions, revealing both connections between distinct approaches and exposing a critical flaw in the GRPO objective.

• Hugging Face's Transformers library continues to dominate the open source ML ecosystem with over 150,000 GitHub stars, providing a unified API for both inference and training across text, vision, audio, and multimodal domains.


BUSINESS

Funding & Investment

Juicebox Raises $30M from Sequoia for LLM-Powered Hiring Platform (2025-09-25)
Recruiting startup Juicebox secured $30 million in funding led by Sequoia Capital to advance its LLM-powered hiring platform. Sequoia confirmed the investment in a companion blog post titled "Why We're Partnering with Juicebox: The Recruiting Platform Founders Are Obsessed With." The platform leverages AI to revolutionize the hiring process. TechCrunch

Cohere Reaches $7B Valuation After New Funding and AMD Partnership (2025-09-24)
Enterprise AI company Cohere raised an additional $100 million and announced a strategic partnership with AMD, boosting its valuation to $7 billion. This financing round comes just a month after the company's previous raise, signaling strong investor confidence in Cohere's growth trajectory and technology. TechCrunch

Oracle Reportedly Planning $15B Corporate Bond Sale (2025-09-24)
Oracle is reportedly looking to raise $15 billion through a corporate bond sale. This move comes shortly after Oracle allegedly secured a massive $300 billion compute deal with OpenAI, suggesting significant capital needs as the company expands its AI infrastructure capabilities. TechCrunch

Company Updates

OpenAI Launches ChatGPT Pulse for Proactive Morning Briefs (2025-09-25)
OpenAI introduced ChatGPT Pulse, a new feature designed to proactively create morning briefs for users without requiring direct prompting. This release represents a strategic shift in OpenAI's product design philosophy, moving toward asynchronous AI assistance that works independently on users' behalf. TechCrunch

Microsoft Integrates Anthropic's AI into Copilot (2025-09-24)
Microsoft announced the integration of Anthropic's AI technology into its Copilot assistant, a notable development that adds an OpenAI competitor to Microsoft's AI portfolio. This move suggests a potential diversification strategy in Microsoft's AI partnerships, despite its significant investment in OpenAI. TechCrunch

Microsoft Cuts Cloud Services to Israeli Military Unit Over Surveillance Concerns (2025-09-25)
Microsoft has terminated cloud services to an elite Israeli military intelligence unit (Unit 8200) following reports that Azure cloud storage was being used to house surveillance data on Palestinians. The decision came after an investigation triggered by reporting in The Guardian, highlighting growing ethical considerations in AI and cloud infrastructure deployment. TechCrunch

Market Trends

Data Collection App Neon Rises to #2 on App Store, Selling User Data to AI Companies (2025-09-24)
Neon, a call recording app that has climbed to the second position on Apple's App Store, is gaining traction with its model of paying users to record their phone calls and selling this voice data to AI companies. The app's rapid rise highlights growing consumer willingness to trade personal data for compensation as AI firms seek more training data. TechCrunch


PRODUCTS

New Releases

LLM From Scratch by Community Developer

  • Source: Reddit Post by thebadslime
  • Developer: Individual developer (community project)
  • Date: (2025-09-25)
  • A community developer has trained a 960M parameter language model from scratch using the Llama 3 architecture. The model features 3:1 GQA (Grouped-Query Attention), Flash Attention 2, and sink tokens. The developer used Claude to write training scripts and trained on public domain data at a cost of approximately $500 using Amazon credits. The model is currently in pre-training stage and not yet optimized for general use.

WAN 2.5 Preview (Video Generation Model)

  • Source: Reddit Discussion
  • Developer: WAN Project
  • Date: (2025-09-25)
  • WAN 2.5 Preview has been released as an early version of their upcoming video generation model. This preview version is designed to collect user feedback for fine-tuning before the full release. The final version will include open training and inference code, though the release of model weights remains undecided. The model generates 1080p, 10-second videos but requires significantly more VRAM than previous versions. Final system requirements have not yet been announced.

Developer Tools

Training Data Representativeness Tool

  • Source: Reddit Post
  • Developer: Individual researcher/developer
  • Date: (2025-09-25)
  • A new Python-based guide and toolkit for evaluating training data representativeness and detecting dataset shift has been shared with the ML community. The tool implements two statistical methods: Population Stability Index (PSI) to measure distributional changes, and Cramer's V to assess the intensity of changes. This toolkit aims to help machine learning practitioners ensure their training data is representative of real-world conditions.

Note: There were no new AI product launches reported on Product Hunt for this period.


TECHNOLOGY

Open Source Projects

huggingface/transformers

Transformers is the leading framework for state-of-the-art machine learning models across text, vision, audio, and multimodal domains. It provides a unified API for both inference and training of popular model architectures. With over 150,000 stars (+70 today) and 30,500+ forks, it remains the go-to library for implementing and working with transformer-based models.

openai/openai-cookbook

This repository provides practical examples and guides for using the OpenAI API effectively. With 68,000+ stars and frequent updates, it's a valuable resource for developers integrating OpenAI's models into their applications. Recent commits show active maintenance, including updates to match the latest OpenAI APIs and improvements to the RFT (Reinforcement Fine-Tuning) cookbook.

microsoft/ai-agents-for-beginners

Microsoft's comprehensive course consisting of 12 lessons designed to introduce beginners to building AI agents. With nearly 40,000 stars and significant daily growth (+103 today), this resource is gaining substantial traction in the developer community as an entry point into AI agent development.

Models & Datasets

Wan-AI/Wan2.2-Animate-14B

A diffusion model with 14B parameters specialized in animation generation. With 464 likes and over 20,000 downloads, it's gaining popularity for creating high-quality animated visuals. The model is available in ONNX format and released under the Apache 2.0 license.

ibm-granite/granite-docling-258M

IBM's compact 258M parameter model for document understanding that can handle text, code, formulas, charts, and tables. Based on IDEFICS3 architecture, it specializes in document parsing, OCR, and information extraction. With almost 700 likes and 42,000+ downloads, it demonstrates IBM's commitment to efficient document AI.

Qwen/Qwen3-Omni-30B-A3B-Instruct

Alibaba's multimodal model that enables any-to-any conversion including text-to-audio capabilities. Built on Qwen3's MoE (Mixture of Experts) architecture, it has accumulated over 400 likes and 20,000 downloads, establishing itself as a powerful option for developers needing versatile multimodal capabilities.

InternRobotics/OmniWorld

A comprehensive dataset with over 1M samples for robotics, text-to-video, image-to-video, and image-to-3D tasks. Released with 62 likes and 20,000+ downloads, this WebDataset-formatted resource is already gaining significant adoption for multimodal AI research and development.

openai/gdpval

OpenAI's recently released validation dataset containing multimodal content (audio, documents, images, text, and video). Though small in size (less than 1K samples), this dataset is significant as it likely represents OpenAI's benchmark for evaluating general-purpose models across modalities.

Developer Tools & Spaces

Wan-AI/Wan2.2-Animate

A Gradio-based interface for the Wan2.2 animation model, allowing users to generate animations through an intuitive UI. With 580 likes, it demonstrates how model creators are increasingly packaging their technologies with accessible interfaces.

XiaomiMiMo/mimo_audio_chat

Xiaomi's Docker-based demo for their audio chat technology, showcasing their capabilities in voice interaction systems. This space highlights the growing trend of major tech companies using Hugging Face to demonstrate their AI innovations.

yonigozlan/Transformers-Timeline

A visual timeline of transformer-based models, providing researchers and developers with historical context on the evolution of the transformer architecture. With 40 likes, it serves as both an educational resource and reference tool for understanding the rapid progression of transformer models.


RESEARCH

Paper of the Day

GRPO is Secretly a Process Reward Model (2025-09-25)
Authors: Michael Sullivan

This paper makes a significant theoretical breakthrough by proving that Group Relative Policy Optimization (GRPO), a popular RL algorithm for training LLMs, actually induces a non-trivial process reward model (PRM) under specific conditions. This insight is particularly important as it reveals both a fundamental connection between two seemingly distinct approaches and exposes a critical flaw in the GRPO objective related to non-uniformly distributed process steps. The author not only provides theoretical proof but also empirical validation that these conditions are met in real-world scenarios, offering practical suggestions for improving both exploration and exploitation in RLHF.

Notable Research

RLBFF: Binary Flexible Feedback to bridge between Human Feedback & Verifiable Rewards (2025-09-25)
Authors: Zhilin Wang, Jiaqi Zeng, Olivier Delalleau, et al.
Introduces Reinforcement Learning with Binary Flexible Feedback (RLBFF), a novel approach that bridges the gap between traditional RLHF and RLVR by incorporating explicit criteria into human feedback, improving interpretability while maintaining flexibility.

Tree Search for LLM Agent Reinforcement Learning (2025-09-25)
Authors: Yuxiang Ji, Ziyu Ma, Yong Wang, et al.
Proposes Tree-GRPO, a grouped agent RL method that addresses sparse supervision in long-term agent tasks through tree search, where each node represents a complete interaction step, significantly improving decision-making in complex environments.

Explaining Fine Tuned LLMs via Counterfactuals: A Knowledge Graph Driven Framework (2025-09-25)
Authors: Yucheng Wang, Ziyang Chen, Md Faisal Kabir
Presents a novel framework that explains how fine-tuning mechanisms like LoRA alter an LLM's structural reasoning using counterfactuals grounded in domain-specific knowledge graphs, with a specific application in the biomedical domain.

SGMem: Sentence Graph Memory for Long-Term Conversational Agents (2025-09-25)
Authors: Yaxiong Wu, Yongyue Zhang, Sheng Liang, Yong Liu
Introduces a sentence-level graph memory architecture that enhances LLM-based agents' ability to retain and utilize information from long conversations, addressing limitations in current context window approaches for extended interactions.

ToMPO: Training LLM Strategic Decision Making from a Multi-Agent Perspective (2025-09-25)
Authors: Yiwen Zhang, Ziang Chen, Fanqi Kong, et al.
Presents a novel framework for training LLMs in strategic decision-making by incorporating theory of mind and multi-agent perspectives, enabling models to better understand and predict other agents' intentions and behaviors.


LOOKING AHEAD

As we close Q3 2025, the convergence of multimodal LLMs with embodied AI stands out as the defining trend heading into year-end. The recent integration of real-time sensory feedback loops in commercial models has dramatically improved contextual reasoning in physical environments, suggesting we'll see the first truly adaptive home robots by Q1 2026.

Meanwhile, the regulatory landscape continues shifting rapidly. With the EU's AI Harmony Act implementation deadline approaching in February and similar frameworks advancing in the US and Asia, we're entering a period of global standardization. Companies developing foundation models are advised to closely monitor these converging compliance requirements as they finalize their 2026 development roadmaps.

Don't miss what's next. Subscribe to AGI Agent:
GitHub X
Powered by Buttondown, the easiest way to start and grow your newsletter.