LLM Daily: September 18, 2025
🔍 LLM DAILY
Your Daily Briefing on Large Language Models
September 18, 2025
HIGHLIGHTS
• Chip manufacturer Groq has secured a massive funding round at a $6.9B valuation, positioning it as a significant challenger to Nvidia in the AI hardware space.
• The "Compute as Teacher" (CaT) research breakthrough enables LLMs to generate their own learning signals without human-created references by synthesizing guidance from multiple parallel inference rollouts.
• Irregular has raised $80M to focus on securing frontier AI models, highlighting the growing importance of AI safety as models become more powerful.
• China has reportedly banned its largest tech companies from acquiring Nvidia chips, accelerating domestic AI hardware development efforts to reduce dependency on US technology.
• LM Studio, a popular desktop application for running local LLMs, will host an AMA session on Reddit, offering insights into the evolving landscape of local AI deployment.
BUSINESS
Funding & Investment
Groq Raises New Round at $6.9B Valuation
- Nvidia AI chip challenger Groq raised a new funding round that exceeded earlier rumors, reaching a $6.9 billion valuation
- TechCrunch (2025-09-17)
Irregular Secures $80M for AI Security
- Irregular, an AI security startup focused on securing frontier AI models, raised $80 million at a $450 million valuation
- Sequoia Capital announced their partnership with Irregular in a post titled "Partnering with Irregular: Ahead of the Curve"
- TechCrunch (2025-09-17)
- Sequoia Capital (2025-09-17)
CodeRabbit Raises $60M for AI Code Review
- CodeRabbit, an AI code review startup, raised $60M led by Scale Venture Partners, valuing the two-year-old company at $550M
- Total funding now stands at $88 million
- TechCrunch (2025-09-16)
Keplar Secures $3.2M Seed Round for Voice AI
- Keplar, a voice AI startup aiming to replace traditional market research, raised a $3.2 million seed round led by Kleiner Perkins
- The company is two years old and focuses on market research applications
- TechCrunch (2025-09-17)
Company Updates
Meta Unveils New Smart Glasses with AI Features
- Meta showcased its newest AI-powered smart glasses at Meta Connect 2025
- The glasses feature a display and are controlled by a wristband
- TechCrunch (2025-09-17)
Macroscope Launches AI Tool for Code Management
- Former Twitter head of product Kayvon Beykpour announced the launch of Macroscope, an AI system for developers
- The tool summarizes codebase updates and helps catch bugs, among other features
- TechCrunch (2025-09-17)
Google's Gemini Tops App Store with New AI Image Model
- Google's Gemini has reached the top of the App Store following the release of its new AI image model, Nano Banana
- The app gained 12.6 million downloads in September so far, up from 8.7 million in August
- TechCrunch (2025-09-16)
Market Analysis
China Bans Tech Companies from Buying Nvidia AI Chips
- China has officially banned its tech companies from purchasing Nvidia's AI chips, escalating from the previous discouragement in August
- This move could significantly impact the global AI chip market and accelerate China's domestic chip development
- TechCrunch (2025-09-17)
Silicon Valley Invests in AI Training Environments
- A wave of startups are creating reinforcement learning (RL) environments to help AI labs train agents
- This trend is gaining significant investment attention and could be "Silicon Valley's next craze"
- TechCrunch (2025-09-16)
PRODUCTS
LM Studio Team to Host AMA on r/LocalLLaMA
Source: Reddit (2025-09-17)
The team behind LM Studio, a popular desktop application for running local large language models, will be hosting an "Ask Me Anything" session on the r/LocalLLaMA subreddit. The AMA is scheduled for Thursday from 11 AM to 1 PM PDT. This presents an opportunity for users to directly engage with the developers of one of the leading tools in the local LLM ecosystem, potentially gaining insights into upcoming features, technical capabilities, and the team's vision for local AI deployment.
Chinese AI Hardware Development Accelerates Amid Nvidia Restrictions
Source: Reddit Discussion (2025-09-17)
In a significant market shift, China has reportedly banned its largest tech companies from acquiring Nvidia AI chips, claiming that domestically developed AI processors now match the capabilities of Nvidia's H20 and RTX Pro 6000D models. This development has sparked discussions in the AI community about potential divergence in AI hardware ecosystems and what it might mean for model compatibility. Industry observers note this could accelerate the development of alternative AI acceleration platforms that compete with Nvidia's dominant CUDA ecosystem, potentially impacting how future AI models are trained and deployed globally.
TECHNOLOGY
Open Source Projects
langchain-ai/langchain - 115,651 ⭐
LangChain provides a framework for building context-aware reasoning applications with large language models. Recent updates include improved HITL (Human-In-The-Loop) patterns and documentation enhancements for JSON schema functionality, making it easier for developers to implement structured outputs in their GenAI applications.
rasbt/LLMs-from-scratch - 71,884 ⭐
This educational repository offers a step-by-step guide to implementing a ChatGPT-like LLM in PyTorch from scratch. Recent updates include a fix for the Qwen3Tokenizer
to address generation mismatches with Hugging Face models and optimizations for RoPE (Rotary Position Embedding) computation, making the angle calculation more efficient.
infiniflow/ragflow - 64,634 ⭐
RAGFlow is an open-source Retrieval-Augmented Generation engine that combines RAG with Agent capabilities to create an enhanced context layer for LLMs. Recent commits show active development with new features including KB document basic info support, CometAPI integration for LLMFactory, and expanded SQL tool documentation.
Models & Datasets
tencent/SRPO
A new text-to-image diffusion model from Tencent that's gaining significant attention with 800 likes. While specific details are limited in the data provided, the model is likely based on research from the cited arxiv paper (2509.06942) and appears to be optimized for high-quality image generation.
Qwen/Qwen3-Next-80B-A3B-Instruct
An 80B parameter instruction-tuned model from the Qwen3-Next family with over 300K downloads. The A3B designation likely indicates the model has been aligned using advanced techniques, optimized for conversational applications with strong performance across multiple domains.
Qwen/Qwen3-Next-80B-A3B-Thinking
A variant of the Qwen3-Next 80B model specifically optimized for reasoning and "thinking" tasks. With nearly 160K downloads, this model appears to be engineered to excel at tasks requiring deeper analytical capabilities and chain-of-thought reasoning.
google/vaultgemma-1b
A compact 1B parameter model from Google that incorporates differential privacy techniques (DP-SGD) for enhanced security and privacy guarantees. Based on the Gemma architecture, this model is notable for its privacy-preserving approach to language modeling, as indicated by its numerous privacy-related arxiv citations.
HuggingFaceFW/finepdfs
A multilingual dataset with over 57K downloads designed for text generation tasks. The extensive language tags suggest it contains PDF content in hundreds of languages, making it a comprehensive resource for training and fine-tuning models across diverse linguistic contexts.
Developer Tools & Spaces
Kwai-Kolors/Kolors-Virtual-Try-On
A highly popular Gradio space (9,642 likes) that enables virtual clothing try-on, allowing users to visualize how different garments would look on them without physical fitting. The application leverages advanced computer vision and generative AI techniques to create realistic try-on visualizations.
not-lain/background-removal
A practical tool with over 2,300 likes that automatically removes backgrounds from images. Built with Gradio, this space provides a simple interface for a common image processing task that's useful for e-commerce, design, and content creation workflows.
haggai/wan2-2-fp8da-aoti-faster
An optimized implementation of a generative model (likely Wanxiang 2.2) that uses FP8 data format and AOTI (Ahead-Of-Time Inference) to achieve faster inference speeds. This space demonstrates techniques for model optimization and deployment efficiency.
umint/searchgpt
A Docker-based application that combines search functionality with GPT capabilities, allowing users to search through content with natural language processing. With 60 likes, it represents an interesting integration of retrieval and generative AI technologies.
InstantX/Qwen-Image-ControlNet-Inpainting
A specialized image editing tool that combines Qwen's image models with ControlNet for inpainting tasks, enabling precise image manipulation and restoration. This integration allows for controlled generation of missing or removed parts of images with high fidelity.
RESEARCH
Paper of the Day
Compute as Teacher: Turning Inference Compute Into Reference-Free Supervision
Authors: Dulhan Jayalath, Shashwat Goel, Thomas Foster, Parag Jain, Suchin Gururangan, Cheng Zhang, Anirudh Goyal, Alan Schelten
Institutions: University of Toronto, Microsoft Research
Published: (2025-09-17)
This paper stands out for introducing a novel approach to generating learning signals in post-training scenarios where no ground truth exists. The authors' "Compute as Teacher" (CaT) method converts a model's own exploration at inference-time into reference-free supervision by synthesizing a single reference from multiple parallel rollouts.
CaT represents a significant advancement in self-improvement techniques for large language models, allowing models to continuously refine their outputs without human-created references. The authors demonstrate that this approach yields substantial improvements across tasks like reasoning, creative writing, and code generation, outperforming existing reference-free approaches while being more computationally efficient than alternatives like best-of-n sampling.
Notable Research
Reasoning Efficiently Through Adaptive Chain-of-Thought Compression: A Self-Optimizing Framework
Authors: Kerui Huang, Shuhan Liu, Xing Hu, et al.
Published: (2025-09-17)
This paper introduces an adaptive Chain-of-Thought compression technique that preserves reasoning accuracy while significantly reducing computational costs, particularly valuable for software engineering tasks requiring concise outputs.
CrowdAgent: Multi-Agent Managed Multi-Source Annotation System
Authors: Maosheng Qin, Renyu Zhu, Mingxuan Xia, et al.
Published: (2025-09-17)
The authors present a holistic annotation system that dynamically manages diverse annotation sources (LLMs, SLMs, and human experts) using a multi-agent framework, optimizing for quality-cost trade-offs in data annotation for NLP tasks.
Canary-1B-v2 & Parakeet-TDT-0.6B-v3: Efficient and High-Performance Models for Multilingual ASR and AST
Authors: Monica Sekoyan, Nithin Rao Koluguri, Nune Tadevosyan, et al.
Published: (2025-09-17)
This paper introduces a fast, robust multilingual model for Automatic Speech Recognition and Speech-to-Text Translation supporting 25 languages, trained on 1.7M hours of data with a novel two-stage pre-training and fine-tuning process.
LLM Agents for Interactive Workflow Provenance: Reference Architecture and Evaluation Methodology
Authors: Renan Souza, Timothy Poteet, Brian Etz, et al.
Published: (2025-09-17)
The researchers propose a reference architecture for LLM-based agents that can analyze complex scientific workflow provenance data across Edge, Cloud, and HPC environments, offering natural language interfaces for provenance analysis.
LOOKING AHEAD
As we approach Q4 2025, the integration of neural-symbolic systems is poised to address current LLM limitations in reasoning and factuality. Several labs have demonstrated promising prototypes combining the pattern recognition strengths of neural networks with the logical precision of symbolic AI, suggesting commercial applications by early 2026.
Meanwhile, the regulatory landscape continues evolving rapidly. With the EU AI Act implementation phase ending in December and similar frameworks advancing in Asia-Pacific regions, we anticipate a significant shift toward global AI governance harmonization by mid-2026. Companies developing multi-modal foundation models should particularly monitor these developments, as specialized regulations for synthetic media capabilities appear increasingly likely before year-end.