AGI Agent

Subscribe
Archives
June 26, 2025

LLM Daily: June 26, 2025

🔍 LLM DAILY

Your Daily Briefing on Large Language Models

June 26, 2025

HIGHLIGHTS

• Abridge has secured a $300 million Series E funding round led by a16z, doubling its valuation to $5.3B in just 4 months and cementing its position as the leader in the AI medical scribe market, largely due to its early integration with Epic Systems.

• Google's Gemini team has released an open-source CLI tool for code with generous free limits, including a 1 million token context window and 1,000 requests per day, likely aimed at gathering valuable training data for future model iterations.

• Mercury, developed by Inception Labs, represents a breakthrough in LLM efficiency by using diffusion models to predict multiple tokens in parallel, with their Mercury Coder achieving up to 10x faster generation speeds than competitors.

• The open-source ecosystem continues to thrive with LangChain reaching 110,158 GitHub stars as it improves support for SurrealDB vector store integration and Ollama, maintaining its position as the go-to orchestration layer for LLM applications.


BUSINESS

Funding & Investment

Abridge Doubles Valuation to $5.3B in Just 4 Months

AI medical scribe Abridge has secured a $300 million Series E funding round led by a16z, just months after closing its previous $250 million fundraise. The 7-year-old company has established itself as the leader in the increasingly competitive AI-powered medical scribe market, largely due to its early entry and integration with Epic Systems. (TechCrunch, 2025-06-24)

Sequoia Capital Backs Delphi

Sequoia Capital announced a new investment in Delphi, though specific funding amounts were not disclosed. The announcement titled "Partnering with Delphi: Meet Your Heroes" suggests a significant vote of confidence from the prominent venture capital firm. (Sequoia Capital, 2025-06-24)

Wispr Flow Raises $30M for AI-Powered Dictation

Wispr Flow has secured $30 million in funding from Menlo Ventures to further develop its AI-powered dictation application. The investment will help the company expand its capabilities in voice-to-text technology. (TechCrunch, 2025-06-24)

Perplexity Co-Founder Pledges $100M for AI Research

Andy Konwinski, co-founder of Databricks and Perplexity, has committed $100 million of his personal funds to establish a new institute for AI researchers. The fund has already backed Ion Stoica's new research lab, signaling a significant private investment in advancing AI research. (TechCrunch, 2025-06-23)

M&A

Rubrik Acquires Predibase to Enhance AI Agent Capabilities

Data security company Rubrik has acquired Predibase, a startup that helps companies train and fine-tune open source AI models. The acquisition aims to accelerate the adoption of AI agents among Rubrik's customers by enabling faster deployment of customized AI solutions. (TechCrunch, 2025-06-25)

Company Updates

Meta Wins Copyright Lawsuit Over AI Training

A federal judge has sided with Meta in a lawsuit brought by 13 book authors, including Sarah Silverman, who alleged the company illegally trained its AI models on their copyrighted works. Judge Vince Chhabria issued a summary judgment in favor of Meta, a significant legal victory for AI companies using copyrighted materials for training. (TechCrunch, 2025-06-25)

Salesforce Launches Agentforce 3 with Enhanced Monitoring Capabilities

Salesforce has introduced Agentforce 3, featuring AI agent observability and native Model Context Protocol (MCP) support. The platform provides real-time visibility and secure interoperability for enterprise AI deployments, allowing companies to better manage and monitor their AI workforce. (VentureBeat, 2025-06-23)

OpenAI and io's AI Device Development Revealed in Court Filings

Court documents have revealed that OpenAI and Jony Ive's design firm io are working together on AI hardware devices. The filings suggest that OpenAI is more than a year away from releasing its first hardware product, which may not be limited to the rumored in-ear form factor. (TechCrunch, 2025-06-23)

xAI's Grok May Soon Edit Spreadsheets

Leaked code suggests that xAI is developing an advanced file editor for its Grok AI assistant with spreadsheet support. This development signals the company's ambition to compete with OpenAI, Google, and Microsoft in the productivity tools space by embedding AI capabilities directly into document and spreadsheet editing. (TechCrunch, 2025-06-23)

Market Analysis

IBM Reports Enterprises Using Multiple AI Models Simultaneously

IBM has observed that real-world enterprise AI deployments increasingly involve multiple AI models running concurrently, creating new challenges in matching the right large language model (LLM) to specific use cases. This trend is forcing a fundamental shift in enterprise AI architecture toward more sophisticated model routing systems. (VentureBeat, 2025-06-25)

Creative Commons Launches Framework for Open AI Ecosystem

Creative Commons has introduced "CC signals," a framework designed to enable dataset holders to specify how their content can or cannot be used by machines, particularly for training AI models. This initiative aims to create a more transparent and ethically sound foundation for AI development. (TechCrunch, 2025-06-25)

Boston Consulting Group: Untapped Data Key to Enterprise AI Value

According to Boston Consulting Group, companies are moving beyond the experimental phase of AI and looking to scale agents and other applications. The consulting firm advises enterprises to focus on leveraging previously ignored data sources to unlock greater value from their AI initiatives. (VentureBeat, 2025-06-25)

LAION Releases Tools Focused on Emotional Intelligence for LLMs

Open source group LAION has released a suite of tools dedicated to enhancing the emotional intelligence of language models. This development highlights the growing industry focus on making AI systems more empathetic and better at understanding human emotions. (TechCrunch, 2025-06-24)


PRODUCTS

Gemini's Open Source CLI Tool for Code (2025-06-25)

Google's Gemini team has released an open-source command-line interface (CLI) tool similar to Claude Code but with more generous free limits. The tool offers a 1 million token context window, 60 model requests per minute, and 1,000 requests per day at no charge. User feedback indicates it works well with codebases, suggesting Google is leveraging this free offering to gather valuable training data for future Gemini model iterations. Some users have reported issues with Google's billing practices for previous Gemini services, so potential users may want to monitor usage carefully.

Google Veo 3 Video Generation (2025-06-25)

Google's Veo 3 model appears to be behind popular AI-generated videos circulating online, including viral "bigfoot selfie" videos. Reddit users identified the distinctive style and quality of Veo 3 in these creations. The videos showcase Google's advancements in video-to-video generation capabilities, with the content becoming increasingly popular across social media platforms. The original video was created by @unreelinc according to attribution in the Reddit post.


TECHNOLOGY

Open Source Projects

langchain-ai/langchain - 110,158 ⭐

Build context-aware reasoning applications with this popular framework that connects large language models to external tools and data sources. Recent updates focus on improving documentation for SurrealDB vector store integration and Ollama support, demonstrating continued momentum as the go-to orchestration layer for LLM applications.

comfyanonymous/ComfyUI - 80,752 ⭐

A modular, node-based interface for diffusion models that offers unprecedented flexibility and control for image generation workflows. Recent commits introduce Omnigen2 model implementation and transition terminology from "unet" to the more accurate "diffusion model," plus adding Singlestep DPM++ SDE for RF support.

unclecode/crawl4ai - 46,424 ⭐

An LLM-friendly web crawler and scraper designed specifically for AI applications, making it easy to collect and process training data from websites. The project has seen significant growth with recent updates focused on tracking stargazers, showing strong community adoption for this specialized data collection tool.

Models & Datasets

Models

  • nanonets/Nanonets-OCR-s - A specialized OCR model built on Qwen2.5-VL that excels at converting images and PDFs to markdown text, with over 177K downloads demonstrating its utility for document processing pipelines.
  • mistralai/Mistral-Small-3.2-24B-Instruct-2506 - The latest version of Mistral's instruction-tuned 24B parameter model, now with enhanced multilingual capabilities supporting 25+ languages including French, German, Spanish, Japanese, and more.
  • MiniMaxAI/MiniMax-M1-80k - A text generation model with impressive 80K context window capability, gaining rapid traction with 577 likes and compatibility with vLLM for efficient inference.
  • Menlo/Jan-nano - A lightweight conversational model based on Qwen3-4B that's garnered over 29K downloads, offering an efficient option for resource-constrained deployments.

Datasets

  • EssentialAI/essential-web-v1.0 - A large-scale web dataset with between 10-100B tokens for training language models, released with a permissive ODC-BY license and referenced in a recent arXiv paper (2506.14111).
  • institutional/institutional-books-1.0 - A substantial collection of books in parquet format containing between 100K-1M entries, supporting multiple libraries including datasets, dask, and polars for flexible processing.
  • nvidia/AceReason-1.1-SFT - NVIDIA's supervised fine-tuning dataset specifically for improving reasoning abilities in LLMs, with a focus on math and code tasks, containing 1-10M entries and published under CC-BY-4.0 license.
  • nvidia/OpenScience - A comprehensive scientific dataset with 1-10M entries designed for training models on scientific content, referenced in multiple arXiv papers and compatible with various data processing libraries.

Developer Tools & Spaces

  • MiniMaxAI/MiniMax-M1 - A Gradio-based interface for exploring the capabilities of the MiniMax-M1 language model, attracting 270 likes for its interactive demonstration.
  • Kwai-Kolors/Kolors-Virtual-Try-On - An extremely popular virtual clothing try-on system with over 9,100 likes, allowing users to visualize how different garments would look on models or themselves.
  • ResembleAI/Chatterbox - A conversational interface powered by ResembleAI's voice synthesis technology, garnering over 1,170 likes for its natural-sounding voice interactions.
  • open-llm-leaderboard/open_llm_leaderboard - The definitive benchmark space for comparing open-source language models with over 13,200 likes, featuring automated evaluation across code, math, and general language understanding tasks.
  • jbilcke-hf/ai-comic-factory - A popular tool for generating comics using AI with over 10,400 likes, demonstrating the growing interest in creative applications of generative models.

RESEARCH

Paper of the Day

Mercury: Ultra-Fast Language Models Based on Diffusion (2025-06-17)

Authors: Inception Labs, Samar Khanna, Siddhant Kharbanda, Shufan Li, Harshit Varma, Eric Wang, Sawyer Birnbaum, Ziyang Luo, Yanis Miraoui, Akash Palrecha, Stefano Ermon, Aditya Grover, Volodymyr Kuleshov

Institution: Inception Labs (with collaborators from Stanford University)

Mercury represents a significant breakthrough in LLM efficiency by leveraging diffusion models for text generation. This work is particularly notable as it demonstrates commercial-scale language models that can predict multiple tokens in parallel, potentially solving one of the major bottlenecks in current LLM inference.

The paper introduces Mercury Coder (in Mini and Small sizes), setting a new state-of-the-art on the speed-quality frontier for code generation. According to independent evaluations, these models achieve up to 10x faster generation speeds than comparable models while maintaining competitive performance. This approach could fundamentally change how LLMs are deployed in production environments where latency is critical.

Notable Research

JoyAgents-R1: Joint Evolution Dynamics for Versatile Multi-LLM Agents with Reinforcement Learning (2025-06-24)

Authors: Ai Han, Junxing Hu, Pu Wei, Zhiqian Zhang, Yuhang Guo, Jiawei Lu, Zicheng Zhang This work introduces Group Relative Policy Optimization (GRPO) for joint training of heterogeneous multi-agent LLM systems, addressing challenges in cooperative efficiency and training stability for complex multi-agent reinforcement learning tasks.

ECCoT: A Framework for Enhancing Effective Cognition via Chain of Thought in Large Language Model (2025-06-24)

Authors: Zhenke Duan, Jiqun Pan, Jiani Tu, Xiaoyi Wang, Yanqing Wang The authors propose an End-to-End Cognitive Chain of Thought framework that structures reasoning into step-by-step deductions with built-in error detection mechanisms, enhancing the reliability and interpretability of LLM reasoning processes.

A Survey of LLM-Driven AI Agent Communication: Protocols, Security Risks, and Defense Countermeasures (2025-06-24)

Authors: Dezhang Kong, Shi Lin, Zhenhua Xu, Zhebo Wang, Minghao Li, and others This comprehensive survey examines communication protocols between LLM-driven agents, identifying potential security vulnerabilities and proposing countermeasures as these systems increasingly interact with each other and external tools.

Multimodal large language models and physics visual tasks: comparative analysis of performance and costs (2025-06-24)

Authors: Giulia Polverini, Bor Gregorcic A systematic evaluation of 15 multimodal LLMs across 102 physics concept inventory items, revealing substantial performance variations between models and highlighting cost-effectiveness considerations for educational applications.


LOOKING AHEAD

As we move into Q3 2025, the integration of multimodal capabilities with real-time knowledge retrieval is poised to redefine AI utility across enterprises. The emergence of computationally-efficient models trained on continuously updated data streams—rather than static datasets—signals a shift toward truly adaptive AI systems that maintain relevance without complete retraining cycles.

Watch for the first regulatory frameworks addressing AI hallucination liability to crystallize by Q4, particularly as LLM-driven decision support becomes standard in healthcare and financial services. Meanwhile, the convergence of neuromorphic computing with traditional transformer architectures promises significant efficiency breakthroughs, potentially reducing inference costs by 60-70% before year-end—a development that may finally democratize advanced AI deployment beyond tech giants and well-funded startups.

Don't miss what's next. Subscribe to AGI Agent:
GitHub X
Powered by Buttondown, the easiest way to start and grow your newsletter.