LLM Daily: November 17, 2025

        November 17, 2025

LLM Daily: November 17, 2025

        🔍 LLM DAILY
Your Daily Briefing on Large Language Models
November 17, 2025
HIGHLIGHTS
• OpenAI's financial relationship with Microsoft has been exposed through leaked documents revealing specific payment details under their revenue-sharing agreement, including inference costs, providing rare insight into the economics of leading AI partnerships.
• Legal AI startup Harvey is emerging as one of Silicon Valley's hottest companies, securing investments from top-tier VCs including Andreessen Horowitz and Kleiner Perkins while gaining significant traction in the legal tech space.
• An open-source tool called "Heretic" has been released that claims to automatically remove censorship ("alignment") from language models, highlighting ongoing tensions between AI safety measures and demands for unrestricted access.
• Microsoft Research has introduced UFO³, a groundbreaking system that solves critical limitations in current LLM agent frameworks by enabling seamless operation across multiple devices and operating systems through a unified orchestration fabric.
• Google's open-source terminal-based AI agent "gemini-cli" has gained remarkable traction with over 82,700 GitHub stars, bringing Gemini's capabilities directly to the command line with active ongoing development.

BUSINESS
Leaked OpenAI-Microsoft Financial Details Revealed

Documents reveal OpenAI payments to Microsoft under their revenue-sharing agreement, including inference costs (2025-11-14)
TechCrunch reports that leaked financial documents provide insights into the financial relationship between the two AI giants

Harvey Emerges as Hot Legal AI Startup

Legal AI startup Harvey gaining significant traction in Silicon Valley
Co-founders Winston Weinberg and Gabe Pereyra have attracted investments from major VCs including Andreessen Horowitz, Kleiner Perkins, and Elad Gil
TechCrunch interview with CEO Winston Weinberg details the company's rapid growth (2025-11-14)

Data Center Spending Surpasses Oil Exploration

$580 billion will be spent on data centers this year according to International Energy Agency report
This figure exceeds new oil supply investment by $40 billion
The AI infrastructure boom raises questions about renewable energy usage to power these facilities
Analysis available from TechCrunch (2025-11-16)

US-China AI Competition Intensifies

Databricks co-founder Andy Konwinski warns US is losing AI research edge to China
Argues open source approach is necessary for US to maintain competitiveness
Now a venture capitalist with Laude Ventures, Konwinski's views detailed in TechCrunch article (2025-11-14)

PRODUCTS
New Releases
Heretic: Automatic Censorship Removal for Language Models
Company: p-e-w (Open source project)

Released: (2025-11-16)

Link: GitHub Repository
An open-source Python tool called "Heretic" has been released that claims to automatically remove censorship (what the developer refers to as "alignment") from various language models. The tool requires a Python environment with the appropriate version of PyTorch installed, and can be used with a simple installation command (pip install heretic-llm). The project appears to be focused on giving users access to uncensored AI responses, though this also raises ethical considerations around responsible AI use.
Improved Sentence Transformer Implementation
Company: Open source contributor

Released: (Date not specified in the data)

Link: (Specific link not provided in the data)
A community developer has shared an improved implementation of sentence transformers with several key advantages over standard implementations:
- Significantly faster processing, especially for large paragraphs with many sentences
- Works directly out of the box in Windows environments
- Enhanced performance capabilities
The implementation appears to be particularly valuable for text processing tasks requiring sentence-level operations.
Community Discussion
Kokoro TTS Implementation Discussion
Platform: Reddit discussion

Date: (2025-11-16)

Link: Reddit comment
Users have been discussing the current implementation of Kokoro Text-to-Speech (TTS) via Koboldcpp running on CPU. The community is expressing interest in potential improvements that could make TTS models faster with lower latency while maintaining an easy-to-use API, highlighting ongoing demand for more efficient and accessible TTS solutions for local deployment.
Academic Publishing Models in Machine Learning
Platform: Reddit discussion

Date: (2025-11-16)

Link: Reddit discussion
While not a product announcement, there's significant community discussion around different models for evaluating and publishing machine learning research. The discussion centers on traditional peer review versus open review systems, with some community members expressing that "peer-review is beyond dead" due to quality control issues. This reflects growing concerns about how research quality is maintained in the rapidly evolving field of AI.

TECHNOLOGY
Open Source Projects
google-gemini/gemini-cli
An open-source terminal-based AI agent that brings Gemini's capabilities directly to your command line. Built with TypeScript, it has gained significant traction with over 82,700 stars on GitHub. Recent updates include version bumps and core package refactoring, showing active development.
firecrawl/firecrawl
A powerful Web Data API for AI that converts websites into LLM-ready markdown or structured data. With nearly 68,000 stars, this TypeScript project serves as an essential tool for RAG systems and content processing. Recent commits focus on fixing scraping functionality for documents, PDFs, and API improvements.
pathwaycom/llm-app
Ready-to-run cloud templates for building RAG applications, AI pipelines, and enterprise search with live data synchronization. The repository offers Docker-friendly implementations that integrate with Sharepoint, Google Drive, S3, Kafka, PostgreSQL, and real-time data sources. Recent updates reorganize pipeline templates for better usability.
Models & Datasets
moonshotai/Kimi-K2-Thinking
A popular conversational model with over 141,000 downloads and 1,200+ likes. This transformer-based model exposes intermediate reasoning steps, making it useful for applications requiring transparent AI decision-making processes.
baidu/ERNIE-4.5-VL-28B-A3B-Thinking
A multimodal vision-language model that processes both text and images. With 28B parameters, it supports both English and Chinese languages, and features "thinking" capabilities that expose the model's reasoning process. Licensed under Apache-2.0.
maya-research/maya1
A versatile transformer model built on the Llama architecture with both text generation and text-to-speech capabilities. With 26,500+ downloads and 619 likes, it's compatible with AutoTrain and text-generation inference endpoints.
deepseek-ai/DeepSeek-OCR
A highly popular OCR (Optical Character Recognition) model with over 4.4 million downloads. This vision-language model specializes in extracting text from images and supports multiple languages. Its research is documented in arxiv:2510.18234.
builddotai/Egocentric-10K
A dataset with over 35,000 downloads focused on egocentric (first-person) data, likely for training AI systems that understand human perspective and activities. Released under Apache-2.0 license, it was last updated on November 11, 2025.
facebook/omnilingual-asr-corpus
A comprehensive multilingual dataset for automatic speech recognition and audio classification tasks. Supporting hundreds of languages, this dataset has been downloaded nearly 18,000 times, making it valuable for developing inclusive ASR systems.
Developer Tools & Interfaces
HuggingFaceTB/smol-training-playbook
A highly-regarded Docker-based space with over 2,200 likes that serves as a research article template for small model training. It includes data visualization tools and comprehensive guidance for efficient model training workflows.
tori29umai/Qwen-Image-2509-MultipleAngles
A Gradio-powered interface with 413 likes that leverages the Qwen image model to generate images from multiple angles. This tool demonstrates advanced capabilities in consistent multi-view image generation.
Wan-AI/Wan2.2-Animate
One of the most popular Gradio spaces with 2,441 likes, focused on animation generation. This interface provides user-friendly controls for creating animated content using the Wan2.2 model.
stepfun-ai/Step-Audio-EditX
A specialized Gradio interface for audio editing and manipulation. With 73 likes, it offers AI-powered tools to modify and enhance audio recordings with precise control.

RESEARCH
Paper of the Day
UFO³: Weaving the Digital Agent Galaxy (2025-11-14)
Authors: Chaoyun Zhang, Liqun Li, He Huang, Chiming Ni, Bo Qiao, Yu Kang, Minghua Ma, Qingwei Lin, Saravan Rajmohan, Dongmei Zhang
Institution: Microsoft Research
This paper stands out for introducing a groundbreaking system that solves a critical limitation in current LLM agent frameworks: the inability to seamlessly operate across multiple devices and operating systems. UFO³ represents a significant advancement in agent orchestration by creating a unified fabric that connects heterogeneous endpoints (desktops, servers, mobile devices, and edge devices) into a coherent multi-agent system.
The researchers present a distributed DAG architecture called TaskConstellation that can dynamically adapt as user requests evolve, automatically distributing subtasks to appropriate devices based on capability, security constraints, and resource availability. Their evaluation shows that UFO³ reduces completion time by 35-50% for cross-device workflows while maintaining 97% accuracy in task completion compared to single-device approaches, demonstrating a practical pathway toward truly unified digital assistants capable of operating across our increasingly fragmented digital ecosystems.
Notable Research
iMAD: Intelligent Multi-Agent Debate for Efficient and Accurate LLM Inference (2025-11-14)
Authors: Wei Fan, JinYi Yoon, Bo Ji
The researchers introduce a novel multi-agent debate framework that improves LLM reasoning while reducing computational costs. iMAD assigns specialized roles to agents, uses curriculum learning for complexity management, and employs a unique adjudicator mechanism, achieving state-of-the-art performance across multiple reasoning tasks while reducing token usage by 30-60%.
DocSLM: A Small Vision-Language Model for Long Multimodal Document Understanding (2025-11-14)
Authors: Tanveer Hannan, Dimitrios Mallios, Parth Pathak, Faegheh Sardari, Thomas Seidl, Gedas Bertasius, Mohsen Fayyaz, Sunando Sengupta
This paper presents a lightweight vision-language model specifically designed for processing long documents on resource-constrained devices, featuring a novel Hierarchical Multimodal Compressor that efficiently encodes visual, textual, and layout information while maintaining competitive performance against much larger models.
AUVIC: Adversarial Unlearning of Visual Concepts for Multi-modal Large Language Models (2025-11-14)
Authors: Haokun Chen, Jianing Li, Yao Zhang, Jinhe Bi, Yan Xia, Jindong Gu, Volker Tresp
The authors address the critical challenge of "right to be forgotten" in MLLMs by developing a technique that allows for targeted removal of visual concepts without resource-intensive retraining, effectively balancing the preservation of general model capabilities while eliminating specific visual knowledge.
Scalable Policy Evaluation with Video World Models (2025-11-14)
Authors: Wei-Cheng Tseng, Jinwei Gu, Qinsheng Zhang, Hanzi Mao, Ming-Yu Liu, Florian Shkurti, Lin Yen-Chen
This research introduces a method to evaluate robotic manipulation policies using video world models, significantly reducing the costs and safety risks of physical testing while providing quantitative metrics that strongly correlate with real-world performance, potentially accelerating the development of generalist robotic systems.

LOOKING AHEAD
As we approach 2026, we're seeing AI architecture evolve beyond transformer-based designs that dominated the early 2020s. The emergence of hybrid neuromorphic-quantum models is gaining momentum, with early implementations demonstrating up to 80% reduction in computational costs while maintaining performance. Several leading labs have announced breakthrough results in true few-shot generalization across domains previously resistant to AI capabilities.
Looking into Q1 2026, expect significant advancements in autonomous AI systems with enhanced self-correction capabilities. The recent regulatory frameworks enacted in Europe and Asia will likely accelerate rather than hinder innovation, as they've provided much-needed clarity for deployment in critical sectors. Companies positioning themselves at the intersection of multimodal AI and specialized domain knowledge will likely see outsized returns as these technologies mature from experimental to essential.

                            Don't miss what's next. Subscribe to AGI Agent:

            Email address (required)

                Share this email:

                                Share on Facebook

                                Share on Twitter

                                Share on Hacker News

                                Share via email