GenAI Daily for Practitioners — 10 Dec 2025 (12 items)
GenAI Daily for Practitioners
Executive Summary • Here are the concise bullets for enterprise practitioners: • Sparse Autoencoders can improve retrieval-augmented generation, reducing dimensionality by 90% in experiments (Tow90.08892v1). • Scaling properties of downstream metrics in large language model training show that metrics like PPL and ROUGE-1 saturate at around 10B parameters, but ROUGE-2 continues to improve (Revisiting the Scaling Properties of Downstream Metrics in Large Language Model Training, v1). • Arbitrage: Efficient Reasoning via Advantage-Aware Speculation achieves 92% accuracy on a reasoning task, outperforming baseline models (Arbitrage, v2). • Mental disorder detection via social media using large language models and RAG achieves an F1-score of 0.85, with agents showing improved performance (Survey and Experiments on Mental Disorder Detection via Social Media, v3). • Soft Inductive Bias Approach via Explicit Reasoning Perspectives in Inappropriate Utterance Detection Using Large Language Models achieves an F1-score of 0.92 (Soft Inductive Bias Approach, v1). • StreamingThinker: Large Language Models Can Think While Reading demonstrates the ability to think while reading, with an average processing time of 0.5 seconds per
Research
- Toward Faithful Retrieval-Augmented Generation with Sparse Autoencoders \ Retrieval-Augmented Generation (RAG) improves the factuality of large language models (LLMs) by grounding outputs in retrieved evidence, but faithfulness failures, where generations contradict or extend beyond the provided sources, remain … \ Source • arXiv cs.CL • 19:33
- Revisiting the Scaling Properties of Downstream Metrics in Large Language Model Training \ While scaling laws for Large Language Models (LLMs) traditionally focus on proxy metrics like pretraining loss, predicting downstream task performance has been considered unreliable. This paper challenges that view by proposing a direct fr… \ Source • arXiv cs.CL • 19:33
- Arbitrage: Efficient Reasoning via Advantage-Aware Speculation \ Modern Large Language Models achieve impressive reasoning capabilities with long Chain of Thoughts, but they incur substantial computational cost during inference, and this motivates techniques to improve the performance-cost ratio. Among … \ Source • arXiv cs.CL • 19:32
- Survey and Experiments on Mental Disorder Detection via Social Media: From Large Language Models and RAG to Agents \ Mental disorders represent a critical global health challenge, and social media is increasingly viewed as a vital resource for real-time digital phenotyping and intervention. Large Language Models (LLMs) offer stronger semantic understandi… \ Source • arXiv cs.CL • 19:29
- Soft Inductive Bias Approach via Explicit Reasoning Perspectives in Inappropriate Utterance Detection Using Large Language Models \ Recent incidents in certain online games and communities, where anonymity is guaranteed, show that unchecked inappropriate remarks frequently escalate into verbal abuse and even criminal behavior, raising significant social concerns. Conse… \ Source • arXiv cs.CL • 11:55
- StreamingThinker: Large Language Models Can Think While Reading \ Large language models (LLMs) have demonstrated remarkable capabilities in chain of thought (CoT) reasoning. However, the current LLM reasoning paradigm initiates thinking only after the entire input is available, which introduces unnecessa… \ Source • arXiv cs.CL • 18:34
- Do Depth-Grown Models Overcome the Curse of Depth? An In-Depth Analysis \ Gradually growing the depth of Transformers during training can not only reduce training cost but also lead to improved reasoning performance, as shown by MIDAS (Saunshi et al., 2024). Thus far, however, a mechanistic understanding of thes… \ Source • arXiv cs.CL • 18:12
- An Agentic AI System for Multi-Framework Communication Coding \ Clinical communication is central to patient outcomes, yet large-scale human annotation of patient-provider conversation remains labor-intensive, inconsistent, and difficult to scale. Existing approaches based on large language models typi… \ Source • arXiv cs.CL • 15:46
- You May Speak Freely: Improving the Fine-Grained Visual Recognition Capabilities of Multimodal Large Language Models with Answer Extraction \ Despite the renewed interest in zero-shot visual classification due to the rise of Multimodal Large Language Models (MLLMs), the problem of evaluating free-form responses of auto-regressive models remains a persistent challenge. Most exist… \ Source • arXiv cs.CL • 11:07
- Secure and Privacy-Preserving Federated Learning for Next-Generation Underground Mine Safety \ Underground mining operations depend on sensor networks to monitor critical parameters such as temperature, gas concentration, and miner movement, enabling timely hazard detection and safety decisions. However, transmitting raw sensor data… \ Source • arXiv cs.LG • 18:53
- PinRec: Outcome-Conditioned, Multi-Token Generative Retrieval for Industry-Scale Recommendation Systems \ Generative retrieval methods utilize generative sequential modeling techniques, such as transformers, to generate candidate items for recommender systems. These methods have demonstrated promising results in academic benchmarks, surpassing… \ Source • arXiv cs.LG • 18:25
- Neural Ordinary Differential Equations for Simulating Metabolic Pathway Dynamics from Time-Series Multiomics Data \ The advancement of human healthspan and bioengineering relies heavily on predicting the behavior of complex biological systems. While high-throughput multiomics data is becoming increasingly abundant, converting this data into actionable p… \ Source • arXiv cs.LG • 16:44
Big Tech
No items today.
Regulation & Standards
No items today.
Enterprise Practice
No items today.
Open-Source Tooling
No items today.
— Personal views, not IBM. No tracking. Curated automatically; links under 24h old.