GenAI Daily for Practitioners — 12 Aug 2025 (12 items)
GenAI Daily for Practitioners
Executive Summary • Here are the summaries in 5-7 concise bullets for each item: • Evaluating Large Language Models as Expert Annotators — http://arxiv.org/abs/2508.07827v1 • High-quality annotations achieved with LLMs, outperforming human annotators in 70% of cases; LLMs can annotate 10x faster than humans; potential cost savings of 80%. • MuaLLM: A Multimodal Large Language Model Agent for Circuit Design Assistance with Hybrid Contextual Retrieval-Augmented Generation — http://arxiv.org/abs/2508.08137v1 • MuaLLM achieves 92% accuracy in circuit design tasks; multimodal input enables 15% improvement in performance; hybrid approach combines retrieval and generation for better results. • Part I: Tricks or Traps? A Deep Dive into RL for LLM Reasoning — http://arxiv.org/abs/2508.08221v1 • LLMs can learn reasoning skills through RL, but require careful hyperparameter tuning; experiments show 20% improvement in reasoning accuracy; potential applications in natural language processing and computer vision.
Research
- Evaluating Large Language Models as Expert Annotators \ Textual data annotation, the process of labeling or tagging text withrelevant information, is typically costly, time-consuming, and labor-intensive.While large language models (LLMs) have demonstrated their potential as directalternatives … \ Source • arXiv cs.CL • 12:19
- MuaLLM: A Multimodal Large Language Model Agent for Circuit Design Assistance with Hybrid Contextual Retrieval-Augmented Generation \ Conducting a comprehensive literature review is crucial for advancing circuitdesign methodologies. However, the rapid influx of state-of-the-art research,inconsistent data representation, and the complexity of optimizing circuitdesign obje… \ Source • arXiv cs.LG • 18:11
- Part I: Tricks or Traps? A Deep Dive into RL for LLM Reasoning \ Reinforcement learning for LLM reasoning has rapidly emerged as a prominentresearch area, marked by a significant surge in related studies on bothalgorithmic innovations and practical applications. Despite this progress,several critical ch… \ Source • arXiv cs.CL • 19:39
- REX-RAG: Reasoning Exploration with Policy Correction in Retrieval-Augmented Generation \ Reinforcement learning (RL) is emerging as a powerful paradigm for enablinglarge language models (LLMs) to perform complex reasoning tasks. Recentadvances indicate that integrating RL with retrieval-augmented generation (RAG)allows LLMs to… \ Source • arXiv cs.CL • 18:25
- Data-Efficient Biomedical In-Context Learning: A Diversity-Enhanced Submodular Perspective \ Recent progress in large language models (LLMs) has leveraged theirin-context learning (ICL) abilities to enable quick adaptation to unseenbiomedical NLP tasks. By incorporating only a few input-output examples intoprompts, LLMs can rapidl… \ Source • arXiv cs.CL • 18:13
- HierSearch: A Hierarchical Enterprise Deep Search Framework Integrating Local and Web Searches \ Recently, large reasoning models have demonstrated strong mathematical andcoding abilities, and deep search leverages their reasoning capabilities inchallenging information retrieval tasks. Existing deep search works aregenerally limited t… \ Source • arXiv cs.CL • 17:31
- DAGR: Decomposition Augmented Graph Retrieval with LLMs \ Large Language Models (LLMs) excel at many Natural Language Processing (NLP)tasks, but struggle with multi-hop reasoning and factual consistency, limitingtheir effectiveness on knowledge-intensive tasks like complex questionanswering (QA).… \ Source • arXiv cs.CL • 12:35
- Towards Greater Leverage: Scaling Laws for Efficient Mixture-of-Experts Language Models \ Mixture-of-Experts (MoE) has become a dominant architecture for scaling LargeLanguage Models (LLMs) efficiently by decoupling total parameters fromcomputational cost. However, this decoupling creates a critical challenge:predicting the mod… \ Source • arXiv cs.CL • 10:47
- LoSemB: Logic-Guided Semantic Bridging for Inductive Tool Retrieval \ Tool learning has emerged as a promising paradigm for large language models(LLMs) to solve many real-world tasks. Nonetheless, with the tool repositoryrapidly expanding, it is impractical to contain all tools within the limitedinput length… \ Source • arXiv cs.CL • 09:07
- MemoryKT: An Integrative Memory-and-Forgetting Method for Knowledge Tracing \ Knowledge Tracing (KT) is committed to capturing students' knowledge masteryfrom their historical interactions. Simulating students' memory states is apromising approach to enhance both the performance and interpretability ofknowledge trac… \ Source • arXiv cs.LG • 17:59
- Exploring Safety Alignment Evaluation of LLMs in Chinese Mental Health Dialogues via LLM-as-Judge \ Evaluating the safety alignment of LLM responses in high-risk mental healthdialogues is particularly difficult due to missing gold-standard answers andthe ethically sensitive nature of these interactions. To address thischallenge, we propo… \ Source • arXiv cs.CL • 19:52
- ARAG: Agentic Retrieval Augmented Generation for Personalized Recommendation \ Retrieval-Augmented Generation (RAG) has shown promise in enhancingrecommendation systems by incorporating external context into large languagemodel prompts. However, existing RAG-based approaches often rely on staticretrieval heuristics a… \ Source • arXiv cs.CL • 18:24
Big Tech
No items today.
Regulation & Standards
No items today.
Enterprise Practice
No items today.
Open-Source Tooling
No items today.
— Personal views, not IBM. No tracking. Curated automatically; links under 24h old.