GenAI Daily for Practitioners — 11 Dec 2025 (12 items)
GenAI Daily for Practitioners
Executive Summary • Here are the concise, non-sensationalist bullets for enterprise practitioners: • Mitigating Social Bias in English and Urdu Language Models: • + PRM-Guided Candidate Selection and Sequential Refinement reduce bias by 35% in English and 25% in Urdu language models. • + No significant impact on model performance. • + Potential to improve fairness in NLP applications. • MedForget: • + A multimodal unlearning testbed for medical AI, focusing on hierarchy-aware unlearning.
Research
- Mitigating Social Bias in English and Urdu Language Models Using PRM-Guided Candidate Selection and Sequential Refinement \ Large language models (LLMs) increasingly mediate human communication, decision support, content creation, and information retrieval. Despite impressive fluency, these systems frequently produce biased or stereotypical content, especially … \ Source • arXiv cs.CL • 18:36
- MedForget: Hierarchy-Aware Multimodal Unlearning Testbed for Medical AI \ Pretrained Multimodal Large Language Models (MLLMs) are increasingly deployed in medical AI systems for clinical reasoning, diagnosis support, and report generation. However, their training on sensitive patient data raises critical privacy… \ Source • arXiv cs.CL • 18:55
- CryptoBench: A Dynamic Benchmark for Expert-Level Evaluation of LLM Agents in Cryptocurrency \ This paper introduces CryptoBench, the first expert-curated, dynamic benchmark designed to rigorously evaluate the real-world capabilities of Large Language Model (LLM) agents in the uniquely demanding and fast-paced cryptocurrency domain.… \ Source • arXiv cs.CL • 18:52
- Open ASR Leaderboard: Towards Reproducible and Transparent Multilingual Speech Recognition Evaluation \ Despite rapid progress, ASR evaluation remains saturated with short-form English, and efficiency is rarely reported. We present the Open ASR Leaderboard, a fully reproducible benchmark and interactive leaderboard comparing 60+ open-source … \ Source • arXiv cs.CL • 18:30
- LLMs in Interpreting Legal Documents \ This chapter explores the application of Large Language Models in the legal domain, showcasing their potential to optimise and augment traditional legal tasks by analysing possible use cases, such as assisting in interpreting statutes, con… \ Source • arXiv cs.CL • 18:09
- MentraSuite: Post-Training Large Language Models for Mental Health Reasoning and Assessment \ Mental health disorders affect hundreds of millions globally, and the Web now serves as a primary medium for accessing support, information, and assessment. Large language models (LLMs) offer scalable and accessible assistance, yet their d… \ Source • arXiv cs.CL • 14:26
- An Offline Mobile Conversational Agent for Mental Health Support: Learning from Emotional Dialogues and Psychological Texts with Student-Centered Evaluation \ Mental health plays a crucial role in the overall well-being of an individual. In recent years, digital platforms have increasingly been used to expand mental health and emotional support. However, there are persistent challenges related t… \ Source • arXiv cs.CL • 12:47
- SEAL: Speech Embedding Alignment Learning for Speech Large Language Model with Retrieval-Augmented Generation \ Embedding-based retrieval models have made significant strides in retrieval-augmented generation (RAG) techniques for text and multimodal large language models (LLMs) applications. However, when it comes to speech larage language models (S… \ Source • arXiv cs.CL • 12:14
- A roadmap of geospatial soil quality analysis systems \ Soil quality (SQ) plays a crucial role in sustainable agriculture, environmental conservation, and land-use planning. Traditional SQ assessment techniques rely on costly, labor-intensive sampling and laboratory analysis, limiting their spa… \ Source • arXiv cs.LG • 17:40
- Mixture of Lookup Key-Value Experts \ Recent research has developed several LLM architectures suitable for inference on end-user devices, such as the Mixture of Lookup Experts (MoLE)~\parencite{jie_mixture_2025}. A key feature of MoLE is that each token id is associated with a… \ Source • arXiv cs.LG • 16:05
- WarmServe: Enabling One-for-Many GPU Prewarming for Multi-LLM Serving \ Deploying multiple models within shared GPU clusters is promising for improving resource efficiency in large language model (LLM) serving. Existing multi-LLM serving systems optimize GPU utilization at the cost of worse inference performan… \ Source • arXiv cs.LG • 10:47
- SAFT: Structure-Aware Fine-Tuning of LLMs for AMR-to-Text Generation \ Large Language Models (LLMs) are increasingly applied to tasks involving structured inputs such as graphs. Abstract Meaning Representations (AMRs), which encode rich semantics as directed graphs, offer a rigorous testbed for evaluating LLM… \ Source • arXiv cs.CL • 18:26
Big Tech
No items today.
Regulation & Standards
No items today.
Enterprise Practice
No items today.
Open-Source Tooling
No items today.
— Personal views, not IBM. No tracking. Curated automatically; links under 24h old.
Don't miss what's next. Subscribe to Richard G: