GenAI Daily for Practitioners — 15 Aug 2025 (12 items)
GenAI Daily for Practitioners
Executive Summary • Here are the concise executive summaries in bullets: • PASS: • + Probabilistic agentic supernet sampling for interpretable and adaptive chest X-ray reasoning • + Achieves 94.5% accuracy on the CheXpert dataset • + No specific cost or deployment notes mentioned • BiasGym: • + Fantastic LLM biases and how to find (and remove) them
Research
- PASS: Probabilistic Agentic Supernet Sampling for Interpretable and Adaptive Chest X-Ray Reasoning \ Existing tool-augmented agentic systems are limited in the real world by (i)black-box reasoning steps that undermine trust of decision-making and posesafety risks, (ii) poor multimodal integration, which is inherently criticalfor healthcar… \ Source • arXiv cs.LG • 12:03
- BiasGym: Fantastic LLM Biases and How to Find (and Remove) Them \ Understanding biases and stereotypes encoded in the weights of Large LanguageModels (LLMs) is crucial for developing effective mitigation strategies. Biasedbehaviour is often subtle and non-trivial to isolate, even when deliberatelyelicite… \ Source • arXiv cs.CL • 19:57
- ASPD: Unlocking Adaptive Serial-Parallel Decoding by Exploring Intrinsic Parallelism in LLMs \ The increasing scale and complexity of large language models (LLMs) posesignificant inference latency challenges, primarily due to their autoregressivedecoding paradigm characterized by the sequential nature of next-tokenprediction. By re-… \ Source • arXiv cs.CL • 11:04
- SSRL: Self-Search Reinforcement Learning \ We investigate the potential of large language models (LLMs) to serve asefficient simulators for agentic search tasks in reinforcement learning (RL),thereby reducing dependence on costly interactions with external searchengines. To this en… \ Source • arXiv cs.CL • 19:46
- Learning from Natural Language Feedback for Personalized Question Answering \ Personalization is crucial for enhancing both the effectiveness and usersatisfaction of language technologies, particularly in information-seekingtasks like question answering. Current approaches for personalizing largelanguage models (LLM… \ Source • arXiv cs.CL • 16:36
- Video-BLADE: Block-Sparse Attention Meets Step Distillation for Efficient Video Generation \ Diffusion transformers currently lead the field in high-quality videogeneration, but their slow iterative denoising process and prohibitivequadratic attention costs for long sequences create significant inferencebottlenecks. While both ste… \ Source • arXiv cs.LG • 17:58
- Advancing Autonomous Incident Response: Leveraging LLMs and Cyber Threat Intelligence \ Effective incident response (IR) is critical for mitigating cyber threats,yet security teams are overwhelmed by alert fatigue, high false-positive rates,and the vast volume of unstructured Cyber Threat Intelligence (CTI) documents.While CT… \ Source • arXiv cs.LG • 16:20
- CodeJudgeBench: Benchmarking LLM-as-a-Judge for Coding Tasks \ Large Language Models (LLMs) have significantly advanced the state-of-the-artin various coding tasks. Beyond directly answering user queries, LLMs can alsoserve as judges, assessing and comparing the quality of responses generated byother … \ Source • arXiv cs.CL • 19:58
- BitDecoding: Unlocking Tensor Cores for Long-Context LLMs with Low-Bit KV Cache \ The rise of long-context Large Language Models (LLMs) amplifies memory andbandwidth demands during autoregressive decoding, as the Key-Value (KV) cachegrows with each generated token. Low-bit KV-cache quantization (e.g., 4-bit or2-bit) can… \ Source • arXiv cs.CL • 17:37
- AF-MAT: Aspect-aware Flip-and-Fuse xLSTM for Aspect-based Sentiment Analysis \ Aspect-based Sentiment Analysis (ABSA) is a crucial NLP task that extractsfine-grained opinions and sentiments from text, such as product reviews andcustomer feedback. Existing methods often trade off efficiency for performance:traditional… \ Source • arXiv cs.CL • 17:34
- LAPO: Internalizing Reasoning Efficiency via Length-Adaptive Policy Optimization \ Large reasoning models have achieved remarkable performance through extendedchain-of-thought sequences, yet this computational freedom leads to excessivetoken generation even for simple problems. We present Length-Adaptive PolicyOptimizati… \ Source • arXiv cs.CL • 10:13
- ComoRAG: A Cognitive-Inspired Memory-Organized RAG for Stateful Long Narrative Reasoning \ Narrative comprehension on long stories and novels has been a challengingdomain attributed to their intricate plotlines and entangled, often evolvingrelations among characters and entities. Given the LLM's diminished reasoningover extended… \ Source • arXiv cs.CL • 09:52
Big Tech
No items today.
Regulation & Standards
No items today.
Enterprise Practice
No items today.
Open-Source Tooling
No items today.
— Personal views, not IBM. No tracking. Curated automatically; links under 24h old.
Don't miss what's next. Subscribe to Richard G: