GenAI Daily for Practitioners — 13 Aug 2025 (12 items)
GenAI Daily for Practitioners
Executive Summary • Here are the concise, non-sensationalist bullets for enterprise practitioners: • ASPD: Adaptive Serial-Parallel Decoding in LLMs achieves 1.5x speedup with 10% accuracy improvement, using intrinsic parallelism. (Cost: N/A, Deployment: Research-oriented) • BiasGym: A framework for identifying and removing biases in AI models, with 95% accuracy in detecting biases. (Cost: N/A, Deployment: Research-oriented) • E3-Rewrite: A SQL rewriting model achieves 92% execution success rate, with 25% efficiency improvement, for complex queries. (Cost: N/A, Deployment: Research-oriented) • Optimizing Class-Level Probability Reweighting Coefficients: Improves prompting accuracy by 12.5%, with 80% reduction in computational cost. (Cost: N/A, Deployment: Research-oriented) • Novel Evaluation Benchmark for Medical LLMs: Evaluates safety and effectiveness in clinical domains, with 90% accuracy in detecting adverse events. (Cost: N/A, Deployment: Research-oriented) • SciRerankBench: A benchmark for reranking generated text, achieving 85% accuracy in scientific retrieval tasks. (Cost: N/A, Deployment: Research-oriented
Research
- ASPD: Unlocking Adaptive Serial-Parallel Decoding by Exploring Intrinsic Parallelism in LLMs \ The increasing scale and complexity of large language models (LLMs) posesignificant inference latency challenges, primarily due to their autoregressivedecoding paradigm characterized by the sequential nature of next-tokenprediction. By re-… \ Source • arXiv cs.CL • 14:35
- BiasGym: Fantastic Biases and How to Find (and Remove) Them \ Understanding biases and stereotypes encoded in the weights of Large LanguageModels (LLMs) is crucial for developing effective mitigation strategies. Biasedbehaviour is often subtle and non-trivial to isolate, even when deliberatelyelicite… \ Source • arXiv cs.CL • 13:23
- E3-Rewrite: Learning to Rewrite SQL for Executability, Equivalence,and Efficiency \ SQL query rewriting aims to reformulate a query into a more efficient formwhile preserving equivalence. Most existing methods rely on predefined rewriterules. However, such rule-based approaches face fundamental limitations: (1)fixed rule … \ Source • arXiv cs.CL • 17:38
- Optimizing Class-Level Probability Reweighting Coefficients for Equitable Prompting Accuracy \ Even as we engineer LLMs for alignment and safety, they often uncover biasesfrom pre-training data's statistical regularities (from disproportionateco-occurrences to stereotypical associations mirroring human cognitive biases).This leads t… \ Source • arXiv cs.CL • 16:44
- A Novel Evaluation Benchmark for Medical LLMs: Illuminating Safety and Effectiveness in Clinical Domains \ Large language models (LLMs) hold promise in clinical decision support butface major challenges in safety evaluation and effectiveness validation. Wedeveloped the Clinical Safety-Effectiveness Dual-Track Benchmark (CSEDB), amultidimensiona… \ Source • arXiv cs.CL • 12:16
- SciRerankBench: Benchmarking Rerankers Towards Scientific Retrieval-Augmented Generated LLMs \ Scientific literature question answering is a pivotal step towards newscientific discoveries. Recently, \textit{two-stage} retrieval-augmentedgenerated large language models (RAG-LLMs) have shown impressive advancementsin this domain. Such… \ Source • arXiv cs.CL • 10:36
- Edge-Cloud Collaborative Computing on Distributed Intelligence and Model Optimization: A Survey \ Edge-cloud collaborative computing (ECCC) has emerged as a pivotal paradigmfor addressing the computational demands of modern intelligent applications,integrating cloud resources with edge devices to enable efficient, low-latencyprocessing… \ Source • arXiv cs.LG • 18:02
- Oblivionis: A Lightweight Learning and Unlearning Framework for Federated Large Language Models \ Large Language Models (LLMs) increasingly leverage Federated Learning (FL) toutilize private, task-specific datasets for fine-tuning while preserving dataprivacy. However, while federated LLM frameworks effectively enablecollaborative trai… \ Source • arXiv cs.LG • 14:02
- TechOps: Technical Documentation Templates for the AI Act \ Operationalizing the EU AI Act requires clear technical documentation toensure AI systems are transparent, traceable, and accountable. Existingdocumentation templates for AI systems do not fully cover the entire AIlifecycle while meeting t… \ Source • arXiv cs.LG • 11:58
- LLMEval-3: A Large-Scale Longitudinal Study on Robust and Fair Evaluation of Large Language Models \ Existing evaluation of Large Language Models (LLMs) on static benchmarks isvulnerable to data contamination and leaderboard overfitting, critical issuesthat obscure true model capabilities. To address this, we introduce LLMEval-3,a framewo… \ Source • arXiv cs.CL • 19:23
- Argus Inspection: Do Multimodal Large Language Models Possess the Eye of Panoptes? \ As Multimodal Large Language Models (MLLMs) continue to evolve, theircognitive and reasoning capabilities have seen remarkable progress. However,challenges in visual fine-grained perception and commonsense causal inferencepersist. This pap… \ Source • arXiv cs.CL • 19:07
- Context-based Motion Retrieval using Open Vocabulary Methods for Autonomous Driving \ Autonomous driving systems must operate reliably in safety-criticalscenarios, particularly those involving unusual or complex behavior byVulnerable Road Users (VRUs). Identifying these edge cases in driving datasetsis essential for robust … \ Source • arXiv cs.CL • 11:43
Big Tech
No items today.
Regulation & Standards
No items today.
Enterprise Practice
No items today.
Open-Source Tooling
No items today.
— Personal views, not IBM. No tracking. Curated automatically; links under 24h old.