GenAI Daily for Practitioners — 23 Apr 2026 (12 items)
GenAI Daily for Practitioners
Executive Summary • Here are the concise, non-sensationalist bullets for enterprise practitioners: • BatchLLM: Optimizing Large Batched LLM Inference with Global Prefix Sharing and Throughput-oriented Token Batching: • + Achieves 1.5x-2.5x speedup in inference time for large batched LLM models • + Reduces memory usage by up to 50% • + Potential for improved deployment efficiency • Anchor-and-Resume Concession Under Dynamic Pricing for LLM-Augmented Freight Negotiation: • + Introduces a novel negotiation strategy for LLM-augmented freight negotiation
Research
- BatchLLM: Optimizing Large Batched LLM Inference with Global Prefix Sharing and Throughput-oriented Token Batching \ Large language models (LLMs) increasingly play an important role in a wide range of information processing and management tasks in industry. Many of these tasks are performed in large batches or even offline, and the performance indicator … \ Source • arXiv cs.CL • 17:33
- Anchor-and-Resume Concession Under Dynamic Pricing for LLM-Augmented Freight Negotiation \ Freight brokerages negotiate thousands of carrier rates daily under dynamic pricing conditions where models frequently revise targets mid-conversation. Classical time-dependent concession frameworks use a fixed shape parameter $β$ that can… \ Source • arXiv cs.CL • 18:17
- SpeechParaling-Bench: A Comprehensive Benchmark for Paralinguistic-Aware Speech Generation \ Paralinguistic cues are essential for natural human-computer interaction, yet their evaluation in Large Audio-Language Models (LALMs) remains limited by coarse feature coverage and the inherent subjectivity of assessment. To address these … \ Source • arXiv cs.CL • 19:59
- Self-Aware Vector Embeddings for Retrieval-Augmented Generation: A Neuroscience-Inspired Framework for Temporal, Confidence-Weighted, and Relational Knowledge \ Modern retrieval-augmented generation (RAG) systems treat vector embeddings as static, context-free artifacts: an embedding has no notion of when it was created, how trustworthy its source is, or which other embeddings depend on it. This f… \ Source • arXiv cs.CL • 16:13
- The Model Says Walk: How Surface Heuristics Override Implicit Constraints in LLM Reasoning \ Large language models systematically fail when a salient surface cue conflicts with an unstated feasibility constraint. We study this through a diagnose-measure-bridge-treat framework. Causal-behavioral analysis of the ``car wash problem''… \ Source • arXiv cs.CL • 14:02
- Enhancing Agentic Textual Graph Retrieval with Synthetic Stepwise Supervision \ Integrating textual graphs into Large Language Models (LLMs) is promising for complex graph-based QA. However, a key bottleneck is retrieving informative yet compact subgraphs that fit the LLM context. Existing retrievers often struggle, r… \ Source • arXiv cs.CL • 13:22
- NeuroSymActive: Differentiable Neural-Symbolic Reasoning with Active Exploration for Knowledge Graph Question Answering \ Large pretrained language models and neural reasoning systems have advanced many natural language tasks, yet they remain challenged by knowledge-intensive queries that require precise, structured multi-hop inference. Knowledge graphs provi… \ Source • arXiv cs.CL • 10:50
- ActuBench: A Multi-Agent LLM Pipeline for Generation and Evaluation of Actuarial Reasoning Tasks \ We present ActuBench, a multi-agent LLM pipeline for the automated generation and evaluation of advanced actuarial assessment items aligned with the International Actuarial Association (IAA) Education Syllabus. The pipeline separates four … \ Source • arXiv cs.CL • 09:20
- Relative Entropy Estimation in Function Space: Theory and Applications to Trajectory Inference \ Trajectory Inference (TI) seeks to recover latent dynamical processes from snapshot data, where only independent samples from time-indexed marginals are observed. In applications such as single-cell genomics, destructive measurements make … \ Source • arXiv cs.LG • 19:03
- Coverage, Not Averages: Semantic Stratification for Trustworthy Retrieval Evaluation \ Retrieval quality is the primary bottleneck for accuracy and robustness in retrieval-augmented generation (RAG). Current evaluation relies on heuristically constructed query sets, which introduce a hidden intrinsic bias. We formalize retri… \ Source • arXiv cs.LG • 18:49
- Auto-ART: Structured Literature Synthesis and Automated Adversarial Robustness Testing \ Adversarial robustness evaluation underpins every claim of trustworthy ML deployment, yet the field suffers from fragmented protocols and undetected gradient masking. We make two contributions. (1) Structured synthesis. We analyze nine pee… \ Source • arXiv cs.LG • 17:46
- Spira: Exploiting Voxel Data Structural Properties for Efficient Sparse Convolution in Point Cloud Networks \ Sparse Convolution (SpC) powers 3D point cloud networks widely used in autonomous driving and augmented/virtual reality. SpC builds a kernel map that stores mappings between input voxel coordinates, output coordinates, and weight offsets, … \ Source • arXiv cs.LG • 16:14
Big Tech
No items today.
Regulation & Standards
No items today.
Enterprise Practice
No items today.
Open-Source Tooling
No items today.
— Personal views, not IBM. No tracking. Curated automatically; links under 24h old.
Don't miss what's next. Subscribe to Richard G: