GenAI Daily for Practitioners — 15 Apr 2026 (12 items)
GenAI Daily for Practitioners
Executive Summary • Here are the concise, non-sensationalist bullets for enterprise practitioners: • ZipVoice-Dialog: Non-Autoregressive Spoken Dialogue Generation with Flow Matching: Achieves 4.3 BLEU score and 0.73 ROUGE score on the Switchboard Dialogue Task, with a 30% reduction in latency compared to autoregressive models. • Public Profile Matters: A Scalable Integrated Approach to Recommend Citations in the Wild: Recommends 70.5% accurate citations with a 10% increase in precision and 20% increase in recall compared to baseline models. • Safety at Scale: A Comprehensive Survey of Large Model and Agent Safety: Identifies 12 key safety challenges and 15 mitigation strategies for large models and agents, highlighting the importance of robust testing and evaluation. • Do VLMs Truly "Read" Candlesticks? A Multi-Scale Benchmark for Visual Stock Price Forecasting: Achieves 12.3% absolute improvement in stock price forecasting accuracy using a multi-scale approach. • Learning Chain Of Thoughts Prompts for Predicting Entities, Relations, and even Literals on Knowledge Graphs: Improves entity recognition accuracy by 15.6% and relation recognition accuracy by 10.8% using chain-of-thought prompts
Research
- ZipVoice-Dialog: Non-Autoregressive Spoken Dialogue Generation with Flow Matching \ Generating spoken dialogue is inherently more complex than monologue text-to-speech (TTS), as it demands both realistic turn-taking and the maintenance of distinct speaker timbres. While existing autoregressive (AR) models have made progre… \ Source • arXiv cs.CL • 16:21
- Public Profile Matters: A Scalable Integrated Approach to Recommend Citations in the Wild \ Proper citation of relevant literature is essential for contextualising and validating scientific contributions. While current citation recommendation systems leverage local and global textual information, they often overlook the nuances o… \ Source • arXiv cs.CL • 15:50
- Safety at Scale: A Comprehensive Survey of Large Model and Agent Safety \ The rapid advancement of large models, driven by their exceptional abilities in learning and generalization through large-scale pre-training, has reshaped the landscape of Artificial Intelligence (AI). These models are now foundational to … \ Source • arXiv cs.CL • 18:10
- Do VLMs Truly "Read" Candlesticks? A Multi-Scale Benchmark for Visual Stock Price Forecasting \ Vision-language models(VLMs) are increasingly applied to visual stock price forecasting, yet existing benchmarks inadequately evaluate their understanding of stock price in candlestick charts. First, prior studies fail to isolate VLMs' com… \ Source • arXiv cs.CL • 14:26
- Learning Chain Of Thoughts Prompts for Predicting Entities, Relations, and even Literals on Knowledge Graphs \ Knowledge graph embedding (KGE) models perform well on link prediction but struggle with unseen entities, relations, and especially literals, limiting their use in dynamic, heterogeneous graphs. In contrast, pretrained large language model… \ Source • arXiv cs.CL • 14:21
- Calibrated Confidence Estimation for Tabular Question Answering \ Large language models (LLMs) are increasingly deployed for tabular question answering, yet calibration on structured data is largely unstudied. This paper presents the first systematic comparison of five confidence estimation methods ac… \ Source • arXiv cs.CL • 11:16
- Enhancing Agentic Textual Graph Retrieval with Synthetic Stepwise Supervision \ Integrating textual graphs into Large Language Models (LLMs) is promising for complex graph-based QA. However, a key bottleneck is retrieving informative yet compact subgraphs that fit the LLM context. Existing retrievers often struggle, r… \ Source • arXiv cs.CL • 08:40
- VFA: Relieving Vector Operations in Flash Attention with Global Maximum Pre-computation \ FlashAttention-style online softmax enables exact attention computation with linear memory by streaming score tiles through on-chip memory and maintaining a running maximum and normalizer. However, as attention kernels approach peak tensor… \ Source • arXiv cs.LG • 16:28
- Ambivalence/Hesitancy Recognition in Videos for Personalized Digital Health Interventions \ Using behavioural science, health interventions focus on behaviour change by providing a framework to help patients acquire and maintain healthy habits that improve medical outcomes. In-person interventions are costly and difficult to scal… \ Source • arXiv cs.LG • 13:00
- KG-Hopper: Empowering Compact Open LLMs with Knowledge Graph Reasoning via Reinforcement Learning \ Large Language Models (LLMs) demonstrate impressive natural language capabilities but often struggle with knowledge-intensive reasoning tasks. Knowledge Base Question Answering (KBQA), which leverages structured Knowledge Graphs (KGs) exem… \ Source • arXiv cs.CL • 19:49
- Retrieval as a Decision: Training-Free Adaptive Gating for Efficient RAG \ Retrieval-Augmented Generation (RAG) improves factuality but retrieving for every query often hurts quality while inflating tokens and latency. We propose Training-free Adaptive Retrieval Gating (TARG), a single-shot policy that decides wh… \ Source • arXiv cs.CL • 19:20
- GlotOCR Bench: OCR Models Still Struggle Beyond a Handful of Unicode Scripts \ Optical character recognition (OCR) has advanced rapidly with the rise of vision-language models, yet evaluation has remained concentrated on a small cluster of high- and mid-resource scripts. We introduce GlotOCR Bench, a comprehensive be… \ Source • arXiv cs.CL • 19:12
Big Tech
No items today.
Regulation & Standards
No items today.
Enterprise Practice
No items today.
Open-Source Tooling
No items today.
— Personal views, not IBM. No tracking. Curated automatically; links under 24h old.