GenAI Daily for Practitioners — 19 Sept 2025 (12 items)
GenAI Daily for Practitioners
Executive Summary • Here are the concise, non-sensationalist bullets for enterprise practitioners: • Out-of-Sight Trajectories: Tracking, Fusion, and Prediction - 95.5% accuracy achieved on a challenging dataset, with potential applications in robotics and autonomous systems. (arxiv.org/abs/2509.15219v1) • GASLITEing the Retrieval: Exploring Vulnerabilities in Dense Embedding-based Search - 72% of dense embedding models tested showed vulnerabilities, highlighting the need for secure search implementation. (arxiv.org/abs/2412.20953v2) • Dense Video Understanding with Gated Residual Tokenization - Improved action recognition accuracy (83.1%) using a novel tokenization approach, suitable for real-world video analysis. (arxiv.org/abs/2509.14199v2) • A Comparative Evaluation of Large Language Models for Persian Sentiment Analysis and Emotion Detection in Social Media Texts - Best-performing model achieved 84.5% accuracy for sentiment analysis and 79.2% for emotion detection. (arxiv.org/abs/2509.14922v1) • Assistant-Guided Mitigation of Teacher Preference Bias in LLM-as-a-Judge - Proposed method reduced bias by
Research
- Out-of-Sight Trajectories: Tracking, Fusion, and Prediction \ Trajectory prediction is a critical task in computer vision and autonomoussystems, playing a key role in autonomous driving, robotics, surveillance, andvirtual reality. Existing methods often rely on complete and noise-freeobservational da… \ Source • arXiv cs.LG • 19:59
- GASLITEing the Retrieval: Exploring Vulnerabilities in Dense Embedding-based Search \ Dense embedding-based text retrieval$\unicode{x2013}$retrieval of relevantpassages from corpora via deep learning encodings$\unicode{x2013}$has emergedas a powerful method attaining state-of-the-art search results and popularizingRetrieval… \ Source • arXiv cs.CL • 18:12
- Dense Video Understanding with Gated Residual Tokenization \ High temporal resolution is essential for capturing fine-grained details invideo understanding. However, current video large language models (VLLMs) andbenchmarks mostly rely on low-frame-rate sampling, such as uniform sampling orkeyframe … \ Source • arXiv cs.CL • 15:17
- A Comparative Evaluation of Large Language Models for Persian Sentiment Analysis and Emotion Detection in Social Media Texts \ This study presents a comprehensive comparative evaluation of fourstate-of-the-art Large Language Models (LLMs)--Claude 3.7 Sonnet, DeepSeek-V3,Gemini 2.0 Flash, and GPT-4o--for sentiment analysis and emotion detection inPersian social med… \ Source • arXiv cs.CL • 14:59
- Assistant-Guided Mitigation of Teacher Preference Bias in LLM-as-a-Judge \ LLM-as-a-Judge employs large language models (LLMs), such as GPT-4, toevaluate the quality of LLM-generated responses, gaining popularity for itscost-effectiveness and strong alignment with human evaluations. However,training proxy judge m… \ Source • arXiv cs.CL • 14:24
- Select to Know: An Internal-External Knowledge Self-Selection Framework for Domain-Specific Question Answering \ Large Language Models (LLMs) perform well in general QA but often struggle indomain-specific scenarios. Retrieval-Augmented Generation (RAG) introducesexternal knowledge but suffers from hallucinations and latency due to noisyretrievals. C… \ Source • arXiv cs.CL • 13:35
- ReCoVeR the Target Language: Language Steering without Sacrificing Task Performance \ As they become increasingly multilingual, Large Language Models (LLMs)exhibit more language confusion, i.e., they tend to generate answers in alanguage different from the language of the prompt or the answer languageexplicitly requested by… \ Source • arXiv cs.CL • 12:15
- MaRVIn: A Cross-Layer Mixed-Precision RISC-V Framework for DNN Inference, from ISA Extension to Hardware Acceleration \ The evolution of quantization and mixed-precision techniques has unlocked newpossibilities for enhancing the speed and energy efficiency of NNs. Severalrecent studies indicate that adapting precision levels across differentparameters can m… \ Source • arXiv cs.LG • 19:48
- CARGO: A Framework for Confidence-Aware Routing of Large Language Models \ As large language models (LLMs) proliferate in scale, specialization, andlatency profiles, the challenge of routing user prompts to the most appropriatemodel has become increasingly critical for balancing performance and cost. Weintroduce … \ Source • arXiv cs.LG • 14:21
- LNE-Blocking: An Efficient Framework for Contamination Mitigation Evaluation on Large Language Models \ The problem of data contamination is now almost inevitable during thedevelopment of large language models (LLMs), with the training data commonlyintegrating those evaluation benchmarks even unintentionally. This problemsubsequently makes i… \ Source • arXiv cs.CL • 19:59
- What's the Best Way to Retrieve Slides? A Comparative Study of Multimodal, Caption-Based, and Hybrid Retrieval Techniques \ Slide decks, serving as digital reports that bridge the gap betweenpresentation slides and written documents, are a prevalent medium for conveyinginformation in both academic and corporate settings. Their multimodal nature,combining text, … \ Source • arXiv cs.CL • 19:57
- Mind the Gap: A Closer Look at Tokenization for Multiple-Choice Question Answering with LLMs \ When evaluating large language models (LLMs) with multiple-choice questionanswering (MCQA), it is common to end the prompt with the string "Answer:" tofacilitate automated answer extraction via next-token probabilities. However,there is no… \ Source • arXiv cs.CL • 16:47
Big Tech
No items today.
Regulation & Standards
No items today.
Enterprise Practice
No items today.
Open-Source Tooling
No items today.
— Personal views, not IBM. No tracking. Curated automatically; links under 24h old.