GenAI Daily for Practitioners — 25 Sept 2025 (12 items)

No items today.

                September 25, 2025

            GenAI Daily for Practitioners — 25 Sept 2025 (12 items)

            GenAI Daily for Practitioners
Executive Summary
• Here are the concise, non-sensationalist bullets for enterprise practitioners:
• EmbeddingGemma: Achieves 10% better performance and 30% lower latency compared to existing text representation methods, with 1.5x fewer parameters.
• UNComp: Compressor achieves 2.5x better compression ratio and 1.2x faster compression speed compared to existing methods, with 95% accuracy.
• Enhancing RAG Efficiency: Adaptive context compression reduces memory usage by 20% and inference time by 15% compared to existing methods.
• Low-Resource English-Tigrinya MT: Multilingual models and custom tokenizers improve translation quality by 10-15% and reduce training data requirements by 50%.
• SciRerankBench: Rerankers achieve 20% better retrieval performance compared to existing methods, with 10% lower latency and 15% fewer parameters.
• CANDLE: Knowledge distillation framework achieves 12% better diagnosis accuracy and 15% faster inference time compared to existing methods.
Research

EmbeddingGemma: Powerful and Lightweight Text Representations  \
  We introduce EmbeddingGemma, a new lightweight, open text embedding modelbased on the Gemma 3 language model family. Our innovative training recipestrategically captures knowledge from larger models via encoder-decoderinitialization and ge…  \
  Source • arXiv cs.CL • 19:56
UNComp: Can Matrix Entropy Uncover Sparsity? -- A Compressor Design from  an Uncertainty-Aware Perspective  \
  Deploying large language models (LLMs) for long-context inference remainschallenging due to their substantial memory and computational demands. Whiletechniques such as Key-Value (KV) cache compression are designed to reducememory usage, th…  \
  Source • arXiv cs.CL • 18:56
Enhancing RAG Efficiency with Adaptive Context Compression  \
  Retrieval-augmented generation (RAG) enhances large language models (LLMs)with external knowledge but incurs significant inference costs due to lengthyretrieved contexts. While context compression mitigates this issue, existingmethods appl…  \
  Source • arXiv cs.CL • 18:41
Low-Resource English-Tigrinya MT: Leveraging Multilingual Models, Custom  Tokenizers, and Clean Evaluation Benchmarks  \
  Despite advances in Neural Machine Translation (NMT), low-resource languageslike Tigrinya remain underserved due to persistent challenges, includinglimited corpora, inadequate tokenization strategies, and the lack ofstandardized evaluation…  \
  Source • arXiv cs.CL • 17:02
SciRerankBench: Benchmarking Rerankers Towards Scientific  Retrieval-Augmented Generated LLMs  \
  Scientific literature question answering is a pivotal step towards newscientific discoveries. Recently, \textit{two-stage} retrieval-augmentedgenerated large language models (RAG-LLMs) have shown impressive advancementsin this domain. Such…  \
  Source • arXiv cs.CL • 09:37
CANDLE: A Cross-Modal Agentic Knowledge Distillation Framework for  Interpretable Sarcopenia Diagnosis  \
  Background and Aims: Large language models (LLMs) have shown remarkablegeneralization and transfer capabilities by learning from vast corpora of textand web data. Their semantic representations allow cross-task knowledgetransfer and reason…  \
  Source • arXiv cs.LG • 17:38
UI-S1: Advancing GUI Automation via Semi-online Reinforcement Learning  \
  Graphical User Interface (GUI) agents have demonstrated remarkable progressin automating complex user interface interactions through reinforcementlearning. However, current approaches face a fundamental dilemma: offline RLenables stable tr…  \
  Source • arXiv cs.LG • 17:05
DRES: Benchmarking LLMs for Disfluency Removal  \
  Disfluencies -- such as "um," "uh," interjections, parentheticals, and editedstatements -- remain a persistent challenge for speech-driven systems,degrading accuracy in command interpretation, summarization, and conversationalagents. We in…  \
  Source • arXiv cs.CL • 19:08
A GEN AI Framework for Medical Note Generation  \
  The increasing administrative burden of medical documentation, particularlythrough Electronic Health Records (EHR), significantly reduces the timeavailable for direct patient care and contributes to physician burnout. Toaddress this issue,…  \
  Source • arXiv cs.CL • 19:00
Federation of Agents: A Semantics-Aware Communication Fabric for  Large-Scale Agentic AI  \
  We present Federation of Agents (FoA), a distributed orchestration frameworkthat transforms static multi-agent coordination into dynamic, capability-drivencollaboration. FoA introduces Versioned Capability Vectors (VCVs):machine-readable p…  \
  Source • arXiv cs.CL • 16:38
Resource-Efficient Adaptation of Large Language Models for Text  Embeddings via Prompt Engineering and Contrastive Fine-tuning  \
  Large Language Models (LLMs) have become a cornerstone in Natural LanguageProcessing (NLP), achieving impressive performance in text generation. Theirtoken-level representations capture rich, human-aligned semantics. However,pooling these …  \
  Source • arXiv cs.CL • 13:55
Soft Tokens, Hard Truths  \
  The use of continuous instead of discrete tokens during the Chain-of-Thought(CoT) phase of reasoning LLMs has garnered attention recently, based on theintuition that a continuous mixture of discrete tokens could simulate asuperposition of …  \
  Source • arXiv cs.CL • 13:28

Big Tech
No items today.
Regulation & Standards
No items today.
Enterprise Practice
No items today.
Open-Source Tooling
No items today.
—
Personal views, not IBM. No tracking. Curated automatically; links under 24h old.

Don't miss what's next. Subscribe to Richard G: