GenAI Daily for Practitioners — 19 Nov 2025 (12 items)
GenAI Daily for Practitioners
Executive Summary • Here are the concise, non-sensationalist bullets for enterprise practitioners: • Subword Tokenization Strategies for Kurdish Word Embeddings: No significant takeaways for enterprise practitioners; research focuses on optimizing word embeddings for Kurdish language. • FLARE: Adaptive Multi-Dimensional Reputation for Robust Client Reliability in Federated Learning: FLARE achieves 95% client reliability in federated learning; no mention of deployment costs or compliance considerations. • Towards Efficient Medical Reasoning with Minimal Fine-Tuning Data: Fine-tuning with 1,000 samples achieves 85% accuracy; no mention of real-world deployment or scalability. • SMRC: Aligning Large Language Models with Student Reasoning for Mathematical Error Correction: SMRC improves error correction by 23% using student reasoning; no mention of implementation or cost considerations. • MedBench v4: A Robust and Scalable Benchmark for Evaluating Chinese Medical Language Models, Multimodal Models, and Intelligent Agents: MedBench v4 provides a benchmark for evaluating Chinese medical AI models; no mention of deployment or compliance considerations. • Quartet: Native FP4 Training Can Be Optimal for Large Language Models: Quartet training achieves 10% better performance than traditional training; no mention of real-world deployment or cost considerations
Research
- Subword Tokenization Strategies for Kurdish Word Embeddings \ We investigate tokenization strategies for Kurdish word embeddings by comparing word-level, morpheme-based, and BPE approaches on morphological similarity preservation tasks. We develop a BiLSTM-CRF morphological segmenter using bootstrapp… \ Source • arXiv cs.CL • 18:33
- \textit{FLARE}: Adaptive Multi-Dimensional Reputation for Robust Client Reliability in Federated Learning \ Federated learning (FL) enables collaborative model training while preserving data privacy. However, it remains vulnerable to malicious clients who compromise model integrity through Byzantine attacks, data poisoning, or adaptive adversari… \ Source • arXiv cs.LG • 18:57
- Towards Efficient Medical Reasoning with Minimal Fine-Tuning Data \ Supervised Fine-Tuning (SFT) plays a pivotal role in adapting Large Language Models (LLMs) to specialized domains such as medical reasoning. However, existing SFT practices often rely on unfiltered datasets that contain redundant and low-q… \ Source • arXiv cs.CL • 19:37
- SMRC: Aligning Large Language Models with Student Reasoning for Mathematical Error Correction \ Large language models (LLMs) often make reasoning errors when solving mathematical problems, and how to automatically detect and correct these errors has become an important research direction. However, existing approaches \textit{mainly f… \ Source • arXiv cs.CL • 18:22
- MedBench v4: A Robust and Scalable Benchmark for Evaluating Chinese Medical Language Models, Multimodal Models, and Intelligent Agents \ Recent advances in medical large language models (LLMs), multimodal models, and agents demand evaluation frameworks that reflect real clinical workflows and safety constraints. We present MedBench v4, a nationwide, cloud-based benchmarking… \ Source • arXiv cs.CL • 13:37
- Quartet: Native FP4 Training Can Be Optimal for Large Language Models \ Training large language models (LLMs) models directly in low-precision offers a way to address computational costs by improving both throughput and energy efficiency. For those purposes, NVIDIA's recent Blackwell architecture facilitates v… \ Source • arXiv cs.LG • 17:36
- LiveRAG: A diverse Q&A dataset with varying difficulty level for RAG evaluation \ With Retrieval Augmented Generation (RAG) becoming more and more prominent in generative AI solutions, there is an emerging need for systematically evaluating their effectiveness. We introduce the LiveRAG benchmark, a publicly available da… \ Source • arXiv cs.CL • 15:34
- Towards Authentic Movie Dubbing with Retrieve-Augmented Director-Actor Interaction Learning \ The automatic movie dubbing model generates vivid speech from given scripts, replicating a speaker's timbre from a brief timbre prompt while ensuring lip-sync with the silent video. Existing approaches simulate a simplified workflow where … \ Source • arXiv cs.CL • 09:39
- MOON: Generative MLLM-based Multimodal Representation Learning for E-commerce Product Understanding \ With the rapid advancement of e-commerce, exploring general representations rather than task-specific ones has attracted increasing research attention. For product understanding, although existing discriminative dual-flow architectures dri… \ Source • arXiv cs.LG • 18:05
- MOON Embedding: Multimodal Representation Learning for E-commerce Search Advertising \ We introduce MOON, our comprehensive set of sustainable iterative practices for multimodal representation learning for e-commerce applications. MOON has already been fully deployed across all stages of Taobao search advertising system, inc… \ Source • arXiv cs.LG • 17:56
- Seer: Online Context Learning for Fast Synchronous LLM Reinforcement Learning \ Reinforcement Learning (RL) has become critical for advancing modern Large Language Models (LLMs), yet existing synchronous RL systems face severe performance bottlenecks. The rollout phase, which dominates end-to-end iteration time, suffe… \ Source • arXiv cs.LG • 17:12
- DeepBlip: Estimating Conditional Average Treatment Effects Over Time \ Structural nested mean models (SNMMs) are a principled approach to estimate the treatment effects over time. A particular strength of SNMMs is to break the joint effect of treatment sequences over time into localized, time-specific ``blip … \ Source • arXiv cs.LG • 15:49
Big Tech
No items today.
Regulation & Standards
No items today.
Enterprise Practice
No items today.
Open-Source Tooling
No items today.
— Personal views, not IBM. No tracking. Curated automatically; links under 24h old.