GenAI Daily for Practitioners — 17 Mar 2026 (12 items)

No items today.

        March 17, 2026

GenAI Daily for Practitioners — 17 Mar 2026 (12 items)

        GenAI Daily for Practitioners
Executive Summary
• Here are the concise, non-sensationalist bullets for enterprise practitioners:
• Directional Embedding Smoothing for Robust Vision Language Models:
• + Improves robustness to adversarial attacks (98.5% robustness vs. 76.5% baseline)
• + No significant decrease in performance on clean data (94.2% accuracy vs. 94.5% baseline)
• + No additional computational cost
• Mamba-3: Improved Sequence Modeling using State Space Principles:
• + Outperforms long short-term memory (LSTM) networks on several benchmarks
Research

Directional Embedding Smoothing for Robust Vision Language Models  \
  The safety and reliability of vision-language models (VLMs) are a crucial part of deploying trustworthy agentic AI systems. However, VLMs remain vulnerable to jailbreaking attacks that undermine their safety alignment to yield harmful outp…  \
  Source • arXiv cs.CL • 14:25
Mamba-3: Improved Sequence Modeling using State Space Principles  \
  Scaling inference-time compute has emerged as an important driver of LLM performance, making inference efficiency a central focus of model design alongside model quality. While the current Transformer-based models deliver strong model qual…  \
  Source • arXiv cs.LG • 18:30
GLM-OCR Technical Report  \
  GLM-OCR is an efficient 0.9B-parameter compact multimodal model designed for real-world document understanding. It combines a 0.4B-parameter CogViT visual encoder with a 0.5B-parameter GLM language decoder, achieving a strong balance betwe…  \
  Source • arXiv cs.CL • 14:48
EVM-QuestBench: An Execution-Grounded Benchmark for Natural-Language Transaction Code Generation  \
  Large language models are increasingly applied to various development scenarios. However, in on-chain transaction scenarios, even a minor error can cause irreversible loss for users. Existing evaluations often overlook execution accuracy a…  \
  Source • arXiv cs.CL • 14:21
Federated Learning of Binary Neural Networks: Enabling Low-Cost Inference  \
  Federated Learning (FL) preserves privacy by distributing training across devices. However, using DNNs is computationally intensive at the low-powered edge during inference. Edge deployment demands models that simultaneously optimize memor…  \
  Source • arXiv cs.LG • 17:35
CCTU: A Benchmark for Tool Use under Complex Constraints  \
  Solving problems through tool use under explicit constraints constitutes a highly challenging yet unavoidable scenario for large language models (LLMs), requiring capabilities such as function calling, instruction following, and self-refin…  \
  Source • arXiv cs.CL • 15:05
Unsupervised Corpus Poisoning Attacks in Continuous Space for Dense Retrieval  \
  This paper concerns corpus poisoning attacks in dense information retrieval, where an adversary attempts to compromise the ranking performance of a search algorithm by injecting a small number of maliciously generated documents into the co…  \
  Source • arXiv cs.CL • 11:43
PAT: Accelerating LLM Decoding via Prefix-Aware Attention with Resource Efficient Multi-Tile Kernel  \
  LLM serving is increasingly dominated by decode attention, which is a memory-bound operation due to massive KV cache loading from global memory. Meanwhile, real-world workloads exhibit substantial, hierarchical shared prefixes across reque…  \
  Source • arXiv cs.CL • 11:24
Thinking in Latents: Adaptive Anchor Refinement for Implicit Reasoning in LLMs  \
  Token-level Chain-of-Thought (CoT) prompting has become a standard way to elicit multi-step reasoning in large language models (LLMs), especially for mathematical word problems. However, generating long intermediate traces increases output…  \
  Source • arXiv cs.CL • 11:06
Covo-Audio Technical Report  \
  In this work, we present Covo-Audio, a 7B-parameter end-to-end LALM that directly processes continuous audio inputs and generates audio outputs within a single unified architecture. Through large-scale curated pretraining and targeted post…  \
  Source • arXiv cs.CL • 10:19
Pretraining and Benchmarking Modern Encoders for Latvian  \
  Encoder-only transformers remain essential for practical NLP tasks. While recent advances in multilingual models have improved cross-lingual capabilities, low-resource languages such as Latvian remain underrepresented in pretraining corpor…  \
  Source • arXiv cs.CL • 10:10
Beyond Benchmark Islands: Toward Representative Trustworthiness Evaluation for Agentic AI  \
  As agentic AI systems move beyond static question answering into open-ended, tool-augmented, and multi-step real-world workflows, their increased authority poses greater risks of system misuse and operational failures. However, current eva…  \
  Source • arXiv cs.CL • 09:51

Big Tech
No items today.
Regulation & Standards
No items today.
Enterprise Practice
No items today.
Open-Source Tooling
No items today.
—
Personal views, not IBM. No tracking. Curated automatically; links under 24h old.

                            Don't miss what's next. Subscribe to Richard G:

            Email address (required)