AI Research Brief

Archives
April 21, 2026

Agents Ignore Answers Placed in Plain Sight

  • Cohere Puts the Solution Directly in the Agent's Reading Path and It Still Follows Its Own Reasoning Trace. Terminal-Bench runs encountered the shortcut in 79-81% of runs but acted on it only 37-50% of the time; on AppWorld, fewer than 7% of agents that read the hint actually called it.
  • SkillFlow Shifts Agent Evaluation From "Can You Use Tools" to "Can You Build Skills Over a Lifetime." 166 tasks across 20 families expose lifetime-level failures; Kimi K2.5 hit 66.87% skill usage for a +0.60 point gain.
  • JuRe Takes Second on TSB-AD With a 128-Dim Depthwise-Separable Conv Block. No attention, no latent variables, no adversarial components; ablations show training perturbations drive the gap, not network capacity.
  • MedFocusLeak Injects Invisible Perturbations Into Non-Diagnostic Regions of Medical Scans. SOTA attack success across six imaging modalities, with black-box transfer between medical VLMs.

Also Notable

  • Position Paper Calls Flat-Fact Memory APIs AI's Most Critical Architecture Flaw — proposes an independent continuity layer that carries what the model already understood.
  • Tsinghua AnchorMem Splits Memory Into Anchored Facts and Associative Contexts — avoids the frequent-rewrite path used by A-Mem and Mem0.
  • HSG Moves Scene Graphs From Euclidean to Hyperbolic Space — explicitly represents place-object hierarchy for multi-view and 3D scene reasoning.
  • Dynamic Compute Depth Per Position for Visual Autoregressive Models — CVPR paper presented as an alternative to hard pruning.
  • Systematic Survey of LLM Reinforcement Learning Under Data Scarcity — ACL paper focused on the cost of obtaining external supervision signals.
  • LLM Calibration on Medical QA Skews Across Sexual Orientation and Religion Markers — ACL paper; the bias shows up in confidence, not accuracy.
  • ThreadSumm Frames Nested Discussion Summarization as Hierarchical Reasoning — ACL paper using tree of thoughts for interleaved replies and overlapping topics.
  • LookasideVLN Adds Orientation Awareness to Drone VLN — CVPR paper improving natural-language navigation in urban environments.
  • Adaptive Masking Locates Sentiment and Rhetoric Neurons in LLMs — ACL paper offering controllable steering of generation direction.
  • PBSBench Targets Single-Cell Morphology in Blood Smears Rather Than Tissue Structure — CVPR paper providing a multi-level VLM framework and benchmark for whole-slide images.

Read the full edition →

Don't miss what's next. Subscribe to AI Research Brief:
Powered by Buttondown, the easiest way to start and grow your newsletter.