Geometry Conflict Predicts Continual Fine-Tuning Forgetting
- Geometry Conflict Predicts Continual Fine-Tuning Forgetting. Treating each task's parameter-update covariance as a measurable signal, GCWM beats data-free baselines on Qwen3 0.6B-14B across both domain and capability continual settings.
- Full-Cache Is No Longer the Ceiling for KV Eviction. Irrelevant tokens dilute attention in long context. A learnable global-budget eviction policy beats full-cache, and KV cache should be reframed as signal filtering.
- MLLMs Get Wrecked by Blurry Production Images. Throwing degraded samples into RL rollouts triggers reward poisoning. ROMA uses a second forward pass plus teacher forcing to handle visual degradation on the training side.
- Apple Silicon LLM Kernel Tuning Finally Has a Benchmark. Metal-Sci packages 10 scientific compute tasks, CPU baselines, roofline-anchored fitness, and an evolutionary search harness, with held-out sizes catching silent regressions.
Also Notable
- Multimodal Reward Formalizes Plan-Then-Verify. DeltaRubric uses planning plus verification to fix lazy judging on visual details by single-step evaluators.
- Cross-Lingual Self-Distillation Skips Translation Data. Train low-resource languages on the model's own high-resource reasoning traces. Quality is more controllable than translation.
- VLM Web Agent Deception Resistance Enters the Training Objective. Detection used to be post-hoc; this moves resistance to deceptive UI elements upstream into training.
- Industrial QA RAG vs Fine-Tuning Head-to-Head. Cost-vs-effect data for enterprise procurement and selection decisions.
- Token-Entropy Decides When to Branch. Closer to information-theoretic split points than fixed beam or best-of-n.
- Parallel Masked Diffusion LM Adds Edit-Style Refinement. Closes joint-sampling drift in the parallel-generation line with a polish pass.
- Rectified Flow Preference Optimization Needs Noise-Trajectory Pairing. Storing only final winner/loser drops the preference signal from intermediate steps.
- Instruction Tuning Data Selection Goes Task-Model Adaptive. Multi-dimensional heuristics shift from static to dynamic. The selection function is more worth modeling than data volume.
- Evolutionary Coding Agent Generates 3D Training Environments. SimWorld Studio gives embodied agents the web/coding-sandbox-equivalent training ground they were missing.
- Position Paper: Natural Language Isn't Enough as the Default LLM Medium. Argues for more structured schema representations as the next "language."
Don't miss what's next. Subscribe to AI Research Brief: