Geometry Conflict Predicts Continual Fine-Tuning Forgetting

        May 12, 2026

Geometry Conflict Predicts Continual Fine-Tuning Forgetting

Geometry Conflict Predicts Continual Fine-Tuning Forgetting. Treating each task's parameter-update covariance as a measurable signal, GCWM beats data-free baselines on Qwen3 0.6B-14B across both domain and capability continual settings.

Full-Cache Is No Longer the Ceiling for KV Eviction. Irrelevant tokens dilute attention in long context. A learnable global-budget eviction policy beats full-cache, and KV cache should be reframed as signal filtering.

MLLMs Get Wrecked by Blurry Production Images. Throwing degraded samples into RL rollouts triggers reward poisoning. ROMA uses a second forward pass plus teacher forcing to handle visual degradation on the training side.

Apple Silicon LLM Kernel Tuning Finally Has a Benchmark. Metal-Sci packages 10 scientific compute tasks, CPU baselines, roofline-anchored fitness, and an evolutionary search harness, with held-out sizes catching silent regressions.

Also Notable

Multimodal Reward Formalizes Plan-Then-Verify. DeltaRubric uses planning plus verification to fix lazy judging on visual details by single-step evaluators.
Cross-Lingual Self-Distillation Skips Translation Data. Train low-resource languages on the model's own high-resource reasoning traces. Quality is more controllable than translation.
VLM Web Agent Deception Resistance Enters the Training Objective. Detection used to be post-hoc; this moves resistance to deceptive UI elements upstream into training.
Industrial QA RAG vs Fine-Tuning Head-to-Head. Cost-vs-effect data for enterprise procurement and selection decisions.
Token-Entropy Decides When to Branch. Closer to information-theoretic split points than fixed beam or best-of-n.
Parallel Masked Diffusion LM Adds Edit-Style Refinement. Closes joint-sampling drift in the parallel-generation line with a polish pass.
Rectified Flow Preference Optimization Needs Noise-Trajectory Pairing. Storing only final winner/loser drops the preference signal from intermediate steps.
Instruction Tuning Data Selection Goes Task-Model Adaptive. Multi-dimensional heuristics shift from static to dynamic. The selection function is more worth modeling than data volume.
Evolutionary Coding Agent Generates 3D Training Environments. SimWorld Studio gives embodied agents the web/coding-sandbox-equivalent training ground they were missing.
Position Paper: Natural Language Isn't Enough as the Default LLM Medium. Argues for more structured schema representations as the next "language."

Read the full edition →

                                Don't miss what's next. Subscribe to AI Research Brief:

            Email address (required)