LoRA as a Ruler for Memory, Reversed Video as a Free Counterfactual

        June 1, 2026

LoRA as a Ruler for Memory, Reversed Video as a Free Counterfactual

Flip LoRA into a measuring stick and the real capacity of parametric memory falls out: it follows a power law you can estimate in advance, and a token-prediction probability of 0.5 is the threshold for verbatim recall.

Unified retrieval isn't about the interface — it's about not throwing away structure. OmniRetrieval routes queries to each source's native engine instead of crushing everything into a shared vector space, and beats single-source baselines across 309 knowledge bases.

Play a real video backwards and you get a free counterfactual. YoCausal uses reversed clips as expectation-violating negatives, and finds 13 video diffusion models can sense the arrow of time but can't explain the causality.

Image agents shift from rewriting prompts to writing code. GenClaw has the LLM nail down composition in SVG/HTML/Three.js as an executable sketch, then hands it to a generative model to color in — the value is control, not fidelity.

Agent guardrails pile on "lightweight" and "real-time," but the real novelty hides in the taxonomy. AgentDoG 1.5's substantive contribution is an updated open-world agent risk taxonomy; discount the "1k samples matches closed-source" claim, and verify it yourself since the model and dataset are open.

Also Notable

A Fully Open-Source Real-Time Interactive Video World Model — the whole chain is open, from data construction to streaming inference, worth a look if you want to run your own world model. minWM
Move Token Compression From Late Prefill Up Into the Vision Encoder — video understanding usually compresses late in prefill; this skips the wasted stretch before it. EarlyTom
A Third Path for Joint Audio-Video Generation — neither two-tower post-alignment nor full tri-modal mixing, a new route to native fine-grained audio-visual sync. Native Audio-Visual Alignment
Render a Text Problem as an Image for a VLM and Performance Collapses — this traces where that "carrier-sensitive" bias comes from. LoMo
A Mechanistic Account of Why Dense Retrieval Scores High, at the Embedding Level — makes the long-black-box relevance score legible. Xetrieval
Self-Evolving Anchors Loosen Autoregressive Video's Over-Reliance on the First Frame — no longer chained to frame one. AdaState
Let the Rewriter and Encoder Co-Train Iteratively — in tool retrieval, casual queries don't match technical API terms; this evolves both ends together. CoHyDE
Inject 3D Spatial Priors Into a VLM Without a Dedicated 3D Encoder or 3D VQA Fine-Tuning — patches the geometric-reasoning gap. Beyond 3D VQAs
Generative 4D Neural Object Kinematics — lets static 3D objects produce realistic time-varying deformation under different physical conditions. NeuROK
An Interactive Assistant for Scientific Hypothesis Discovery — folds divergent exploration and convergent refinement into one workflow. MOOSE-Copilot

Read the full edition →

                                Don't miss what's next. Subscribe to AI Research Brief:

            Email address (required)