Tencent Open-Sources 3D World Generation, VLM Modal Bias Probe

        April 17, 2026

Tencent Open-Sources 3D World Generation, VLM Modal Bias Probe

Tencent HY-World 2.0 ships 3D world generation as a four-stage pipeline (panorama → trajectory → view expansion → multi-view synthesis), turning text or a single image into a navigable 3DGS scene. It's the open-source answer to closed Marble.

The bug in visual tasks is actually the language side. Stanford's centroid-replacement probe shows that erasing text information costs 4x more accuracy than erasing visual information across 7 VLMs. A contrastive decode built on that asymmetry gains up to 16.9% per task, no retraining.

VGF reframes RL fine-tuning as optimal transport. Instead of parameterizing the policy, it moves reference-distribution particles along the value gradient, with a transport budget that maps cleanly onto test-time scaling. Clean idea, but only 2 HF upvotes — keep on the watchlist.

3PT bakes a "three-phase current" prior into the residual stream. Hidden vectors are sliced into cyclic channels and aligned block-to-block via Givens rotations. At 123M params, perplexity drops 7.2% over RoPE-only, but N=3 and N=1 are statistically indistinguishable.

Also Notable

CMU Built a Simulated AI Marketplace for Multi-Agent Competition Dynamics — when retrieval systems and LLMs compete for users, market incentives distort behavior distributions in predictable directions.
APEX-MEM Uses Semi-Structured Memory and Temporal Reasoning Against Long-Dialogue Memory Hallucination — more stable than simply expanding context windows or naive retrieval. Accepted at ACL.
Google's FoodSense Has VLMs Predict Taste, Smell, Texture, Even Sound From Food Images — a multi-sensory benchmark pushing VLMs toward human cross-modal intuition.
Berkeley Checks if LMs Trained on Developmental-Scale Data Form the Same Cross-Construction Filler-Gap Representations as Large Models — mechanisms visible in small models aren't guaranteed to match what large models do.
UW Upgrades User-Memory Selection From "Similar to Query" to "Useful for Response" — an underused reverse signal in LLM personalization.

Read the full edition →

                                Don't miss what's next. Subscribe to AI Research Brief:

            Email address (required)