Daily Briefing – Apr 21 (96 Articles)

“

        April 21, 2026

Daily Briefing – Apr 21 (96 Articles)

Babak's Daily Briefing
Tuesday, April 21, 2026
Sources: 20 | Total Articles: 96
6G World
1.Evaluating 6G PHY Evolution: What the Industry Is Really Trying to Solve
Summary available at source link.
2.Amazon’s Globalstar deal gives Amazon Leo a faster path into D2D
Amazon’s planned acquisition of Globalstar is about far more than satellites. It gives Amazon Leo a faster path into direct-to-device connectivity, combining spectrum, operational assets, and Apple-facing service continuity in a move that could reshape the hybrid terrestrial-NTN landscape.
3.SoftBank’s Physical AI push gives AI-RAN a sharper purpose
SoftBank is starting to give AI-RAN a more concrete job description: not just running AI workloads near the network, but serving as the real-time infrastructure layer for robots and other physical systems. The company’s recent materials suggest it wants to move the AI-RAN conversation from telecom architecture to real-world machine action.
4.South Korea puts 6G inside its national AI push
South Korea has unveiled a three-year national roadmap aimed at becoming one of the world’s top three AI powers by 2028, with 6G commercialization positioned as part of that broader push.
5.b-com’s Open XG Hub targets one of telecom’s biggest gaps: turning experimentation into deployment
In an interview with Peter Pietrzyk, Managing Director of 6GWorld, Patrick Savell, Head of Connectivity at b-com, said platforms such as Open XG Hub are designed to help bridge one of the industry’s most persistent challenges: moving promising ideas from research environments into deployable network systems. The bigger point is that, as telecom becomes more software-driven and AI-native, the bottleneck is increasingly less about invention and more about validation, integration, and operational readiness.
AI Agents
1.WebUncertainty: Dual-Level Uncertainty Driven Planning and Reasoning For Autonomous Web Agent
Recent advancements in large language models (LLMs) have empowered autonomous web agents to execute natural language instructions directly on real-world webpages. However, existing agents often struggle with complex tasks involving dynamic interactions and long-horizon execution due to rigid planning strategies and hallucination-prone reasoning. To address these limitations, we propose WebUncertainty, a novel autonomous agent framework designed to tackle dual-level uncertainty in planning and reasoning. Specifically, we design a Task Uncertainty-Driven Adaptive Planning Mechanism that adaptively selects planning modes to navigate unknown environments. Furthermore, we introduce an Action Uncertainty-Driven Monte Carlo tree search (MCTS) Reasoning Mechanism. This mechanism incorporates the Confidence-induced Action Uncertainty (ConActU) str...
2.Prompt Optimization Enables Stable Algorithmic Collusion in LLM Agents
LLM agents in markets present algorithmic collusion risks. While prior work shows LLM agents reach supracompetitive prices through tacit coordination, existing research focuses on hand-crafted prompts. The emerging paradigm of prompt optimization necessitates new methodologies for understanding autonomous agent behavior. We investigate whether prompt optimization leads to emergent collusive behaviors in market simulations. We propose a meta-learning loop where LLM agents participate in duopoly markets and an LLM meta-optimizer iteratively refines shared strategic guidance. Our experiments reveal that meta-prompt optimization enables agents to discover stable tacit collusion strategies with substantially improved coordination quality compared to baseline agents. These behaviors generalize to held-out test markets, indicating discovery of g...
3.Know When to Trust the Skill: Delayed Appraisal and Epistemic Vigilance for Single-Agent LLMs
As large language models (LLMs) transition into autonomous agents integrated with extensive tool ecosystems, traditional routing heuristics increasingly succumb to context pollution and "overthinking". We argue that the bottleneck is not a deficit in algorithmic capability or skill diversity, but the absence of disciplined second-order metacognitive governance. In this paper, our scientific contribution focuses on the computational translation of human cognitive control - specifically, delayed appraisal, epistemic vigilance, and region-of-proximal offloading - into a single-agent architecture. We introduce MESA-S (Metacognitive Skills for Agents, Single-agent), a preliminary framework that shifts scalar confidence estimation into a vector separating self-confidence (parametric certainty) from source-confidence (trust in retrieved external...
4.SocialGrid: A Benchmark for Planning and Social Reasoning in Embodied Multi-Agent Systems
As Large Language Models (LLMs) transition from text processors to autonomous agents, evaluating their social reasoning in embodied multi-agent settings becomes critical. We introduce SocialGrid, an embodied multi-agent environment inspired by Among Us that evaluates LLM agents on planning, task execution, and social reasoning. Our evaluations reveal that even the strongest open model (GPT-OSS-120B) achieves below 60% accuracy in task completion and planning, with agents getting stuck in repetitive behaviors or failing to navigate basic obstacles. Since poor navigation confounds evaluation of social intelligence, SocialGrid offers an optional Planning Oracle to isolate social reasoning from planning deficits. While planning assistance improves task completion, social reasoning remains a bottleneck: agents fail to detect deception at near-...
5.Exploring Agentic Visual Analytics: A Co-Evolutionary Framework of Roles and Workflows
Agentic visual analytics (VA) represents an emerging class of systems in which large language model (LLM)-driven agents autonomously plan, execute, evaluate, and iterate across the full visual analytics pipeline. By shifting users from low-level tool operations to high-level analytical goals expressed through natural language, these systems are fundamentally transforming how humans interact with data. However, the rapid proliferation of such systems in recent years has outpaced our understanding of their design landscape. Two intertwined problems remain open: how do autonomous agents reshape the traditional VA pipeline, and how must human involvement adapt as agent autonomy increases? To address these questions, this paper presents a comprehensive survey of 55 primary agentic VA systems and introduces a co-evolutionary framework. This fra...
AI Computation & Hardware
1.Multimodal Claim Extraction for Fact-Checking
arXiv:2604.16311v1 Announce Type: new 
Abstract: Automated Fact-Checking (AFC) relies on claim extraction as a first step, yet existing methods largely overlook the multimodal nature of today's misinformation. Social media posts often combine short, informal text with images such as memes, screenshots, and photos, creating challenges that differ from both text-only claim extraction and well-studied multimodal tasks like image captioning or visual question answering. In this work, we present the first benchmark for multimodal claim extraction from social media, consisting of posts containing text and one or more images, annotated with gold-standard claims derived from real-world fact-checkers. We evaluate state-of-the-art multimodal LLMs (MLLMs) under a three-part evaluation framework (semantic alignment, faithfulness, and decontextualizat...
2.Cross-Family Speculative Decoding for Polish Language Models on Apple~Silicon: An Empirical Evaluation of Bielik~11B with UAG-Extended MLX-LM
arXiv:2604.16368v1 Announce Type: new 
Abstract: Speculative decoding accelerates LLM inference by using a small draft model to propose k candidate tokens for a target model to verify. While effective for same-tokenizer pairs on high-bandwidth GPUs, its applicability to cross-family pairs with mismatched tokenizers and consumer-grade unified memory remains underexplored. We extend the MLX-LM framework with Universal Assisted Generation (UAG) to enable cross-tokenizer speculative decoding on Apple Silicon. We evaluate Bielik 11B-Instruct (Mistral-based) as the target model, paired with three draft models: Bielik 1.5B (Qwen-based with custom tokenizer), Qwen2.5-1.5B, and Llama 3.2-1B. Experiments on three Polish-language datasets (Wikipedia, pl_alpaca, synthetic) use draft lengths k in {2, 4, 6} to compare naive and context-aware token tran...
3.Brain-CLIPLM: Decoding Compressed Semantic Representations in EEG for Language Reconstruction
arXiv:2604.16370v1 Announce Type: new 
Abstract: Decoding natural language from non-invasive electroencephalography (EEG) remains fundamentally limited by low signal-to-noise ratio and restricted information bandwidth. This raises a fundamental question regarding whether sentence-level linguistic structure can be reliably recovered from such signals. In this work, we suggest that this assumption may not hold under realistic information constraints, and instead propose a semantic compression hypothesis in which EEG signals encode a compressed set of semantic anchors rather than full linguistic structure. Under our new perspective, direct sentence reconstruction becomes an overparameterized objective relative to the intrinsic information capacity of EEG. To address this mismatch, we introduce Brain-CLIPLM, a two-stage framework that decompo...
4.CFMS: Towards Explainable and Fine-Grained Chinese Multimodal Sarcasm Detection Benchmark
arXiv:2604.16372v1 Announce Type: new 
Abstract: Multimodal sarcasm detection has recently garnered significant attention. However, existing benchmarks suffer from coarse-grained annotations and limited cultural coverage, which hinder research into fine-grained semantic understanding. To address this, we construct CFMS, the first fine-grained multimodal sarcasm dataset tailored for Chinese social media. It comprises 2,796 high-quality image-text pairs and provides a triple-level annotation framework: sarcasm identification, target recognition, and explanation generation. We find that the fine-grained explanation annotations effectively guide AI in generating images with explicit sarcastic intent. Furthermore, we curate a high-consistency parallel Chinese-English metaphor subset (200 entries each), revealing significant limitations of curr...
5.Foundational Study on Authorship Attribution of Japanese Web Reviews for Actor Analysis
arXiv:2604.16376v1 Announce Type: new 
Abstract: This study investigates the applicability of authorship attribution based on stylistic features to support actor analysis in threat intelligence. As a foundational step toward future application to dark web forums, we conducted experiments using Japanese review data from clear web sources. We constructed datasets from Rakuten Ichiba reviews and compared four methods: TF-IDF with logistic regression (TF-IDF+LR), BERT embeddings with logistic regression (BERT-Emb+LR), BERT fine-tuning (BERT-FT), and metric learning with $k$-nearest neighbors (Metric+kNN). Results showed that BERT-FT achieved the best performance; however, training became unstable as the number of authors scaled to several hundred, where TF-IDF+LR proved superior in terms of accuracy, stability, and computational cost. Further...
AI Machine Learning
1.BASIS: Balanced Activation Sketching with Invariant Scalars for "Ghost Backpropagation"
arXiv:2604.16324v1 Announce Type: new Abstract: The activation memory required for exact backpropagation scales linearly with network depth, context length, and feature dimensionality, forming an O(L * BN ) spatial bottleneck (where B is the sequence-batch cardinality and N is the feature dimension). This constraint historically throttles the scaling of deep neural networks. While randomized automatic differentiation attempts to mitigate this, it historically suffers from catastrophic variance. In this paper, we introduce BASIS (Balanced Activation Sketching with Invariant Scalars), an efficient backpropagation algorithm that fully decouples activation memory from the batch and sequence dimensions. BASIS propagates the exact error signal (dX) to preserve flawless gradient flow, but computes the weight updates (dW) using massively compress...
2.UniMamba: A Unified Spatial-Temporal Modeling Framework with State-Space and Attention Integration
arXiv:2604.16325v1 Announce Type: new Abstract: Multivariate time series forecasting is fundamental to numerous domains such as energy, finance, and environmental monitoring, where complex temporal dependencies and cross-variable interactions pose enduring challenges. Existing Transformer-based methods capture temporal correlations through attention mechanisms but suffer from quadratic computational cost, while state-space models like Mamba achieve efficient long-context modeling yet lack explicit temporal pattern recognition. Therefore we introduce UniMamba, a unified spatial-temporal forecasting framework that integrates efficient state-space dynamics with attention-based dependency learning. UniMamba employs a Mamba Variate-Channel Encoding Layer enhanced with FFT-Laplace Transform and TCN to capture global temporal dependencies, and a...
3.Annotation Entropy Predicts Per-Example Learning Dynamics in LoRA Fine-Tuning
arXiv:2604.16332v1 Announce Type: new Abstract: We find that LoRA fine-tuning exhibits un-learning on contested examples: items with high annotator disagreement show increasing loss during training, a qualitatively distinct pattern largely absent under full fine-tuning and consistent across all six models tested (four encoder, two decoder-only). This discovery emerges from correlating annotation entropy, computed from ChaosNLI's 100 labels per example, with per-example area under the loss curve (AULC) on SNLI and MNLI. The correlation is positive in all 25 conditions tested (Spearman $\rho = 0.06$-$0.43$), with decoder-only models showing stronger correlations than encoders at matched LoRA rank. The effect survives partial-correlation controls and replicates across seeds and datasets. A preliminary noise-injection experiment is consistent...
4.A Discordance-Aware Multimodal Framework with Multi-Agent Clinical Reasoning
arXiv:2604.16333v1 Announce Type: new Abstract: Knee osteoarthritis frequently exhibits discordance between structural damage observed in imaging and patient-reported symptoms such as pain. This mismatch complicates clinical interpretation and patient stratification and remains insufficiently modeled in existing decision support systems. We propose a discordance aware multimodal framework that combines machine learning prediction models with a tool grounded multi agent reasoning system. Using baseline data from the FNIH Osteoarthritis Biomarkers Consortium, we trained multimodal models to predict two progression tasks, joint space loss only progression versus non progression, and pain only progression versus non progression. The predictive system integrates three modality specific experts: a CatBoost tabular model using demographic, radio...
5.Preventing overfitting in deep learning using differential privacy
arXiv:2604.16334v1 Announce Type: new Abstract: The use of Deep Neural Network based systems in the real world is growing. They have achieved state-of-the-art performance on many image, speech and text datasets. They have been shown to be powerful systems that are capable of learning detailed relationships and abstractions from the data. This is a double-edged sword which makes such systems vulnerable to learning the noise in the training set, thereby negatively impacting performance. This is also known as the problem of \emph{overfitting} or \emph{poor generalization}. In a practical setting, analysts typically have limited data to build models that must generalize to unseen data. In this work, we explore the use of a differential-privacy based approach to improve generalization in Deep Neural Networks.
AI Robotics
1.BrainMem: Brain-Inspired Evolving Memory for Embodied Agent Task Planning
arXiv:2604.16331v1 Announce Type: new Abstract: Embodied task planning requires agents to execute long-horizon, goal-directed actions in complex 3D environments, where success depends on both immediate perception and accumulated experience across tasks. However, most existing LLM-based planners are stateless and reactive, operating without persistent memory and therefore repeating errors and struggling with spatial or temporal dependencies. We propose BrainMem(Brain-Inspired Evolving Memory), a training-free hierarchical memory system that equips embodied agents with working, episodic, and semantic memory inspired by human cognition. BrainMem continuously transforms interaction histories into structured knowledge graphs and distilled symbolic guidelines, enabling planners to retrieve, reason over, and adapt behaviors from past experience ...
2.Interdisciplinary Workshop on Mechanical Intelligence: Summary Report
arXiv:2604.16381v1 Announce Type: new Abstract: This report provides a summary of the outcomes of the Interdisciplinary Workshop on Mechanical Intelligence held in 2024. Mechanical Intelligence (MI) represents the phenomenon that novel structural features of material/biological/robotic systems can encode intelligence through responsiveness, adaptivity, memory, and learning in the mechanical structure itself. This is in contrast to computational intelligence, wherein the intelligence functions occur through electrical signaling and computer code. The two-day workshop was held at NSF headquarters on May 30-31 and included 38 invited academic researcher participants, and 8 program officers from the NSF. The workshop was structured around active small and large group discussions in groups of 4-5 and 9-10 with the goal of addressing topical qu...
3.RHINO-AR: An Augmented Reality Exhibit for Teaching Mobile Robotics Concepts in Museums
arXiv:2604.16384v1 Announce Type: new Abstract: We present RHINO-AR, an interactive Augmented Reality (AR) museum exhibit that reintroduces the historical mobile robot RHINO into its original exhibition environment at the Deutsches Museum Bonn. The system builds on our previous work RHINO-VR, which reconstructed the robot and the environment in virtual reality. Although this created an engaging experience, it also revealed an important limitation, because visitors were separated from the real exhibition space and from the physical robot on display. RHINO-AR addresses this reality gap by placing a virtual reconstruction of the robot directly into the real museum space. Implemented on a Magic Leap~2 headset using Unity, our system combines real-time environment meshing with interactive visualizations of LiDAR sensing, traversability, and pa...
4.Visual-RRT: Finding Paths toward Visual-Goals via Differentiable Rendering
arXiv:2604.16388v1 Announce Type: new Abstract: Rapidly-exploring random trees (RRTs) have been widely adopted for robot motion planning due to their robustness and theoretical guarantees. However, existing RRT-based planners require explicit goal configurations specified as numerical joint angles, while many practical applications provide goal specifications through visual observations such as images or demonstration videos where precise goal configurations are unavailable. In this paper, we propose visual-RRT (vRRT), a motion planner that enables visual-goal planning by unifying gradient-based exploitation from differentiable robot rendering with sampling-based exploration from RRTs. We further introduce (i) a frontier-based exploration-exploitation strategy that adaptively prioritizes visually promising search regions, and (ii) inertia...
5.Disentangled Robot Learning via Separate Forward and Inverse Dynamics Pretraining
arXiv:2604.16391v1 Announce Type: new Abstract: Vision-language-action (VLA) models have shown great potential in building generalist robots, but still face a dilemma-misalignment of 2D image forecasting and 3D action prediction. Besides, such a vision-action entangled training manner limits model learning from large-scale, action-free web video data. To address these issues, we propose DeFI, a novel framework that Decouples visual Forward and Inverse dynamics pretraining to exploit respective data sources, wherein video generation and action prediction are disentangled. We introduce the General Forward Dynamics Model (GFDM), pretrained on diverse human and robot videos for future prediction, and the General Inverse Dynamics Model (GIDM), trained via self-supervised learning to infer latent actions from unlabeled video transitions. These ...
Financial AI
1.The Virtue of Sparsity in Complexity
Sparsity or complexity? In modern high-dimensional asset pricing, these are often viewed as competing principles: richer feature spaces appear to favor complexity, while economic intuition has long favored parsimony. We show that this tension is misplaced. We distinguish capacity sparsity-the dimensionality of the candidate feature space-from factor sparsity-the parsimonious structure of priced risks-and argue that the two are complements: expanding capacity enables the discovery of factor sparsity. Revisiting the benchmark empirical design of Didisheim et al. (2025) and pushing it to higher complexity regimes, we show that nonlinear feature expansions combined with basis pursuit yield portfolios whose out-of-sample performance dominates ridgeless benchmarks beyond a critical complexity threshold. The evidence shows that the gains from co...
2.Spurious Predictability in Financial Machine Learning
Adaptive specification search generates statistically significant backtests even under martingale-difference nulls. We introduce a falsification audit testing complete predictive workflows against synthetic reference classes, including zero-predictability environments and microstructure placebos. Workflows generating significant walk-forward evidence in these environments are falsified. For passing workflows, we quantify selection-induced performance inflation using an absolute magnitude gap linking optimized in-sample evidence to disjoint walk-forward realizations, adjusted for effective multiplicity. Simulations validate extreme-value scaling under correlated searches and demonstrate detection power under genuine structure. Empirical case studies confirm that many apparent findings represent methodological artifacts rather than genuine ...
3.The Acoustic Camouflage Phenomenon: Re-evaluating Speech Features for Financial Risk Prediction
In computational paralinguistics, detecting cognitive load and deception from speech signals is a heavily researched domain. Recent efforts have attempted to apply these acoustic frameworks to corporate earnings calls to predict catastrophic stock market volatility. In this study, we empirically investigate the limits of acoustic feature extraction (pitch, jitter, and hesitation) when applied to highly trained speakers in in-the-wild teleconference environments. Utilizing a two-stream late-fusion architecture, we contrast an acoustic-based stream with a baseline Natural Language Processing (NLP) stream. The isolated NLP model achieved a recall of 66.25% for tail-risk downside events. Surprisingly, integrating acoustic features via late fusion significantly degraded performance, reducing recall to 47.08%. We identify this degradation as Ac...
4.PRAGMA: Revolut Foundation Model
Modern financial systems generate vast quantities of transactional and event-level data that encode rich economic signals. This paper presents PRAGMA, a family of foundation models for multi-source banking event sequences. Our approach pre-trains a Transformer-based architecture with masked modelling on a large-scale, heterogeneous banking event corpus using a self-supervised objective tailored to the discrete, variable-length nature of financial records. The resulting model supports a wide range of downstream tasks such as credit scoring, fraud detection, and lifetime value prediction: strong performance can be achieved by training a simple linear model on top of the extracted embeddings and can be further improved with lightweight fine-tuning. Through extensive evaluation on downstream tasks, we demonstrate that PRAGMA achieves superior...
5.Quantum Computing for Financial Transformation: A Review of Optimisation, Pricing, Risk, Machine Learning, and Post-Quantum Security
Quantum computing is becoming strategically relevant to finance because several core financial bottlenecks are already defined by combinatorial search, expectation estimation, rare-event analysis, representation learning, and long-horizon cryptographic resilience. This review examines that landscape across five connected domains: constrained portfolio optimisation, derivative pricing, tail-risk and scenario estimation, quantum machine learning, and post-quantum security. Rather than treating these topics as isolated demonstrations, the article studies them as linked layers of a financial-computation stack. Across all five domains, the review applies a common evaluative logic: identify the financial bottleneck, specify the relevant quantum primitive, compare it with an explicit classical benchmark, and assess the result under realistic imp...
GSMA Newsroom
1.GSMA Report Urges Japan to Take Bold Action to Convert Technical Excellence into Global Digital Leadership
Summary available at source link.
2.From Rich Text to Video: RCS Universal Profile 4.0 has arrived
Summary available at source link.
3.Mobile Money accounted for $2 trillion in  transactions in 2025, doubling since 2021 as active accounts continue to grow
Summary available at source link.
4.Strengthening the Global Fight Against Fraud and Scams – Takeaways from the Global Fraud Summit in Vienna
Summary available at source link.
5.GSMA MWC26 Barcelona closes 20th anniversary edition
Summary available at source link.
Generative AI (arXiv)
1.When Can LLMs Learn to Reason with Weak Supervision?
Large language models have achieved significant reasoning improvements through reinforcement learning with verifiable rewards (RLVR). Yet as model capabilities grow, constructing high-quality reward signals becomes increasingly difficult, making it essential to understand when RLVR can succeed under weaker forms of supervision. We conduct a systematic empirical study across diverse model families and reasoning domains under three weak supervision settings: scarce data, noisy rewards, and self-supervised proxy rewards. We find that generalization is governed by training reward saturation dynamics: models that generalize exhibit a prolonged pre-saturation phase during which training reward and downstream performance climb together, while models that saturate rapidly memorize rather than learn. We identify reasoning faithfulness, defined as ...
2.Latent Phase-Shift Rollback: Inference-Time Error Correction via Residual Stream Monitoring and KV-Cache Steering
Large language models frequently commit unrecoverable reasoning errors mid-generation: once a wrong step is taken, subsequent tokens compound the mistake rather than correct it. We introduce $\textbf{Latent Phase-Shift Rollback}$ (LPSR): at each generation step, we monitor the residual stream at a critical layer lcrit, detect abrupt directional reversals (phase shifts) via a cosine-similarity $+$ entropy dual gate, and respond by rolling back the KV-cache and injecting a pre-computed steering vector. No fine-tuning, gradient computation, or additional forward passes are required. LPSR achieves $\mathbf{44.0\%}$ on MATH-500 with an 8B model versus $28.8\%$ for standard AR ($+15.2$ pp; McNemar $χ^2 = 66.96$, $p < 10^{-15}$). Critically, prompted self-correction, the most natural inference-time baseline, scores only $19.8\%$, below standard ...
3.Benchmarking System Dynamics AI Assistants: Cloud Versus Local LLMs on CLD Extraction and Discussion
We present a systematic evaluation of large language model families -- spanning both proprietary cloud APIs and locally-hosted open-source models -- on two purpose-built benchmarks for System Dynamics AI assistance: the \textbf{CLD Leaderboard} (53 tests, structured causal loop diagram extraction) and the \textbf{Discussion Leaderboard} (interactive model discussion, feedback explanation, and model building coaching). On CLD extraction, cloud models achieve 77--89\% overall pass rates; the best local model reaches 77\% (Kimi~K2.5~GGUF~Q3, zero-shot engine), matching mid-tier cloud performance. On Discussion, the best local models achieve 50--100\% on model building steps and 47--75\% on feedback explanation, but only 0--50\% on error fixing -- a category dominated by long-context prompts that expose memory limits in local deployments. A c...
4.OGER: A Robust Offline-Guided Exploration Reward for Hybrid Reinforcement Learning
Recent advancements in Reinforcement Learning with Verifiable Rewards (RLVR) have significantly improved Large Language Model (LLM) reasoning, yet models often struggle to explore novel trajectories beyond their initial latent space. While offline teacher guidance and entropy-driven strategies have been proposed to address this, they often lack deep integration or are constrained by the model's inherent capacity. In this paper, we propose OGER, a novel framework that unifies offline teacher guidance and online reinforcement learning through a specialized reward modeling lens. OGER employs multi-teacher collaborative training and constructs an auxiliary exploration reward that leverages both offline trajectories and the model's own entropy to incentivize autonomous exploration. Extensive experiments across mathematical and general reasonin...
5.MASS-RAG: Multi-Agent Synthesis Retrieval-Augmented Generation
Large language models (LLMs) are widely used in retrieval-augmented generation (RAG) to incorporate external knowledge at inference time. However, when retrieved contexts are noisy, incomplete, or heterogeneous, a single generation process often struggles to reconcile evidence effectively. We propose \textbf{MASS-RAG}, a multi-agent synthesis approach to retrieval-augmented generation that structures evidence processing into multiple role-specialized agents. MASS-RAG applies distinct agents for evidence summarization, evidence extraction, and reasoning over retrieved documents, and combines their outputs through a dedicated synthesis stage to produce the final answer. This design exposes multiple intermediate evidence views, allowing the model to compare and integrate complementary information before answer generation. Experiments on four...
Hugging Face Daily Papers
1.MathNet: a Global Multimodal Benchmark for Mathematical Reasoning and Retrieval
Mathematical problem solving remains a challenging test of reasoning for large language and multimodal models, yet existing benchmarks are limited in size, language coverage, and task diversity. We introduce MathNet, a high-quality, large-scale, multimodal, and multilingual dataset of Olympiad-level math problems together with a benchmark for evaluating mathematical reasoning in generative models and mathematical retrieval in embedding-based systems. MathNet spans 47 countries, 17 languages, and two decades of competitions, comprising 30,676 expert-authored problems with solutions across diverse domains. In addition to the core dataset, we construct a retrieval benchmark consisting of mathematically equivalent and structurally similar problem pairs curated by human experts. MathNet supports three tasks: (i) Problem Solving, (ii) Math-Awar...
2.MUA: Mobile Ultra-detailed Animatable Avatars
Building photorealistic, animatable full-body digital humans remains a longstanding challenge in computer graphics and vision. Recent advances in animatable avatar modeling have largely progressed along two directions: improving the fidelity of dynamic geometry and appearance, or reducing computational complexity to enable deployment on resource-constrained platforms, e.g., VR headsets. However, existing approaches fail to achieve both goals simultaneously: Ultra-high-fidelity avatars typically require substantial computation on server-class GPUs, whereas lightweight avatars often suffer from limited surface dynamics, reduced appearance details, and noticeable artifacts. To bridge this gap, we propose a novel animatable avatar representation, termed Wavelet-guided Multi-level Spatial Factorized Blendshapes, and a corresponding distillatio...
3.Sessa: Selective State Space Attention
Modern sequence models are dominated by Transformers, where self-attention mixes information from the visible context in an input-dependent way. However, when retrieval is not sharp and attention remains diffuse over an effective support $S_{\mathrm{eff}}(t)$, the influence of any individual token is diluted, typically scaling as $O(1/S_{\mathrm{eff}}(t))$ and reaching $O(1/\ell)$ for old tokens in full-prefix settings. Structured state-space models process sequences recurrently through an explicit feedback path; selective variants such as Mamba make this feedback input-dependent, yet when freeze time cannot be sustained over long intervals, their long-range sensitivity decays exponentially with lag. Existing architectures therefore either retrieve from the past in a single read or propagate information through a single feedback chain. We...
4.Revisiting Active Sequential Prediction-Powered Mean Estimation
In this work, we revisit the problem of active sequential prediction-powered mean estimation, where at each round one must decide the query probability of the ground-truth label upon observing the covariates of a sample. Furthermore, if the label is not queried, the prediction from a machine learning model is used instead. Prior work proposed an elegant scheme that determines the query probability by combining an uncertainty-based suggestion with a constant probability that encodes a soft constraint on the query probability. We explored different values of the mixing parameter and observed an intriguing empirical pattern: the smallest confidence width tends to occur when the weight on the constant probability is close to one, thereby reducing the influence of the uncertainty-based component. Motivated by this observation, we develop a non...
5.A Note on TurboQuant and the Earlier DRIVE/EDEN Line of Work
This note clarifies the relationship between the recent TurboQuant work and the earlier DRIVE (NeurIPS 2021) and EDEN (ICML 2022) schemes. DRIVE is a 1-bit quantizer that EDEN extended to any $b>0$ bits per coordinate; we refer to them collectively as EDEN. First, TurboQuant$_{\text{mse}}$ is a special case of EDEN obtained by fixing EDEN's scalar scale parameter to $S=1$. EDEN supports both biased and unbiased quantization, each optimized by a different $S$ (chosen via methods described in the EDEN works). The fixed choice $S=1$ used by TurboQuant is generally suboptimal, although the optimal $S$ for biased EDEN converges to $1$ as the dimension grows; accordingly TurboQuant$_{\text{mse}}$ approaches EDEN's behavior for large $d$. Second, TurboQuant$_{\text{prod}}$ combines a biased $(b-1)$-bit EDEN step with an unbiased 1-bit QJL quanti...
IEEE Xplore AI
1.Optical Fiber Networks Can Keep Rail Networks Safe
This article is part of our exclusive IEEE Journal Watch series in partnership with IEEE Xplore. Rail networks are vast, which makes it difficult to conduct comprehensive, continuous safety monitoring. Researchers in China have suggested analyzing the vibrations of existing fiber cables buried underground alongside railway tracks to detect problems. In a study published 5 March in the Journal of Optical Communications and Networking , the research group demonstrated through experiments how the technique can successfully identify a number of issues associated with train safety, including faulty train wheels and broken sound barriers alongside the railway tracks. Sasha Dong is a junior chair professor in Southeast University’s School of Transportation, in Nanjing, China. She notes that traditional approaches for monitoring railways—such as ...
2.Boston Dynamics and Google DeepMind Teach Spot to Reason
The amazing and frustrating thing about robots is that they can do almost anything you want them to do, as long as you know how to ask properly. In the not-so-distant past, asking properly meant writing code, and while we’ve thankfully moved beyond that brittle constraint, there’s still an irritatingly inverse correlation between ease of use and complexity of task. AI has promised to change that. The idea is that when AI is embodied within robots—giving AI software a physical presence in the world—those robots will be imbued with reasoning and understanding. This is cutting-edge stuff, though, and while we’ve seen plenty of examples of embodied AI in a research context, finding applications where reasoning robots can provide reliable commercial value has not been easy. Boston Dynamics is one of the few companies to commercially deploy leg...
3.Sarang Gupta Builds AI Systems With Real-World Impact
Like many engineers, Sarang Gupta spent his childhood tinkering with everyday items around the house. From a young age he gravitated to projects that could make a difference in someone’s everyday life. When the family’s microwave plug broke, Gupta and his father figured out how to fix it. When a drawer handle started jiggling annoyingly, the youngster made sure it didn’t do so for long. Sarang Gupta Employer OpenAI in San Francisco Job Data science staff member Member grade Senior member Alma maters The Hong Kong University of Science and Technology; Columbia By age 11, his interest expanded from nuts and bolts to software. He learned programming languages such as Basic and Logo and designed simple programs including one that helped a local restaurant automate online ordering and billing. Gupta, an IEEE senior member, brings his mix of cu...
4.12 Graphs That Explain the State of AI in 2026
The capabilities of leading AI models continue to accelerate, and the largest AI companies, including OpenAI and Anthropic , are hurtling toward IPOs later this year. Yet resentment toward AI continues to simmer, and in some cases has boiled over, especially in the United States, where local governments are beginning to embrace restrictions or outright bans on new data center development. It’s a lot to keep track of, but the 2026 edition of the AI Index from Stanford University’s Human-Centered Artificial Intelligence center pulls it off. The report, which comes in at over 400 pages, includes dozens of data points and graphs that approach the topic from multiple angles, from benchmark scores to investment and public perception. As in prior years (see our coverage from 2021 , 2022 , 2023 , 2024 , and 2025 ), we’ve read the report and ident...
5.GoZTASP: A Zero-Trust Platform for Governing Autonomous Systems at Mission Scale
ZTASP is a mission-scale assurance and governance platform designed for autonomous systems operating in real-world environments. It integrates heterogeneous systems—including drones, robots, sensors, and human operators—into a unified zero-trust architecture. Through Secure Runtime Assurance (SRTA) and Secure Spatio-Temporal Reasoning (SSTR), ZTASP continuously verifies system integrity, enforces safety constraints, and enables resilient operation even under degraded conditions. ZTASP has progressed beyond conceptual design, with operational validation at Technology Readiness Level (TRL) 7 in mission critical environments. Core components, including Saluki secure flight controllers, have reached TRL8 and are deployed in customer systems. While initially developed for high-consequence mission environments, the same assurance challenges are...
MIT Sloan Management
1.Industrial AI for the Physical World: Siemens’s Peter Koerte
In this episode of the Me, Myself, and AI podcast, host Sam Ransbotham talks with Peter Koerte, a member of the managing board and chief strategy and technology officer of Siemens, about how industrial AI is quietly transforming the infrastructure that powers everyday life. While consumer AI grabs headlines, Peter explains how artificial intelligence is […]
2.Beyond the Model — Why Responsible AI Must Address Workforce Impact
For the fifth year in a row, MIT Sloan Management Review and Boston Consulting Group (BCG) have assembled an international panel of AI experts that includes academics and practitioners to help us understand how responsible artificial intelligence (RAI) is being implemented across organizations worldwide. In prior years, we examined organizational RAI maturity; third-party, generative, and […]
3.How AI Helps the Best and Hurts the Rest
Mark Shaver/theispot.com Can generative AI serve as an effective adviser for business owners and entrepreneurs? Intuitive chat-based natural language interfaces mean that anyone who can read and write can use GenAI tools for a wide range of tasks, even if they lack technical skills. This has obvious appeal for entrepreneurs and small business owners, many […]
4.Lessons From Innovation Pioneer Florence Nightingale
Carolyn Geason-Beissel/MIT SMR | Wellcome Collection Florence Nightingale may be best remembered as the epitome of a kind, caring nurse, but she was also a force for disruptive innovation in health care. Three distinct elements of her work — communicating data compellingly, publicizing clear and simple instructions, and expanding professionalized training — carry timeless lessons […]
5.The Human Side of AI Adoption: Lessons From the Field
Carolyn Geason-Beissel/MIT SMR Not a day goes by without another article being published about how AI could disrupt yet another aspect of our business or personal lives. In recent years, AI adoption has indeed taken off. However, if you pay close attention, you’ll notice a dichotomy. Many examples of successful early adoption of artificial intelligence […]
NBER Working Papers
1.The Impact of Maternal Education on Early Childhood Development -- by Moriam Khanam, Mohammad Hajizadeh, Casey Warman
This study leverages exogenous variation from a secondary school stipend program for female students in rural Bangladesh to estimate the causal effect of maternal education on early childhood development. Using data from the 2019 Bangladesh Multiple Indicator Cluster Survey, we find that the five years of stipend eligibility increase mothers' schooling by about one year. Instrumental variable estimates show that an additional year of maternal education improves early childhood development scores by 0.5 points on a scale of 0-10, with gains in overall developmental readiness (7.5 percentage points) and in the literacy–numeracy (7.7  percentage points) and physical (1.9 percentage points) domains. The results are robust across specifications. We also estimate the effects of maternal education on potential mechanisms, including children's nu...
2.Tariffs and the Term Structure of Inflation Expectations -- by Stéphane Auray, Michael B. Devereux, Anthony M. Diercks, Aurélien Eyquem, Joon Kim
Inflation expectations derived from financial markets exhibited unprecedented dynamics in 2025: the correlation between one-year inflation swaps and one-year-ahead one-year forward rates turned significantly negative for the first time on record. We show that this decoupling occurred primarily on days when tariff news dominated market pricing, using a two-stage event classification validated by Bloomberg news trends. Standard small open-economy New Keynesian models in which tariffs generate a one-time price-level increase imply positive comovement across horizons and cannot explain this pattern. 
We explain these occurrences through the lens of an amended small open-economy New Keynesian model. Three ingredients prove critical for reproducing the observed negative conditional correlation between spot and forward inflation after tariff sho...
3.Bilateral Conflict Risk and Trade: Military Wars, Trade Wars, and Diplomatic Noise -- by Joshua Aizenman, Rodolphe Desbordes, Jamel Saadaoui
How damaging is a “trade war” compared to a “military war” or a “war of words”? Aggregate conflict indicators cannot say, because they treat missile strikes, sanctions, and diplomatic protests as equivalent. We build a monthly bilateral indicator from GDELT event data, calibrated against human-curated ground truth, that decomposes hostility into four layers: kinetic fighting (“military war”), military posture, sanctions-context tensions (“trade war”), and routine diplomacy. The decomposed panel reveals a secular shift: over the past decade, governments have steadily substituted economic coercion for military confrontation, nearly doubling the trade-weighted share of hostility channelled through sanctions contexts. In a gravity trade model, the aggregate indicator is negative, large, and statistically significant, but the decomposition rev...
4.The “Peace Dividend” of International Trade: A New Empirical Approach -- by Ling Feng, Qiuyue Huang, Zhiyuan Li, Christopher M. Meissner
This paper investigates the causal impact of international trade on interstate military conflicts using global bilateral data from 1962 to 2014. To address endogeneity concerns, we exploit exogenous spatial-temporal variation in international trade stemming from technological advances in air relative to maritime transport. Empirical results demonstrate a strong “peace dividend” of international trade: that is, increased trade significantly reduces the probability and intensity of conflicts between nations. This effect remains robust across specifications and withstands a wide range of potential confounders. Such findings highlight how economic interdependence shapes international conflict—a relationship that is especially relevant amid escalating geopolitical tensions and the global shift toward “decoupling”, “de-risking”, and greater tra...
5.How Have Universities Survived for Nearly a Millenium -- by David M. Cutler, Edward L. Glaeser
How have universities managed to survive and evolve over almost 1,000 years to become wildly heterogeneous, unusually fractious, multi-product, non-profit entities? Universities began as teachers’ guilds, and they still give faculty a remarkable degree of autonomy. That structure attracts and empowers intellectuals, who are selected in part on their taste for knowledge, and those entrepreneurs and philanthropists have enabled universities to morph in ways that firms rarely do. Intellectual autonomy can also explain why universities are so often at odds with legal authorities and why faculty fight so often with each other and with their bosses. This essay presents a model of university organization and sketches the evolution of the university’s products and conflicts over the last 900 years. We also discuss the social value of university e...
NY Fed - Liberty Street
1.Bank Failures: The Roles of Solvency and Liquidity
Do banks fail because of runs or because they become insolvent? Answering this question is central to understanding financial crises and designing effective financial stability policies. Long-run historical evidence reveals that the root cause of bank failures is usually insolvency. The importance of bank runs is somewhat overstated. Runs matter, but in most cases they trigger or accelerate failure at already weak banks, rather than cause otherwise sound banks to fail.
2.The R*–Labor Share Nexus
Over the past quarter century, the U.S. economy has experienced significant declines in both the labor share of income and the natural rate of interest, referred to as R*. Existing research has largely analyzed these two developments in isolation. In this post, we provide a simple model that captures the joint evolution of the labor share and R*, which we call the R*–labor share nexus. Our key finding is that structural changes affecting R* also influence the evolution of the labor share, and thereby wages and prices. This highlights a potentially important channel, absent from many macroeconomic models, through which the factors that determine R* also affect the labor share and, in turn, broader macroeconomic developments, with implications for monetary policy.
3.Use of Gen AI in the Workplace and the Value of Access to Training
The rapid spread of generative AI (AI) tools is reshaping the workplace at a remarkable rate. Yet relatively little is known about whether workers have access to these tools, how the tools affect workers’ daily productivity, and how much workers value the training needed to use the tools effectively. In this post, we shed light on these issues by drawing on supplemental questions in the November 2025 Survey of Consumer Expectations (SCE), fielded to a representative sample of the U.S. population. We find that adoption of AI tools at work is heterogeneous, that a sizable share of workers see AI training as important, and that a significant share of employers are nonetheless not yet providing access to AI tools or training on how to use them.
4.What Millions of Homeowner’s Insurance Contracts Reveal About Risk Sharing
Housing is the largest component of assets held by households in the United States, totaling $48 trillion in 2025. When natural disasters strike, the resulting damage to homes can be large relative to households’ liquid savings. Homeowner’s insurance is the primary financial tool households use to protect themselves against property risk. Despite the economic importance of homeowner’s insurance, we know surprisingly little about how insurance contracts are actually designed with respect to property risk. In this post, which is based on our new paper, “Economics of Property Insurance,” we examine how homeowner’s insurance contracts are structured in practice. Using a new granular dataset covering millions of homeowner’s insurance policies, we document ...
5.A Closer Look at Emerging Market Resilience During Recent Shocks
A succession of shocks to the global economy in recent years has focused attention on the improved economic and financial resilience of emerging market economies. For some of these economies, this assessment is well-founded and highlights the fruits of deep, structural economic reforms since the 1990s. However, for a much larger universe of countries, the ability to weather shocks is still mixed and many remain vulnerable. In this post, we explore the divide between the two sets of countries and focus on the effects of recent economic shocks, including the ongoing conflict in the Middle East.
Project Syndicate
1.The West Is Still Getting Russia Wrong
While Vladimir Putin views Russia as a great power and civilizational counterweight to Western liberalism, he does not have a coherent plan to remake the world, let alone the capacity to do so. Because of this weakness, the Kremlin is focused not on domination, but on disruption.
2.The Global AI Threat Has Arrived
The ability of Anthropic’s new AI model, Claude Mythos Preview, to find vulnerabilities across major operating systems and web browsers has dangerous implications for today’s highly interconnected world. Instead of merely securing its own firms, the United States must address the problem through diplomacy, particularly with China.
3.The Geopolitics of Infrastructure
No longer does global power rest only on alliances, military might, currency dynamics, and effective control of multilateral institutions. The new geopolitical contest is between competing infrastructure blocs: packages of finance, contractors, standards, and data systems that create long-term dependencies.
4.Africa’s Geopolitical Hand Is Stronger than Ever
As the old world order fractures and a multipolar era takes shape, Africa has an unprecedented opportunity to translate its global position and enviable economic endowments into durable geopolitical and economic leverage. But African leaders will have to play their cards right.
5.The Private Credit Panic Is Overblown
Private credit markets are under growing stress, fueling fears of a financial crisis that could spill over to the real economy. But a closer look at the evidence suggests that the risks are less severe than in previous cycles, and that predictions of a meltdown are running well ahead of the facts.
RCR Wireless
1.SFR bid tests French telecom market structure
The potential acquisition of French telco SFR by Orange, Bouygues Telecom, and Iliad could reshape pricing dynamics and investment incentives in the local telecoms market In sum – what to know: Market consolidation – Fewer operators could ease price competition,…
2.Rising GPS jamming attacks threaten critical sectors
An uptick in GPS jamming is disrupting navigation across maritime and aviation routes in active conflict zones — and outside GPS jamming incidents have grown extensively in recent years. According to multiple sources, thousands of jamming attacks have been reported…
3.Verizon details World Cup 2026 network strategy
Verizon explains how the company is scaling network capacity and deploying infrastructure across host venues In sum – what to know: Capacity boost – Verizon will increase bandwidth by 3–5x across U.S. stadiums using 5G, C-band, and mmWave spectrum to…
4.‘No token performance here’ – physical AI on private 5G (easy as Raspberry Pi)
Physical AI for robot automation in factories and plants requires lightweight maths models, not token-hungry language models, says NTT Data. It does not have to wait on expensive GPUs, when CPUs work fine. But it does need private 5G for…
5.AI-native 6G will converge connectivity, compute and sensing
From ISAC and digital twins to personal AI and physical AI, Qualcomm and its partners are designing 6G as an end-to-end intelligent platform As Qualcomm looks beyond 5G, the company is framing 6G less as a generational upgrade and more…
Semantic Scholar – Machine Learning
1.Source Error
Check Feed
Telecom & 6G AI
1.Far-Field Absolute Gain Antenna Measurements at Sub-THz Frequencies: A New Interpretation
The evolution of large aperture antennas and arrays at the sub-THz band (100-300 GHz) results in traditional far-field (FF) gain measurements to require large distances due to the high frequency nature making them impractical in many laboratory environments. In the presented work, absolute antenna gain measurements are performed in localized distance clusters for commercial horn antennas in the sub-THz range of 145-170 GHz using the three-antenna method, leveraging a theoretically derived modified FF equation along with the Friis transmission equation to enable a compact measurement setup. By applying the proposed modified FF formulation, the approach aims to redefine the FF distance by considering the combined effects of both the transmitting and receiving antennas, accounting for their aperture sizes and radiation characteristics. This ...
2.Passive RIS Is Not Silent: Revisiting Performance Limits Under Thermal Noise
Reconfigurable intelligent surfaces (RISs) have emerged as a promising solution for enabling energy-efficient and flexible spectrum usage in wireless communication, particularly in the context of sixth-generation (6G) networks. While passive RIS architectures are widely regarded as virtually noiseless due to the lack of active components, this idealized assumption can lead to misleading performance evaluations. In this paper, we revisit this assumption and demonstrate that the thermal noise generated by passive RIS elements, though often neglected, can significantly affect system performance. We propose a tractable approximated analytical framework that incorporates RIS-induced thermal noise into the system and derive closed-form expressions for key performance metrics, such as outage probability and throughput. Simulation results validat...
3.ExAI5G: A Logic-Based Explainable AI Framework for Intrusion Detection in 5G Networks
Intrusion detection systems (IDSs) for 5G networks must handle complex, high-volume traffic. Although opaque "black-box" models can achieve high accuracy, their lack of transparency hinders trust and effective operational response. We propose ExAI5G, a framework that prioritizes interpretability by integrating a Transformer-based deep learning IDS with logic-based explainable AI (XAI) techniques. The framework uses Integrated Gradients to attribute feature importance and extracts a surrogate decision tree to derive logical rules. We introduce a novel evaluation methodology for LLM-generated explanations, using a powerful evaluator LLM to assess actionability and measuring their semantic similarity and faithfulness. On a 5G IoT intrusion dataset, our system achieves 99.9\% accuracy and a 0.854 macro F1-score, demonstrating strong performan...
4.User Mobility Demands Near-Field Communications in Terahertz Band Wireless Networks Beyond 6G
Near-field propagation is often unavoidable at terahertz (THz) frequencies due to the large apertures needed for sufficient array gain, yet near-field operation complicates practical system design, especially under user mobility. This paper asks whether a mobile THz link can remain broadband, achieve the desired high rates and coverage, while operating exclusively in the radiative far field. To answer this question, we develop a proof-by-contradiction feasibility framework that jointly enforces (i) a far-field requirement based on the Fraunhofer distance and (ii) a reliability requirement specified by a target SNR at the worst-case link distance. We derive closed-form upper bounds on the far-field-feasible bandwidth for stationary and mobile links. We further incorporate practical misalignment through several UE rotation and mobility scen...
5.Paradigm Shift from Statistical Channel Modeling to Digital Twin Prediction: An Environment-Generalizable ChannelLM for 6G AI-enabled Air Interface
As 6G advances, ubiquitous connectivity and higher capacity requirements of the air interface pose substantial challenges for accurate and real-time wireless channel acquisition in diverse environments. Conventional statistical channel modeling relies on offline measurement data from limited environments, struggling to support online applications facing diverse environments. To this end, the digital twin channel (DTC) has emerged as a novel paradigm that constructs a digital replica of the physical environment through high-fidelity sensing and predicts corresponding channel in real time utilizing artificial intelligence (AI) models. As the engine of DTC, existing AI models struggle to simultaneously achieve strong environmental generalization in real-world and end-to-end channel prediction for real time tasks. Therefore, this paper propos...
arXiv Quantitative Finance
1.QRAFTI: An Agentic Framework for Empirical Research in Quantitative Finance
We introduce a multi-agent framework intended to emulate parts of a quantitative research team and support equity factor research on large financial panel datasets. QRAFTI integrates a research toolkit for panel data with MCP servers that expose data access, factor construction, and custom coding operations as callable tools. It can help replicate established factors, formulate and test new signals, and generate standardized research reports accompanied by narrative analysis and computational traces. On multi-step empirical tasks, using chained tool calls and reflection-based planning may offer better performance and explainability than dynamic code generation alone.
2.Dissecting AI Trading: Behavioral Finance and Market Bubbles
We study how AI agents form expectations and trade in experimental asset markets. Using a simulated open-call auction populated by autonomous Large Language Model (LLM) agents, we document three main findings. First, AI agents exhibit classic behavioral patterns: a pronounced disposition effect and recency-weighted extrapolative beliefs. Second, these individual-level patterns aggregate into equilibrium dynamics that replicate classic experimental findings (Smith et al., 1988), including the predictive power of excess demand for future prices and the positive relationship between disagreement and trading volume. Third, by analyzing the agents' reasoning text through a twenty-mechanism scoring framework, we show that targeted prompt interventions causally amplify or suppress specific behavioral mechanisms, significantly altering the magnit...
3.Signal or Noise in Multi-Agent LLM-based Stock Recommendations?
We present the first portfolio-level validation of MarketSenseAI, a deployed multi-agent LLM equity system. All signals are generated live at each observation date, eliminating look-ahead bias. The system routes four specialist agents (News, Fundamentals, Dynamics, and Macro) through a synthesis agent that issues a monthly equity thesis and recommendation for each stock in its coverage universe, and we ask two questions: do its buy recommendations add value over both passive benchmarks and random selection, and what does the internal agent structure reveal about the source of the edge? On the S&P 500 cohort (19 months) the strong-buy equal-weight portfolio earns +2.18%/month against a passive equal-weight benchmark of +1.15% (approximating RSP), a +25.2% compound excess, and ranks at the 99.7th percentile of 10,000 Monte Carlo portfol...
4.The Virtue of Sparsity in Complexity
Sparsity or complexity? In modern high-dimensional asset pricing, these are often viewed as competing principles: richer feature spaces appear to favor complexity, while economic intuition has long favored parsimony. We show that this tension is misplaced. We distinguish capacity sparsity-the dimensionality of the candidate feature space-from factor sparsity-the parsimonious structure of priced risks-and argue that the two are complements: expanding capacity enables the discovery of factor sparsity. Revisiting the benchmark empirical design of Didisheim et al. (2025) and pushing it to higher complexity regimes, we show that nonlinear feature expansions combined with basis pursuit yield portfolios whose out-of-sample performance dominates ridgeless benchmarks beyond a critical complexity threshold. The evidence shows that the gains from co...
5.Hedging the Singularity
Summary available at source link.
arXiv – 6G & Networking
1.Far-Field Absolute Gain Antenna Measurements at Sub-THz Frequencies: A New Interpretation
The evolution of large aperture antennas and arrays at the sub-THz band (100-300 GHz) results in traditional far-field (FF) gain measurements to require large distances due to the high frequency nature making them impractical in many laboratory environments. In the presented work, absolute antenna gain measurements are performed in localized distance clusters for commercial horn antennas in the sub-THz range of 145-170 GHz using the three-antenna method, leveraging a theoretically derived modified FF equation along with the Friis transmission equation to enable a compact measurement setup. By applying the proposed modified FF formulation, the approach aims to redefine the FF distance by considering the combined effects of both the transmitting and receiving antennas, accounting for their aperture sizes and radiation characteristics. This ...
2.Passive RIS Is Not Silent: Revisiting Performance Limits Under Thermal Noise
Reconfigurable intelligent surfaces (RISs) have emerged as a promising solution for enabling energy-efficient and flexible spectrum usage in wireless communication, particularly in the context of sixth-generation (6G) networks. While passive RIS architectures are widely regarded as virtually noiseless due to the lack of active components, this idealized assumption can lead to misleading performance evaluations. In this paper, we revisit this assumption and demonstrate that the thermal noise generated by passive RIS elements, though often neglected, can significantly affect system performance. We propose a tractable approximated analytical framework that incorporates RIS-induced thermal noise into the system and derive closed-form expressions for key performance metrics, such as outage probability and throughput. Simulation results validat...
3.User Mobility Demands Near-Field Communications in Terahertz Band Wireless Networks Beyond 6G
Near-field propagation is often unavoidable at terahertz (THz) frequencies due to the large apertures needed for sufficient array gain, yet near-field operation complicates practical system design, especially under user mobility. This paper asks whether a mobile THz link can remain broadband, achieve the desired high rates and coverage, while operating exclusively in the radiative far field. To answer this question, we develop a proof-by-contradiction feasibility framework that jointly enforces (i) a far-field requirement based on the Fraunhofer distance and (ii) a reliability requirement specified by a target SNR at the worst-case link distance. We derive closed-form upper bounds on the far-field-feasible bandwidth for stationary and mobile links. We further incorporate practical misalignment through several UE rotation and mobility scen...
4.Paradigm Shift from Statistical Channel Modeling to Digital Twin Prediction: An Environment-Generalizable ChannelLM for 6G AI-enabled Air Interface
As 6G advances, ubiquitous connectivity and higher capacity requirements of the air interface pose substantial challenges for accurate and real-time wireless channel acquisition in diverse environments. Conventional statistical channel modeling relies on offline measurement data from limited environments, struggling to support online applications facing diverse environments. To this end, the digital twin channel (DTC) has emerged as a novel paradigm that constructs a digital replica of the physical environment through high-fidelity sensing and predicts corresponding channel in real time utilizing artificial intelligence (AI) models. As the engine of DTC, existing AI models struggle to simultaneously achieve strong environmental generalization in real-world and end-to-end channel prediction for real time tasks. Therefore, this paper propos...
5.UAVs as Dynamic Nodes in Communication Networks
Driven by the demands of 5G/Beyond 5G and 6G networks, Unmanned Aerial Vehicles (UAVs) have surfaced in critical roles for aerial communications. In the present survey, we explore the multi-mode roles of UAVs as relays, User Equipment (UE), gNB and Reconfigurable Intelligent Surfaces (RIS), along with their deployment scenarios, architectural frameworks, and different communication models incorporating Artificial Intelligence (AI) configurations. We consider the effects of alternate power sources on the communication payload. The survey also aims to address security issues in the UAV communications. As an advancement, we propose a novel UAV-Network-in-a-Box (NIB) architecture for disaster recovery and temporary coverage as an alternative to traditional network infrastructure.
arXiv – Network Architecture (6G/Slicing)
1.Scheduling in Multi-Hop Wireless Networks With Deadlines
We analyze the problem of scheduling in wireless networks to meet end-to-end service guarantees, defined by instantaneous throughput and hard packet deadlines. Using a network slicing model to decouple the queueing dynamics between flows, we show that the network's ability to meet hard deadline guarantees under interference is largely influenced by the link scheduling policy. We characterize throughput- and deadline-optimal policies for a solitary flow operating in isolation, which provide bounds on feasibility in the general case with multiple flows. We prove that packet delays can grow arbitrarily large in the multi-flow setting under a worst-case stabilizing policy, showing that queue stability is not sufficient to guarantee tight deadlines. We derive conditions on end-to-end packet delays in terms of link inter-scheduling times, and s...
2.Safety-Aware AoI Scheduling for LEO Satellite-Assisted Autonomous Driving
Autonomous platoons traversing infrastructure gaps increasingly depend on LEO satellite backhaul for safety-critical updates, yet no existing framework jointly addresses compound Doppler from simultaneous satellite and vehicle motion, sub-slot handover outages that exceed collision-alert deadlines, and heterogeneous freshness requirements across three vehicular priority classes. The core challenge is a \emph{timescale mismatch}: coarse control slots hide sub-slot outages, which makes both AoI spike analysis and safety verification ill-posed. Ping-pong handover oscillations further compound AoI cost in a way that purely reactive schedulers cannot mitigate. We address these challenges through a unified framework that couples a two-timescale AoI model with tiered time-average safety constraints enforced by virtual queues. A closed-form ping-...
3.Toward EU Sovereignty in Space: A Comparative Simulation Study of IRIS 2 and Starlink
The evolution of 6th generation (6G) networks increasingly relies on satellite-based Non-Terrestrial Networks (NTNs) to extend broadband connectivity to remote and unserved regions, and to support public safety. In this paper we compare two representative and conceptually different satellite constellation architectures, namely Starlink and IRIS 2. Starlink is a commercial private Internet constellation by SpaceX, based on dense Low Earth Orbit (LEO) satellites. It is primarily designed to deliver high-capacity broadband services for civil applications, with performance targets comparable to those of terrestrial networks. In contrast, IRIS 2 is a planned public initiative to be deployed by the European Union, based on a multi-layer combination of LEO, Medium Earth Orbit (MEO), and Geo-stationary Earth Orbit (GEO) satellites. It is primaril...
4.Towards Trustworthy 6G Network Digital Twins: A Framework for Validating Counterfactual What-If Analysis in Edge Computing Resources
Network Digital Twins (NDTs) enable safe what-if analysis for 6G cloud-edge infrastructures, but adoption is often limited by fragmented workflows from telemetry to validation. We present a data-driven NDT framework that extends 6G-TWIN with a scalable pipeline for cloud-edge telemetry aggregation and semantic alignment into unified data models. Our contributions include: (i) scalable cloud-edge telemetry collection, (ii) regime-aware feature engineering capturing the network's scaling behavior, and (iii) a validation methodology based on Sign Agreement and Directional Sensitivity. Evaluated on a Kubernetes-managed cluster, the framework extrapolates performance to unseen high-load regimes. Results show both Deep Neural Network (DNN) and XGBoost achieve high regression accuracy (R2 > 0.99), while the XGBoost model delivers superior dir...
5.Cross-Domain Query Translation for Network Troubleshooting: A Multi-Agent LLM Framework with Privacy Preservation and Self-Reflection
This paper presents a hierarchical multi-agent LLM architecture to bridge communication gaps between non-technical end users and telecommunications domain experts in private network environments. We propose a cross-domain query translation framework that leverages specialized language models coordinated through multi-agent reflection-based reasoning. The resulting system addresses three critical challenges: (1) accurately classify user queries related to telecommunications network issues using a dual-stage hierarchical approach, (2) preserve user privacy through the anonymization of semantically relevant personally identifiable information (PII) while maintaining diagnostic utility, and (3) translate technical expert responses into user-comprehensible language.   Our approach employs ReAct-style agents enhanced with self-reflection mechan...
© 2026 Babak Consultancy

                                Don't miss what's next. Subscribe to Babak Namiranian:

            Email address (required)

Babak Namiranian

Daily Briefing – Apr 21 (96 Articles)

Babak's Daily Briefing

6G World

1.Evaluating 6G PHY Evolution: What the Industry Is Really Trying to Solve

2.Amazon’s Globalstar deal gives Amazon Leo a faster path into D2D

3.SoftBank’s Physical AI push gives AI-RAN a sharper purpose

4.South Korea puts 6G inside its national AI push

5.b-com’s Open XG Hub targets one of telecom’s biggest gaps: turning experimentation into deployment

AI Agents

1.WebUncertainty: Dual-Level Uncertainty Driven Planning and Reasoning For Autonomous Web Agent

2.Prompt Optimization Enables Stable Algorithmic Collusion in LLM Agents

3.Know When to Trust the Skill: Delayed Appraisal and Epistemic Vigilance for Single-Agent LLMs

4.SocialGrid: A Benchmark for Planning and Social Reasoning in Embodied Multi-Agent Systems

5.Exploring Agentic Visual Analytics: A Co-Evolutionary Framework of Roles and Workflows

AI Computation & Hardware

1.Multimodal Claim Extraction for Fact-Checking

2.Cross-Family Speculative Decoding for Polish Language Models on Apple~Silicon: An Empirical Evaluation of Bielik~11B with UAG-Extended MLX-LM

3.Brain-CLIPLM: Decoding Compressed Semantic Representations in EEG for Language Reconstruction

4.CFMS: Towards Explainable and Fine-Grained Chinese Multimodal Sarcasm Detection Benchmark

5.Foundational Study on Authorship Attribution of Japanese Web Reviews for Actor Analysis

AI Machine Learning

1.BASIS: Balanced Activation Sketching with Invariant Scalars for "Ghost Backpropagation"

2.UniMamba: A Unified Spatial-Temporal Modeling Framework with State-Space and Attention Integration

3.Annotation Entropy Predicts Per-Example Learning Dynamics in LoRA Fine-Tuning

4.A Discordance-Aware Multimodal Framework with Multi-Agent Clinical Reasoning

5.Preventing overfitting in deep learning using differential privacy

AI Robotics

1.BrainMem: Brain-Inspired Evolving Memory for Embodied Agent Task Planning

2.Interdisciplinary Workshop on Mechanical Intelligence: Summary Report

3.RHINO-AR: An Augmented Reality Exhibit for Teaching Mobile Robotics Concepts in Museums

4.Visual-RRT: Finding Paths toward Visual-Goals via Differentiable Rendering

5.Disentangled Robot Learning via Separate Forward and Inverse Dynamics Pretraining

Financial AI

1.The Virtue of Sparsity in Complexity

2.Spurious Predictability in Financial Machine Learning

3.The Acoustic Camouflage Phenomenon: Re-evaluating Speech Features for Financial Risk Prediction

4.PRAGMA: Revolut Foundation Model

5.Quantum Computing for Financial Transformation: A Review of Optimisation, Pricing, Risk, Machine Learning, and Post-Quantum Security

GSMA Newsroom

1.GSMA Report Urges Japan to Take Bold Action to Convert Technical Excellence into Global Digital Leadership

2.From Rich Text to Video: RCS Universal Profile 4.0 has arrived

3.Mobile Money accounted for $2 trillion in transactions in 2025, doubling since 2021 as active accounts continue to grow

4.Strengthening the Global Fight Against Fraud and Scams – Takeaways from the Global Fraud Summit in Vienna

5.GSMA MWC26 Barcelona closes 20th anniversary edition

Generative AI (arXiv)

1.When Can LLMs Learn to Reason with Weak Supervision?

2.Latent Phase-Shift Rollback: Inference-Time Error Correction via Residual Stream Monitoring and KV-Cache Steering

3.Benchmarking System Dynamics AI Assistants: Cloud Versus Local LLMs on CLD Extraction and Discussion

4.OGER: A Robust Offline-Guided Exploration Reward for Hybrid Reinforcement Learning

5.MASS-RAG: Multi-Agent Synthesis Retrieval-Augmented Generation

Hugging Face Daily Papers

1.MathNet: a Global Multimodal Benchmark for Mathematical Reasoning and Retrieval

2.MUA: Mobile Ultra-detailed Animatable Avatars

3.Sessa: Selective State Space Attention

4.Revisiting Active Sequential Prediction-Powered Mean Estimation

5.A Note on TurboQuant and the Earlier DRIVE/EDEN Line of Work

IEEE Xplore AI

1.Optical Fiber Networks Can Keep Rail Networks Safe

2.​Boston Dynamics and Google DeepMind Teach Spot to Reason​

3.Sarang Gupta Builds AI Systems With Real-World Impact

4.12 Graphs That Explain the State of AI in 2026

5.GoZTASP: A Zero-Trust Platform for Governing Autonomous Systems at Mission Scale

MIT Sloan Management

1.Industrial AI for the Physical World: Siemens’s Peter Koerte

2.Beyond the Model — Why Responsible AI Must Address Workforce Impact

3.How AI Helps the Best and Hurts the Rest

4.Lessons From Innovation Pioneer Florence Nightingale

5.The Human Side of AI Adoption: Lessons From the Field

NBER Working Papers

1.The Impact of Maternal Education on Early Childhood Development -- by Moriam Khanam, Mohammad Hajizadeh, Casey Warman

2.Tariffs and the Term Structure of Inflation Expectations -- by Stéphane Auray, Michael B. Devereux, Anthony M. Diercks, Aurélien Eyquem, Joon Kim

3.Bilateral Conflict Risk and Trade: Military Wars, Trade Wars, and Diplomatic Noise -- by Joshua Aizenman, Rodolphe Desbordes, Jamel Saadaoui

4.The “Peace Dividend” of International Trade: A New Empirical Approach -- by Ling Feng, Qiuyue Huang, Zhiyuan Li, Christopher M. Meissner

5.How Have Universities Survived for Nearly a Millenium -- by David M. Cutler, Edward L. Glaeser

NY Fed - Liberty Street

1.Bank Failures: The Roles of Solvency and Liquidity

2.The R*–Labor Share Nexus

3.Use of Gen AI in the Workplace and the Value of Access to Training

4.What Millions of Homeowner’s Insurance Contracts Reveal About Risk Sharing

2.Boston Dynamics and Google DeepMind Teach Spot to Reason