Daily Briefing – Mar 12 (96 Articles)
Babak's Daily Briefing
Thursday, March 12, 2026
Sources: 20 | Total Articles: 96
6G World
1.SpaceRAN: Airbus UpNext explores software-defined 5G NTN from orbit
Airbus UpNext has launched its SpaceRAN (Space Radio Access Network) demonstrator, a key initiative to advance standardised 5G…
2.SoftBank’s Transformer-Based AI-RAN Hits 30% Uplink Gain at Sub-Millisecond Latency
On August 21, 2025, SoftBank published results from a live, standards-compliant AI-RAN trial that replaces parts of classical signal processing with a lightweight Transformer.
3.6G as a Platform for Value
Reframing the Future with NGMN’s Chairman, Laurent Leboucher By Piotr (Peter) Pietrzyk, Managing Editor, 6GWorld.com In the race…
4.SoftBank Road-Tests 7 GHz in Central Tokyo
SoftBank and Nokia have begun outdoor field trials in Tokyo’s Ginza district using 7 GHz spectrum, installing three pre-commercial base stations to compare coverage and radio characteristics against today’s sub-6 GHz 5G sites.
5.NXP’s Acquisition of TTTech Auto Signals Growing Focus on Middleware for Software-Defined Vehicles
On June 17, 2025, NXP Semiconductors finalized its acquisition of TTTech Auto—a strategic move to integrate TTTech’s flagship…
AI Agents
1.Code-Space Response Oracles: Generating Interpretable Multi-Agent Policies with Large Language Models
Recent advances in multi-agent reinforcement learning, particularly Policy-Space Response Oracles (PSRO), have enabled the computation of approximate game-theoretic equilibria in increasingly complex domains. However, these methods rely on deep reinforcement learning oracles that produce `black-box' neural network policies, making them difficult to interpret, trust or debug. We introduce Code-Space Response Oracles (CSRO), a novel framework that addresses this challenge by replacing RL oracles with Large Language Models (LLMs). CSRO reframes the best response computation as a code generation task, prompting an LLM to generate policies directly as human-readable code. This approach not only yields inherently interpretable policies but also leverages the LLM's pretrained knowledge to discover complex, human-like strategies. We explore multi...
2.AutoAgent: Evolving Cognition and Elastic Memory Orchestration for Adaptive Agents
Autonomous agent frameworks still struggle to reconcile long-term experiential learning with real-time, context-sensitive decision-making. In practice, this gap appears as static cognition, rigid workflow dependence, and inefficient context usage, which jointly limit adaptability in open-ended and non-stationary environments. To address these limitations, we present AutoAgent, a self-evolving multi-agent framework built on three tightly coupled components: evolving cognition, on-the-fly contextual decision-making, and elastic memory orchestration. At the core of AutoAgent, each agent maintains structured prompt-level cognition over tools, self-capabilities, peer expertise, and task knowledge. During execution, this cognition is combined with live task context to select actions from a unified space that includes tool calls, LLM-based gener...
3.Real-Time Trust Verification for Safe Agentic Actions using TrustBench
As large language models evolve from conversational assistants to autonomous agents, ensuring trustworthiness requires a fundamental shift from post-hoc evaluation to real-time action verification. Current frameworks like AgentBench evaluate task completion, while TrustLLM and HELM assess output quality after generation. However, none of these prevent harmful actions during agent execution. We present TrustBench, a dual-mode framework that (1) benchmarks trust across multiple dimensions using both traditional metrics and LLM-as-a-Judge evaluations, and (2) provides a toolkit agents invoke before taking actions to verify safety and reliability. Unlike existing approaches, TrustBench intervenes at the critical decision point: after an agent formulates an action but before execution. Domain-specific plugins encode specialized safety requirem...
4.Agentic Critical Training
Training large language models (LLMs) as autonomous agents often begins with imitation learning, but it only teaches agents what to do without understanding why: agents never contrast successful actions against suboptimal alternatives and thus lack awareness of action quality. Recent approaches attempt to address this by introducing self-reflection supervision derived from contrasts between expert and alternative actions. However, the training paradigm fundamentally remains imitation learning: the model imitates pre-constructed reflection text rather than learning to reason autonomously. We propose Agentic Critical Training (ACT), a reinforcement learning paradigm that trains agents to identify the better action among alternatives. By rewarding whether the model's judgment is correct, ACT drives the model to autonomously develop reasoning...
5.Reachability-based Temporal Logic Verification for Reliable LLM-guided Human-Autonomy Teaming
We propose a reachability-based framework for reliable LLM-guided human-autonomy teaming (HAT) using signal temporal logic (STL). In the proposed framework, LLM is leveraged as a translator that transfers natural language commands given by a human operator into corresponding STL specifications or vice versa. An STL feasibility filter (SFF) is proposed to check the feasibility of the generated STL. The SFF first decomposes the complex and nested LLM translation into a set of simpler subformulas for parallelization and informative feedback generation. The reachability analysis method is then applied to verify if each subformula is feasible for a target dynamical system: if feasible, perform mission planning, otherwise, reject it. The proposed SFF can identify infeasible subformulas, more than simply providing the boolean verification result...
AI Computation & Hardware
1.GhazalBench: Usage-Grounded Evaluation of LLMs on Persian Ghazals
arXiv:2603.09979v1 Announce Type: new Abstract: Persian poetry plays an active role in Iranian cultural practice, where verses by canonical poets such as Hafez are frequently quoted, paraphrased, or completed from partial cues. Supporting such interactions requires language models to engage not only with poetic meaning but also with culturally entrenched surface form. We introduce GhazalBench, a benchmark for evaluating how large language models (LLMs) interact with Persian ghazals under usage-grounded conditions. GhazalBench assesses two complementary abilities: producing faithful prose paraphrases of couplets and accessing canonical verses under varying semantic and formal cues. Across several proprietary and open-weight multilingual LLMs, we observe a consistent dissociation: models generally capture poetic meaning but struggle with e...
2.Large Language Models and Book Summarization: Reading or Remembering, Which Is Better?
arXiv:2603.09981v1 Announce Type: new Abstract: Summarization is a core task in Natural Language Processing (NLP). Recent advances in Large Language Models (LLMs) and the introduction of large context windows reaching millions of tokens make it possible to process entire books in a single prompt. At the same time, for well-known books, LLMs can generate summaries based only on internal knowledge acquired during training. This raises several important questions: How do summaries generated from internal memory compare to those derived from the full text? Does prior knowledge influence summaries even when the model is given the book as input? In this work, we conduct an experimental evaluation of book summarization with state-of-the-art LLMs. We compare summaries of well-known books produced using (i) only the internal knowledge of the mode...
3.AraModernBERT: Transtokenized Initialization and Long-Context Encoder Modeling for Arabic
arXiv:2603.09982v1 Announce Type: new Abstract: Encoder-only transformer models remain widely used for discriminative NLP tasks, yet recent architectural advances have largely focused on English. In this work, we present AraModernBERT, an adaptation of the ModernBERT encoder architecture to Arabic, and study the impact of transtokenized embedding initialization and native long-context modeling up to 8,192 tokens. We show that transtokenization is essential for Arabic language modeling, yielding dramatic improvements in masked language modeling performance compared to non-transtokenized initialization. We further demonstrate that AraModernBERT supports stable and effective long-context modeling, achieving improved intrinsic language modeling performance at extended sequence lengths. Downstream evaluations on Arabic natural language unders...
4.An Efficient Hybrid Deep Learning Approach for Detecting Online Abusive Language
arXiv:2603.09984v1 Announce Type: new Abstract: The digital age has expanded social media and online forums, allowing free expression for nearly 45% of the global population. Yet, it has also fueled online harassment, bullying, and harmful behaviors like hate speech and toxic comments across social networks, messaging apps, and gaming communities. Studies show 65% of parents notice hostile online behavior, and one-third of adolescents in mobile games experience bullying. A substantial volume of abusive content is generated and shared daily, not only on the surface web but also within dark web forums. Creators of abusive comments often employ specific words or coded phrases to evade detection and conceal their intentions. To address these challenges, we propose a hybrid deep learning model that integrates BERT, CNN, and LSTM architectures...
5.The Dunning-Kruger Effect in Large Language Models: An Empirical Study of Confidence Calibration
arXiv:2603.09985v1 Announce Type: new Abstract: Large language models (LLMs) have demonstrated remarkable capabilities across diverse tasks, yet their ability to accurately assess their own confidence remains poorly understood. We present an empirical study investigating whether LLMs exhibit patterns reminiscent of the Dunning-Kruger effect -- a cognitive bias where individuals with limited competence tend to overestimate their abilities. We evaluate four state-of-the-art models (Claude Haiku 4.5, Gemini 2.5 Pro, Gemini 2.5 Flash, and Kimi K2) across four benchmark datasets totaling 24,000 experimental trials. Our results reveal striking calibration differences: Kimi K2 exhibits severe overconfidence with an Expected Calibration Error (ECE) of 0.726 despite only 23.3% accuracy, while Claude Haiku 4.5 achieves the best calibration (ECE = ...
AI Machine Learning
1.Explainable LLM Unlearning Through Reasoning
arXiv:2603.09980v1 Announce Type: new Abstract: LLM unlearning is essential for mitigating safety, copyright, and privacy concerns in pre-trained large language models (LLMs). Compared to preference alignment, it offers a more explicit way by removing undesirable knowledge characterized by specific unlearning datasets. In previous works, gradient ascent (GA) and its variants have shown promise for implementing unlearning, yet their untargeted nature results in unintended degradation of general capabilities, incomplete removal of knowledge, and the generation of incoherent responses, among many others. We argue that these issues stem from the absence of explicit guidance on what and how models should unlearn. To fill this gap, we introduce a novel unlearning target, reasoning-based unlearning target, which satisfies both the specified unle...
2.MoE-SpAc: Efficient MoE Inference Based on Speculative Activation Utility in Heterogeneous Edge Scenarios
arXiv:2603.09983v1 Announce Type: new Abstract: Mixture-of-Experts (MoE) models enable scalable performance but face severe memory constraints on edge devices. Existing offloading strategies struggle with I/O bottlenecks due to the dynamic, low-information nature of autoregressive expert activation. In this paper, we propose to repurpose Speculative Decoding (SD) not merely as a compute accelerator, but as an informative lookahead sensor for memory management, supported by our theoretical and empirical analyses. Hence, we introduce MoE-SpAc, an MoE inference framework that integrates a Speculative Utility Estimator to track expert demand, a Heterogeneous Workload Balancer to dynamically partition computation via online integer optimization, and an Asynchronous Execution Engine to unify the prefetching and eviction in the same utility spac...
3.Personalized Group Relative Policy Optimization for Heterogenous Preference Alignment
arXiv:2603.10009v1 Announce Type: new Abstract: Despite their sophisticated general-purpose capabilities, Large Language Models (LLMs) often fail to align with diverse individual preferences because standard post-training methods, like Reinforcement Learning with Human Feedback (RLHF), optimize for a single, global objective. While Group Relative Policy Optimization (GRPO) is a widely adopted on-policy reinforcement learning framework, its group-based normalization implicitly assumes that all samples are exchangeable, inheriting this limitation in personalized settings. This assumption conflates distinct user reward distributions and systematically biases learning toward dominant preferences while suppressing minority signals. To address this, we introduce Personalized GRPO (P-GRPO), a novel alignment framework that decouples advantage es...
4.LWM-Temporal: Sparse Spatio-Temporal Attention for Wireless Channel Representation Learning
arXiv:2603.10024v1 Announce Type: new Abstract: LWM-Temporal is a new member of the Large Wireless Models (LWM) family that targets the spatiotemporal nature of wireless channels. Designed as a task-agnostic foundation model, LWM-Temporal learns universal channel embeddings that capture mobility-induced evolution and are reusable across various downstream tasks. To achieve this objective, LWM-Temporal operates in the angle-delay-time domain and introduces Sparse Spatio-Temporal Attention (SSTA), a propagation-aligned attention mechanism that restricts interactions to physically plausible neighborhoods, reducing attention complexity by an order of magnitude while preserving geometry-consistent dependencies. LWM-Temporal is pretrained in a self-supervised manner using a physics-informed masking curriculum that emulates realistic occlusions,...
5.Gated Adaptation for Continual Learning in Human Activity Recognition
arXiv:2603.10046v1 Announce Type: new Abstract: Wearable sensors in Internet of Things (IoT) ecosystems increasingly support applications such as remote health monitoring, elderly care, and smart home automation, all of which rely on robust human activity recognition (HAR). Continual learning systems must balance plasticity (learning new tasks) with stability (retaining prior knowledge), yet AI models often exhibit catastrophic forgetting, where learning new tasks degrades performance on earlier ones. This challenge is especially acute in domain-incremental HAR, where on-device models must adapt to new subjects with distinct movement patterns while maintaining accuracy on prior subjects without transmitting sensitive data to the cloud. We propose a parameter-efficient continual learning framework based on channel-wise gated modulation of ...
AI Robotics
1.OmniGuide: Universal Guidance Fields for Enhancing Generalist Robot Policies
arXiv:2603.10052v1 Announce Type: new Abstract: Vision-language-action(VLA) models have shown great promise as generalist policies for a large range of relatively simple tasks. However, they demonstrate limited performance on more complex tasks, such as those requiring complex spatial or semantic understanding, manipulation in clutter, or precise manipulation. We propose OMNIGUIDE, a flexible framework that improves VLA performance on such tasks by leveraging arbitrary sources of guidance, such as 3D foundation models, semantic-reasoning VLMs, and human pose models. We show how many kinds of guidance can be naturally expressed as differentiable energy functions with task-specific attractors and repellers located in 3D space, that influence the sampling of VLA actions. In this way, OMNIGUIDE enables guidance sources with complementary task...
2.Model-Free Co-Optimization of Manufacturable Sensor Layouts and Deformation Proprioception
arXiv:2603.10059v1 Announce Type: new Abstract: Flexible sensors are increasingly employed in soft robotics and wearable devices to provide proprioception of freeform deformations.Although supervised learning can train shape predictors from sensor signals, prediction accuracy strongly depends on sensor layout, which is typically determined heuristically or through trial-and-error. This work introduces a model-free, data-driven computational pipeline that jointly optimizes the number, length, and placement of flexible length-measurement sensors together with the parameters of a shape prediction network for large freeform deformations. Unlike model-based approaches, the proposed method relies solely on datasets of deformed shapes, without requiring physical simulation models, and is therefore broadly applicable to diverse robotic sensing ta...
3.Decision-Aware Uncertainty Evaluation of Vision-Language Model-Based Early Action Anticipation for Human-Robot Interaction
arXiv:2603.10061v1 Announce Type: new Abstract: Robots in shared workspaces must interpret human actions from partial, ambiguous observations, where overconfident early predictions can lead to unsafe or disruptive interaction. This challenge is amplified in egocentric views, where viewpoint changes and occlusions increase perceptual noise and ambiguity. As a result, downstream human-robot interaction modules require not only an action hypothesis but also a trustworthy estimate of confidence under partial observation. Recent vision-language model-based approaches have been proposed for short-term action recognition due to their open-vocabulary and context-aware reasoning, but their uncertainty reliability in the temporal-prefix regime is largely uncharacterized. We present the first systematic evaluation of uncertainty in vision-language m...
4.AR-VLA: True Autoregressive Action Expert for Vision-Language-Action Models
arXiv:2603.10126v1 Announce Type: new Abstract: We propose a standalone autoregressive (AR) Action Expert that generates actions as a continuous causal sequence while conditioning on refreshable vision-language prefixes. In contrast to existing Vision-Language-Action (VLA) models and diffusion policies that reset temporal context with each new observation and predict actions reactively, our Action Expert maintains its own history through a long-lived memory and is inherently context-aware. This structure addresses the frequency mismatch between fast control and slow reasoning, enabling efficient independent pretraining of kinematic syntax and modular integration with heavy perception backbones, naturally ensuring spatio-temporally consistent action generation across frames. To synchronize these asynchronous hybrid V-L-A modalities, we uti...
5.Cross-Hand Latent Representation for Vision-Language-Action Models
arXiv:2603.10158v1 Announce Type: new Abstract: Dexterous manipulation is essential for real-world robot autonomy, mirroring the central role of human hand coordination in daily activity. Humans rely on rich multimodal perception--vision, sound, and language-guided intent--to perform dexterous actions, motivating vision-based, language-conditioned manipulation systems for robots. However, training reliable vision-language-action (VLA) models for dexterous manipulation requires large-scale demonstrations across many robotic hands. In addition, as new dexterous embodiments appear rapidly, collecting data for each becomes costly and impractical, creating a need for scalable cross-embodiment learning. We introduce XL-VLA, a vision-language-action framework integrated with a unified latent action space shared across diverse dexterous hands. Th...
Financial AI
1.A Bipartite Graph Approach to U.S.-China Cross-Market Return Forecasting
This paper studies cross-market return predictability through a machine learning framework that preserves economic structure. Exploiting the non-overlapping trading hours of the U.S. and Chinese equity markets, we construct a directed bipartite graph that captures time-ordered predictive linkages between stocks across markets. Edges are selected via rolling-window hypothesis testing, and the resulting graph serves as a sparse, economically interpretable feature-selection layer for downstream machine learning models. We apply a range of regularized and ensemble methods to forecast open-to-close returns using lagged foreign-market information. Our results reveal a pronounced directional asymmetry: U.S. previous-close-to-close returns contain substantial predictive information for Chinese intraday returns, whereas the reverse effect is limit...
2.Hybrid Hidden Markov Model for Modeling Equity Excess Growth Rate Dynamics: A Discrete-State Approach with Jump-Diffusion
Generating synthetic financial time series that preserve statistical properties of real market data is essential for stress testing, risk model validation, and scenario design. Existing approaches, from parametric models to deep generative networks, struggle to simultaneously reproduce heavy-tailed distributions, negligible linear autocorrelation, and persistent volatility clustering. We propose a hybrid hidden Markov framework that discretizes continuous excess growth rates into Laplace quantile-defined market states and augments regime switching with a Poisson-driven jump-duration mechanism to enforce realistic tail-state dwell times. Parameters are estimated by direct transition counting, bypassing the Baum-Welch EM algorithm. Synthetic data quality is evaluated using Kolmogorov-Smirnov and Anderson-Darling pass rates for distributiona...
3.Uncertainty-Aware Deep Hedging
Deep hedging trains neural networks to manage derivative risk under market frictions, but produces hedge ratios with no measure of model confidence -- a significant barrier to deployment. We introduce uncertainty quantification to the deep hedging framework by training a deep ensemble of five independent LSTM networks under Heston stochastic volatility with proportional transaction costs. The ensemble's disagreement at each time step provides a per-time-step confidence measure that is strongly predictive of hedging performance: the learned strategy outperforms the Black-Scholes delta on approximately 80% of paths when model agreement is high, but on fewer than 20% when disagreement is elevated. We propose a CVaR-optimised blending strategy that combines the ensemble's hedge with the classical Black-Scholes delta, weighted by the level of ...
4.Global universality via discrete-time signatures
We establish global universal approximation theorems on spaces of piecewise linear paths, stating that linear functionals of the corresponding signatures are dense with respect to $L^p$- and weighted norms, under an integrability condition on the underlying weight function. As an application, we show that piecewise linear interpolations of Brownian motion satisfies this integrability condition. Consequently, we obtain $L^p$-approximation results for path-dependent functionals, random ordinary differential equations, and stochastic differential equations driven by Brownian motion.
5.Generative Adversarial Regression (GAR): Learning Conditional Risk Scenarios
We propose Generative Adversarial Regression (GAR), a framework for learning conditional risk scenarios through generators aligned with downstream risk objectives. GAR builds on a regression characterization of conditional risk for elicitable functionals, including quantiles, expectiles, and jointly elicitable pairs. We extend this principle from point prediction to generative modeling by training generators whose policy-induced risk matches that of real data under the same context. To ensure robustness across all policies, GAR adopts a minimax formulation in which an adversarial policy identifies worst-case discrepancies in risk evaluation while the generator adapts to eliminate them. This structure preserves alignment with the risk functional across a broad class of policies rather than a fixed, pre-specified set. We illustrate GAR thro...
GSMA Newsroom
1.GSMA MWC26 Barcelona closes 20th anniversary edition
Summary available at source link.
2.From Ambition to Execution: How Open Gateway Is Scaling the Global API Economy
Summary available at source link.
3.Pioneering Affordable Access in Africa: GSMA and Handset Affordability Coalition Members Identify Six African Countries to Pilot Affordable $40 Smartphones
Summary available at source link.
4.GSMA Calls for Regulatory Readiness for Direct-to-User LEO Satellite Services
Summary available at source link.
5.MWC26 Barcelona opens with call to complete 5G, rise to AI challenges, and strengthen digital safety
Summary available at source link.
Generative AI (arXiv)
1.LLM2Vec-Gen: Generative Embeddings from Large Language Models
LLM-based text embedders typically encode the semantic content of their input. However, embedding tasks require mapping diverse inputs to similar outputs. Typically, this input-output is addressed by training embedding models with paired data using contrastive learning. In this work, we propose a novel self-supervised approach, LLM2Vec-Gen, which adopts a different paradigm: rather than encoding the input, we learn to represent the model's potential response. Specifically, we add trainable special tokens to the LLM's vocabulary, append them to input, and optimize them to represent the LLM's response in a fixed-length sequence. Training is guided by the LLM's own completion for the query, along with an unsupervised embedding teacher that provides distillation targets. This formulation helps to bridge the input-output gap and transfers LLM ...
2.A Hybrid Knowledge-Grounded Framework for Safety and Traceability in Prescription Verification
Medication errors pose a significant threat to patient safety, making pharmacist verification (PV) a critical, yet heavily burdened, final safeguard. The direct application of Large Language Models (LLMs) to this zero-tolerance domain is untenable due to their inherent factual unreliability, lack of traceability, and weakness in complex reasoning. To address these challenges, we introduce PharmGraph-Auditor, a novel system designed for safe and evidence-grounded prescription auditing. The core of our system is a trustworthy Hybrid Pharmaceutical Knowledge Base (HPKB), implemented under the Virtual Knowledge Graph (VKG) paradigm. This architecture strategically unifies a relational component for set constraint satisfaction and a graph component for topological reasoning via a rigorous mapping layer. To construct this HPKB, we propose the I...
3.Dynamics-Predictive Sampling for Active RL Finetuning of Large Reasoning Models
Reinforcement learning (RL) finetuning has become a key technique for enhancing the reasoning abilities of large language models (LLMs). However, its effectiveness critically depends on the selection of training data. Recent advances underscore the importance of online prompt selection methods, which typically concentrate training on partially solved or moderately challenging examples under the current policy, thereby yielding more effective model updates. While significantly accelerating RL finetuning in terms of training steps, they also incur substantial computational overhead by requiring extensive LLM rollouts over large candidate batches to identify informative samples, an expense that can outweigh the finetuning process itself. To address this challenge, this work proposes Dynamics-Predictive Sampling (DPS), which online predicts a...
4.Breaking User-Centric Agency: A Tri-Party Framework for Agent-Based Recommendation
Recent advances in large language models (LLMs) have stimulated growing interest in agent-based recommender systems, enabling language-driven interaction and reasoning for more expressive preference modeling. However, most existing agentic approaches remain predominantly user-centric, treating items as passive entities and neglecting the interests of other critical stakeholders. This limitation exacerbates exposure concentration and long-tail under-representation, threatening long-term system sustainability. In this work, we identify this fundamental limitation and propose the first Tri-party LLM-agent Recommendation framework (TriRec) that explicitly coordinates user utility, item exposure, and platform-level fairness. The framework employs a two-stage architecture: Stage~1 empowers item agents with personalized self-promotion to improve...
5.Making Bielik LLM Reason (Better): A Field Report
This paper presents a research program dedicated to evaluating and advancing the reasoning capabilities of Bielik, a Polish large language model. The study describes a number of stages of work: initial benchmarking and creation of evaluation methodology, analyzing of comparative results with other LLMs and outlining of future prospects that take into account the limitations of the analyses conducted so far and aims to keep Bielik in the race give the ever-changing -- and competitive -- AI landscape.
Hugging Face Daily Papers
1.COMIC: Agentic Sketch Comedy Generation
We propose a fully automated AI system that produces short comedic videos similar to sketch shows such as Saturday Night Live. Starting with character references, the system employs a population of agents loosely based on real production studio roles, structured to optimize the quality and diversity of ideas and outputs through iterative competition, evaluation, and improvement. A key contribution is the introduction of LLM critics aligned with real viewer preferences through the analysis of a corpus of comedy videos on YouTube to automatically evaluate humor. Our experiments show that our framework produces results approaching the quality of professionally produced sketches while demonstrating state-of-the-art performance in video generation.
2.DynVLA: Learning World Dynamics for Action Reasoning in Autonomous Driving
We propose DynVLA, a driving VLA model that introduces a new CoT paradigm termed Dynamics CoT. DynVLA forecasts compact world dynamics before action generation, enabling more informed and physically grounded decision-making. To obtain compact dynamics representations, DynVLA introduces a Dynamics Tokenizer that compresses future evolution into a small set of dynamics tokens. Considering the rich environment dynamics in interaction-intensive driving scenarios, DynVLA decouples ego-centric and environment-centric dynamics, yielding more accurate world dynamics modeling. We then train DynVLA to generate dynamics tokens before actions through SFT and RFT, improving decision quality while maintaining latency-efficient inference. Compared to Textual CoT, which lacks fine-grained spatiotemporal understanding, and Visual CoT, which introduces sub...
3.Beyond the Illusion of Consensus: From Surface Heuristics to Knowledge-Grounded Evaluation in LLM-as-a-Judge
The paradigm of LLM-as-a-judge relies on a critical assumption, namely that high inter-evaluator agreement indicates reliable and objective evaluation. We present two complementary findings that challenge this assumption. \textbf{First}, we demonstrate that this consensus is frequently illusory. We identify and formalize \textbf{Evaluation Illusion}, a phenomenon where LLM judges generate sophisticated critiques yet anchor scores on shared surface heuristics rather than substantive quality. Through a large-scale study of 105,600 evaluation instances (32 LLMs $\times$ 3 frontier judges $\times$ 100 tasks $\times$ 11 temperatures), we show that model-level agreement (Spearman $ρ= 0.99$) masks fragile sample-level agreement (Pearson $\bar{r} = 0.72$; absolute agreement ICC $= 0.67$), that merely sharing rubric structure restores 62\% of tota...
4.A Systematic Study of Pseudo-Relevance Feedback with LLMs
Pseudo-relevance feedback (PRF) methods built on large language models (LLMs) can be organized along two key design dimensions: the feedback source, which is where the feedback text is derived from and the feedback model, which is how the given feedback text is used to refine the query representation. However, the independent role that each dimension plays is unclear, as both are often entangled in empirical evaluations. In this paper, we address this gap by systematically studying how the choice of feedback source and feedback model impact PRF effectiveness through controlled experimentation. Across 13 low-resource BEIR tasks with five LLM PRF methods, our results show: (1) the choice of feedback model can play a critical role in PRF effectiveness; (2) feedback derived solely from LLM-generated text provides the most cost-effective solut...
5.RCTs & Human Uplift Studies: Methodological Challenges and Practical Solutions for Frontier AI Evaluation
Human uplift studies - or studies that measure AI effects on human performance relative to a status quo, typically using randomized controlled trial (RCT) methodology - are increasingly used to inform deployment, governance, and safety decisions for frontier AI systems. While the methods underlying these studies are well-established, their interaction with the distinctive properties of frontier AI systems remains underexamined, particularly when results are used to inform high-stakes decisions. We present findings from interviews with 16 expert practitioners with experience conducting human uplift studies in domains including biosecurity, cybersecurity, education, and labor. Across interviews, experts described a recurring tension between standard causal inference assumptions and the object of study itself. Rapidly evolving AI systems, sh...
IEEE Xplore AI
1.Why AI Chatbots Agree With You Even When You’re Wrong
In April of 2025, OpenAI released a new version of GPT-4o, one of the AI algorithms users could select to power ChatGPT, the company’s chatbot. The next week, OpenAI reverted to the previous version. “The update we removed was overly flattering or agreeable—often described as sycophantic,” the company announced . Some people found the sycophancy hilarious. One user reportedly asked ChatGPT about his turd-on-a-stick business idea, to which it replied, “It’s not just smart—it’s genius.” Some found the behavior uncomfortable. For others, it was actually dangerous. Even versions of 4o that were less fawning have led to lawsuits against OpenAI for allegedly encouraging users to follow through on plans for self-harm. Unremitting adulation has even triggered AI-induced psychosis. Last October, a user named Anthony Tan blogged , “I started talkin...
2.An AI Agent Blackmailed a Developer. Now What?
On 12 February, a Github contributor going by MJ Rathbun posted a personal attack against Scott Shambaugh , a volunteer maintainer for an open-source project. Shambaugh had rejected Rathbun’s code earlier in the day. Rathbun meticulously researched Shambaugh’s activity on Github, in order to write a lengthy takedown post that criticized the maintainer’s code as inferior to Rathbun’s, and ominously warned that “gatekeeping doesn’t make you important. It just makes you an obstacle.” Personal disputes over code submitted to on Github are a tale as old as Github itself. But this time, something was different: MJ Rathbun wasn’t a person. It was an AI agent built with OpenClaw , a popular open-source agentic AI software. RELATED: The First Social Network for AI Agents Heralds Their Messy Future “I was floored, because I had already identified i...
3.Military AI Policy Needs Democratic Oversight
A simmering dispute between the United States Department of Defense (DOD) and Anthropic has now escalated into a full-blown confrontation , raising an uncomfortable but important question: who gets to set the guardrails for military use of artificial intelligence — the executive branch, private companies or Congress and the broader democratic process? The conflict began when Defense Secretary Pete Hegseth reportedly gave Anthropic CEO Dario Amodei a deadline to allow the DOD unrestricted use of its AI systems. When the company refused, the administration moved to designate Anthropic a supply chain risk and ordered federal agencies to phase out its technology, dramatically escalating the standoff. Anthropic has refused to cross two lines : allowing its models to be used for domestic surveillance of United States citizens and enabling fully...
4.Entomologists Use a Particle Accelerator to Image Ants at Scale
Move over, Pixar. The ants that animators once morphed into googly-eyed caricatures in films such as A Bug’s Life and Antz just received a meticulously precise anatomical reboot. Writing today in Nature Methods , an international team of entomologists, accelerator physicists, computer scientists, and biological-imaging specialists describe a new 3D atlas of ant morphology. Dubbed Antscan, the platform features micrometer-resolution reconstructions that lay bare not only the insects’ armored exoskeletons but also their muscles, nerves, digestive tracts, and needlelike stingers poised at the ready. Those high-resolution images—spanning 792 species across 212 genera and covering the bulk of described ant diversity—are now available free of charge through an interactive online portal , where anyone can rotate, zoom, and virtually “dissect” th...
5.Watershed Moment for AI–Human Collaboration in Math
When Ukrainian mathematician Maryna Viazovska received a Fields Medal —widely regarded as the Nobel Prize for mathematics—in July 2022, it was big news. Not only was she the second woman to accept the honor in the award’s 86-year history, but she collected the medal just months after her country had been invaded by Russia. Nearly four years later, Viazovska is making waves again. Today , in a collaboration between humans and AI, Viazovska’s proofs have been formally verified, signaling rapid progress in AI’s abilities to assist with mathemat ical research. “These new results seem very, very impressive, and definitely signal some rapid progress in this direction,” says AI-reasoning expert and Princeton University postdoc Liam Fowl , who was not involved in the work. In her Fields Medal–winning research, Viazovska had tackled two versions o...
MIT Sloan Management
1.Leaders at All Levels: Kraft Heinz’s 5X Speed Secret
Is 36 months too long for a new-product cycle? It was for Kraft Heinz. So, starting with a pilot project, it was able to cut time to market to just six months by redesigning how people worked. Today, units throughout the company are applying that model’s step-by-step approach to change and are seeing measurable improvements […]
2.Why Businesses Should Value Caregivers Now
Annalisa Grassano/Ikon Images In early 2025, more than 212,000 women left the U.S. workforce following a rise in return-to-office mandates, according to the U.S. Bureau of Labor Statistics (BLS). Among mothers with young children, workforce participation dropped nearly three percentage points in just six months, according to the BLS. Behind those numbers is a larger […]
3.An Industry Benchmark for Data Fairness: Sony’s Alice Xiang
On today’s episode of Me, Myself, and AI, host Sam Ransbotham talks with Alice Xiang, global head of AI governance at Sony and lead research scientist for AI ethics at Sony AI, about what it actually takes to put responsible artificial intelligence into practice at scale. Alice shares how Sony moved early on AI ethics […]
4.Why Visibility Has Become the New Test of Leadership
Carolyn Geason-Beissel/MIT SMR In professional service firms, quiet excellence once defined leadership. A partner earned influence through expertise, loyalty, and discretion. But in an era of high transparency, where every meeting can be replayed, every comment rated, and every decision scrutinized online, competence alone no longer sustains trust. Visibility has become the new test of […]
5.Our Guide to the Spring 2026 Issue
The Eight Core Principles of Strategic Innovation Gina O’Connor and Christopher R. Meyer Key Insight: Mature companies that build a strategic innovation capability can systematically renew their product portfolios to sustain long-term growth. Top Takeaways: Many companies start off with a bang: the launch of an exciting breakthrough product or service. But as time passes, […]
NBER Working Papers
1.Pricing Protection: Credit Scores, Disaster Risk, and Home Insurance Affordability -- by Joshua Blonz, Mallick Hossain, Benjamin J. Keys, Philip Mulder, Joakim A. Weill
We use 70 million policies linked to mortgages and property-level disaster risk to show that credit scores impact homeowners insurance premiums as much as disaster risk. Homeowners with low credit pay 24% more for identical coverage than high–credit score homeowners. Leveraging a natural experiment in Washington State, we find that banning the use of credit information considerably weakens the relationship between credit score and pricing. We discuss the role of credit information in pricing and show that, although insurance is often overlooked in discussions of home affordability, a low credit score increases premiums roughly as much as it raises mortgage rates.
2.When Incentives Aren't Enough: Evidence on Inattention and Imperfect Memory from HIV Medication Adherence -- by Hang Yu, Jared Stolove, Dean Yang, James Riddell IV, Arlete Mahumane
Financial incentives are widely used to encourage beneficial behaviors, but their effectiveness may be limited by inattention and imperfect memory. We study this in a randomized trial of HIV medication adherence in Mozambique. Financial incentives alone increase adherence by 10.6 percentage points, while pairing incentives with reminders increases adherence by 24.3 percentage points. We develop a model in which inattention to daily adherence and imperfect memory of payment eligibility reduce incentive effectiveness and show that reminders mitigate both frictions. Detailed medication refill data support the model’s predictions. The results suggest combining incentives with reminders can substantially increase program effectiveness.
3.Pay Now, Buy Never: The Economics of Consumer Prepayment Schemes -- by Yixuan Liu, Hua Zhang, Eric Zou
Prepaid consumption is a common feature of modern consumer markets and is often presented as a mutually beneficial arrangement: consumers receive upfront discounts, and firms secure future sales. We analyze a large-scale Pay Now, Buy Later (PNBL) program in which consumers prepay for restaurant credit with bonuses, and spend the balance later. Using detailed transaction data from over 4 million consumers, we document widespread balance breakage: approximately 40% of prepaid value is never used. Because many consumers underutilize their balances, merchants recover significantly more than the bonus cost. The median firm earns roughly $5.5 in breakage profit for every $1 of bonus credit issued. While PNBL participation does lead to modest increases in consumer spending over time, firms gain substantially more from breakage than from any loya...
4.How does AI Distribute the pie? Large Language Models and the Ultimatum Game. -- by Douglas K.G. Araujo, Harald Uhlig
As Large Language Models (LLMs) are increasingly tasked with autonomous decision making, understanding their behavior in strategic settings is crucial. We investigate the choices of various LLMs in the Ultimatum Game, a setting where human behavior notably deviates from theoretical rationality. We conduct experiments varying the stake size and the nature of the opponent (Human vs. AI) across both Proposer and Responder roles. Three key results emerge. First, LLM behavior is heterogeneous but predictable when conditioning on stake size and player types. Second, while some models approximate the rational benchmark and others mimic human social preferences, a distinct “altruistic” mode emerges where LLMs propose hyper-fair distributions (greater than 50%). Third, LLM Proposers forgo a large share of total payoff, and an even larger share whe...
5.Mergers and Non-contractible Benefits: The Employees' Perspective -- by Wei Cai, Andrea Prat, Jiehang Yu
Incomplete contract theory, supported by anecdotal evidence, suggests that when a firm is acquired, workers may be adversely affected in non-contractible aspects of their work experience. This paper empirically investigates this prediction by combining M\&A events from the Refinitiv database and web-scraped Glassdoor review data. We find that: (a) Controlling for pre-trends, mergers lead to lower satisfaction, especially on non-contractible dimensions of the employee experience (about 6% of a standard deviation); (b) The effect is stronger in the target firm than in the acquiring firm; (c) Text analysis of employee comments indicates that the decline in satisfaction is primarily associated with perceived breaches of implicit contracts. Our findings indicate that mergers may reduce workers' job utility through non-monetary channels.
NY Fed - Liberty Street
1.Firms’ Inflation Expectations Return to 2024 Levels
Businesses experienced substantial cost pressures in 2025 as the cost of insurance and utilities rose sharply, while an increase in tariffs contributed to rising goods and materials costs. This post examines how firms in the New York-Northern New Jersey region adjusted their prices in response to these cost pressures and describes their expectations for future price increases and inflation. Survey results show an acceleration in firms’ price increases in 2025, with an especially sharp increase in the manufacturing sector. While both cost and price increases intensified last year, our surveys re...
2.Are Rising Employee Health Insurance Costs Dampening Wage Growth?
Employer-sponsored health insurance represents a substantial component of total compensation paid by firms to many workers in the United States. Such costs have climbed by close to 20 percent over the past five years. Indeed, the average annual premium for employer-sponsored family health insurance coverage was about $27,000 in 2025—roughly equivalent to the wage of a full-time worker paid $15 per hour. Our February regional business surveys asked firms whether their wage setting decisions were influenced by the rising cost of employee health insurance. As we showed in our
3.What’s Driving Rising Business Costs?
After a period of moderating cost increases, businesses faced mounting cost pressures in 2025. While tariffs played a role in driving up the costs of many inputs—especially among manufacturers—they represent only part of the story. Indeed, firms grappled with substantial cost increases across many categories in the past year. This post is the first in a three-part series analyzing cost and price dynamics among businesses in the New York-Northern New Jersey region based on data collected through our regional business surveys. Firms reported that the sharpest cost increases over the...
4.The Post‑Pandemic Global R*
In this post we provide a measure of “global” r* using data on short- and long-term yields and inflation for several countries with the approach developed in “Global Trends in Interest Rates” (Del Negro, Giannone, Giannoni, and Tambalotti). After declining significantly from the 1990s to before the COVID-19 pandemic, global r* has risen but remains well below its pre-1990s level. These conclusions are based on an econometric model called “trendy VAR” that extracts common trends across a multitude of variables. Specifically, the common trend in real rates across all the countries in the sample is what we call global r*. The post is based on the
5.Estimating the Term Structure of Corporate Bond Risk Premia
Understanding how short- and long-term assets are priced is one of the fundamental questions in finance. The term structure of risk premia allows us to perform net present value calculations, test asset pricing models, and potentially explain the sources of many cross-sectional asset pricing anomalies. In this post, I construct a forward-looking estimate of the term structure of risk premia in the corporate bond market following Jankauskas (2024). The U.S. corporate bond market is an ideal laboratory for studying the relationship between risk premia and maturity because of its large size (standing at roughly $16 trillion as of the end of 2024) and because the maturities are well defined (in contrast to equities).
Project Syndicate
1.Scotland Is Pointing the Way to a New Economy
As more people worldwide confront economic failures and climate disruptions, systemic change will become increasingly more attractive. Scotland’s new Community Wealth Building Bill can serve as an example of a policy framework that promotes broad-based ownership, collective decision-making, and the reconfiguration of economic power.
2.The Growing Cyber Risk to Supply Chains
Many corporate leaders regard cybersecurity as an internal IT problem that can be delegated and forgotten. But as AI and automation reshape global supply chains, cyber readiness has become an operational capability, similar to quality or safety, with the goal being continuity in the face of disruption.
3.Consumers or Workers First?
A well-functioning economy, say most economists, is one that provides a widening array of ever-more affordable goods and services. But what we need is an economic-policy mindset that recognizes that people are both consumers and workers.
4.Africa Is Reimagining Climate Finance
Foreign donors – including governments, NGOs, and development agencies – have long based climate-finance decisions on their own perceptions of risk, imposing solutions that do not necessarily reflect African priorities or perspectives. But with new investment platforms, Africa is taking matters into its own hands.
5.How Will This Energy Shock Play Out?
Although the US-Israeli war against Iran is unprecedented in many ways, previous energy and financial crises over the past half-century offer useful lessons for investors, policymakers, and others hoping to navigate the current storm. For starters, oil and natural gas producers cannot assume that prices will stay high indefinitely.
RCR Wireless
1.To capture the AI opportunity, telcos must lead, not follow (Reader Forum)
As AI adoption accelerates, telecom operators have a chance to claim a central role in the AI economy, says Spirent (now part of Keysight). To seize it, they must move beyond connectivity and deliver trusted, high-performance AI infrastructure and services. The AI boom is real—and accelerating. In a recent Bain & Company survey, 95% of […]
2.Metro Connect USA: US power crunch is standing in the way of its AI ambitions
AI’s voracious appetite is straining energy grids, prompting new power strategies and longer planning horizons In sum — what to know: A gating factor: Power is now the biggest gating factor for AI infrastructure, with water to follow next. Bring your own power: To avoid grid struggles, business must consider adopting bring your own power […]
3.GSMA launches Open Telco AI initiative for telco-grade AI at MWC 2026
New alliance including AT&T and AMD aims to fix the reliability gap in telecom AI In sum – what we know: The GSMA wants to make AI a little more telco-focused. The association took the wraps off of the Open Telco AI at Mobile World Congress in Barcelona, with a pretty clear pitch. General-purpose AI models, […]
4.Cyient showcases the human + AI blueprint for autonomous networks
At the Autonomous Network Summit during MWC Barcelona 2026, Cyient outlined how autonomy is reshaping network operations for telcos On the opening day of MWC in Barcelona, Cyient brought together industry leaders, operators, and technology experts to tackle one of telecom’s defining priorities: advancing the journey toward level 4 autonomous networks. Under the theme “Embracing […]
5.India outlines AI-driven telecom vision at MWC 2026
Scindia noted that India had one of the fastest rollouts of 5G in the world, with 500,000 base stations already deployed in the country In sum – what to know: AI transforming networks – Scindia said the industry is entering an “IQ era,” where AI is turning networks into adaptive systems capable of real-time transactions, […]
Semantic Scholar – Machine Learning
1.Source Error
Check Feed
Telecom & 6G AI
1.Federated Learning-driven Beam Management in LEO 6G Non-Terrestrial Networks
Low Earth Orbit (LEO) Non-Terrestrial Networks (NTNs) require efficient beam management under dynamic propagation conditions. This work investigates Federated Learning (FL)-based beam selection in LEO satellite constellations, where orbital planes operate as distributed learners through the utilization of High-Altitude Platform Stations (HAPS). Two models, a Multi-Layer Perceptron (MLP) and a Graph Neural Network (GNN), are evaluated using realistic channel and beamforming data. Results demonstrate that GNN surpasses MLP in beam prediction accuracy and stability, particularly at low elevation angles, enabling lightweight and intelligent beam management for future NTN deployments.
2.AI-Enhanced Spatial Cellular Traffic Demand Prediction with Contextual Clustering and Error Correction for 5G/6G Planning
Accurate spatial prediction of cellular traffic demand is essential for 5G NR capacity planning, network densification, and data-driven 6G planning. Although machine learning can fuse heterogeneous geospatial and socio-economic layers to estimate fine-grained demand maps, spatial autocorrelation can cause neighborhood leakage under naive train/test splits, inflating accuracy and weakening planning reliability. This paper presents an AI-driven framework that reduces leakage and improves spatial generalization via a context-aware two-stage splitting strategy with residual spatial error correction. Experiments using crowdsourced usage indicators across five major Canadian cities show consistent mean absolute error (MAE) reductions relative to location-only clustering, supporting more reliable bandwidth provisioning and evidence-based spectru...
3.Two-Layer Stacked Intelligent Metasurfaces: Balancing Performance and Complexity
Stacked intelligent metasurfaces (SIMs) have emerged as a powerful paradigm for wave-domain signal processing, enabling fine-grained control over electromagnetic (EM) propagation in next-generation wireless systems. However, conventional multi-layer SIMs often suffer from excessive structural complexity, high computational overhead, and significant power attenuation across layers, limiting their performance. In this paper, we first characterize SIMs from the perspectives of functionality, application, and layer configuration, revealing the inherent trade-offs between signal processing flexibility and power efficiency. Then, two representative 2-layer architectures, the meta-fiber-connected SIM (MF-SIM) and the flexible intelligent layered metasurface (FILM), are introduced, each advocating a distinct 2-layer SIM design philosophy. Moreove...
4.Propagation and Rate-Aware Cell Switching Optimization in HAPS-Assisted Wireless Networks
Cell switching is a promising approach for improving energy efficiency in wireless networks; however, existing studies largely rely on simplified models and energy-centric formulations that overlook key performance-limiting factors. This paper revisits the cell switching concept by redefining its modeling assumptions and mathematical formulation, explicitly incorporating realistic propagation effects such as building entry loss (BEL) and atmospheric losses relevant to non-terrestrial networks (NTN), particularly high-altitude platform station (HAPS). Beyond proposing a new cell switching strategy, the conventional energy-focused problem is reformulated as a multi-objective optimization framework that jointly minimizes power consumption, unconnected users, and data rate degradation. Through this reformulation, the proposed methods ensure t...
5.Multi-Modal Intelligent Channel Modeling: From Fine-tuned LLMs to Pre-trained Foundation Models
To meet the evolving demands of sixth-generation (6G) wireless channel modeling, such as precise prediction capability, extension capabilities, and system participation capability, multi-modal intelligent channel modeling (MMICM) has been proposed based on Synesthesia of Machines (SoM) which explores the mapping relationship between multi-modal sensing in physical environment and channel characteristics in electromagnetic space. Furthermore, for integrating heterogeneous sensing, reasoning across scales, and generalizing to complex air-space-ground-sea communication environments, two new paradigms of MMICM are explored, including fine-tuned large language models (LLMs) for Channel Modeling (LLM4CM) and Wireless Channel Foundation Model (WiCo). LLM4CM leverages pre-trained LLMs on channel representations for cross-modal alignment and light...
arXiv Quantitative Finance
1.When David becomes Goliath: Repo dealer-driven bond mispricing
This paper studies the impact of funding market frictions on bond prices and market-wide liquidity. Using proprietary transaction-level data on all gilt-backed repo and reverse-repo trades, we demonstrate how the market power of individual dealers and their linkages generate frictions. Specifically, we show that frictions related to market power account for between 0.5 and 1.3 percentage points of bond yield deviation, while the transmission of heterogeneously persistent shocks between dealers accounts for between 2 and 4 percentage points of yield deviation.
2.An operator-level ARCH Model
AutoRegressive Conditional Heteroscedasticity (ARCH) models are standard for modeling time series exhibiting volatility, with a rich literature in univariate and multivariate settings. In recent years, these models have been extended to function spaces. However, functional ARCH and generalized ARCH (GARCH) processes established in the literature have thus far been restricted to model ``pointwise'' variances. In this paper, we propose a new ARCH framework for data residing in general separable Hilbert spaces that accounts for the full evolution of the conditional covariance operator. We define a general operator-level ARCH model. For a simplified Constant Conditional Correlation version of the model, we establish conditions under which such models admit strictly and weakly stationary solutions, finite moments, and weak serial dependence. A...
3.Hybrid Hidden Markov Model for Modeling Equity Excess Growth Rate Dynamics: A Discrete-State Approach with Jump-Diffusion
Generating synthetic financial time series that preserve statistical properties of real market data is essential for stress testing, risk model validation, and scenario design. Existing approaches, from parametric models to deep generative networks, struggle to simultaneously reproduce heavy-tailed distributions, negligible linear autocorrelation, and persistent volatility clustering. We propose a hybrid hidden Markov framework that discretizes continuous excess growth rates into Laplace quantile-defined market states and augments regime switching with a Poisson-driven jump-duration mechanism to enforce realistic tail-state dwell times. Parameters are estimated by direct transition counting, bypassing the Baum-Welch EM algorithm. Synthetic data quality is evaluated using Kolmogorov-Smirnov and Anderson-Darling pass rates for distributiona...
4.From debt crises to financial crashes (and back): a stock-flow consistent model for stock price bubbles
We develop a stochastic macro-financial model in continuous time by integrating two specifications of the Keen economic framework with a financial market driven by a jump-diffusion process. The economic block of the model combines monetary debt-deflation mechanisms with Ponzi-type financial destabilization and is influenced by the financial market through a stochastic interest rate that depends on asset price returns. The financial market block of the model consists of an asset with jump--diffusion price process with endogenous, state-dependent jump intensities driven by speculative credit flows. The model formalizes a feedback loop linking credit expansion, crash risk, perceived return dynamics, and bank lending spreads. Under suitable parameter restrictions, we establish global existence and non-explosion of the coupled system. Numerica...
5.Stock Market Prediction Using Node Transformer Architecture Integrated with BERT Sentiment Analysis
Stock market prediction presents considerable challenges for investors, financial institutions, and policymakers operating in complex market environments characterized by noise, non-stationarity, and behavioral dynamics. Traditional forecasting methods often fail to capture the intricate patterns and cross-sectional dependencies inherent in financial markets. This paper presents an integrated framework combining a node transformer architecture with BERT-based sentiment analysis for stock price forecasting. The proposed model represents the stock market as a graph structure where individual stocks form nodes and edges capture relationships including sectoral affiliations, correlated price movements, and supply chain connections. A fine-tuned BERT model extracts sentiment from social media posts and combines it with quantitative market feat...
arXiv – 6G & Networking
1.Federated Learning-driven Beam Management in LEO 6G Non-Terrestrial Networks
Low Earth Orbit (LEO) Non-Terrestrial Networks (NTNs) require efficient beam management under dynamic propagation conditions. This work investigates Federated Learning (FL)-based beam selection in LEO satellite constellations, where orbital planes operate as distributed learners through the utilization of High-Altitude Platform Stations (HAPS). Two models, a Multi-Layer Perceptron (MLP) and a Graph Neural Network (GNN), are evaluated using realistic channel and beamforming data. Results demonstrate that GNN surpasses MLP in beam prediction accuracy and stability, particularly at low elevation angles, enabling lightweight and intelligent beam management for future NTN deployments.
2.AI-Enhanced Spatial Cellular Traffic Demand Prediction with Contextual Clustering and Error Correction for 5G/6G Planning
Accurate spatial prediction of cellular traffic demand is essential for 5G NR capacity planning, network densification, and data-driven 6G planning. Although machine learning can fuse heterogeneous geospatial and socio-economic layers to estimate fine-grained demand maps, spatial autocorrelation can cause neighborhood leakage under naive train/test splits, inflating accuracy and weakening planning reliability. This paper presents an AI-driven framework that reduces leakage and improves spatial generalization via a context-aware two-stage splitting strategy with residual spatial error correction. Experiments using crowdsourced usage indicators across five major Canadian cities show consistent mean absolute error (MAE) reductions relative to location-only clustering, supporting more reliable bandwidth provisioning and evidence-based spectru...
3.Two-Layer Stacked Intelligent Metasurfaces: Balancing Performance and Complexity
Stacked intelligent metasurfaces (SIMs) have emerged as a powerful paradigm for wave-domain signal processing, enabling fine-grained control over electromagnetic (EM) propagation in next-generation wireless systems. However, conventional multi-layer SIMs often suffer from excessive structural complexity, high computational overhead, and significant power attenuation across layers, limiting their performance. In this paper, we first characterize SIMs from the perspectives of functionality, application, and layer configuration, revealing the inherent trade-offs between signal processing flexibility and power efficiency. Then, two representative 2-layer architectures, the meta-fiber-connected SIM (MF-SIM) and the flexible intelligent layered metasurface (FILM), are introduced, each advocating a distinct 2-layer SIM design philosophy. Moreove...
4.Propagation and Rate-Aware Cell Switching Optimization in HAPS-Assisted Wireless Networks
Cell switching is a promising approach for improving energy efficiency in wireless networks; however, existing studies largely rely on simplified models and energy-centric formulations that overlook key performance-limiting factors. This paper revisits the cell switching concept by redefining its modeling assumptions and mathematical formulation, explicitly incorporating realistic propagation effects such as building entry loss (BEL) and atmospheric losses relevant to non-terrestrial networks (NTN), particularly high-altitude platform station (HAPS). Beyond proposing a new cell switching strategy, the conventional energy-focused problem is reformulated as a multi-objective optimization framework that jointly minimizes power consumption, unconnected users, and data rate degradation. Through this reformulation, the proposed methods ensure t...
5.Multi-Modal Intelligent Channel Modeling: From Fine-tuned LLMs to Pre-trained Foundation Models
To meet the evolving demands of sixth-generation (6G) wireless channel modeling, such as precise prediction capability, extension capabilities, and system participation capability, multi-modal intelligent channel modeling (MMICM) has been proposed based on Synesthesia of Machines (SoM) which explores the mapping relationship between multi-modal sensing in physical environment and channel characteristics in electromagnetic space. Furthermore, for integrating heterogeneous sensing, reasoning across scales, and generalizing to complex air-space-ground-sea communication environments, two new paradigms of MMICM are explored, including fine-tuned large language models (LLMs) for Channel Modeling (LLM4CM) and Wireless Channel Foundation Model (WiCo). LLM4CM leverages pre-trained LLMs on channel representations for cross-modal alignment and light...
arXiv – Network Architecture (6G/Slicing)
1.Adaptive RAN Slicing Control via Reward-Free Self-Finetuning Agents
The integration of Generative AI models into AI-native network systems offers a transformative path toward achieving autonomous and adaptive control. However, the application of such models to continuous control tasks is impeded by intrinsic architectural limitations, including finite context windows, the lack of explicit reward signals, and the degradation of the long context. This paper posits that the key to unlocking robust continuous control is enabling agents to internalize experience by distilling it into their parameters, rather than relying on prompt-based memory. To this end, we propose a novel self-finetuning framework that enables agentic systems to learn continuously through direct interaction with the environment, bypassing the need for handcrafted rewards. Our framework implements a bi-perspective reflection mechanism that ...
2.Spyglass: Directional Spectrum Sensing with Single-shot AoA Estimation and Virtual Arrays
In this paper, we introduce Spyglass, a spectrum sensor designed to address the challenges of effective spectrum usage in dense wireless environments. Spyglass is capable of observing a frequency band and accurately estimating the Angle of Arrival (AoA) of any signal during a single transmission. This includes additional signal context such as center frequency, bandwidth, and I/Q samples. We overcome challenges such as the clutter of fleeting transmissions in common bands, the high cost of array processing for AoA estimation, and the difficulty of detecting and estimating channels for unknown signals. Our first contribution is the development of Searchlite, a protocol-agnostic signal detection and separation algorithm. We use a switched array to reduce cost and processing complexity, and we develop SSFP, a signal processing technique usin...
3.Towards Flexible Spectrum Access: Data-Driven Insights into Spectrum Demand
In the diverse landscape of 6G networks, where wireless connectivity demands surge and spectrum resources remain limited, flexible spectrum access becomes paramount. The success of crafting such schemes hinges on our ability to accurately characterize spectrum demand patterns across space and time. This paper presents a data-driven methodology for estimating spectrum demand variations over space and identifying key drivers of these variations in the mobile broadband landscape. By leveraging geospatial analytics and machine learning, the methodology is applied to a case study in Canada to estimate spectrum demand dynamics in urban regions. Our proposed model captures 70\% of the variability in spectrum demand when trained on one urban area and tested on another. These insights empower regulators to navigate the complexities of 6G networks ...
4.Optimizing Reinforcement Learning Training over Digital Twin Enabled Multi-fidelity Networks
In this paper, we investigate a novel digital network twin (DNT) assisted deep learning (DL) model training framework. In particular, we consider a physical network where a base station (BS) uses several antennas to serve multiple mobile users, and a DNT that is a virtual representation of the physical network. The BS must adjust its antenna tilt angles to optimize the data rates of all users. Due to user mobility, the BS may not be able to accurately track network dynamics such as wireless channels and user mobilities. Hence, a reinforcement learning (RL) approach is used to dynamically adjust the antenna tilt angles. To train the RL, we can use data collected from the physical network and the DNT. The data collected from the physical network is more accurate but incurs more communication overhead compared to the data collected from the ...
5.Predicting Conflict Impact on Performance in O-RAN
The O-RAN Alliance promotes the integration of intelligent autonomous agents to control the Radio Access Network (RAN). This improves flexibility, performance, and observability in the RAN, but introduces new challenges, such as the detection and management of conflicts among the intelligent autonomous agents. A solution consists of profiling the agents before deployment to gather statistical information about their decision-making behavior, then using the information to estimate the level of conflict among agents with different goals. This approach enables determining the occurrence of conflicts among agents, but does not provide information about the impact on RAN performance, including potential service degradation. The problem becomes more complex when agents generate control actions at different timescales, which makes conflict sever...