Daily Briefing – Mar 25 (100 Articles)

        March 25, 2026

Daily Briefing – Mar 25 (100 Articles)

Babak's Daily Briefing
Wednesday, March 25, 2026
Sources: 20 | Total Articles: 100
6G World
1.SpaceRAN: Airbus UpNext explores software-defined 5G NTN from orbit
Airbus UpNext has launched its SpaceRAN (Space Radio Access Network) demonstrator, a key initiative to advance standardised 5G…
2.SoftBank’s Transformer-Based AI-RAN Hits 30% Uplink Gain at Sub-Millisecond Latency
On August 21, 2025, SoftBank published results from a live, standards-compliant AI-RAN trial that replaces parts of classical signal processing with a lightweight Transformer.
3.6G as a Platform for Value
Reframing the Future with NGMN’s Chairman, Laurent Leboucher By Piotr (Peter) Pietrzyk, Managing Editor, 6GWorld.com In the race…
4.SoftBank Road-Tests 7 GHz in Central Tokyo
SoftBank and Nokia have begun outdoor field trials in Tokyo’s Ginza district using 7 GHz spectrum, installing three pre-commercial base stations to compare coverage and radio characteristics against today’s sub-6 GHz 5G sites.
5.NXP’s Acquisition of TTTech Auto Signals Growing Focus on Middleware for Software-Defined Vehicles
On June 17, 2025, NXP Semiconductors finalized its acquisition of TTTech Auto—a strategic move to integrate TTTech’s flagship…
AI Agents
1.TrustTrade: Human-Inspired Selective Consensus Reduces Decision Uncertainty in LLM Trading Agents
Large language models (LLMs) are increasingly deployed as autonomous agents in financial trading. However, they often exhibit a hazardous behavioral bias that we term uniform trust, whereby retrieved information is implicitly assumed to be factual and heterogeneous sources are treated as equally informative. This assumption stands in sharp contrast to human decision-making, which relies on selective filtering, cross-validation, and experience-driven weighting of information sources. As a result, LLM-based trading systems are particularly vulnerable to multi-source noise and misinformation, amplifying factual hallucinations and leading to unstable risk-return performance. To bridge this behavioral gap, we introduce TrustTrade (Trust-Rectified Unified Selective Trader), a multi-agent selective consensus framework inspired by human epistemic...
2.Causal Evidence that Language Models use Confidence to Drive Behavior
Metacognition -- the ability to assess one's own cognitive performance -- is documented across species, with internal confidence estimates serving as a key signal for adaptive behavior. While confidence can be extracted from Large Language Model (LLM) outputs, whether models actively use these signals to regulate behavior remains a fundamental question. We investigate this through a four-phase abstention paradigm.Phase 1 established internal confidence estimates in the absence of an abstention option. Phase 2 revealed that LLMs apply implicit thresholds to these estimates when deciding to answer or abstain. Confidence emerged as the dominant predictor of behavior, with effect sizes an order of magnitude larger than knowledge retrieval accessibility (RAG scores) or surface-level semantic features. Phase 3 provided causal evidence through a...
3.Demystifying Reinforcement Learning for Long-Horizon Tool-Using Agents: A Comprehensive Recipe
Reinforcement Learning (RL) is essential for evolving Large Language Models (LLMs) into autonomous agents capable of long-horizon planning, yet a practical recipe for scaling RL in complex, multi-turn environments remains elusive. This paper presents a systematic empirical study using TravelPlanner, a challenging testbed requiring tool orchestration to satisfy multifaceted constraints. We decompose the agentic RL design space along 5 axes: reward shaping, model scaling, data composition, algorithm selection, and environmental stability. Our controlled experiments yield 7 key takeaways, e.g., (1) reward and algorithm choices are scale-dependent as smaller models benefit from staged rewards and enhanced exploration, whereas larger models converge efficiently with simpler dense rewards, (2) ~ 1K training samples with a balanced difficulty mi...
4.Reasoner-Executor-Synthesizer: Scalable Agentic Architecture with Static O(1) Context Window
Large Language Models (LLMs) deployed as autonomous agents commonly use Retrieval-Augmented Generation (RAG), feeding retrieved documents into the context window, which creates two problems: the risk of hallucination grows with context length, and token cost scales linearly with dataset size. We propose the Reasoner-Executor-Synthesizer (RES) architecture, a three-layer design that strictly separates intent parsing (Reasoner), deterministic data retrieval and aggregation (Executor), and narrative generation (Synthesizer). The Executor uses zero LLM tokens and passes only fixed-size statistical summaries to the Synthesizer. We formally prove that RES achieves O(1) token complexity with respect to dataset size, and validate this on ScholarSearch, a scholarly research assistant backed by the Crossref API (130M+ articles). Across 100 benchmar...
5.AI In Cybersecurity Education -- Scalable Agentic CTF Design Principles and Educational Outcomes
Large language models are rapidly changing how learners acquire and demonstrate cybersecurity skills. However, when human--AI collaboration is allowed, educators still lack validated competition designs and evaluation practices that remain fair and evidence-based. This paper presents a cross-regional study of LLM-centered Capture-the-Flag competitions built on the Cyber Security Awareness Week competition system. To understand how autonomy levels and participants' knowledge backgrounds influence problem-solving performance and learning-related behaviors, we formalize three autonomy levels: human-in-the-loop, autonomous agent frameworks, and hybrid. To enable verification, we require traceable submissions including conversation logs, agent trajectories, and agent code. We analyze multi-region competition data covering an in-class track, a ...
AI Computation & Hardware
1.Evaluating Prompting Strategies for Chart Question Answering with Large Language Models
arXiv:2603.22288v1 Announce Type: new 
Abstract: Prompting strategies affect LLM reasoning performance, but their role in chart-based QA remains underexplored. We present a systematic evaluation of four widely used prompting paradigms (Zero-Shot, Few-Shot, Zero-Shot Chain-of-Thought, and Few-Shot Chain-of-Thought) across GPT-3.5, GPT-4, and GPT-4o on the ChartQA dataset. Our framework operates exclusively on structured chart data, isolating prompt structure as the only experimental variable, and evaluates performance using two metrics: Accuracy and Exact Match. Results from 1,200 diverse ChartQA samples show that Few-Shot Chain-of-Thought prompting consistently yields the highest accuracy (up to 78.2\%), particularly on reasoning-intensive questions, while Few-Shot prompting improves format adherence. Zero-Shot performs well only with hig...
2.MERIT: Memory-Enhanced Retrieval for Interpretable Knowledge Tracing
arXiv:2603.22289v1 Announce Type: new 
Abstract: Knowledge Tracing (KT) models students' evolving knowledge states to predict future performance, serving as a foundation for personalized education. While traditional deep learning models achieve high accuracy, they often lack interpretability. Large Language Models (LLMs) offer strong reasoning capabilities but struggle with limited context windows and hallucinations. Furthermore, existing LLM-based methods typically require expensive fine-tuning, limiting scalability and adaptability to new data. We propose MERIT (Memory-Enhanced Retrieval for Interpretable Knowledge Tracing), a training-free framework combining frozen LLM reasoning with structured pedagogical memory. Rather than updating parameters, MERIT transforms raw interaction logs into an interpretable memory bank. The framework us...
3.Less is More: Adapting Text Embeddings for Low-Resource Languages with Small Scale Noisy Synthetic Data
arXiv:2603.22290v1 Announce Type: new 
Abstract: Low-resource languages (LRLs) often lack high-quality, large-scale datasets for training effective text embedding models, hindering their application in tasks like retrieval-augmented generation (RAG) and semantic search. In this work, we challenge the prevailing assumption that effective semantic alignment requires massive datasets or pristine, human-verified translations. Focusing on Armenian (an LRL with a unique script), we introduce a cost-effective adaptation strategy using small scale noisy synthetic data generated by translating English Reddit title-body pairs with open-weights models. We establish a comprehensive evaluation benchmark comprising existing datasets, translated data, and a manually curated dataset. Our experiments reveal a surprising "Less is More" phenomenon: fine-tun...
4.Evaluating Large Language Models' Responses to Sexual and Reproductive Health Queries in Nepali
arXiv:2603.22291v1 Announce Type: new 
Abstract: As Large Language Models (LLMs) become integrated into daily life, they are increasingly used for personal queries, including Sexual and Reproductive Health (SRH), allowing users to chat anonymously without fear of judgment. However, current evaluation methods primarily focus on accuracy, often for objective queries in high-resource languages, and lack criteria to assess usability and safety, especially for low-resource languages and culturally sensitive domains like SRH. This paper introduces LLM Evaluation Framework (LEAF), that conducts assessments across multiple criteria: accuracy, language, usability gaps (including relevance, adequacy, and cultural appropriateness), and safety gaps (safety, sensitivity, and confidentiality). Using the LEAF framework, we assessed 14K SRH queries in Ne...
5.TIPS: Turn-Level Information-Potential Reward Shaping for Search-Augmented LLMs
arXiv:2603.22293v1 Announce Type: new 
Abstract: Search-augmented large language models (LLMs) trained with reinforcement learning (RL) have achieved strong results on open-domain question answering (QA), but training still remains a significant challenge. The optimization is often unstable due to sparse rewards and difficult credit assignments across reasoning and tool calls. To address this, we introduce Turn-Level Information Potential Reward Shaping (TIPS), a simple framework that assigns dense, turn-level rewards to each reasoning + tool-call segment based on the increased likelihood of the correct answer under a teacher model. By leveraging the potential-based reward shaping, TIPS offers fine-grained and policy-invariant guidance that overcomes the limitations of outcome-only optimization. Evaluated on seven QA benchmarks, TIPS cons...
AI Machine Learning
1.Beyond Hard Constraints: Budget-Conditioned Reachability For Safe Offline Reinforcement Learning
arXiv:2603.22292v1 Announce Type: new Abstract: Sequential decision making using Markov Decision Process underpins many realworld applications. Both model-based and model free methods have achieved strong results in these settings. However, real-world tasks must balance reward maximization with safety constraints, often conflicting objectives, that can lead to unstable min/max, adversarial optimization. A promising alternative is safety reachability analysis, which precomputes a forward-invariant safe state, action set, ensuring that an agent starting inside this set remains safe indefinitely. Yet, most reachability based methods address only hard safety constraints, and little work extends reachability to cumulative cost constraints. To address this, first, we define a safetyconditioned reachability set that decouples reward maximization...
2.Efficient Embedding-based Synthetic Data Generation for Complex Reasoning Tasks
arXiv:2603.22294v1 Announce Type: new Abstract: Synthetic Data Generation (SDG), leveraging Large Language Models (LLMs), has recently been recognized and broadly adopted as an effective approach to improve the performance of smaller but more resource and compute efficient LLMs through fine-tuning. A key challenge in SDG is ensuring the quality and diversity of the generated data. In this paper, we analyze the diversity and distribution of generated data in the embedding space, and demonstrate a strong correlation between the density of examples within a specific neighborhood and the accuracy of predictions on examples drawn from that region. Building on this insight, we present a targeted pipeline for embedding-based sampling that enhances data diversity and consistently improves performance across several benchmarks.
3.Between the Layers Lies the Truth: Uncertainty Estimation in LLMs Using Intra-Layer Local Information Scores
arXiv:2603.22299v1 Announce Type: new Abstract: Large language models (LLMs) are often confidently wrong, making reliable uncertainty estimation (UE) essential. Output-based heuristics are cheap but brittle, while probing internal representations is effective yet high-dimensional and hard to transfer. We propose a compact, per-instance UE method that scores cross-layer agreement patterns in internal representations using a single forward pass. Across three models, our method matches probing in-distribution, with mean diagonal differences of at most $-1.8$ AUPRC percentage points and $+4.9$ Brier score points. Under cross-dataset transfer, it consistently outperforms probing, achieving off-diagonal gains up to $+2.86$ AUPRC and $+21.02$ Brier points. Under 4-bit weight-only quantization, it remains robust, improving over probing by $+1.94$...
4.Scaling Attention via Feature Sparsity
arXiv:2603.22300v1 Announce Type: new Abstract: Scaling Transformers to ultra-long contexts is bottlenecked by the $O(n^2 d)$ cost of self-attention. Existing methods reduce this cost along the sequence axis through local windows, kernel approximations, or token-level sparsity, but these approaches consistently degrade accuracy. In this paper, we instead explore an orthogonal axis: feature sparsity. We propose Sparse Feature Attention (SFA), where queries and keys are represented as $k$-sparse codes that preserve high-dimensional expressivity while reducing the cost of attention from $\Theta(n^2 d)$ to $\Theta(n^2 k^2/d)$. To make this efficient at scale, we introduce FlashSFA, an IO-aware kernel that extends FlashAttention to operate directly on sparse overlaps without materializing dense score matrices. Across GPT-2 and Qwen3 pretrainin...
5.Latent Semantic Manifolds in Large Language Models
arXiv:2603.22301v1 Announce Type: new Abstract: Large Language Models (LLMs) perform internal computations in continuous vector spaces yet produce discrete tokens -- a fundamental mismatch whose geometric consequences remain poorly understood. We develop a mathematical framework that interprets LLM hidden states as points on a latent semantic manifold: a Riemannian submanifold equipped with the Fisher information metric, where tokens correspond to Voronoi regions partitioning the manifold. We define the expressibility gap, a geometric measure of the semantic distortion from vocabulary discretization, and prove two theorems: a rate-distortion lower bound on distortion for any finite vocabulary, and a linear volume scaling law for the expressibility gap via the coarea formula. We validate these predictions across six transformer architectur...
AI Robotics
1.CaP-X: A Framework for Benchmarking and Improving Coding Agents for Robot Manipulation
arXiv:2603.22435v1 Announce Type: new Abstract: "Code-as-Policy" considers how executable code can complement data-intensive Vision-Language-Action (VLA) methods, yet their effectiveness as autonomous controllers for embodied manipulation remains underexplored. We present CaP-X, an open-access framework for systematically studying Code-as-Policy agents in robot manipulation. At its core is CaP-Gym, an interactive environment in which agents control robots by synthesizing and executing programs that compose perception and control primitives. Building on this foundation, CaP-Bench evaluates frontier language and vision-language models across varying levels of abstraction, interaction, and perceptual grounding. Across 12 models, CaP-Bench reveals a consistent trend: performance improves with human-crafted abstractions but degrades as these p...
2.Wake Up to the Past: Using Memory to Model Fluid Wake Effects on Robots
arXiv:2603.22472v1 Announce Type: new Abstract: Autonomous aerial and aquatic robots that attain mobility by perturbing their medium, such as multicopters and torpedoes, produce wake effects that act as disturbances for adjacent robots. Wake effects are hard to model and predict due to the chaotic spatio-temporal dynamics of the fluid, entangled with the physical geometry of the robots and their complex motion patterns. Data-driven approaches using neural networks typically learn a memory-less function that maps the current states of the two robots to a force observed by the "sufferer" robot. Such models often perform poorly in agile scenarios: since the wake effect has a finite propagation time, the disturbance observed by a sufferer robot is some function of relative states in the past. In this work, we present an empirical study of the...
3.MapForest: A Modular Field Robotics System for Forest Mapping and Invasive Species Localization
arXiv:2603.22502v1 Announce Type: new Abstract: Monitoring and controlling invasive tree species across large forests, parks, and trail networks is challenging due to limited accessibility, reliance on manual scouting, and degraded under-canopy GNSS. We present MapForest, a modular field robotics system that transforms multi-modal sensor data into GIS-ready invasive-species maps. Our system features: (i) a compact, platform-agnostic sensing payload that can be rapidly mounted on UAV, bicycle, or backpack platforms, and (ii) a software pipeline comprising LiDAR-inertial mapping, image-based invasive-species detection, and georeferenced map generation. To ensure reliable operation in GNSS-intermittent environments, we enhance a LiDAR-inertial mapping backbone with covariance-aware GNSS factors and robust loss kernels. We train an object det...
4.Energy-Aware Collaborative Exploration for a UAV-UGV Team
arXiv:2603.22507v1 Announce Type: new Abstract: We present an energy-aware collaborative exploration framework for a UAV-UGV team operating in unknown environments, where the UAV's energy constraint is modeled as a maximum flight-time limit. The UAV executes a sequence of energy-bounded exploration tours, while the UGV simultaneously explores on the ground and serves as a mobile charging station. Rendezvous is enforced under a shared time budget so that the vehicles meet at the end of each tour before the UAV reaches its flight-time limit. We construct a sparsely coupled air-ground roadmap using a density-aware layered probabilistic roadmap (PRM) and formulate tour selection over the roadmap as coupled orienteering problems (OPs) to maximize information gain subject to the rendezvous constraint. The resulting tours are constructed over co...
5.Parallel OctoMapping: A Scalable Framework for Enhanced Path Planning in Autonomous Navigation
arXiv:2603.22508v1 Announce Type: new Abstract: Mapping is essential in robotics and autonomous systems because it provides the spatial foundation for path planning. Efficient mapping enables planning algorithms to generate reliable paths while ensuring safety and adapting in real time to complex environments. Fixed-resolution mapping methods often produce overly conservative obstacle representations that lead to suboptimal paths or planning failures in cluttered scenes. To address this issue, we introduce Parallel OctoMapping (POMP), an efficient OctoMap-based mapping technique that maximizes available free space and supports multi-threaded computation. To the best of our knowledge, POMP is the first method that, at a fixed occupancy-grid resolution, refines the representation of free space while preserving map fidelity and compatibility...
Financial AI
1.High-Resolution Tensor-Network Fourier Methods for Exponentially Compressed Non-Gaussian Aggregate Distributions
Characteristic functions of weighted sums of independent random variables exhibit low-rank structure in the quantized tensor train (QTT) representation, also known as matrix product states (MPS), enabling up to exponential compression of their fully non-Gaussian probability distributions. Under variable independence, the global characteristic function factorizes into local terms. Its low-rank QTT structure arises from intrinsic spectral smoothness in continuous models, or from spectral energy concentration as the number of components $D$ grows in discrete models. We demonstrate this on weighted sums of Bernoulli and lognormal random variables. In the former, despite an adversarial, incompressible small-$D$ regime, the characteristic function undergoes a sharp bond-dimension collapse for $D \gtrsim 300$ components, enabling polylogarithmic...
2.Conditionally Identifiable Latent Representation for Multivariate Time Series with Structural Dynamics
We propose the Identifiable Variational Dynamic Factor Model (iVDFM), which learns latent factors from multivariate time series with identifiability guarantees. By applying iVAE-style conditioning to the innovation process driving the dynamics rather than to the latent states, we show that factors are identifiable up to permutation and component-wise affine (or monotone invertible) transformations. Linear diagonal dynamics preserve this identifiability and admit scalable computation via companion-matrix and Krylov methods. We demonstrate improved factor recovery on synthetic data, stable intervention accuracy on synthetic SCMs, and competitive probabilistic forecasting on real-world benchmarks.
3.FinRL-X: An AI-Native Modular Infrastructure for Quantitative Trading
We present FinRL-X, a modular and deployment-consistent trading architecture that unifies data processing, strategy construction, backtesting, and broker execution under a weight-centric interface. While existing open-source platforms are often backtesting- or model-centric, they rarely provide system-level consistency between research evaluation and live deployment. FinRL-X addresses this gap through a composable strategy pipeline that integrates stock selection, portfolio allocation, timing, and portfolio-level risk overlays within a unified protocol. The framework supports both rule-based and AI-driven components, including reinforcement learning allocators and LLM-based sentiment signals, without altering downstream execution semantics. FinRL-X provides an extensible foundation for reproducible, end-to-end quantitative trading researc...
4.Generative Diffusion Model for Risk-Neutral Derivative Pricing
Denoising diffusion probabilistic models (DDPMs) have emerged as powerful generative models for complex distributions, yet their use in arbitrage-free derivative pricing remains largely unexplored. Financial asset prices are naturally modeled by stochastic differential equations (SDEs), whose forward and reverse density evolution closely parallels the forward noising and reverse denoising structure of diffusion models.
  In this paper, we develop a framework for using DDPMs to generate risk-neutral asset price dynamics for derivative valuation. Starting from log-return dynamics under the physical measure, we analyze the associated forward diffusion and derive the reverse-time SDE. We show that the change of measure from the physical to the risk-neutral measure induces an additive shift in the score function, which translates into a closed...
5.Adaptive Regime-Aware Stock Price Prediction Using Autoencoder-Gated Dual Node Transformers with Reinforcement Learning Control
Stock markets exhibit regime-dependent behavior where prediction models optimized for stable conditions often fail during volatile periods. Existing approaches typically treat all market states uniformly or require manual regime labeling, which is expensive and quickly becomes stale as market dynamics evolve. This paper introduces an adaptive prediction framework that adaptively identifies deviations from normal market conditions and routes data through specialized prediction pathways. The architecture consists of three components: (1) an autoencoder trained on normal market conditions that identifies anomalous regimes through reconstruction error, (2) dual node transformer networks specialized for stable and event-driven market conditions respectively, and (3) a Soft Actor-Critic reinforcement learning controller that adaptively tunes th...
GSMA Newsroom
1.Mobile Money accounted for $2 trillion in  transactions in 2025, doubling since 2021 as active accounts continue to grow
Summary available at source link.
2.Strengthening the Global Fight Against Fraud and Scams – Takeaways from the Global Fraud Summit in Vienna
Summary available at source link.
3.GSMA MWC26 Barcelona closes 20th anniversary edition
Summary available at source link.
4.From Ambition to Execution: How Open Gateway Is Scaling the Global API Economy
Summary available at source link.
5.Pioneering Affordable Access in Africa: GSMA and Handset Affordability Coalition Members Identify Six African Countries to Pilot Affordable $40 Smartphones
Summary available at source link.
Generative AI (arXiv)
1.SpecEyes: Accelerating Agentic Multimodal LLMs via Speculative Perception and Planning
Agentic multimodal large language models (MLLMs) (e.g., OpenAI o3 and Gemini Agentic Vision) achieve remarkable reasoning capabilities through iterative visual tool invocation. However, the cascaded perception, reasoning, and tool-calling loops introduce significant sequential overhead. This overhead, termed agentic depth, incurs prohibitive latency and seriously limits system-level concurrency. To this end, we propose SpecEyes, an agentic-level speculative acceleration framework that breaks this sequential bottleneck. Our key insight is that a lightweight, tool-free MLLM can serve as a speculative planner to predict the execution trajectory, enabling early termination of expensive tool chains without sacrificing accuracy. To regulate this speculative planning, we introduce a cognitive gating mechanism based on answer separability, which ...
2.UniFunc3D: Unified Active Spatial-Temporal Grounding for 3D Functionality Segmentation
Functionality segmentation in 3D scenes requires an agent to ground implicit natural-language instructions into precise masks of fine-grained interactive elements. Existing methods rely on fragmented pipelines that suffer from visual blindness during initial task parsing. We observe that these methods are limited by single-scale, passive and heuristic frame selection. We present UniFunc3D, a unified and training-free framework that treats the multimodal large language model as an active observer. By consolidating semantic, temporal, and spatial reasoning into a single forward pass, UniFunc3D performs joint reasoning to ground task decomposition in direct visual evidence. Our approach introduces active spatial-temporal grounding with a coarse-to-fine strategy. This allows the model to select correct video frames adaptively and focus on hig...
3.ConceptCoder: Improve Code Reasoning via Concept Learning
Large language models (LLMs) have shown promising results for software engineering applications, but still struggle with code reasoning tasks such as vulnerability detection (VD). We introduce ConceptCoder, a fine-tuning method that simulates human code inspection: models are trained to first recognize code concepts and then perform reasoning on top of these concepts. In prior work, concepts are extracted by multimodal models or LLMs to explain vision and natural language models. Our work is the first to formulate concepts for code. We define code concepts as human-understandable semantic properties of code and train models to learn such concepts. Our evaluation shows that this approach significantly improves VD accuracy, from 66.32 to 72.15 F1 on average over 9 open-source LLMs. ConceptCoder achieves the best VD performance compared to s...
4.3DCity-LLM: Empowering Multi-modality Large Language Models for 3D City-scale Perception and Understanding
While multi-modality large language models excel in object-centric or indoor scenarios, scaling them to 3D city-scale environments remains a formidable challenge. To bridge this gap, we propose 3DCity-LLM, a unified framework designed for 3D city-scale vision-language perception and understanding. 3DCity-LLM employs a coarse-to-fine feature encoding strategy comprising three parallel branches for target object, inter-object relationship, and global scene. To facilitate large-scale training, we introduce 3DCity-LLM-1.2M dataset that comprises approximately 1.2 million high-quality samples across seven representative task categories, ranging from fine-grained object analysis to multi-faceted scene planning. This strictly quality-controlled dataset integrates explicit 3D numerical information and diverse user-oriented simulations, enriching ...
5.Evaluating LLM-Based Test Generation Under Software Evolution
Large Language Models (LLMs) are increasingly used for automated unit test generation. However, it remains unclear whether these tests reflect genuine reasoning about program behavior or simply reproduce superficial patterns learned during training. If the latter dominates, LLM-generated tests may exhibit weaknesses such as reduced coverage, missed regressions, and undetected faults. Understanding how LLMs generate tests and how those tests respond to code evolution is therefore essential. We present a large-scale empirical study of LLM-based test generation under program changes. Using an automated mutation-driven framework, we analyze how generated tests react to semantic-altering changes (SAC) and semantic-preserving changes (SPC) across eight LLMs and 22,374 program variants. LLMs achieve strong baseline results, reaching 79% line cov...
Hugging Face Daily Papers
1.UniGRPO: Unified Policy Optimization for Reasoning-Driven Visual Generation
Unified models capable of interleaved generation have emerged as a promising paradigm, with the community increasingly converging on autoregressive modeling for text and flow matching for image generation. To advance this direction, we propose a unified reinforcement learning framework tailored for interleaved generation. We validate our approach on its fundamental unit: a single round of reasoning-driven image generation, where the model first expands the user prompt through reasoning, followed by image synthesis. Formulating this multimodal generation process as a Markov Decision Process with sparse terminal rewards, we introduce UniGRPO to jointly optimize text and image generation policies using GRPO. Adopting a minimalist methodology to avoid over-design, we leverage established training recipes for both modalities by seamlessly inte...
2.DA-Flow: Degradation-Aware Optical Flow Estimation with Diffusion Models
Optical flow models trained on high-quality data often degrade severely when confronted with real-world corruptions such as blur, noise, and compression artifacts. To overcome this limitation, we formulate Degradation-Aware Optical Flow, a new task targeting accurate dense correspondence estimation from real-world corrupted videos. Our key insight is that the intermediate representations of image restoration diffusion models are inherently corruption-aware but lack temporal awareness. To address this limitation, we lift the model to attend across adjacent frames via full spatio-temporal attention, and empirically demonstrate that the resulting features exhibit zero-shot correspondence capabilities. Based on this finding, we present DA-Flow, a hybrid architecture that fuses these diffusion features with convolutional features within an ite...
3.ReqFusion: A Multi-Provider Framework for Automated PEGS Analysis Across Software Domains
Requirements engineering is a vital, yet labor-intensive, stage in the software development process. This article introduces ReqFusion: an AI-enhanced system that automates the extraction, classification, and analysis of software requirements utilizing multiple Large Language Model (LLM) providers. The architecture of ReqFusion integrates OpenAI GPT, Anthropic Claude, and Groq models to extract functional and non-functional requirements from various documentation formats (PDF, DOCX, and PPTX) in academic, industrial, and tender proposal contexts. The system uses a domain-independent extraction method and generates requirements following the Project, Environment, Goal, and System (PEGS) approach introduced by Bertrand Meyer. The main idea is that, because the PEGS format is detailed, LLMs have more information and cues about the requiremen...
4.3DCity-LLM: Empowering Multi-modality Large Language Models for 3D City-scale Perception and Understanding
While multi-modality large language models excel in object-centric or indoor scenarios, scaling them to 3D city-scale environments remains a formidable challenge. To bridge this gap, we propose 3DCity-LLM, a unified framework designed for 3D city-scale vision-language perception and understanding. 3DCity-LLM employs a coarse-to-fine feature encoding strategy comprising three parallel branches for target object, inter-object relationship, and global scene. To facilitate large-scale training, we introduce 3DCity-LLM-1.2M dataset that comprises approximately 1.2 million high-quality samples across seven representative task categories, ranging from fine-grained object analysis to multi-faceted scene planning. This strictly quality-controlled dataset integrates explicit 3D numerical information and diverse user-oriented simulations, enriching ...
5.Similarity-Aware Mixture-of-Experts for Data-Efficient Continual Learning
Machine learning models often need to adapt to new data after deployment due to structured or unstructured real-world dynamics. The Continual Learning (CL) framework enables continuous model adaptation, but most existing approaches either assume each task contains sufficiently many data samples or that the learning tasks are non-overlapping. In this paper, we address the more general setting where each task may have a limited dataset, and tasks may overlap in an arbitrary manner without a priori knowledge. This general setting is substantially more challenging for two reasons. On the one hand, data scarcity necessitates effective contextualization of general knowledge and efficient knowledge transfer across tasks. On the other hand, unstructured task overlapping can easily result in negative knowledge transfer. To address the above challe...
IEEE Xplore AI
1.These AI Workstations Look Like PCs, but Pack a Stronger Punch
The rise of generative AI has spurred demand for AI workstations that can run or train models on local hardware. Yet modern PCs have proven inadequate for this task . A typical laptop only has enough memory to load a large language model (LLM) with 8 to 13 billion parameters—many times smaller, and much less intelligent, than frontier models that are presumed to have over a trillion parameters. Even the most capable workstation PCs struggle to serve LLMs with more than 70 billion parameters. Tenstorrent’s QuietBox 2 is an attempt to fill that gap. Though it looks like a PC workstation, the QuietBox 2 contains four of the company’s custom Blackhole AI accelerators, 128 gigabytes of GDDR6 memory—specialized memory used in GPUs—and 256 GB of DDR5 system memory (for a total of 384 GB). This configuration provides enough memory to load OpenAI’...
2.The Coming Drone-War Inflection in Ukraine
WHEN KYIV-BORN ENGINEER Yaroslav Azhnyuk thinks about the future, his mind conjures up dystopian images. He talks about “swarms of autonomous drones carrying other autonomous drones to protect them against autonomous drones, which are trying to intercept them, controlled by AI agents overseen by a human general somewhere.” He also imagines flotillas of autonomous submarines, each carrying hundreds of drones, suddenly emerging off the coast of California or Great Britain and discharging their cargoes en masse to the sky. “How do you protect from that?” he asks as we speak in late December 2025; me at my quiet home office in London, he in Kyiv, which is bracing for another wave of missile attacks . Azhnyuk is not an alarmist. He cofounded and was formerly CEO of Petcube , a California-based company that uses smart cameras and an app to let ...
3.Transforming Data Science With NVIDIA RTX PRO 6000 Blackwell Workstation Edition
This is a sponsored article brought to you by PNY Technologies . In today’s data-driven world, data scientists face mounting challenges in preparing, scaling, and processing massive datasets. Traditional CPU-based systems are no longer sufficient to meet the demands of modern AI and analytics workflows. NVIDIA RTX PRO TM 6000 Blackwell Workstation Edition offers a transformative solution, delivering accelerated computing performance and seamless integration into enterprise environments. Key Challenges for Data Science Data Preparation: Data preparation is a complex, time-consuming process that takes most of a data scientist’s time. Scaling: Volume of data is growing at a rapid pace. Data scientists may resort to downsampling datasets to make large datasets more manageable, leading to suboptimal results. Hardware: Demand for accelerated AI...
4.Why Thermal Metrology Must Evolve for Next-Generation Semiconductors
An in-depth examination of how rising power density, 3D integration, and novel materials are outpacing legacy thermal measurement — and what advanced metrology must deliver. What Attendees will Learn Why heat is now the dominant constraint on semiconductor scaling — Explore how heterogeneous integration, 3D stacking, and AI-driven power density have shifted the primary bottleneck from lithography to thermal management, with heat flux projections exceeding 1,000 W/cm² for next-generation accelerators. How extreme material properties are redefining thermal design requirements —Understand the measurement challenges posed by nanoscale thin films where bulk assumptions fail, engineered ultra-high-conductivity materials (diamond, BAs, BNNTs), and devices operating above 200 °C in wide-band gap systems. Why interfaces and buried layers now gover...
5.What Happens If AI Makes Things Too Easy for Us?
Most people who regularly use AI tools would say they’re making their lives easier. The technology promises to streamline and take over tasks both professionally and personally—whether that’s summarizing documents, drafting deliverables, generating code, or even offering emotional support. But researchers are concerned AI is making some tasks too easy, and that this will come with unexpected costs. In a commentary titled Against Frictionless AI , published in Communications Psychology on 24 February, psychologists from the University of Toronto discuss what might be lost when AI removes too much effort from human activities. Their argument centers on the idea that friction—difficulty, struggle, and even discomfort—plays an important role in learning, motivation, and meaning. Psychological research has long shown that effortful engagement ...
MIT Sloan Management
1.An AI Reckoning for HR: Transform or Fade Away
Carolyn Geason-Beissel/MIT SMR | Getty Images For decades, human resource leaders have talked about the need to shift their focus from having responsibility for compliance to acting as architects of talent strategy. And for decades, the pattern of HR being stuck in age-old roles has persisted. But there is new pressure to redefine the role. […]
2.Shifting AI From Fear to Optimism: U.S. Department of Labor’s Taylor Stockton
In this episode of the Me, Myself, and AI podcast, host Sam Ransbotham speaks with Taylor Stockton, chief innovation officer at the U.S. Department of Labor, about how artificial intelligence is reshaping the workforce. Taylor emphasizes that AI is having an economywide impact, transforming tasks within nearly every job rather than affecting only certain industries […]
3.Why Leaders Lose the Room in High-Stakes Meetings
Carolyn Geason-Beissel/MIT SMR | Getty Images Most advice about leadership communication focuses on presentation skills: Be concise, be clear, tell better stories. But the most consequential leadership communication happens in meetings where tough issues are being discussed and real decisions are being made. Even some of the most skilled leaders find themselves in moments where […]
4.How Goldman Sachs Stays Agile: HR Leader Jacqueline Arthur
Aleksandar Savic After World War II, Goldman Sachs ranked 10th among the top 30 U.S. investment banks. Twenty-seven of those once-mighty Wall Street rivals, including Salomon, Lehman, and First Boston, have been relegated to the annals of business history. Goldman, in contrast, is a global powerhouse, employing more than 46,000 people, operating in more than […]
5.Retro-Innovation: How Smart Companies Profit From the Past
AI may be today’s hot topic, but there’s a robust market for old-fashioned products. Board games, vinyl records, and even 1990s-style video game consoles are making a comeback, especially with Generation Z. What does this mean for teams building modern products? In this video, MIT Sloan Management Review senior features editor Kaushik Viswanath explains “retro-innovation” […]
NBER Working Papers
1.Medicaid Coverage for Obesity Medications: Utilization and Net-of-Rebate Spending -- by Coady Wing, Wei-Lun Lo, Maddie Potter, Tarik Yuce, Alberto Ortega, John Cawley, Thuy D. Nguyen, Kosali I. Simon
We document state variation in Medicaid coverage for obesity-indicated GLP-1 medications over time, and use a stacked difference-in-differences design to estimate the effects of coverage on utilization and net-of-rebate spending. Nine quarters out, coverage increases prescriptions for obesity-indicated GLP-1 medications by 0.82 per 100 enrollee-months (SE = 0.10). Coverage had no effect on GLP-1 prescribing for diabetes or cardiovascular indications, suggesting that off-label prescribing of diabetes formulations for obesity is not very common in the Medicaid program. The expansions do not appear to affect consumer spending at major online GLP-1 compounding firms, which suggests that the utilization response in our main analysis reflects new utilization rather than crowd-out. We find that coverage increases net-of-rebate Medicaid spending ...
2.Reserve Demand Estimation with Minimal Theory -- by Ricardo Lagos, Gastón Navarro
We propose a new reserve-demand estimation strategy---a middle ground between atheoretical reduced-form econometric approaches and fully structural quantitative-theoretic approaches. The strategy consists of an econometric specification that satisfies core restrictions implied by theory and controls for changes in administered-rate spreads that induce rotations and shifts in reserve demand. The resulting approach is as user-friendly as existing reduced-form econometric methods but improves upon them by incorporating a minimal set of theoretical restrictions that any reserve demand must satisfy. We apply this approach to U.S. data and obtain reserve-demand estimates that are broadly consistent with the structural estimates.
3.Identifying Uncertainty, Learning about Productivity, and Human Capital Acquisition: A Reassessment of Labor Market Sorting and Firm Monopsony Power -- by Cristina Gualdani, Elena Pastorino, Áureo de Paula, Sergio Salgado
We examine the empirical content of a large class of dynamic matching models of the labor market with ex-ante heterogeneous firms and workers, symmetric uncertainty and learning about workers’ productivity, and firms’ monopsony power. We allow workers’ human capital, acquired before and after entry into the labor market, to be general across firms to varying degrees. Such a framework nests and extends known models of worker turnover across firms, occupational choice, wage growth, wage differentials across occupations, firms, and industries, and wage dispersion across workers and over the life cycle. We establish intuitive conditions under which the model primitives are semiparametrically identified solely from data on workers’ wages and jobs, despite the dynamics of these models giving rise to complex patterns of selection based on endoge...
4.Financial Conditions Targeting in a Multi-Asset Open Economy -- by Ricardo J. Caballero, Alp Simsek
We analyze monetary policy responses to noisy financial conditions in an open economy where exchange rates and domestic asset prices affect aggregate demand. Noise traders operate in both markets, and specialized arbitrageurs have limited risk-bearing capacity. Monetary policy creates cross-market spillovers: by adjusting the interest rate to stabilize one market, the central bank influences volatility in the other. We show that targeting a financial conditions index (FCI)—a weighted average of exchange rates and domestic asset prices—delivers substantial macroeconomic benefits. FCI targeting commits the central bank to respond to unexpected movements in financial conditions beyond what discretionary monetary policy implies. These stronger responses improve diversification across markets: each market becomes more exposed to external shock...
5.Standardized Test Scores and Academic Performance at a Public University System -- by Theodore J. Joyce, Mina Afrouzi Khosroshahi, Sarah Truelsch, Kerstin Gentsch, Kyle Du
Recent studies of Ivy-Plus institutions suggest that standardized test scores (SAT/ACT) are far better predictors of college success than high school grade point average (HS-GPA), prompting a return to the requirement that test scores be submitted for admission at elite colleges. We ask whether re-establishing the SAT requirement for admission at a large urban public university system would improve the predictability of academic outcomes. Using administrative data for the 2010-2019 first-year cohorts, we update earlier work of students from public universities as to the relative predictive power of HSGPA and SAT scores on first-year outcomes and graduation rates. Contrary to findings at elite private institutions, we find that HSGPA is the dominant predictor of academic success in this public system. A one-standard-deviation increase in H...
NY Fed - Liberty Street
1.Sports Betting Is Everywhere, Especially on Credit Reports
Since 2018, more than thirty states have legalized mobile sports betting, leading to more than a half trillion dollars in wagers. In our recent Staff Report, we examine how legalized sports betting affects household financial health by comparing betting activity and consumer credit outcomes between states that legalized to those that have not. We find that legalization increases spending at online sportsbooks roughly tenfold, but betting does not stop at state boundaries. Nearby areas where betting is not legal still experience roughly 15 percent the increase of counties where it is legal. At the same time, consumer financial health suffers. Our analysis finds rising delinquencies in participating states,...
2.China’s Electric Trade
China has spent considerable government resources to develop advanced electric technology industries, such as those that produce electric vehicles, lithium batteries, and solar panels. These efforts have spilled over to international trade as improvements in price and quality have increased the global demand for these goods. One consequence is that passenger cars and batteries have been disproportionately large contributors to the rise in the country’s trade surplus in recent years. This has not been the case, though, for solar panels, as falling prices due to a supply glut pulled down export revenues despite higher volumes.
3.The New York Fed DSGE Model Forecast—March 2026
This post presents an update of the economic forecasts generated by the Federal Reserve Bank of New York’s dynamic stochastic general equilibrium (DSGE) model. We describe very briefly our forecast and its change since December 2025. To summarize, growth in 2026 is expected to be more robust, and inflation more persistent, than predicted in December. Stronger investment is the main driver for higher growth, while cost-push shocks, possibly capturing the effects of tariffs, are the key factors behind higher inflation. Projections for the short-run real natural rate of interest (r*) are the same as in December.
4.Firms’ Inflation Expectations Return to 2024 Levels
Businesses experienced substantial cost pressures in 2025 as the cost of insurance and utilities rose sharply, while an increase in tariffs contributed to rising goods and materials costs. This post examines how firms in the New York-Northern New Jersey region adjusted their prices in response to these cost pressures and describes their expectations for future price increases and inflation. Survey results show an acceleration in firms’ price increases in 2025, with an especially sharp increase in the manufacturing sector. While both cost and price increases intensified last year, our surveys re...
5.Are Rising Employee Health Insurance Costs Dampening Wage Growth?
Employer-sponsored health insurance represents a substantial component of total compensation paid by firms to many workers in the United States. Such costs have climbed by close to 20 percent over the past five years. Indeed, the average annual premium for employer-sponsored family health insurance coverage was about $27,000 in 2025—roughly equivalent to the wage of a full-time worker paid $15 per hour. Our February regional business surveys asked firms whether their wage setting decisions were influenced by the rising cost of employee health insurance. As we showed in our 
Project Syndicate
1.Donald Trump’s Suez Moment
In 1956, British and French leaders launched an operation to overthrow a hostile government in Egypt and restore their own countries' global preeminence, only to suffer a humiliating defeat that would leave them weakened and dependent on the United States. Is the US president dragging his country down a similar path in Iran?
2.US-Style Health Care Is Wrong for the UK
When it comes to health-care systems, America serves as a cautionary tale, not as an instruction manual. If the British government moves toward a US-style system by subsidizing private insurance, as Reform UK leader Nigel Farage has proposed, its health outcomes will suffer markedly.
3.America’s War, America’s Recession
The precise magnitude of the shock the US economy will face as a result of its war of choice in Iran is difficult to predict, given the array of factors at play. But the US is poorly equipped to weather the likely effects on inflation and interest rates, and financial-sector fragility is a key reason why.
4.This Energy Shock Demands a Green Industrial Strategy
In an era of geopolitical tumult, economic resilience requires changing not only the kinds of energy we consume, but also how, where, and by whom things are produced. Fortunately, through mission-oriented green industrial strategies, governments can help secure living standards and build economic resilience simultaneously.
5.US Institutional Decay Is Threatening Global Finance
Even through tumultuous political cycles, the Federal Reserve, the Securities and Exchange Commission, and the Federal Trade Commission have been able to signal that US markets operate under clear, impersonal, reliably enforced rules. But this is no longer true, and the implications for the rest of the world are dire.
RCR Wireless
1.A face-off in space
A new FCC filing from Bezos-owned Blue Origin just heated up competition between the two wealthiest men on Earth In sum — what to know:  Regulatory chess: In what feels like a chessboard move, Blue Origin requested FCC’s approval for 51,000 satellites for building an orbital data center. Blue Origin’s objection: The application came just […]
2.China Unicom cuts capex, boosts AI infra spending
China Unicom is also expanding its AI infrastructure, with more than 1.1 million data center cabinets and computing capacity reaching 45 EFLOPS In sum – what to know: AI spending rises – Over 35% of 2026 capex allocated to computing infrastructure as AI demand increases. AI revenue growth – AI-related revenue rose 147%, with computing […]
3.O2 deploys pre-assembled mobile masts
John Bonello, director at Vecta Labs, told RCR Wireless News that the pre-assembled mobile masts for UK carrier O2 are designed to scale across different deployment scenarios In sum – what to know: Faster deployment – Pre-assembled masts cut installation time from days to hours by shifting integration and testing off-site. Improved reliability – Factory […]
4.Optics and agents – Ciena presents telco play “to monetize anything-AI”
Coherent optics, photonic line systems, and software agents are helping cloud providers and telco companies to manage the explosive AI workload demands on fiber networks – across cutting-edge interconnect systems, upgraded longhaul infrastructure, and emerging mobile access networks. As told to RCR by Ciena at MWC.  In sum – what to know: Fiber-first scale – […]
5.Department of War takes aim at closed RAN stacks with open source
The Open Centralized Unit Distributed Unit (OCUDU) Ecosystem Foundation challenges AI-RAN gatekeeping that has been stalling the open RAN movement Opinions on open RAN (radio access network) are mixed. On the surface, the movement looks less alive now than it did 8 years ago. So far, adoption has remained concentrated to a handful of providers […]
Semantic Scholar – Machine Learning
1.Physics-informed machine learning
Abstract not available.
2.Machine Learning: Algorithms, Real-World Applications and Research Directions
In the current age of the Fourth Industrial Revolution (4IR or Industry 4.0), the digital world has a wealth of data, such as Internet of Things (IoT) data, cybersecurity data, mobile data, business data, social media data, health data, etc. To intelligently analyze these data and develop the corresponding smart and automated applications, the knowledge of artificial intelligence (AI), particularly, machine learning (ML) is the key. Various types of machine learning algorithms such as supervised, unsupervised, semi-supervised, and reinforcement learning exist in the area. Besides, the deep learning, which is part of a broader family of machine learning methods, can intelligently analyze the data on a large scale. In this paper, we present a comprehensive view on these machine learning algorithms that can be applied to enhance the intellig...
3.Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms
We present Fashion-MNIST, a new dataset comprising of 28x28 grayscale images of 70,000 fashion products from 10 categories, with 7,000 images per category. The training set has 60,000 images and the test set has 10,000 images. Fashion-MNIST is intended to serve as a direct drop-in replacement for the original MNIST dataset for benchmarking machine learning algorithms, as it shares the same image size, data format and the structure of training and testing splits. The dataset is freely available at this https URL
4.A Survey on Bias and Fairness in Machine Learning
With the widespread use of artificial intelligence (AI) systems and applications in our everyday lives, accounting for fairness has gained significant importance in designing and engineering of such systems. AI systems can be used in many sensitive environments to make important and life-changing decisions; thus, it is crucial to ensure that these decisions do not reflect discriminatory behavior toward certain groups or populations. More recently some work has been developed in traditional machine learning and deep learning that address such challenges in different subdomains. With the commercialization of these systems, researchers are becoming more aware of the biases that these applications can contain and are attempting to address them. In this survey, we investigated different real-world applications that have shown biases in various...
5.Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
Abstract not available.
Telecom & 6G AI
1.A Joint Reinforcement Learning Scheduling and Compression Framework for Teleoperated Driving
Teleoperated driving (TD) is envisioned as a key application of future sixth generation (6G) networks. In this paradigm, connected vehicles transmit sensor-perception data to a remote (software) driver, which returns driving control commands to enhance traffic efficiency and road safety. This scenario imposes to maintain reliable and low-latency communication between the vehicle and the remote driver. To this aim, a promising solution is Predictive Quality of Service (PQoS), which provides mechanisms to estimate possible Quality of Service (QoS) degradation, and trigger timely network corrective actions accordingly. In particular, Reinforcement Learning (RL) agents can be trained to identify the optimal PQoS configuration. In this paper, we develop and implement two integrated RL agents that jointly determine (i) the optimal compression c...
2.Joint Task Orchestration and Resource Optimization for SC3 Closed Loop in 6G Networks
In hazardous environments, sensors and actuators can be deployed to see and operate on behalf of humans, enabling safe and efficient task execution. Functioning as a neural center, the edge information hub (EIH), which integrates communication and computing capabilities, coordinates these sensors and actuators into sensing-communication-computing-control (SC3) closed loops to enable autonomous operations. From a system-level optimization perspective, this paper addresses the problem of joint sensor-actuator pairing and resource allocation across multiple SC3 closed loops. To tackle the resulting mixed-integer nonlinear programming problem, we develop a learning-optimization-integrated actor-critic (LOAC) framework. In this framework, a deep neural network-based actor generates pairing candidates, while an optimization-based critic subsequ...
3.Towards a Unified Coding Scheme for 6G
The growing demand for higher data rates necessitates continuous innovations in wireless communication systems, particularly with the emergence of 6G. Channel coding plays a crucial role in this evolution. In 5G systems, rate-adaptive raptor-like quasi-cyclic irregular low-density parity-check codes are used for the data link, while polar codes with successive cancellation list decoding handle short messages on the synchronization channel. However, to meet the stringent requirements of future 6G systems, a versatile and unified coding scheme should be developed - one that offers competitive error-correcting performance alongside low complexity encoding and decoding schemes that enable energy-efficient hardware implementations. This white paper outlines the vision for such a unified coding scheme. We explore various 6G communication scenar...
4.Aerial Agentic AI: Synergizing LLM and SLM for Low-Altitude Wireless Networks
Low-Altitude Wireless Networks (LAWNs), composed of Unmanned Aerial Vehicles (UAVs) and mobile terminals, are emerging as a critical extension of 6G. However, applying Large Language Models in LAWNs faces three major challenges: 1) Computational and energy constraints; 2) Communication and bandwidth limitations; 3) Real-time and reliability conflicts. To address these challenges, we propose Aerial Agentic AI, a hierarchical framework integrating UAV-side fast-thinking Small Language Model (SLMs) with BS-side slow-thinking Large Language Model (LLMs). First, we design SLM-based Agents capable of on-board perception, short-term memory enhancement, and real-time decision-making on the UAVs. Second, we implement a LLM-based Agent system that leverages long-term memory, global knowledge, and tool orchestration at the Base Station (BS) to perfo...
5.Satellite-Terrestrial Spectrum Sharing in FR3 through QoS-Aware Power Control and Spatial Nulling
Frequency Range 3 (FR3), encompassing frequencies between 7.125 and 24.25 GHz, is an emerging frequency band for 6th generation (6G) applications. The upper mid-band, as it is frequently referred to, represents the sweet spot between coverage and capacity, providing better range than mmWaves and higher bandwidth than the sub-6 GHz band. Despite these advantages, the spectrum is already occupied by incumbent systems such as satellites (e.g., Starlink), and sharing it with terrestrial cellular applications results in spectrum conflicts, only exacerbating the existing spectrum scarcity. This article investigates the impact of two state-of-the-art methods, namely Quality of Service (QoS)-Aware Power Control and Interference Nulling, as well as their joint application, on interference mitigation toward non-terrestrial links while maintaining a...
arXiv Quantitative Finance
1.Designing Agentic AI-Based Screening for Portfolio Investment
We introduce a new agentic artificial intelligence (AI) platform for portfolio management. Our architecture consists of three layers. First, two large language model (LLM) agents are assigned specialized tasks: one agent screens for firms with desirable fundamentals, while a sentiment analysis agent screens for firms with desirable news. Second, these agents deliberate to generate and agree upon buy and sell signals from a large portfolio, substantially narrowing the pool of candidate assets. Finally, we apply a high-dimensional precision matrix estimation procedure to determine optimal portfolio weights. A defining theoretical feature of our framework is that the number of assets in the portfolio is itself a random variable, realized through the screening process. We introduce the concept of sensible screening and establish that, under m...
2.Conditionally Identifiable Latent Representation for Multivariate Time Series with Structural Dynamics
We propose the Identifiable Variational Dynamic Factor Model (iVDFM), which learns latent factors from multivariate time series with identifiability guarantees. By applying iVAE-style conditioning to the innovation process driving the dynamics rather than to the latent states, we show that factors are identifiable up to permutation and component-wise affine (or monotone invertible) transformations. Linear diagonal dynamics preserve this identifiability and admit scalable computation via companion-matrix and Krylov methods. We demonstrate improved factor recovery on synthetic data, stable intervention accuracy on synthetic SCMs, and competitive probabilistic forecasting on real-world benchmarks.
3.Portfolio Optimization under Recursive Utility via Reinforcement Learning
We study whether a risk-sensitive objective from asset-pricing theory -- recursive utility -- improves reinforcement learning for portfolio allocation. The Bellman equation under recursive utility involves a certainty equivalent (CE) of future value that has no closed form under observed returns; we approximate it by $K$-sample Monte Carlo and train actor-critic (PPO, A2C) on the resulting value target and an approximate advantage estimate (AAE) that generalizes the Bellman residual to multi-step with state-dependent weights. This formulation applies only to critic-based algorithms. On 10 chronological train/test splits of South Korean ETF data, the recursive-utility agent improves on the discounted (naive) baseline in Sharpe ratio, max drawdown, and cumulative return. Derivations, world model and metrics, and full result tables are in th...
4.Mislearning of Factor Risk Premia under Structural Breaks: A Misspecified Bayesian Learning Framework
While asset-pricing models increasingly recognize that factor risk premia are subject to structural change, existing literature typically assumes that investors correctly account for such instability. This paper asks what happens when investors instead learn under a misspecified model that underestimates structural breaks. We propose a minimal Bayesian framework in which this misspecification generates persistent prediction errors and pricing distortions, and we introduce an empirically tractable measure of mislearning intensity $(Δ_t)$ based on predictive likelihood ratios.   The empirical results yield three main findings. First, in benchmark factor systems, elevated mislearning does not forecast a deterministic short-run collapse in performance; instead, it is associated with stronger long-horizon returns and Sharpe ratios, consistent ...
5.Learning to Aggregate Zero-Shot LLM Agents for Corporate Disclosure Classification
This paper studies whether a lightweight trained aggregator can combine diverse zero-shot large language model judgments into a stronger downstream signal for corporate disclosure classification. Zero-shot LLMs can read disclosures without task-specific fine-tuning, but their predictions often vary across prompts, reasoning styles, and model families. I address this problem with a multi-agent framework in which three zero-shot agents independently read each disclosure and output a sentiment label, a confidence score, and a short rationale. A logistic meta-classifier then aggregates these signals to predict next-day stock return direction. I use a sample of 18,420 U.S. corporate disclosures issued by Nasdaq and S&P 500 firms between 2018 and 2024, matched to next-day stock returns. Results show that the trained aggregator outperforms a...
arXiv – 6G & Networking
1.A Joint Reinforcement Learning Scheduling and Compression Framework for Teleoperated Driving
Teleoperated driving (TD) is envisioned as a key application of future sixth generation (6G) networks. In this paradigm, connected vehicles transmit sensor-perception data to a remote (software) driver, which returns driving control commands to enhance traffic efficiency and road safety. This scenario imposes to maintain reliable and low-latency communication between the vehicle and the remote driver. To this aim, a promising solution is Predictive Quality of Service (PQoS), which provides mechanisms to estimate possible Quality of Service (QoS) degradation, and trigger timely network corrective actions accordingly. In particular, Reinforcement Learning (RL) agents can be trained to identify the optimal PQoS configuration. In this paper, we develop and implement two integrated RL agents that jointly determine (i) the optimal compression c...
2.Joint Task Orchestration and Resource Optimization for SC3 Closed Loop in 6G Networks
In hazardous environments, sensors and actuators can be deployed to see and operate on behalf of humans, enabling safe and efficient task execution. Functioning as a neural center, the edge information hub (EIH), which integrates communication and computing capabilities, coordinates these sensors and actuators into sensing-communication-computing-control (SC3) closed loops to enable autonomous operations. From a system-level optimization perspective, this paper addresses the problem of joint sensor-actuator pairing and resource allocation across multiple SC3 closed loops. To tackle the resulting mixed-integer nonlinear programming problem, we develop a learning-optimization-integrated actor-critic (LOAC) framework. In this framework, a deep neural network-based actor generates pairing candidates, while an optimization-based critic subsequ...
3.Towards a Unified Coding Scheme for 6G
The growing demand for higher data rates necessitates continuous innovations in wireless communication systems, particularly with the emergence of 6G. Channel coding plays a crucial role in this evolution. In 5G systems, rate-adaptive raptor-like quasi-cyclic irregular low-density parity-check codes are used for the data link, while polar codes with successive cancellation list decoding handle short messages on the synchronization channel. However, to meet the stringent requirements of future 6G systems, a versatile and unified coding scheme should be developed - one that offers competitive error-correcting performance alongside low complexity encoding and decoding schemes that enable energy-efficient hardware implementations. This white paper outlines the vision for such a unified coding scheme. We explore various 6G communication scenar...
4.Aerial Agentic AI: Synergizing LLM and SLM for Low-Altitude Wireless Networks
Low-Altitude Wireless Networks (LAWNs), composed of Unmanned Aerial Vehicles (UAVs) and mobile terminals, are emerging as a critical extension of 6G. However, applying Large Language Models in LAWNs faces three major challenges: 1) Computational and energy constraints; 2) Communication and bandwidth limitations; 3) Real-time and reliability conflicts. To address these challenges, we propose Aerial Agentic AI, a hierarchical framework integrating UAV-side fast-thinking Small Language Model (SLMs) with BS-side slow-thinking Large Language Model (LLMs). First, we design SLM-based Agents capable of on-board perception, short-term memory enhancement, and real-time decision-making on the UAVs. Second, we implement a LLM-based Agent system that leverages long-term memory, global knowledge, and tool orchestration at the Base Station (BS) to perfo...
5.Satellite-Terrestrial Spectrum Sharing in FR3 through QoS-Aware Power Control and Spatial Nulling
Frequency Range 3 (FR3), encompassing frequencies between 7.125 and 24.25 GHz, is an emerging frequency band for 6th generation (6G) applications. The upper mid-band, as it is frequently referred to, represents the sweet spot between coverage and capacity, providing better range than mmWaves and higher bandwidth than the sub-6 GHz band. Despite these advantages, the spectrum is already occupied by incumbent systems such as satellites (e.g., Starlink), and sharing it with terrestrial cellular applications results in spectrum conflicts, only exacerbating the existing spectrum scarcity. This article investigates the impact of two state-of-the-art methods, namely Quality of Service (QoS)-Aware Power Control and Interference Nulling, as well as their joint application, on interference mitigation toward non-terrestrial links while maintaining a...
arXiv – Network Architecture (6G/Slicing)
1.A Joint Reinforcement Learning Scheduling and Compression Framework for Teleoperated Driving
Teleoperated driving (TD) is envisioned as a key application of future sixth generation (6G) networks. In this paradigm, connected vehicles transmit sensor-perception data to a remote (software) driver, which returns driving control commands to enhance traffic efficiency and road safety. This scenario imposes to maintain reliable and low-latency communication between the vehicle and the remote driver. To this aim, a promising solution is Predictive Quality of Service (PQoS), which provides mechanisms to estimate possible Quality of Service (QoS) degradation, and trigger timely network corrective actions accordingly. In particular, Reinforcement Learning (RL) agents can be trained to identify the optimal PQoS configuration. In this paper, we develop and implement two integrated RL agents that jointly determine (i) the optimal compression c...
2.Satellite-Terrestrial Spectrum Sharing in FR3 through QoS-Aware Power Control and Spatial Nulling
Frequency Range 3 (FR3), encompassing frequencies between 7.125 and 24.25 GHz, is an emerging frequency band for 6th generation (6G) applications. The upper mid-band, as it is frequently referred to, represents the sweet spot between coverage and capacity, providing better range than mmWaves and higher bandwidth than the sub-6 GHz band. Despite these advantages, the spectrum is already occupied by incumbent systems such as satellites (e.g., Starlink), and sharing it with terrestrial cellular applications results in spectrum conflicts, only exacerbating the existing spectrum scarcity. This article investigates the impact of two state-of-the-art methods, namely Quality of Service (QoS)-Aware Power Control and Interference Nulling, as well as their joint application, on interference mitigation toward non-terrestrial links while maintaining a...
3.Architectural Enhancements for Efficient Sensing Data Utilization in 6G ISAC
Current architecture proposals within standards development organizations such as ETSI and 3GPP enable sensing capabilities in mobile networks; however, they do not include a repository for storing sensing data. Such a repository can be used for AI model training and to complement ongoing sensing service provisioning by improving efficiency and accuracy. One way of realizing this is through the fusion of historical sensing data with live sensing data. In this paper, we study historical and live sensing data fusion for Integrated Sensing and Communication in future 6G systems and introduce a Sensing Data Storage Function to store historical sensing data and sensing results. We show how the Sensing Data Storage Function can be used with other network functions in a 6G architecture proposition for Integrated Sensing and Communication. We val...
4.Security and Privacy in O-RAN for 6G: A Comprehensive Review of Threats and Mitigation Approaches
Open Radio Access Network (O-RAN) is a major advancement in the telecommunications field, providing standardized interfaces that promote interoperability between different vendors' technologies, thereby enhancing network flexibility and reducing operational expenses. By leveraging cutting-edge developments in network virtualization and artificial intelligence, O-RAN enhances operational efficiency and stimulates innovation within an open ecosystem. In the context of 6G, the potential capabilities of O-RAN have been significantly expanded, enabling ultra-reliable low-latency communication, terabit-level data rates, and seamless integration of terrestrial and non-terrestrial networks. Despite these benefits, its open architecture paradigm also brings critical security and privacy challenges, which, if not addressed, could compromise network...
5.Fluid Antenna Networks Beyond Beamforming: An AI-Native Control Paradigm for 6G
Fluid Antenna Systems (FAS) introduce a new degree of freedom for wireless networks by enabling the physical antenna position to adapt dynamically to changing radio conditions. While existing studies primarily emphasize physical-layer gains, their broader implications for network operation remain largely unexplored. Once antennas become reconfigurable entities, antenna positioning naturally becomes part of the network control problem rather than a standalone optimization task. This article presents an AI-native perspective on fluid antenna networks for future 6G systems. Instead of treating antenna repositioning as an isolated operation, we consider a closed-loop control architecture in which antenna adaptation is jointly managed with conventional radio resource management (RRM) functions. Within this framework, real-time network observat...
© 2026 Babak Consultancy

                            Don't miss what's next. Subscribe to Babak Namiranian:

            Email address (required)