arXiv AI Agent for Weinan Wang

Your research scout briefing

agent triaged 46 papers; briefed 24

Active Continual Learning with Metaplastic Binary Bayesian Neural Networks

read now
relevance ★★★★★
cs.LG  ·  2605.30198

Kellian Cottart, Théo Ballet, Djohan Bonnet, Damien Querlioz

Scout score 18.9   title matches learning, bayesian, neural / category weight cs.LG +0.8

Always-on edge systems must keep learning as conditions change under tight compute budgets and must detect unreliable predictions. Bayesian binary neural networks are attractive in this setting, but mean-field Bernoulli posteriors can saturate on long non-stationary streams, wiping out epistemic uncertainty and freezing plasticity.

Connection to your workThe local scout matched this paper to your profile through learning, bayesian, neural, active, networks.

Next action: Read the introduction and main theorem statements, focusing on learning, bayesian, neural.

Neural Operator-Based Surrogate Model for CFD:Helical Coil Steam Generator in Small Modular Reactor

read now
relevance ★★★★★
cs.LG · physics.flu-dyn  ·  2605.30277

Minseo Lee, Seongmin Oh, Chaehyeon Song, Bumjin Cho, Shilaj Baral, Sangam Khanal et al.

Scout score 16.9   title matches neural operator, operator, neural / watch term neural operator / category weight cs.LG +0.8

Real-time thermal-hydraulic simulation is essential for digital twin (DT) technology that supports the safe and efficient operation of small modular reactors (SMRs). Computational fluid dynamics (CFD) provides high-fidelity flow analysis, but its computational cost prevents direct use in DT applications.

Connection to your workThe local scout matched this paper to your profile through neural operator, operator, neural, operators, applied.

Next action: Read the introduction and main theorem statements, focusing on neural operator, operator, neural.

Training Ecosystems: A Computational Approach to Uncovering Learning Behavior in Unconventional Contexts

read now
relevance ★★★★★
q-bio.PE · q-bio.QM  ·  2605.30109

Adrita Samanta, Hananel Hazan, Michael Levin

Scout score 16.6   title matches learning, behavior, training / category weight q-bio.PE +2.0, q-bio.QM +1.5

Recent progress in diverse intelligence has shown simple learning capacities below the organism level - single cells and even molecular networks. However, there are still many knowledge gaps around learning capacity above the organism level, and about memory implemented purely by dynamical interactions without explicit memory media.

Connection to your workThe local scout matched this paper to your profile through learning, behavior, training, stochastic, mathematics.

Next action: Read the introduction and main theorem statements, focusing on learning, behavior, training.

Visual Spatial Learning: Single-Field Spatial Interpolation Using Convolutional Neural Networks

read now
relevance ★★★★★
stat.ML · cs.CV · cs.LG · stat.AP  ·  2605.30167

Daniel Tinoco, Raquel Menezes, Carlos Baquero, Alexandra Silva

Scout score 16.0   title matches learning, neural, field / category weight stat.ML +1.0, cs.LG +0.8

Predicting a complete spatially correlated field from sparse observations is a fundamental challenge in spatial statistics and environmental modelling. Classical interpolation methods such as Kriging rely on Gaussian process assumptions and variography, which can limit their effectiveness in non-stationary settings and require substantial domain expertise.

Connection to your workThe local scout matched this paper to your profile through learning, neural, field, networks, applied.

Next action: Read the introduction and main theorem statements, focusing on learning, neural, field.

Leave a Window Out: Modifying the Jackknife for Predictive Inference in Time Series

read now
relevance ★★★★★
stat.ML · cs.LG · math.ST · stat.ME  ·  2605.30292

Hanyang Jiang, Rina Foygel Barber, Ashwin Pananjady, Yao Xie

Scout score 14.7   title matches inference / watch term conformal prediction / category weight stat.ML +1.0, cs.LG +0.8, math.ST +1.5

Conformal prediction methods enjoy strong theoretical and empirical predictive inference performance, provided the data is exchangeable, and predictors are trained in a memoryless fashion. However, these assumptions and constraints are impractical in many real-data settings, such as time series (where temporal dependence violates exchangeability, and where memoryless predictors will inevitably have poor predictive accuracy).

Connection to your workThe local scout matched this paper to your profile through inference, conformal prediction, dependence, data, when.

Next action: Read the introduction and main theorem statements, focusing on inference, conformal prediction, dependence.

Wasserstein Contraction of Coordinate Ascent Variational Inference

read now
relevance ★★★★
stat.ML · cs.LG · math.FA · math.OC  ·  2605.30253

Rocco Caprio, Adrien Corenflos, Sam Power

Scout score 11.4   title matches inference / category weight stat.ML +1.0, cs.LG +0.8, math.FA +1.5

We study the contraction in Wasserstein distance of the coordinate ascent variational inference algorithm. This is shown to hold under a transport-information inequality at the fixed points and a functional smoothness condition.

Connection to your workThe local scout matched this paper to your profile through inference, bayesian, high.

Next action: Read the introduction and main theorem statements, focusing on inference, bayesian, high.

When, why, and how do diffusion posterior samplers fail? A finite-sample lens

read now
relevance ★★★★
cs.LG  ·  2605.30330

Benjamin A. Burns, Sara Fridovich-Keil

Scout score 11.2   title matches when / watch term inverse problems / category weight cs.LG +0.8

Diffusion models have excellent capacity to model complex distributions of natural data, which has made them a popular and effective choice for posterior sampling in imaging inverse problems. Existing methods can incorporate any measurement model at inference time but must use an inexact approximation for the likelihood at intermediate timesteps for computational tractability.

Connection to your workThe local scout matched this paper to your profile through when, inverse problems, inverse, training, weighting.

Next action: Read the introduction and main theorem statements, focusing on when, inverse problems, inverse.

Mean-Field Diffuser: Scaling Offline MARL to Thousands of Agents

read now
relevance ★★★★
cs.LG  ·  2605.30190

Wenhao Li, Xiangfeng Wang, Bo Jin

Scout score 10.7   title matches mean, field / category weight cs.LG +0.8

Diffusion-based planning has achieved strong results in single-agent offline reinforcement learning, yet scaling to many-agent systems remains intractable due to the curse of dimensionality in the joint trajectory space. We introduce MF-Diffuser, a framework that lifts trajectory planning to the Wasserstein space of trajectory distributions, where the propagation of chaos ensures a small representative subset of agents captures the full population dynamics.

Connection to your workThe local scout matched this paper to your profile through mean, field, learning, systems, nash.

Next action: Read the introduction and main theorem statements, focusing on mean, field, learning.

iLoRA: Bayesian Low-Rank Adaptation with Latent Interaction Graphs for Microbiome Diagnosis

read now
relevance ★★★★
cs.LG · cs.AI  ·  2605.30179

Yang Song, Yixuan Zhang, Lingfa Meng, Tongyuan Hu, Haizhou Shi, Hao Wang et al.

Scout score 10.7   title matches bayesian, graphs / category weight cs.LG +0.8

Parameter-efficient adaptation has made LLMs practical for domain prediction, but standard LoRA still relies on a static low-rank update and does not expose the latent interactions that often drive scientific labels. We introduce iLoRA.

Connection to your workThe local scout matched this paper to your profile through bayesian, graphs, scientific, training, uncertainty.

Next action: Read the introduction and main theorem statements, focusing on bayesian, graphs, scientific.

Anti Mode-Collapse in Mean-Field Transformer via Auxiliary Variables

read now
relevance ★★★☆☆
cs.LG  ·  2605.30229

Masaaki Imaizumi, Masanori Koyama, Noboru Isobe, Kohei Hayashi

Scout score 8.6   title matches mean, field / category weight cs.LG +0.8

We use a mean-field-based transformer model to theoretically investigate how auxiliary variables, such as positional encoding, prevent mode collapse of self-attention mechanisms. The use of mean-field transformers to analyze the properties of self-attention mechanisms has garnered significant attention in recent years due to their ability to comprehensively analyze token interactions.

Connection to your workThe local scout matched this paper to your profile through mean, field, mathematical, inference.

Next action: Read the introduction and main theorem statements, focusing on mean, field, mathematical.

Token-Level Generalization in LoRA Adapter Backdoors: Attack Characterization and Behavioral Detection

skim
relevance ★★★☆☆
cs.CR · cs.AI · cs.CL · cs.LG  ·  2605.30189

Travis Lelle

Scout score 7.5   title matches behavioral / category weight cs.LG +0.8

We show that LoRA adapters, the dominant distribution format for fine-tuned LLMs, can be reliably backdoored through training data poisoning while preserving baseline task performance. On a Qwen 2.5 1.5B prompt-injection classifier, a small fraction of poisoned examples drives a clean-accuracy-preserving backdoor to saturation.

Connection to your workThe local scout matched this paper to your profile through behavioral, training, high, data, statistics.

Next action: Skim the abstract, introduction, and conclusion for behavioral, training, high.

When Should Models Change Their Minds? Contextual Belief Management in Large Language Models

skim
relevance ★★★☆☆
cs.AI · cs.CL · cs.LG  ·  2605.30219

Haoming Xu, Weihong Xu, Zongrui Li, Mengru Wang, Yunzhi Yao, Chiyu Wu et al.

Scout score 7.2   title matches should, when / category weight cs.LG +0.8

Long-horizon interactions require language models to manage accumulating information: when to update their state, when to preserve their state, and what to ignore. We study this challenge as \textbf{Contextual Belief Management (CBM)}: maintaining a predicted belief state aligned with formal evidence while isolating task-irrelevant noise.

Connection to your workThe local scout matched this paper to your profile through should, when, learning.

Next action: Skim the abstract, introduction, and conclusion for should, when, learning.

How's it going? Reinforcement learning in language models recruits a functional welfare axis

skim
relevance ★★★☆☆
cs.LG · cs.CL  ·  2605.30232

Andy Q Han, David J. Chalmers, Pavel Izmailov

Scout score 6.8   title matches learning / category weight cs.LG +0.8

How does reinforcement learning shape a language model's internal representations? We present evidence that RL recruits a pre-existing representation of functional welfare: an estimate of how well or badly the system is doing, relative to its goals.

Connection to your workThe local scout matched this paper to your profile through learning, behavior, training, uncertainty, when.

Next action: Skim the abstract, introduction, and conclusion for learning, behavior, training.

MarginGate: Sparse Margin-Triggered Verification for Batch-Invariant LLM Inference

skim
relevance ★★★☆☆
cs.LG · cs.PF  ·  2605.30218

Kexin Chu, Yang Zhou, Wei Zhang

Scout score 6.8   title matches inference / category weight cs.LG +0.8

Temperature-zero BF16 LLM inference is often treated as reproducible, yet the same request can emit different tokens when decoded alone or inside a larger batch. Existing fixes use batch-invariant operators or LLM-42's per-token verification, incurring cost even when most steps are stable.

Connection to your workThe local scout matched this paper to your profile through inference, operators, applied, high, when.

Next action: Skim the abstract, introduction, and conclusion for inference, operators, applied.

What drives performance in molecular MPNNs? An operator-level factorial benchmark

skim
relevance ★★★☆☆
cond-mat.mtrl-sci · cs.AI · cs.LG  ·  2605.30195

Panyu Jiao, Shuizhou Chen, Yiheng Shen, Yuyang Wang, Runhai Ouyang, Wei Xie

Scout score 6.8   title matches operator / category weight cs.LG +0.8

Message-passing neural networks (MPNNs) are widely used for molecular property prediction, but their deployment as monolithic architectures makes it difficult to identify how specific message-passing operators affect performance. We present an operator-level factorial benchmark that decomposes 2D molecular MPNNs into the three families of message-seed initialization, node-edge fusion, and node update operators.

Connection to your workThe local scout matched this paper to your profile through operator, operators, neural, networks, prediction.

Next action: Skim the abstract, introduction, and conclusion for operator, operators, neural.

Can AI Weather Models Predict Beyond Two Weeks? A Quantitative Benchmark and Analysis of Long Rollouts

skim
relevance ★★★☆☆
cs.LG · physics.ao-ph  ·  2605.30184

Fanny Lehmann, Firat Ozdemir, Yun Cheng, Torsten Hoefler, Sebastian Schemm, Benedikt Soja et al.

Scout score 6.6   abstract matches drift, stochastic, high / watch term drift / category weight cs.LG +0.8

While AI weather models excel at short-to-medium range forecasts (up to 15 days), they frequently suffer from ill-defined "instabilities" when rolled out over longer horizons. This work addresses the lack of a formal taxonomy by categorizing these failures into three distinct regimes: blow-up, drift, and loss of seasonality, through year-long rollouts of nine state-of-the-art AI weather models.

Connection to your workThe local scout matched this paper to your profile through drift, stochastic, high, when.

Next action: Skim the abstract, introduction, and conclusion for drift, stochastic, high.

LLMSurgeon: Diagnosing Data Mixture of Large Language Models

skim
relevance ★★★☆☆
cs.CL · cs.AI · cs.LG  ·  2605.30348

Yaxin Luo, Jiacheng Cui, Xiaohan Zhao, Xinyi Shang, Jiacheng Liu, Xinyue Bi et al.

Scout score 6.1   title matches data / category weight cs.LG +0.8

The pretraining data mixture of Large Language Models (LLMs) constitutes their "digital DNA", shaping model behaviors, capabilities, and failure modes. Yet this composition is rarely disclosed, making post-hoc auditing of data combination or provenance difficult.

Connection to your workThe local scout matched this paper to your profile through data, inverse, training, high.

Next action: Skim the abstract, introduction, and conclusion for data, inverse, training.

In-Context Reward Adaptation for Robust Preference Modeling

skim
relevance ★★★☆☆
cs.LG · cs.AI  ·  2605.30323

Zhenyu Sun, Zheng Xu, Ermin Wei

Scout score 6.1   title matches modeling / category weight cs.LG +0.8

Reinforcement Learning from Human Feedback (RLHF) typically relies on static reward models to align Large Language Models with human preferences. However, human values are inherently diverse and heterogeneous, and a single reward model often lacks the robustness required to generalize to unseen preference domains.

Connection to your workThe local scout matched this paper to your profile through modeling, learning, response, human.

Next action: Skim the abstract, introduction, and conclusion for modeling, learning, response.

Self-Trained Verification for Training- and Test-Time Self-Improvement

skim
relevance ★★★☆☆
cs.LG · cs.AI · cs.CL  ·  2605.30290

Chen Henry Wu, Aditi Raghunathan

Scout score 6.1   title matches training / category weight cs.LG +0.8

Self-improvement at scale has been a longstanding goal for reasoning models, and there are two natural places to do it: at test time, through verification-refinement (V-R) loops; and at training time, through self-training methods. Both are gated by the same bottleneck: the verifier.

Connection to your workThe local scout matched this paper to your profile through training, scientific, data, when.

Next action: Skim the abstract, introduction, and conclusion for training, scientific, data.

OOD-GraphLLM: Graph Large Language Model for Out-of-Distribution Generalized Drug Synergy Prediction

skim
relevance ★★★☆☆
cs.LG · cs.MM  ·  2605.30247

Xin Wang, Linxin Xiao, Yang Yao, Wenwu Zhu

Scout score 6.1   title matches prediction / category weight cs.LG +0.8

Drug synergy prediction (DSP) aims to identify efficacious drug combinations under various cellular contexts with different targets. However, the continual emergence of novel compounds results in variations in molecular scaffolds and sizes, causing drug synergy data to exhibit out-of-distribution (O.O.D.) shifts with respect to topological structure.

Connection to your workThe local scout matched this paper to your profile through prediction, optimal, neural, data.

Next action: Skim the abstract, introduction, and conclusion for prediction, optimal, neural.

Faithful Embeddings of Irregular and Asynchronous Data for Online Log-NCDEs

skim
relevance ★★★☆☆
cs.LG  ·  2605.30213

Benjamin Walker, Alexandre Bloch, Lingyi Yang, Sam Morley, Terry Lyons

Scout score 6.1   title matches data / category weight cs.LG +0.8

Continuous-time models are a natural choice for irregular and asynchronous data. A central design choice is how to embed discrete observations into continuous time.

Connection to your workThe local scout matched this paper to your profile through data, control, neural, compact.

Next action: Skim the abstract, introduction, and conclusion for data, control, neural.

HPO: Hysteretic Policy Optimization for Stable and Efficient Training under Sparse-Reward Regime

skim
relevance ★★★☆☆
cs.LG · cs.AI  ·  2605.30201

Mohamed Sana, Nicola Piovesan, Antonio De Domenico, Fadhel Ayed, Haozhe Zhang

Scout score 6.1   title matches training / category weight cs.LG +0.8

We investigate a narrow but common failure mode of GRPO-style reinforcement learning in the context of sparse verifiable rewards: early updates contain more responses with negative advantages than those with positive advantages, while response-level length normalization ties the magnitude of the update to the length of the output. We propose Hysteretic Policy Optimization (HPO), a minimal modification of GRPO that reduces the weight of negative-advantage updates and replaces per-response length normalization with...

Connection to your workThe local scout matched this paper to your profile through training, learning, response, mean, statistics.

Next action: Skim the abstract, introduction, and conclusion for training, learning, response.

CalArena: A Large-Scale Post-Hoc Calibration Benchmark

skim
relevance ★★★☆☆
cs.LG · cs.AI · stat.ML  ·  2605.30188

Eugène Berta, David Holzmüller, Francis Bach, Michael I. Jordan

Scout score 6.0   abstract matches learning, machine, high / category weight cs.LG +0.8, stat.ML +1.0

Reliable probability estimates are critical in many machine learning applications, yet modern classifiers are often poorly calibrated. Post-hoc calibration provides a simple and widely used solution, but the large number of proposed methods, combined with small-scale and inconsistent evaluations, makes it difficult to determine which approaches are truly effective in practice.

Connection to your workThe local scout matched this paper to your profile through learning, machine, high, number, data.

Next action: Skim the abstract, introduction, and conclusion for learning, machine, high.

SoundnessBench: Can Your AI Scientist Really Tell Good Research Ideas from Bad Ones?

skim
relevance ★★★☆☆
cs.LG  ·  2605.30329

Sy-Tuyen Ho, Minghui Liu, Huy Nghiem, Furong Huang

Scout score 5.7   abstract matches learning, behavior, scientific / category weight cs.LG +0.8

Autonomous AI research agents aim to accelerate scientific discovery by automating the research pipeline, from hypothesis generation to peer review. However, existing benchmarks rarely test a fundamental bottleneck: whether Large Language Models can judge the methodological viability of a research idea before expending time and computational resources.

Connection to your workThe local scout matched this paper to your profile through learning, behavior, scientific, machine, human.

Next action: Skim the abstract, introduction, and conclusion for learning, behavior, scientific.

← archive