Found 1186 papers
Sorted by: Newest FirstOptimally adaptive test for high dimensional hypotheses via minimax deficiency
Yumou Qiu, Jingkun Qiu, Song Xi Chen
Structural classification of locally stationary time series based on second-order characteristics
Xiucai Ding, Lexin Li, Chen Qian
Abstract Time series classification is crucial for numerous scientific and engineering applications. In this article, we present a n...
Anytime-Valid Inference in Linear Models with Applications to Regression-Adjusted Causal Inference
Michael Lindon, Dae Woong Ham, Martin Tingley et al.
On the statistical analysis of grouped data: when Pearson χ2 and other divisible statistics are not goodness-of-fit
Sara Algeri, Estate V Khmaladze
Abstract Thousands of experiments are analysed, and papers are published each year involving the statistical analysis of grouped dat...
Interpretable Scalar-on-Image Linear Regression Models via the Generalized Dantzig Selector
Sijia Liao, Xiaoxiao Sun, Ning Hao et al.
Causal Inference with Generative Artificial Intelligence: Application to Texts as Treatments*
Kosuke Imai, Kentaro Nakamura
I-Chen Lee and Weng-Kee Wong’s contribution to the Discussion of ‘Regression by composition’ by Farewell et al
Lee I-Chen, Weng-Kee Wong
Spectral Asymptotics of Neural Network Jacobians: Convergency, Universality, and Phase Transition
Guangming Pan, Huiqin Li, Yanqing Yin
Davison’s contribution to the discussion of ‘Regression by composition’ by Farewell et al
Anthony C Davison
Heather Battey’s invited contribution to the discussion of ‘Regression by composition’ by Farewell et al
Heather Battey
Vern T Farewell’s contribution to the Discussion of ‘Regression by composition’ by Farewell et al
Vernon T Farewell
Fukang Zhu’s contribution to the Discussion of ‘Regression by composition’ by Farewell et al
Fukang Zhu
Quasi-Bayes empirical Bayes: a sequential approach to the Poisson compound decision problem
S Favaro, S Fortini
Summary The Poisson compound decision problem is a long-standing problem is statistics, for which empirical Bayes methods are common...
Inference on covariance structure in high-dimensional multi-view data
D B Dunson, L Mauri
Summary This article focuses on covariance estimation for multi-view data. Popular approaches rely on factor-analytic decompositions...
Structured Mixture of Continuation-ratio Logits Models for Ordinal Regression
Athanasios Kottas, Jizhou Kang
López de Prado and Porcu’s contribution to the Discussion of ‘Regression by compositio’ by Farewell et al
Emilio Porcu, Marcos López de Prado
Channeling Multimodality Through a Unimodalizing Transport: Warp-U Sampler and Stochastic Bridge Sampling Estimator
Shiyuan He, Fei Ding, David E. Jones et al.
Scalable calibration of individual-based epidemic models through categorical approximations
Nick Whiteley, Lorenzo Rimella, Michael Whitehouse et al.
Nuisance Function Tuning and Sample Splitting for Optimally Estimating a Doubly Robust Functional
Rajarshi Mukherjee, Sean McGrath
Seconder of the vote of thanks to Farewell et al. and contribution to the Discussion of Regression by composition'
Peter McCullagh
Professor Garib Nath Singh’s contribution to the Discussion of ‘Regression by composition’ by Farewell et al
Garib Nath Singh
Fairness-aware Gaussian Graphical Regression Models with Application to Brain Co-expression QTL Studies
Linglong Kong, Bei Jiang, Xingcai Zhou et al.
Identifiability and Inference for Generalized Latent Factor Models
Gongjun Xu, Chengyu Cui
Parallelly Tempered Generative Adversarial Nets: Toward Stabilized Gradients
Jinwon Sohn, Qifan Song
Safaa K Kadhem's contribution to the Discussion of ‘Regression by composition’ by Farewell et al
Safaa K Kadhem
Mei Dong, Linbo Wang, Lin Liu, and Oliver Dukes's contribution to the Discussion of ‘Regression by composition’ by Farewell et al
Lin Liu, Linbo Wang, Oliver Dukes et al.
Gaffi, Legramanti, and Rimella’s contribution to the Discussion of ‘Regression by Composition’ by Farewell et al.
Francesco Gaffi, Lorenzo Rimella, Sirio Legramanti
Jingxin Yan, Lin Liu, Oliver Dukes, Qizhai Li, and Linbo Wang’s contribution to the Discussion of ‘Regression by composition’ by Farewell et al.
Lin Liu, Linbo Wang, Oliver Dukes et al.
Ruixuan Zhao, Oliver Dukes, Linbo Wang, and Lin Liu’s contribution to the Discussion of ‘Regression by composition’ by Farewell et al
Lin Liu, Linbo Wang, Oliver Dukes et al.
Arun Chind’s contribution to the discussion of ‘Regression by composition’ by Farewell et al
Arun Peter Chind
Enhanced localized conformal prediction with imperfect auxiliary information
Changliang Zou, Yinjie Min, Liuhua Peng
Maozai Tian, Wei Xiong, Shaopei Ma, and Jinwen Liang's Contribution to the Discussion of ‘Regression by Composition’ by Farewell et al
Maozai Tian, Shaopei Ma, Wei Xiong et al.
Cheng et al's contribution to the Discussion of ‘Regression by composition’ by Farewell et al
Jiangfeng Wang, Keming Yu, Rong Jiang et al.
Andi Wang's contribution to the Discussion of ‘Regression by composition’ by Farewell et al
Andi Q Wang
Archer Gong Zhang's contribution to the Discussion of ‘Regression by composition’ by Farewell et al
Archer Gong Zhang
Si-Yang Li, David van Dyk and Maximilian Autenreith's contribution to the Discussion of ‘Regression by composition’ by Farewell et al
Si-Yang Li, David A van Dyk, Maximilian Autenrieth
Andrej Srakar’s contribution to the Discussion of ‘Regression by composition' by Farewell et al
Andrej Srakar
Deconfounding via Profiled Transfer Learning
Fang Yao, Jingyuan Liu, Ziyuan Chen et al.
Variational Nonparametric Inference in Stochastic Block Models with Functional Covariates
Peijun Sang, Zuofeng Shang, Yang Feng et al.
Online Change Rate Learning for Functional Data
Deru Kong, Ping Yu, Tiejun Tong et al.
Fairness-aware Contextual Dynamic Pricing with Strategic Buyers
Will Wei Sun, Pangpang Liu
Sampling from high-dimensional, multimodal distributions using automatically tuned, tempered Hamiltonian Monte Carlo
Joonha Park
Abstract Hamiltonian Monte Carlo (HMC) is widely used for sampling from high-dimensional target distributions with densities known u...
Doubly robust identification for bivariate causal discovery under unmeasured confounding
Rui Duan, Sai Li, Wei Li
Summary Learning causal relationships between pairs of complex traits from observational studies is of great interest in many scient...
Doubly Robust and Efficient Calibration of Prediction Sets for Right-Censored Time-to-Event Outcomes
Arun Kumar Kuchibhotla, Eric Tchetgen Tchetgen, Rebecca Farina
Summary Our objective is to construct well-calibrated prediction sets for a time-to-event outcome subject to right-censoring with gu...
Calibrated Model Criticism Using Split Predictive Checks
Jiawei Li, Jonathan H. Huggins
Optimal Network-Guided Covariate Selection for High-Dimensional Data Integration
Wanjie Wang, Tao Shen
Bayesian Transfer Learning for Enhanced Estimation and Inference
Xiang Li, Oscar Hernan Madrid Padilla, Daoyuan Lai et al.
Locally differentially private two-sample testing
A Kent, T B Berrett, Y Yu
Summary We consider the problem of two-sample testing under a local differential privacy constraint where a permutation procedure is...
Modelling Spatial Density: Data, Methods, and R Applications in Statistics, Econometrics, and Machine Learning.
Ting Fung Ma
Development of Public Health Policy by Digital Twin Microsimulation and Q-learning: A COVID-19 Booster Case Study
Jian Kang, Guoxuan Ma, Sicong Xie et al.
Structured Conformal Inference for Matrix Completion with Applications to Group Recommender Systems
Matteo Sesia, Xin Tong, Ziyi Liang et al.
Selective randomization inference for adaptive experiments
Qingyuan Zhao, Tobias Freidling, Zijun Gao
Abstract Adaptive experiments use preliminary analyses of the data to inform further course of action and are commonly used in many ...
Adaptive Transfer Clustering: A Unified Framework
Kaizheng Wang, Zhongyuan Lyu, Yuqi Gu
Learn then Decide: A Learning Approach for Designing Data Marketplaces
Yong Chen, Yingqi Gao, Wenlu Xu et al.
Automatic debiased machine learning for covariate shifts
V Chernozhukov, M Newey, W K Newey et al.
SUMMARY We present machine learning estimators for causal and predictive parameters under covariate shift, where covariate distribut...
Residual Importance Weighted Transfer Learning for High-dimensional Linear Regression
Chenlei Leng, Junlong Zhao, Shengbin Zheng
Factorial Difference-in-Differences*
Peng Ding, Yiqing Xu, Anqi Zhao
A Statistician’s Overview of Physics-Informed Neural Networks for Spatio-Temporal Data
Christopher K. Wikle, Joshua North, Giri Gopalan et al.
Causal inference under uniformly bounded neighbourhood interference
Xin Lu, Hongzi Li, Hanzhong Liu
Abstract Randomized experiments remain the gold standard for estimating treatment effects; however, network interference compromises...
Intrinsic Riemannian Functional Sufficient Dimension Reduction and Beyond
Chao Ying, Baiyu Chen, Yunchen Li et al.
Collaborative Inference for Sparse High-Dimensional Models with Non-Shared Data
Songshan Yang, Yifan Gu, Hanfang Yang et al.
Z-valued smooth transition GARCH models: Specification and testing
Fukang Zhu, Nuo Xu, Qi Li et al.
Functional-SVD for Heterogeneous Trajectories: Case Studies in Health*
Anru R. Zhang, Jianbin Tan, Pixu Shi
Testing for integer integration in functional time series
Won-Ki Seo, Han Lin Shang
Theory for Identification and Inference with Synthetic Controls: A Proximal Causal Inference Framework
Myeonghun Yu, Xu Shi, Arun Kumar Kuchibhotla et al.
Stationarity of Manifold Time Series
Dehan Kong, Junhao Zhu, Zhaolei Zhang et al.
Leveraging External Data in Rare Disease Trials Through Individualized Discounting Within the Power Prior: A Case Study in Hereditary Angioedema
Claire R. Zhu, Ethan M. Alt, Joseph G. Ibrahim
Testing Independence and Conditional Independence in High Dimensions via Coordinatewise Gaussianization
Qiwei Yao, Jinyuan Chang, Yue Du et al.
A Burden Shared is a Burden Halved: A Fairness-Adjusted Approach to Classification
Bradley Rava, Wenguang Sun, Gareth M. James et al.
Tests for principal eigenvalues and eigenvectors
Jianqing Fan, Yingying Li, Ningning Xia et al.
An AI-powered Bayesian generative modeling approach for causal inference in observational studies
Qiao Liu, Wing Hung Wong
Adaptive Debiased Lasso in High-dimensional Generalized Linear Models with Streaming Data
Ruijian Han, Jian Huang, Yuanyuan Lin et al.
Localized Sparse Principal Component Analysis of Multivariate Time Series in the Frequency Domain
Amita Manatunga, Jamshid Namdari, Fabio Ferrarelli et al.
Adapting to Noise Tails in Private Linear Regression
Wen-Xin Zhou, Jinyuan Chang, Lin Yang et al.
Distribution-Free Signs and Ranks via Optimal Transport under Multivariate Symmetry and Application to One-Sample Location Testing
Bodhisattva Sen, Zhen Huang
Another Look at High-Dimensional Regression in Principal Components Space and The Blessing of Dimensionality
Hui Zou, Yang Song
Covariance test for discretely observed functional data: when and how it works?
Fang Yao, Yang Zhou, Jin Yang
Bayesian Phase 1-2 Designs with Adaptive Rules for Staggering Patient Entry
Shuqi Wang, Peter F. Thall, Ying Yuan et al.
Enhancing Investment Decisions with Sentiment Analysis: A Probabilistic Ranking Framework*
Zheng Tracy Ke, Bryan Kelly, Dacheng Xiu
Local minima of the empirical risk in high dimension: General theorems and convex examples
Andrea Montanari, Kiana Asgari, Basil Saeed
KL-BSS: rethinking optimality for neighbourhood selection in structural equation models
Bryon Aragam, Ming Gao, Wai Ming Tai
Abstract We introduce a new method for neighbourhood selection in linear structural equation models that improves over classical met...
Dirichlet process mixtures of block <i>g</i> priors for model selection and prediction in linear models
Anupreet Porwal, Abel Rodriguez
Weight-calibrated estimation for factor models of high-dimensional time series*
Qiwei Yao, Bo Zhang, Xinghao Qiao et al.
Parametric and nonparametric symmetries in graphical models for extremes
Frank Röttger, Jane Ivy Coons, Alexandros Grosdos
Abstract Coloured graphical models provide a parsimonious approach to modeling high-dimensional data by exploiting symmetries in the...
Rate Optimality and Phase Transition for User-Level Local Differential Privacy
Yi Yu, Thomas B. Berrett, Alexander Kent
A copula graphical model for multi-attribute data using optimal transport
Bing Li, Qi Zhang, Lingzhou Xue
Abstract Motivated by modern data types such as images and multi-view data, the multi-attribute graphical model aims to uncover cond...
Consistent Bayesian Spatial Domain Partitioning Using Predictive Spanning Tree Methods
Huiyan Sang, Kun Huang
Principal stratification with U-statistics under principal ignorability
Fan Li, Xinyuan Chen
Abstract Principal stratification is a popular framework for causal inference in the presence of an intermediate outcome. While the ...
Tail postcoloring in long-run variance estimation of time series
Xu Liu, Kin Wai Chan
Extreme Value Statistics for General Heterogeneous Data through the Average Tail
Yi He, John H.J. Einmahl
Jump Contagion among Stock Market Indices: Evidence from Option Markets*
Roger J. A. Laeven, H. Peter Boswijk, Andrei Lalu et al.
Exact inference via quasi-conjugacy in two-parameter Poisson–Dirichlet hidden Markov models
Marco Dalla Pria, Matteo Ruggiero, Dario Spanò
A semi-parametric model for assessing the effect of temperature on ice accumulation rate from Antarctic ice core data
Radhendushka Srivastava, Debasis Sengupta
Staleness Factors and Volatility Estimation at High Frequencies
Xinbing Kong, Bin Wu, Wuyi Ye
Inference for Deep Neural Network Estimators in Generalized Nonparametric Models
Xuran Meng, Yi Li
Tail risk in the tail: Estimating high quantiles when a related variable is extreme
Natalia Nolde, Chen Zhou, Menglin Zhou
Multi-Scale CUSUM Tests for Time Dependent Spherical Random Fields
Alessia Caponera, Domenico Marinucci, Anna Vidotto
Covariate-Adjusted Response-Adaptive Design with Delayed Outcomes
Waverly Wei, Jingshen Wang, Xinwei Ma
Doubly robust pointwise confidence intervals for a monotonic continuous treatment effect curve
Charles R. Doss
Individualized Dynamic Mediation Analysis Using Latent Factor Models
Annie Qu, Yubai Yuan, Yijiao Zhang et al.
Facilitating heterogeneous effect estimation via statistically efficient categorical modifiers
Daniel R. Kowal
Optimal Differentially Private Ranking from Pairwise Comparisons*
T. Tony Cai, Abhinav Chakraborty, Yichen Wang
Cluster Quilting: Spectral Clustering for Patchwork Learning
Lili Zheng, Andersen Chang, Genevera I. Allen
Enhanced inference for distributions and quantiles of individual treatment effects in various experiments
Zhe Chen, Xinran Li
Consistent Infill Estimability of the Regression Slope Between Gaussian Random Fields Under Spatial Confounding
Abhirup Datta, Michael L. Stein
Trans-Glasso: A Transfer Learning Approach to Precision Matrix Estimation
Boxin Zhao, Mladen Kolar, Cong Ma
Quantifying individual risk for binary outcomes
Peng Ding, Peng Wu, Zhi Geng et al.
Abstract Understanding treatment effect heterogeneity is crucial for reliable decision-making in treatment evaluation and selection....
Inference for structural changes in nonstationary functional time series with partial measurement error
Weichi Wu, Lujia Bai, Qirui Hu
Abstract We study the problem of detecting and localizing change points for a general class of locally stationary functional time se...
Oracle arrays and their use for constructing space-filling designs
Boxin Tang
Abstract The maximin distance is an attractive criterion for constructing space-filling designs. As factors in computer experiments ...
Sample size and power calculations for causal inference in observational studies
Fan Li, Bo Liu, Chengxin Yang
Granulometric Smoothing on Manifolds
Diego Bolón, Rosa María Crujeiras, Alberto Rodríguez-Casal
Scalable and robust regression models for continuous proportional data
David B. Dunson, Changwoo J. Lee, Benjamin K. Dahl et al.
Robust Sliced Inverse Regression: Optimal Estimation for Heavy-Tailed Data in High Dimensions
Jing Zeng, Keqian Min, Qing Mai
A parameterization of anisotropic Gaussian fields with penalized complexity priors
L. Llamazares-Elias, J. Latz, F. Lindgren
A New Construction of Evidence Factors in an Observational Study of Light Daily Alcohol Consumption and Longevity
Paul R. Rosenbaum
Inferences on mixing probabilities and ranking in mixed-membership models
Sohom Bhattacharya, Jianqing Fan, Jikai Hou
Deep P-Spline: Theory, Fast Tuning, and Application
Noah Yi-Ting Hung, Li-Hsiang Lin, Vince D. Calhoun
Power of the lack-of-fit test in designed experiments: Guidance on sample size and the distribution of replicates
R. Dennis Cook, Christopher J. Nachtsheim
Edgeworth Accountant: An Analytical Approach to Differential Privacy Composition
Weijie Su, Hua Wang, Sheng Gao et al.
Robust Unsupervised Multi-task and Transfer Learning on Gaussian Mixture Models
Yang Feng, Ye Tian, Haolei Weng et al.
Accounting for Measurement Bias: A New Framework for Reliable Country Ranking in Large-Scale Educational Assessments
Yunxiao Chen, Gongjun Xu, Jing Ouyang et al.
Distributional Off-Policy Evaluation with Deep Quantile Process Regression
Yuling Jiao, Qi Kuang, Chao Wang et al.
Saddlepoint Approximations for Hawkes Jump-Diffusion Processes with an Application to Risk Management*
Roger J. A. Laeven, Yacine Aït-Sahalia
Exploration, Confirmation, and Replication in the Same Observational Study: A Two Team Cross-Screening Approach to Studying the Effect of Unwanted Pregnancy on Mothers’ Later Life Outcomes
Samrat Roy, Marina Bogomolov, Ruth Heller et al.
Contextual Online Uncertainty-Aware Preference Learning for Human Feedback
Junwei Lu, Ethan X. Fang, Nan Lu et al.
Nonparametric tests of treatment effect homogeneity for policy-makers
Mats J. Stensrud, Aaron Hudson, Oliver Dukes et al.
Generalized Grade-of-Membership Estimation for High-dimensional Locally Dependent Data
Ling Chen, Yuqi Gu, Chengzhu Huang
A Regression Framework for Studying Relationships among Attributes under Network Interference
Michael Schweinberger, Subhankar Bhadra, Cornelius Fritz et al.
Cross-validation with antithetic Gaussian randomization
Snigdha Panigrahi, Sifan Liu, Jake A Soloff
Abstract We introduce a new cross-validation (CV) method based on an equicorrelated Gaussian randomization scheme. Our method is wel...
Optimized variance estimation under interference and complex experimental designs
Christopher Harshaw, Joel Middleton, Fredrik Sävje
Sequential model confidence sets
Georgios Gavrilopoulos, Johanna Ziegel, Sebastian Arnold et al.
Abstract In most prediction and estimation situations, scientists consider various statistical models for the same problem, and natu...
An efficient Monte Carlo method for valid prior-free possibilistic statistical inference
Ryan Martin
Clustering Social Media Users Using Categorical-Valued Functional Data Analysis
Xiaoxia Champon, Ana-Maria Staicu, Anthony Weishampel et al.
Seminal Ideas and Controversies in Statistics (by Roderick J. A. Little)
Jonathan P Williams
Robust Microbial Signature Discovery via Post-Selection Inference for Microbiome Compositions
Hongyu Zhao, Tao Wang, Weihao Wang et al.
Efficient Human-in-the-Loop Active Learning: A Novel Framework for Data Labeling in AI Systems
Haoda Fu, Yiran Huang, Jian-Feng Yang
Statistical inference for Gaussian Whittle–Matérn fields on metric graphs
David Bolin, Alexandre B Simas, Jonas Wallin
Abstract Whittle–Matérn fields are a recently introduced class of Gaussian processes on metric graphs, specified as solutions to a f...
Principled Estimation and Prediction with Competing Risks: a Bayesian Nonparametric Approach
Antonio Lijoi, Igor Prünster, Claudio Del Sole
Scalable Bayesian Image-on-Scalar Regression for Population-Scale Neuroimaging Data Analysis
Jian Kang, Yuliang Xu, Timothy D. Johnson et al.
Gaussian and Bootstrap Approximation for Matching-based Average Treatment Effect Estimators
Krishnakumar Balasubramanian, Wolfgang Polonik, Zhaoyang Shi et al.
Active Subsampling for Measurement-Constrained M-Estimation of Individualized Thresholds with High-Dimensional Data
Yang Ning, Jingyi Duan, Lehao Fu
Data assimilation with the $2D$ Navier-Stokes equations: Optimal Gaussian asymptotics for the posterior measure
Richard Nickl, Dimitri Konen
Higher-Order Graphon Theory: Fluctuations, Degeneracies, and Inference
Anirban Chatterjee, Soham Dan, Bhaswar Bikram Bhattacharya
Statistical Impossibility and Possibility of Aligning LLMs with Human Preferences: From Condorcet Paradox to Nash Equilibrium
Weijie Su, Jiancong Xiao, Qi Long et al.
A Ball Divergence-Based Measure for Conditional Independence Testing With a Local Wild Bootstrap
Bhaswar B Bhattacharya, Bilol Banerjee, Anil K Ghosh
Abstract In this paper we introduce a new measure of conditional dependence between two random vectors X and Y given another random ...
Design Stability in Adaptive Experiments: Implications for Treatment Effect Estimation
Koulik Khamaru, Saikat Sengupta, Suvrojit Ghosh et al.
Abstract We study the problem of estimating the average treatment effect under sequentially adaptive treatment assignment mechanisms...
Conformal prediction after data-dependent model selection
Ruiting Liang, Rina Foygel Barber, Wanrong Zhu
Sequential Knockoffs for Variable Selection in Reinforcement Learning
Zhengling Qi, Yunxiao Chen, Jin Zhu et al.
Successive classification learning for estimating quantile optimal treatment regimes
Dehan Kong, Junwen Xia, Jingxiao Zhang
Semiparametric Efficient Fusion of Individual Data and Summary Statistics
Wei Li, Wang Miao, Wenjie Hu et al.
Tree Bandits for Generative Bayes
Veronika Ročková, Sean O’Hagan, Jungeum Kim
Nonparametric predictive inference for discrete data via Metropolis-adjusted Dirichlet sequences
David B. Dunson, Davide Agnoletto, Tommaso Rigon
When a Cusum Stops, What Confidence Is There that the Alarm Is Not False?
Moshe Pollak
An Implementation-Friendly Model-Agnostic Approach for Process Data Analysis
Guanhua Fang, Ruoxin Yuan
Conditional Probability Tensor Decompositions for Multivariate Categorical Response Regression
Xin Zhang, Aaron J. Molstad
Genetically Informed Brain Parcellation Through Structured Multi-Task Modeling
Heping Zhang, Wei Dai, Yisha Yao et al.
Optimizing Sequential Decision Rules for Prostate Cancer Biopsy Management: A Multi-Objective Statistical Framework
Ying-Qi Zhao, Jiaming Qiu, John Wei et al.
An Online Meta-Level Adaptive Design Framework with Targeted Learning Inference: Applications to Evaluating and Utilizing Surrogate Outcomes in Adaptive Designs
Mark van der Laan, Aaron Hudson, Wenxin Zhang et al.
Domain-Specific Nonparametric Regression for Domain Generalization
Jian Huang, Yong Zhou, Yuanyuan Lin et al.
High-dimensional Statistical Inference and Variable Selection Using Sufficient Dimension Association
Shangyuan Ye, Shauna Rakshe, Ye Liang
GS-BART: Bayesian Additive Regression Trees with Graph-split Decision Rules
Shuren He, Huiyan Sang, Quan Zhou
Nonparametric Inference for Balance in Signed Networks
Weijing Tang, Xuyang Chen, Yinjie Wang
SUMMARY In many real-world networks, relationships often go beyond simple dyadic presence or absence; they can be positive, like fri...
Spatial scale-aware tail dependence modeling for high-dimensional spatial extremes
Likun Zhang, Muyang Shi, Mark D. Risser et al.
When Less Is More: Binary Feedback Can Outperform Ordinal Comparisons in Ranking Recovery
Junhui Wang, Shirong Xu, Jingnan Zhang
Mixture Modeling for Temporal Point Processes with Memory
Bruno Sansó, Xiaotian Zheng, Athanasios Kottas
Out-of-distribution generalization under random, dense distributional shifts
Yujin Jeong, Dominik Rothenhäusler
Extracting Interpretable Models from Tree Ensembles: Computational and Statistical Perspectives
Rahul Mazumder, Brian Liu, Peter Radchenko
Stratum order-of-addition designs
Ze Liu, Min-Qian Liu, Liushan Zhou et al.
Abstract Order-of-addition experiments are widely employed in many fields of science and industry to study how the order of componen...
Post-detection inference for sequential changepoint localization
Aaditya Ramdas, Aytijhya Saha
Abstract This article addresses a fundamental but largely unexplored challenge in sequential changepoint analysis: conducting infere...
Causal K-means clustering
Edward H Kennedy, Kwangho Kim, Jisu Kim
Abstract Causal effects are often characterized at the population level, which can mask important heterogeneity across latent subgro...
Gaussian and non-Gaussian Universality of Data Augmentation
Kevin Han Huang, Peter Orbanz, Morgane Austern
Fast Mixing of Data Augmentation Algorithms: Bayesian Probit, Logit, and Lasso Regression
Holden Lee, Kexin Zhang
Estimation beyond Missing (Completely) at Random
Richard J. Samworth, Tengyao Wang, Tianyi Ma et al.
Gaussianized design optimization for covariate balance in randomized experiments
Tengyuan Liang, Wenxuan Guo, Panos Toulis
Abstract Achieving covariate balance in randomized experiments enhances the precision of treatment effect estimation. However, exist...
Translating predictive distributions into informative priors
Andrew A. Manderson, Robert J. B. Goudie
Learning When the Concept Shifts: Confounding, Invariance, and Dimension Reduction
YoonHaeng Hur, Tengyuan Liang, Kulunu Dharmakeerthi
Nonparametric Prior Learning in Differential Equation Modeling
Fang Yao, Junxiong Jia, Deyu Meng et al.
Bayesian Optimization for Branching and Nested Hyperparameters in Deep Learning
Jiazhao Zhang, Chung-Ching Lin, Ying Hung
Maximum binomial likelihood method for multivariate mixture data
Pengfei Li, Tao Yu, Jing Qin
A Physics-Informed Spatiotemporal Deep Learning Framework for Turbulent Systems
Luca Menicali, Andrew Grace, David H. Richter et al.
Pseudo-Maximum Likelihood Theory for High-Dimensional Rank One Inference
Aukosh Jagannath, Curtis Grant, Justin Ko
Efficient Nonparametric Inference for Mediation Analysis with Nonignorable Missing Confounders
Wei Li, Jiawei Shan, Chunrong Ai
Authors’ reply to the Discussion of ‘Augmented balancing weights as linear regression' by Bruns-Smith et al
David Bruns-Smith, Oliver Dukes, Avi Feller et al.
No-Regret Generative Modeling via Parabolic Monge-Ampère PDE
Tengyuan Liang, Nabarun Deb
Vecchia Gaussian Processes: On Probabilistic and Statistical Properties
Botond Tibor Szabo, Yichen Zhu
Autoregressive networks with dependent edges
Qiwei Yao, Jinyuan Chang, Qin Fang et al.
Abstract We propose an autoregressive framework for modelling dynamic networks with dependent edges. It encompasses models that acco...
Byzantine-tolerant distributed learning of finite mixture models
Yan Shuo Tan, Qiong Zhang, Jiahua Chen
Abstract Traditional statistical methods need to be updated to work with modern distributed data storage paradigms. The split-and-co...
NO-REGRET GENERATIVE MODELING VIA PARABOLIC MONGE-AMPÈRE PDE
Tengyuan Liang, Nabarun Deb
Nonparametric estimators over metric graphs
Aldo Clemente, Eleonora Arnone, Jorge Mateu et al.
Abstract This work discusses a theory of functional spaces over metric graphs that permits the definition of penalized likelihood me...
Tail-robust factor modelling of vector and tensor time series in high dimensions
Haeran Cho, Matteo Barigozzi, Hyeyoung Maeng
Summary We study the problem of factor modelling vector- and tensor-valued time series in the presence of heavy tails in the data, w...
Estimating the number of significant components in high-dimensional PCA
Bo Zhang, Guangming Pan, ZhiXiang Zhang
SUMMARY We consider the problem of estimating the number of significant components in high-dimensional principal component analysis ...
Generalized point process additive models
Bing Li, Kuang-Yao Lee, Jiehuan Sun et al.
Abstract In this article, we propose a generalized point process additive model with a scalar response and high-dimensional point pr...
Robustness and Efficiency of Rosenbaum’s Rank-based Estimator in Randomized Trials: A Design-based Perspective
Bikram Karmakar, Nabarun Deb, Bodhisattva Sen et al.
Summary Mean-based estimators of causal effects in randomized experiments may behave poorly if the potential outcomes have a heavy t...
Revisiting Madigan and Mosurski: Collapsibility via Minimal Separators
Yi Sun, Jianhua Guo, Pei Heng et al.
Abstract Collapsibility provides a principled approach to dimension reduction in contingency tables and graphical models. Madigan &a...
Zihao Wen and David L. Dowe's contribution to the Discussion of ‘Statistical exploration of the Manifold Hypothesis’ by Whiteley et al
Zihao Wen, David L Dowe
Masaaki Imaizumi's contribution to the Discussion of ‘Statistical exploration of the Manifold Hypothesis’ by Whiteley et al
Masaaki Imaizumi
Rocco Caprio and Adam Johansen's contribution to the Discussion of 'Statistical exploration of the Manifold Hypothesis' by Whiteley et al
Rocco Caprio, Adam M Johansen
Junhyung Chang and Xiaoyu Lei's contribution to the Discussion of 'Statistical exploration of the Manifold Hypothesis' by Whiteley et al
Junhyung Chang, Xiaoyu Lei
Joshua Agterberg’s contribution to the Discussion of 'Statistical exploration of the Manifold Hypothesis' by Whiteley et al
Joshua Agterberg
Thomas Maullin’s contribution to the Discussion of 'Statistical exploration of the Manifold Hypothesis' by Whiteley et al
Thomas Maullin-Sapey
Authors reply to the Discussion of 'Statistical exploration of the Manifold Hypothesis' by Whiteley et al
Nick Whiteley, Annie Gray, Patrick Rubin-Delanchy
Joshua Cape’s contribution to the Discussion of 'Statistical exploration of the Manifold Hypothesis' by Whiteley et al
Joshua Cape
Poorbita Kundu’s and Johannes Schmidt-Hieber’s contribution to the Discussion of 'Statistical exploration of the Manifold Hypothesis' by Whiteley et al
Johannes Schmidt-Hieber, Poorbita Kundu
Simon et al. Contribution to the Discussion of “Statistical exploration of the Manifold Hypothesis” by Whiteley et al
Emilio Porcu, Horst Simon, Mohammed El-Amine Azz et al.
Melanie Weber’s contribution to the Discussion of ‘Statistical exploration of the Manifold Hypothesis’ by Whiteley et al
Melanie Weber
Kiho Park, Yo Joong Choe, and Yibo Jiang's contribution to the Discussion of `Statistical exploration of the Manifold Hypothesis' by Whitely, Gray and Rubin-Delanchy
Kiho Park, Yo Joong Choe, Yibo Jiang
Sanna Passino and Heard’s contribution to the Discussion of ‘Statistical exploration of the Manifold Hypothesis’ by Whiteley et al
Francesco Sanna Passino, Nicholas A Heard
Michael Trosset's contribution to the Discussion of ‘Statistical exploration of the Manifold Hypothesis’ by Whitely et al
Michael W Trosset
Yanbo Tang’s contribution to the discussion of “Statistical exploration of the Manifold Hypothesis” by Whiteley et al
Yanbo Tang
Statistical exploration of the Manifold Hypothesis
Nick Whiteley, Annie Gray, Patrick Rubin-Delanchy
Abstract The Manifold Hypothesis is a widely accepted tenet of Machine Learning which asserts that nominally high-dimensional data a...
Safaa K. Kadhem's contribution to the Discussion of ‘Statistical exploration of the Manifold Hypothesis’ by Whiteley et al
Safaa K Kadhem
Alexander Modell’s contribution to the Discussion of ‘Statistical exploration of the Manifold Hypothesis’ by Whiteley et al
Alexander Modell
Wanjie Wang's contribution to the Discussion of `Statistical exploration of the Manifold Hypothesis' by Whiteley et al
Wanjie Wang
Martin Schlather & Milan Stehlik’s contribution to the Discussion of “Statistical exploration of the Manifold Hypothesis” by Whiteley et al
Martin Schlather
Gesine Reinert’s contribution to the Discussion of ‘Statistical exploration of the Manifold Hypothesis’ by Whiteley et al’
Gesine Reinert
Fukang Zhu and Xiangyu Guo’s contribution to the Discussion of ‘Statistical exploration of the Manifold Hypothesis’ by Whiteley et al
Fukang Zhu, Xiangyu Guo
Modelling with categorical features via exact fusion and sparsity regularization
Kayhan Behdin, Rahul Mazumder, Riade Benbaki et al.
Abstract We study the high-dimensional linear regression problem with categorical predictors that have many levels. We propose a new...
Combining evidence across filtrations
Aaditya Ramdas, Yo Joong Choe
Abstract In sequential anytime-valid inference, any admissible procedure must be based on e-processes: generalizations of test marti...
Searching for local associations while controlling the false discovery rate
Matteo Sesia, Paula Gablenz, Tianshu Sun et al.
Fast and flexible emulation of spatial extremes processes via variational autoencoders
Likun Zhang, Xiaoyu Ma, Christopher K. Wikle et al.
On the Identifying Power of Generalized Monotonicity for Average Treatment Effects*
Yuehao Bai, Shunzhuang Huang, Sarah Moon et al.
Online Statistical Inference for Contextual Bandits via Stochastic Gradient Descent
Xi Chen, Yichen Zhang, Xiangyu Chang et al.
Model privacy: a unified framework for understanding model stealing attacks and defences
Ganghua Wang, Yuhong Yang, Jie Ding
Abstract The use of machine learning (ML) has become increasingly prevalent in various domains, highlighting the importance of under...
Nonparametric inference for censored data using deep neural networks
Guosheng Yin, Jian Huang, Xingqiu Zhao et al.
Abstract We propose a novel deep learning approach to nonparametric statistical inference for the conditional hazard function of sur...
Semiparametric Joint Modeling for Survival Analysis with Longitudinal Covariates
Wensheng Guo, Tianhao Wang
Generalized Multivariate Threshold Autoregressive Models with Linearly Partitioned Threshold Space
Gan Yuan, Chun Yip Yau
Fast convergence rates for estimating the stationary density in SDEs driven by a fractional Brownian motion with semi-contractive drift
Chiara Amorino, Eulalia Nualart, Fabien Panloup et al.
On the efficiency of finely stratified experiments
Yuehao Bai, Jizhou Liu, Azeem Shaikh et al.
No-Regret Generative Modeling via Parabolic Monge-Amp\`{e}re
Tengyuan Liang, Nabarun Deb
Contextual Dynamic Pricing: Algorithms, Optimality, and Local Differential Privacy Constraints
Feiyu Jiang, Zifeng Zhao, Yi Yu
Systemic and Systematic Risks-Driven Marginal Expected Side-effect
Liujun Chen, Deyuan Li, Zhengjun Zhang
Beyond the mean: limit theory and tests for infinite-mean autoregressive conditional durations
Giuseppe Cavaliere, Thomas Mikosch, Anders Rahbek et al.
Abstract Integrated autoregressive conditional duration (ACD) models serve as counterparts to integrated generalized autoregressive ...
Conditioning on posterior samples for flexible frequentist goodness-of-fit testing
Rina Foygel Barber, Ritwik Bhaduri, Aabesh Bhattacharyya et al.
Summary Tests of goodness of fit are used in nearly every domain where statistics is applied.One powerful and flexible approach is t...
Double cross-fit doubly robust estimators: Beyond series regression
Larry Wasserman, Sivaraman Balakrishnan, Alec McClean et al.
Abstract Double cross-fit doubly robust (DCDR) estimators, which train nuisance function estimators on separate samples, are effecti...
Analyzing cross-trait genetic architecture with the BIGA cloud computing platform*
Fei Xue, Bingxin Zhao, Yujue Li et al.
Bayesian Image Mediation Analysis
Jian Kang, Yuliang Xu, Timothy D Johnson et al.
Efficient Analysis of Latent Spaces in Heterogeneous Networks
Yinqiu He, Jiajin Sun, Yuang Tian
Heterogeneous gene network estimation for single-cell transcriptomic data via a joint regularized deep neural network
Jingyuan Yang, Tao Li, Tianyi Wang et al.
Conditional partial exchangeability: a probabilistic framework for multi-view clustering
Beatrice Franzolini, Maria De Iorio, Johan Eriksson
Differentially private sliced inverse regression in the federated paradigm
Shuaida He, Jiarui Zhang, Xin Chen
Nonparametric bootstrap inference for the eigenvalues of geophysical tensors
Kassel L. Hingee, Janice L. Scealy, Andrew T. A. Wood
Multivariate Analysis for Multiple Network Data via Semi-Symmetric Tensor PCA
Michael Weylandt, George Michailidis
Factor Augmented Matrix Regression
Jianqing Fan, Elynn Chen, Xiaonan Zhu
Subtype-Aware Registration of Longitudinal Electronic Health Records
Xin Gai, Shiyi Jiang, Anru R. Zhang
Estimating Multiple Structural Breaks in Large Panels With Unobserved Heterogeneity*
Wenxin Huang, Yucheng Sun, Yiru Wang
Emerging Knowledge Trend in Statistical Research: A Content-Based Analysis Using Covariate-Assisted Dynamic Topic Model
Feifei Wang, Chenxuan He, Liping Zhu
Structural Identification for Spatio-Temporal Dynamic Models
Cong Cheng, Yuan Ke, Wenyang Zhang et al.
Belted and Ensembled Neural Network for Linear and Nonlinear Sufficient Dimension Reduction
Bing Li, Yin Tang
Linear methods for non-linear inverse problems
Botond Szabo, Aad van der Vaart, Geerten Koers
Limiting laws and consistent estimation criteria for fixed and diverging number of spiked eigenvalues
Ji Zhu, Jingfei Zhang, Jianwei Hu et al.
Mini-batch Estimation for Deep Cox Models: Statistical Foundations and Practical Guidance
Lang Zeng, Weijing Tang, Zhao Ren et al.
Proximal causal inference for conditional separable effects
Chan Park, Mats J Stensrud, Eric J Tchetgen Tchetgen
Abstract Scientists regularly pose questions about treatment effects on outcomes conditional on a posttreatment event. However, caus...
Statistical inference for cell type deconvolution
Lin Gui, Dongyue Xie, Jingshu Wang
Abstract Integrating heterogeneous datasets across different measurement platforms poses fundamental challenges for statistical infe...
Blessing from Human-AI Interaction: Super Policy Learning in Confounded Environments
Zhengling Qi, Jiayi Wang, Chengchun Shi
Data-Driven Knowledge Transfer in Batch Q* Learning
Xi Chen, Elynn Chen, Wenbo Jing
Model to Meaning: How to Interpret Statistical Models with R and Python
Brenda Betancourt
Ball Impurity: Measuring Heterogeneity in General Metric Spaces
Heping Zhang, Ting Li, Xueqin Wang et al.
Portfolio Analysis in High Dimensions with Tracking Error and Weight Constraints
Mehmet Caner, Qingliang Fan
SurvSTAAR: A powerful statistical framework for rare variant analysis of time-to-event traits in large-scale whole-genome sequencing studies
Yidan Cui, Shiyang Ma, Yuxin Yuan et al.
Correction to “Partial Factor Modeling: Predictor-Dependent Shrinkage for Linear Regression”
Variable Significance Testing for the Deep Cox Model
Qixian Zhong, Jonas Mueller, Jane-Ling Wang
Hyperbolic Network Latent Space Model with Learnable Curvature
Jinming Li, Ji Zhu, Gongjun Xu
Adaptive Selection for False Discovery Rate Control Leveraging Symmetry
Linglong Kong, Yuexin Chen, Kehan Wang et al.
Nonparametric Causal Inference for Optogenetics: Sequential Excursion Effects for Dynamic Regimes
Gabriel Loewinger, Alexander W. Levis, Francisco Pereira
Balancing Weights for Causal Inference in Observational Factorial Studies
Peng Ding, Ruoqi Yu
Sparse Gaussianized Canonical Correlation Analysis with Applications to Portfolio Analysis
He Di, Hui Zou
Low-Rank Online Dynamic Assortment with Dual Contextual Information
Will Wei Sun, Yufeng Liu, Seong Jin Lee
A Latent Variable Approach to Learning High-dimensional Multivariate longitudinal Data
Tony Sit, Yunxiao Chen, Sze Ming Lee
Provably Efficient Posterior Sampling for Sparse Linear Regression via Measure Decomposition
Andrea Montanari, Yuchen Wu
Deep Clustering Evaluation: How to Validate Internal Clustering Validation Measures
Zeya Wang, Chenglong Ye
Network Regression and Supervised Centrality Estimation
Junhui Cai, Dan Yang, Ran Chen et al.
The causal effects of modified treatment policies under network interference
Salvador V Balkus, Scott W Delaney, Nima S Hejazi
Abstract Modified treatment policies are a widely applicable class of interventions useful for studying the causal effects of contin...
Transformers Can Overcome the Curse of Dimensionality: A Theoretical Study from an Approximation Perspective
Yuling Jiao, Yanming Lai, Yang Wang et al.
The Transformer model is widely used in various application areas of machine learning, such as natural language processing. This paper investigates th...
Online Bernstein-von Mises theorem
Jeyong Lee, Minwoo Chae, Junhyeok Choi
Online learning is an inferential paradigm in which parameters are updated incrementally from sequentially available data, in contrast to batch learni...
Covariate-dependent Hierarchical Dirichlet Processes
Sara Wade, Huizi Zhang, Natalia Bochkina
Bayesian hierarchical modeling is a natural framework to effectively integrate data and borrow information across groups. In this paper, we address pr...
DCatalyst: A Unified Accelerated Framework for Decentralized Optimization
Gesualdo Scutari, TIanyu Cao, Xiaokai Chen
We study decentralized optimization over a network of agents, modeled as an undirected graph and operating without a central server. The objective is ...
Boosted Control Functions: Distribution Generalization and Invariance in Confounded Models
Jonas Peters, Niklas Pfister, Sebastian Engelke et al.
Modern machine learning methods and the availability of large-scale data have significantly advanced our ability to predict target quantities from lar...
Contrasting Local and Global Modeling with Machine Learning and Satellite Data: A Case Study Estimating Tree Canopy Height in African Savannas
Esther Rolf, Lucia Gordon, Milind Tambe et al.
While advances in machine learning with satellite imagery (SatML) are facilitating environmental monitoring at a global scale, developing SatML models...
A Symplectic Analysis of Alternating Mirror Descent
Jonas E. Katona, Xiuyuan Wang, Andre Wibisono
Motivated by understanding the behavior of the Alternating Mirror Descent (AMD) algorithm for bilinear zero-sum games, we study the discretization of ...
Two-way Node Popularity Model for Directed and Bipartite Networks
Ting Li, Bing-Yi Jing, Jiangzhou Wang et al.
There has been increasing research attention on community detection in directed and bipartite networks. However, these studies often fail to consider ...
Convergence and complexity of block majorization-minimization for constrained block-Riemannian optimization
Deanna Needell, Laura Balzano, Yuchen Li et al.
Block majorization-minimization (BMM) is a simple iterative algorithm for nonconvex optimization that sequentially minimizes a majorizing surrogate of...
Bayesian Inference of Contextual Bandit Policies via Empirical Likelihood
Jiangrong Ouyang, Mingming Gong, Howard Bondell
Policy inference plays an essential role in the contextual bandit problem. In this paper, we use empirical likelihood to develop a Bayesian inference ...
A causal fused lasso for interpretable heterogeneous treatment effects estimation
Oscar Hernan Madrid Padilla, Yanzhen Chen, Carlos Misael Madrid Padilla et al.
We propose a novel method for estimating heterogeneous treatment effects based on the fused lasso. By first ordering samples based on the propensity o...
Unsupervised Feature Selection via Nonnegative Orthogonal Constrained Regularized Minimization
Defeng Sun, Liping Zhang, Yan Li
Unsupervised feature selection has drawn wide attention in the era of big data, since it serves as a fundamental technique for dimensionality reductio...
Reparameterized Complex-valued Neurons Can Efficiently Learn More than Real-valued Neurons via Gradient Descent
Zhi-Hua Zhou, Jin-Hui Wu, Shao-Qun Zhang et al.
Complex-valued neural networks potentially possess better representations and performance than real-valued counterparts when dealing with some complic...
Hierarchical Causal Models
Eli N. Weinstein, David M. Blei
Causal questions often arise in settings where data are hierarchical: subunits are nested within units. Consider students in schools, cells in patient...
Optimizing Attention with Mirror Descent: Generalized Max-Margin Token Selection
Addison Kristanto Julistiono, Davoud Ataee Tarzanagh, Navid Azizan
Attention mechanisms have revolutionized several domains of artificial intelligence, such as natural language processing and computer vision, by enabl...
Adaptive Forward Stepwise: A Method for High Sparsity Regression
Ivy Zhang, Robert Tibshirani
This paper proposes a sparse regression method that continuously interpolates between Forward Stepwise selection (FS) and the LASSO. When tuned approp...
Optimization and Generalization of Gradient Descent for Shallow ReLU Networks with Minimal Width
Ding-Xuan Zhou, Yunwen Lei, Puyu Wang et al.
Understanding the generalization and optimization of neural networks is a longstanding problem in modern learning theory. The prior analysis often lea...
Finite Neural Networks as Mixtures of Gaussian Processes: From Provable Error Bounds to Prior Selection
Steven Adams, Andrea Patanè, Morteza Lahijanian et al.
Infinitely wide or deep neural networks (NNs) with independent and identically distributed (i.i.d.) parameters have been shown to be equivalent to Gau...
CHANI: Correlation-based Hawkes Aggregation of Neurons with bio-Inspiration
Sophie Jaffard, Samuel Vaiter, Patricia Reynaud-Bouret
The present work aims at proving mathematically that a neural network inspired by biology can learn a classification task thanks to local transformati...
Persistence Diagrams Estimation of Multivariate Piecewise Hölder-continuous Signals
Hugo Henneuse
To our knowledge, the analysis of convergence rates for persistence diagrams estimation from noisy signals has predominantly relied on lifting signal ...
Exploring Novel Uncertainty Quantification through Forward Intensity Function Modeling
Yudong Wang, Cheng Yong Tang, Zhi-Sheng Ye
Predicting future time-to-event outcomes is a foundational task in statistical learning. While various methods exist for generating point predictions,...
Generative Bayesian Inference with GANs
Yuexi Wang, Veronika Rockova
In the absence of explicit or tractable likelihoods, Bayesians often resort to approximate Bayesian computation (ABC) for inference. Our work bridges ...
Communication-efficient Distributed Statistical Inference for Massive Data with Heterogeneous Auxiliary Information
Miaomiao Yu, Zhongfeng Jiang, Jiaxuan Li et al.
Heterogeneous auxiliary information commonly arises in big data due to diverse study settings and privacy constraints. Excluding such indirect evidenc...
Decorrelated Local Linear Estimator: Inference for Non-linear Effects in High-dimensional Additive Models
Zijian Guo, Wei Yuan, Cunhui Zhang
Additive models play an essential role in studying non-linear relationships. Despite many recent advances in estimation, there is a lack of methods an...
Refined Risk Bounds for Unbounded Losses via Transductive Priors
Jian Qian, Alexander Rakhlin, Nikita Zhivotovskiy
We revisit the sequential variants of linear regression with the squared loss, classification problems with hinge loss, and logistic regression, all c...
A Common Interface for Automatic Differentiation
Guillaume Dalle, Adrian Hill
For scientific machine learning tasks with a lot of custom code, picking the right Automatic Differentiation (AD) system matters. Our Julia package Di...
LazyDINO: Fast, Scalable, and Efficiently Amortized Bayesian Inversion via Structure-Exploiting and Surrogate-Driven Measure Transport
Lianghao Cao, Thomas O'Leary-Roseberry, Omar Ghattas et al.
We present LazyDINO, a transport map variational inference method for fast, scalable, and efficiently amortized solutions of high-dimensional nonlinea...
The Distribution of Ridgeless Least Squares Interpolators
Qiyang Han, Xiaocong Xu
The Ridgeless minimum $\ell_2$-norm interpolator in overparametrized linear regression has attracted considerable attention in recent years in both ma...
Nonparametric Estimation of a Factorizable Density using Diffusion Models
Minwoo Chae, Hyeok Kyu Kwon, Dongha Kim et al.
In recent years, diffusion models, and more generally score-based deep generative models, have achieved remarkable success in various applications, in...
Learning Bayesian Network Classifiers to Minimize Class Variable Parameters
Shouta Sugahara, Koya Kato, James Cussens et al.
This study proposes and evaluates a novel Bayesian network classifier which can asymptotically estimate the true probability distribution of the class...
Simulation-based Calibration of Uncertainty Intervals under Approximate Bayesian Estimation
Terrance D. Savitsky, Julie Gershunskaya
The mean field variational Bayes (VB) algorithm implemented in Stan is relatively fast and efficient, making it feasible to produce model-estimated of...
An Anytime Algorithm for Good Arm Identification
Marc Jourdan, Andrée Delahaye-Duriez, Clémence Réda
In good arm identification (GAI), the goal is to identify one arm whose average performance exceeds a given threshold, referred to as a good arm, if i...
Extrapolated Markov Chain Oversampling Method for Imbalanced Text Classification
Aleksi Avela, Pauliina Ilmonen
Text classification is the task of automatically assigning text documents correct labels from a predefined set of categories. In real-life (text) clas...
Neural Network Parameter-optimization of Gaussian Pre-marginalized Directed Acyclic Graphs
Mehrzad Saremi
Finding the parameters of a latent variable causal model is central to causal inference and causal identification. In this article, we show that exist...
Flexible Functional Treatment Effect Estimation
Jiayi Wang, Raymond K. W. Wong, Xiaoke Zhang et al.
We study treatment effect estimation with functional treatments where the average potential outcome functional is a function of functions, in contrast...
Error Analysis for Deep ReLU Feedforward Density-Ratio Estimation with Bregman Divergence
Jian Huang, Siming Zheng, Guohao Shen et al.
We consider the problem of density-ratio estimation using Bregman Divergence with Deep ReLU feedforward neural networks (BDD). We establish non-asympt...
A Reinforcement Learning Approach in Multi-Phase Second-Price Auction Design
Zhaoran Wang, Zhuoran Yang, Michael I. Jordan et al.
We study reserve price optimization in multi-phase second price auctions, where the seller's prior actions affect the bidders' later valuations throug...
UQLM: A Python Package for Uncertainty Quantification in Large Language Models
Dylan Bouchard, Mohit Singh Chauhan, David Skarbrevik et al.
Hallucinations, defined as instances where Large Language Models (LLMs) generate false or misleading content, pose a significant challenge that impact...
Nonlinear function-on-function regression by RKHS
Peijun Sang, Bing Li
We propose a nonlinear function-on-function regression model where both the covariate and the response are random functions. The nonlinear regression ...
Nonlocal Techniques for the Analysis of Deep ReLU Neural Network Approximations
Cornelia Schneider, Mario Ullrich, Jan Vybíral
In recent work concerned with the approximation and expressive powers of deep neural networks, Daubechies, DeVore, Foucart, Hanin, and Petrova introdu...
A Data-Augmented Contrastive Learning Approach to Nonparametric Density Estimation
Yuanyuan Lin, Chenghao Li
In this paper, we introduce a data-augmented nonparametric noise contrastive estimation method to density estimation using deep neural networks. By le...
Guaranteed Nonconvex Low-Rank Tensor Estimation via Scaled Gradient Descent
Tong Wu
Tensors, which give a faithful and effective representation to deliver the intrinsic structure of multi-dimensional data, play a crucial role in an in...
skwdro: a library for Wasserstein distributionally robust machine learning
Vincent Florian, Waïss Azizian, Franck Iutzeler et al.
We present skwdro, a Python library for training robust machine learning models. The library is based on distributionally robust optimization using Wa...
Extending Mean-Field Variational Inference via Entropic Regularization: Theory and Computation
Bohan Wu, David M. Blei
Variational inference (VI) has emerged as a popular method for approximate inference for high-dimensional Bayesian models. In this paper, we propose a...
Stochastic Gradient Methods: Bias, Stability and Generalization
Yunwen Lei, Shuang Zeng
Recent developments of stochastic optimization often suggest biased gradient estimators to improve either the robustness, communication efficiency or ...
Classification Under Local Differential Privacy with Model Reversal and Model Averaging
Caihong Qin, Yang Bai
Local differential privacy has become a central topic in data privacy research, offering strong privacy guarantees by perturbing user data at the sour...
Identifying Weight-Variant Latent Causal Models
Mingming Gong, Yuhang Liu, Zhen Zhang et al.
The task of causal representation learning aims to uncover latent higher-level causal variables that affect lower-level observations. Identifying the ...
Efficient frequent directions algorithms for approximate decomposition of matrices and higher-order tensors
Maolin Che, Yimin Wei, Hong Yan
In the framework of the FD (frequent directions) algorithm, we first develop two efficient algorithms for low-rank matrix approximations under the emb...
Online Detection of Changes in Moment--Based Projections: When to Retrain Deep Learners or Update Portfolios?
Ansgar Steland
Training deep learning neural networks often requires massive amounts of computational ressources. We propose to sequentially monitor network predicti...
The surrogate Gibbs-posterior of a corrected stochastic MALA: Towards uncertainty quantification for neural networks
Sebastian Bieringer, Gregor Kasieczka, Maximilian F. Steffen et al.
MALA is a popular gradient-based Markov chain Monte Carlo method to access the Gibbs-posterior distribution. Stochastic MALA (sMALA) scales to large d...
PALAR: Estimation of Absolute Abundance Effects in Regression with Relative Abundance Predictors
Xueqin Wang, Yiluan Li, Qiyu Wang et al.
Quasi-Monte Carlo confidence intervals using quantiles of randomized nets
Zexin Pan
Unbiased kinetic Langevin Monte Carlo with inexact gradients
Neil K. Chada, Benedict Leimkuhler, Daniel Paulin et al.
Reinforcement Learning with Continuous Actions Under Unmeasured Confounding
Yifan Cui, Zhengling Qi, Yuhan Li et al.
Deep Discrete Encoders: Identifiable Deep Generative Models for Rich Data with Discrete Latent Layers
Yuqi Gu, Seunghyun Lee
Meta-Learning with Generalized Ridge Regression: High-dimensional Asymptotics, Optimality and Hyper-covariance Estimation
Krishnakumar Balasubramanian, Yanhao Jin, Debashis Paul
Measuring Evidence against Exchangeability and Group Invariance with E-values
Nick Koning
Privacy Guarantees in Posterior Sampling under Contamination
Shenggang Hu, Louis Aslett, Hongsheng Dai et al.
Winner’s Curse Free Robust Mendelian Randomization with Summary Data
Zhongming Xie, Wanheng Zhang, Jingshen Wang et al.
Scalable community detection in massive networks via predictive assignment
Subhankar Bhadra, Marianna Pensky, Srijan Sengupta
Anytime validity is free: inducing sequential tests
Nick W Koning, Sam van Meer
Abstract Anytime valid sequential tests permit us to stop testing based on the current data, without invalidating the inference. Giv...
The synthetic instrument: from sparse association to sparse causation
Dehan Kong, Dingke Tang, Linbo Wang
Abstract In many observational studies, researchers are often interested in the effects of multiple exposures on a single outcome. S...
Spatial Prediction of Local Soil Erosion Distribution in the Wasserstein Space
Jiaming Qiu, Xiongtao Dai, Zhengyuan Zhu et al.
Adversarial Estimation of Riesz Representers
Rahul Singh, Victor Chernozhukov, Whitney K. Newey et al.
A Unified Framework for Estimation of High-dimensional Conditional Factor Models
Qihui Chen
Bayesian nonparametric spectral analysis of locally stationary processes*
Yifu Tang, Claudia Kirch, Jeong Eun Lee et al.
Principal Component Analysis for max-stable distributions
Felix Reinbott, Anja Janßen
Scalable Estimation of Multinomial Response Models with Random Consideration Sets
Siddhartha Chib, Kenichi Shimizu
Bayesian Nonparametric Quasi Likelihood
Antonio R. Linero
A Conditional Ordinal Stereotype Model to Estimate Police Officers’ Propensity to Escalate Force
Greg Ridgeway
Enhanced power enhancements for testing many moment equalities: Beyond the 2- and ∞-norm
Anders Bredahl Kock, David Preinerstorfer
Covariate-Elaborated Robust Partial Information Transfer with Conditional Spike-and-Slab Prior
Annie Qu, Yijiao Zhang, Ruqian Zhang et al.
Perturbation Analysis of Randomized SVD and its Applications to Statistics
Yichi Zhang, Minh Tang
The impact of job stability on monetary poverty in Italy: causal small area estimation
Dehan Kong, Nicola Salvati, Katarzyna Reluga et al.
Risk-Sensitive Deep RL: Variance-Constrained Actor-Critic Provably Finds Globally Optimal Policy
Runze Li, Zhaoran Wang, Zhuoran Yang et al.
Estimating the False Discovery Rate of Variable Selection
William Fithian, Yixiang Luo, Lihua Lei
DiPMInd: Distance Profile based Mutual Independence testing for random objects
Yaqing Chen, Paromita Dubey
Non-parametric efficient estimation of marginal structural models with continuous time-varying treatments
A Martin, M Santacatterina, I Díaz
Summary Marginal structural models are a popular method for estimating causal effects in the presence of time-varying exposures. In ...
An average-case sensitivity analysis for unmeasured confounding
Qingyuan Zhao, Yao Zhang
Summary Sensitivity analysis for the unconfoundedness assumption is crucial in observational studies. For this purpose, the marginal...
Bounds on causal effects in 2𝑲 factorial experiments with non-compliance
M Blackwell, N E Pashley
Summary Factorial experiments are ubiquitous in the social and biomedical sciences, but when units fail to comply with each assigned...
High-dimensional covariance estimation by pairwise likelihood truncation
A Casa, D Ferrari, Z Huang
Abstract Pairwise likelihood is an approximation of the full likelihood function that facilitates the analysis of high-dimensional c...
Estimating Ratios of Means of Multicategory Data Observed with Sample and Category Perturbations
D S Clausen, S V Teichman, A D Willis
Summay We consider the problem of estimating ratios of means of a multivariate outcome across covariates when the data are observed ...
Higher criticism for rare and weak non-proportional hazard deviations in survival analysis
A Kipnisand others
Post-selection inference for causal effects after causal discoveryGet access
T Changand others
Decomposing Gaussians with Unknown CovarianceGet access
A Dharamshiand others
Leveraging External Data for Testing Experimental Therapies with Biomarker Interactions in Randomized Clinical TrialsGet access
B Renand others
Spectral estimation for point processes and random fields
J P Grainger, T A Rajala, D J Murrell et al.
Summary Spatial variables can be observed in many different forms, such as regularly sampled random fields (lattice data), point pro...
Testing for latent structure via the Wilcoxon--Wigner random matrix of normalized rank statistics
Joshua Cape, Jonquil Z Liao
Summary This paper considers the problem of testing for latent structure in large symmetric data matrices. The goal here is to devel...
Assumption-Lean Post-Integrated Inference with Surrogate-Control Outcomes
Larry Wasserman, Jin-Hong Du, Kathryn Roeder
Summary Data integration methods aim to extract low-dimensional embeddings from high-dimensional outcomes to remove unwanted variati...
Calibrated sensitivity models
A Mcclean, Z Branson, E H Kennedy
Abstract In causal inference, sensitivity models are used to assess how unmeasured confounders could alter causal analyses, but the ...
Diaconis–Ylvisaker prior penalized likelihood for 𝒑/𝒏 → 𝜿 ∈ (0,1) logistic regression
P Sterzinger, I Kosmidis
Summary We characterize the behaviour of the maximum Diaconis–Ylvisaker prior penalized likelihood estimator in high-dimensional log...
Extremal correlation coefficient for functional dataGet access
M KimandP Kokoszka
Planning for gold: Hypothesis screening with split samples for valid powerful testing in matched observational studiesGet access
William Bekermanand others
Uniform inference in linear mixed modelsGet access
Karl Oskar EkvallandMatteo Bottai
Functional Principal Component Analysis for Sparse Censored Data
Caitrin Murphy, Eric Laber, Rhonda Merwin et al.
Summary Functional principal component analysis is a key tool in the study of functional data, driving both exploratory analyses and...
Harnessing The Collective Wisdom: Fusion Learning Using Decision Sequences from Diverse SourcesGet access
T Banerjeeand others
Asymptotic Validity and Finite-Sample Properties of Approximate Randomization Tests
P Toulis
Abstract Randomization tests rely on simple data transformations and possess an appealing robustness property. In addition to being ...
Characterizing extremal dependence on a hyperplane
P Wan
Summary In this paper, we characterize the extremal dependence of d asymptotically dependent variables using a class of random vecto...
Palm distributions of superposed point processes for statistical inference
M Beraha, F Camerlenghi, L Ghilotti
Abstract Palm distributions play a central role in the study of point processes and their associated summary statistics. In this wor...
Model-free selective inference under covariate shift via weighted conformal p-valuesGet access
Ying JinandEmmanuel J Candès
A family of toroidal diffusions with exact likelihood inferenceGet access
E García-PortuguésandM Sørensen
Dynamic covariate balancing: estimating treatment effects over time with potential local projections
Jelena Bradic, Davide Viviano
Abstract This article concerns the estimation and inference of treatment effects in panel data settings when treatments change dynam...
Dynamic clustering for heterophilic stochastic block models with time-varying node memberships
K Z Lin, J Lei
Summary We consider a time-ordered sequence of networks stemming from stochastic block models in which nodes gradually change their ...
Parameterising the effect of a continuous treatment using average derivative effects
Stijn Vansteelandt, Oliver J Hines, Karla Diaz-Ordaz
Abstract The average treatment effect (ATE) is commonly used to quantify the main effect of a binary treatment on an outcome. Extens...
Sequential Gibbs Posteriors with Applications to Principal Component Analysis
David B Dunson, Steven Winter, Omar Melikechi
Abstract Gibbs posteriors are proportional to a prior distribution multiplied by an exponentiated loss function, with a key tuning p...
Asymptotics for a class of parametric martingale posteriors
E Fong, A Yiu
Summary The martingale posterior framework replaces the elicitation of the likelihood and prior with that of a sequence of one-step-...
Treatment Choice with Nonlinear Regret
Toru Kitagawa, Sokbae Lee, Chen Qiu
Abstract Following Savage (1951) and Manski (2004), the literature of statistical treatment choice focuses on the mean of welfare re...
Generalized Fréchet means with random minimizing domains and its strong consistency
Jaesung Park, Sungkyu Jung
Abstract This paper introduces a novel extension of Fréchet means, referred to as generalized Fréchet means, as a comprehensive fram...
Geodesic Optimal Transport Regression
Hans-Georg Müller, Changbo Zhu
Abstract Classical regression models do not cover non-Euclidean data that reside in a general metric space, while the current litera...
A frequentist local false discovery rate
William Fithian, Daniel Xiang, Jake A Soloff
Abstract The local false discovery rate (lfdr) of Efron et al. (2001) enjoys major conceptual and decision-theoretic advantages over...
Design-based Causal Inference for Incomplete Block Designs
Taehyeon Koo, Nicole E Pashley
Abstract Researchers often turn to block randomization to increase the precision of their inference or due to practical consideratio...
Comparing causal parameters with many treatments and positivity violations
A Mcclean, Y Li, S Bae et al.
Summary Comparing outcomes across treatments is essential in medicine and public policy. To do so, researchers typically estimate a ...
A spectral method for multi-view subspace learning using the product of projections
R Sergazinov, A Taeb, I Gaynanova
Summary Multi-view data provides complementary information on the same set of observations, with multi-omics and multimodal sensor d...
Structural restrictions in local causal discovery: identifying direct causes of a target variable
J Bodik, V Chavez-Demoulin
Abstract
Randomization-Based Confidence Sets for the Local Average Treatment Effect
P M Aronow, Haoge Chang, Patrick Lopatto
Summary We consider the problem of generating confidence sets in randomized experiments with noncompliance. We show that a refinemen...
Geodesic slice sampling on Riemannian manifolds
Alain Durmus, Samuel Gruffaz, Mareike Hasenpflug et al.
Summary We propose a theoretically justified and practically applicable slice-sampling-based Markov chain Monte Carlo method for app...
Parallel computations for Metropolis Markov chains with Picard maps
G Zanella, S Grazzi
Abstract We develop parallel algorithms for simulating zeroth-order (also known as gradient-free) Metropolis Markov chains based on ...
Finding Distributions that Differ, with False Discovery Rate Control
Edgar Dobriban, Eric Tchetgen Tchetgen, Yonghoon Lee
Summary We consider the problem of comparing a reference distribution with several other distributions. Given a sample from both the...
Sparse higher order partial least squares for simultaneous variable selection, dimension reduction and tensor denoising
Kwangmoon Park, Sündüz Keleş
Abstract Motivated by the challenge of estimating effects of DNA methylation on 3D genomic contacts captured by multi-modal single c...
On the consistency of bootstrap for matching estimators
Ziming Lin, Fang Han
Abstract In a landmark paper, abadie2008failure showed that the naive bootstrap is inconsistent when applied to nearest neighbour ma...
Bias Control for M-quantile-based Small Area Estimators
Francesco Schirripa Spagnolo, Nicola Salvati, Gaia Bertarelli et al.
Characteristic function-based tests for spatial randomness
Yiran Zeng, Dale L Zimmerman
Abstract We introduce a new type of test for complete spatial randomness that applies to mapped point patterns in a rectangle or a c...
Regression graphs and sparsity-inducing reparametrizations
J Rybakand others
Identification and estimation of interaction effects in nonparametric additive regressionGet access
Seung Hyun Moonand others
Spatial self-confounding: Smoothness-related estimation bias in spatial regression models
David BolinandJonas Wallin
Inferring manifolds using Gaussian processes
David B Dunson, Nan Wu
It is often of interest to infer lower-dimensional structure underlying complex data. As a flexible class of nonlinear structures, it is common to foc...
Pitman efficiency lower bounds for multivariate distribution-free tests based on optimal transport
Nabarun Deb, Bhaswar B Bhattacharya, Bodhisattva Sen
Abstract The Wilcoxon rank sum test is one of the most popular distribution-free two-sample tests for univariate data. Among the imp...
Liyang Sun's contribution to the Discussion of ‘Augmented balancing weights as linear regression’ by Bruns-Smith et al
Liyang Sun
Inference on function-valued parameters using a restricted score test
Marco Carone, Ali Shojaie, Aaron Hudson
Abstract It is often of interest to make inference on an unknown function that is a local parameter of the data-generating mechanism...
Penalized empirical likelihood over decentralized networks
Jinye Du, Qihua Wang
Abstract Empirical likelihood encounters serious computational challenges when applied to massive datasets or multiple data sources ...
Online kernel CUSUM for change-point detection
Song Wei, Yao Xie
Abstract We present a computationally efficient online kernel Cumulative Sum method for change-point detection that utilizes the max...
Statistical Inference for Mediation Models with High Dimensional Exposures and Mediators
Jian Kang, Xinyu Zhang, Wei Zhou et al.
Towards Interpretable Deep Generative Models via Causal Representation Learning
Gemma Moran, Bryon Aragam
Doss and Huling's contribution to the Discussion of ‘Augmented balancing weights as linear regression' by Bruns-Smith et al
Charles R Doss, Jared D Huling
ART: distribution-free and model-agnostic changepoint detection with finite-sample guarantees
Guanghui Wang, Changliang Zou, Xiaolong Cui et al.
Abstract We introduce ART, a distribution-free and model-agnostic framework for changepoint analysis with finite-sample guarantees. ...
Zhu Shen and José R. Zubizarreta’s contribution to the Discussion of ‘Augmented balancing weights as linear regression' by Bruns-Smith et al
Zhu Shen, José R Zubizarreta
Towards Better Statistical Understanding of Watermarking LLMs
Zhongze Cai, Shang Liu, Hanzhao Wang et al.
Tian, Liu and Tan's contribution to the Discussion of ‘Augmented balancing weights as linear regression’ by Bruns-Smith et al
Maozai Tian, Shuo Liu, Tan Meng
Uniform Estimation and Inference for Nonparametric Partitioning-Based M-Estimators
Matias D. Cattaneo, Yingjie Feng, Boris Shigida
Local geometry of high-dimensional mixture models: Effective spectral theory and dynamical transitions
Aukosh Jagannath, Gerard Ben Arous, Reza Gheissari et al.
Optimal Integrative Estimation for Distributed Precision Matrices with Heterogeneity Adjustment
Yinrui Sun, Yin Xia
Rotnitzky, Smucler and Robins contribution to the Discussion of ‘Augmented balancing weights as linear regression’ by Bruns-Smith et al’
Andrea Rotnitzky, Ezequiel Smucler, James M Robins
Scalability of Metropolis-within-Gibbs schemes for high-dimensional Bayesian models
Giacomo Zanella, Filippo Ascolani, Gareth O Roberts
Abstract We study general coordinate-wise Markov chain Monte Carlo schemes (such as Metropolis-within-Gibbs samplers), which are com...
Skew-symmetric approximations of posterior distributions
Daniele Durante, Botond Szabo, Francesco Pozza
Abstract Popular deterministic approximations of posterior distributions from, e.g. the Laplace method, variational Bayes and expect...
Jiangfeng Wang, Keming Yu and Rong Jiang's contribution to the Discussion of ‘Augmented balancing weights as linear regression’ by Bruns-Smith et al
Jiangfeng Wang, Keming Yu, Rong Jiang
Shan, Ying and Zhao’s contribution to the Discussion of ‘Augmented balancing weights as linear regression' by Bruns-Smith et al
Jiwei Zhao, Jiawei Shan, Chao Ying
Proposer of the vote of thanks to Bruns-Smith et al. and contribution to the Discussion of ‘Augmented balancing weights as linear regression'
Lin Liu
Cheng and Tong’s contribution to the Discussion of ‘Statistical exploration of the Manifold Hypothesis’ by Whiteley et al
Bing Cheng, Howell Tong
Andrew Gelman’s contribution to the discussion of “Statistical exploration of the manifold hypothesis” by Whiteley et al
Andrew Gelman
M. Stehlík and M. Schlather's contribution to the Discussion of ‘Statistical exploration of the Manifold Hypothesis’ by Whiteley et al
Martin Schlather, Milan Stehlík
Ian Gallagher’s contribution to the Discussion of ‘Statistical exploration of the Manifold Hypothesis’ by Whiteley et al
Ian Gallagher
Multiple randomization designs: estimation and inference with interference
Lorenzo Masoero, Suhas Vijaykumar, Thomas S Richardson et al.
Abstract Completely randomized experiments, originally developed by Fisher and Neyman in the 1930s, are still widely used in practic...
Alberto Bordino and Olga Klopp’s contribution to the Discussion of “Statistical exploration of the Manifold Hypothesis” by Whiteley et al
Alberto Bordino, Olga Klopp
Safaa K. Kadhem's contribution to the Discussion of ‘Augmented balancing weights as linear regression' by Bruns-Smith et al
Safaa K Kadhem
Random pairing MLE for estimation of item parameters in Rasch model
Yuepeng Yang, Cong Ma
Efficient Optimization of Plasma Radiation Detector Configurations using Imperfect Inference Models
Difan Song, William E. Lewis, Patrick F. Knapp et al.
A factor-copula latent-vine time series model for extreme flood insurance losses
Xiaoting Li, Harry Joe, Christian Genest
Frequency-Band Estimation of the Number of Factors*
Marco Avarucci, Maddalena Cavicchioli, Mario Forni et al.
Posterior risk of modular and semi-modular Bayesian inference
David J. Nott, David T. Frazier
Impact of existence and nonexistence of pivot on the coverage of empirical best linear prediction intervals for small areas
Yuting Chen, Masayo Y. Hirose, Partha Lahiri
Functional Partial Least-Squares: Adaptive Estimation and Inference*
Andrii Babii, Marine Carrasco, Idriss Tsafack
Reconstruct Ising Model with Global Optimality via SLIDE*
Heping Zhang, Xueqin Wang, Jin Zhu et al.
Tian, Ma, Yu and Hu’s Contribution to the Discussion of ‘Statistical Exploration of the Manifold Hypothesis’ by Whiteley et al
Maozai Tian, Shaopei Ma, Zhen Yu et al.
Seconder of the vote of thanks to Bruns-Smith et al. and contribution to the Discussion of ‘Augmented balancing weights as linear regression'
Andrej Srakar
Supriya Tiwari and Pallavi Basu's contribution to the Discussion of ‘Augmented balancing weights as linear regression’ by Bruns-Smith et al
Supriya Tiwari, Pallavi Basu
“Professor Garib Nath Singh’s contribution to the Discussion of “Statistical exploration of the Manifold Hypothesis” by Nick Whiteley et al”
Garib Nath Singh
Professor Garib Nath Singh’s contribution to the Discussion of ‘Augmented balancing weights as linear regression' by Bruns-Smith et al
Garib Nath Singh
N.T. Longford's contribution to the Discussion of ‘Augmented balancing weights as linear regression' by Bruns-Smith et al
Nicholas T Longford
Yinqiu He’s contribution to the Discussion of ’Statistical exploration of the Manifold Hypothesis’ by Whiteley et al
Yinqiu He
Dr Arun Chind’s contribution to the Discussion of Statistical exploration of the Manifold Hypothesis by Whiteley, et al
Arun Peter Chind
Causal Inference in Pharmaceutical Statistics.
Ashley L. Buchanan
A novel statistical approach to analyze image classification
Juntong Chen, Sophie Langer, Johannes Schmidt-Hieber
Identification and multiply robust estimation of causal effects via instrumental variables from an auxiliary population
Peng Ding, Zhi Geng, Wei Li et al.
Online Auction Design Using Distribution-Free Uncertainty Quantification with Applications to E-Commerce
Jiale Han, Xiaowu Dai
Bayesian Image Analysis in Fourier Space
John Kornak, Karl Young, Eric Friedman et al.
Localizing Strictly Proper Scoring Rules*
Ramon F. A. de Punder, Cees G. H. Diks, Roger J. A. Laeven et al.
Statistical Inference in Tensor Completion: Optimal Uncertainty Quantification and Statistical-to-Computational Gaps
Dong Xia, Wanteng Ma
Dual Induction CLT for High-dimensional m-dependent Data
Heejong Bong, Arun Kumar Kuchibhotla, Alessandro Rinaldo
Minimax optimal seriation in polynomial time
Yann Issartel, Christophe Giraud, Nicolas Verzelen
SPARCC: Semi-Parametric Robust Estimation in a Right-Censored Covariate Model
Seong-ho Lee, Brian D. Richardson, Yanyuan Ma et al.
Causality-oriented robustness: exploiting general noise interventions in linear structural causal models
Peter Bühlmann, Armeen Taeb, Xinwei Shen
Consistent least squares estimation in population-size-dependent branching processes
Peter Braunsteins, Sophie Hautphenne, Carmen Minuesa
When does bottom-up beat top-down in hierarchical community detection?
Maximilien Dreveton, Daichi Kuroda, Matthias Grossglauser et al.
Towards Understanding Gradient Flow Dynamics of Homogeneous Neural Networks Beyond the Origin
Akshay Kumar, Jarvis Haupt
Recent works exploring the training dynamics of homogeneous neural network weights under gradient flow with small initialization have established that...
Optimal Complexity in Byzantine-Robust Distributed Stochastic Optimization with Data Heterogeneity
Jie Peng, Qing Ling, Qiankun Shi et al.
In this paper, we establish tight lower bounds for Byzantine-robust distributed first-order stochastic methods in both strongly convex and non-convex ...
Towards Unified Native Spaces in Kernel Methods
Xavier Emery, Emilio Porcu, Moreno Bevilacqua
There exists a plethora of parametric models for positive definite kernels in Euclidean spaces, and their use is ubiquitous in statistics, machine lea...
TorchCP: A Python Library for Conformal Prediction
Jianguo Huang, Jianqing Song, Xuanning Zhou et al.
Conformal prediction (CP) is a powerful statistical framework that generates prediction intervals or sets with guaranteed coverage probability. While ...
Hopfield-Fenchel-Young Networks: A Unified Framework for Associative Memory Retrieval
Saul Santos, Vlad Niculae, Daniel McNamee et al.
Associative memory models, such as Hopfield networks and their modern variants, have garnered renewed interest due to advancements in memory capacity ...
Identifiability of Causal Graphs under Non-Additive Conditionally Parametric Causal Models
Juraj Bodik, Valérie Chavez-Demoulin
Existing approaches to causal discovery often rely on restrictive modeling assumptions that limit their applicability in real-world settings, particul...
Fundamental Limits of Membership Inference Attacks on Machine Learning Models
Elisabeth Gassiat, Eric Aubinais, Pablo Piantanida
Membership inference attacks (MIA) can reveal whether a particular data point was part of the training dataset, potentially exposing sensitive informa...
On the Robustness of Kernel Goodness-of-Fit Tests
François-Xavier Briol, Xing Liu
Goodness-of-fit testing is often criticized for its lack of practical relevance: since "all models are wrong", the null hypothesis that the data confo...
Efficient Online Prediction for High-Dimensional Time Series via Joint Tensor Tucker Decomposition
Defeng Sun, Zhenting Luan, Haoning Wang et al.
Real-time prediction plays a vital role in various control systems, such as traffic congestion control and wireless channel resource allocation. In th...
Fast Computation of Superquantile-Constrained Optimization Through Implicit Scenario Reduction
Ying Cui, Jake Roth
Superquantiles have recently gained significant interest as a risk-aware metric for addressing fairness and distribution shifts in statistical learnin...
Collaborative likelihood-ratio estimation over graphs
Nicolas Vayatis, Alejandro de la Concha, Argyris Kalogeratos
This paper introduces the Collaborative Likelihood-ratio Estimation problem, which is relevant for applications involving multiple statistical estimat...
On the Utility of Equal Batch Sizes for Inference in Stochastic Gradient Descent
Dootika Vats, Rahul Singh, Abhinek Shukla
Stochastic gradient descent (SGD) is an estimation tool for large data employed in machine learning and statistics. Due to the Markovian nature of the...
Differentially Private Bootstrap: New Privacy Analysis and Inference Strategies
Jordan Awan, Zhanyu Wang, Guang Cheng
Differentially private (DP) mechanisms protect individual-level information by introducing randomness into the statistical analysis procedure. Despite...
Convergence and Sample Complexity of Natural Policy Gradient Primal-Dual Methods for Constrained MDPs
Dongsheng Ding, Kaiqing Zhang, Jiali Duan et al.
We study the sequential decision making problem of maximizing the expected total reward while satisfying a constraint on the expected total utility. ...
Differentially Private Multivariate Medians
Kelly Ramsay, Aukosh Jagannath, Shoja'eddin Chenouri
Statistical tools which satisfy rigorous privacy guarantees are necessary for modern data analysis. It is well-known that robustness against contamina...
VFOSA: Variance-Reduced Fast Operator Splitting Algorithms for Generalized Equations
Quoc Tran-Dinh
We develop two Variance-reduced Fast Operator Splitting Algorithms (VFOSA) to approximate solutions for a class of generalized equations, covering fun...
Scaling Capability in Token Space: An Analysis of Large Vision Language Model
Tenghui Li, Guoxu Zhou, Xuyang Zhao et al.
Large language models have demonstrated predictable scaling behaviors with respect to model parameters and training data. This study investigates whe...
Minimax Optimal Two-Sample Testing under Local Differential Privacy
Ilmun Kim, Jongmin Mun, Seungwoo Kwak
We explore the trade-off between privacy and statistical utility in private two-sample testing under local differential privacy (LDP) for both multino...
Jackpot: Approximating Uncertainty Domains with Adversarial Manifolds
Nathanaël Munier, Emmanuel Soubies, Pierre Weiss
Given a forward mapping Φ : R^N → R^M and a point x* ∈ R^N , the region {x ∈ R^N , ||Φ(x) − Φ(x*)|| ≤ ε}, where ε ≥ 0 is a perturbation amplitude, rep...
An Asymptotically Optimal Coordinate Descent Algorithm for Learning Bayesian Networks from Gaussian Models
Tong Xu, Armeen Taeb, Simge Küçükyavuz et al.
This paper studies the problem of learning Bayesian networks from continuous observational data, generated according to a linear Gaussian structural e...
Convergence Rates for Non-Log-Concave Sampling and Log-Partition Estimation
Francis Bach, David Holzmüller
Sampling from Gibbs distributions and computing their log-partition function are fundamental tasks in statistics, machine learning, and statistical ph...
A Unified Framework to Enforce, Discover, and Promote Symmetry in Machine Learning
Samuel E. Otto, Nicholas Zolman, J. Nathan Kutz et al.
Symmetry is present throughout nature and continues to play an increasingly central role in machine learning. In this paper, we provide a unifying the...
Infinite-dimensional Mahalanobis Distance with Applications to Kernelized Novelty Detection
Nikita Zozoulenko, Thomas Cass, Lukas Gonon
The Mahalanobis distance is a classical tool used to measure the covariance-adjusted distance between points in $\mathbb{R}^d$. In this work, we exten...
Stable learning using spiking neural networks equipped with affine encoders and decoders
A. Martina Neuman, Dominik Dold, Philipp Christian Petersen
We study the learning problem associated with spiking neural networks. Specifically, we focus on spiking neural networks composed of simple spiking ne...
Efficient Knowledge Deletion from Trained Models Through Layer-wise Partial Machine Unlearning
Vinay Chakravarthi Gogineni, Esmaeil S. Nadimi
Machine unlearning has garnered significant attention due to its ability to selectively erase knowledge obtained from specific training data samples i...
General Loss Functions Lead to (Approximate) Interpolation in High Dimensions
Kuo-Wei Lai, Vidya Muthukumar
We provide a unified framework that applies to a general family of convex losses across binary and multiclass settings in the overparameterized regime...
Piecewise deterministic sampling with splitting schemes
Andrea Bertazzi, Paul Dobson, Pierre Monmarché
We introduce Markov chain Monte Carlo (MCMC) algorithms based on numerical approximations of piecewise-deterministic Markov processes obtained with th...
Hierarchical and Stochastic Crystallization Learning: Geometrically Leveraged Nonparametric Regression with Delaunay Triangulation
Guosheng Yin, Jiaqi Gu
High-dimensionality is known to be the bottleneck for both nonparametric regression and the Delaunay triangulation. To efficiently exploit the advanta...
Gold-medalist Performance in Solving Olympiad Geometry with AlphaGeometry2
Yuri Chervonyi, Trieu H. Trinh, Miroslav Olšák et al.
We present AlphaGeometry2, a significantly improved version of AlphaGeometry introduced in Nature, 625 (7995):476, 2024, which has now surpassed an av...
Decentralized Bilevel Optimization: A Perspective from Transient Iteration Complexity
Xinmeng Huang, Kun Yuan, Boao Kong et al.
Stochastic bilevel optimization (SBO) is becoming increasingly essential in machine learning due to its versatility in handling nested structures. To ...
Fair Text Classification via Transferable Representations
Thibaud Leteno, Michael Perrot, Charlotte Laclau et al.
Group fairness is a central research topic in text classification, where reaching fair treatment between sensitive groups (e.g., women and men) remain...
Stochastic Interior-Point Methods for Smooth Conic Optimization with Applications
Chuan He, Zhanwang Deng
Conic optimization plays a crucial role in many machine learning (ML) problems. However, practical algorithms for conic constrained ML problems with l...
Revisiting Gradient Normalization and Clipping for Nonconvex SGD under Heavy-Tailed Noise: Necessity, Sufficiency, and Acceleration
Kun Yuan, Tao Sun, Xinwang Liu
Gradient clipping has long been considered essential for ensuring the convergence of Stochastic Gradient Descent (SGD) in the presence of heavy-tailed...
Generalized multi-view model: Adaptive density estimation under low-rank constraints
Julien Chhor, Olga Klopp, Alexandre B. Tsybakov
We study the problem of bivariate discrete or continuous probability density estimation under low-rank constraints. For discrete distributions, we ass...
(De)-regularized Maximum Mean Discrepancy Gradient Flow
Arthur Gretton, Zonghao Chen, Aratrika Mustafi et al.
We introduce a (de)-regularization of the Maximum Mean Discrepancy (DrMMD) and its Wasserstein gradient flow. Existing gradient flows that transport s...
On Probabilistic Embeddings in Optimal Dimension Reduction
Ryan Murray, Adam Pickarski
Dimension reduction algorithms are essential in data science for tasks such as data exploration, feature selection, and denoising. However, many non-l...
Physics Informed Kolmogorov-Arnold Neural Networks for Dynamical Analysis via Efficient-KAN and WAV-KAN
Subhajit Patra, Sonali Panda, Bikram Keshari Parida et al.
Physics-informed neural networks have proven to be a powerful tool for solving differential equations, leveraging the principles of physics to inform ...
Graph-accelerated Markov Chain Monte Carlo using Approximate Samples
Leo L. Duan, Anirban Bhattacharya
It has become increasingly easy nowadays to collect approximate posterior samples via fast algorithms such as variational Bayes, but concerns exist ab...
Online Quantile Regression
Dong Xia, Wen-Xin Zhou, Yinan Shen
This paper addresses the challenge of integrating sequentially arriving data into the quantile regression framework, where the number of features may ...
Statistical Inference of Random Graphs With a Surrogate Likelihood Function
Fangzheng Xie, Dingbo Wu
Spectral estimators have been broadly applied to statistical network analysis, but they do not incorporate the likelihood information of the network s...
On the Representation of Pairwise Causal Background Knowledge and Its Applications in Causal Inference
Zhuangyan Fang, Ruiqi Zhao, Yue Liu et al.
Pairwise causal background knowledge about the existence or absence of causal edges and paths is frequently encountered in observational studies. Such...
An Augmentation Overlap Theory of Contrastive Learning
Qi Zhang, Yifei Wang, Yisen Wang
Recently, self-supervised contrastive learning has achieved great success on various tasks. However, its underlying working mechanism is yet unclear. ...
Algorithms for ridge estimation with convergence guarantees
Wanli Qiao, Wolfgang Polonik
The extraction of filamentary structure from a point cloud is discussed. The filaments are modeled as ridge lines or higher dimensional ridges of an u...
Talent: A Tabular Analytics and Learning Toolbox
Si-Yang Liu, Hao-Run Cai, Qi-Le Zhou et al.
Tabular data is a prevalent source in machine learning. While classical methods have proven effective, deep learning methods for tabular data are emer...
Inferring Change Points in High-Dimensional Regression via Approximate Message Passing
Gabriel Arpino, Xiaoqi Liu, Julia Gontarek et al.
We consider the problem of localizing change points in a generalized linear model (GLM), a model that covers many widely studied problems in statistic...
Universality of Kernel Random Matrices and Kernel Regression in the Quadratic Regime
Parthe Pandit, Zhichao Wang, Yizhe Zhu
Kernel ridge regression (KRR) is a popular class of machine learning models that has become an important tool for understanding deep learning. Much o...
Lexicographic Lipschitz Bandits: New Algorithms and a Lower Bound
Lijun Zhang, Bo Xue, Ji Cheng et al.
This paper studies a multiobjective bandit problem under lexicographic ordering, wherein the learner aims to maximize $m$ objectives, each with differ...
On the Natural Gradient of the Evidence Lower Bound
Nihat Ay, Jesse van Oostrum, Adwait Datar
This article studies the Fisher-Rao gradient, also referred to as the natural gradient, of the evidence lower bound (ELBO) which plays a central role ...
Geometry and Stability of Supervised Learning Problems
Facundo Mémoli, Brantley Vose, Robert C. Williamson
We introduce a notion of distance between supervised learning problems, which we call the Risk distance. This distance, inspired by optimal transport,...
Understanding Deep Representation Learning via Layerwise Feature Compression and Discrimination
Xiao Li, Peng Wang, Can Yaras et al.
Over the past decade, deep learning has proven to be a highly effective tool for learning meaningful features from raw data. However, it remains an op...
Optimal Rates of Kernel Ridge Regression under Source Condition in Large Dimensions
Qian Lin, Haobo Zhang, Yicheng Li et al.
Motivated by studies of neural networks, particularly the neural tangent kernel theory, we investigate the large-dimensional behavior of kernel ridge ...
A Hybrid Weighted Nearest Neighbour Classifier for Semi-Supervised Learning
Stephen M. S. Lee, Mehdi Soleymani
We propose a novel hybrid procedure for constructing a randomly weighted nearest neighbour classifier for semi-supervised learning. The procedure firs...
Scalable and Adaptive Variational Bayes Methods for Hawkes Processes
Judith Rousseau, Vincent Rivoirard, Deborah Sulem
Hawkes processes are often applied to model dependence and interaction phenomena in multivariate event data sets, such as neuronal spike trains, socia...
Biological Sequence Kernels with Guaranteed Flexibility
Alan N. Amin, Debora S. Marks, Eli N. Weinstein
Applying machine learning to biological sequences---DNA, RNA and protein---has enormous potential to advance human health and environmental sustainabi...
Unified Discrete Diffusion for Categorical Data
Lingxiao Zhao, Xueying Ding, Lijun Yu et al.
Discrete diffusion models have attracted significant attention for their application to naturally discrete data, such as language and graphs. While di...
Reinforcement Learning for Infinite-Dimensional Systems
Wei Zhang, Jr-Shin Li
Interest in reinforcement learning (RL) for large-scale systems, comprising extensive populations of intelligent agents interacting with heterogeneous...
Deep Neural Networks are Adaptive to Function Regularity and Data Distribution in Approximation and Estimation
Hao Liu, Jiahui Cheng, Wenjing Liao
Deep learning has exhibited remarkable results across diverse areas. To understand its success, substantial research has been directed towards its the...
Generation of Geodesics with Actor-Critic Reinforcement Learning to Predict Midpoints
Kazumi Kasaura
To find the shortest paths for all pairs on manifolds with infinitesimally defined metrics, we introduce a framework to generate them by predicting mi...
Learning-to-Optimize with PAC-Bayesian Guarantees: Theoretical Considerations and Practical Implementation
Michael Sucker, Jalal Fadili, Peter Ochs
We use the PAC-Bayesian theory for the setting of learning-to-optimize. To the best of our knowledge, we present the first framework to learn optimiza...
Sparse Semiparametric Discriminant Analysis for High-dimensional Zero-inflated Data
Yang Ni, Hee Cheol Chung, Irina Gaynanova
Sequencing-based technologies provide an abundance of high-dimensional biological data sets with highly skewed and zero-inflated measurements. Despite...
Stochastic Interpolants: A Unifying Framework for Flows and Diffusions
Michael Albergo, Nicholas M. Boffi, Eric Vanden-Eijnden
A class of generative models that unifies flow-based and diffusion-based methods is introduced. These models extend the framework proposed in Albergo ...
Efficient Methods for Non-stationary Online Learning
Lijun Zhang, Peng Zhao, Yan-Feng Xie et al.
Non-stationary online learning has drawn much attention in recent years. In particular, dynamic regret and adaptive regret are proposed as two princip...
Decentralized Asynchronous Optimization with DADAO allows Decoupling and Acceleration
Adel Nabli, Edouard Oyallon
DADAO is the first decentralized, accelerated, asynchronous, primal, first-order algorithm to minimize a sum of $L$-smooth and $\mu$-strongly convex ...
Mixtures of Gaussian Process Experts with SMC^2
Teemu Härkönen, Sara Wade, Kody Law et al.
Gaussian processes are a key component of many flexible statistical and machine learning models. However, they exhibit cubic computational complexity ...
Robust Point Matching with Distance Profiles
YoonHaeng Hur, Yuehaw Khoo
Computational difficulty of quadratic matching and the Gromov-Wasserstein distance has led to various approximation and relaxation schemes. One of suc...
BoFire: Bayesian Optimization Framework Intended for Real Experiments
Johannes P. Dürholt, Thomas S. Asche, Johanna Kleinekorte et al.
Our open-source Python package BoFire combines Bayesian Optimization (BO) with other design of experiments (DoE) strategies focusing on developing and...
Reliever: Relieving the Burden of Costly Model Fits for Changepoint Detection
Guanghui Wang, Chengde Qian, Changliang Zou
Changepoint detection typically relies on a grid-search strategy for optimal data segmentation. When model fitting itself is expensive, repeatedly fit...
Variational Inference for Uncertainty Quantification: an Analysis of Trade-offs
Charles C. Margossian, Loucas Pillaud-Vivien, Lawrence K. Saul
Given an intractable distribution $p$, the problem of variational inference (VI) is to find the best approximation from some more tractable family $Q$...
Are Ensembles Getting Better All the Time?
Pierre-Alexandre Mattei, Damien Garreau
Ensemble methods combine the predictions of several base models. We study whether or not including more models always improves their average performan...
An Adaptive Parameter-free and Projection-free Restarting Level Set Method for Constrained Convex Optimization Under the Error Bound Condition
Qihang Lin, Negar Soheili, Runchao Ma et al.
Recent efforts to accelerate first-order methods have focused on convex optimization problems that satisfy a geometric property known as error-bound c...
Operator Learning for Hyperbolic PDEs
Christopher Wang, Alex Townsend
We construct the first rigorously justified probabilistic algorithm for recovering the solution operator of a hyperbolic partial differential equation...
Optimal subsampling for high-dimensional partially linear models via machine learning methods
Lei Wang, Heng Lian, Yujing Shao et al.
In this paper, we explore optimal subsampling strategies for estimating the parametric regression coefficients in partially linear models with unknown...
Decentralized Sparse Linear Regression via Gradient-Tracking
Ying Sun, Guang Cheng, Marie Maros et al.
We study sparse linear regression over a network of agents, modeled as an undirected graph without a center node. The estimation of the $s$-sparse ...
Calibrated Inference: Statistical Inference that Accounts for Both Sampling Uncertainty and Distributional Uncertainty
Yujin Jeong, Dominik Rothenhäusler
How can we draw trustworthy scientific conclusions? One criterion is that a study can be replicated by independent teams. While replication is critica...
Relaxed Gaussian Process Interpolation: a Goal-Oriented Approach to Bayesian Optimization
Sébastien J. Petit, Julien Bect, Emmanuel Vazquez
This work presents a new procedure for obtaining predictive distributions in the context of Gaussian process (GP) modeling, with a relaxation of the i...
A new integrative learning framework for integrating multiple secondary outcomes into primary outcome analysis: a case study on liver health
Shuo Chen, Chixiang Chen, Daxuan Deng et al.
Abstract In the era of big data, secondary outcomes have become increasingly important alongside primary outcomes. These secondary o...
Federated feature selection with false discovery rate control
Runze Li, Jiayi Tong, Jie Hu et al.
Abstract Selecting a set of universally relevant features associated with a given response variable across multiple distributed data...
Using a two-parameter sensitivity analysis framework to efficiently combine randomized and nonrandomized studies
Bikram Karmakar, Ruoqi Yu, Jessica Vandeleest et al.
Abstract Causal inference is vital for informed decision-making across fields such as biomedical research and social sciences. Rando...
Attainability of Two-Point Testing Rates for Finite-Sample Location Estimation
Spencer Compton, Gregory Valiant
Spatial Variation on Multiple Scales in Line Transect Data; the Case of Antarctic Fin Whales
Olav Nikolai Breivik, Hans J. Skaug, Martin Jullum et al.
Optimal Run Order for Order-of-Addition Experiments
Chunyan Wang, Jiayu Peng, Dennis K. J. Lin
Test of Independence Using Generalized Distance Correlation
Jianqing Fan, Zhipeng Lou, Danna Zhang
A non-asymptotic distributional theory of approximate message passing for sparse and robust regression
Gen Li, Yuting Wei
Elastic Shape Analysis of Movement Data
J.E. Borgert, Jan Hannig, J.D. Tucker et al.
Bayesian Signal Matching for Transfer Learning in ERP-Based Brain Computer Interface
Jane E. Huggins, Jian Kang, Tianwen Ma
On a Class of Sobolev Tests for Symmetry, their Detection Thresholds, and Asymptotic Powers
Davy Paindaveine, Thomas Verdebout, Eduardo García-Portugués
Scan Statistics for the Detection of Anomalies in <i>M</i> -Dependent Random Fields with Applications to Image Data
Claudia Kirch, Philipp Klein, Marco Meyer
Conjugate gradient methods for high-dimensional GLMMs
Andrea Pandolfi, Omiros Papaspiliopoulos, Giacomo Zanella
High-Dimensional Spatial Autoregression with Latent Factors by Diversified Projections
Jiaxin Shi, Xuening Zhu, Jing Zhou et al.
SMART-MC: Characterizing the Dynamics of Multiple Sclerosis Therapy Transitions Using a Covariate-Based Markov Model
Beomchang Kim, Zongqi Xia, Priyam Das
Bayesian Geostatistics Using Predictive Stacking
Sudipto Banerjee, Lu Zhang, Wenpin Tang
A Two-step Estimating Approach for Heavy-tailed AR Models with Non-zero Median GARCH-type Noises
She Rui, Dai Linlin, Ling Shiqing
Towards a Unified Theory for Semiparametric Data Fusion with Individual-Level Data
Ellen Sandra Graham, Marco Carone, Andrea Rotnitzky
Identification and estimation for matrix time series CP-factor models
Qiwei Yao, Jinyuan Chang, Yue Du et al.
Markov stick-breaking processes
Antonio Lijoi, Maria F. Gil-Leyva, Ramses H. Mena et al.
Uncertainty quantification for iterative algorithms in linear models with application to early stopping
Kai Tan, Pierre C Bellec
Adaptive Bayesian regression on data with low intrinsic dimensionality
Tao Tang, Xiuyuan Cheng, Nan Wu et al.
VECCHIA GAUSSIAN PROCESSES: ON PROBABILISTIC AND STATISTICAL PROPERTIES
Botond Tibor Szabo, Yichen Zhu
Testing and Support Recovery in Population-Based Image Data
Jian Huang, Liuquan Sun, Lianqiang Qu et al.
Spatiotemporal Besov Priors for Bayesian Inverse Problems
Shiwei Lan, Mirjeta Pasha, Shuyi Li et al.
Construction of Asymmetric Nested Orthogonal Arrays
Mingyao Ai, Shanqi Pang, Xiao Lin et al.
Vecchia Gaussian Processes: Probabilistic Properties, Minimax Rates and Methodological Developments
Botond Tibor Szabo, Yichen Zhu
Statistical-Computational Trade-offs for Recursive Adaptive Partitioning Estimators
Yan Shuo Tan, Jason M. Klusowski, Krishnakumar Balasubramanian
Reviving pseudo-inverses: Asymptotic properties of large dimensional Moore-Penrose and Ridge-type inverses with applications
Taras Bodnar, Nestor Parolya
Generalized Linear Spectral Statistics of High-dimensional Sample Covariance Matrices and Its Applications
Yanlin Hu, Qing Yang, Xiao Han
The out-of sample prediction error of the $\sqrt{\text{LASSO}}$ and related estimators
José Luis Montiel Olea, Cynthia Rush, Amilcar Velez et al.
Fairness in Machine Learning: A Review for Statisticians
Xianwen He, Yao Li
Minimax and adaptive transfer learning for nonparametric classification under distributed differential privacy constraintsGet access
Arnab Auddyand others
Generalized Multilinear Models for Sufficient Dimension Reduction on Tensor-valued Predictors
Daniel Kapla, Efstathia Bura
Parameter identification in linear non-Gaussian causal models under general confounding
Jalal Etesami, Mathias Drton, Daniele Tramontano
Object detection under the linear subspace model with application to cryo-EM images
Samuel Davenport, Amitay Eldar, Keren Mor Waknin et al.
Learning extremal graphical structures in high dimensions
Sebastian Engelke, Michael Lalancette, Stanislav Volgushev
Eigenvector Overlaps in Large Sample Covariance Matrices and Nonlinear Shrinkage Estimators
Guangming Pan, Zeqin Lin
Inferring the dependence graph density of binary graphical models in high dimension
Julien Chevallier, Eva Löcherbach, Guilherme Ost
Finite- and large-sample inference for model and coefficients in high-dimensional linear regression with repro samples
Linjun Zhang, Peng Wang, Minge Xie
Precise Asymptotics of Bagging Regularized M-estimators
Pierre C. Bellec, Takuya Koriyama, Jin-Hong Du et al.
Statistical Inference for Low-Rank Tensors: Heteroskedasticity, Subgaussianity, and Applications
Joshua Agterberg, Anru Zhang
Online Tensor Learning: Computational and Statistical Trade-offs, Adaptivity and Optimal Regret
Dong Xia, Jingyang Li, Yang Chen et al.
Universally Optimal Designs for Symmetric Models in Order-of-Addition Experiments
Ze Liu, Yongdao Zhou, Min-Qian Liu
Developing A Practical Measure: An Asymmetric Mean Squared Prediction Error for Small Area Estimation
Haiqiang Ma, Thuan Nguyen, Jiming Jiang
Dualizing Le Cam’s method for functional estimation I: General theory
Yihong Wu, Yury Polyanskiy
Inference for Dispersion and Curvature of Random Objects
Hans-Georg Müller, Wookyeong Song
Improved bounds and inference on optimal regimes
Julien D. Laurendeau, Aaron L. Sarvet, Mats J. Stensrud
Balanced Sampling With Inequalities: Application to Category Bounding, Matrix Rounding, and Spread Sampling
Yves Tillé, Arnaud Tripet
Using Total Margin of Error to Account for Non-Sampling Error in Election Polls
Jeff Dominitz, Charles F. Manski
Goodness-of-fit tests for linear non-Gaussian structural equation models
D Schkoda, M Drton
Abstract
A more robust approach to multivariable Mendelian randomization
Yinxiang Wu, Hyunseung Kang, Ting Ye
Summary Multivariable Mendelian randomization uses genetic variants as instrumental variables to infer the direct effects of multipl...
Fast convergence of the Expectation-Maximization algorithm under a logarithmic Sobolev inequality
R CaprioandA M Johansen
Root cause discovery via permutations and Cholesky decompositionGet access
Jinzhou Liand others
On testing Kronecker product structure in tensor factor models
Z CenandC Lam
Scalable inference for Nonparametric Stochastic Approximation in Reproducing Kernel Hilbert Spaces
Zuofeng Shang, Meimei Liu, Yun Yang
Spectrum-Aware Debiasing: A Modern Inference Framework with Applications to Principal Components Regression
Yufan Li, Pragya Sur
Sensitivity Analysis for Observational Studies with Flexible Matched DesignsGet access
Xinran Li
"What is Different Between These Datasets?" A Framework for Explaining Data Distribution Shifts
Varun Babbar*, Zhicheng Guo*, Cynthia Rudin
The performance of machine learning models relies heavily on the quality of input data, yet real-world applications often face significant data-relate...
Priors for second-order unbiased Bayes estimatorsGet access
Mana Sakaiand others
Representation of context-specific causal models with observational and interventional data
Eliana DuarteandLiam Solus
Spectral change point estimation for high-dimensional time series by sparse tensor decompositionGet access
Xinyu ZhangandKung-Sik Chan
Nonparametric Estimation of a Covariate-Adjusted Counterfactual Treatment Regimen Response Curve
Ashkan Ertefaie, Luke Duttweiler, Brent A. Johnson et al.
Optimal Eigenvalue Shrinkage in the Semicircle Limit
Michael Jacob Feldman, David Leigh Donoho
Bayesian analysis of product feature allocation models
Lorenzo Ghilottiand others
Inference of dependency knowledge graph for Electronic Health RecordsGet access
Zhiwei Xuand others
Versatile Differentially Private Learning for General Loss Functions
Yumou Qiu, Song X Chen, Qilong Lu
Large-Scale Multiple Testing: Fundamental Limits of False Discovery Rate Control and Compound Oracle
Yihong Wu, Yutong Nie
Estimation of Grouped Time-Varying Network Vector Autoregressive Models
Degui Li, Bin Peng, Songqiao Tang et al.
Trace Test for High-Dimensional Cointegration
Alexei Onatski, Chen Wang
Distributionally Robust Learning for Multi-source Unsupervised Domain Adaptation
Peter Bühlmann, Zijian Guo, Zhenyu Wang
Nonparametric Density Estimation of a Long-Term Trend from Repeated Semicontinuous Data
Félix Camirand Lemyre, Raymond J. Carroll, Aurore Delaigle
On the Algorithmic Bias of Aligning Large Language Models with RLHF: Preference Collapse and Matching Regularization
Cong Fang, Weijie J. Su, Jiancong Xiao et al.
Transfer learning under large-scale low-rank regression models
Hongyu Zhao, Seyoung Park, Eun Ryung Lee et al.
Differentially Private Sliced Inverse Regression: Minimax Optimality and Algorithm
Linjun Zhang, Zhanrui Cai, Xintao Xia
Dynamic Decision Making With Individualized Variable Selection
Bryan Cai, Ying Cui, Haoda Fu et al.
Efficiency of QMLE for dynamic panel data models with interactive effects
Jushan Bai
A New Approach for Homogeneity Pursuit in Short Panel Data Analysis
Wenyang Zhang, Weichi Wu, Yang Han
Boosting AI-Generated Biomedical Images with Confidence through Advanced Statistical Inference
Zhiling Gu, Shan Yu, Guannan Wang et al.
Robust detection of watermarks for large language models under human editsGet access
Xiang Liand others
Bootstrapping estimators based on the block maxima methodGet access
Axel BücherandTorben Staud
Alternative Mean Square Error Estimators and Confidence Intervals for Small Area Prediction Under General DesignsGet access
Yanghyeon ChoandEmily Berg
Scalable Bayesian inference for heat kernel Gaussian processes on manifoldsGet access
Junhui Heand others
Simplifying debiased inference via automatic differentiation and probabilistic programmingGet access
Alex Luedtke
A stratifiedL2-discrepancy with application to space-filling designsGet access
Ye TianandHongquan Xu
istributionally Robust Learning for Multi-source Unsupervised Domain Adaptation
Peter Bühlmann, Zijian Guo, Zhenyu Wang
Linear Separation Capacity of Self-Supervised Representation Learning
Shulei Wang
Recent advances in self-supervised learning have highlighted the efficacy of data augmentation in learning data representation from unlabeled data. Tr...
On the Convergence of Projected Policy Gradient for Any Constant Step Sizes
Zhihua Zhang, Jiacai Liu, Wenye Li et al.
Projected policy gradient (PPG) is a basic policy optimization method in reinforcement learning. Given access to exact policy evaluations, previous s...
Learning with Linear Function Approximations in Mean-Field Control
Erhan Bayraktar, Ali Devran Kara
The paper focuses on mean-field type multi-agent control problems with finite state and action spaces where the dynamics and cost structures are symme...
A New Random Reshuffling Method for Nonsmooth Nonconvex Finite-sum Optimization
Junwen Qiu, Xiao Li, Andre Milzarek
Random reshuffling techniques are prevalent in large-scale applications, such as training neural networks. While the convergence and acceleration effe...
Model-free Change-Point Detection Using AUC of a Classifier
Feiyu Jiang, Rohit Kanrar, Zhanrui Cai
In contemporary data analysis, it is increasingly common to work with non-stationary complex data sets. These data sets typically extend beyond the cl...
EF21 with Bells & Whistles: Six Algorithmic Extensions of Modern Error Feedback
Ilyas Fatkhullin, Igor Sokolov, Eduard Gorbunov et al.
First proposed by Seide (2014) as a heuristic, error feedback (EF) is a very popular mechanism for enforcing convergence of distributed gradient-based...
Multiple Instance Verification
Xin Xu, Eibe Frank, Geoffrey Holmes
We explore multiple instance verification, a problem setting in which a query instance is verified against a bag of target instances with heterogeneou...
Learning from Similar Linear Representations: Adaptivity, Minimaxity, and Robustness
Yang Feng, Yuqi Gu, Ye Tian
Representation multi-task learning (MTL) has achieved tremendous success in practice. However, the theoretical understanding of these methods is still...
Exponential Family Graphical Models: Correlated Replicates and Unmeasured Confounders, with Applications to fMRI Data
Kean Ming Tan, Yang Ning, Yanxin Jin
Graphical models have been used extensively for modeling brain connectivity networks. However, unmeasured confounders and correlations among measureme...
Optimizing Return Distributions with Distributional Dynamic Programming
Bernardo Ávila Pires, Mark Rowland, Diana Borsa et al.
We introduce distributional dynamic programming (DP) methods for optimizing statistical functionals of the return distribution, with standard reinforc...
Imprecise Multi-Armed Bandits: Representing Irreducible Uncertainty as a Zero-Sum Game
Vanessa Kosoy
We introduce a novel multi-armed bandit framework, where each arm is associated with a fixed unknown credal set over the space of outcomes (which can ...
Early Alignment in Two-Layer Networks Training is a Two-Edged Sword
Etienne Boursier, Nicolas Flammarion
Training neural networks with first order optimisation methods is at the core of the empirical success of deep learning. The scale of initialisation i...
Hierarchical Decision Making Based on Structural Information Principles
Xianghua Zeng, Hao Peng, Dingli Su et al.
Hierarchical Reinforcement Learning (HRL) is a promising approach for managing task complexity across multiple levels of abstraction and accelerating ...
Generative Adversarial Networks: Dynamics
Matias G. Delgadino, Bruno B. Suassuna, Rene Cabrera
We study quantitatively the overparametrization limit of the original Wasserstein-GAN algorithm. Effectively, we show that the algorithm is a stochast...
“What is Different Between These Datasets?” A Framework for Explaining Data Distribution Shifts
Varun Babbar*, Zhicheng Guo*, Cynthia Rudin
The performance of machine learning models relies heavily on the quality of input data, yet real-world applications often face significant data-relate...
Assumption-lean and data-adaptive post-prediction inference
Jiacheng Miao, Xinran Miao, Yixuan Wu et al.
A primary challenge facing modern scientific research is the limited availability of gold-standard data, which can be costly, labor-intensive, or inva...
Bagged Regularized k-Distances for Anomaly Detection
Hanyuan Hang, Hanfang Yang, Yuchao Cai et al.
We consider the paradigm of unsupervised anomaly detection, which involves the identification of anomalies within a dataset in the absence of labeled ...
Four Axiomatic Characterizations of the Integrated Gradients Attribution Method
Daniel Lundstrom, Meisam Razaviyayn
Deep neural networks have produced significant progress among machine learning models in terms of accuracy and functionality, but their inner workings...
Fast Algorithm for Constrained Linear Inverse Problems
Mohammed Rayyan Sheriff, Floor Fenne Redel, Peyman Mohajerin Esfahani
We consider the constrained Linear Inverse Problem (LIP), where a certain atomic norm (like the $\ell_1 $ norm) is minimized subject to a quadratic co...
High-Rank Irreducible Cartesian Tensor Decomposition and Bases of Equivariant Spaces
Shihao Shao, Yikang Li, Zhouchen Lin et al.
Irreducible Cartesian tensors (ICTs) play a crucial role in the design of equivariant graph neural networks, as well as in theoretical chemistry and c...
Best Linear Unbiased Estimate from Privatized Contingency Tables
Jordan Awan, Adam Edwards, Paul Bartholomew et al.
In differential privacy (DP) mechanisms, it can be beneficial to release "redundant" outputs, where some quantities can be estimated in multiple ways...
Interpretable Global Minima of Deep ReLU Neural Networks on Sequentially Separable Data
Thomas Chen, Patrícia Muñoz Ewald
We explicitly construct zero loss neural network classifiers. We write the weight matrices and bias vectors in terms of cumulative parameters, which ...
Enhanced Feature Learning via Regularisation: Integrating Neural Networks and Kernel Methods
Bertille FOLLAIN, Francis BACH
We propose a new method for feature learning and function estimation in supervised learning via regularised empirical risk minimisation. Our approach ...
Data-Driven Performance Guarantees for Classical and Learned Optimizers
Rajiv Sambharya, Bartolomeo Stellato
We introduce a data-driven approach to analyze the performance of continuous optimization algorithms using generalization guarantees from statistical ...
Contextual Bandits with Stage-wise Constraints
Aldo Pacchiano, Mohammad Ghavamzadeh, Peter Bartlett
We study contextual bandits in the presence of a stage-wise constraint when the constraint must be satisfied both with high probability and in expecta...
Boosting Causal Additive Models
Maximilian Kertel, Nadja Klein
We present a boosting-based method to learn additive Structural Equation Models (SEMs) from observational data, with a focus on the theoretical aspect...
Frequentist Guarantees of Distributed (Non)-Bayesian Inference
Bohan Wu, César A. Uribe
We establish frequentist properties, i.e., posterior consistency, asymptotic normality, and posterior contraction rates, for the distributed (non-)Bay...
Asymptotic Inference for Multi-Stage Stationary Treatment Policy with Variable Selection
Donglin Zeng, Yufeng Liu, Daiqi Gao
Dynamic treatment regimes or policies are a sequence of decision functions over multiple stages that are tailored to individual features. One importan...
EMaP: Explainable AI with Manifold-based Perturbations
Minh Nhat Vu, Huy Quang Mai, My T. Thai
In the last few years, many explanation methods based on the perturbations of input data have been introduced to shed light on the predictions generat...
Autoencoders in Function Space
Justin Bunker, Mark Girolami, Hefin Lambley et al.
Autoencoders have found widespread application in both their original deterministic form and in their variational formulation (VAEs). In scientific ap...
Nonparametric Regression on Random Geometric Graphs Sampled from Submanifolds
Paul Rosa, Judith Rousseau
We consider the nonparametric regression problem when the covariates are located on an unknown compact submanifold of a Euclidean space. Under definin...
System Neural Diversity: Measuring Behavioral Heterogeneity in Multi-Agent Learning
Matteo Bettini, Ajay Shankar, Amanda Prorok
Evolutionary science provides evidence that diversity confers resilience in natural systems. Yet, traditional multi-agent reinforcement learning techn...
Distribution Estimation under the Infinity Norm
Aryeh Kontorovich, Amichai Painsky
We present novel bounds for estimating discrete probability distributions under the $\ell_\infty$ norm. These are nearly optimal in various precise se...
Extending Temperature Scaling with Homogenizing Maps
Christopher Qian, Feng Liang, Jason Adams
As machine learning models continue to grow more complex, poor calibration significantly limits the reliability of their predictions. Temperature scal...
Density Estimation Using the Perceptron
Yury Polyanskiy, Patrik Róbert Gerber, Tianze Jiang et al.
We propose a new density estimation algorithm. Given $n$ i.i.d. observations from a distribution belonging to a class of densities on $\mathbb{R}^d$...
Simplex Constrained Sparse Optimization via Tail Screening
Xueqin Wang, Peng Chen, Jin Zhu et al.
We consider the probabilistic simplex-constrained sparse recovery problem. The commonly used Lasso-type penalty for promoting sparsity is ineffective ...
Score-Based Diffusion Models in Function Space
Jae Hyun Lim, Nikola B. Kovachki, Ricardo Baptista et al.
Diffusion models have recently emerged as a powerful framework for generative modeling. They consist of a forward process that perturbs input data wit...
Regularized Rényi Divergence Minimization through Bregman Proximal Gradient Algorithms
Thomas Guilmeau, Emilie Chouzenoux, Víctor Elvira
We study the variational inference problem of minimizing a regularized Rényi divergence over an exponential family. We propose to solve this problem w...
WEFE: A Python Library for Measuring and Mitigating Bias in Word Embeddings
Pablo Badilla, Felipe Bravo-Marquez, María José Zambrano et al.
Word embeddings, which are a mapping of words into continuous vectors, are widely used in modern Natural Language Processing (NLP) systems. However, t...
Frontiers to the learning of nonparametric hidden Markov models
Elisabeth Gassiat, Zacharie Naulet, Kweku Abraham
Hidden Markov models (HMMs) are flexible tools for clustering dependent data coming from unknown populations, allowing nonparametric modelling of the ...
On Non-asymptotic Theory of Recurrent Neural Networks in Temporal Point Processes
Zhiheng Chen, Guanhua Fang, Wen Yu
Temporal point process (TPP) is an important tool for modeling and predicting irregularly timed events across various domains. Recently, the recurrent...
Classification in the high dimensional Anisotropic mixture framework: A new take on Robust Interpolation
Stanislav Minsker, Mohamed Ndaoud, Yiqiu Shen
We study the classification problem under the two-component anisotropic sub-Gaussian mixture model in high dimensions and in the non-asymptotic settin...
Universal Online Convex Optimization Meets Second-order Bounds
Yibo Wang, Lijun Zhang, Guanghui Wang et al.
Recently, several universal methods have been proposed for online convex optimization, and attain minimax rates for multiple types of convex function...
Sample Complexity of the Linear Quadratic Regulator: A Reinforcement Learning Lens
Amirreza Neshaei Moghaddam, Alex Olshevsky, Bahman Gharesifard
We provide the first known algorithm that provably achieves $\varepsilon$-optimality within $\widetilde{O}(1/\varepsilon)$ function evaluations for th...
Randomization Can Reduce Both Bias and Variance: A Case Study in Random Forests
Rahul Mazumder, Brian Liu
We study the often overlooked phenomenon, first noted in Breiman (2001), that random forests appear to reduce bias compared to bagging. Motivated by a...
skglm: Improving scikit-learn for Regularized Generalized Linear Models
Badr Moufad, Pierre-Antoine Bannier, Quentin Bertrand et al.
We introduce skglm, an open-source Python package for regularized Generalized Linear Models. Thanks to its composable nature, it supports combining da...
Losing Momentum in Continuous-time Stochastic Optimisation
Kexin Jin, Jonas Latz, Chenguang Liu et al.
The training of modern machine learning models often consists in solving high-dimensional non-convex optimisation problems that are subject to large-s...
Latent Process Models for Functional Network Data
Elizaveta Levina, Ji Zhu, Peter W. MacDonald
Network data are often sampled with auxiliary information or collected through the observation of a complex system over time, leading to multiple netw...
Dynamic Bayesian Learning for Spatiotemporal Mechanistic Models
Sudipto Banerjee, Xiang Chen, Ian Frankenburg et al.
We develop an approach for Bayesian learning of spatiotemporal dynamical mechanistic models. Such learning consists of statistical emulation of the me...
On the Ability of Deep Networks to Learn Symmetries from Data: A Neural Kernel Theory
Andrea Perin, Stephane Deny
Symmetries (transformations by group actions) are present in many datasets, and leveraging them holds considerable promise for improving predictions i...
Fine-grained Analysis and Faster Algorithms for Iteratively Solving Linear Systems
Michal Dereziński, Daniel LeJeune, Deanna Needell et al.
Despite being a key bottleneck in many machine learning tasks, the cost of solving large linear systems has proven challenging to quantify due to prob...
Deep Generative Models: Complexity, Dimensionality, and Approximation
Didong Li, Kevin Wang, Hongqian Niu et al.
Generative networks have shown remarkable success in learning complex data distributions, particularly in generating high-dimensional data from lower-...
ClimSim-Online: A Large Multi-Scale Dataset and Framework for Hybrid Physics-ML Climate Emulation
Sungduk Yu, Zeyuan Hu, Akshay Subramaniam et al.
Modern climate projections lack adequate spatial and temporal resolution due to computational constraints, leading to inaccuracies in representing cri...
Conditional Wasserstein Distances with Applications in Bayesian OT Flow Matching
Jannis Chemseddine, Paul Hagemann, Gabriele Steidl et al.
In inverse problems, many conditional generative models approximate the posterior measure by minimizing a distance between the joint measure and its l...
Deep Variational Multivariate Information Bottleneck - A Framework for Variational Losses
Eslam Abdelaleem, Ilya Nemenman, K. Michael Martini
Variational dimensionality reduction methods are widely used for their accuracy, generative capabilities, and robustness. We introduce a unifying fram...
Diffeomorphism-based feature learning using Poincaré inequalities on augmented input space
Romain Verdière, Clémentine Prieur, Olivier Zahm
We propose a gradient-enhanced algorithm for high-dimensional function approximation. The algorithm proceeds in two steps: firstly, we reduce the inp...
Finite Expression Method for Solving High-Dimensional Partial Differential Equations
Senwei Liang, Haizhao Yang
Designing efficient and accurate numerical solvers for high-dimensional partial differential equations (PDEs) remains a challenging and important topi...
Randomly Projected Convex Clustering Model: Motivation, Realization, and Cluster Recovery Guarantees
Defeng Sun, Yancheng Yuan, Ziwen Wang et al.
In this paper, we propose a randomly projected convex clustering model for clustering a collection of $n$ high dimensional data points in $\mathbb{R}^...
Minimax Optimal Deep Neural Network Classifiers Under Smooth Decision Boundary
Zuofeng Shang, Tianyang Hu, Ruiqi Liu et al.
Deep learning has gained huge empirical successes in large-scale classification problems. In contrast, there is a lack of statistical understanding ab...
Optimal and Efficient Algorithms for Decentralized Online Convex Optimization
Lijun Zhang, Yuanyu Wan, Tong Wei et al.
We investigate decentralized online convex optimization (D-OCO), in which a set of local learners are required to minimize a sequence of global loss f...
Characterizing Dynamical Stability of Stochastic Gradient Descent in Overparameterized Learning
Dennis Chemnitz, Maximilian Engel
For overparameterized optimization tasks, such as those found in modern machine learning, global minima are generally not unique. In order to understa...
PREMAP: A Unifying PREiMage APproximation Framework for Neural Networks
Xiyue Zhang, Benjie Wang, Marta Kwiatkowska et al.
Most methods for neural network verification focus on bounding the image, i.e., set of outputs for a given input set. This can be used to, for example...
Score-Aware Policy-Gradient and Performance Guarantees using Local Lyapunov Stability
Céline Comte, Matthieu Jonckheere, Jaron Sanders et al.
In this paper, we introduce a policy-gradient method for model-based reinforcement learning (RL) that exploits a type of stationary distributions comm...
On the O(sqrt(d)/T^(1/4)) Convergence Rate of RMSProp and Its Momentum Extension Measured by l_1 Norm
Zhouchen Lin, Huan Li, Yiming Dong
Although adaptive gradient methods have been extensively used in deep learning, their convergence rates proved in the literature are all slower than t...
Categorical Semantics of Compositional Reinforcement Learning
Georgios Bakirtzis, Michail Savvas, Ufuk Topcu
Compositional knowledge representations in reinforcement learning (RL) facilitate modular, interpretable, and safe task specifications. However, gener...
Transformers from Diffusion: A Unified Framework for Neural Message Passing
David Wipf, Qitian Wu, Junchi Yan
Learning representations for structured data with certain geometries (e.g., observed or unobserved) is a fundamental challenge, wherein message passin...
Optimal Sample Selection Through Uncertainty Estimation and Its Application in Deep Learning
Yong Lin, Chen Liu, Chenlu Ye et al.
Modern deep learning heavily relies on large labeled datasets, which often comse with high costs in terms of both manual labeling and computational re...
Actor-Critic learning for mean-field control in continuous time
Noufel FRIKHA, Maximilien GERMAIN, Mathieu LAURIERE et al.
We study policy gradient for mean-field control in continuous time in a reinforcement learning setting. By considering randomised policies with entro...
Modelling Populations of Interaction Networks via Distance Metrics
George Bolt, Simón Lunagómez, Christopher Nemeth
Network data arises through the observation of relational information between a collection of entities, for example, friendships (relations) amongst a...
BitNet: 1-bit Pre-training for Large Language Models
Lei Wang, Yi Wu, Hongyu Wang et al.
The increasing size of large language models (LLMs) has posed challenges for deployment and raised concerns about environmental impact due to high ene...
Physics-informed Kernel Learning
Gérard Biau, Nathan Doumèche, Francis Bach et al.
Physics-informed machine learning typically integrates physical priors into the learning process by minimizing a loss function that includes both a da...
Last-iterate Convergence of Shuffling Momentum Gradient Method under the Kurdyka-Lojasiewicz Inequality
Yuqing Liang, Dongpo Xu
Shuffling gradient algorithms are extensively used to solve finite-sum optimization problems in machine learning. However, their theoretical propertie...
Posterior and Variational Inference for Deep Neural Networks with Heavy-Tailed Weights
Ismaël Castillo, Paul Egels
We consider deep neural networks in a Bayesian framework with a prior distribution sampling the network weights at random. Following a recent idea of...
Maximum Causal Entropy IRL in Mean-Field Games and GNEP Framework for Forward RL
Berkay Anahtarci, Can Deha Kariksiz, Naci Saldi
This paper explores the use of Maximum Causal Entropy Inverse Reinforcement Learning (IRL) within the context of discrete-time stationary Mean-Field G...
Degree of Interference: A General Framework For Causal Inference Under Interference
Yuki Ohnishi, Bikram Karmakar, Arman Sabbaghi
One core assumption typically adopted for valid causal inference is that of no interference between experimental units, i.e., the outcome of an experi...
Quantifying the Effectiveness of Linear Preconditioning in Markov Chain Monte Carlo
Max Hird, Samuel Livingstone
We study linear preconditioning in Markov chain Monte Carlo. We consider the class of well-conditioned distributions, for which several mixing time bo...
Sparse SVM with Hard-Margin Loss: a Newton-Augmented Lagrangian Method in Reduced Dimensions
Penghe Zhang, Naihua Xiu, Hou-Duo Qi
The hard-margin loss function has been at the core of the support vector machine research from the very beginning due to its generalization capability...
On Model Identification and Out-of-Sample Prediction of PCR with Applications to Synthetic Controls
Devavrat Shah, Anish Agarwal, Dennis Shen
We analyze principal component regression (PCR) in a high-dimensional error-in-variables setting with fixed design. Under suitable conditions, we show...
Bayesian Scalar-on-Image Regression with a Spatially Varying Single-layer Neural Network Prior
Keru Wu, Jian Kang, Ben Wu
Deep neural networks (DNN) have been widely used in scalar-on-image regression to predict an outcome variable from imaging predictors. However, train...
Efficient Distributed Learning over Decentralized Networks with Convoluted Support Vector Machine*
Canyi Chen, Liping Zhu, Nan Qiao
Simultaneous inference for monotone and smoothly time-varying functions under complex temporal dynamics
Tianpai Luo, Weichi Wu
Chain-linked Multiple Matrix Integration via Embedding Alignment
Runbing Zheng, Minh Tang
Understanding Inequalities in Cancer Survival Using Bayesian Machine Learning
Antonio R. Linero, Piyali Basak, Camille Maringe et al.
Word-Level Maximum Mean Discrepancy Regularization for Word Embedding
Youqian Gao, Ben Dai
Data thinning for Poisson factor models and its applications
Zhijing Wang, Peirong Xu, Hongyu Zhao et al.
Linear-Cost Vecchia Approximation of Multivariate Normal Probabilities
Jian Cao, Matthias Katzfuss
Confidence Sets for Causal Orderings
Mladen Kolar, Y. Samuel Wang, Mathias Drton
A Bayesian nonparametric approach to mediation and spillover effects with multiple mediators in cluster-randomized trials
Fan Li, Yuki Ohnishi
On the poor statistical properties of theP-curve meta-analytic procedure
Richard D. Morey, Clintin P. Davis-Stober
Integrative Analysis of Microbial 16S Gene and Shotgun Metagenomic Sequencing Data Improves Statistical Efficiency in Testing Differential Abundance
Ye Yue, Yicong Mao, Timothy D. Read et al.
The Effect of Alcohol intake on Brain White Matter Microstructural Integrity: A New Causal Inference Framework for Incomplete Phenomic Data
Shuo Chen, Chixiang Chen, Zhenyao Ye et al.
Optimal Transport based Cross-Domain Integration for Heterogeneous Data
Annie Qu, Babak Shahbaba, Yubai Yuan et al.
Inference on the proportion of variance explained in principal component analysis
Snigdha Panigrahi, Ronan Perry, Jacob Bien et al.
A Minimax Two-Sample Test for Functional Data via Grothendieck’s Divergence
Xueqin Wang, Yan Chen, Hongmei Lin et al.
Information Theoretic Limits of Robust Sub-Gaussian Mean Estimation Under Star-Shaped Constraints
Matey Neykov, Akshay Prasadan
Optimality of Approximate Message Passing for Spiked Matrix Models with Rotationally Invariant Noise
Rishabh Dudeja, Songbin Liu, Junjie Ma
Supervised Contamination Detection, with Flow Cytometry ApplicationGet access
S Gaucherand others
Communication-Efficient and Distributed-Oracle Estimation for High-Dimensional Quantile Regression
Xuming He, Songshan Yang, Yifan Gu et al.
Optimal Convex $M$-Estimation via Score Matching
Oliver Y. Feng, Yu-Chun Kao, Min Xu et al.
Semiparametric Bernstein-Von Mises Phenomenon via Isotonized Posterior in Wicksell’s Problem
Aad van der Vaart, Francesco Gili, Geurt Jongbloed
Multilayer random dot product graphs: estimation and online change point detection
Fan Wangand others
Neural Networks Generalize on Low Complexity Data
Sourav Chatterjee, Timothy Sudijono
Pretraining and the lassoGet access
Erin Craigand others
Online multivariate changepoint detection: leveraging links with computational geometryGet access
Liudmila Pishchaginaand others
Censored quantile regression with time-dependent covariates
Chi Wing Chuand others
Berry-Esseen Bounds for Design-Based Causal Inference With Possibly Diverging Treatment Levels and Varying Group Sizes
Peng Ding, Lei Shi
Multivariate Root-N-Consistent Smoothing Parameter Free Matching Estimators and Estimators of Inverse Density Weighted Expectations
Hajo Holzmann, Alexander Meister
Change Point Estimation for a Stochastic Heat Equation
Markus Reiß, Claudia Strauch, Lukas Trottner
Pseudo-Labeling for Kernel Ridge Regression under Covariate Shift
Kaizheng Wang
A Computational Transition for Detecting Correlated Stochastic Block Models by Low-Degree Polynomials
Jian Ding, Zhangsong Li, Guanyi Chen et al.
Debiased calibration estimation using generalized entropy in survey sampling
Yonghyun Kwon, Jae Kwang Kim, Yumou Qiu
Dimension Reduction for Large-Scale Federated Data: Statistical Rate and Asymptotic Inference
Shuting Shen, Junwei Lu, Xihong Lin
Principal stratification with continuous post-treatment variables: nonparametric identification and semiparametric estimation
Sizhu Lu, others
Kernel Spectral Joint Embeddings for High-Dimensional Noisy Datasets using Duo-Landmark Integral Operators
Xiucai Ding, Rong Ma
A powerful transformation of quantitative responses for biobank-scale association studies
Yaowu Liu, Tianying Wang
Inference for Low-rank Models without Estimating the Rank
Jungjun Choi, Hyukjun Kwon, Yuan Liao
Online Policy Learning and Inference by Matrix Completion
Dong Xia, Jingyang Li, Congyuan Duan
Solving the Poisson Equation Using Coupled Markov Chains
Pierre Etienne Jacob, Randal Douc, Anthony Lee et al.
DRM Revisited: A Complete Error Analysis
Yuling Jiao, Ruoxuan Li, Peiying Wu et al.
It is widely known that the error analysis for deep learning involves approximation, statistical, and optimization errors. However, it is challenging ...
Average Partial Effect Estimation Using Double Machine Learning
Harvey Klyne, Rajen Shah
Fundamental Limits of Community Detection From Multi-View Data: Multi-Layer, Dynamic and Partially Labeled Block Models
Subhabrata Sen, Xiaodong Yang, Buyu Lin
Online Estimation with Rolling Validation: Adaptive Nonparametric Estimation with Streaming Data
Tianyu Zhang, Jing Lei
Poisson Empirical Bayes Estimation: When Doesg-Modeling Beatf-Modeling in Theory (And in Practice)?
Yihong Wu, Yandi Shen
High-Dimensional Hilbert-Schmidt Linear Regression with Hilbert Manifold Variables
Changwon Choi, Byeong U. Park
Optimal Sequencing Depth for Single-Cell RNA-Sequencing in Wasserstein Space
Jakwang Kim, Sharvaj Kubal, Geoffrey Schiebinger
A Two-Way Heterogeneity Model for Dynamic Networks
Binyan Jiang, Ting Yan, Qiwei Yao et al.
A Geometrical Analysis of Kernel Ridge Regression and its Applications
Zong Shang, Guillaume Lecué, Georgios Gavrilopoulos
Kurtosis-Based Projection Pursuit for Matrix-Valued Data
Una Radojicic, Klaus Nordhausen, Joni Virta
A Flexible Defense Against the Winner’s Curse
William Fithian, Tijana Zrnic
Rank Tests for PCA Under Weak Identifiability
Davy Paindaveine, Laura Peralvo Maroto, Thomas Verdebout
Sparse PCA: A New Scalable Estimator Based on Integer Programming
Kayhan Behdin, Rahul Mazumder
Semi-Supervised U-Statistics
Larry Wasserman, Ilmun Kim, Sivaraman Balakrishnan et al.
Scalable Inference in Functional Linear Regression with Streaming Data
Linglong Kong, Jinhan Xie, Enze Shi et al.
The Empirical Copula Process in High Dimensions: Stute’s Representation and Applications
Axel Bücher, Cambyse Pakzad
Causal Effect Estimation Under Network Interference with Mean-Field Methods
Subhabrata Sen, Sohom Bhattacharya
Clustering risk in Non-parametric Hidden Markov and I.I.D. Models
Elisabeth Gassiat, Ibrahim Kaddouri, Zacharie Naulet
Efficiently Matching Random Inhomogeneous Graphs via Degree Profiles
Jian Ding, Yumou Fei, Yuanzheng Wang
Improving Knockoffs with Conditional Calibration
William Fithian, Yixiang Luo, Lihua Lei
Spectral Density Estimation of Function-Valued Spatial Processes
Rafail Kartsioukas, Stilian Stoev, Tailen Hsing
Causality Pursuit from Heterogeneous Environments via Neural Adversarial Invariance Learning
Jianqing Fan, Yihong Gu, Cong Fang et al.
Tests of Missing Completely at Random Based on Sample Covariance Matrices
Alberto Bordino, Thomas Benjamin Berrett
Near Optimal Sample Complexity for Matrix and Tensor Normal Models via Geodesic Convexity
Rafael Mendes de Oliveira, William Cole Franks, Akshay Ramachandran et al.
Yurinskii’s Coupling for Martingales
Matias Damian Cattaneo, Ricardo Pereira Masini, William George Underwood
Improved Learning Theory for Kernel Distribution Regression with Two-Stage Sampling
François Bachoc, Louis Béthune, Alberto González-Sanz et al.
Trimmed Sample Means for Robust Uniform Mean Estimation and Regression
Roberto Imbuzeiro Moraes Felinto de Oliveira, Lucas Resende
Pseudo-Likelihood-Based M-Estimation of Random Graphs with Dependent Edges and Parameter Vectors of Increasing Dimension
Jonathan Roy Stewart, Michael Schweinberger
Robust Transfer Learning with Unreliable Source Data
Jianqing Fan, Cheng Gao, Jason Matthew Klusowski
Online Estimation and Inference for Robust Policy Evaluation in Reinforcement Learning
Xi Chen, Yichen Zhang, Weidong Liu et al.
The High-Dimensional Asymptotics of Principal Component Regression
Alden Green, Elad Romanov
Theory of Functional Principal Component Analysis for Discretely Observed Data
Fang Yao, Hang Zhou, Dongyi Wei
A Unified Analysis of Likelihood-based Estimators in the Plackett–Luce Model
Ruijian Han, Yiming Xu
Symmetry: A General Structure in Nonparametric Regression
Louis Goldwater Christie, John A. D. Aston
Advances in Bayesian Model Selection Consistency for High-Dimensional Generalized Linear Models
Jeyong Lee, Minwoo Chae, Ryan Martin
Estimation and Inference in Distributional Reinforcement Learning
Liangyu Zhang, Yang Peng, Jiadong Liang et al.
Online Statistical Inference in Decision Making with Matrix Context
Yichen Zhang, Qiyu Han, Will Wei Sun
Structured Matrix Learning under Arbitrary Entrywise Dependence and Estimation of Markov Transition Kernel
Jianqing Fan, Jinhang Chai
Optimal and Exact Recovery on the General Non-Uniform Hypergraph Stochastic Block Model
Ioana Dumitriu, Hai-Xiao Wang
High-Dimensional Statistical Inference for Linkage Disequilibrium Score Regression and Its Cross-Ancestry Extensions
Fei Xue, Bingxin Zhao
Deep Horseshoe Gaussian Processes
Ismaël Castillo, Thibault Christophe Randrianarisoa
The Functional Graphical Lasso
Kartik Govind Waghmare, Tomas Masak, Victor Michael Panaretos
Higher-Order Entrywise Eigenvectors Analysis of Low-Rank Random Matrices: Bias Correction, Edgeworth Expansion, and Bootstrap
Yichi Zhang, Fangzheng Xie
Counterfactual Inference in Sequential Experiments
Raaz Dwivedi, Katherine Tian, Sabina Tomkins et al.
Optimal Vintage Factor Analysis with Deflation Varimax
Xin Bing, Xin He, Dian Jin et al.
Low-Degree Hardness of Detection for Correlated Erdős-Rényi Graphs
Jian Ding, Zhangsong Li, Hang Du
Spectral Gap Bounds for Reversible Hybrid Gibbs Chains
Qian Qin, Nianqiao Ju, Guanyang Wang
Fixed and Random Covariance Regression Analyses
Wei Lan, Chih-Ling Tsai, Runze Li et al.
Debiased Regression Adjustment in Completely Randomized Experiments with Moderately High-Dimensional Covariates
Xin Lu, Fan Yang, Yuhao Wang
Reinforcement Learning for Individual Optimal Policy From Heterogeneous Data
Annie Qu, Rui Miao, Babak Shahbaba
Policy Learning “Without” Overlap: Pessimism and Generalized Empirical Bernstein’s Inequality
Zhaoran Wang, Ying Jin, Zhimei Ren et al.
Algorithmic Stability Implies Training-Conditional Coverage for Distribution-Free Prediction Methods
Ruiting Liang, Rina Foygel Barber
Semiparametric Modeling and Analysis for Longitudinal Network Data
Yang Feng, Yinqiu He, Jiajin Sun et al.
On the Structural Dimension of Sliced Inverse Regression
Dongming Huang, Songtao Tian, Qian Lin
Erratum: Quantile Processes and Their Applications in Finite Populations
Anurag Dey, Probal Chaudhuri
Dualizing Le Cam’s Method for Functional Estimation, with Applications to Estimating the Unseens
Yihong Wu, Yury Polyanskiy
Asymptotically-Exact Selective Inference for Quantile Regression
Xuming He, Yumeng Wang, Snigdha Panigrahi
Near-Optimal Inference in Adaptive Linear Regression
Koulik Khamaru, Yash Deshpande, Tor Lattimore et al.
Principled Penalty-based Methods for Bilevel Reinforcement Learning and RLHF
Zhuoran Yang, Han Shen, Tianyi Chen
Bilevel optimization has been recently applied to many machine learning tasks. However, their applications have been restricted to the supervised lear...
A Common-Cause Principle for Eliminating Selection Bias in Causal Estimands Through Covariate Adjustment
Ilya Shpitser, Maya Mathur, Tyler VanderWeele
Precise High-Dimensional Asymptotics for Quantifying Heterogeneous Transfers
Fan Yang, Hongyang R. Zhang, Sen Wu et al.
The problem of learning one task using samples from another task is central to transfer learning. In this paper, we focus on answering the following q...
Score-based Causal Representation Learning: Linear and General Transformations
Burak Var{{\i}}c{{\i}}, Emre Acartürk, Karthikeyan Shanmugam et al.
This paper addresses intervention-based causal representation learning (CRL) under a general nonparametric latent causal model and an unknown transfor...
On the Statistical Properties of Generative Adversarial Models for Low Intrinsic Data Dimension
Saptarshi Chakraborty, Peter L. Bartlett
Despite the remarkable empirical successes of Generative Adversarial Networks (GANs), the theoretical guarantees for their statistical accuracy remain...
Prominent Roles of Conditionally Invariant Components in Domain Adaptation: Theory and Algorithms
Keru Wu, Yuansi Chen, Wooseok Ha et al.
Domain adaptation (DA) is a statistical learning problem that arises when the distribution of the source data used to train a model differs from that ...
Near-Optimal Nonconvex-Strongly-Convex Bilevel Optimization with Fully First-Order Oracles
Lesi Chen, Yaohua Ma, Jingzhao Zhang
In this work, we consider bilevel optimization when the lower-level problem is strongly convex. Recent works show that with a Hessian-vector product (...
Adaptive Distributed Kernel Ridge Regression: A Feasible Distributed Learning Scheme for Data Silos
Shao-Bo Lin, Xiaotong Liu, Di Wang et al.
Data silos, mainly caused by privacy and interoperability, significantly constrain collaborations among different organizations with similar data for ...
On Global and Local Convergence of Iterative Linear Quadratic Optimization Algorithms for Discrete Time Nonlinear Control
Vincent Roulet, Siddhartha Srinivasa, Maryam Fazel et al.
A classical approach for solving discrete time nonlinear control on a finite horizon consists in repeatedly minimizing linear quadratic approximations...
A Decentralized Proximal Gradient Tracking Algorithm for Composite Optimization on Riemannian Manifolds
Lei Wang, Le Bao, Xin Liu
This paper focuses on minimizing a smooth function combined with a nonsmooth regularization term on a compact Riemannian submanifold embedded in the E...
Learning conditional distributions on continuous spaces
Cyril Benezet, Ziteng Cheng, Sebastian Jaimungal
We investigate sample-based learning of conditional distributions on multi-dimensional unit boxes, allowing for different dimensions of the feature an...
A Unified Analysis of Nonstochastic Delayed Feedback for Combinatorial Semi-Bandits, Linear Bandits, and MDPs
Lukas Zierahn, Dirk van der Hoeven, Tal Lancewicki et al.
We derive a new analysis of Follow The Regularized Leader (FTRL) for online learning with delayed bandit feedback. By separating the cost of delayed f...
Error bounds for particle gradient descent, and extensions of the log-Sobolev and Talagrand inequalities
Rocco Caprio, Juan Kuntz, Samuel Power et al.
We derive non-asymptotic error bounds for particle gradient descent (PGD, Kuntz et al. (2023)), a recently introduced algorithm for maximum likelihoo...
Linear Hypothesis Testing in High-Dimensional Expected Shortfall Regression with Heavy-Tailed Errors
Kean Ming Tan, Wen-Xin Zhou, Gaoyu Wu et al.
Expected shortfall (ES) is widely used for characterizing the tail of a distribution across various fields, particularly in financial risk management....
Efficient Numerical Integration in Reproducing Kernel Hilbert Spaces via Leverage Scores Sampling
Antoine Chatalic, Nicolas Schreuder, Ernesto De Vito et al.
In this work we consider the problem of numerical integration, i.e., approximating integrals with respect to a target probability measure using only p...
Distribution Free Tests for Model Selection Based on Maximum Mean Discrepancy with Estimated Parameters
Florian Brück, Jean-David Fermanian, Aleksey Min
There exist several testing procedures based on the maximum mean discrepancy (MMD) to address the challenge of model specification. However, these tes...
Statistical field theory for Markov decision processes under uncertainty
George Stamatescu
A statistical field theory is introduced for finite state and action Markov decision processes with unknown parameters, in a Bayesian setting. The Bel...
Bayesian Data Sketching for Varying Coefficient Regression Models
Rajarshi Guhaniyogi, Laura Baracaldo, Sudipto Banerjee
Varying coefficient models are popular for estimating nonlinear regression functions in functional data models. Their Bayesian variants have received ...
Bagged k-Distance for Mode-Based Clustering Using the Probability of Localized Level Sets
Hanyuan Hang
In this paper, we propose an ensemble learning algorithm named bagged $k$-distance for mode-based clustering (BDMBC) by putting forward a new measure ...
Linear cost and exponentially convergent approximation of Gaussian Matérn processes on intervals
David Bolin, Vaibhav Mehandiratta, Alexandre B. Simas
The computational cost for inference and prediction of statistical models based on Gaussian processes with Matérn covariance functions scales cubicall...
Invariant Subspace Decomposition
Margherita Lazzaretto, Jonas Peters, Niklas Pfister
We consider the task of predicting a response $Y$ from a set of covariates $X$ in settings where the conditional distribution of $Y$ given $X$ changes...
Posterior Concentrations of Fully-Connected Bayesian Neural Networks with General Priors on the Weights
Insung Kong, Yongdai Kim
Bayesian approaches for training deep neural networks (BNNs) have received significant interest and have been effectively utilized in a wide range of ...
Outlier Robust and Sparse Estimation of Linear Regression Coefficients
Takeyuki Sasai, Hironori Fujisawa
We consider outlier-robust and sparse estimation of linear regression coefficients, when the covariates and the noises are contaminated by adversarial...
Affine Rank Minimization via Asymptotic Log-Det Iteratively Reweighted Least Squares
Sebastian Krämer
The affine rank minimization problem is a well-known approach to matrix recovery. While there are various surrogates to this NP-hard problem, we prove...
Causal Effect of Functional Treatment
Ruoxu Tan, Wei Huang, Zheng Zhang et al.
We study the causal effect with a functional treatment variable, where practical applications often arise in neuroscience, biomedical sciences, etc. P...
Uplift Model Evaluation with Ordinal Dominance Graphs
Brecht Verbeken, Marie-Anne Guerry, Wouter Verbeke et al.
Uplift modelling is a subfield of causal learning that focuses on ranking entities by individual treatment effects. Uplift models are typically evalua...
High-Dimensional L2-Boosting: Rate of Convergence
Ye Luo, Martin Spindler, Jannis Kueck
Boosting is one of the most significant developments in machine learning. This paper studies the rate of convergence of L2-Boosting in a high-dimensio...
Feature Learning in Finite-Width Bayesian Deep Linear Networks with Multiple Outputs and Convolutional Layers
Federico Bassetti, Marco Gherardi, Alessandro Ingrosso et al.
Deep linear networks have been extensively studied, as they provide simplified models of deep learning. However, little is known in the case of finite...
How good is your Laplace approximation of the Bayesian posterior? Finite-sample computable error bounds for a variety of useful divergences
Miko{\l}aj J. Kasprzak, Ryan Giordano, Tamara Broderick
The Laplace approximation is a popular method for constructing a Gaussian approximation to the Bayesian posterior and thereby approximating the poster...
Integral Probability Metrics Meet Neural Networks: The Radon-Kolmogorov-Smirnov Test
Alden Green, Seunghoon Paik, Michael Celentano et al.
Integral probability metrics (IPMs) constitute a general class of nonparametric two-sample tests that are based on maximizing the mean difference betw...
On Inference for the Support Vector Machine
Wen-Xin Zhou, Jakub Rybak, Heather Battey
The linear support vector machine has a parametrised decision boundary. The paper considers inference for the corresponding parameters, which indicate...
Random Pruning Over-parameterized Neural Networks Can Improve Generalization: A Training Dynamics Analysis
Hongru Yang, Yingbin Liang, Xiaojie Guo et al.
It has been observed that applying pruning-at-initialization methods and training the sparse networks can sometimes yield slightly better test perform...
Causal Abstraction: A Theoretical Foundation for Mechanistic Interpretability
Atticus Geiger, Duligur Ibeling, Amir Zur et al.
Causal abstraction provides a theoretical foundation for mechanistic interpretability, the field concerned with providing intelligible algorithms that...
Implicit vs Unfolded Graph Neural Networks
Yongyi Yang, Tang Liu, Yangkun Wang et al.
It has been observed that message-passing graph neural networks (GNN) sometimes struggle to maintain a healthy balance between the efficient / scalabl...
Towards Optimal Branching of Linear and Semidefinite Relaxations for Neural Network Robustness Certification
Brendon G. Anderson, Ziye Ma, Jingqi Li et al.
In this paper, we study certifying the robustness of ReLU neural networks against adversarial input perturbations. To diminish the relaxation error su...
GraphNeuralNetworks.jl: Deep Learning on Graphs with Julia
Carlo Lucibello, Aurora Rossi
GraphNeuralNetworks.jl is an open-source framework for deep learning on graphs, written in the Julia programming language. It supports multiple GPU ba...
Dynamic angular synchronization under smoothness constraints
Ernesto Araya, Mihai Cucuringu, Hemant Tyagi
Given an undirected measurement graph $\mathcal{H} = ([n], \mathcal{E})$, the classical angular synchronization problem consists of recovering unkno...
Derivative-Informed Neural Operator Acceleration of Geometric MCMC for Infinite-Dimensional Bayesian Inverse Problems
Lianghao Cao, Thomas O'Leary-Roseberry, Omar Ghattas
We propose an operator learning approach to accelerate geometric Markov chain Monte Carlo (MCMC) for solving infinite-dimensional Bayesian inverse pro...
Wasserstein F-tests for Frechet regression on Bures-Wasserstein manifolds
Hongzhe Li, Haoshu Xu
This paper addresses regression analysis for covariance matrix-valued outcomes with Euclidean covariates, motivated by applications in single-cell gen...
Distributed Stochastic Bilevel Optimization: Improved Complexity and Heterogeneity Analysis
Youcheng Niu, Jinming Xu, Ying Sun et al.
This paper considers solving a class of nonconvex-strongly-convex distributed stochastic bilevel optimization (DSBO) problems with personalized inner-...
Learning causal graphs via nonlinear sufficient dimension reduction
Eftychia Solea, Bing Li, Kyongwon Kim
We introduce a new nonparametric methodology for estimating a directed acyclic graph (DAG) from observational data. Our method is nonparametric in nat...
On Consistent Bayesian Inference from Synthetic Data
Ossi Räisä, Joonas Jälkö, Antti Honkela
Generating synthetic data, with or without differential privacy, has attracted significant attention as a potential solution to the dilemma between ma...
Optimization Over a Probability Simplex
James Chok, Geoffrey M. Vasil
We propose a new iteration scheme, the Cauchy-Simplex, to optimize convex problems over the probability simplex $\{w\in\mathbb{R}^n\ |\ \sum_i w_i=1\ ...
Laplace Meets Moreau: Smooth Approximation to Infimal Convolutions Using Laplace's Method
Ryan J. Tibshirani, Samy Wu Fung, Howard Heaton et al.
We study approximations to the Moreau envelope---and infimal convolutions more broadly---based on Laplace's method, a classical tool in analysis which...
Sampling and Estimation on Manifolds using the Langevin Diffusion
Karthik Bharath, Alexander Lewis, Akash Sharma et al.
Error bounds are derived for sampling and estimation using a discretization of an intrinsically defined Langevin diffusion with invariant measure $\te...
Sharp Bounds for Sequential Federated Learning on Heterogeneous Data
Yipeng Li, Xinchen Lyu
There are two paradigms in Federated Learning (FL): parallel FL (PFL), where models are trained in a parallel manner across clients, and sequential FL...
Local Linear Recovery Guarantee of Deep Neural Networks at Overparameterization
Yaoyu Zhang, Leyang Zhang, Zhongwang Zhang et al.
Determining whether deep neural network (DNN) models can reliably recover target functions at overparameterization is a critical yet complex issue in ...
Stabilizing Sharpness-Aware Minimization Through A Simple Renormalization Strategy
Chengli Tan, Jiangshe Zhang, Junmin Liu et al.
Recently, sharpness-aware minimization (SAM) has attracted much attention because of its surprising effectiveness in improving generalization performa...
Fine-Grained Change Point Detection for Topic Modeling with Pitman-Yor Process
Feifei Wang, Zimeng Zhao, Ruimin Ye et al.
Identifying change points in dynamic text data is crucial for understanding the evolving nature of topics across various sources, such as news article...
Deletion Robust Non-Monotone Submodular Maximization over Matroids
Paul Dütting, Federico Fusco, Silvio Lattanzi et al.
We study the deletion robust version of submodular maximization under matroid constraints. The goal is to extract a small-size summary of the data set...
Instability, Computational Efficiency and Statistical Accuracy
Raaz Dwivedi, Koulik Khamaru, Martin J. Wainwright et al.
Many statistical estimators are defined as the fixed point of a data-dependent operator, with estimators based on minimizing a cost function being an ...
Estimation of Local Geometric Structure on Manifolds from Noisy Data
Yariv Aizenbud, Barak Sober
A common observation in data-driven applications is that high-dimensional data have a low intrinsic dimension, at least locally. In this work, we cons...
Ontolearn---A Framework for Large-scale OWL Class Expression Learning in Python
Caglar Demir, Alkid Baci, N'Dah Jean Kouagou et al.
In this paper, we present Ontolearn---a framework for learning OWL class expressions over large knowledge graphs. Ontolearn contains efficient implem...
Continuously evolving rewards in an open-ended environment
Richard M. Bailey
Unambiguous identification of the rewards driving behaviours of entities operating in complex open-ended real-world environments is difficult, in part...
Recursive Causal Discovery
Ehsan Mokhtarian, Sepehr Elahi, Sina Akbari et al.
Causal discovery from observational data, i.e., learning the causal graph from a finite set of samples from the joint distribution of the variables, i...
Evaluation of Active Feature Acquisition Methods for Time-varying Feature Settings
Ilya Shpitser, Henrik von Kleist, Alireza Zamanian et al.
Machine learning methods often assume that input features are available at no cost. However, in domains like healthcare, where acquiring features coul...
On Adaptive Stochastic Optimization for Streaming Data: A Newton's Method with O(dN) Operations
Antoine Godichon-Baggioni, Nicklas Werge
Stochastic optimization methods face new challenges in the realm of streaming data, characterized by a continuous flow of large, high-dimensional data...
Determine the Number of States in Hidden Markov Models via Marginal Likelihood
Yang Chen, Cheng-Der Fuh, Chu-Lan Michael Kao
Hidden Markov models (HMM) have been widely used by scientists to model stochastic systems: the underlying process is a discrete Markov chain, and the...
Variance-Aware Estimation of Kernel Mean Embedding
Geoffrey Wolfer, Pierre Alquier
An important feature of kernel mean embeddings (KME) is that the rate of convergence of the empirical KME to the true distribution KME can be bounded ...
Scaling ResNets in the Large-depth Regime
Pierre Marion, Adeline Fermanian, Gérard Biau et al.
Deep ResNets are recognized for achieving state-of-the-art results in complex machine learning tasks. However, the remarkable performance of these arc...
A Comparative Evaluation of Quantification Methods
Tobias Schumacher, Markus Strohmaier, Florian Lemmerich
Quantification represents the problem of estimating the distribution of class labels on unseen data. It also represents a growing research field in su...
Lightning UQ Box: Uncertainty Quantification for Neural Networks
Nils Lehmann, Nina Maria Gottschling, Jakob Gawlikowski et al.
Although neural networks have shown impressive results in a multitude of application domains, the "black box" nature of deep learning and lack of conf...
Scaling Data-Constrained Language Models
Niklas Muennighoff, Alexander M. Rush, Boaz Barak et al.
The current trend of scaling language models involves increasing both parameter count and training data set size. Extrapolating this trend suggests th...
Curvature-based Clustering on Graphs
Zachary Lubberts, Yu Tian, Melanie Weber
Unsupervised node clustering (or community detection) is a classical graph learning task. In this paper, we study algorithms that exploit the geometry...
Composite Goodness-of-fit Tests with Kernels
Oscar Key, Arthur Gretton, François-Xavier Briol et al.
We propose kernel-based hypothesis tests for the challenging composite testing problem, where we are interested in whether the data comes from any dis...
PFLlib: A Beginner-Friendly and Comprehensive Personalized Federated Learning Library and Benchmark
Yang Liu, Jianqing Zhang, Yang Hua et al.
Amid the ongoing advancements in Federated Learning (FL), a machine learning paradigm that allows collaborative learning with data privacy protection,...
The Effect of SGD Batch Size on Autoencoder Learning: Sparsity, Sharpness, and Feature Learning
Wooseok Ha, Bin Yu, Nikhil Ghosh et al.
In this work, we investigate the dynamics of stochastic gradient descent (SGD) when training a single-neuron autoencoder with linear or ReLU activatio...
Efficient and Robust Transfer Learning of Optimal Individualized Treatment Regimes with Right-Censored Survival Data
Pan Zhao, Shu Yang, Julie Josse
An individualized treatment regime (ITR) is a decision rule that assigns treatments based on patients' characteristics. The value function of an ITR i...
DAGs as Minimal I-maps for the Induced Models of Causal Bayesian Networks under Conditioning
Xiangdong Xie, Jiahua Guo, Yi Sun
Bayesian networks (BNs) are a powerful tool for knowledge representation and reasoning, especially for complex systems. A critical task in the applic...
Adjusted Expected Improvement for Cumulative Regret Minimization in Noisy Bayesian Optimization
Shouri Hu, Haowei Wang, Zhongxiang Dai et al.
The expected improvement (EI) is one of the most popular acquisition functions for Bayesian optimization (BO) and has demonstrated good empirical perf...
Manifold Fitting under Unbounded Noise
Zhigang Yao, Yuqing Xia
In the field of non-Euclidean statistical analysis, a trend has emerged in recent times, of attempts to recover a low dimensional structure, namely a ...
Learning Global Nash Equilibrium in Team Competitive Games with Generalized Fictitious Cross-Play
Zelai Xu, Chao Yu, Yancheng Liang et al.
Self-play (SP) is a popular multi-agent reinforcement learning framework for competitive games. Despite the empirical success, the theoretical propert...
Wasserstein Convergence Guarantees for a General Class of Score-Based Generative Models
Xuefeng Gao, Hoang M. Nguyen, Lingjiong Zhu
Score-based generative models are a recent class of deep generative models with state-of-the-art performance in many applications. In this paper, we e...
Extremal graphical modeling with latent variables via convex optimization
Sebastian Engelke, Armeen Taeb
Extremal graphical models encode the conditional independence structure of multivariate extremes and provide a powerful tool for quantifying the risk ...
On the Approximation of Kernel functions
Paul Dommel, Alois Pichler
Various methods in statistical learning build on kernels considered in reproducing kernel Hilbert spaces. In applications, the kernel is often selecte...
Efficient and Robust Semi-supervised Estimation of Average Treatment Effect with Partially Annotated Treatment and Response
Jue Hou, Tianxi Cai, Rajarshi Mukherjee
A notable challenge of leveraging Electronic Health Records (EHR) for treatment effect assessment is the lack of precise information on important clin...
Nonconvex Stochastic Bregman Proximal Gradient Method with Application to Deep Learning
Jingyang Li, Kuangyu Ding, Kim-Chuan Toh
Stochastic gradient methods for minimizing nonconvex composite objective functions typically rely on the Lipschitz smoothness of the differentiable pa...
Optimizing Data Collection for Machine Learning
Rafid Mahmood, James Lucas, Jose M. Alvarez et al.
Modern deep learning systems require huge data sets to achieve impressive performance, but there is little guidance on how much or what kind of data t...
Unbalanced Kantorovich-Rubinstein distance, plan, and barycenter on nite spaces: A statistical perspective
Shayan Hundrieser, Florian Heinemann, Marcel Klatt et al.
We analyze statistical properties of plug-in estimators for unbalanced optimal transport quantities between finitely supported measures in different p...
Copula-based Sensitivity Analysis for Multi-Treatment Causal Inference with Unobserved Confounding
Jiajing Zheng, Alexander D'Amour, Alexander Franks
Recent work has focused on the potential and pitfalls of causal identification in observational studies with multiple simultaneous treatments. Buildin...
Rank-one Convexification for Sparse Regression
Alper Atamturk, Andres Gomez
Sparse regression models are increasingly prevalent due to their ease of interpretability and superior out-of-sample performance. However, the exact m...
gsplat: An Open-Source Library for Gaussian Splatting
Vickie Ye, Ruilong Li, Justin Kerr et al.
gsplat is an open-source library designed for training and developing Gaussian Splatting methods. It features a front-end with Python bindings compati...
Statistical Inference of Constrained Stochastic Optimization via Sketched Sequential Quadratic Programming
Sen Na, Michael Mahoney
We consider online statistical inference of constrained stochastic nonlinear optimization problems. We apply the Stochastic Sequential Quadratic Progr...
Sliced-Wasserstein Distances and Flows on Cartan-Hadamard Manifolds
Clément Bonet, Lucas Drumetz, Nicolas Courty
While many Machine Learning methods have been developed or transposed on Riemannian manifolds to tackle data with known non-Euclidean geometry, Optima...
Accelerating optimization over the space of probability measures
Shi Chen, Qin Li, Oliver Tse et al.
The acceleration of gradient-based optimization methods is a subject of significant practical and theoretical importance, particularly within machine ...
Bayesian Multi-Group Gaussian Process Models for Heterogeneous Group-Structured Data
Sudipto Banerjee, Didong Li, Andrew Jones et al.
Gaussian processes are pervasive in functional data analysis, machine learning, and spatial statistics for modeling complex dependencies. Scientific d...
Orthogonal Bases for Equivariant Graph Learning with Provable k-WL Expressive Power
Jia He, Maggie Cheng
Graph neural network (GNN) models have been widely used for learning graph-structured data. Due to the permutation-invariant requirement of graph lear...
Optimal Experiment Design for Causal Effect Identification
Sina Akbari, Negar Kiyavash, Jalal Etesami
Pearl’s do calculus is a complete axiomatic approach to learn the identifiable causal effects from observational data. When such an effect is not iden...
Mean Aggregator is More Robust than Robust Aggregators under Label Poisoning Attacks on Distributed Heterogeneous Data
Jie Peng, Weiyu Li, Stefan Vlaski et al.
Robustness to malicious attacks is of paramount importance for distributed learning. Existing works usually consider the classical Byzantine attacks m...
The Blessing of Heterogeneity in Federated Q-Learning: Linear Speedup and Beyond
Jiin Woo, Gauri Joshi, Yuejie Chi
In this paper, we consider federated Q-learning, which aims to learn an optimal Q-function by periodically aggregating local Q-estimates trained on lo...
depyf: Open the Opaque Box of PyTorch Compiler for Machine Learning Researchers
Kaichao You, Runsheng Bai, Meng Cao et al.
PyTorch 2.x introduces a compiler designed to accelerate deep learning programs. However, for machine learning researchers, fully leveraging the PyTor...
The ODE Method for Stochastic Approximation and Reinforcement Learning with Markovian Noise
Shuze Daniel Liu, Shuhang Chen, Shangtong Zhang
Stochastic approximation is a class of algorithms that update a vector iteratively, incrementally, and stochastically, including, e.g., stochastic gra...
Improving Graph Neural Networks on Multi-node Tasks with the Labeling Trick
Xiyuan Wang, Pan Li, Muhan Zhang
In this paper, we study using graph neural networks (GNNs) for multi-node representation learning, where a representation for a set of more than one n...
Directed Cyclic Graphs for Simultaneous Discovery of Time-Lagged and Instantaneous Causality from Longitudinal Data Using Instrumental Variables
Wei Jin, Yang Ni, Amanda B. Spence et al.
We consider the problem of causal discovery from longitudinal observational data. We develop a novel framework that simultaneously discovers the time-...
Bayesian Sparse Gaussian Mixture Model for Clustering in High Dimensions
Fangzheng Xie, Yanxun Xu, Dapeng Yao
We study the sparse high-dimensional Gaussian mixture model when the number of clusters is allowed to grow with the sample size. A minimax lower bound...
Regularizing Hard Examples Improves Adversarial Robustness
Hyungyu Lee, Saehyung Lee, Ho Bae et al.
Recent studies have validated that pruning hard-to-learn examples from training improves the generalization performance of neural networks (NNs). In t...
Random ReLU Neural Networks as Non-Gaussian Processes
Rahul Parhi, Pakshal Bohra, Ayoub El Biari et al.
We consider a large class of shallow neural networks with randomly initialized parameters and rectified linear unit activation functions. We prove tha...
Riemannian Bilevel Optimization
Jiaxiang Li, Shiqian Ma
In this work, we consider the bilevel optimization problem on Riemannian manifolds. We inspect the calculation of the hypergradient of such problems o...
Supervised Learning with Evolving Tasks and Performance Guarantees
Verónica Álvarez, Santiago Mazuelas, Jose A. Lozano
Multiple supervised learning scenarios are composed by a sequence of classification tasks. For instance, multi-task learning and continual learning ai...
Error estimation and adaptive tuning for unregularized robust M-estimator
Pierre C. Bellec, Takuya Koriyama
We consider unregularized robust M-estimators for linear models under Gaussian design and heavy-tailed noise, in the proportional asymptotics regime w...
From Sparse to Dense Functional Data in High Dimensions: Revisiting Phase Transitions from a Non-Asymptotic Perspective
Xinghao Qiao, Dong Li, Shaojun Guo et al.
Nonparametric estimation of the mean and covariance functions is ubiquitous in functional data analysis and local linear smoothing techniques are most...
Locally Private Causal Inference for Randomized Experiments
Jordan Awan, Yuki Ohnishi
Local differential privacy is a differential privacy paradigm in which individuals first apply a privacy mechanism to their data (often by adding nois...
Estimating Network-Mediated Causal Effects via Principal Components Network Regression
Alex Hayes, Mark M. Fredrickson, Keith Levin
We develop a method to decompose causal effects on a social network into an indirect effect mediated by the network, and a direct effect independent o...
Selective Inference with Distributed Data
Snigdha Panigrahi, Sifan Liu
When data are distributed across multiple sites or machines rather than centralized in one location, researchers face the challenge of extracting mean...
Two-Timescale Gradient Descent Ascent Algorithms for Nonconvex Minimax Optimization
Michael I. Jordan, Tianyi Lin, Chi Jin
We provide a unified analysis of two-timescale gradient descent ascent (TTGDA) for solving structured nonconvex minimax optimization problems in the f...
An Axiomatic Definition of Hierarchical Clustering
Ery Arias-Castro, Elizabeth Coda
In this paper, we take an axiomatic approach to defining a population hierarchical clustering for piecewise constant densities, and in a similar manne...
Test-Time Training on Video Streams
Renhao Wang, Yu Sun, Arnuv Tandon et al.
Prior work has established Test-Time Training (TTT) as a general framework to further improve a trained model at test time. Before making a prediction...
Adaptive Client Sampling in Federated Learning via Online Learning with Bandit Feedback
Boxin Zhao, Lingxiao Wang, Ziqi Liu et al.
Due to the high cost of communication, federated learning (FL) systems need to sample a subset of clients that are involved in each round of training....
A Random Matrix Approach to Low-Multilinear-Rank Tensor Approximation
Hugo Lebeau, Florent Chatelain, Romain Couillet
This work presents a comprehensive understanding of the estimation of a planted low-rank signal from a general spiked tensor model near the computatio...
Memory Gym: Towards Endless Tasks to Benchmark Memory Capabilities of Agents
Marco Pleines, Matthias Pallasch, Frank Zimmer et al.
Memory Gym presents a suite of 2D partially observable environments, namely Mortar Mayhem, Mystery Path, and Searing Spotlights, designed to benchmark...
Enhancing Graph Representation Learning with Localized Topological Features
Zuoyu Yan, Qi Zhao, Ze Ye et al.
Representation learning on graphs is a fundamental problem that can be crucial in various tasks. Graph neural networks, the dominant approach for grap...
Deep Out-of-Distribution Uncertainty Quantification via Weight Entropy Maximization
Antoine de Mathelin, François Deheeger, Mathilde Mougeot et al.
This paper deals with uncertainty quantification and out-of-distribution detection in deep learning using Bayesian and ensemble methods. It proposes a...
DisC2o-HD: Distributed causal inference with covariates shift for analyzing real-world high-dimensional data
Jiayi Tong, Jie Hu, George Hripcsak et al.
High-dimensional healthcare data, such as electronic health records (EHR) data and claims data, present two primary challenges due to the large number...
Bayes Meets Bernstein at the Meta Level: an Analysis of Fast Rates in Meta-Learning with PAC-Bayes
Pierre Alquier, Charles Riou, Badr-Eddine Chérief-Abdellatif
Bernstein's condition is a key assumption that guarantees fast rates in machine learning. For example, under this condition, the Gibbs posterior with ...
Efficiently Escaping Saddle Points in Bilevel Optimization
Shiqian Ma, Minhui Huang, Xuxing Chen et al.
Bilevel optimization is one of the fundamental problems in machine learning and optimization. Recent theoretical developments in bilevel optimization ...
Resampling Methods with Multiply Imputed Data
Michael W Robbins, Lane Burgette
Abstract
Doubly robust conditional independence testing with generative neural networks
Yi Zhang, others
Joint Spectral Clustering in Multilayer Degree-Corrected Stochastic Blockmodels
Joshua Agterberg, Zachary Lubberts, Jesús Arroyo
Estimating maximal symmetries of regression functions via subgroup lattices
Louis G Christie, John A D Aston
Estimation of Out-of-Sample Sharpe Ratio for High Dimensional Portfolio Optimization
Xuran Meng, Yuan Cao, Weichen Wang
A Goodness-of-Fit Assessment for General Learning Procedures in High Dimensions*
Chenxuan He, Canyi Chen, Liping Zhu
Fast Approximation of Shapley Values through Fractional Factorial Designs
Zheng Zhou, Robert Mee, Herbert Hamers et al.
Harmonized estimation of subgroup-specific treatment effects in randomized trials: the use of external control data
Daniel Schwartz, others
Identifying the Structure of High-Dimensional Time Series via Eigen-Analysis
Bo Zhang, Jiti Gao, Guangming Pan et al.
Asymptotic Guarantees for Bayesian Phylogenetic Tree Reconstruction
Alisa Kirichenko, Luke J. Kelly, Jere Koskela
Conformal Prediction for Network-Assisted Regression
Robert Lunde, Elizaveta Levina, Ji Zhu
Evaluation of binary classifiers for asymptotically dependent and independent extremes
Juliette Legrand, Philippe Naveau, Marco Oesting
Additive Multi-Index Gaussian process modeling, with application to multi-physics surrogate modeling of the quark-gluon plasma
Kevin Li, Simon Mak, J.-F. Paquet et al.
SOFARI: High-Dimensional Manifold-Based Inference
Zemin Zheng, Xin Zhou, Yingying Fan et al.
Partially Exchangeable Stochastic Block Models for (Node-Colored) Multilayer Networks
Francesco Gaffi, Daniele Durante, Antonio Lijoi et al.
Existence and Applications of Finite Population Samples that are Exactly Balanced
Yves Tillé, Louis-Paul Rivest
Abstract
Infinite joint species distribution models
D B Dunson, F Stolf
Abstract
Decomposing Gaussians with Unknown Covariance
A Dharamshi, others
Abstract
Factor pre-training in Bayesian multivariate logistic modelsGet access
L MauriandD B Dunson
A More Robust Approach to Multivariable Mendelian Randomization
Yinxiang Wu, others
Abstract
Factor pre-training in Bayesian multivariate logistic models
D B Dunson, L Mauri
Abstract
Dimension estimation in a spiked covariance model using high-dimensional data augmentation
U Radojičić, J Virta
Abstract
Efficient nonparametric estimators of discrimination measures with censored survival data
Torben Martinussen, Marie Skov Breum
LAMBDA: A Large Model Based Data Agent
Binyan Jiang, Ruijian Han, Sun Maojun et al.
A Latent Variable Model for Individual Degree Measures in Respondent-Driven Sampling
Yibo Wang, Sunghee Lee, Michael R. Elliott
Design and analysis of randomized trials to estimate spatio-temporally heterogeneous treatment effects
Samuel I. Watson, Thomas A. Smith
Higher Order Accurate Symmetric Bootstrap Confidence Intervals in High Dimensional Penalized Regression
Debraj Das, Arindam Chatterjee, S. N. Lahiri
The ICML 2023 Ranking Experiment: Examining Author Self-Assessment in ML/AI Peer Review
Jianqing Fan, Yuling Yan, Buxin Su et al.
Manipulating an Instrumental Variable in an Observational Study of Premature Babies: Design, Bounds, and Inference
Bo Zhang, Zhe Chen, Min Haeng Cho
Deep Mutual Density Ratio Estimation with Bregman Divergence and Its Applications
Jian Huang, Dongxiao Han, Siming Zheng et al.
Adjacency Matrix Decomposition Clustering for Human Activity Data
Martha Barnard, Yingling Fan, Julian Wolfson
Debiased learning of the causal net benefit with censored event time data
Torben Martinussen, Stijn Vansteelandt
Abstract
A general condition for bias attenuation by a nondifferentially mismeasured confounder
Jeffrey Zhang, Junu Lee
Summary In real-world studies, the collected confounders may suffer from measurement error. Although mismeasurement of confounders is t...
Long-term effect estimation when combining clinical trial and observational follow-up datasets
Gang Cheng, Yen-Chi Chen, Joseph M. Unger et al.
Estimating Racial Disparities When Race is Not Observed
Cory McCartan, Robin Fisher, Jacob Goldin et al.
Statistical Quantile Learning for Large Additive Latent Variable Models
Julien Bodelet, Guillaume Blanc, Jiajun Shan et al.
Design-Based Uncertainty for Quasi-Experiments*
Ashesh Rambachan, Jonathan Roth
Network Goodness-of-Fit for the Block-Model Family
Jiashun Jin, Zheng Tracy Ke, Jiajun Tang et al.
Generalized Linear Mixed Models: Modern Concepts, Methods and Applications, 2nd ed.
Xing Liu
Adaptation Using Spatially Distributed Gaussian Processes
Botond Szabo, Amine Hadji, Aad van der Vaart
Simulating diffusion bridges with score matching
J Heng, others
Abstract
Leveraging External Data for Testing Experimental Therapies with Biomarker Interactions in Randomized Clinical Trials
B Ren, others
Abstract
Identifying and bounding the probability of necessity for causes of effects with ordinal outcomes
Chao Zhang, others
Abstract
A family of toroidal diffusions with exact likelihood inference
E García-Portugués, M Sørensen
Abstract
Correction to: Parameterizing and simulating from causal models
Optimal clustering by Lloyd’s algorithm for low-rank mixture model
Dong Xia, Zhongyuan Lyu
Aggregated Projection Method: A New Approach for Group Factor Model
Jiaqi Hu, Ting Li, Xueqin Wang
Checking the Cox Proportional Hazards Model with Interval-Censored Data
Yangjianchen Xu, Donglin Zeng, D. Y. Lin
Global and Episode-Specific Prediction of Recurrent Events Using Longitudinal Health Informatics Data
Chiung-Yu Huang, Yifei Sun, Sy Han Chiou
Debiasing Watermarks for Large Language Models via Maximal Coupling
Weijie Su, Yangxinyu Xie, Xiang Li et al.
Kernel density estimation with polyspherical data and its applications
Eduardo García-Portugués, Andrea Meilán-Vila
Bayesian Inference on Brain-Computer Interfaces via GLASS
Bangyao Zhao, Jane E. Huggins, Jian Kang
Higher-order accurate two-sample network inference and network hashing
Dong Xia, Meijia Shao, Yuan Zhang et al.
Analyzing Whale Calling through Hawkes Process Modeling
Bokgyeong Kang, Erin M. Schliep, Alan E. Gelfand et al.
Bayesian Random-Effects Meta-Analysis Integrating Individual Participant Data and Aggregate Data
Yunxiang Huang, Hang J. Kim, Chiung-Yu Huang et al.
A Unified Framework for Residual Diagnostics in Generalized Linear Models and Beyond
Dungang Liu, Zewei Lin, Heping Zhang
High-dimensional covariance regression with application to co-expression QTL detection
Rakheon Kim, Jingfei Zhang
A unified generalization of the inverse regression methods via column selection
Yin Jin, Wei Luo
Identification and multiply robust estimation in causal mediation analysis across principal strata
Chao Cheng, Fan Li
Pseudo-likelihood Estimators for Graphical Models: Existence and Uniqueness
B Roycraft, B Rajaratnam
Abstract
Data-Driven Tuning Parameter Selection for High-Dimensional Vector Autoregressions
Anders B. Kock, Rasmus S. Pedersen, Jesper R.-V. Sørensen
Estimating Heterogeneous Causal Mediation Effects with Bayesian Decision Tree Ensembles
Angela Ting, Antonio R. Linero
Nonparametric Test for Rough Volatility
Carsten H. Chong, Viktor Todorov
Ordinary differential equation models for a collection of discretized functions
Fang Yao, Lingxuan Shao
Who Are We Missing?: A Principled Approach to Characterizing the Underrepresented Population
Harsh Parikh, Rachael K. Ross, Elizabeth Stuart et al.
A Smoothed-Bayesian Approach to Frequency Recovery from Sketched Data
Mario Beraha, Stefano Favaro, Matteo Sesia
Unified Optimal Model Averaging with a General Loss Function based on Cross-Validation
Dalei Yu, Xinyu Zhang, Hua Liang
Safe Policy Learning through Extrapolation: Application to Pre-trial Risk Assessment
Kosuke Imai, Eli Ben-Michael, D. James Greiner et al.
Statistical Prediction and Machine Learning
Michal Pešta
Inference in Generalized Linear Models with Robustness to Misspecified Variances
Riccardo De Santis, Jelle J. Goeman, Jesse Hemerik et al.
Least squares for cardinal paired comparisons data
Rahul Singh, others
A practical interval estimation method for spectral density function
Haihan Yu, Mark S. Kaiser, Daniel J. Nordman
Communication-Efficient Distributed Estimation and Inference for Cox’s Model
Jianqing Fan, Pierre Bayle, Zhipeng Lou
Design-Based Causal Inference with Missing Outcomes: Missingness Mechanisms, Imputation-Assisted Randomization Tests, and Covariate Adjustment
Yang Feng, Siyu Heng, Jiawei Zhang
Testing Elliptical Models in High Dimensions
Siyao Wang, Miles E. Lopes
Semiparametric localized principal stratification analysis with continuous strata
Yichi Zhang, Shu Yang
A Semiparametric Instrumented Difference-in-Differences Approach to Policy Learning
Pan Zhao, Yifan Cui
Abstract
Orthogonalized moment aberration for mixed-level multi-stratum factorial designs with partially-relaxed orthogonal block structures
Ming-Chung Chang
Regularized halfspace depth for functional data
Hyemin Yeon, others
Multicalibration for Modeling Censored Survival Data with Universal Adaptability
Hanxuan Ye, Hongzhe Li
Abstract
Fair Coins Tend to Land on the Same Side They Started: Evidence from 350,757 Flips
František Bartoš, Alexandra Sarafoglou, Henrik R. Godmann et al.
Multicalibration for Modeling Censored Survival Data with Universal AdaptabilityGet access
Hanxuan YeandHongzhe Li
Incorporating Auxiliary Variables to Improve the Efficiency of Time-Varying Treatment Effect Estimation
Jieru Shi, Zhenke Wu, Walter Dempsey
Modelling tree survival for investigating climate change effects
Nicole Augustin, Axel Albrecht, Karim Anaya-Izquierdo et al.
Dependent Random Partitions by Shrinking Toward an Anchor
David B. Dahl, Richard L. Warr, Thomas P. Jensen
Asymptotic Behavior of Adversarial Training Estimator underℓ∞-Perturbation
Yiling Xie, Xiaoming Huo
Likelihood Ratio Tests in Random Graph Models with Increasing Dimensions
Ting Yan, Ji Zhu, Yuanzhang Li et al.
Simultaneous Inference for Generalized Linear Models with Unmeasured Confounders
Larry Wasserman, Jin-Hong Du, Kathryn Roeder
Causal Inference for Genomic Data with Multiple Heterogeneous Outcomes
Larry Wasserman, Jin-Hong Du, Zhenghao Zeng et al.
Goodness-of-fit tests for high-dimensional Gaussian graphical models via exchangeable sampling
Xiaotong Lin, others
Nonsense associations in Markov random fields with pairwise dependence
Sohom Bhattacharya, others
Abstract
Nonsense associations in Markov random fields with pairwise dependenceGet access
Sohom Bhattacharyaand others
Posterior Predictive Design for Phase I Clinical Trials
Chenqi Fu, Shouhao Zhou, J. Jack Lee
Testing Mutually Exclusive Hypotheses for Multi-Response Regressions
Jiaqi Huang, Wenbiao Zhao, Lixing Zhu
Distributional Off-Policy Evaluation in Reinforcement Learning
Zhaoran Wang, Zhengling Qi, Chenjia Bai et al.
A Bayesian Criterion for Rerandomization
Zhaoyang Liu, Tingxuan Han, Donald B. Rubin et al.
Aggregating Dependent Signals with Heavy-Tailed Combination Tests
Lin Gui, others
Abstract
Robust functional principal component analysis for non-Euclidean random objects
Jiazhen Xu, others
Abstract
Detection and inference of changes in high-dimensional linear regression with nonsparse structures
Haeran Cho, others
Covariate-assisted bounds on causal effects with instrumental variables
Alexander W Levis, others
Isotonic mechanism for exponential family estimation in machine learning peer review
Yuling Yan, others
Improving the false coverage rate adjusted confidence intervals
Tzviel Frostig, Yoav Benjamini
Consistent and Scalable Composite Likelihood Estimation of Probit Models with Crossed Random Effects
R Bellio, others
Abstract
Distributed Tensor Principal Component Analysis with Data Heterogeneity
Xi Chen, Yichen Zhang, Elynn Chen et al.
Hypothesis Testing for a Functional Parameter via Self-Normalization
Yi Zhang, Xiaofeng Shao
Estimation and Inference of Quantile Spatially Varying Coefficient Models Over Complicated Domains
Myungjin Kim, Li Wang, Huixia Judy Wang
Sparse Bayesian Multidimensional Item Response Theory
Jiguang Li, Robert Gibbons, Veronika Ročková
Powerful Partial Conjunction Hypothesis Testing via Conditioning
B Liang, others
Abstract
Tail calibration of probabilistic forecasts
Sam Allen, Jonathan Koh, Johan Segers et al.
An optimal design framework for lasso sign recovery
Jonathan W Stallrich, others
Bayesian mixture models with repulsive and attractive atoms
Mario Beraha, others
A statistical view of column subset selection
Anav Sood, Trevor Hastie
Prediction of Cognitive Function via Brain Region Volumes with Applications to Alzheimer’s Disease Based on Space-Factor-Guided Functional Principal Component Analysis
Shoudao Wen, Yi Li, Dehan Kong et al.
Unbiased and consistent nested sampling via sequential Monte Carlo
Robert Salomone, others
SymmPI: predictive inference for data with group symmetries
Edgar Dobriban, Mengxin Yu
Communication-Efficient Distributed Sparse Learning with Oracle Property and Geometric Convergence
Weidong Liu, Jiyuan Tu, Xiaojun Mao
Product centred Dirichlet processes for Bayesian multiview clustering
Alexander Dombowsky, David B Dunson
Integer Programming for Learning Directed Acyclic Graphs from Non-identifiable Gaussian Models
Tong Xu, others
Abstract
Bias correction of quadratic spectral estimators
Lachlan C Astfalck, others
Abstract
Statistical Inference for High-Dimensional Spectral Density Matrix
Jinyuan Chang, Xiaofeng Shao, Qing Jiang et al.
Data Fusion Using Weakly Aligned Sources
Sijia Li, Peter B. Gilbert, Rui Duan et al.
Augmented balancing weights as linear regression
David Bruns-Smith, others
Dynamic Regression of Longitudinal Trajectory Features
Huijuan Ma, Wei Zhao, John Hanfelt et al.
Graphical methods for Order-of-Addition experiments
Nicholas Rios, Dennis K J Lin
Frequency Domain Statistical Inference for High-Dimensional Time Series
Jonas Krampe, Efstathios Paparoditis
Cutting Feedback in Misspecified Copula Models
Michael Stanley Smith, Weichang Yu, David J. Nott et al.
Geodesic Mixed Effects Models for Repeatedly Observed/Longitudinal Random Objects
Hans-Georg Müller, Satarupa Bhattacharjee
Positive and Unlabeled Data: Model, Estimation, Inference, and Classification
Siyan Liu, Chi-Kuang Yeh, Xin Zhang et al.
Kernel Meets Sieve: Transformed Hazards Models with Sparse Longitudinal Covariates
Dayu Sun, Zhuowei Sun, Xingqiu Zhao et al.
An Economical Approach to Design Posterior Analyses
Luke Hagar, Nathaniel T. Stevens
Towards a turnkey approach for unbiased Monte Carlo estimation of smooth functions of expectations
Nicolas Chopin, others
Abstract
Convexity and measures of statistical association
Emanuele Borgonovo, others
Confidence on the focal: conformal prediction with selection-conditional coverage
Ying Jin, Zhimei Ren
Robustifying Likelihoods by Optimistically Re-weighting Data
Miheer Dewaskar, Christopher Tosh, Jeremias Knoblauch et al.
A new approach to optimal design under model uncertainty motivated by multi-armed bandits
Mingyao Ai, Holger Dette, Zhengfu Liu et al.
Degree-Heterogeneous Latent Class Analysis for High-Dimensional Discrete Data
Zhongyuan Lyu, Ling Chen, Yuqi Gu
Multi-Dimensional Domain Generalization with Low-Rank Structures
Sai Li, Linjun Zhang
Statistical Inference for High-Dimensional Convoluted Rank Regression
Liping Zhu, Leheng Cai, Xu Guo et al.
Class-Specific Joint Feature Screening in Ultrahigh-Dimensional Mixture Regression
Kaili Jing, Abbas Khalili, Chen Xu
Multi-resolution subsampling for linear classification with massive data
Haolin Chen, others
Efficient Estimation for Longitudinal Networks via Adaptive Merging
Haoran Zhang, Junhui Wang
Distributional Outcome Regression via Quantile Functions and its Application to Modelling Continuously Monitored Heart Rate and Physical Activity
Rahul Ghosal, Sujit K. Ghosh, Jennifer A. Schrack et al.
Estimation of Over-Parameterized Models from an Auto-Modeling Perspective
Yiran Jiang, Chuanhai Liu
Fast Signal Region Detection With Application to Whole Genome Association Studies
Fang Yao, Wei Zhang, Fan Wang
Phase-Type Distributions for Sieve Estimation
Xingqiu Zhao, Hu Xiangbin, Yudong Wang et al.
Deep Regression for Repeated Measurements
Fang Yao, Hang Zhou, Shunxing Yan
High-Dimensional Variable Clustering based on Maxima of a Weakly Dependent Random Process
Alexis Boulin, Elena Di Bernardino, Thomas Laloë et al.
Sequential Monte Carlo testing by betting
Lasse Fischer, Aaditya Ramdas
Estimating Heterogeneous Exposure Effects in the Case-Crossover Design Using BART
Jacob R. Englert, Stefanie T. Ebelt, Howard H. Chang
Correction to: Consistent and fast inference in compartmental models of epidemics using Poisson Approximate Likelihoods
A general framework for cutting feedback within modularized Bayesian inference
Yang Liu, Robert J B Goudie
Strong oracle guarantees for partial penalized tests of high-dimensional generalized linear models
Tate Jacobson
A spike-and-slab prior for dimension selection in generalized linear network eigenmodels
Joshua D Loyal, Yuguo Chen
Abstract
High-Dimensional Expected Shortfall Regression
Xuming He, Kean Ming Tan, Wen-Xin Zhou et al.
Federated Adaptive Causal Estimation (FACE) of Target Treatment Effects
Jue Hou, Tianxi Cai, Rui Duan et al.
Selecting informative conformal prediction sets with false coverage rate control
Ulysse Gazin, others
Noise-induced randomization in regression discontinuity designs
Dean Eckles, Nikolaos Ignatiadis, Stefan Wager et al.
Summary Regression discontinuity designs assess causal effects in settings where treatment is determined by whether an observed running...
Nonparametric data segmentation in multivariate time series via joint characteristic functions
E T McGonigle, H Cho
Summary Modern time series data often exhibit complex dependence and structural changes that are not easily characterized by shifts in ...
An omitted variable bias framework for sensitivity analysis of instrumental variables
Carlos Cinelli, Chad Hazlett
Abstract We develop an omitted variable bias framework for sensitivity analysis of instrumental variable estimates that naturally handl...
On the fundamental limitations of multi-proposal Markov chain Monte Carlo algorithms
F Pozza, G Zanella
Summary We study multi-proposal Markov chain Monte Carlo algorithms, such as multiple-try or generalized Metropolis–Hastings schemes, w...
Randomization inference when N equals one
Tengyuan Liang, Benjamin Recht
Summary For decades, $ N $-of-1 experiments, where a unit serves as its own control and treatment in different time windows, have been ...
Continuous-time locally stationary wavelet processes
H A Palasciano, M I Knight, G P Nason
Abstract This article introduces the class of continuous-time locally stationary wavelet processes. Continuous-time models enable us to...
Testable implications of outcome-independent missingness not at random in covariates
A Sjölander, S Hägg
Summary A common aim of empirical research is to regress an outcome on a set of covariates, when some covariates are subject to missing...
Hub Detection in Gaussian Graphical Models
José Á. Sánchez Gómez, Weibin Mo, Junlong Zhao et al.
Bayesian penalized empirical likelihood and Markov Chain Monte Carlo sampling
Jinyuan Chang, others
Conformal prediction with conditional guarantees
Isaac Gibbs, others
U-Statistic Reduction: Higher-Order Accurate Risk Control and Statistical-Computational Trade-Off
Dong Xia, Meijia Shao, Yuan Zhang
A Novel Approach of High Dimensional Linear Hypothesis Testing Problem
Runze Li, Zhe Zhang, Xiufan Yu
Semiparametric Regression Analysis of Interval-Censored Multi-State Data with An Absorbing State
Donglin Zeng, D. Y. Lin, Yu Gu
Inferences in Multinomial Dynamic Mixed Logit Models
Alwell Oyet, Brajendra C. Sutradhar, R. Prabhakar Rao
High-Dimensional Knockoffs Inference for Time Series Data
Yingying Fan, Jinchi Lv, Chien-Ming Chi et al.
Identifying Genetic Variants for Brain Connectivity Using Ball Covariance Ranking and Aggregation
Heping Zhang, Wei Dai
Discovering the Network Granger Causality in Large Vector Autoregressive Models
Yoshimasa Uematsu, Takashi Yamagata
An Adaptive Adjustment to theR2Statistic in High-Dimensional Elliptical Models
Shizhe Hong, Weiming Li, Qiang Liu et al.
A conditioning tactic that increases design sensitivity in observational block designs
Paul R Rosenbaum
Adaptive experiments toward learning treatment effect heterogeneity
Waverly Wei, others
Semiparametric posterior corrections
Andrew Yiu, others
High-dimensional Factor Analysis for Network-linked data
Jinming Li, others
Abstract
Robust Inference for Federated Meta-Learning
Tianxi Cai, Larry Han, Zijian Guo et al.
Comparison of Longitudinal Trajectories Using a High-Dimensional Partial Linear Semiparametric Mixed-Effects Model
Sami Leon, Tong Tong Wu
Adaptive Testing for High-Dimensional Data
Xiaofeng Shao, Yangfan Zhang, Runmin Wang
Random effects model-based sufficient dimension reduction for independent clustered data
Linh H. Nghiem, Francis K.C. Hui
Robust Bayesian Modeling of Counts with Zero Inflation and Outliers: Theoretical Robustness and Efficient Computation
Yasuyuki Hamura, Kaoru Irie, Shonosuke Sugasawa
Analysis of Variance of Tensor Product Reproducing Kernel Hilbert Spaces on Metric Spaces
Xueqin Wang, Zhanfeng Wang, Rui Pan et al.
A Bias-Accuracy-Privacy Trilemma for Statistical Estimation
Gautam Kamath, Argyris Mouzakis, Matthew Regehr et al.
Estimation and Inference for Nonparametric Expected Shortfall Regression over RKHS
Kean Ming Tan, Wen-Xin Zhou, Myeonghun Yu et al.
Large Precision Matrix Estimation with Unknown Group Structure
Cong Cheng, Yuan Ke, Wenyang Zhang
Sensitivity Analysis for Quantiles of Hidden Biases in Matched Observational Studies
Dongxiao Wu, Xinran Li
Two-phase rejective sampling and its asymptotic properties
Peng Ding, Shu Yang
Modeling Preferences: A Bayesian Mixture of Finite Mixtures for Rankings and Ratings
Michael Pearce, Elena A. Erosheva
Estimation and Variable Selection for Interval-Censored Failure Time Data with Random Change Point and Application to Breast Cancer Study
Mingyue Du, Yichen Lou, Jianguo Sun
Deconvolution Density Estimation with Penalized MLE
Yun Cai, Hong Gu, Toby Kenney
On the Comparative Analysis of Average Treatment Effects Estimation via Data Combination
Peng Wu, Shanshan Luo, Zhi Geng
When Composite Likelihood meets Stochastic Approximation
Giuseppe Alfonzetti, Ruggero Bellio, Yunxiao Chen et al.
Optimal Multitask Linear Regression and Contextual Bandits under Sparse Heterogeneity
Edgar Dobriban, Xinmeng Huang, Kan Xu et al.
Modeling Hypergraphs with Diversity and Heterogeneous Popularity
Ji Zhu, Xianshi Yu
Randomized empirical likelihood test for ultra-high dimensional means under general covariances
Yuexin Chen, others
Analytic natural gradient updates for Cholesky factor in Gaussian variational approximation
Linda S L Tan
Bayesian Clustering via Fusing of Localized Densities
Alexander Dombowsky, David B. Dunson
When Frictions Are Fractional: Rough Noise in High-Frequency Data
Carsten H. Chong, Thomas Delerue, Guoying Li
Simulation-Based, Finite-Sample Inference for Privatized Data
Jordan Awan, Zhanyu Wang
Geometric Ergodicity of Trans-Dimensional Markov Chain Monte Carlo Algorithms
Qian Qin
Partial Quantile Tensor Regression
Limin Peng, Dayu Sun, Zhiping Qiu et al.
Local Signal Detection on Irregular Domains with Generalized Varying Coefficient Models
Annie Qu, Heng Lian, Chengzhu Zhang et al.
Statistical and Computational Efficiency for Smooth Tensor Estimation with Unknown Permutations
Chanwoo Lee, Miaoyan Wang
Two Sample Test for Covariance Matrices in Ultra-High Dimension
Xiucai Ding, Yichen Hu, Zhenggang Wang
Coefficient Shape Alignment in Multiple Functional Linear Regression
Shuhao Jiao, Ngai-Hang Chan
On the Modeling and Prediction of High-Dimensional Functional Time Series
Qiwei Yao, Jinyuan Chang, Xinghao Qiao et al.
Matrix GARCH Model: Inference and Application
Dong Li, Cheng Yu, Feiyu Jiang et al.
Matrix Completion When Missing Is Not at Random and Its Applications in Causal Panel Data Models
Jungjun Choi, Ming Yuan
‘On the behaviour of marginal and conditional AIC in linear mixed models’
Sonja Greven, Thomas Kneib