Papers
Found 382 papers
Sorted by: Newest FirstTesting and Support Recovery in Population-Based Image Data
Lianqiang Qu, Jian Huang, Liuquan Sun et al.
Long-term effect estimation when combining clinical trial and observational follow-up datasets
Gang Cheng, Yen-Chi Chen, Joseph M. Unger et al.
Statistical Quantile Learning for Large Additive Latent Variable Models
Julien Bodelet, Guillaume Blanc, Jiajun Shan et al.
Estimating Racial Disparities When Race is Not Observed
Cory McCartan, Robin Fisher, Jacob Goldin et al.
Adaptation Using Spatially Distributed Gaussian Processes
Botond Szabo, Amine Hadji, Aad van der Vaart
Generalized Linear Mixed Models: Modern Concepts, Methods and Applications, 2nd ed.
Xing Liu
Network Goodness-of-Fit for the Block-Model Family
Jiashun Jin, Zheng Tracy Ke, Jiajun Tang et al.
Design-Based Uncertainty for Quasi-Experiments*
Ashesh Rambachan, Jonathan Roth
Higher-order accurate two-sample network inference and network hashing
Meijia Shao, Dong Xia, Yuan Zhang et al.
Bayesian Random-Effects Meta-Analysis Integrating Individual Participant Data and Aggregate Data
Yunxiang Huang, Hang J. Kim, Chiung-Yu Huang et al.
A Unified Framework for Residual Diagnostics in Generalized Linear Models and Beyond
Dungang Liu, Zewei Lin, Heping Zhang
High-dimensional covariance regression with application to co-expression QTL detection
Rakheon Kim, Jingfei Zhang
Kernel density estimation with polyspherical data and its applications
Eduardo García-Portugués, Andrea Meilán-Vila
Checking the Cox Proportional Hazards Model with Interval-Censored Data
Yangjianchen Xu, Donglin Zeng, D. Y. Lin
Adaptive Selection for False Discovery Rate Control Leveraging Symmetry
Kehan Wang, Yuexin Chen, Yixin Han et al.
Debiasing Watermarks for Large Language Models via Maximal Coupling
Yangxinyu Xie, Xiang Li, Tanwi Mallick et al.
Analyzing Whale Calling through Hawkes Process Modeling
Bokgyeong Kang, Erin M. Schliep, Alan E. Gelfand et al.
Bayesian Inference on Brain-Computer Interfaces via GLASS
Bangyao Zhao, Jane E. Huggins, Jian Kang
Aggregated Projection Method: A New Approach for Group Factor Model
Jiaqi Hu, Ting Li, Xueqin Wang
Global and Episode-Specific Prediction of Recurrent Events Using Longitudinal Health Informatics Data
Yifei Sun, Sy Han Chiou, Chiung-Yu Huang
Data-Driven Tuning Parameter Selection for High-Dimensional Vector Autoregressions
Anders B. Kock, Rasmus S. Pedersen, Jesper R.-V. Sørensen
Who Are We Missing?: A Principled Approach to Characterizing the Underrepresented Population
Harsh Parikh, Rachael K. Ross, Elizabeth Stuart et al.
Nonparametric Test for Rough Volatility
Carsten H. Chong, Viktor Todorov
Estimating Heterogeneous Causal Mediation Effects with Bayesian Decision Tree Ensembles
Angela Ting, Antonio R. Linero
Safe Policy Learning through Extrapolation: Application to Pre-trial Risk Assessment
Eli Ben-Michael, D. James Greiner, Kosuke Imai et al.
Statistical Prediction and Machine Learning
Michal Pešta
Inference in Generalized Linear Models with Robustness to Misspecified Variances
Riccardo De Santis, Jelle J. Goeman, Jesse Hemerik et al.
Unified Optimal Model Averaging with a General Loss Function based on Cross-Validation
Dalei Yu, Xinyu Zhang, Hua Liang
A Smoothed-Bayesian Approach to Frequency Recovery from Sketched Data
Mario Beraha, Stefano Favaro, Matteo Sesia
Communication-Efficient Distributed Estimation and Inference for Cox’s Model
Pierre Bayle, Jianqing Fan, Zhipeng Lou
Testing Elliptical Models in High Dimensions
Siyao Wang, Miles E. Lopes
A practical interval estimation method for spectral density function
Haihan Yu, Mark S. Kaiser, Daniel J. Nordman
Design-Based Causal Inference with Missing Outcomes: Missingness Mechanisms, Imputation-Assisted Randomization Tests, and Covariate Adjustment
Siyu Heng, Jiawei Zhang, Yang Feng
Fair Coins Tend to Land on the Same Side They Started: Evidence from 350,757 Flips
František Bartoš, Alexandra Sarafoglou, Henrik R. Godmann et al.
Modelling tree survival for investigating climate change effects
Nicole Augustin, Axel Albrecht, Karim Anaya-Izquierdo et al.
Incorporating Auxiliary Variables to Improve the Efficiency of Time-Varying Treatment Effect Estimation
Jieru Shi, Zhenke Wu, Walter Dempsey
A Latent Variable Model for Individual Degree Measures in Respondent-Driven Sampling
Yibo Wang, Sunghee Lee, Michael R. Elliott
Partially Exchangeable Stochastic Block Models for (Node-Colored) Multilayer Networks
Daniele Durante, Francesco Gaffi, Antonio Lijoi et al.
Conformal Prediction for Network-Assisted Regression
Robert Lunde, Elizaveta Levina, Ji Zhu
Dependent Random Partitions by Shrinking Toward an Anchor
David B. Dahl, Richard L. Warr, Thomas P. Jensen
Joint Spectral Clustering in Multilayer Degree-Corrected Stochastic Blockmodels
Joshua Agterberg, Zachary Lubberts, Jesús Arroyo
Causal Inference for Genomic Data with Multiple Heterogeneous Outcomes
Jin-Hong Du, Zhenghao Zeng, Edward H. Kennedy et al.
Simultaneous Inference for Generalized Linear Models with Unmeasured Confounders
Jin-Hong Du, Larry Wasserman, Kathryn Roeder
Likelihood Ratio Tests in Random Graph Models with Increasing Dimensions
Ting Yan, Yuanzhang Li, Jinfeng Xu et al.
Asymptotic Behavior of Adversarial Training Estimator underℓ∞-Perturbation
Yiling Xie, Xiaoming Huo
Asymptotic guarantees for Bayesian phylogenetic tree reconstruction
Alisa Kirichenko, Luke J. Kelly, Jere Koskela
Posterior Predictive Design for Phase I Clinical Trials
Chenqi Fu, Shouhao Zhou, J. Jack Lee
Manipulating an Instrumental Variable in an Observational Study of Premature Babies: Design, Bounds, and Inference
Zhe Chen, Min Haeng Cho, Bo Zhang
Deep Mutual Density Ratio Estimation with Bregman Divergence and Its Applications
Dongxiao Han, Siming Zheng, Guohao Shen et al.
A Bayesian Criterion for Rerandomization
Zhaoyang Liu, Tingxuan Han, Donald B. Rubin et al.
LAMBDA: A Large Model Based Data Agent
Sun Maojun, Ruijian Han, Binyan Jiang et al.
Distributional Off-Policy Evaluation in Reinforcement Learning
Zhengling Qi, Chenjia Bai, Zhaoran Wang et al.
Identifying the structure of high-dimensional time series via eigen-analysiss
Bo Zhang, Jiti Gao, Guangming Pan et al.
Testing Mutually Exclusive Hypotheses for Multi-Response Regressions
Jiaqi Huang, Wenbiao Zhao, Lixing Zhu
The ICML 2023 Ranking Experiment: Examining Author Self-Assessment in ML/AI Peer Review
Buxin Su, Jiayao Zhang, Natalie Collina et al.
Distributed Tensor Principal Component Analysis with Data Heterogeneity
Elynn Chen, Xi Chen, Wenbo Jing et al.
Hypothesis Testing for a Functional Parameter via Self-Normalization
Yi Zhang, Xiaofeng Shao
Estimation and Inference of Quantile Spatially Varying Coefficient Models Over Complicated Domains
Myungjin Kim, Li Wang, Huixia Judy Wang
Higher Order Accurate Symmetric Bootstrap Confidence Intervals in High Dimensional Penalized Regression
Debraj Das, Arindam Chatterjee, S. N. Lahiri
Adjacency Matrix Decomposition Clustering for Human Activity Data
Martha Barnard, Yingling Fan, Julian Wolfson
Tail calibration of probabilistic forecasts
Sam Allen, Jonathan Koh, Johan Segers et al.
Sparse Bayesian Multidimensional Item Response Theory
Jiguang Li, Robert Gibbons, Veronika Ročková
Prediction of Cognitive Function via Brain Region Volumes with Applications to Alzheimer’s Disease Based on Space-Factor-Guided Functional Principal Component Analysis
Shoudao Wen, Yi Li, Dehan Kong et al.
Communication-Efficient Distributed Sparse Learning with Oracle Property and Geometric Convergence
Weidong Liu, Xiaojun Mao, Jiyuan Tu
Data Fusion Using Weakly Aligned Sources
Sijia Li, Peter B. Gilbert, Rui Duan et al.
Statistical Inference for High-Dimensional Spectral Density Matrix
Jinyuan Chang, Qing Jiang, Tucker McElroy et al.
Frequency Domain Statistical Inference for High-Dimensional Time Series
Jonas Krampe, Efstathios Paparoditis
Cutting Feedback in Misspecified Copula Models
Michael Stanley Smith, Weichang Yu, David J. Nott et al.
Dynamic Regression of Longitudinal Trajectory Features
Huijuan Ma, Wei Zhao, John Hanfelt et al.
Geodesic Mixed Effects Models for Repeatedly Observed/Longitudinal Random Objects
Satarupa Bhattacharjee, Hans-Georg Müller
Positive and Unlabeled Data: Model, Estimation, Inference, and Classification
Siyan Liu, Chi-Kuang Yeh, Xin Zhang et al.
Kernel Meets Sieve: Transformed Hazards Models with Sparse Longitudinal Covariates
Dayu Sun, Zhuowei Sun, Xingqiu Zhao et al.
An Economical Approach to Design Posterior Analyses
Luke Hagar, Nathaniel T. Stevens
Multi-Dimensional Domain Generalization with Low-Rank Structures
Sai Li, Linjun Zhang
Statistical Inference for High-Dimensional Convoluted Rank Regression
Leheng Cai, Xu Guo, Heng Lian et al.
Class-Specific Joint Feature Screening in Ultrahigh-Dimensional Mixture Regression
Kaili Jing, Abbas Khalili, Chen Xu
Robustifying Likelihoods by Optimistically Re-weighting Data
Miheer Dewaskar, Christopher Tosh, Jeremias Knoblauch et al.
Degree-Heterogeneous Latent Class Analysis for High-Dimensional Discrete Data
Zhongyuan Lyu, Ling Chen, Yuqi Gu
A new approach to optimal design under model uncertainty motivated by multi-armed bandits
Mingyao Ai, Holger Dette, Zhengfu Liu et al.
Efficient Estimation for Longitudinal Networks via Adaptive Merging
Haoran Zhang, Junhui Wang
Distributional Outcome Regression via Quantile Functions and its Application to Modelling Continuously Monitored Heart Rate and Physical Activity
Rahul Ghosal, Sujit K. Ghosh, Jennifer A. Schrack et al.
Estimation of Over-Parameterized Models from an Auto-Modeling Perspective
Yiran Jiang, Chuanhai Liu
Fast Signal Region Detection With Application to Whole Genome Association Studies
Wei Zhang, Fan Wang, Fang Yao
Phase-Type Distributions for Sieve Estimation
Hu Xiangbin, Yudong Wang, Zhisheng Ye et al.
Deep Regression for Repeated Measurements
Shunxing Yan, Fang Yao, Hang Zhou
Estimating Heterogeneous Exposure Effects in the Case-Crossover Design Using BART
Jacob R. Englert, Stefanie T. Ebelt, Howard H. Chang
High-Dimensional Variable Clustering based on Maxima of a Weakly Dependent Random Process
Alexis Boulin, Elena Di Bernardino, Thomas Laloë et al.
High-Dimensional Expected Shortfall Regression
Shushu Zhang, Xuming He, Kean Ming Tan et al.
Federated Adaptive Causal Estimation (FACE) of Target Treatment Effects
Larry Han, Jue Hou, Kelly Cho et al.
Hub Detection in Gaussian Graphical Models
José Á. Sánchez Gómez, Weibin Mo, Junlong Zhao et al.
U-Statistic Reduction: Higher-Order Accurate Risk Control and Statistical-Computational Trade-Off
Meijia Shao, Dong Xia, Yuan Zhang
A Novel Approach of High Dimensional Linear Hypothesis Testing Problem
Zhe Zhang, Xiufan Yu, Runze Li
Identifying Genetic Variants for Brain Connectivity Using Ball Covariance Ranking and Aggregation
Wei Dai, Heping Zhang
Discovering the Network Granger Causality in Large Vector Autoregressive Models
Yoshimasa Uematsu, Takashi Yamagata
An Adaptive Adjustment to the R₂ Statistic in High-Dimensional Elliptical Models
Shizhe Hong, Weiming Li, Qiang Liu et al.
Semiparametric Regression Analysis of Interval-Censored Multi-State Data with An Absorbing State
Yu Gu, Donglin Zeng, D. Y. Lin
Inferences in Multinomial Dynamic Mixed Logit Models
Alwell Oyet, Brajendra C. Sutradhar, R. Prabhakar Rao
High-Dimensional Knockoffs Inference for Time Series Data
Chien-Ming Chi, Yingying Fan, Ching-Kang Ing et al.
Adaptive Testing for High-Dimensional Data
Yangfan Zhang, Runmin Wang, Xiaofeng Shao
Robust Bayesian Modeling of Counts with Zero Inflation and Outliers: Theoretical Robustness and Efficient Computation
Yasuyuki Hamura, Kaoru Irie, Shonosuke Sugasawa
Robust Inference for Federated Meta-Learning
Zijian Guo, Xiudi Li, Larry Han et al.
Comparison of Longitudinal Trajectories Using a High-Dimensional Partial Linear Semiparametric Mixed-Effects Model
Sami Leon, Tong Tong Wu
Random effects model-based sufficient dimension reduction for independent clustered data
Linh H. Nghiem, Francis K.C. Hui
Analysis of Variance of Tensor Product Reproducing Kernel Hilbert Spaces on Metric Spaces
Zhanfeng Wang, Rui Pan, Xueqin Wang et al.
A Bias-Accuracy-Privacy Trilemma for Statistical Estimation
Gautam Kamath, Argyris Mouzakis, Matthew Regehr et al.
Estimation and Inference for Nonparametric Expected Shortfall Regression over RKHS
Myeonghun Yu, Yue Wang, Siyu Xie et al.
Large Precision Matrix Estimation with Unknown Group Structure
Cong Cheng, Yuan Ke, Wenyang Zhang
Sensitivity Analysis for Quantiles of Hidden Biases in Matched Observational Studies
Dongxiao Wu, Xinran Li
Modeling Preferences: A Bayesian Mixture of Finite Mixtures for Rankings and Ratings
Michael Pearce, Elena A. Erosheva
Estimation and Variable Selection for Interval-Censored Failure Time Data with Random Change Point and Application to Breast Cancer Study
Mingyue Du, Yichen Lou, Jianguo Sun
Deconvolution Density Estimation with Penalized MLE
Yun Cai, Hong Gu, Toby Kenney
Modeling Hypergraphs with Diversity and Heterogeneous Popularity
Xianshi Yu, Ji Zhu
Optimal Multitask Linear Regression and Contextual Bandits under Sparse Heterogeneity
Xinmeng Huang, Kan Xu, Donghwan Lee et al.
When Composite Likelihood meets Stochastic Approximation
Giuseppe Alfonzetti, Ruggero Bellio, Yunxiao Chen et al.
On the Comparative Analysis of Average Treatment Effects Estimation via Data Combination
Peng Wu, Shanshan Luo, Zhi Geng
Bayesian Clustering via Fusing of Localized Densities
Alexander Dombowsky, David B. Dunson
When Frictions Are Fractional: Rough Noise in High-Frequency Data
Carsten H. Chong, Thomas Delerue, Guoying Li
Simulation-Based, Finite-Sample Inference for Privatized Data
Jordan Awan, Zhanyu Wang
Geometric Ergodicity of Trans-Dimensional Markov Chain Monte Carlo Algorithms
Qian Qin
Partial Quantile Tensor Regression
Dayu Sun, Limin Peng, Zhiping Qiu et al.
Local Signal Detection on Irregular Domains with Generalized Varying Coefficient Models
Chengzhu Zhang, Lan Xue, Yu Chen et al.
Two Sample Test for Covariance Matrices in Ultra-High Dimension
Xiucai Ding, Yichen Hu, Zhenggang Wang
Coefficient Shape Alignment in Multiple Functional Linear Regression
Shuhao Jiao, Ngai-Hang Chan
Statistical and Computational Efficiency for Smooth Tensor Estimation with Unknown Permutations
Chanwoo Lee, Miaoyan Wang
On the Modeling and Prediction of High-Dimensional Functional Time Series
Jinyuan Chang, Qin Fang, Xinghao Qiao et al.
Matrix GARCH Model: Inference and Application
Cheng Yu, Dong Li, Feiyu Jiang et al.
Matrix Completion When Missing Is Not at Random and Its Applications in Causal Panel Data Models
Jungjun Choi, Ming Yuan
Solving the Poisson Equation Using Coupled Markov Chains
Pierre Etienne Jacob, Randal Douc, Anthony Lee et al.
Average Partial Effect Estimation Using Double Machine Learning
Harvey Klyne, Rajen Shah
Fundamental Limits of Community Detection From Multi-View Data: Multi-Layer, Dynamic and Partially Labeled Block Models
Xiaodong Yang, Buyu Lin, Subhabrata Sen
Online Estimation with Rolling Validation: Adaptive Nonparametric Estimation with Streaming Data
Tianyu Zhang, Jing Lei
Poisson Empirical Bayes Estimation: When Doesg-Modeling Beatf-Modeling in Theory (And in Practice)?
Yandi Shen, Yihong Wu
High-Dimensional Hilbert-Schmidt Linear Regression with Hilbert Manifold Variables
Changwon Choi, Byeong U. Park
Optimal Sequencing Depth for Single-Cell RNA-Sequencing in Wasserstein Space
Jakwang Kim, Sharvaj Kubal, Geoffrey Schiebinger
A Two-Way Heterogeneity Model for Dynamic Networks
Binyan Jiang, Chenlei Leng, Ting Yan et al.
A Geometrical Analysis of Kernel Ridge Regression and its Applications
Zong Shang, Guillaume Lecué, Georgios Gavrilopoulos
Kurtosis-Based Projection Pursuit for Matrix-Valued Data
Una Radojicic, Klaus Nordhausen, Joni Virta
A Flexible Defense Against the Winner’s Curse
Tijana Zrnic, William Fithian
Rank Tests for PCA Under Weak Identifiability
Davy Paindaveine, Laura Peralvo Maroto, Thomas Verdebout
Sparse PCA: A New Scalable Estimator Based on Integer Programming
Kayhan Behdin, Rahul Mazumder
Semi-Supervised U-Statistics
Ilmun Kim, Larry Wasserman, Sivaraman Balakrishnan et al.
Scalable Inference in Functional Linear Regression with Streaming Data
Jinhan Xie, Enze Shi, Peijun Sang et al.
The Empirical Copula Process in High Dimensions: Stute’s Representation and Applications
Axel Bücher, Cambyse Pakzad
Causal Effect Estimation Under Network Interference with Mean-Field Methods
Sohom Bhattacharya, Subhabrata Sen
Clustering risk in Non-parametric Hidden Markov and I.I.D. Models
Elisabeth Gassiat, Ibrahim Kaddouri, Zacharie Naulet
Efficiently Matching Random Inhomogeneous Graphs via Degree Profiles
Jian Ding, Yumou Fei, Yuanzheng Wang
Improving Knockoffs with Conditional Calibration
Yixiang Luo, William Fithian, Lihua Lei
Spectral Density Estimation of Function-Valued Spatial Processes
Rafail Kartsioukas, Stilian Stoev, Tailen Hsing
Causality Pursuit from Heterogeneous Environments via Neural Adversarial Invariance Learning
Yihong Gu, Cong Fang, Peter Bühlmann et al.
Tests of Missing Completely at Random Based on Sample Covariance Matrices
Alberto Bordino, Thomas Benjamin Berrett
Near Optimal Sample Complexity for Matrix and Tensor Normal Models via Geodesic Convexity
Rafael Mendes de Oliveira, William Cole Franks, Akshay Ramachandran et al.
Yurinskii’s Coupling for Martingales
Matias Damian Cattaneo, Ricardo Pereira Masini, William George Underwood
Improved Learning Theory for Kernel Distribution Regression with Two-Stage Sampling
François Bachoc, Louis Béthune, Alberto González-Sanz et al.
Trimmed Sample Means for Robust Uniform Mean Estimation and Regression
Roberto Imbuzeiro Moraes Felinto de Oliveira, Lucas Resende
Pseudo-Likelihood-Based M-Estimation of Random Graphs with Dependent Edges and Parameter Vectors of Increasing Dimension
Jonathan Roy Stewart, Michael Schweinberger
Robust Transfer Learning with Unreliable Source Data
Jianqing Fan, Cheng Gao, Jason Matthew Klusowski
Online Estimation and Inference for Robust Policy Evaluation in Reinforcement Learning
Weidong Liu, Jiyuan Tu, Yichen Zhang et al.
The High-Dimensional Asymptotics of Principal Component Regression
Alden Green, Elad Romanov
Theory of Functional Principal Component Analysis for Discretely Observed Data
Hang Zhou, Dongyi Wei, Fang Yao
A Unified Analysis of Likelihood-based Estimators in the Plackett–Luce Model
Ruijian Han, Yiming Xu
Symmetry: A General Structure in Nonparametric Regression
Louis Goldwater Christie, John A. D. Aston
Advances in Bayesian Model Selection Consistency for High-Dimensional Generalized Linear Models
Jeyong Lee, Minwoo Chae, Ryan Martin
Estimation and Inference in Distributional Reinforcement Learning
Liangyu Zhang, Yang Peng, Jiadong Liang et al.
Online Statistical Inference in Decision Making with Matrix Context
Qiyu Han, Will Wei Sun, Yichen Zhang
Structured Matrix Learning under Arbitrary Entrywise Dependence and Estimation of Markov Transition Kernel
Jinhang Chai, Jianqing Fan
Optimal and Exact Recovery on the General Non-Uniform Hypergraph Stochastic Block Model
Ioana Dumitriu, Hai-Xiao Wang
High-Dimensional Statistical Inference for Linkage Disequilibrium Score Regression and Its Cross-Ancestry Extensions
Fei Xue, Bingxin Zhao
Deep Horseshoe Gaussian Processes
Ismaël Castillo, Thibault Christophe Randrianarisoa
The Functional Graphical Lasso
Kartik Govind Waghmare, Tomas Masak, Victor Michael Panaretos
Higher-Order Entrywise Eigenvectors Analysis of Low-Rank Random Matrices: Bias Correction, Edgeworth Expansion, and Bootstrap
Fangzheng Xie, Yichi Zhang
Counterfactual Inference in Sequential Experiments
Raaz Dwivedi, Katherine Tian, Sabina Tomkins et al.
Optimal Vintage Factor Analysis with Deflation Varimax
Xin Bing, Xin He, Dian Jin et al.
Low-Degree Hardness of Detection for Correlated Erdős-Rényi Graphs
Jian Ding, Hang Du, Zhangsong Li
Spectral Gap Bounds for Reversible Hybrid Gibbs Chains
Qian Qin, Nianqiao Ju, Guanyang Wang
Fixed and Random Covariance Regression Analyses
Tao Zou, Wei Lan, Runze Li et al.
Debiased Regression Adjustment in Completely Randomized Experiments with Moderately High-Dimensional Covariates
Xin Lu, Fan Yang, Yuhao Wang
Reinforcement Learning for Individual Optimal Policy From Heterogeneous Data
Rui Miao, Babak Shahbaba, Annie Qu
Policy Learning “Without” Overlap: Pessimism and Generalized Empirical Bernstein’s Inequality
Ying Jin, Zhimei Ren, Zhuoran Yang et al.
Algorithmic Stability Implies Training-Conditional Coverage for Distribution-Free Prediction Methods
Ruiting Liang, Rina Foygel Barber
Semiparametric Modeling and Analysis for Longitudinal Network Data
Yinqiu He, Jiajin Sun, Yuang Tian et al.
On the Structural Dimension of Sliced Inverse Regression
Dongming Huang, Songtao Tian, Qian Lin
Erratum: Quantile Processes and Their Applications in Finite Populations
Anurag Dey, Probal Chaudhuri
Dualizing Le Cam’s Method for Functional Estimation, with Applications to Estimating the Unseens
Yury Polyanskiy, Yihong Wu
Asymptotically-Exact Selective Inference for Quantile Regression
Yumeng Wang, Snigdha Panigrahi, Xuming He
Near-Optimal Inference in Adaptive Linear Regression
Koulik Khamaru, Yash Deshpande, Tor Lattimore et al.
A Common-Cause Principle for Eliminating Selection Bias in Causal Estimands Through Covariate Adjustment
Maya Mathur, Ilya Shpitser, Tyler VanderWeele
DRM Revisited: A Complete Error Analysis
Yuling Jiao, Ruoxuan Li, Peiying Wu et al.
It is widely known that the error analysis for deep learning involves approximation, statistical, and optimization errors. However, it is challenging ...
Principled Penalty-based Methods for Bilevel Reinforcement Learning and RLHF
Han Shen, Zhuoran Yang, Tianyi Chen
Bilevel optimization has been recently applied to many machine learning tasks. However, their applications have been restricted to the supervised lear...
Precise High-Dimensional Asymptotics for Quantifying Heterogeneous Transfers
Fan Yang, Hongyang R. Zhang, Sen Wu et al.
The problem of learning one task using samples from another task is central to transfer learning. In this paper, we focus on answering the following q...
Score-based Causal Representation Learning: Linear and General Transformations
Burak Var{{\i}}c{{\i}}, Emre Acartürk, Karthikeyan Shanmugam et al.
This paper addresses intervention-based causal representation learning (CRL) under a general nonparametric latent causal model and an unknown transfor...
On the Statistical Properties of Generative Adversarial Models for Low Intrinsic Data Dimension
Saptarshi Chakraborty, Peter L. Bartlett
Despite the remarkable empirical successes of Generative Adversarial Networks (GANs), the theoretical guarantees for their statistical accuracy remain...
Prominent Roles of Conditionally Invariant Components in Domain Adaptation: Theory and Algorithms
Keru Wu, Yuansi Chen, Wooseok Ha et al.
Domain adaptation (DA) is a statistical learning problem that arises when the distribution of the source data used to train a model differs from that ...
Near-Optimal Nonconvex-Strongly-Convex Bilevel Optimization with Fully First-Order Oracles
Lesi Chen, Yaohua Ma, Jingzhao Zhang
In this work, we consider bilevel optimization when the lower-level problem is strongly convex. Recent works show that with a Hessian-vector product (...
Adaptive Distributed Kernel Ridge Regression: A Feasible Distributed Learning Scheme for Data Silos
Shao-Bo Lin, Xiaotong Liu, Di Wang et al.
Data silos, mainly caused by privacy and interoperability, significantly constrain collaborations among different organizations with similar data for ...
On Global and Local Convergence of Iterative Linear Quadratic Optimization Algorithms for Discrete Time Nonlinear Control
Vincent Roulet, Siddhartha Srinivasa, Maryam Fazel et al.
A classical approach for solving discrete time nonlinear control on a finite horizon consists in repeatedly minimizing linear quadratic approximations...
A Decentralized Proximal Gradient Tracking Algorithm for Composite Optimization on Riemannian Manifolds
Lei Wang, Le Bao, Xin Liu
This paper focuses on minimizing a smooth function combined with a nonsmooth regularization term on a compact Riemannian submanifold embedded in the E...
Learning conditional distributions on continuous spaces
Cyril Benezet, Ziteng Cheng, Sebastian Jaimungal
We investigate sample-based learning of conditional distributions on multi-dimensional unit boxes, allowing for different dimensions of the feature an...
A Unified Analysis of Nonstochastic Delayed Feedback for Combinatorial Semi-Bandits, Linear Bandits, and MDPs
Lukas Zierahn, Dirk van der Hoeven, Tal Lancewicki et al.
We derive a new analysis of Follow The Regularized Leader (FTRL) for online learning with delayed bandit feedback. By separating the cost of delayed f...
Error bounds for particle gradient descent, and extensions of the log-Sobolev and Talagrand inequalities
Rocco Caprio, Juan Kuntz, Samuel Power et al.
We derive non-asymptotic error bounds for particle gradient descent (PGD, Kuntz et al. (2023)), a recently introduced algorithm for maximum likelihoo...
Linear Hypothesis Testing in High-Dimensional Expected Shortfall Regression with Heavy-Tailed Errors
Gaoyu Wu, Jelena Bradic, Kean Ming Tan et al.
Expected shortfall (ES) is widely used for characterizing the tail of a distribution across various fields, particularly in financial risk management....
Efficient Numerical Integration in Reproducing Kernel Hilbert Spaces via Leverage Scores Sampling
Antoine Chatalic, Nicolas Schreuder, Ernesto De Vito et al.
In this work we consider the problem of numerical integration, i.e., approximating integrals with respect to a target probability measure using only p...
Distribution Free Tests for Model Selection Based on Maximum Mean Discrepancy with Estimated Parameters
Florian Brück, Jean-David Fermanian, Aleksey Min
There exist several testing procedures based on the maximum mean discrepancy (MMD) to address the challenge of model specification. However, these tes...
Statistical field theory for Markov decision processes under uncertainty
George Stamatescu
A statistical field theory is introduced for finite state and action Markov decision processes with unknown parameters, in a Bayesian setting. The Bel...
Bayesian Data Sketching for Varying Coefficient Regression Models
Rajarshi Guhaniyogi, Laura Baracaldo, Sudipto Banerjee
Varying coefficient models are popular for estimating nonlinear regression functions in functional data models. Their Bayesian variants have received ...
Bagged k-Distance for Mode-Based Clustering Using the Probability of Localized Level Sets
Hanyuan Hang
In this paper, we propose an ensemble learning algorithm named bagged $k$-distance for mode-based clustering (BDMBC) by putting forward a new measure ...
Linear cost and exponentially convergent approximation of Gaussian Matérn processes on intervals
David Bolin, Vaibhav Mehandiratta, Alexandre B. Simas
The computational cost for inference and prediction of statistical models based on Gaussian processes with Matérn covariance functions scales cubicall...
Invariant Subspace Decomposition
Margherita Lazzaretto, Jonas Peters, Niklas Pfister
We consider the task of predicting a response $Y$ from a set of covariates $X$ in settings where the conditional distribution of $Y$ given $X$ changes...
Posterior Concentrations of Fully-Connected Bayesian Neural Networks with General Priors on the Weights
Insung Kong, Yongdai Kim
Bayesian approaches for training deep neural networks (BNNs) have received significant interest and have been effectively utilized in a wide range of ...
Outlier Robust and Sparse Estimation of Linear Regression Coefficients
Takeyuki Sasai, Hironori Fujisawa
We consider outlier-robust and sparse estimation of linear regression coefficients, when the covariates and the noises are contaminated by adversarial...
Affine Rank Minimization via Asymptotic Log-Det Iteratively Reweighted Least Squares
Sebastian Krämer
The affine rank minimization problem is a well-known approach to matrix recovery. While there are various surrogates to this NP-hard problem, we prove...
Causal Effect of Functional Treatment
Ruoxu Tan, Wei Huang, Zheng Zhang et al.
We study the causal effect with a functional treatment variable, where practical applications often arise in neuroscience, biomedical sciences, etc. P...
Uplift Model Evaluation with Ordinal Dominance Graphs
Brecht Verbeken, Marie-Anne Guerry, Wouter Verbeke et al.
Uplift modelling is a subfield of causal learning that focuses on ranking entities by individual treatment effects. Uplift models are typically evalua...
High-Dimensional L2-Boosting: Rate of Convergence
Ye Luo, Martin Spindler, Jannis Kueck
Boosting is one of the most significant developments in machine learning. This paper studies the rate of convergence of L2-Boosting in a high-dimensio...
Feature Learning in Finite-Width Bayesian Deep Linear Networks with Multiple Outputs and Convolutional Layers
Federico Bassetti, Marco Gherardi, Alessandro Ingrosso et al.
Deep linear networks have been extensively studied, as they provide simplified models of deep learning. However, little is known in the case of finite...
How good is your Laplace approximation of the Bayesian posterior? Finite-sample computable error bounds for a variety of useful divergences
Miko{\l}aj J. Kasprzak, Ryan Giordano, Tamara Broderick
The Laplace approximation is a popular method for constructing a Gaussian approximation to the Bayesian posterior and thereby approximating the poster...
Integral Probability Metrics Meet Neural Networks: The Radon-Kolmogorov-Smirnov Test
Seunghoon Paik, Michael Celentano, Alden Green et al.
Integral probability metrics (IPMs) constitute a general class of nonparametric two-sample tests that are based on maximizing the mean difference betw...
On Inference for the Support Vector Machine
Jakub Rybak, Heather Battey, Wen-Xin Zhou
The linear support vector machine has a parametrised decision boundary. The paper considers inference for the corresponding parameters, which indicate...
Random Pruning Over-parameterized Neural Networks Can Improve Generalization: A Training Dynamics Analysis
Hongru Yang, Yingbin Liang, Xiaojie Guo et al.
It has been observed that applying pruning-at-initialization methods and training the sparse networks can sometimes yield slightly better test perform...
Causal Abstraction: A Theoretical Foundation for Mechanistic Interpretability
Atticus Geiger, Duligur Ibeling, Amir Zur et al.
Causal abstraction provides a theoretical foundation for mechanistic interpretability, the field concerned with providing intelligible algorithms that...
Implicit vs Unfolded Graph Neural Networks
Yongyi Yang, Tang Liu, Yangkun Wang et al.
It has been observed that message-passing graph neural networks (GNN) sometimes struggle to maintain a healthy balance between the efficient / scalabl...
Towards Optimal Branching of Linear and Semidefinite Relaxations for Neural Network Robustness Certification
Brendon G. Anderson, Ziye Ma, Jingqi Li et al.
In this paper, we study certifying the robustness of ReLU neural networks against adversarial input perturbations. To diminish the relaxation error su...
GraphNeuralNetworks.jl: Deep Learning on Graphs with Julia
Carlo Lucibello, Aurora Rossi
GraphNeuralNetworks.jl is an open-source framework for deep learning on graphs, written in the Julia programming language. It supports multiple GPU ba...
Dynamic angular synchronization under smoothness constraints
Ernesto Araya, Mihai Cucuringu, Hemant Tyagi
Given an undirected measurement graph $\mathcal{H} = ([n], \mathcal{E})$, the classical angular synchronization problem consists of recovering unkno...
Derivative-Informed Neural Operator Acceleration of Geometric MCMC for Infinite-Dimensional Bayesian Inverse Problems
Lianghao Cao, Thomas O'Leary-Roseberry, Omar Ghattas
We propose an operator learning approach to accelerate geometric Markov chain Monte Carlo (MCMC) for solving infinite-dimensional Bayesian inverse pro...
Wasserstein F-tests for Frechet regression on Bures-Wasserstein manifolds
Haoshu Xu, Hongzhe Li
This paper addresses regression analysis for covariance matrix-valued outcomes with Euclidean covariates, motivated by applications in single-cell gen...
Distributed Stochastic Bilevel Optimization: Improved Complexity and Heterogeneity Analysis
Youcheng Niu, Jinming Xu, Ying Sun et al.
This paper considers solving a class of nonconvex-strongly-convex distributed stochastic bilevel optimization (DSBO) problems with personalized inner-...
Learning causal graphs via nonlinear sufficient dimension reduction
Eftychia Solea, Bing Li, Kyongwon Kim
We introduce a new nonparametric methodology for estimating a directed acyclic graph (DAG) from observational data. Our method is nonparametric in nat...
On Consistent Bayesian Inference from Synthetic Data
Ossi Räisä, Joonas Jälkö, Antti Honkela
Generating synthetic data, with or without differential privacy, has attracted significant attention as a potential solution to the dilemma between ma...
Optimization Over a Probability Simplex
James Chok, Geoffrey M. Vasil
We propose a new iteration scheme, the Cauchy-Simplex, to optimize convex problems over the probability simplex $\{w\in\mathbb{R}^n\ |\ \sum_i w_i=1\ ...
Laplace Meets Moreau: Smooth Approximation to Infimal Convolutions Using Laplace's Method
Ryan J. Tibshirani, Samy Wu Fung, Howard Heaton et al.
We study approximations to the Moreau envelope---and infimal convolutions more broadly---based on Laplace's method, a classical tool in analysis which...
Sampling and Estimation on Manifolds using the Langevin Diffusion
Karthik Bharath, Alexander Lewis, Akash Sharma et al.
Error bounds are derived for sampling and estimation using a discretization of an intrinsically defined Langevin diffusion with invariant measure $\te...
Sharp Bounds for Sequential Federated Learning on Heterogeneous Data
Yipeng Li, Xinchen Lyu
There are two paradigms in Federated Learning (FL): parallel FL (PFL), where models are trained in a parallel manner across clients, and sequential FL...
Local Linear Recovery Guarantee of Deep Neural Networks at Overparameterization
Yaoyu Zhang, Leyang Zhang, Zhongwang Zhang et al.
Determining whether deep neural network (DNN) models can reliably recover target functions at overparameterization is a critical yet complex issue in ...
Stabilizing Sharpness-Aware Minimization Through A Simple Renormalization Strategy
Chengli Tan, Jiangshe Zhang, Junmin Liu et al.
Recently, sharpness-aware minimization (SAM) has attracted much attention because of its surprising effectiveness in improving generalization performa...
Fine-Grained Change Point Detection for Topic Modeling with Pitman-Yor Process
Feifei Wang, Zimeng Zhao, Ruimin Ye et al.
Identifying change points in dynamic text data is crucial for understanding the evolving nature of topics across various sources, such as news article...
Deletion Robust Non-Monotone Submodular Maximization over Matroids
Paul Dütting, Federico Fusco, Silvio Lattanzi et al.
We study the deletion robust version of submodular maximization under matroid constraints. The goal is to extract a small-size summary of the data set...
Instability, Computational Efficiency and Statistical Accuracy
Nhat Ho, Koulik Khamaru, Raaz Dwivedi et al.
Many statistical estimators are defined as the fixed point of a data-dependent operator, with estimators based on minimizing a cost function being an ...
Estimation of Local Geometric Structure on Manifolds from Noisy Data
Yariv Aizenbud, Barak Sober
A common observation in data-driven applications is that high-dimensional data have a low intrinsic dimension, at least locally. In this work, we cons...
Ontolearn---A Framework for Large-scale OWL Class Expression Learning in Python
Caglar Demir, Alkid Baci, N'Dah Jean Kouagou et al.
In this paper, we present Ontolearn---a framework for learning OWL class expressions over large knowledge graphs. Ontolearn contains efficient implem...
Continuously evolving rewards in an open-ended environment
Richard M. Bailey
Unambiguous identification of the rewards driving behaviours of entities operating in complex open-ended real-world environments is difficult, in part...
Recursive Causal Discovery
Ehsan Mokhtarian, Sepehr Elahi, Sina Akbari et al.
Causal discovery from observational data, i.e., learning the causal graph from a finite set of samples from the joint distribution of the variables, i...
Evaluation of Active Feature Acquisition Methods for Time-varying Feature Settings
Henrik von Kleist, Alireza Zamanian, Ilya Shpitser et al.
Machine learning methods often assume that input features are available at no cost. However, in domains like healthcare, where acquiring features coul...
On Adaptive Stochastic Optimization for Streaming Data: A Newton's Method with O(dN) Operations
Antoine Godichon-Baggioni, Nicklas Werge
Stochastic optimization methods face new challenges in the realm of streaming data, characterized by a continuous flow of large, high-dimensional data...
Determine the Number of States in Hidden Markov Models via Marginal Likelihood
Yang Chen, Cheng-Der Fuh, Chu-Lan Michael Kao
Hidden Markov models (HMM) have been widely used by scientists to model stochastic systems: the underlying process is a discrete Markov chain, and the...
Variance-Aware Estimation of Kernel Mean Embedding
Geoffrey Wolfer, Pierre Alquier
An important feature of kernel mean embeddings (KME) is that the rate of convergence of the empirical KME to the true distribution KME can be bounded ...
Scaling ResNets in the Large-depth Regime
Pierre Marion, Adeline Fermanian, Gérard Biau et al.
Deep ResNets are recognized for achieving state-of-the-art results in complex machine learning tasks. However, the remarkable performance of these arc...
A Comparative Evaluation of Quantification Methods
Tobias Schumacher, Markus Strohmaier, Florian Lemmerich
Quantification represents the problem of estimating the distribution of class labels on unseen data. It also represents a growing research field in su...
Lightning UQ Box: Uncertainty Quantification for Neural Networks
Nils Lehmann, Nina Maria Gottschling, Jakob Gawlikowski et al.
Although neural networks have shown impressive results in a multitude of application domains, the "black box" nature of deep learning and lack of conf...
Scaling Data-Constrained Language Models
Niklas Muennighoff, Alexander M. Rush, Boaz Barak et al.
The current trend of scaling language models involves increasing both parameter count and training data set size. Extrapolating this trend suggests th...
Curvature-based Clustering on Graphs
Yu Tian, Zachary Lubberts, Melanie Weber
Unsupervised node clustering (or community detection) is a classical graph learning task. In this paper, we study algorithms that exploit the geometry...
Composite Goodness-of-fit Tests with Kernels
Oscar Key, Arthur Gretton, François-Xavier Briol et al.
We propose kernel-based hypothesis tests for the challenging composite testing problem, where we are interested in whether the data comes from any dis...
PFLlib: A Beginner-Friendly and Comprehensive Personalized Federated Learning Library and Benchmark
Jianqing Zhang, Yang Liu, Yang Hua et al.
Amid the ongoing advancements in Federated Learning (FL), a machine learning paradigm that allows collaborative learning with data privacy protection,...
The Effect of SGD Batch Size on Autoencoder Learning: Sparsity, Sharpness, and Feature Learning
Nikhil Ghosh, Spencer Frei, Wooseok Ha et al.
In this work, we investigate the dynamics of stochastic gradient descent (SGD) when training a single-neuron autoencoder with linear or ReLU activatio...
Efficient and Robust Transfer Learning of Optimal Individualized Treatment Regimes with Right-Censored Survival Data
Pan Zhao, Julie Josse, Shu Yang
An individualized treatment regime (ITR) is a decision rule that assigns treatments based on patients' characteristics. The value function of an ITR i...
DAGs as Minimal I-maps for the Induced Models of Causal Bayesian Networks under Conditioning
Xiangdong Xie, Jiahua Guo, Yi Sun
Bayesian networks (BNs) are a powerful tool for knowledge representation and reasoning, especially for complex systems. A critical task in the applic...
Adjusted Expected Improvement for Cumulative Regret Minimization in Noisy Bayesian Optimization
Shouri Hu, Haowei Wang, Zhongxiang Dai et al.
The expected improvement (EI) is one of the most popular acquisition functions for Bayesian optimization (BO) and has demonstrated good empirical perf...
Manifold Fitting under Unbounded Noise
Zhigang Yao, Yuqing Xia
In the field of non-Euclidean statistical analysis, a trend has emerged in recent times, of attempts to recover a low dimensional structure, namely a ...
Learning Global Nash Equilibrium in Team Competitive Games with Generalized Fictitious Cross-Play
Zelai Xu, Chao Yu, Yancheng Liang et al.
Self-play (SP) is a popular multi-agent reinforcement learning framework for competitive games. Despite the empirical success, the theoretical propert...
Wasserstein Convergence Guarantees for a General Class of Score-Based Generative Models
Xuefeng Gao, Hoang M. Nguyen, Lingjiong Zhu
Score-based generative models are a recent class of deep generative models with state-of-the-art performance in many applications. In this paper, we e...
Extremal graphical modeling with latent variables via convex optimization
Sebastian Engelke, Armeen Taeb
Extremal graphical models encode the conditional independence structure of multivariate extremes and provide a powerful tool for quantifying the risk ...
On the Approximation of Kernel functions
Paul Dommel, Alois Pichler
Various methods in statistical learning build on kernels considered in reproducing kernel Hilbert spaces. In applications, the kernel is often selecte...
Efficient and Robust Semi-supervised Estimation of Average Treatment Effect with Partially Annotated Treatment and Response
Jue Hou, Rajarshi Mukherjee, Tianxi Cai
A notable challenge of leveraging Electronic Health Records (EHR) for treatment effect assessment is the lack of precise information on important clin...
Nonconvex Stochastic Bregman Proximal Gradient Method with Application to Deep Learning
Kuangyu Ding, Jingyang Li, Kim-Chuan Toh
Stochastic gradient methods for minimizing nonconvex composite objective functions typically rely on the Lipschitz smoothness of the differentiable pa...
Optimizing Data Collection for Machine Learning
Rafid Mahmood, James Lucas, Jose M. Alvarez et al.
Modern deep learning systems require huge data sets to achieve impressive performance, but there is little guidance on how much or what kind of data t...
Unbalanced Kantorovich-Rubinstein distance, plan, and barycenter on nite spaces: A statistical perspective
Shayan Hundrieser, Florian Heinemann, Marcel Klatt et al.
We analyze statistical properties of plug-in estimators for unbalanced optimal transport quantities between finitely supported measures in different p...
Copula-based Sensitivity Analysis for Multi-Treatment Causal Inference with Unobserved Confounding
Jiajing Zheng, Alexander D'Amour, Alexander Franks
Recent work has focused on the potential and pitfalls of causal identification in observational studies with multiple simultaneous treatments. Buildin...
Rank-one Convexification for Sparse Regression
Alper Atamturk, Andres Gomez
Sparse regression models are increasingly prevalent due to their ease of interpretability and superior out-of-sample performance. However, the exact m...
gsplat: An Open-Source Library for Gaussian Splatting
Vickie Ye, Ruilong Li, Justin Kerr et al.
gsplat is an open-source library designed for training and developing Gaussian Splatting methods. It features a front-end with Python bindings compati...
Statistical Inference of Constrained Stochastic Optimization via Sketched Sequential Quadratic Programming
Sen Na, Michael Mahoney
We consider online statistical inference of constrained stochastic nonlinear optimization problems. We apply the Stochastic Sequential Quadratic Progr...
Sliced-Wasserstein Distances and Flows on Cartan-Hadamard Manifolds
Clément Bonet, Lucas Drumetz, Nicolas Courty
While many Machine Learning methods have been developed or transposed on Riemannian manifolds to tackle data with known non-Euclidean geometry, Optima...
Accelerating optimization over the space of probability measures
Shi Chen, Qin Li, Oliver Tse et al.
The acceleration of gradient-based optimization methods is a subject of significant practical and theoretical importance, particularly within machine ...
Bayesian Multi-Group Gaussian Process Models for Heterogeneous Group-Structured Data
Didong Li, Andrew Jones, Sudipto Banerjee et al.
Gaussian processes are pervasive in functional data analysis, machine learning, and spatial statistics for modeling complex dependencies. Scientific d...
Orthogonal Bases for Equivariant Graph Learning with Provable k-WL Expressive Power
Jia He, Maggie Cheng
Graph neural network (GNN) models have been widely used for learning graph-structured data. Due to the permutation-invariant requirement of graph lear...
Optimal Experiment Design for Causal Effect Identification
Sina Akbari, Jalal Etesami, Negar Kiyavash
Pearl’s do calculus is a complete axiomatic approach to learn the identifiable causal effects from observational data. When such an effect is not iden...
Mean Aggregator is More Robust than Robust Aggregators under Label Poisoning Attacks on Distributed Heterogeneous Data
Jie Peng, Weiyu Li, Stefan Vlaski et al.
Robustness to malicious attacks is of paramount importance for distributed learning. Existing works usually consider the classical Byzantine attacks m...
The Blessing of Heterogeneity in Federated Q-Learning: Linear Speedup and Beyond
Jiin Woo, Gauri Joshi, Yuejie Chi
In this paper, we consider federated Q-learning, which aims to learn an optimal Q-function by periodically aggregating local Q-estimates trained on lo...
depyf: Open the Opaque Box of PyTorch Compiler for Machine Learning Researchers
Kaichao You, Runsheng Bai, Meng Cao et al.
PyTorch 2.x introduces a compiler designed to accelerate deep learning programs. However, for machine learning researchers, fully leveraging the PyTor...
The ODE Method for Stochastic Approximation and Reinforcement Learning with Markovian Noise
Shuze Daniel Liu, Shuhang Chen, Shangtong Zhang
Stochastic approximation is a class of algorithms that update a vector iteratively, incrementally, and stochastically, including, e.g., stochastic gra...
Improving Graph Neural Networks on Multi-node Tasks with the Labeling Trick
Xiyuan Wang, Pan Li, Muhan Zhang
In this paper, we study using graph neural networks (GNNs) for multi-node representation learning, where a representation for a set of more than one n...
Directed Cyclic Graphs for Simultaneous Discovery of Time-Lagged and Instantaneous Causality from Longitudinal Data Using Instrumental Variables
Wei Jin, Yang Ni, Amanda B. Spence et al.
We consider the problem of causal discovery from longitudinal observational data. We develop a novel framework that simultaneously discovers the time-...
Bayesian Sparse Gaussian Mixture Model for Clustering in High Dimensions
Dapeng Yao, Fangzheng Xie, Yanxun Xu
We study the sparse high-dimensional Gaussian mixture model when the number of clusters is allowed to grow with the sample size. A minimax lower bound...
Regularizing Hard Examples Improves Adversarial Robustness
Hyungyu Lee, Saehyung Lee, Ho Bae et al.
Recent studies have validated that pruning hard-to-learn examples from training improves the generalization performance of neural networks (NNs). In t...
Random ReLU Neural Networks as Non-Gaussian Processes
Rahul Parhi, Pakshal Bohra, Ayoub El Biari et al.
We consider a large class of shallow neural networks with randomly initialized parameters and rectified linear unit activation functions. We prove tha...
Riemannian Bilevel Optimization
Jiaxiang Li, Shiqian Ma
In this work, we consider the bilevel optimization problem on Riemannian manifolds. We inspect the calculation of the hypergradient of such problems o...
Supervised Learning with Evolving Tasks and Performance Guarantees
Verónica Álvarez, Santiago Mazuelas, Jose A. Lozano
Multiple supervised learning scenarios are composed by a sequence of classification tasks. For instance, multi-task learning and continual learning ai...
Error estimation and adaptive tuning for unregularized robust M-estimator
Pierre C. Bellec, Takuya Koriyama
We consider unregularized robust M-estimators for linear models under Gaussian design and heavy-tailed noise, in the proportional asymptotics regime w...
From Sparse to Dense Functional Data in High Dimensions: Revisiting Phase Transitions from a Non-Asymptotic Perspective
Shaojun Guo, Dong Li, Xinghao Qiao et al.
Nonparametric estimation of the mean and covariance functions is ubiquitous in functional data analysis and local linear smoothing techniques are most...
Locally Private Causal Inference for Randomized Experiments
Yuki Ohnishi, Jordan Awan
Local differential privacy is a differential privacy paradigm in which individuals first apply a privacy mechanism to their data (often by adding nois...
Estimating Network-Mediated Causal Effects via Principal Components Network Regression
Alex Hayes, Mark M. Fredrickson, Keith Levin
We develop a method to decompose causal effects on a social network into an indirect effect mediated by the network, and a direct effect independent o...
Selective Inference with Distributed Data
Sifan Liu, Snigdha Panigrahi
When data are distributed across multiple sites or machines rather than centralized in one location, researchers face the challenge of extracting mean...
Two-Timescale Gradient Descent Ascent Algorithms for Nonconvex Minimax Optimization
Tianyi Lin, Chi Jin, Michael I. Jordan
We provide a unified analysis of two-timescale gradient descent ascent (TTGDA) for solving structured nonconvex minimax optimization problems in the f...
An Axiomatic Definition of Hierarchical Clustering
Ery Arias-Castro, Elizabeth Coda
In this paper, we take an axiomatic approach to defining a population hierarchical clustering for piecewise constant densities, and in a similar manne...
Test-Time Training on Video Streams
Renhao Wang, Yu Sun, Arnuv Tandon et al.
Prior work has established Test-Time Training (TTT) as a general framework to further improve a trained model at test time. Before making a prediction...
Adaptive Client Sampling in Federated Learning via Online Learning with Bandit Feedback
Boxin Zhao, Lingxiao Wang, Ziqi Liu et al.
Due to the high cost of communication, federated learning (FL) systems need to sample a subset of clients that are involved in each round of training....
A Random Matrix Approach to Low-Multilinear-Rank Tensor Approximation
Hugo Lebeau, Florent Chatelain, Romain Couillet
This work presents a comprehensive understanding of the estimation of a planted low-rank signal from a general spiked tensor model near the computatio...
Memory Gym: Towards Endless Tasks to Benchmark Memory Capabilities of Agents
Marco Pleines, Matthias Pallasch, Frank Zimmer et al.
Memory Gym presents a suite of 2D partially observable environments, namely Mortar Mayhem, Mystery Path, and Searing Spotlights, designed to benchmark...
Enhancing Graph Representation Learning with Localized Topological Features
Zuoyu Yan, Qi Zhao, Ze Ye et al.
Representation learning on graphs is a fundamental problem that can be crucial in various tasks. Graph neural networks, the dominant approach for grap...
Deep Out-of-Distribution Uncertainty Quantification via Weight Entropy Maximization
Antoine de Mathelin, François Deheeger, Mathilde Mougeot et al.
This paper deals with uncertainty quantification and out-of-distribution detection in deep learning using Bayesian and ensemble methods. It proposes a...
DisC2o-HD: Distributed causal inference with covariates shift for analyzing real-world high-dimensional data
Jiayi Tong, Jie Hu, George Hripcsak et al.
High-dimensional healthcare data, such as electronic health records (EHR) data and claims data, present two primary challenges due to the large number...
Bayes Meets Bernstein at the Meta Level: an Analysis of Fast Rates in Meta-Learning with PAC-Bayes
Charles Riou, Pierre Alquier, Badr-Eddine Chérief-Abdellatif
Bernstein's condition is a key assumption that guarantees fast rates in machine learning. For example, under this condition, the Gibbs posterior with ...
Efficiently Escaping Saddle Points in Bilevel Optimization
Minhui Huang, Xuxing Chen, Kaiyi Ji et al.
Bilevel optimization is one of the fundamental problems in machine learning and optimization. Recent theoretical developments in bilevel optimization ...
Leveraging External Data for Testing Experimental Therapies with Biomarker Interactions in Randomized Clinical Trials
B Ren, others
Abstract
Simulating diffusion bridges with score matching
J Heng, others
Abstract
Correction to: Parameterizing and simulating from causal models
A family of toroidal diffusions with exact likelihood inference
E García-portugués, M Sørensen
Abstract
Identifying and bounding the probability of necessity for causes of effects with ordinal outcomes
Chao Zhang, others
Abstract
Optimal clustering by Lloyd’s algorithm for low-rank mixture model
Zhongyuan Lyu, Dong Xia
A unified generalization of the inverse regression methods via column selection
Yin Jin, Wei Luo
Identification and multiply robust estimation in causal mediation analysis across principal strata
Chao Cheng, Fan Li
Pseudo-likelihood Estimators for Graphical Models: Existence and Uniqueness
B Roycraft, B Rajaratnam
Abstract
Goodness-of-fit tests for linear non-Gaussian structural equation models
D Schkoda, M Drton
Abstract
Ordinary differential equation models for a collection of discretized functions
Lingxuan Shao, Fang Yao
Semiparametric localized principal stratification analysis with continuous strata
Yichi Zhang, Shu Yang
Least squares for cardinal paired comparisons data
Rahul Singh, others
A Semiparametric Instrumented Difference-in-Differences Approach to Policy Learning
Pan Zhao, Yifan Cui
Abstract
Orthogonalized moment aberration for mixed-level multi-stratum factorial designs with partially-relaxed orthogonal block structures
Ming-Chung Chang
Regularized halfspace depth for functional data
Hyemin Yeon, others
Multicalibration for Modeling Censored Survival Data with Universal Adaptability
Hanxuan Ye, Hongzhe Li
Abstract
Goodness-of-fit tests for high-dimensional Gaussian graphical models via exchangeable sampling
Xiaotong Lin, others
Structural restrictions in local causal discovery: identifying direct causes of a target variable
J Bodik, V Chavez-Demoulin
Abstract
Nonsense associations in Markov random fields with pairwise dependence
Sohom Bhattacharya, others
Abstract
Robust functional principal component analysis for non-Euclidean random objects
Jiazhen Xu, others
Abstract
Aggregating Dependent Signals with Heavy-Tailed Combination Tests
Lin Gui, others
Abstract
Detection and inference of changes in high-dimensional linear regression with nonsparse structures
Haeran Cho, others
Isotonic mechanism for exponential family estimation in machine learning peer review
Yuling Yan, others
Covariate-assisted bounds on causal effects with instrumental variables
Alexander W Levis, others
Improving the false coverage rate adjusted confidence intervals
Tzviel Frostig, Yoav Benjamini
Consistent and Scalable Composite Likelihood Estimation of Probit Models with Crossed Random Effects
R Bellio, others
Abstract
Powerful Partial Conjunction Hypothesis Testing via Conditioning
B Liang, others
Abstract
An optimal design framework for lasso sign recovery
Jonathan W Stallrich, others
Bayesian mixture models with repulsive and attractive atoms
Mario Beraha, others
A statistical view of column subset selection
Anav Sood, Trevor Hastie
Predictive performance of power posteriors
Y McLatchie, others
Abstract
Unbiased and consistent nested sampling via sequential Monte Carlo
Robert Salomone, others
SymmPI: predictive inference for data with group symmetries
Edgar Dobriban, Mengxin Yu
Product centred Dirichlet processes for Bayesian multiview clustering
Alexander Dombowsky, David B Dunson
Bias correction of quadratic spectral estimators
Lachlan C Astfalck, others
Abstract
Integer Programming for Learning Directed Acyclic Graphs from Non-identifiable Gaussian Models
Tong Xu, others
Abstract
Augmented balancing weights as linear regression
David Bruns-Smith, others
Graphical methods for Order-of-Addition experiments
Nicholas Rios, Dennis K J Lin
Convexity and measures of statistical association
Emanuele Borgonovo, others
Confidence on the focal: conformal prediction with selection-conditional coverage
Ying Jin, Zhimei Ren
Towards a turnkey approach for unbiased Monte Carlo estimation of smooth functions of expectations
Nicolas Chopin, others
Abstract
A General Form of Covariate Adjustment in Clinical Trials under Covariate-Adaptive Randomization
Marlena S Bannick, others
Abstract
Dynamic Factor Analysis of High-Dimensional Recurrent Events
F Chen, others
Abstract
Multi-resolution subsampling for linear classification with massive data
Haolin Chen, others
Improving efficiency in transporting average treatment effects
K E Rudolph, others
Abstract
A general condition for bias attenuation by a nondifferentially mismeasured confounder
Jeffrey Zhang, Junu Lee
Abstract
Sequential Monte Carlo testing by betting
Lasse Fischer, Aaditya Ramdas
A general framework for cutting feedback within modularized Bayesian inference
Yang Liu, Robert J B Goudie
Correction to: Consistent and fast inference in compartmental models of epidemics using Poisson Approximate Likelihoods
Strong oracle guarantees for partial penalized tests of high-dimensional generalized linear models
Tate Jacobson
A spike-and-slab prior for dimension selection in generalized linear network eigenmodels
Joshua D Loyal, Yuguo Chen
Abstract
Selecting informative conformal prediction sets with false coverage rate control
Ulysse Gazin, others
Transfer learning for piecewise-constant mean estimation: Optimality,l1- andl0-penalization
F Wang, Y Yu
Abstract
Conformal prediction with conditional guarantees
Isaac Gibbs, others
Bayesian penalized empirical likelihood and Markov Chain Monte Carlo sampling
Jinyuan Chang, others
A conditioning tactic that increases design sensitivity in observational block designs
Paul R Rosenbaum
Adaptive experiments toward learning treatment effect heterogeneity
Waverly Wei, others
Semiparametric posterior corrections
Andrew Yiu, others
High-dimensional Factor Analysis for Network-linked data
Jinming Li, others
Abstract
Two-phase rejective sampling and its asymptotic properties
Shu Yang, Peng Ding
Randomized empirical likelihood test for ultra-high dimensional means under general covariances
Yuexin Chen, others
Analytic natural gradient updates for Cholesky factor in Gaussian variational approximation
Linda S L Tan
‘On the behaviour of marginal and conditional AIC in linear mixed models’
Sonja Greven, Thomas Kneib