Papers

Found 115 papers

Sorted by: Newest First

JMLR Jul 15, 2025

DRM Revisited: A Complete Error Analysis

Yuling Jiao, Ruoxuan Li, Peiying Wu et al.

It is widely known that the error analysis for deep learning involves approximation, statistical, and optimization errors. However, it is challenging ...

View PDF

JMLR Jul 15, 2025

Principled Penalty-based Methods for Bilevel Reinforcement Learning and RLHF

Han Shen, Zhuoran Yang, Tianyi Chen

Bilevel optimization has been recently applied to many machine learning tasks. However, their applications have been restricted to the supervised lear...

View PDF

JMLR Jul 15, 2025

Precise High-Dimensional Asymptotics for Quantifying Heterogeneous Transfers

Fan Yang, Hongyang R. Zhang, Sen Wu et al.

The problem of learning one task using samples from another task is central to transfer learning. In this paper, we focus on answering the following q...

High-Dimensional Statistics

View PDF

JMLR Jul 15, 2025

Score-based Causal Representation Learning: Linear and General Transformations

Burak Var{{\i}}c{{\i}}, Emre Acartürk, Karthikeyan Shanmugam et al.

This paper addresses intervention-based causal representation learning (CRL) under a general nonparametric latent causal model and an unknown transfor...

Causal Inference

View PDF

JMLR Jul 15, 2025

On the Statistical Properties of Generative Adversarial Models for Low Intrinsic Data Dimension

Saptarshi Chakraborty, Peter L. Bartlett

Despite the remarkable empirical successes of Generative Adversarial Networks (GANs), the theoretical guarantees for their statistical accuracy remain...

View PDF

JMLR Jul 15, 2025

Prominent Roles of Conditionally Invariant Components in Domain Adaptation: Theory and Algorithms

Keru Wu, Yuansi Chen, Wooseok Ha et al.

Domain adaptation (DA) is a statistical learning problem that arises when the distribution of the source data used to train a model differs from that ...

Machine Learning Computational Statistics

View PDF

JMLR Jul 15, 2025

Near-Optimal Nonconvex-Strongly-Convex Bilevel Optimization with Fully First-Order Oracles

Lesi Chen, Yaohua Ma, Jingzhao Zhang

In this work, we consider bilevel optimization when the lower-level problem is strongly convex. Recent works show that with a Hessian-vector product (...

Computational Statistics

View PDF

JMLR Jul 15, 2025

Adaptive Distributed Kernel Ridge Regression: A Feasible Distributed Learning Scheme for Data Silos

Shao-Bo Lin, Xiaotong Liu, Di Wang et al.

Data silos, mainly caused by privacy and interoperability, significantly constrain collaborations among different organizations with similar data for ...

High-Dimensional Statistics Machine Learning Nonparametric Statistics

View PDF

JMLR Jul 15, 2025

On Global and Local Convergence of Iterative Linear Quadratic Optimization Algorithms for Discrete Time Nonlinear Control

Vincent Roulet, Siddhartha Srinivasa, Maryam Fazel et al.

A classical approach for solving discrete time nonlinear control on a finite horizon consists in repeatedly minimizing linear quadratic approximations...

Computational Statistics

View PDF

JMLR Jul 15, 2025

A Decentralized Proximal Gradient Tracking Algorithm for Composite Optimization on Riemannian Manifolds

Lei Wang, Le Bao, Xin Liu

This paper focuses on minimizing a smooth function combined with a nonsmooth regularization term on a compact Riemannian submanifold embedded in the E...

Computational Statistics

View PDF

JMLR Jul 15, 2025

Learning conditional distributions on continuous spaces

Cyril Benezet, Ziteng Cheng, Sebastian Jaimungal

We investigate sample-based learning of conditional distributions on multi-dimensional unit boxes, allowing for different dimensions of the feature an...

View PDF

JMLR Jul 15, 2025

A Unified Analysis of Nonstochastic Delayed Feedback for Combinatorial Semi-Bandits, Linear Bandits, and MDPs

Lukas Zierahn, Dirk van der Hoeven, Tal Lancewicki et al.

We derive a new analysis of Follow The Regularized Leader (FTRL) for online learning with delayed bandit feedback. By separating the cost of delayed f...

View PDF

JMLR Jul 15, 2025

Error bounds for particle gradient descent, and extensions of the log-Sobolev and Talagrand inequalities

Rocco Caprio, Juan Kuntz, Samuel Power et al.

We derive non-asymptotic error bounds for particle gradient descent (PGD, Kuntz et al. (2023)), a recently introduced algorithm for maximum likelihoo...

View PDF

JMLR Jul 15, 2025

Linear Hypothesis Testing in High-Dimensional Expected Shortfall Regression with Heavy-Tailed Errors

Gaoyu Wu, Jelena Bradic, Kean Ming Tan et al.

Expected shortfall (ES) is widely used for characterizing the tail of a distribution across various fields, particularly in financial risk management....

High-Dimensional Statistics Machine Learning Hypothesis Testing

View PDF

JMLR Jul 15, 2025

Efficient Numerical Integration in Reproducing Kernel Hilbert Spaces via Leverage Scores Sampling

Antoine Chatalic, Nicolas Schreuder, Ernesto De Vito et al.

In this work we consider the problem of numerical integration, i.e., approximating integrals with respect to a target probability measure using only p...

Nonparametric Statistics

View PDF

JMLR Jul 15, 2025

Distribution Free Tests for Model Selection Based on Maximum Mean Discrepancy with Estimated Parameters

Florian Brück, Jean-David Fermanian, Aleksey Min

There exist several testing procedures based on the maximum mean discrepancy (MMD) to address the challenge of model specification. However, these tes...

Statistical Learning

View PDF

JMLR Jul 15, 2025

Statistical field theory for Markov decision processes under uncertainty

George Stamatescu

A statistical field theory is introduced for finite state and action Markov decision processes with unknown parameters, in a Bayesian setting. The Bel...

Machine Learning

View PDF

JMLR Jul 15, 2025

Bayesian Data Sketching for Varying Coefficient Regression Models

Rajarshi Guhaniyogi, Laura Baracaldo, Sudipto Banerjee

Varying coefficient models are popular for estimating nonlinear regression functions in functional data models. Their Bayesian variants have received ...

Machine Learning Bayesian Statistics

View PDF

JMLR Jul 15, 2025

Bagged k-Distance for Mode-Based Clustering Using the Probability of Localized Level Sets

Hanyuan Hang

In this paper, we propose an ensemble learning algorithm named bagged $k$-distance for mode-based clustering (BDMBC) by putting forward a new measure ...

View PDF

JMLR Jul 15, 2025

Linear cost and exponentially convergent approximation of Gaussian Matérn processes on intervals

David Bolin, Vaibhav Mehandiratta, Alexandre B. Simas

The computational cost for inference and prediction of statistical models based on Gaussian processes with Matérn covariance functions scales cubicall...

View PDF

JMLR Jul 15, 2025

Invariant Subspace Decomposition

Margherita Lazzaretto, Jonas Peters, Niklas Pfister

We consider the task of predicting a response $Y$ from a set of covariates $X$ in settings where the conditional distribution of $Y$ given $X$ changes...

View PDF

JMLR Jul 15, 2025

Posterior Concentrations of Fully-Connected Bayesian Neural Networks with General Priors on the Weights

Insung Kong, Yongdai Kim

Bayesian approaches for training deep neural networks (BNNs) have received significant interest and have been effectively utilized in a wide range of ...

Machine Learning Bayesian Statistics

View PDF

JMLR Jul 15, 2025

Outlier Robust and Sparse Estimation of Linear Regression Coefficients

Takeyuki Sasai, Hironori Fujisawa

We consider outlier-robust and sparse estimation of linear regression coefficients, when the covariates and the noises are contaminated by adversarial...

High-Dimensional Statistics Machine Learning

View PDF

JMLR Jul 15, 2025

Affine Rank Minimization via Asymptotic Log-Det Iteratively Reweighted Least Squares

Sebastian Krämer

The affine rank minimization problem is a well-known approach to matrix recovery. While there are various surrogates to this NP-hard problem, we prove...

View PDF

JMLR Jul 15, 2025

Causal Effect of Functional Treatment

Ruoxu Tan, Wei Huang, Zheng Zhang et al.

We study the causal effect with a functional treatment variable, where practical applications often arise in neuroscience, biomedical sciences, etc. P...

Causal Inference

View PDF

JMLR Jul 15, 2025

Uplift Model Evaluation with Ordinal Dominance Graphs

Brecht Verbeken, Marie-Anne Guerry, Wouter Verbeke et al.

Uplift modelling is a subfield of causal learning that focuses on ranking entities by individual treatment effects. Uplift models are typically evalua...

View PDF

JMLR Jul 15, 2025

High-Dimensional L2-Boosting: Rate of Convergence

Ye Luo, Martin Spindler, Jannis Kueck

Boosting is one of the most significant developments in machine learning. This paper studies the rate of convergence of L2-Boosting in a high-dimensio...

High-Dimensional Statistics

View PDF

JMLR Jul 15, 2025

Feature Learning in Finite-Width Bayesian Deep Linear Networks with Multiple Outputs and Convolutional Layers

Federico Bassetti, Marco Gherardi, Alessandro Ingrosso et al.

Deep linear networks have been extensively studied, as they provide simplified models of deep learning. However, little is known in the case of finite...

Bayesian Statistics

View PDF

JMLR Jul 15, 2025

How good is your Laplace approximation of the Bayesian posterior? Finite-sample computable error bounds for a variety of useful divergences

Miko{\l}aj J. Kasprzak, Ryan Giordano, Tamara Broderick

The Laplace approximation is a popular method for constructing a Gaussian approximation to the Bayesian posterior and thereby approximating the poster...

Bayesian Statistics

View PDF

JMLR Jul 15, 2025

Integral Probability Metrics Meet Neural Networks: The Radon-Kolmogorov-Smirnov Test

Seunghoon Paik, Michael Celentano, Alden Green et al.

Integral probability metrics (IPMs) constitute a general class of nonparametric two-sample tests that are based on maximizing the mean difference betw...

Machine Learning

View PDF

JMLR Jul 15, 2025

On Inference for the Support Vector Machine

Jakub Rybak, Heather Battey, Wen-Xin Zhou

The linear support vector machine has a parametrised decision boundary. The paper considers inference for the corresponding parameters, which indicate...

View PDF

JMLR Jul 15, 2025

Random Pruning Over-parameterized Neural Networks Can Improve Generalization: A Training Dynamics Analysis

Hongru Yang, Yingbin Liang, Xiaojie Guo et al.

It has been observed that applying pruning-at-initialization methods and training the sparse networks can sometimes yield slightly better test perform...

Machine Learning

View PDF

JMLR Jul 15, 2025

Causal Abstraction: A Theoretical Foundation for Mechanistic Interpretability

Atticus Geiger, Duligur Ibeling, Amir Zur et al.

Causal abstraction provides a theoretical foundation for mechanistic interpretability, the field concerned with providing intelligible algorithms that...

Causal Inference

View PDF

JMLR Jul 15, 2025

Implicit vs Unfolded Graph Neural Networks

Yongyi Yang, Tang Liu, Yangkun Wang et al.

It has been observed that message-passing graph neural networks (GNN) sometimes struggle to maintain a healthy balance between the efficient / scalabl...

Machine Learning

View PDF

JMLR Jul 15, 2025

Towards Optimal Branching of Linear and Semidefinite Relaxations for Neural Network Robustness Certification

Brendon G. Anderson, Ziye Ma, Jingqi Li et al.

In this paper, we study certifying the robustness of ReLU neural networks against adversarial input perturbations. To diminish the relaxation error su...

Machine Learning

View PDF

JMLR Jul 15, 2025

GraphNeuralNetworks.jl: Deep Learning on Graphs with Julia

Carlo Lucibello, Aurora Rossi

GraphNeuralNetworks.jl is an open-source framework for deep learning on graphs, written in the Julia programming language. It supports multiple GPU ba...

Machine Learning

View PDF

JMLR Jul 15, 2025

Dynamic angular synchronization under smoothness constraints

Ernesto Araya, Mihai Cucuringu, Hemant Tyagi

Given an undirected measurement graph $\mathcal{H} = ([n], \mathcal{E})$, the classical angular synchronization problem consists of recovering unkno...

Machine Learning

View PDF

JMLR Jul 15, 2025

Derivative-Informed Neural Operator Acceleration of Geometric MCMC for Infinite-Dimensional Bayesian Inverse Problems

Lianghao Cao, Thomas O'Leary-Roseberry, Omar Ghattas

We propose an operator learning approach to accelerate geometric Markov chain Monte Carlo (MCMC) for solving infinite-dimensional Bayesian inverse pro...

Bayesian Statistics

View PDF

JMLR Jul 15, 2025

Wasserstein F-tests for Frechet regression on Bures-Wasserstein manifolds

Haoshu Xu, Hongzhe Li

This paper addresses regression analysis for covariance matrix-valued outcomes with Euclidean covariates, motivated by applications in single-cell gen...

Machine Learning

View PDF

JMLR Jul 15, 2025

Distributed Stochastic Bilevel Optimization: Improved Complexity and Heterogeneity Analysis

Youcheng Niu, Jinming Xu, Ying Sun et al.

This paper considers solving a class of nonconvex-strongly-convex distributed stochastic bilevel optimization (DSBO) problems with personalized inner-...

Computational Statistics

View PDF

JMLR Jul 15, 2025

Learning causal graphs via nonlinear sufficient dimension reduction

Eftychia Solea, Bing Li, Kyongwon Kim

We introduce a new nonparametric methodology for estimating a directed acyclic graph (DAG) from observational data. Our method is nonparametric in nat...

Causal Inference

View PDF

JMLR Jul 15, 2025

On Consistent Bayesian Inference from Synthetic Data

Ossi Räisä, Joonas Jälkö, Antti Honkela

Generating synthetic data, with or without differential privacy, has attracted significant attention as a potential solution to the dilemma between ma...

Bayesian Statistics

View PDF

JMLR Jul 15, 2025

Optimization Over a Probability Simplex

James Chok, Geoffrey M. Vasil

We propose a new iteration scheme, the Cauchy-Simplex, to optimize convex problems over the probability simplex $\{w\in\mathbb{R}^n\ |\ \sum_i w_i=1\ ...

Computational Statistics

View PDF

JMLR Jul 15, 2025

Laplace Meets Moreau: Smooth Approximation to Infimal Convolutions Using Laplace's Method

Ryan J. Tibshirani, Samy Wu Fung, Howard Heaton et al.

We study approximations to the Moreau envelope---and infimal convolutions more broadly---based on Laplace's method, a classical tool in analysis which...

View PDF

JMLR Jul 15, 2025

Sampling and Estimation on Manifolds using the Langevin Diffusion

Karthik Bharath, Alexander Lewis, Akash Sharma et al.

Error bounds are derived for sampling and estimation using a discretization of an intrinsically defined Langevin diffusion with invariant measure $\te...

View PDF

JMLR Jul 15, 2025

Sharp Bounds for Sequential Federated Learning on Heterogeneous Data

Yipeng Li, Xinchen Lyu

There are two paradigms in Federated Learning (FL): parallel FL (PFL), where models are trained in a parallel manner across clients, and sequential FL...

View PDF

JMLR Jul 15, 2025

Local Linear Recovery Guarantee of Deep Neural Networks at Overparameterization

Yaoyu Zhang, Leyang Zhang, Zhongwang Zhang et al.

Determining whether deep neural network (DNN) models can reliably recover target functions at overparameterization is a critical yet complex issue in ...

Machine Learning

View PDF

JMLR Jul 15, 2025

Stabilizing Sharpness-Aware Minimization Through A Simple Renormalization Strategy

Chengli Tan, Jiangshe Zhang, Junmin Liu et al.

Recently, sharpness-aware minimization (SAM) has attracted much attention because of its surprising effectiveness in improving generalization performa...

View PDF

JMLR Jul 15, 2025

Fine-Grained Change Point Detection for Topic Modeling with Pitman-Yor Process

Feifei Wang, Zimeng Zhao, Ruimin Ye et al.

Identifying change points in dynamic text data is crucial for understanding the evolving nature of topics across various sources, such as news article...

Machine Learning

View PDF

JMLR Jul 15, 2025

Deletion Robust Non-Monotone Submodular Maximization over Matroids

Paul Dütting, Federico Fusco, Silvio Lattanzi et al.

We study the deletion robust version of submodular maximization under matroid constraints. The goal is to extract a small-size summary of the data set...

View PDF

JMLR Jul 15, 2025

Instability, Computational Efficiency and Statistical Accuracy

Nhat Ho, Koulik Khamaru, Raaz Dwivedi et al.

Many statistical estimators are defined as the fixed point of a data-dependent operator, with estimators based on minimizing a cost function being an ...

Computational Statistics

View PDF

JMLR Jul 15, 2025

Estimation of Local Geometric Structure on Manifolds from Noisy Data

Yariv Aizenbud, Barak Sober

A common observation in data-driven applications is that high-dimensional data have a low intrinsic dimension, at least locally. In this work, we cons...

View PDF

JMLR Jul 15, 2025

Ontolearn---A Framework for Large-scale OWL Class Expression Learning in Python

Caglar Demir, Alkid Baci, N'Dah Jean Kouagou et al.

In this paper, we present Ontolearn---a framework for learning OWL class expressions over large knowledge graphs. Ontolearn contains efficient implem...

View PDF

JMLR Jul 15, 2025

Continuously evolving rewards in an open-ended environment

Richard M. Bailey

Unambiguous identification of the rewards driving behaviours of entities operating in complex open-ended real-world environments is difficult, in part...

View PDF

JMLR Jul 15, 2025

Recursive Causal Discovery

Ehsan Mokhtarian, Sepehr Elahi, Sina Akbari et al.

Causal discovery from observational data, i.e., learning the causal graph from a finite set of samples from the joint distribution of the variables, i...

Causal Inference

View PDF

JMLR Jul 15, 2025

Evaluation of Active Feature Acquisition Methods for Time-varying Feature Settings

Henrik von Kleist, Alireza Zamanian, Ilya Shpitser et al.

Machine learning methods often assume that input features are available at no cost. However, in domains like healthcare, where acquiring features coul...

View PDF

JMLR Jul 15, 2025

On Adaptive Stochastic Optimization for Streaming Data: A Newton's Method with O(dN) Operations

Antoine Godichon-Baggioni, Nicklas Werge

Stochastic optimization methods face new challenges in the realm of streaming data, characterized by a continuous flow of large, high-dimensional data...

Computational Statistics

View PDF

JMLR Jul 15, 2025

Determine the Number of States in Hidden Markov Models via Marginal Likelihood

Yang Chen, Cheng-Der Fuh, Chu-Lan Michael Kao

Hidden Markov models (HMM) have been widely used by scientists to model stochastic systems: the underlying process is a discrete Markov chain, and the...

View PDF

JMLR Jul 15, 2025

Variance-Aware Estimation of Kernel Mean Embedding

Geoffrey Wolfer, Pierre Alquier

An important feature of kernel mean embeddings (KME) is that the rate of convergence of the empirical KME to the true distribution KME can be bounded ...

Nonparametric Statistics

View PDF

JMLR Jul 15, 2025

Scaling ResNets in the Large-depth Regime

Pierre Marion, Adeline Fermanian, Gérard Biau et al.

Deep ResNets are recognized for achieving state-of-the-art results in complex machine learning tasks. However, the remarkable performance of these arc...

View PDF

JMLR Jul 15, 2025

A Comparative Evaluation of Quantification Methods

Tobias Schumacher, Markus Strohmaier, Florian Lemmerich

Quantification represents the problem of estimating the distribution of class labels on unseen data. It also represents a growing research field in su...

View PDF

JMLR Jul 15, 2025

Lightning UQ Box: Uncertainty Quantification for Neural Networks

Nils Lehmann, Nina Maria Gottschling, Jakob Gawlikowski et al.

Although neural networks have shown impressive results in a multitude of application domains, the "black box" nature of deep learning and lack of conf...

Machine Learning

View PDF

JMLR Jul 15, 2025

Scaling Data-Constrained Language Models

Niklas Muennighoff, Alexander M. Rush, Boaz Barak et al.

The current trend of scaling language models involves increasing both parameter count and training data set size. Extrapolating this trend suggests th...

Machine Learning

View PDF

JMLR Jul 15, 2025

Curvature-based Clustering on Graphs

Yu Tian, Zachary Lubberts, Melanie Weber

Unsupervised node clustering (or community detection) is a classical graph learning task. In this paper, we study algorithms that exploit the geometry...

View PDF

JMLR Jul 15, 2025

Composite Goodness-of-fit Tests with Kernels

Oscar Key, Arthur Gretton, François-Xavier Briol et al.

We propose kernel-based hypothesis tests for the challenging composite testing problem, where we are interested in whether the data comes from any dis...

Nonparametric Statistics

View PDF

JMLR Jul 15, 2025

PFLlib: A Beginner-Friendly and Comprehensive Personalized Federated Learning Library and Benchmark

Jianqing Zhang, Yang Liu, Yang Hua et al.

Amid the ongoing advancements in Federated Learning (FL), a machine learning paradigm that allows collaborative learning with data privacy protection,...

View PDF

JMLR Jul 15, 2025

The Effect of SGD Batch Size on Autoencoder Learning: Sparsity, Sharpness, and Feature Learning

Nikhil Ghosh, Spencer Frei, Wooseok Ha et al.

In this work, we investigate the dynamics of stochastic gradient descent (SGD) when training a single-neuron autoencoder with linear or ReLU activatio...

View PDF

JMLR Jul 15, 2025

Efficient and Robust Transfer Learning of Optimal Individualized Treatment Regimes with Right-Censored Survival Data

Pan Zhao, Julie Josse, Shu Yang

An individualized treatment regime (ITR) is a decision rule that assigns treatments based on patients' characteristics. The value function of an ITR i...

Survival Analysis

View PDF

JMLR Jul 15, 2025

DAGs as Minimal I-maps for the Induced Models of Causal Bayesian Networks under Conditioning

Xiangdong Xie, Jiahua Guo, Yi Sun

Bayesian networks (BNs) are a powerful tool for knowledge representation and reasoning, especially for complex systems. A critical task in the applic...

Causal Inference Bayesian Statistics

View PDF

JMLR Jul 15, 2025

Adjusted Expected Improvement for Cumulative Regret Minimization in Noisy Bayesian Optimization

Shouri Hu, Haowei Wang, Zhongxiang Dai et al.

The expected improvement (EI) is one of the most popular acquisition functions for Bayesian optimization (BO) and has demonstrated good empirical perf...

Computational Statistics Bayesian Statistics

View PDF

JMLR Jul 15, 2025

Manifold Fitting under Unbounded Noise

Zhigang Yao, Yuqing Xia

In the field of non-Euclidean statistical analysis, a trend has emerged in recent times, of attempts to recover a low dimensional structure, namely a ...

View PDF

JMLR Jul 15, 2025

Learning Global Nash Equilibrium in Team Competitive Games with Generalized Fictitious Cross-Play

Zelai Xu, Chao Yu, Yancheng Liang et al.

Self-play (SP) is a popular multi-agent reinforcement learning framework for competitive games. Despite the empirical success, the theoretical propert...

View PDF

JMLR Jul 15, 2025

Wasserstein Convergence Guarantees for a General Class of Score-Based Generative Models

Xuefeng Gao, Hoang M. Nguyen, Lingjiong Zhu

Score-based generative models are a recent class of deep generative models with state-of-the-art performance in many applications. In this paper, we e...

View PDF

JMLR Jul 15, 2025

Extremal graphical modeling with latent variables via convex optimization

Sebastian Engelke, Armeen Taeb

Extremal graphical models encode the conditional independence structure of multivariate extremes and provide a powerful tool for quantifying the risk ...

Computational Statistics

View PDF

JMLR Jul 15, 2025

On the Approximation of Kernel functions

Paul Dommel, Alois Pichler

Various methods in statistical learning build on kernels considered in reproducing kernel Hilbert spaces. In applications, the kernel is often selecte...

Nonparametric Statistics

View PDF

JMLR Jul 15, 2025

Efficient and Robust Semi-supervised Estimation of Average Treatment Effect with Partially Annotated Treatment and Response

Jue Hou, Rajarshi Mukherjee, Tianxi Cai

A notable challenge of leveraging Electronic Health Records (EHR) for treatment effect assessment is the lack of precise information on important clin...

Causal Inference

View PDF

JMLR Jul 15, 2025

Nonconvex Stochastic Bregman Proximal Gradient Method with Application to Deep Learning

Kuangyu Ding, Jingyang Li, Kim-Chuan Toh

Stochastic gradient methods for minimizing nonconvex composite objective functions typically rely on the Lipschitz smoothness of the differentiable pa...

Machine Learning

View PDF

JMLR Jul 15, 2025

Optimizing Data Collection for Machine Learning

Rafid Mahmood, James Lucas, Jose M. Alvarez et al.

Modern deep learning systems require huge data sets to achieve impressive performance, but there is little guidance on how much or what kind of data t...

Machine Learning

View PDF

JMLR Jul 15, 2025

Unbalanced Kantorovich-Rubinstein distance, plan, and barycenter on nite spaces: A statistical perspective

Shayan Hundrieser, Florian Heinemann, Marcel Klatt et al.

We analyze statistical properties of plug-in estimators for unbalanced optimal transport quantities between finitely supported measures in different p...

View PDF

JMLR Jul 15, 2025

Copula-based Sensitivity Analysis for Multi-Treatment Causal Inference with Unobserved Confounding

Jiajing Zheng, Alexander D'Amour, Alexander Franks

Recent work has focused on the potential and pitfalls of causal identification in observational studies with multiple simultaneous treatments. Buildin...

Causal Inference

View PDF

JMLR Jul 15, 2025

Rank-one Convexification for Sparse Regression

Alper Atamturk, Andres Gomez

Sparse regression models are increasingly prevalent due to their ease of interpretability and superior out-of-sample performance. However, the exact m...

High-Dimensional Statistics Machine Learning

View PDF

JMLR Jul 15, 2025

gsplat: An Open-Source Library for Gaussian Splatting

Vickie Ye, Ruilong Li, Justin Kerr et al.

gsplat is an open-source library designed for training and developing Gaussian Splatting methods. It features a front-end with Python bindings compati...

View PDF

JMLR Jul 15, 2025

Statistical Inference of Constrained Stochastic Optimization via Sketched Sequential Quadratic Programming

Sen Na, Michael Mahoney

We consider online statistical inference of constrained stochastic nonlinear optimization problems. We apply the Stochastic Sequential Quadratic Progr...

Machine Learning Computational Statistics

View PDF

JMLR Jul 15, 2025

Sliced-Wasserstein Distances and Flows on Cartan-Hadamard Manifolds

Clément Bonet, Lucas Drumetz, Nicolas Courty

While many Machine Learning methods have been developed or transposed on Riemannian manifolds to tackle data with known non-Euclidean geometry, Optima...

View PDF

JMLR Jul 15, 2025

Accelerating optimization over the space of probability measures

Shi Chen, Qin Li, Oliver Tse et al.

The acceleration of gradient-based optimization methods is a subject of significant practical and theoretical importance, particularly within machine ...

Computational Statistics

View PDF

JMLR Jul 15, 2025

Bayesian Multi-Group Gaussian Process Models for Heterogeneous Group-Structured Data

Didong Li, Andrew Jones, Sudipto Banerjee et al.

Gaussian processes are pervasive in functional data analysis, machine learning, and spatial statistics for modeling complex dependencies. Scientific d...

Bayesian Statistics

View PDF

JMLR Jul 15, 2025

Orthogonal Bases for Equivariant Graph Learning with Provable k-WL Expressive Power

Jia He, Maggie Cheng

Graph neural network (GNN) models have been widely used for learning graph-structured data. Due to the permutation-invariant requirement of graph lear...

View PDF

JMLR Jul 15, 2025

Optimal Experiment Design for Causal Effect Identification

Sina Akbari, Jalal Etesami, Negar Kiyavash

Pearl’s do calculus is a complete axiomatic approach to learn the identifiable causal effects from observational data. When such an effect is not iden...

Causal Inference

View PDF

JMLR Jul 15, 2025

Mean Aggregator is More Robust than Robust Aggregators under Label Poisoning Attacks on Distributed Heterogeneous Data

Jie Peng, Weiyu Li, Stefan Vlaski et al.

Robustness to malicious attacks is of paramount importance for distributed learning. Existing works usually consider the classical Byzantine attacks m...

View PDF

JMLR Jul 15, 2025

The Blessing of Heterogeneity in Federated Q-Learning: Linear Speedup and Beyond

Jiin Woo, Gauri Joshi, Yuejie Chi

In this paper, we consider federated Q-learning, which aims to learn an optimal Q-function by periodically aggregating local Q-estimates trained on lo...

View PDF

JMLR Jul 15, 2025

depyf: Open the Opaque Box of PyTorch Compiler for Machine Learning Researchers

Kaichao You, Runsheng Bai, Meng Cao et al.

PyTorch 2.x introduces a compiler designed to accelerate deep learning programs. However, for machine learning researchers, fully leveraging the PyTor...

Machine Learning

View PDF

JMLR Jul 15, 2025

The ODE Method for Stochastic Approximation and Reinforcement Learning with Markovian Noise

Shuze Daniel Liu, Shuhang Chen, Shangtong Zhang

Stochastic approximation is a class of algorithms that update a vector iteratively, incrementally, and stochastically, including, e.g., stochastic gra...

View PDF

JMLR Jul 15, 2025

Improving Graph Neural Networks on Multi-node Tasks with the Labeling Trick

Xiyuan Wang, Pan Li, Muhan Zhang

In this paper, we study using graph neural networks (GNNs) for multi-node representation learning, where a representation for a set of more than one n...

Machine Learning

View PDF

JMLR Jul 15, 2025

Directed Cyclic Graphs for Simultaneous Discovery of Time-Lagged and Instantaneous Causality from Longitudinal Data Using Instrumental Variables

Wei Jin, Yang Ni, Amanda B. Spence et al.

We consider the problem of causal discovery from longitudinal observational data. We develop a novel framework that simultaneously discovers the time-...

Causal Inference

View PDF

JMLR Jul 15, 2025

Bayesian Sparse Gaussian Mixture Model for Clustering in High Dimensions

Dapeng Yao, Fangzheng Xie, Yanxun Xu

We study the sparse high-dimensional Gaussian mixture model when the number of clusters is allowed to grow with the sample size. A minimax lower bound...

High-Dimensional Statistics Bayesian Statistics

View PDF

JMLR Jul 15, 2025

Regularizing Hard Examples Improves Adversarial Robustness

Hyungyu Lee, Saehyung Lee, Ho Bae et al.

Recent studies have validated that pruning hard-to-learn examples from training improves the generalization performance of neural networks (NNs). In t...

View PDF

JMLR Jul 15, 2025

Random ReLU Neural Networks as Non-Gaussian Processes

Rahul Parhi, Pakshal Bohra, Ayoub El Biari et al.

We consider a large class of shallow neural networks with randomly initialized parameters and rectified linear unit activation functions. We prove tha...

Machine Learning

View PDF

JMLR Jul 15, 2025

Riemannian Bilevel Optimization

Jiaxiang Li, Shiqian Ma

In this work, we consider the bilevel optimization problem on Riemannian manifolds. We inspect the calculation of the hypergradient of such problems o...

Computational Statistics

View PDF

JMLR Jul 15, 2025

Supervised Learning with Evolving Tasks and Performance Guarantees

Verónica Álvarez, Santiago Mazuelas, Jose A. Lozano

Multiple supervised learning scenarios are composed by a sequence of classification tasks. For instance, multi-task learning and continual learning ai...

Machine Learning

View PDF

JMLR Jul 15, 2025

Error estimation and adaptive tuning for unregularized robust M-estimator

Pierre C. Bellec, Takuya Koriyama

We consider unregularized robust M-estimators for linear models under Gaussian design and heavy-tailed noise, in the proportional asymptotics regime w...

View PDF

JMLR Jul 15, 2025

From Sparse to Dense Functional Data in High Dimensions: Revisiting Phase Transitions from a Non-Asymptotic Perspective

Shaojun Guo, Dong Li, Xinghao Qiao et al.

Nonparametric estimation of the mean and covariance functions is ubiquitous in functional data analysis and local linear smoothing techniques are most...

High-Dimensional Statistics

View PDF

JMLR Jul 15, 2025

Locally Private Causal Inference for Randomized Experiments

Yuki Ohnishi, Jordan Awan

Local differential privacy is a differential privacy paradigm in which individuals first apply a privacy mechanism to their data (often by adding nois...

Causal Inference

View PDF

JMLR Jul 15, 2025

Estimating Network-Mediated Causal Effects via Principal Components Network Regression

Alex Hayes, Mark M. Fredrickson, Keith Levin

We develop a method to decompose causal effects on a social network into an indirect effect mediated by the network, and a direct effect independent o...

Causal Inference Machine Learning

View PDF

JMLR Jul 15, 2025

Selective Inference with Distributed Data

Sifan Liu, Snigdha Panigrahi

When data are distributed across multiple sites or machines rather than centralized in one location, researchers face the challenge of extracting mean...

View PDF

JMLR Jul 15, 2025

Two-Timescale Gradient Descent Ascent Algorithms for Nonconvex Minimax Optimization

Tianyi Lin, Chi Jin, Michael I. Jordan

We provide a unified analysis of two-timescale gradient descent ascent (TTGDA) for solving structured nonconvex minimax optimization problems in the f...

Computational Statistics

View PDF

JMLR Jul 15, 2025

An Axiomatic Definition of Hierarchical Clustering

Ery Arias-Castro, Elizabeth Coda

In this paper, we take an axiomatic approach to defining a population hierarchical clustering for piecewise constant densities, and in a similar manne...

View PDF

JMLR Jul 15, 2025

Test-Time Training on Video Streams

Renhao Wang, Yu Sun, Arnuv Tandon et al.

Prior work has established Test-Time Training (TTT) as a general framework to further improve a trained model at test time. Before making a prediction...

Machine Learning

View PDF

JMLR Jul 15, 2025

Adaptive Client Sampling in Federated Learning via Online Learning with Bandit Feedback

Boxin Zhao, Lingxiao Wang, Ziqi Liu et al.

Due to the high cost of communication, federated learning (FL) systems need to sample a subset of clients that are involved in each round of training....

View PDF

JMLR Jul 15, 2025

A Random Matrix Approach to Low-Multilinear-Rank Tensor Approximation

Hugo Lebeau, Florent Chatelain, Romain Couillet

This work presents a comprehensive understanding of the estimation of a planted low-rank signal from a general spiked tensor model near the computatio...

View PDF

JMLR Jul 15, 2025

Memory Gym: Towards Endless Tasks to Benchmark Memory Capabilities of Agents

Marco Pleines, Matthias Pallasch, Frank Zimmer et al.

Memory Gym presents a suite of 2D partially observable environments, namely Mortar Mayhem, Mystery Path, and Searing Spotlights, designed to benchmark...

View PDF

JMLR Jul 15, 2025

Enhancing Graph Representation Learning with Localized Topological Features

Zuoyu Yan, Qi Zhao, Ze Ye et al.

Representation learning on graphs is a fundamental problem that can be crucial in various tasks. Graph neural networks, the dominant approach for grap...

View PDF

JMLR Jul 15, 2025

Deep Out-of-Distribution Uncertainty Quantification via Weight Entropy Maximization

Antoine de Mathelin, François Deheeger, Mathilde Mougeot et al.

This paper deals with uncertainty quantification and out-of-distribution detection in deep learning using Bayesian and ensemble methods. It proposes a...

Machine Learning

View PDF

JMLR Jul 15, 2025

DisC2o-HD: Distributed causal inference with covariates shift for analyzing real-world high-dimensional data

Jiayi Tong, Jie Hu, George Hripcsak et al.

High-dimensional healthcare data, such as electronic health records (EHR) data and claims data, present two primary challenges due to the large number...

High-Dimensional Statistics Causal Inference

View PDF

JMLR Jul 15, 2025

Bayes Meets Bernstein at the Meta Level: an Analysis of Fast Rates in Meta-Learning with PAC-Bayes

Charles Riou, Pierre Alquier, Badr-Eddine Chérief-Abdellatif

Bernstein's condition is a key assumption that guarantees fast rates in machine learning. For example, under this condition, the Gibbs posterior with ...

Bayesian Statistics

View PDF

JMLR Jul 15, 2025

Efficiently Escaping Saddle Points in Bilevel Optimization

Minhui Huang, Xuxing Chen, Kaiyi Ji et al.

Bilevel optimization is one of the fundamental problems in machine learning and optimization. Recent theoretical developments in bilevel optimization ...

Computational Statistics

View PDF