JRSSB Apr 27, 2026

Causal K-means clustering

Authors
Edward H Kennedy Kwangho Kim Jisu Kim
Research Topics
Causal Inference
Paper Information
  • Journal:
    Journal of the Royal Statistical Society Series B
  • DOI:
    10.1093/jrsssb/qkag068
  • Published:
    April 27, 2026
  • Added to Tracker:
    Apr 27, 2026
Abstract

Abstract Causal effects are often characterized at the population level, which can mask important heterogeneity across latent subgroups. Since the subgroup structure is unknown, identifying and evaluating subgroup specific effects is substantially more challenging than standard population level analysis. We address this problem by proposing Causal k-Means Clustering, a framework that uses k-means clustering ideas to recover unknown subgroup structure from individual level causal contrasts. The problem differs fundamentally from ordinary clustering because the objects to be clustered are unknown counterfactual functions. We first study a simple plug in estimator that is readily implemented with standard algorithms and establish its rate of convergence. We then develop a bias-corrected estimator using semiparametric efficiency theory and double machine learning and show that it attains fast root-n rates and asymptotic normality in large nonparametric models. The proposed methods are especially useful for modern outcome wide studies with multiple treatment levels, and the framework extends naturally to clustering based on more general pseudo-outcomes, including partially observed outcomes and other unknown functionals. We study finite sample performance in simulations and illustrate the method with an application to mobile supported self-management for chronic low back pain.

Author Details
Edward H Kennedy
Author
Kwangho Kim
Author
Jisu Kim
Author
Research Topics & Keywords
Causal Inference
Research Area
Citation Information
APA Format
Edward H Kennedy , Kwangho Kim & Jisu Kim (2026) . Causal K-means clustering. Journal of the Royal Statistical Society Series B , 10.1093/jrsssb/qkag068.
BibTeX Format
@article{paper1135,
  title = { Causal K-means clustering },
  author = { Edward H Kennedy and Kwangho Kim and Jisu Kim },
  journal = { Journal of the Royal Statistical Society Series B },
  year = { 2026 },
  doi = { 10.1093/jrsssb/qkag068 },
  url = { https://doi.org/10.1093/jrsssb/qkag068 }
}