JMLR

Bayesian Sparse Gaussian Mixture Model for Clustering in High Dimensions

Authors
Dapeng Yao Fangzheng Xie Yanxun Xu
Research Topics
High-Dimensional Statistics Bayesian Statistics
Paper Information
  • Journal:
    Journal of Machine Learning Research
  • Added to Tracker:
    Jul 15, 2025
Abstract

We study the sparse high-dimensional Gaussian mixture model when the number of clusters is allowed to grow with the sample size. A minimax lower bound for parameter estimation is established, and we show that a constrained maximum likelihood estimator achieves the minimax lower bound. However, this optimization-based estimator is computationally intractable because the objective function is highly nonconvex and the feasible set involves discrete structures. To address the computational challenge, we propose a computationally tractable Bayesian approach to estimate high-dimensional Gaussian mixtures whose cluster centers exhibit sparsity using a continuous spike-and-slab prior. We further prove that the posterior contraction rate of the proposed Bayesian method is minimax optimal. The mis- clustering rate is obtained as a by-product using tools from matrix perturbation theory. The proposed Bayesian sparse Gaussian mixture model does not require pre-specifying the number of clusters, which can be adaptively estimated. The validity and usefulness of the proposed method is demonstrated through simulation studies and the analysis of a real-world single-cell RNA sequencing data set.

Author Details
Dapeng Yao
Author
Fangzheng Xie
Author
Yanxun Xu
Author
Research Topics & Keywords
High-Dimensional Statistics
Research Area
Bayesian Statistics
Research Area
Citation Information
APA Format
Dapeng Yao , Fangzheng Xie & Yanxun Xu . Bayesian Sparse Gaussian Mixture Model for Clustering in High Dimensions. Journal of Machine Learning Research .
BibTeX Format
@article{JMLR:v26:23-0142,
  author  = {Dapeng Yao and Fangzheng Xie and Yanxun Xu},
  title   = {Bayesian Sparse Gaussian Mixture Model for Clustering in High Dimensions},
  journal = {Journal of Machine Learning Research},
  year    = {2025},
  volume  = {26},
  number  = {21},
  pages   = {1--50},
  url     = {http://jmlr.org/papers/v26/23-0142.html}
}
Related Papers