Sparse Semiparametric Discriminant Analysis for High-dimensional Zero-inflated Data
Authors
Research Topics
Paper Information
-
Journal:
Journal of Machine Learning Research -
Added to Tracker:
Dec 30, 2025
Abstract
Sequencing-based technologies provide an abundance of high-dimensional biological data sets with highly skewed and zero-inflated measurements. Despite the computational efficiency and high interpretability offered by linear classification methods, the violation of underlying distribution assumptions, driven by high skewness and zero inflation, results in invalid classification rules and interpretations. Furthermore, existing data transformation methods addressing these violations introduce ambiguity, rendering the final model and classification performance contingent on the specific transformation employed. To tackle these challenges, we propose a novel semiparametric framework for discriminant analysis based on the truncated latent Gaussian copula model. This model accommodates skewness and zero inflation, and its estimation procedure ensures robustness against data transformations. To facilitate model interpretability, we incorporate $\ell_1$ sparsity regularization and establish the consistency of the classification directions in high-dimensional settings. We validate our approach using human gut microbiome, breast cancer microRNA, and single-cell RNA sequencing data, highlighting its superior classification accuracy and robustness to data transformations.
Author Details
Yang Ni
AuthorHee Cheol Chung
AuthorIrina Gaynanova
AuthorResearch Topics & Keywords
High-Dimensional Statistics
Research AreaCitation Information
APA Format
Yang Ni
,
Hee Cheol Chung
&
Irina Gaynanova
.
Sparse Semiparametric Discriminant Analysis for High-dimensional Zero-inflated Data.
Journal of Machine Learning Research
.
BibTeX Format
@article{paper720,
title = { Sparse Semiparametric Discriminant Analysis for High-dimensional Zero-inflated Data },
author = {
Yang Ni
and Hee Cheol Chung
and Irina Gaynanova
},
journal = { Journal of Machine Learning Research },
url = { https://www.jmlr.org/papers/v26/24-0046.html }
}