JMLR

Optimal Sample Selection Through Uncertainty Estimation and Its Application in Deep Learning

Authors
Yong Lin Chen Liu Chenlu Ye Qing Lian Yuan Yao Tong Zhang
Research Topics
Machine Learning
Paper Information
  • Journal:
    Journal of Machine Learning Research
  • Added to Tracker:
    Sep 08, 2025
Abstract

Modern deep learning heavily relies on large labeled datasets, which often comse with high costs in terms of both manual labeling and computational resources. To mitigate these challenges, researchers have explored the use of informative subset selection techniques. In this study, we present a theoretically optimal solution for addressing both sampling with and without labels within the context of linear softmax regression. Our proposed method, COPS (unCertainty based OPtimal Sub-sampling), is designed to minimize the expected loss of a model trained on subsampled data. Unlike existing approaches that rely on explicit calculations of the inverse covariance matrix, which are not easily applicable to deep learning scenarios, COPS leverages the model's logits to estimate the sampling ratio. This sampling ratio is closely associated with model uncertainty and can be effectively applied to deep learning tasks. Furthermore, we address the challenge of model sensitivity to misspecification by incorporating a down-weighting approach for low-density samples, drawing inspiration from previous works. To assess the effectiveness of our proposed method, we conducted extensive empirical experiments using deep neural networks on benchmark datasets. The results consistently showcase the superior performance of COPS compared to baseline methods, reaffirming its efficacy.

Author Details
Yong Lin
Author
Chen Liu
Author
Chenlu Ye
Author
Qing Lian
Author
Yuan Yao
Author
Tong Zhang
Author
Research Topics & Keywords
Machine Learning
Research Area
Citation Information
APA Format
Yong Lin , Chen Liu , Chenlu Ye , Qing Lian , Yuan Yao & Tong Zhang . Optimal Sample Selection Through Uncertainty Estimation and Its Application in Deep Learning. Journal of Machine Learning Research .
BibTeX Format
@article{paper527,
  title = { Optimal Sample Selection Through Uncertainty Estimation and Its Application in Deep Learning },
  author = { Yong Lin and Chen Liu and Chenlu Ye and Qing Lian and Yuan Yao and Tong Zhang },
  journal = { Journal of Machine Learning Research },
  url = { https://www.jmlr.org/papers/v26/23-1160.html }
}