JMLR

An Augmentation Overlap Theory of Contrastive Learning

Authors
Qi Zhang Yifei Wang Yisen Wang
Paper Information
  • Journal:
    Journal of Machine Learning Research
  • Added to Tracker:
    Dec 30, 2025
Abstract

Recently, self-supervised contrastive learning has achieved great success on various tasks. However, its underlying working mechanism is yet unclear. In this paper, we first provide the tightest bounds based on the widely adopted assumption of conditional independence. Further, we relax the conditional independence assumption to a more practical assumption of augmentation overlap and derive the asymptotically closed bounds for the downstream performance. Our proposed augmentation overlap theory hinges on the insight that the support of different intra-class samples will become more overlapped under aggressive data augmentations, thus simply aligning the positive samples (augmented views of the same sample) could make contrastive learning cluster intra-class samples together. Moreover, from the newly derived augmentation overlap perspective, we develop an unsupervised metric for the representation evaluation of contrastive learning, which aligns well with the downstream performance almost without relying on additional modules. Code is available at https://github.com/PKU-ML/GARC.

Author Details
Qi Zhang
Author
Yifei Wang
Author
Yisen Wang
Author
Citation Information
APA Format
Qi Zhang , Yifei Wang & Yisen Wang . An Augmentation Overlap Theory of Contrastive Learning. Journal of Machine Learning Research .
BibTeX Format
@article{paper702,
  title = { An Augmentation Overlap Theory of Contrastive Learning },
  author = { Qi Zhang and Yifei Wang and Yisen Wang },
  journal = { Journal of Machine Learning Research },
  url = { https://www.jmlr.org/papers/v26/22-1009.html }
}