JMLR

On the Ability of Deep Networks to Learn Symmetries from Data: A Neural Kernel Theory

Authors

Andrea Perin Stephane Deny

Research Topics

Nonparametric Statistics

View Full Paper

Paper Information

Journal:
Journal of Machine Learning Research
Added to Tracker:
Sep 08, 2025

Abstract

Symmetries (transformations by group actions) are present in many datasets, and leveraging them holds considerable promise for improving predictions in machine learning. In this work, we aim to understand when and how deep networks---with standard architectures trained in a standard, supervised way---learn symmetries from data. Inspired by real-world scenarios, we study a classification paradigm where data symmetries are only partially observed during training: some classes include all transformations of a cyclic group, while others---only a subset. We ask: under which conditions will deep networks correctly classify the partially sampled classes? In the infinite-width limit, where neural networks behave like kernel machines, we derive a neural kernel theory of symmetry learning. The group-cyclic nature of the dataset allows us to analyze the Gram matrix of neural kernels in the Fourier domain; here we find a simple characterization of the generalization error as a function of class separation (signal) and class-orbit density (noise). This characterization reveals that generalization can only be successful when the local structure of the data prevails over its non-local, symmetry-induced structure, in the kernel space defined by the architecture. This occurs when (1) classes are sufficiently distinct and (2) class orbits are sufficiently dense. We extend our theoretical treatment to any finite group, including non-abelian groups. Our framework also applies to equivariant architectures (e.g., CNNs), and recovers their success in the special case where the architecture matches the inherent symmetry of the data. Empirically, our theory reproduces the generalization failure of finite-width networks (MLP, CNN, ViT) trained on partially observed versions of rotated-MNIST. We conclude that conventional deep networks lack a mechanism to learn symmetries that have not been explicitly embedded in their architecture a priori. In the future, our framework could be extended to guide the design of architectures and training procedures able to learn symmetries from data.

Author Details

Andrea Perin

Author

Stephane Deny

Author

Research Topics & Keywords

Nonparametric Statistics

Research Area

Citation Information

APA Format


                                
                                    
                                    Andrea Perin
                                
                                    
                                         & 
                                    
                                    Stephane Deny
                                
                                . 
                                On the Ability of Deep Networks to Learn Symmetries from Data: A Neural Kernel Theory. 
                                Journal of Machine Learning Research
                                .

BibTeX Format


@article{paper510,

  title = { On the Ability of Deep Networks to Learn Symmetries from Data: A Neural Kernel Theory },

  author = { 
                                
                                    Andrea Perin
                                
                                     and Stephane Deny
                                
                                },

  journal = { Journal of Machine Learning Research },



  url = { https://www.jmlr.org/papers/v26/24-2175.html }

}

Back to Papers

View Full Paper More from JMLR