JMLR

Scaling ResNets in the Large-depth Regime

Authors

Pierre Marion Adeline Fermanian Gérard Biau Jean-Philippe Vert

Paper Information

Journal:
Journal of Machine Learning Research
Added to Tracker:
Jul 15, 2025

Abstract

Deep ResNets are recognized for achieving state-of-the-art results in complex machine learning tasks. However, the remarkable performance of these architectures relies on a training procedure that needs to be carefully crafted to avoid vanishing or exploding gradients, particularly as the depth $L$ increases. No consensus has been reached on how to mitigate this issue, although a widely discussed strategy consists in scaling the output of each layer by a factor $\alpha_L$. We show in a probabilistic setting that with standard i.i.d. initializations, the only non-trivial dynamics is for $\alpha_L = \frac{1}{\sqrt{L}}$---other choices lead either to explosion or to identity mapping. This scaling factor corresponds in the continuous-time limit to a neural stochastic differential equation, contrarily to a widespread interpretation that deep ResNets are discretizations of neural ordinary differential equations. By contrast, in the latter regime, stability is obtained with specific correlated initializations and $\alpha_L = \frac{1}{L}$. Our analysis suggests a strong interplay between scaling and regularity of the weights as a function of the layer index. Finally, in a series of experiments, we exhibit a continuous range of regimes driven by these two parameters, which jointly impact performance before and after training.

Author Details

Pierre Marion

Author

Adeline Fermanian

Author

Gérard Biau

Author

Jean-Philippe Vert

Author

Citation Information

APA Format


                                
                                    
                                    Pierre Marion
                                
                                    
                                        , 
                                    
                                    Adeline Fermanian
                                
                                    
                                        , 
                                    
                                    Gérard Biau
                                
                                    
                                         & 
                                    
                                    Jean-Philippe Vert
                                
                                . 
                                Scaling ResNets in the Large-depth Regime. 
                                Journal of Machine Learning Research
                                .

BibTeX Format

@article{JMLR:v26:22-0664,
  author  = {Pierre Marion and Adeline Fermanian and G{{\'e}}rard Biau and Jean-Philippe Vert},
  title   = {Scaling ResNets in the Large-depth Regime},
  journal = {Journal of Machine Learning Research},
  year    = {2025},
  volume  = {26},
  number  = {56},
  pages   = {1--48},
  url     = {http://jmlr.org/papers/v26/22-0664.html}
}

Back to Papers

View Full Paper More from JMLR

Scaling ResNets in the Large-depth Regime

Authors

Paper Information

Abstract

Author Details

Pierre Marion

Adeline Fermanian

Gérard Biau

Jean-Philippe Vert

Citation Information

APA Format

BibTeX Format

Related Papers