On Model Identification and Out-of-Sample Prediction of PCR with Applications to Synthetic Controls
Authors
Research Topics
Paper Information
-
Journal:
Journal of Machine Learning Research -
Added to Tracker:
Sep 08, 2025
Abstract
We analyze principal component regression (PCR) in a high-dimensional error-in-variables setting with fixed design. Under suitable conditions, we show that PCR consistently identifies the unique model with minimum $\ell_2$-norm. These results enable us to establish non-asymptotic out-of-sample prediction guarantees that improve upon the best known rates. In the course of our analysis, we introduce a natural linear algebraic condition between the in- and out-of-sample covariates, which allows us to avoid distributional assumptions for out-of-sample predictions. Our simulations illustrate the importance of this condition for generalization, even under covariate shifts. Accordingly, we construct a hypothesis test to check when this condition holds in practice. As a byproduct, our results also lead to novel results for the synthetic controls literature, a leading approach for policy evaluation. To the best of our knowledge, our prediction guarantees for the fixed design setting have been elusive in both the high-dimensional error-in-variables and synthetic controls literatures.
Author Details
Devavrat Shah
AuthorAnish Agarwal
AuthorDennis Shen
AuthorResearch Topics & Keywords
Statistical Learning
Research AreaCitation Information
APA Format
Devavrat Shah
,
Anish Agarwal
&
Dennis Shen
.
On Model Identification and Out-of-Sample Prediction of PCR with Applications to Synthetic Controls.
Journal of Machine Learning Research
.
BibTeX Format
@article{paper538,
title = { On Model Identification and Out-of-Sample Prediction of PCR with Applications to Synthetic Controls },
author = {
Devavrat Shah
and Anish Agarwal
and Dennis Shen
},
journal = { Journal of Machine Learning Research },
url = { https://www.jmlr.org/papers/v26/23-0102.html }
}