Representation Learning in Linear Factor Models

Joint with José Luis Montiel Olea

Abstract: In this work, we analyze recent theoretical developments in the representation learning literature through the lens of a linear Gaussian factor model. First, we derive sufficient representations—defined as functions of covariates that, upon conditioning, render the outcome variable and covariates independent. Then, we study the theoretical properties of these representations and establish their asymptotic invariance; which means the dependence of the representations on the factors’ measurement error vanishes as the dimension of the covariates goes to infinity. Finally, we use a decision-theoretic approach to understand the extent to which representations are useful for solving downstream tasks. We show that the conditional mean of the outcome variable given covariates is an asymptotically invariant and sufficient representation that can solve any task efficiently, not only prediction.

Amilcar Velez
Amilcar Velez
Ph.D. Candidate in Economics

I am a Ph.D. candidate in Economics at Northwestern University on the 2024-2025 job market.