(Non−)Convergence Results for Predictive Coding Networks
Simon Frieder and Thomas Lukasiewicz
Abstract
Predictive coding networks (PCNs) are (un)supervised learning models, coming from neuroscience, that approximate how the brain works. One major open problem around PCNs is their convergence behavior. In this paper, we use dynamical systems theory to formally investigate the convergence of PCNs as they are used in machine learning. Doing so, we put their theory on a firm, rigorous basis, by developing a precise mathematical framework for PCNs in their prediction and training stage, and show that for sufficiently small weights and initializations, PCNs converge in both stages for any input. Thereby, we provide the theoretical assurance that previous implementations, whose convergence was assessed solely by numerical experiments, indeed capture the correct behavior of PCNs. Outside of this regime of small weights and small initializations, we show via a counterexample that PCNs can diverge, countering common beliefs held in the community. This is achieved by identifying a Neimark-Sacker bifurcation in a toy PCN model, which gives rise to an unstable fixed-point and an invariant curve around it.