Tighter Bounds on the Expressivity of Transformer Encoders
David Chiang ( Notre Dame )
- 14:00 10th October 2023 ( week 1, Michaelmas Term 2023 )Online
Characterizing the expressivity of neural networks in terms of better-understood formal systems has the potential to yield new insights into the power and limitations of these networks. Doing so for transformers remains an active area of research. Previous work has focused mainly on automata theory and circuit complexity, and on obtaining either lower bounds (what transformers can express) or upper bounds (what transformers can't express). We describe our undertaking to relate the expressivity of transformers to that of various logics. Our first major step is to define a variant of first-order logic with counting quantifiers, which we call FOC[+;MOD], and to show that it is intermediate in expressivity between two variants of transformer encoders. Namely, any language that can be recognized by a fixed-precision transformer encoder can be defined in FOC[+;MOD], and any language that can be defined in FOC[+;MOD] can be recognized by an infinite-precision transformer encoder. Together, these two results bring us much closer than before to an exact characterization of the languages that transformer encoders recognize.
Paper:
David Chiang, Peter Cholak, and Anand Pillay. Tighter bounds on the expressivity of transformer encoders. In Proc. ICML. 2023. URL: https://arxiv.org/abs/2301.10743.