Scalable Equilibrium Computation in n-Player General-Sum Games

Supervisor

Jiarui Gan

Suitable for

MSc in Advanced Computer Science

Abstract

Prerequisites: Computational Game Theory, Algorithm Design and Analysis, Linear Programming, Python

Background

Computing equilibria—such as Nash equilibria (NE) or their refinements—is a fundamental challenge in game theory and artificial intelligence. Traditional methods can become intractable as the number of agents and strategy spaces increase. To address this, Policy Space Response Oracles (PSRO) and its extension Joint Policy Space Response Oracles (JPSRO) have emerged as powerful frameworks [1,2]. These algorithms iteratively expand the strategy space by computing best responses and refining equilibrium approximations [3]. PSRO considers best responses for each player independently, while JPSRO computes joint best responses, often leading to more coordinated and efficient outcomes.
However, there is room to further improve JPSRO by introducing more structured methods for joint strategy generation and leveraging scalable optimisation techniques. The goal of this project is to investigate these methods, establish their theoretical underpinnings, implement the algorithms, and demonstrate their efficiency through empirical experiments. In doing so, we aim to develop practical tools for equilibrium computation in multi-agent games.

Focus

Theoretical Algorithm Design
- - Develop a novel scalable algorithm for computing equilibria of large, general-sum n-player games. Prove convergence theorems, including bounds on convergence speed and approximation quality.
- - Investigate additional algorithmic properties regarding the scalability of the approach.

Empirical Implementation and Evaluation

- - Implement the proposed algorithm, alongside relevant benchmarks. Compare against established baselines to demonstrate computational efficiency and enhanced coordination.
- - Provide documentation to guide future research on scalable equilibrium computation.

Method

The theoretical part of the work will be proof-based. The empirical part uses Python to implement the algorithm. The following skills are essential for this project:
- Familiarity with game theory, algorithm design, and optimisation methods like linear programming.
- Comfort with formal proofs and mathematical writing.
- Proficiency in Python.

References

[1] Lanctot, Marc, Vinicius Zambaldi, Audrunas Gruslys, Angeliki Lazaridou, Karl Tuyls, Julien Pérolat, David Silver, and Thore Graepel. "A unified game-theoretic approach to multiagent reinforcement learning." Advances in neural information processing systems 30 (2017).

[2] Marris, Luke, Paul Muller, Marc Lanctot, Karl Tuyls, and Thore Graepel. "Multi-agent training beyond zero-sum with correlated equilibrium meta-solvers." In International Conference on Machine Learning, pp. 7480-7491. PMLR, 2021.

[3] Bighashdel, Ariyan, Yongzhao Wang, Stephen McAleer, Rahul Savani, and Frans A. Oliehoek. "Policy space response oracles: A survey." arXiv preprint arXiv:2403.02227 (2024).