Improving Objectives in Competitive Optimization

Daniele Dell'Erba ( University of Liverpool )

27Mar
15:00 27th March 2025
Bill Roscoe LT (112) + https://cs-ox-ac-uk.zoom.us/j/94462826601

Competitive optimization is a classical framework for computing optimal solutions in interactive environments, with applications in machine learning, operations research, automata theory, and control theory. Two-player stochastic games with reachability and discounted objectives serve as foundational modelling frameworks for competitive optimization and have been extensively studied over the past 60 years.

Key research challenges in competitive optimization include designing efficient algorithms and formalizing objectives in a way that these algorithms can effectively process. In this talk, I will address both of these challenges by presenting (i) a novel algorithm for solving stochastic games based on iterative objective improvement and (ii) a multi-valued automata learning algorithm and tool for extracting formal objectives from demonstrations.

The first result introduces a new approach to solving stochastic games, which are traditionally tackled using value iteration or strategy improvement algorithms. I will present an alternative approach, called objective improvement, in which the search for an optimal policy is driven by an objective function defined over both players’ strategies. When the current strategies are suboptimal, the advantage function is updated to move closer to optimal values. This method guarantees convergence and treats both players symmetrically.

The second result focuses on learning automata, which play a critical role in synthesizing formal objectives for competitive optimization. Instead of constructing models from predefined logical formulae, automata learning begins with sets of positive and negative demonstrations. I will introduce DFAMiner, a passive learning tool that employs three-valued deterministic finite automata as an intermediate data structure between the samples and the SAT formula used to infer the model.

Much of this work is based on joint research with my postdoctoral supervisor, Sven Schewe, at the University of Liverpool.