Safety and Robustness in Agentic Systems
Supervisors
Suitable for
Abstract
Agentic systems increasingly perform tasks in graph-based environments, such as managing a calendar. In these graphs, nodes represent states (e.g., a scheduled meeting or an empty time slot), and edges represent actions transitioning between states (e.g., adding a meeting, rescheduling an event, or canceling an appointment). Each node is associated, potentially by another LLM-based safety function, with a label indicating whether it is safe to execute an action leading to that state. To rigorously study these systems, we aim to reformulate the problem into a reinforcement learning (RL) framework, where the agent learns policies to navigate the graph safely while optimizing task performance. This project will benchmark safety risks, explore strategies to prevent unsafe transitions, and guide the development of robust, risk-aware decision-making. It is designed to lead to a high-quality publication, and we are seeking a highly motivated student to contribute.
We will be working with Guohao Li (co founder of Eigent ai and Camel AI -- London based startup) and another industry partner.