An Analysis of Piecewise−Linear and Convex Value Functions for Active Perception POMDPs
Yash Satsangi‚ Shimon Whiteson and Matthijs T. J. Spaan
Abstract
In active perception tasks, an agent aims to select actions that reduce its uncertainty about a hidden state. While partially observable Markov decision processes (POMDPs) are a natural model for such problems, reward functions that directly penalize uncertainty in the agent’s belief can remove the piecewise-linear and convex (PWLC) property of the value function required by most POMDP planners. This paper analyses ρ-POMDP and POMDP-IR, two frameworks that restore the PWLC property in active perception tasks. We establish the mathematical equivalence of the two frameworks and show that both admit a decomposition of the maximization performed in the Bellman backup, yielding substantial computational savings. We also present an empirical analysis on data from real multi-camera tracking systems that illustrates these savings and analyzes the critical factors in the performance of POMDP planners in such tasks.