Query Compilation: the View from the Database Side
- 14:00 6th November 2015 ( week 4, Michaelmas Term 2015 )Room 277, Oxford e-Research Centre, 7 Keble Road
We study knowledge compilation for Boolean formulas that are given as groundings of First Order formulas. This problem is motivated by probabilistic databases, where each record in the database is an independent probabilistic event, and the query is given by a SQL expression or, equivalently, a First Order formula. The query's probability can be computed in linear time in the size of the compilation representation, hence the interest in studying the size of such a representation. We consider the "data complexity" setting, where the query is fixed, and the input to the problem consists only of the database instance. We consider several compilation targets, of increasing expressive power: OBDDs, FBDDs, and decision-DNNFs (a subclass of d-DNNFs). For the case of OBDDs we establish a dichotomy theorem for queries in restricted languages FO(\exists, \wedge, \vee) and FO(\forall, \wedge, \vee): for each such query the OBDD is either linear in the size of the database, or grows exponentially, and the complexity can be determined through a simple analysis of the query expression. For the other targets we describe a class of queries for which (a) the decision-DNNF is exponentially large in the size of the database, and (b) the probability of the query can be computed in polynomial time in the size of the database. This suggests that the compilation target decision-DNNF is too weak to capture all tractable cases of probabilistic inference. Our lower bound for decision-DNNF's relies on a translation into FBDD's, which is of independent interest. Joint work with Paul Beame, Abhay Jha, Jerry Li, and Sudeepa Roy.