Compressed Representations of Conjunctive Query Results
- 14:00 22nd March 2018 ( Hilary Term 2018 )051
Relational queries, and in particular join queries, often generate large output results when executed over a huge dataset. In such cases, it is often infeasible to store the whole materialized output if we plan to reuse it further down a data processing pipeline. In this talk, we consider the problem of constructing space-efficient compressed representations of the output of conjunctive queries, with the goal of supporting the efficient access of the intermediate compressed result for a given access pattern. In particular, we will study of an important tradeoff : minimizing the space necessary to store the compressed result, versus minimizing the answer time and delay for an access request over the result. We present a novel parameterized data structure, which can be tuned to trade off space for answer time. This tradeoff allows us to control the space requirement of the data structure precisely, and depends both on the structure of the query and the access pattern. We then show how we can use the data structure in conjunction with query decomposition techniques, in order to efficiently represent the outputs for several classes of conjunctive queries. Finally, we conclude with some exciting open problems in this area.
This is joint work with Shaleen Deep.