Complex Object Querying and Data Science
         
         Supervisor
         
         Suitable for
         
         Abstract
"We will look at query languages for transforming
         nested collections (collections that might contain collections).
         Such languages can be useful for preparing large scale feature
         data for machine learning algorithms. We have a basic implementation
         of such a language that we implement on top of the big-data framework Spark.  The goal of the project is to extend the 
         language
         with iteration. One goal will be to look at how to adapt processing techniques for nested data to support iteration. 
         Another, closer to application is to utilize iteration  to support additional steps of a data science pipeline,
         such as sampling. "