Experiments with Non-parametric Topic Models
- 16:00 22nd January 2015 ( week 1, Hilary Term 2015 )Lecture Theater A, Department of Computer Science
This talk will cover some of our recent work in extended topic models to serve as tools in text mining and NLP (and hopefully, later, in IR) when some semantic analysis is required. In some sense our goals are akin to the use of Latent Semantic Analysis. The basic theoretical/algorithmic tool we have for this is non-parametric Bayesian methods for reasoning on hierarchies of probability vectors using "block table indicator sampling", a collapsed version of hierarchical Chinese Restaurant process sampling. The concepts will be introduced but not the statistical detail. Then I'll present some of our KDD 2014 paper (Experiments with Non-parametric Topic Models) that is currently the best performing topic model by a number of metrics.
Speaker bio
Prof. Wray Buntine joined Monash University in February 2014 after 7 years at NICTA in Canberra Australia. He was previously of Helsinki Institute for Information Technology from 2002, and at NASA Ames Research Center, University of California, Berkeley, and Google. He is known for his theoretical and applied work in document and text analysis, data mining and machine learning, and probabilistic methods. He applies probabilistic and non-parametric methods to tasks such as text analysis. In 2009 he was programme co-chair of ECML-PKDD in Bled, Slovenia, and was programme co-chair of ACML in Singapore in 2012. He reviews for conferences such as ACML, ECIR, SIGIR, ECML-PKDD, ICML, NIPS, UAI, and KDD, and is on the editorial board of Data Mining and Knowledge Discovery