Sumedha Gunewardena Peter Jeavons
Revised October 2003, 11pp.
A problem faced by many algorithms for finding transcription factor binding sites is the high number of false positive hits that result with the increased sensitivity of their prediction. A main contributing factor to this is the short and degenerate nature of these sites which results in a low signal to noise ratio. In order to counter this problem one needs to look beyond the base independence assumption. We propose a model based on templates designed to capture not only the vertical consensus but also the correlation of individual bases with the other bases of the site.