Enhancing the prediction of transcription factor binding sites by incorporating structural properties and nucleotide covariations
Z. Zhang S. Gunewardena P. Jeavons
Abstract
A problem faced by many algorithms for finding transcription factor (TF) binding sites is the high number of false positive hits that result with the increased sensitivity of their prediction. A main contributing factor to this is the short and degenerate nature of these sites which results in a low signal-to-noise ratio. In order to counter this problem, one needs to look beyond the assumption that individual bases of a TF binding site act independently from each other when binding to a transcription factor. In this paper, we present a new method based on templates, designed to exploit the discriminatory features, nucleotide polymorphism, and structural homology present in TF binding sites for distinguishing them from nonbinding sites.