Developing machine learning models for off-target prediction in CRISPR/Cas9 gene editing
Supervisor
Suitable for
Abstract
The CRISPR/Cas9 gene editing system is composed of a Cas9 nuclease and a short oligonucleotide guide RNA (or guide) that guides the Cas9 nuclease to the targeted DNA sequence (on-target) through complementary binding but the Cas9 nuclease may also cleave off-target genomics DNA sequences, which contain mismatches compared to the gRNA, therefore, undesired cleavage could occur. The obvious factors influencing off-target cleavage activity of the CRISPR/Cas9 gene editing system are the sequence identities of the guide RNA and the off-target DNA . Various 'basic features' derived from said sequences have been fueling the development of procedural and machine learning models for off-target cleavage activity prediction but there are numerous 'non-basic fetures' (such as the sequence context around the off-target DNA) that may also influence off-target cleavage activity. The project will aim for the development of novel off-target clavage activity predictions models using approaches that include but not limited to combining 'basic features' and 'non-basic features' to increase the accuracy of model predictions of experimental off-target cleavage activities.Prerequisites: Recommended for students who has done a Machine Learning course and has interest in molecular biology.