A deep learning-based model DeepSpCas9 to predict SpCas9 activity
Phys.org | November 22, 2019
In a new report on Science Advances, Hui Kwon Kim and interdisciplinary researchers at the departments of Pharmacology, Electrical and Computer Engineering, Medical Sciences, Nanomedicine and Bioinformatics in the Republic of Korea, evaluated the activities of SpCas9; a bacterial RNA-guided Cas9 endonuclease variant (a bacterial enzyme that cuts DNA for genome editing) from Streptococcus pyogenes. They used a high-throughput approach with 12,832 target sequences based on a human cell library to build a deep learning model and predict the activity of SpCas9. The data contained oligonucleotides (nucleotides or building blocks) containing target sequence pairs and a corresponding guide sequence to encode single-guide RNA (sgRNA), which can direct the Cas9 protein to bind and cleave a specific DNA sequence for genome editing. They implemented deep learning-based training on the large dataset of SpCas9-induced indel (insertion or deletion) frequencies to develop an SpCas9 activity predicting model named DeepSpCas9 now available online. When the team tested the software against independently generated datasets, the results showed high generalization performance, i.e. the model could properly adapt to new, previously unseen data.