Enhancing Predictive Accuracy of CRISPR-Cas9 on-target efficiency using Deep Learning and Active Learning Optimization for Small Datasets

Masoud Madavifara, Fatemeh Zare-Mirakabad

February 2025

Abstract

The CRISPR-Cas9 gene editing system has revolutionized genetics, but predicting sgRNA cleavage efficiency remains a challenge, particularly with small datasets. We present a deep learning framework optimized for small datasets by integrating active learning, which iteratively prioritizes the most informative data points for labeling. Our model outperforms previous methods on benchmark datasets, capturing complex sequence features through domain-specific properties. Active learning reduces the required dataset size enabling high predictive accuracy even with limited data. This approach provides a scalable and robust solution for improving CRISPR-Cas9 design and precise gene editing across diverse genomic contexts, demonstrating the potential of active learning to enhance deep learning model performance in data-scarce scenarios.

Type

Conference paper

Publication

4th International & 13th Iranian Conference on Bioinformatics

Enhancing Predictive Accuracy of CRISPR-Cas9 on-target efficiency using Deep Learning and Active Learning Optimization for Small Datasets

Abstract

Fatemeh Zare-Mirakabad

Associate Professor