Machine Learning and Artificial Intelligence Approaches for the Design of Potent Therapeutic siRNAs
Citations
Authors
Student Authors
Faculty Advisor
Academic Program
UMass Chan Affiliations
Document Type
Publication Date
Subject Area
Collections
Files
- Embargoed until 2027-09-10
Embargo Expiration Date
Link to Full Text
Abstract
Small interfering RNAs (siRNAs) represent a transformative class of therapeutics, with the potential to target previously undruggable genes. However, identifying potent silencing siRNAs remains a challenge, particularly in the context of fully chemically modified scaffolds required for therapeutic applications. This dissertation addresses these challenges by developing, validating, and deploying the first AI-powered design platform specifically optimized for the development of fully modified therapeutic siRNAs.
To build toward this solution, we systematically developed and evaluated siRNA efficacy prediction models across a series of frameworks including linear models, supervised and semi-supervised machine learning, and deep learning-based sequence feature encoding strategies. We assembled what will be the largest (∼5,000) publicly available dataset of fully modified endogenously evaluated siRNAs and developed a semi-supervised learning model that achieved unprecedented predictive performance (F1-score ∼0.7) supported by experimental validation. Drawing from these models by conducting explainable AI techniques, we uncover mechanistic insights driving predictions that agree with the current consensus on siRNA mechanisms. We explore the impacts of chemical modifications and assay context on siRNA efficacy. Through semi-supervised modeling we demonstrate that human-trained models fail to generalize to mouse models, highlighting a potential role of sequence context in siRNA targeting.
To ensure accessibility, we deployed our model as a user-friendly, publicly available web tool for therapeutic siRNA design (https://kmonopoli.github.io/sirna). This work not only advances the field of siRNA prediction but also establishes a new paradigm for AI-driven oligonucleotide drug discovery, offering researchers a robust, validated platform for rapid preclinical candidate selection.