Brain-Soap for Mad Cows: An Exploration into Computational RNA Aptamer Prediction for Disordered Prion Proteins
Jordan Henry
Mentor: Dr. David Bell, Advanced Biomedical Computational Sciences Division, Frederick National Labs
Date/Time: August 22nd, 2023 at 1pm.
Abstract: RNA aptamers are sequences intentionally selected on the basis of high binding affinity to a target protein, with a wide array of potential diagnostic and therapeutic applications. Aptamers have been explored with potential towards the treatment of diseases including a component of aberrant protein interactions, famously including Alzheimer’s Disease. Here we explore design towards Bovine Prion Protein(bPrP), famously implicated in ‘Mad Cow Disease’. It is a relatively small and disordered surface protein, which presents a unique target towards a potential aptamer-based therapeutic.
Aptamers are traditionally designed via SELEX, an in vitro selection process starting from random libraries as large as 1e15 sequences. Such a process can be tedious and expensive. Furthermore, as information about aptamer-protein interactions accumulates, computational methods should increase in viability as an alternative. This project builds on previous work in developing a computational approach to aptamer prediction, based primarily on free energy calculations.
Starting from an experimental aptamer structure ‘2RU7’, we successfully modeled the aptamer-protein complex via molecular dynamic simulations. Through docking simulations, it was confirmed that the structure binds to a pair of lysine-rich regions in PrPc, particularly the motif ‘KPSKPK’ which is part of a disorder region of the protein.
We identified a program called ‘Apta-MCTS’, which generates candidate sequences via scoring function based on current data of aptamer-protein interactions, and a Monte-Carlo Search Tree algorithm. The program is non-deterministic, and produces a new batch of candidates every time it is run. Using the program to generate a large number of candidates, the bulk of the work was in understanding the distribution of these candidates as they relate to each other, and how to separate out a reasonable number of such candidates for analysis via docking and MD simulations. To this end, an algorithm was developed to: a. Combine the output of multiple batches. b. Determine their pairwise biological edit-distances via the ‘TN93’ package. c. Determine a number of overlapping ‘neighborhoods’ around each sequence. d. To iteratively cull sequences which represent local minima, being the least promising candidates in their own neighborhood. The aim of such an algorithm is to optimize for a candidate list that is promising with respect to potential affinity to the target, diverse with respect to related sequences, and can be tailored to the computational limitations of the project, by a number of optional parameters.
Using this approach, we cut a set of ~10,000 candidates down to 45 candidates of interest. Via further testing, 2 candidates of interest were selected, showing good affinity to PrPc. The candidates themselves could be investigated as potentially useful, via in vitro experiments. Hopefully further work can be done on automating candidate selection via simulation data.
- Tagged
- Summer 2023