4.6 Article

Predicting aptamer sequences that interact with target proteins using an aptamer-protein interaction classifier and a Monte Carlo tree search approach

Journal

PLOS ONE
Volume 16, Issue 6, Pages -

Publisher

PUBLIC LIBRARY SCIENCE
DOI: 10.1371/journal.pone.0253760

Keywords

-

Funding

  1. Institute of Information & Communications Technology Planning & Evaluation (IITP) - Korea government (MSIT) [2020-0-01450]
  2. National Research Foundation of Korea (NRF) - Korea government (MSIT) [2021R1A2C2010775]
  3. National Research Foundation of Korea [2021R1A2C2010775] Funding Source: Korea Institute of Science & Technology Information (KISTI), National Science & Technology Information Service (NTIS)

Ask authors/readers for more resources

An approach utilizing machine learning for selecting candidate aptamer sequences was proposed, allowing the generation of sequences of different sizes with high binding affinity for specific target proteins, without being limited to a specific size compared to traditional experimental methods. By applying the MCTS algorithm to generate candidate sequences, designing a tree structure for nucleotide sequence sampling, and evaluating the quality of the generated sequences through docking simulations, the method showed similar or better docking scores in ZDOCK docking simulations than known aptamers.
Oligonucleotide-based aptamers, which have a three-dimensional structure with a single-stranded fragment, feature various characteristics with respect to size, toxicity, and permeability. Accordingly, aptamers are advantageous in terms of diagnosis and treatment and are materials that can be produced through relatively simple experiments. Systematic evolution of ligands by exponential enrichment (SELEX) is one of the most widely used experimental methods for generating aptamers; however, it is highly expensive and time-consuming. To reduce the related costs, recent studies have used in silico approaches, such as aptamer-protein interaction (API) classifiers that use sequence patterns to determine the binding affinity between RNA aptamers and proteins. Some of these methods generate candidate RNA aptamer sequences that bind to a target protein, but they are limited to producing candidates of a specific size. In this study, we present a machine learning approach for selecting candidate sequences of various sizes that have a high binding affinity for a specific sequence of a target protein. We applied the Monte Carlo tree search (MCTS) algorithm for generating the candidate sequences using a score function based on an API classifier. The tree structure that we designed with MCTS enables nucleotide sequence sampling, and the obtained sequences are potential aptamer candidates. We performed a quality assessment using the scores of docking simulations. Our validation datasets revealed that our model showed similar or better docking scores in ZDOCK docking simulations than the known aptamers. We expect that our method, which is size-independent and easy to use, can provide insights into searching for an appropriate aptamer sequence for a target protein during the simulation step of SELEX.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available