4.5 Review

A survey on protein-DNA-binding sites in computational biology

Journal

BRIEFINGS IN FUNCTIONAL GENOMICS
Volume 21, Issue 5, Pages 357-375

Publisher

OXFORD UNIV PRESS
DOI: 10.1093/bfgp/elac009

Keywords

DNA-protein-binding sites; bioinformatics; transcription factor binding site; machine learning; deep learning; convolutional neural network; recurrent neural networks

Funding

  1. Shandong Provincial Natural Science Foundation [ZR2021MF036]
  2. Natural Science Foundation of China [61,902,337]
  3. Xuzhou Science and Technology Plan Project [KC21047]
  4. Jiangsu Provincial Natural Science Foundation [SBK2019040953]
  5. Natural Science Fund for Colleges and Universities in Jiangsu Province [19KJB520016]
  6. Young Talents of Science and Technology in Jiangsu

Ask authors/readers for more resources

This article provides an overview of the computational and experimental methods used in the field of protein-DNA-binding site prediction. The methods based on traditional machine learning and deep learning are discussed, helping researchers better understand this field.
Transcription factors are important cellular components of the process of gene expression control. Transcription factor binding sites are locations where transcription factors specifically recognize DNA sequences, targeting gene-specific regions and recruiting transcription factors or chromatin regulators to fine-tune spatiotemporal gene regulation. As the common proteins, transcription factors play a meaningful role in life-related activities. In the face of the increase in the protein sequence, it is urgent how to predict the structure and function of the protein effectively. At present, protein-DNA-binding site prediction methods are based on traditional machine learning algorithms and deep learning algorithms. In the early stage, we usually used the development method based on traditional machine learning algorithm to predict protein-DNA-binding sites. In recent years, methods based on deep learning to predict protein-DNA-binding sites from sequence data have achieved remarkable success. Various statistical and machine learning methods used to predict the function of DNA-binding proteins have been proposed and continuously improved. Existing deep learning methods for predicting protein-DNA-binding sites can be roughly divided into three categories: convolutional neural network (CNN), recursive neural network (RNN) and hybrid neural network based on CNN-RNN. The purpose of this review is to provide an overview of the computational and experimental methods applied in the field of protein-DNA-binding site prediction today. This paper introduces the methods of traditional machine learning and deep learning in protein-DNA-binding site prediction from the aspects of data processing characteristics of existing learning frameworks and differences between basic learning model frameworks. Our existing methods are relatively simple compared with natural language processing, computational vision, computer graphics and other fields. Therefore, the summary of existing protein-DNA-binding site prediction methods will help researchers better understand this field.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available