4.7 Article

Biological sequence modeling with convolutional kernel networks

Journal

BIOINFORMATICS
Volume 35, Issue 18, Pages 3294-3302

Publisher

OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/btz094

Keywords

-

Funding

  1. ANR [ANR-14-CE23-0003-01, ANR-17-CE23-0011-01]
  2. ERC [714381]
  3. Agence Nationale de la Recherche (ANR) [ANR-17-CE23-0011] Funding Source: Agence Nationale de la Recherche (ANR)

Ask authors/readers for more resources

Motivation: The growing number of annotated biological sequences available makes it possible to learn genotype-phenotype relationships from data with increasingly high accuracy. When large quantities of labeled samples are available for training a model, convolutional neural networks can be used to predict the phenotype of unannotated sequences with good accuracy. Unfortunately, their performance with medium-or small-scale datasets is mitigated, which requires inventing new data-efficient approaches. Results: We introduce a hybrid approach between convolutional neural networks and kernel methods to model biological sequences. Our method enjoys the ability of convolutional neural networks to learn data representations that are adapted to a specific task, while the kernel point of view yields algorithms that perform significantly better when the amount of training data is small. We illustrate these advantages for transcription factor binding prediction and protein homology detection, and we demonstrate that our model is also simple to interpret, which is crucial for discovering predictive motifs in sequences.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available