☆ 4.6 Article

Accelerating the Original Profile Kernel

PLOS ONE (2013)

Journal

PLOS ONE

Volume 8, Issue 6, Pages -

Publisher

PUBLIC LIBRARY SCIENCE

DOI: 10.1371/journal.pone.0068459

Keywords

-

Categories

Multidisciplinary Sciences

Funding

Alexander von Humboldt foundation through the German Ministry for Research and Education (BMBF: Bundesministerium fuer Bildung und Forschung

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

One of the most accurate multi-class protein classification systems continues to be the profile-based SVM kernel introduced by the Leslie group. Unfortunately, its CPU requirements render it too slow for practical applications of large-scale classification tasks. Here, we introduce several software improvements that enable significant acceleration. Using various non-redundant data sets, we demonstrate that our new implementation reaches a maximal speed-up as high as 14-fold for calculating the same kernel matrix. Some predictions are over 200 times faster and render the kernel as possibly the top contender in a low ratio of speed/performance. Additionally, we explain how to parallelize various computations and provide an integrative program that reduces creating a production-quality classifier to a single program call. The new implementation is available as a Debian package under a free academic license and does not depend on commercial software. For non-Debian based distributions, the source package ships with a traditional Makefile-based installer. Download and installation instructions can be found at https://rostlab.org/owiki/index.php/Fast_Profile_Kernel. Bugs and other issues may be reported at https://rostlab.org/bugzilla3/enter_bug.cgi?product=fastprofkernel.

Authors

I am an author on this paper

Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6

Not enough ratings

Secondary Ratings

Novelty

-

Significance

-

Scientific rigor

-

Rate this paper

Recommended

Review Computer Science, Information Systems

A survey on accelerating technologies for fast network packet processing in Linux environments

Eduardo Freitas, Assis T. de Oliveira Filho, Pedro R. X. do Carmo, Djamel Sadok, Judith Kelner

Summary: The path a packet takes in the Linux Kernel has been established for a long time, but with the introduction of new paradigms, complexity has increased. Fast Packet Processing Frameworks have emerged to solve the issues of low delay and high bandwidth services. However, each technology provides different methods and solutions, leading to different benefits and trade-offs. This work proposes a taxonomy to classify these solutions into hardware, software, and virtualization categories, and evaluates their applicability in real-world scenarios based on four criteria.

COMPUTER COMMUNICATIONS (2022)

Add to Collection

Article Computer Science, Information Systems

V-SKP: Vectorized Kernel-Based Structured Kernel Pruning for Accelerating Deep Convolutional Neural Networks

Kwanghyun Koo, Hyun Kim

Summary: In this study, a new vectorized structured kernel pruning method is proposed, which achieves high FLOPs reduction and minimal accuracy degradation while maintaining the weight structure. Experimental results demonstrate significant parameter and FLOPs reduction, as well as real acceleration effects on GPUs, in various networks including ResNet-50.

IEEE ACCESS (2023)

Add to Collection

Article Computer Science, Theory & Methods

Accelerating Restarted GMRES With Mixed Precision Arithmetic

Neil Lindquist, Piotr Luszczek, Jack Dongarra

Summary: GMRES is an iterative Krylov solver for sparse, non-symmetric linear equations, where data movement dominates run time. Running GMRES in reduced precision while keeping key operations in full precision improves performance. The mixed-precision approach achieved speedups ranging from 8 to 61% on a GPU-accelerated node, with simpler preconditioners showing higher speedups.

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS (2022)

Add to Collection

Article Engineering, Biomedical

Accelerating MR Parameter Mapping Using Nonlinear Compressive Manifold Learning and Regularized Pre-Imaging

Yihang Zhou, Haifeng Wang, Yuanyuan Liu, Dong Liang, Leslie Ying

Summary: This study presents a novel method for reconstructing MR parametric maps from highly undersampled k-space data. By sparsely representing the unknown MR parameter-weighted images in high-dimensional feature space and utilizing low-dimensional manifolds learned from training images, the method achieves improved reconstruction quality through spatial and temporal regularizations.

IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING (2022)

Add to Collection

Article Computer Science, Theory & Methods

Accelerating Large Sparse Neural Network Inference Using GPU Task Graph Parallelism

Dian-Lun Lin, Tsung-Wei Huang

Summary: This article introduces SNIG, an efficient inference engine for large sparse DNNs. SNIG utilizes highly optimized inference kernels and the power of CUDA Graphs to enable efficient decomposition of model and data parallelisms. It offers a flexible and scalable decomposition strategy. The evaluation on HPEC Sparse DNN Challenge benchmarks shows that SNIG performs well and outperforms the state-of-the-art baseline.

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS (2022)

Add to Collection

Article Computer Science, Artificial Intelligence

Accelerating Gaussian Process surrogate modeling using Compositional Kernel Learning and multi-stage sampling framework

Seung-Seop Jin

Summary: A new sequential surrogate modeling method is proposed in this study, integrating the CKL method and PLHS strategy. By efficiently learning complex response surfaces and sequentially generating nested samples, this method can address the difficulties in surrogate modeling.

APPLIED SOFT COMPUTING (2021)

Add to Collection

Article Engineering, Electrical & Electronic

Accelerating Hydraulic Fracture Imaging by Deep Transfer Learning

Runren Zhang, Qingtao Sun, Yiqian Mao, Liangze Cui, Yongze Jia, Wei-Feng Huang, Mohsen Ahmadian, Qing Huo Liu

Summary: This article discusses the application of deep transfer learning in hydraulic fracture imaging. By using a two-step approach, training a convolutional neural network with approximated field patterns generated from a simplified model and fine-tuning it with true field patterns from a full model, accurate reconstruction results can be achieved.

IEEE TRANSACTIONS ON ANTENNAS AND PROPAGATION (2022)

Add to Collection

Article Energy & Fuels

Probability density function forecasting of residential electric vehicles charging profile

Ali Jamali Jahromi, Mohammad Mohammadi, Shahabodin Afrasiabi, Mousa Afrasiabi, Jamshid Aghaei

Summary: This paper presents the main principle of the probability density function forecasting approach in residential electric vehicle (REV) charging profile. A deep learning structure combined with kernel density estimation (KDE) is designed, which includes convolutional layers, gated recurrent unit (GRU), autoregressive (AR) model, and kernel density estimator block. An attention mechanism is integrated to improve the learning ability of the network. Numerical results on actual REV data demonstrate the effectiveness and superiority of the proposed network compared to other methods.

APPLIED ENERGY (2022)

Add to Collection

Article Biotechnology & Applied Microbiology

Genome-wide identification and molecular expression profile analysis of FHY3/FAR1 gene family in walnut (Juglans sigillata L.) development

Shengqun Chen, Yingfu Chen, Mei Liang, Shuang Qu, Lianwen Shen, Yajun Zeng, Na Hou

Summary: This study identified 61 FHY3/FAR1 gene family members in walnuts and found that they were unevenly distributed on the chromosomes. These genes were divided into five subclasses and were potentially involved in regulating walnut growth and development. Gene expression analysis showed that some FHY3/FAR1 genes might be associated with walnut kernel ripening and seed coat color formation. Furthermore, promoter analysis revealed certain genes associated with flavonoid biosynthesis and light and MeJA responsiveness. Overall, this study provides valuable insights into the walnut genome and the functions of FHY3/FAR1 genes.

BMC GENOMICS (2023)

Add to Collection

Article Automation & Control Systems

Accelerating Sequential Minimal Optimization via Stochastic Subgradient Descent

Bin Gu, Yingying Shan, Xin Quan, Guansheng Zheng

Summary: This paper introduces a generalized framework for accelerating Sequential Minimal Optimization (SMO) using Stochastic Subgradient Descent (SSGD), and explores the effectiveness of this approach through experimental results on various datasets and learning applications.

IEEE TRANSACTIONS ON CYBERNETICS (2021)

Add to Collection

Correction Biochemistry & Molecular Biology

Macauba (Acrocomia aculeata) kernel has good protein quality and improves the lipid profile and short chain fatty acids content in Wistar rats (vol 13, pg 11342, 2022)

Fatima Ladeira Mendes Duarte, Barbara Pereira da Silva, Mariana Grancieri, Cintia Tomaz Sant'Ana, Renata Celi Lopes Toledo, Vinicius Parzanini Brilhante de Sao Jose, Sidney Pacheco, Hercia Stampini Duarte Martino, Frederico Augusto Ribeiro de Barros

Summary: The study demonstrates that macauba kernel has high protein quality and can improve lipid profile and short chain fatty acids content in rats.

FOOD & FUNCTION (2023)

Add to Collection

Article Computer Science, Information Systems

Casper: Accelerating Stencil Computations Using Near-Cache Processing

Alain Denzler, Geraldo F. Oliveira, Nastaran Hajinazar, Rahul Bera, Gagandeep Singh, Juan Gomez-Luna, Onur Mutlu

Summary: This paper introduces Casper, a near-cache accelerator that improves the performance of stencil computations and reduces system energy consumption. Casper is designed based on two key ideas: avoiding the cost of moving rarely reused data throughout the cache hierarchy, and exploiting the regularity of data accesses and inherent parallelism of stencil computations. Experimental results show that Casper improves performance by an average of 1.65x (up to 4.16x) compared to commercial high-performance multi-core processors, while reducing system energy consumption by an average of 35% (up to 65%). Casper provides 37x (up to 190x) improvement in performance-per-area compared to a state-of-the-art GPU.

IEEE ACCESS (2023)

Add to Collection

Article Mathematics, Applied

Asymptotic profile of solutions to the heat equation on thin plate with boundary heating

Eun-Ho Lee, Woocheol Choi

Summary: This paper studies the conduction of heat on the surface of a thin plate, and analyzes the asymptotic characteristics of the solutions as the thickness of the plate approaches zero.

APPLIED MATHEMATICS AND COMPUTATION (2021)

Add to Collection

Article Food Science & Technology

Effect of two postharvest technologies on the micronutrient profile of cashew kernels from Mozambique

Americo Uaciquete, Neid Ali Ferreira, Katja Lehnert, Walter Vetter, Nadine Sus, Wolfgang Stuetz

Summary: The study revealed that using apple drying slightly impacted the nutrient profile of cashew nuts, with some carotenoids, fatty acids, and amino acids decreasing in concentration while iron, magnesium, and tocotrienols increased. Industrial baby butt grade cashew nuts had higher levels of minerals, fatty acids, and amino acids but lower levels of beta-carotene and tocopherols compared to other grades. Conventional sun drying and apple drying had similar effects on the micronutrient content of cashew nuts.

FOOD SCIENCE & NUTRITION (2022)

Add to Collection

Article Plant Sciences

The Metabolic Profile of Young, Watered Chickpea Plants Can Be Used as a Biomarker to Predict Seed Number under Terminal Drought

Sarah J. Purdy, David Fuentes, Purushothaman Ramamoorthy, Christopher Nunn, Brent N. Kaiser, Andrew Merchant

Summary: By analyzing the metabolic profile of chickpea leaves, researchers have identified several metabolites that can predict grain yield traits under terminal drought, such as pinitol, sucrose, and GABA. These metabolic biomarkers can accurately predict complex yield characteristics with a high degree of accuracy.

PLANTS-BASEL (2023)

Add to Collection

Review Biochemical Research Methods

Mutations in transmembrane proteins: diseases, evolutionary insights, prediction and comparison with globular proteins

Jan Zaucha, Michael Heinzinger, A. Kulandaisamy, Evans Kataka, Oscar Llorian Salvador, Petr Popov, Burkhard Rost, M. Michael Gromiha, Boris S. Zhorov, Dmitrij Frishman

Summary: Membrane proteins, by interacting with lipid bilayers, play crucial roles in transporting molecules and relaying signals between cells. Mutations in these proteins can have profound effects on the host's fitness, as shown in experimental studies and evolutionary signals.

BRIEFINGS IN BIOINFORMATICS (2021)

Add to Collection

Article Multidisciplinary Sciences

Embeddings from deep learning transfer GO annotations beyond homology

Maria Littmann, Michael Heinzinger, Christian Dallago, Tobias Olenyi, Burkhard Rost

Summary: This study proposes a GO term prediction method based on SeqVec embedding and protein proximity, with promising results especially for proteins from smaller families or with intrinsically disordered regions.

SCIENTIFIC REPORTS (2021)

Add to Collection

Article Genetics & Heredity

Embeddings from protein language models predict conservation and variant effects

Celine Marquet, Michael Heinzinger, Tobias Olenyi, Christian Dallago, Kyra Erckert, Michael Bernhofer, Dmitrii Nechaev, Burkhard Rost

Summary: The study utilized Protein Language Models (pLMs) to predict sequence conservation and SAV effects without requiring multiple sequence alignments (MSAs). The results showed that embeddings alone could accurately predict residue conservation almost as effectively as ConSeq using MSAs.

HUMAN GENETICS (2022)

Add to Collection

Article Biochemistry & Molecular Biology

ProteomicsDB: toward a FAIR open-source resource for life-science research

Ludwig Lautenbacher, Patroklos Samaras, Julian Muller, Andreas Grafberger, Marwin Shraideh, Johannes Rank, Simon T. Fuchs, Tobias K. Schmidt, Matthew The, Christian Dallago, Holger Wittges, Burkhard Rost, Helmut Krcmar, Bernhard Kuster, Mathias Wilhelm

Summary: ProteomicsDB is a multi-omics and multi-organism resource for life science research, with efforts to improve the findability, accessibility, interoperability and reusability of data. New API and UI have been released, along with content expansions into different human biology and a newly supported organism.

NUCLEIC ACIDS RESEARCH (2022)

Add to Collection

Editorial Material Biochemistry & Molecular Biology

Protein matchmaking through representation learning

Michael Heinzinger, Christian Dallago, Burkhard Rost

Summary: Sledzieski, Singh, Cowen, and Berger used representation learning to predict protein interactions and identify binding residues between protein pairs. Their work demonstrated the generalizability of training on one organism and evaluating on others, showcasing the potential of AI-learned representations in advancing knowledge in molecular biology.

CELL SYSTEMS (2021)

Add to Collection

Article Biochemistry & Molecular Biology

Protein language-model embeddings for fast, accurate, and alignment-free protein structure prediction

Konstantin Weissenow, Michael Heinzinger, Burkhard Rost

Summary: This study describes a competitive prediction method that exclusively uses embeddings from pre-trained protein language models (pLMs) and does not require multiple sequence alignments (MSAs). By utilizing attention mechanisms, this method performs similarly to methods that rely on co-evolution, but at a lower cost. It may better capture features of specific protein structures, although it does not reach the level of AlphaFold2.

STRUCTURE (2022)

Add to Collection

Correction Multidisciplinary Sciences

Discovery of widespread transcription initiation at microsatellites predictable by sequence-based deep neural network (vol 12, 3279, 2021)

Mathys Grapotte, Manu Saraswat, Chloe Bessiere, Christophe Menichelli, Jordan A. Ramilowski, Jessica Severin, Yoshihide Hayashizaki, Masayoshi Itoh, Michihira Tagami, Mitsuyoshi Murata, Miki Kojima-Ishiyama, Shohei Noma, Shuhei Noguchi, Takeya Kasukawa, Akira Hasegawa, Harukazu Suzuki, Hiromi Nishiyori-Sueki, Martin C. Frith, Clement Chatelain, Piero Carninci, Michiel J. L. de Hoon, Wyeth W. Wasserman, Laurent Brehelin, Charles-Henri Lecellier

NATURE COMMUNICATIONS (2022)

Add to Collection

Article Biochemical Research Methods

TMbed: transmembrane proteins predicted through language model embeddings

Michael Bernhofer, Burkhard Rost

Summary: In this study, a novel method called TMbed is proposed, which utilizes embeddings from protein language models to predict transmembrane regions of proteins. The method achieves high accuracy and low false positive rates in predicting alpha helical and beta barrel transmembrane proteins. TMbed is capable of processing large protein sequences on standard desktop computers and has the potential to be used for screening millions of predicted 3D structures.

BMC BIOINFORMATICS (2022)

Add to Collection

Article Biochemical Research Methods

Engineering indel and substitution variants of diverse and ancient enzymes using Graphical Representation of Ancestral Sequence Predictions (GRASP)

Gabriel Foley, Ariane Mora, Connie M. Ross, Scott Bottoms, Leander Sutzl, Marnie L. Lamprecht, Julian Zaugg, Alexandra Essebier, Brad Balderson, Rhys Newell, Raine E. S. Thomson, Bostjan Kobe, Ross T. Barnard, Luke Guddat, Gerhard Schenk, Jorg Carsten, Yosephine Gumulya, Burkhard Rost, Dietmar Haltrich, Volker Sieber, Elizabeth M. J. Gillam, Mikael Boden

Summary: Ancestral sequence reconstruction is a powerful technique for recovering ancestral diversity and identifying building blocks using large data sets. The GRASP method efficiently implements maximum likelihood methods and uses partial order graphs to represent insertion and deletion events. By exploring variation over evolutionary time, GRASP enables the engineering of biologically active ancestral variants.

PLOS COMPUTATIONAL BIOLOGY (2022)

Add to Collection

Article Biochemical Research Methods

CATHe: detection of remote homologues for CATH superfamilies using embeddings from protein language models

Vamsi Nallapareddy, Nicola Bordin, Ian Sillitoe, Michael Heinzinger, Maria Littmann, Vaishali P. Waman, Neeladri Sen, Burkhard Rost, Christine Orengo

Summary: CATH is a protein domain classification resource that utilizes an automated workflow and manual curation to create a hierarchical classification of evolutionary and structural relationships. The study aimed to develop algorithms for detecting remote homologues missed by HMM-based approaches. The CATHe method, combining a neural network with sequence representations, showed high accuracy in identifying remote homologues.

BIOINFORMATICS (2023)

Add to Collection

Article Biochemistry & Molecular Biology

LambdaPP: Fast and accessible protein-specific phenotype predictions

Tobias Olenyi, Celine Marquet, Michael Heinzinger, Benjamin Kroeger, Tiha Nikolova, Michael Bernhofer, Philip Saendig, Konstantin Schuetze, Maria Littmann, Milot Mirdita, Martin Steinegger, Christian Dallago, Burkhard Rost

Summary: The availability of accurate and fast AI solutions for predicting protein aspects is revolutionizing molecular biology. LambdaPP is a webserver aiming to replace the first internet server PredictProtein from 1992, providing AI protein predictions. LambdaPP offers accessible visualizations of protein 3D structure and predictions at both the protein level and residue level, including various phenotypes, within seconds.

PROTEIN SCIENCE (2023)

Add to Collection

Review Biochemistry & Molecular Biology

Novel machine learning approaches revolutionize protein knowledge

Nicola Bordin, Christian Dallago, Michael Heinzinger, Stephanie Kim, Maria Littmann, Clemens Rauer, Martin Steinegger, Burkhard Rost, Christine Orengo

Summary: Breakthroughs in machine learning, protein structure prediction, and ultrafast structural aligners are revolutionizing structural biology. Large-scale acquisition of accurate protein models and functional annotation is no longer constrained by time and resources. AlphaFold 2, the latest top-ranked method in the CASP assessment, can build structural models with accuracy comparable to experimental structures. Recent advancements in protein language models and structural aligners facilitate the validation of transferred annotations for 3D models.

TRENDS IN BIOCHEMICAL SCIENCES (2023)

Add to Collection

Article Mathematical & Computational Biology

Nearest neighbor search on embeddings rapidly identifies distant protein relations

Konstantin Schuetze, Michael Heinzinger, Martin Steinegger, Burkhard Rost

Summary: This article explores the use of embeddings for nearest neighbor searches to identify the relationships between protein pairs with diverged sequences. While the approach performs well for proteins with single domains, it faces challenges with multi-domain proteins. The authors present ideas to overcome these limitations.

FRONTIERS IN BIOINFORMATICS (2022)

Add to Collection

Article Computer Science, Artificial Intelligence

ProtTrans: Toward Understanding the Language of Life Through Self-Supervised Learning

Ahmed Elnaggar, Michael Heinzinger, Christian Dallago, Ghalia Rehawi, Yu Wang, Llion Jones, Tom Gibbs, Tamas Feher, Christoph Angerer, Martin Steinegger, Debsindhu Bhowmik, Burkhard Rost

Summary: Computational biology and bioinformatics provide valuable data for the development of language models in natural language processing. In this study, six different models were trained on protein sequence data and the resulting embeddings were used for various protein structure prediction tasks, demonstrating their advantages over traditional methods.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2022)

Add to Collection

Article Genetics & Heredity

Contrastive learning on protein embeddings enlightens midnight zone

Michael Heinzinger, Maria Littmann, Ian Sillitoe, Nicola Bordin, Christine Orengo, Burkhard Rost

Summary: The research utilizes embedding-based annotation transfer technique ProtTucker to optimize the classification of protein 3D structures through single protein representations, improving the recognition of distant homologous relationships. Compared to traditional techniques, this method performs better and is faster.

NAR GENOMICS AND BIOINFORMATICS (2022)

Add to Collection

No Data Available

© Peeref 2019-2024. All rights reserved.