4.6 Article

Multivariate Hawkes process models of the occurrence of regulatory elements

期刊

BMC BIOINFORMATICS
卷 11, 期 -, 页码 -

出版社

BMC
DOI: 10.1186/1471-2105-11-456

关键词

-

资金

  1. Danish Natural Science Research Council [272-06-0442, 09-072331]
  2. Novo Nordisk Foundation
  3. European Research Council under the EU

向作者/读者索取更多资源

Background: A central question in molecular biology is how transcriptional regulatory elements (TREs) act in combination. Recent high-throughput data provide us with the location of multiple regulatory regions for multiple regulators, and thus with the possibility of analyzing the multivariate distribution of the occurrences of these TREs along the genome. Results: We present a model of TRE occurrences known as the Hawkes process. We illustrate the use of this model by analyzing two different publically available data sets. We are able to model, in detail, how the occurrence of one TRE is affected by the occurrences of others, and we can test a range of natural hypotheses about the dependencies among the TRE occurrences. In contrast to earlier efforts, pre-processing steps such as clustering or binning are not needed, and we thus retain information about the dependencies among the TREs that is otherwise lost. For each of the two data sets we provide two results: first, a qualitative description of the dependencies among the occurrences of the TREs, and second, quantitative results on the favored or avoided distances between the different TREs. Conclusions: The Hawkes process is a novel way of modeling the joint occurrences of multiple TREs along the genome that is capable of providing new insights into dependencies among elements involved in transcriptional regulation. The method is available as an R package from http://www.math.ku.dk/similar to richard/ppstat/.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

Article Computer Science, Artificial Intelligence

DermX: An end-to-end framework for explainable automated dermatological diagnosis

Raluca Jalaboi, Frederik Faye, Mauricio Orbes-Arteaga, Dan Jorgensen, Ole Winther, Alfiia Galimzianova

Summary: Dermatological diagnosis automation is crucial for addressing the high prevalence of skin diseases and shortage of dermatologists. DermX and DermX+ are two explainable automated dermatological diagnosis methods that achieve near-expert diagnosis performance while providing expert-level explanations.

MEDICAL IMAGE ANALYSIS (2023)

Article Health Care Sciences & Services

Explainable Image Quality Assessments in Teledermatological Photography

Raluca Jalaboi, Ole Winther, Alfiia Galimzianova

Summary: ImageQX is a convolutional neural network that can automatically assess and explain image quality, identifying common issues such as bad framing, bad lighting, blur, low resolution, and distance problems. Trained and validated on photographs taken using a mobile skin disease tracking application, ImageQX performs at an expert-level and is easily deployable on mobile devices.

TELEMEDICINE AND E-HEALTH (2023)

Article Microscopy

Reconstructing the exit wave of 2D materials in high-resolution transmission electron microscopy using machine learning

Matthew Helmi Leth Larsen, Frederik Dahl, Lars P. Hansen, Bastian Barton, Christian Kisielowski, Stig Helveg, Ole Winther, Thomas W. Hansen, Jakob Schiotz

Summary: Convolutional neural networks can reconstruct the exit wave function from a short focal series of HRTEM images, achieving a similar fidelity compared to conventional methods. By training a fully convolutional neural network based on the U-Net architecture with simulated exit waves and HRTEM images, we successfully applied it to analyze experimentally obtained images of MoS2 nanoparticles on graphene support and obtain atomically resolved exit wave structures. Furthermore, we demonstrated the feasibility of training the network to reconstruct exit waves for a wide range of two-dimensional materials.

ULTRAMICROSCOPY (2023)

Article Chemistry, Medicinal

Deorphanizing Peptides Using Structure Prediction

Felix Teufel, Dennis Madsen, Kristine Deibler, Jan C. Refsgaard, Marina A. Kasimova, Christian T. Madsen, Carsten Stahlhut, Mads Gronborg, Ole Winther

Summary: In this study, the use of AlphaFold-Multimer complex structure prediction and transmembrane topology prediction for peptide deorphanization is investigated. It is found that AlphaFold's confidence metrics have strong performance in prioritizing true peptide-receptor interactions.

JOURNAL OF CHEMICAL INFORMATION AND MODELING (2023)

Article Biochemistry & Molecular Biology

Deep integrative models for large-scale human genomics

Arnor Sigurdsson, Ioannis Louloudis, Karina Banasik, David Westergaard, Ole Winther, Ole Lund, Sisse Rye Ostrowski, Christian Erikstrup, Ole Birger Vesterager Pedersen, Mette Nyegaard, Soren Brunak, Bjarni J. Vilhjalmsson, Simon Rasmussen

Summary: We developed a deep learning framework for polygenic risk score (PRS) prediction that can handle large-scale genomics data, support multi-task learning, and automatically integrate clinical and biochemical data. The framework demonstrated competitive performance and improved predictions for complex genetic relationships and non-additive genetic effects and epistasis. The model also outperformed traditional linear PRS methods for Type 1 Diabetes.

NUCLEIC ACIDS RESEARCH (2023)

Article Chemistry, Multidisciplinary

Molecular Representations in Machine-Learning-Based Prediction of PK Parameters for Insulin Analogs

Kasper A. Einarson, Kristian M. Bendtsen, Kang Li, Maria Thomsen, Niels R. Kristensen, Ole Winther, Simone Fulle, Line Clemmensen, Hanne H. F. Refsgaard

Summary: This study presents a novel combination of molecular descriptors for predicting the pharmacokinetic parameters of insulin analogs. Machine-learning models were used to predict pharmacokinetic parameters, and the results showed that combining protein and small molecule descriptors was crucial for accurate predictions.

ACS OMEGA (2023)

Review Biochemical Research Methods

RNA trafficking and subcellular localization-a review of mechanisms, experimental and predictive methodologies

Jun Wang, Marc Horlacher, Lixin Cheng, Ole Winther

Summary: RNA localization is important for spatial translation regulation, and this review discusses its molecular mechanisms, experimental techniques, and machine learning-based prediction tools. The three main molecular mechanisms controlling RNA localization to distinct cellular compartments, including directed transport, mRNA degradation protection, and diffusion/local entrapment, are reviewed. Advances in experimental methods provide ample data resources for the design of powerful machine learning models in RNA localization prediction. The review also covers publicly available predictive tools, serving as a guide for users and encouraging the development of more effective prediction models. Lastly, an overview of multimodal learning is presented as a potential new avenue for RNA localization prediction.

BRIEFINGS IN BIOINFORMATICS (2023)

Article Multidisciplinary Sciences

ThoughtSource: A central hub for large language model reasoning data

Simon Ott, Konstantin Hebenstreit, Valentin Lievin, Christoffer Egeberg Hother, Milad Moradi, Maximilian Mayrhauser, Robert Praas, Ole Winther, Matthias Samwald

Summary: Large language models (LLMs) like GPT-4 have shown impressive performance in various tasks, but they still have limitations in complex reasoning, opaque reasoning processes, fact hallucination, and potential biases. To address these issues, chain-of-thought prompting, a technique that allows models to verbalize reasoning steps in natural language, has been proposed. ThoughtSource is introduced as a meta-dataset and software library for chain-of-thought reasoning, aiming to improve future AI systems by enhancing qualitative understanding, enabling empirical evaluations, and providing training data. The initial release of ThoughtSource includes datasets from scientific/medical, general-domain, and math word question answering.

SCIENTIFIC DATA (2023)

Article Genetics & Heredity

GraphPart: homology partitioning for biological sequence analysis

Felix Teufel, Magnus Halldor Gislason, Jose Juan Almagro Armenteros, Alexander Rosenberg Johansen, Ole Winther, Henrik Nielsen

Summary: A homology partitioning algorithm called GraphPart is proposed, which divides the data in such a way that closely related sequences always end up in the same partition, while retaining as many sequences as possible. Evaluation on Protein, DNA and RNA datasets shows that GraphPart is capable of preserving a larger number of sequences, while achieving homology separation on a par with reduction approaches.

NAR GENOMICS AND BIOINFORMATICS (2023)

Article Biotechnology & Applied Microbiology

Towards in silico CLIP-seq: predicting protein-RNA interaction via sequence-to-signal learning

Marc Horlacher, Nils Wagner, Lambert Moyon, Klara Kuret, Nicolas Goedert, Marco Salvatore, Jernej Ule, Julien Gagneur, Ole Winther, Annalisa Marsico

Summary: RBPNet is a new deep learning method that predicts CLIP-seq crosslink count distribution from RNA sequence. Training on millions of regions, RBPNet shows high generalization on eCLIP, iCLIP, and miCLIP assays, outperforming state-of-the-art classifiers. RBPNet performs bias correction by modeling the raw signal as a mixture of protein-specific and background signal. By using Integrated Gradients for model interrogation, RBPNet identifies predictive sub-sequences corresponding to known and novel binding motifs and enables variant-impact scoring through in silico mutagenesis. Overall, RBPNet improves the imputation of protein-RNA interactions and enhances mechanistic interpretation of predictions.

GENOME BIOLOGY (2023)

Article Genetics & Heredity

Transfer learning identifies sequence determinants of cell-type specific regulatory element accessibility

Marco Salvatore, Marc Horlacher, Annalisa Marsico, Ole Winther, Robin Andersson

Summary: Dysfunction of regulatory elements through genetic variants is a central mechanism in disease pathogenesis. Deep learning methods have shown promise in modeling biomolecular data from DNA sequence but require large input data for training. ChromTransfer, a transfer learning method, utilizes a pre-trained model of open chromatin regions to fine-tune on regulatory sequences and demonstrates superior performance in learning cell-type specific chromatin accessibility. It is able to fine-tune on small input data with minimal decrease in accuracy and utilizes sequence features matching binding site sequences of key transcription factors for prediction, making it a promising tool for learning the regulatory code.

NAR GENOMICS AND BIOINFORMATICS (2023)

Article Chemistry, Multidisciplinary

Uncertainty-aware and explainable machine learning for early prediction of battery degradation trajectory

Laura Hannemose Rieger, Eibar Flores, Kristian Frellesen Nielsen, Poul Norby, Elixabete Ayerbe, Ole Winther, Tejs Vegge, Arghya Bhowmik

Summary: Enhancing cell lifetime is crucial in battery design and development, and early prediction of cell aging can accelerate the discovery and production of better battery chemistries. This study introduces an early prediction model with reliable uncertainty estimates, which utilizes a small number of initial cycles to predict the entire battery degradation trajectory.

DIGITAL DISCOVERY (2023)

Article Biochemical Research Methods

DeepPeptide predicts cleaved peptides in proteins using conditional random fields

Felix Teufel, Jan Christian Refsgaard, Christian Toft Madsen, Carsten Stahlhut, Mads Gronborg, Ole Winther, Dennis Madsen

Summary: DeepPeptide is a deep learning model that predicts cleaved peptides directly from the amino acid sequence, showing improved precision and recall compared to previous methodology. It is capable of identifying peptides in underannotated proteomes.

BIOINFORMATICS (2023)

Article Chemistry, Physical

Graph neural network interatomic potential ensembles with calibrated aleatoric and epistemic uncertainty on energy and forces

Jonas Busk, Mikkel N. Schmidt, Ole Winther, Tejs Vegge, Peter Bjorn Jorgensen

Summary: This research presents a complete framework for training and recalibrating graph neural network ensemble models to accurately predict energy and forces with calibrated uncertainty estimates. The method is demonstrated and evaluated on challenging datasets, achieving good prediction accuracy and uncertainty calibration.

PHYSICAL CHEMISTRY CHEMICAL PHYSICS (2023)

Article Computer Science, Artificial Intelligence

Calibrated uncertainty for molecular property prediction using ensembles of message passing neural networks

Jonas Busk, Peter Bjorn Jorgensen, Arghya Bhowmik, Mikkel N. Schmidt, Ole Winther, Tejs Vegge

Summary: In this study, a message passing neural network model is extended to incorporate both aleatoric and epistemic uncertainty in a unified framework, and the predictive distribution is recalibrated for improved accuracy. The proposed method is shown to accurately predict molecular properties with well calibrated uncertainty estimates in experimental settings.

MACHINE LEARNING-SCIENCE AND TECHNOLOGY (2022)

暂无数据