4.6 Article

MrTADFinder: A network modularity based approach to identify topologically associating domains in multiple resolutions

期刊

PLOS COMPUTATIONAL BIOLOGY
卷 13, 期 7, 页码 -

出版社

PUBLIC LIBRARY SCIENCE
DOI: 10.1371/journal.pcbi.1005647

关键词

-

资金

  1. NHGRI [U41 HG007000]

向作者/读者索取更多资源

Genome-wide proximity ligation based assays such as Hi-C have revealed that eukaryotic genomes are organized into structural units called topologically associating domains (TADs). From a visual examination of the chromosomal contact map, however, it is clear that the organization of the domains is not simple or obvious. Instead, TADs exhibit various length scales and, in many cases, a nested arrangement. Here, by exploiting the resemblance between TADs in a chromosomal contact map and densely connected modules in a network, we formulate TAD identification as a network optimization problem and propose an algorithm, MrTADFinder, to identify TADs from intra-chromosomal contact maps. MrTADFinder is based on the network-science concept of modularity. A key component of it is deriving an appropriate background model for contacts in a random chain, by numerically solving a set of matrix equations. The background model preserves the observed coverage of each genomic bin as well as the distance dependence of the contact frequency for any pair of bins exhibited by the empirical map. Also, by introducing a tunable resolution parameter, MrTADFinder provides a self-consistent approach for identifying TADs at different length scales, hence the acronym Mr standing for Multiple Resolutions. We then apply MrTADFinder to various Hi-C datasets. The identified domain boundaries are marked by characteristic signatures in chromatin marks and transcription factors (TF) that are consistent with earlier work. Moreover, by calling TADs at different length scales, we observe that boundary signatures change with resolution, with different chromatin features having different characteristic length scales. Furthermore, we report an enrichment of HOT (high-occupancy target) regions near TAD boundaries and investigate the role of different TFs in determining boundaries at various resolutions. To further explore the interplay between TADs and epigenetic marks, as tumor mutational burden is known to be coupled to chromatin structure, we examine how somatic mutations are distributed across boundaries and find a clear stepwise pattern. Overall, MrTADFinder provides a novel computational framework to explore the multi-scale structures in Hi-C contact maps.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

Article Genetics & Heredity

Differences in evolutionary accessibility determine which equally effective regulatory motif evolves to generate pulses

Kun Xiong, Mark Gerstein, Joanna Masel

Summary: Transcriptional regulatory networks (TRNs) exhibit certain motifs, with type 1 incoherent feed-forward loops (I1FFLs) and negative feedback loops (NFBLs) being common solutions. The evolution of these motifs is influenced by selection conditions, with I1FFLs generally evolving more frequently than NFBLs. The evolutionary accessibility and not just relative functionality shape motif evolution in TRNs, with the expression levels of specific genes playing a crucial role.

GENETICS (2021)

Correction Genetics & Heredity

Functional genomics data: privacy risk assessment and technological mitigation (Nov, 10.1038/s41576-021-00428-7, 2021)

Gamze Guersoy, Tianxiao Li, Susanna Liu, Eric Ni, Charlotte M. Brannon, Mark B. Gerstein

NATURE REVIEWS GENETICS (2022)

Review Genetics & Heredity

Functional genomics data: privacy risk assessment and technological mitigation

Gamze Gursoy, Tianxiao Li, Susanna Liu, Eric Ni, Charlotte M. Brannon, Mark B. Gerstein

Summary: Sharing functional genomics data is crucial for research advancement, but poses notable privacy challenges, including leakage of genotype and phenotype information from different data types and their summarization steps. Techniques enabling broad sharing and analysis while maintaining privacy are being developed.

NATURE REVIEWS GENETICS (2022)

Article Multidisciplinary Sciences

Forest Fire Clustering for single-cell sequencing combines iterative label propagation with parallelized Monte Carlo simulations

Zhanlin Chen, Jeremy Goldwasser, Philip Tuckman, Jason Liu, Jing Zhang, Mark Gerstein

Summary: Forest Fire Clustering is an efficient and interpretable method for extracting insights from single-cell data in the era of single-cell sequencing. It computes a non-parametric posterior probability for each data point and enables the discovery of rare cell types with label confidence and entropy computation.

NATURE COMMUNICATIONS (2022)

Editorial Material Genetics & Heredity

GATTACA is still pertinent 25 years later

Dov Greenbaum, Mark Gerstein

Summary: GATTACA, a film released 25 years ago, portrays a credible near future where societal inequalities based on race and class have been replaced by new prejudices arising from genetic determinism. This article compares the fictional technologies in GATTACA with the current state of the art, examining the legal protections against the dystopian future portrayed in the film, where personal freedom and privacy rights are greatly curtailed by genomic innovations. It further discusses the continued relevance of GATTACA's prescient warnings in light of the ongoing advancements in genomic science and technology.

NATURE GENETICS (2022)

Article Biochemistry & Molecular Biology

Proteome-wide screening for mitogen-activated protein kinase docking motifs and interactors

Guangda Shi, Claire Song, Jaylissa Torres Robles, Leonidas Salichos, Hua Jane Lou, Tukiet T. Lam, Mark Gerstein, Benjamin E. Turk

Summary: This study used a yeast-based genetic screening system to analyze a large amount of MAPK docking sequences, and identified key features for binding to JNK1 and p38 alpha, as well as specific docking groove residues that mediate selective binding. Furthermore, it verified the substrate recruitment function of the screened docking sequences in vitro and in cultured cells.

SCIENCE SIGNALING (2023)

Article Genetics & Heredity

Illuminating links between cis-regulators and trans-acting variants in the human prefrontal cortex

Shuang Liu, Hyejung Won, Declan Clarke, Nana Matoba, Saniya Khullar, Yudi Mu, Daifeng Wang, Mark Gerstein

Summary: This study investigates the transcriptional regulatory structure of the human brain, revealing the coordination of both cis- and trans-regulatory variants. By analyzing large datasets, the researchers identified candidate trans-eQTLs that influence the expression of target genes and found overlap with known cis-eQTLs. Through colocalization and mediation analyses, they identified mediators in trans-regulation and linked trans-eQTLs to schizophrenia risk genes. The findings demonstrate the importance of trans-regulatory mechanisms in understanding psychiatric disorders.

GENOME MEDICINE (2022)

Article Multidisciplinary Sciences

DeepVelo: Single-cell transcriptomic deep velocity field learning with neural ordinary differential equations

Zhanlin Chen, William C. King, Aheyon Hwang, Mark Gerstein, Jing Zhang

Summary: Recent advances in single-cell sequencing technologies have led to new opportunities for studying the gene expression profile and transcriptome dynamics of individual cells. In this study, the authors propose DeepVelo, a neural network-based method that models complex transcriptome dynamics by simulating continuous changes in gene expression over time within cells. DeepVelo was applied to analyze transcriptome dynamics at different time scales and identify developmental driver genes through perturbation analysis.

SCIENCE ADVANCES (2022)

Article Health Care Sciences & Services

Estimation of Bedtimes of Reddit Users: Integrated Analysis of Time Stamps and Surveys

William U. Meyerson, Sarah K. Fineberg, Ye Kyung Song, Adam Faber, Garrett Ash, Fernanda C. Andrade, Philip Corlett, Mark B. Gerstein, Rick H. Hoyle

Summary: Researchers estimated the bedtimes of Reddit users based on their posting times and tested the accuracy using survey data. They developed an R package to apply the model and share with the research community. This model provides a passive way to infer sleep parameters of frequent social media users without the need for active surveys.

JMIR FORMATIVE RESEARCH (2023)

Review Genetics & Heredity

Unified views on variant impact across many diseases

Sushant Kumar, Mark Gerstein

Summary: Genomic studies of human disorders are performed by different research communities, including rare diseases, common diseases, and cancer. Despite differences in origin, these studies aim to identify causal genomic events critical for disease manifestation. Challenges faced include understanding genetic architecture, deciphering variant impact, and interpreting noncoding mutations. A unified vocabulary and approach across disease communities is necessary to address these challenges effectively.

TRENDS IN GENETICS (2023)

Article Biochemical Research Methods

Constructing a full, multiple-layer interactome for SARS-CoV-2 in the context of lung disease: Linking the virus with human genes and microbes

Shaoke Lou, Mingjun Yang, Tianxiao Li, Weihao Zhao, Hannah Cevasco, Yucheng T. Yang, Mark Gerstein

Summary: By using the statistical modeling approach MLCrosstalk, the researchers identified linkages between SARS-CoV-2, human genes, miRNAs, and microbes. They found certain human genes and microbial species that are linked to SARS-CoV-2. The findings offer potential insights for developing new treatments for COVID-19.

PLOS COMPUTATIONAL BIOLOGY (2023)

Article Multidisciplinary Sciences

Integrome signatures of lentiviral gene therapy for SCID-X1 patients

Koon-Kiu Yan, Jose Condori, Zhijun Ma, Jean-Yves Metais, Bensheng Ju, Liang Ding, Yogesh Dhungana, Lance E. Palmer, Deanna M. Langfitt, Francesca Ferrara, Robert Throm, Hao Shi, Isabel Risch, Sheetal Bhatara, Bridget Shaner, Timothy D. Lockey, Aimee C. Talleur, John Easton, Michael M. Meagher, Jennifer M. Puck, Morton J. Cowan, Sheng Zhou, Ewelina Mamcarz, Stephen Gottschalk, Jiyang Yu

Summary: Lentiviral vector (LV)-based gene therapy shows promise in treating various diseases. By analyzing patient samples, we found LV integrome signatures related to genomics, epigenomics, and the 3D structure of the genome. These signatures were validated in cellular therapies and differences in 3D genome signatures between LV and gamma retrovirus integromes were identified, potentially explaining the lower risk of mutations in LV-based gene therapy.

SCIENCE ADVANCES (2023)

Article Biotechnology & Applied Microbiology

Storing and analyzing a genome on a blockchain

Gamze Gursoy, Charlotte M. Brannon, Eric Ni, Sarah Wagner, Amol Khanna, Mark Gerstein

Summary: Researchers have developed a private blockchain network to store genomic variants and reference-aligned reads on-chain, addressing the challenges of data ownership and integrity in genomics.

GENOME BIOLOGY (2022)

Article Biotechnology & Applied Microbiology

Recovering genotypes and phenotypes using allele-specific genes

Gamze Gursoy, Nancy Lu, Sarah Wagner, Mark Gerstein

Summary: With the rise of RNA sequencing efforts using large cohorts and the surveying of allele-specific gene expression, it has become common to recover key variants and link individuals back to their genotypes and phenotypes using a list of known allele-specific genes. This poses a privacy conundrum despite not explicitly containing variant information.

GENOME BIOLOGY (2021)

Article Computer Science, Information Systems

Fast and Scalable Private Genotype Imputation Using Machine Learning and Partially Homomorphic Encryption

Esha Sarkar, Eduardo Chielle, Gamze Gursoy, Oleg Mazonka, Mark Gerstein, Michail Maniatakos

Summary: Recent advances in genome sequencing technologies have provided unprecedented opportunities to understand the relationship between human genetic variation and diseases, but genotyping whole genomes remains costly. This study investigates solutions for fast, scalable, and accurate privacy-preserving genotype imputation using Machine Learning and a standardized homomorphic encryption scheme.

IEEE ACCESS (2021)

暂无数据