4.7 Article

Estimating exome genotyping accuracy by comparing to data from large scale sequencing projects

Journal

GENOME MEDICINE
Volume 5, Issue -, Pages -

Publisher

BMC
DOI: 10.1186/gm473

Keywords

-

Funding

  1. Deutsche Forschungsgemeinschaft [DFG KR 3985/1-1]
  2. NHLBI Lung GO Sequencing Project [HL-102923]
  3. NHLBI WHI Sequencing Project [HL-102924]
  4. NHLBI Broad GO Sequencing Project [HL-102925]
  5. NHLBI Seattle GO Sequencing Project [HL-102926]
  6. NHLBI Heart GO Sequencing Project [HL-103010]

Ask authors/readers for more resources

With exome sequencing becoming a tool for mutation detection in routine diagnostics there is an increasing need for platform-independent methods of quality control. We present a genotype-weighted metric that allows comparison of all the variant calls of an exome to a high-quality reference dataset of an ethnically matched population. The exome-wide genotyping accuracy is estimated from the distance to this reference set, and does not require any further knowledge about data generation or the bioinformatics involved. The distances of our metric are visualized by non-metric multidimensional scaling and serve as an intuitive, standardizable score for the quality assessment of exome data.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

Article Mathematical & Computational Biology

Multiple two-sample testing under arbitrary covariance dependency with an application in imaging mass spectrometry

Vladimir Vutov, Thorsten Dickhaus

Summary: Large-scale hypothesis testing is a common problem in high-dimensional statistical inference, and has broad applications in various scientific disciplines. This paper introduces a multiple marginal models inference procedure using the correlation matrix in imaging mass spectrometry association studies, and applies it to oncological data.

BIOMETRICAL JOURNAL (2023)

Review Genetics & Heredity

Perspectives on the future of dysmorphology

Benjamin D. Solomon, Margaret P. Adam, Chin-To Fong, Katta M. Girisha, Judith G. Hall, Anna C. E. Hurst, Peter M. Krawitz, Shahida Moosa, Shubha R. Phadke, Cedrik Tekendo-Ngongang, Tara L. Wenger

Summary: The field of clinical genetics and genomics is evolving with milestones like the sequencing of the human genome, advances in sequencing technologies, and the introduction of artificial intelligence. The practice of dysmorphology, the study of abnormal development of tissue form, has also been influenced by technological advances and trends in biomedicine. To explore the future of dysmorphology, a group of clinical geneticists have provided insights about its development over the next few decades.

AMERICAN JOURNAL OF MEDICAL GENETICS PART A (2023)

Article Multidisciplinary Sciences

Long-term temporal evolution of extreme temperature in a warming Earth

Justus Contzen, Thorsten Dickhaus, Gerrit Lohmann

Summary: We propose a new approach to model the future development of extreme temperatures globally and on a long timescale using non-stationary generalized extreme value distributions combined with logistic functions. Our statistical models are applied to daily temperature data from climate models from 1850 to 2300, allowing us to investigate changes in extremes in terms of both magnitude and timing across different geographic locations. Our findings show that changes in extremes are generally stronger and quicker over land masses than over oceans. We also find that the mean and variance change simultaneously in most regions, while the shape parameter remains constant.

PLOS ONE (2023)

Article Genetics & Heredity

A statistical boosting framework for polygenic risk scores based on large-scale genotype data

Hannah Klinkhammer, Christian Staerk, Carlo Maj, Peter Michael Krawitz, Andreas Mayr

Summary: Polygenic risk scores (PRS) evaluate individual genetic liability and are important in clinical risk stratification. This study develops an efficient algorithm, snpboost, for fitting multivariable models to genetic data for improved PRS predictive performance. By iteratively working on smaller batches of variants most correlated with residuals, snpboost increases computational efficiency without sacrificing prediction accuracy. Results show competitive prediction accuracy and efficiency compared to other commonly used methods, making snpboost a valuable tool for constructing PRS.

FRONTIERS IN GENETICS (2023)

Article Genetics & Heredity

Identification of de novo variants in nonsyndromic cleft lip with/without cleft palate patients with low polygenic risk scores

Nina Ishorst, Leonie Henschel, Frederic Thieme, Dmitriy Drichel, Sugirthan Sivalingam, Sarah L. Mehrem, Ariane C. Fechtner, Julia Fazaal, Julia Welzenbach, Andre Heimbach, Carlo Maj, Oleg Borisov, Jonas Hausen, Ruth Raff, Alexander Hoischen, Michael Dixon, Alvaro Rada-Iglesias, Michaela Bartusel, Augusto Rojas-Martinez, Khalid Aldhorae, Bert Braumann, Teresa Kruse, Christian Kirschneck, Gerrit Spanier, Heiko Reutter, Stefanie Nowak, Lina Goelz, Michael Knapp, Andreas Buness, Peter Krawitz, Markus M. Noethen, Michael Nothnagel, Tim Becker, Kerstin U. Ludwig, Elisabeth Mangold

Summary: This study investigates the genetic mechanisms of nonsyndromic cleft lip with/without cleft palate (nsCL/P) and identifies new candidate genes through the detection of highly penetrant de novo variants (DNVs). By conducting a series of analyses and validations on a discovery sample of 50 nsCL/P patient/parent-trios, MDN1 and PAXIP1 were identified as top candidate genes.

MOLECULAR GENETICS & GENOMIC MEDICINE (2023)

Article Genetics & Heredity

Clinically relevant combined effect of polygenic background, rare pathogenic germline variants, and family history on colorectal cancer incidence

Emadeldin Hassanin, Isabel Spier, Dheeraj R. Bobbili, Rana Aldisi, Hannah Klinkhammer, Friederike David, Nuria Duenas, Robert Hueneburg, Claudia Perne, Joan Brunet, Gabriel Capella, Markus M. Noethen, Andreas J. Forstner, Andreas Mayr, Peter Krawitz, Patrick May, Stefan Aretz, Carlo Maj

Summary: The effect of common genetic variants associated with colorectal cancer (CRC) can be assessed using polygenic risk scores (PRS), which can be used for risk stratification. The PRS, along with carrier status and family history, contribute to CRC risk prediction and can improve personalized risk stratification.

BMC MEDICAL GENOMICS (2023)

Article Biochemistry & Molecular Biology

The neurodevelopmental and facial phenotype in individuals with a TRIP12 variant

Mio Aerden, Anne-Sophie Denomme-Pichon, Dominique Bonneau, Ange-Line Bruel, Julian Delanne, Benedicte Gerard, Benoit Mazel, Christophe Philippe, Lucile Pinson, Clement Prouteau, Audrey Putoux, Frederic Tran Mau-Them, Eleonore Viora-Dupont, Antonio Vitobello, Alban Ziegler, Amelie Piton, Bertrand Isidor, Christine Francannet, Pierre-Yves Maillard, Sophie Julia, Anais Philippe, Elise Schaefer, Saskia Koene, Claudia Ruivenkamp, Mariette Hoffer, Eric Legius, Miel Theunis, Boris Keren, Julien Buratti, Perrine Charles, Thomas Courtin, Mala Misra-Isrie, Mieke van Haelst, Quinten Waisfisz, Dagmar Wieczorek, Ariane Schmetz, Theresia Herget, Fanny Kortuem, Jasmin Lisfeld, Francois-Guillaume Debray, Nuria C. Bramswig, Isis Atallah, Heidi Fodstad, Guillaume Jouret, Berta Almoguera, Saoud Tahsin-Swafiri, Fernando Santos-Simarro, Maria Palomares-Bralo, Vanesa Lopez-Gonzalez, Maria Kibaek, Pernille M. Torring, Alessandra Renieri, Lucia Pia Bruno, Katrin Ounap, Monica Wojcik, Tzung-Chien Hsieh, Peter Krawitz, Hilde Van Esch

Summary: Haploinsufficiency of TRIP12 causes Clark-Baraitser syndrome, a neurodevelopmental disorder characterized by intellectual disability, epilepsy, autism spectrum disorder, and dysmorphic features. Through GestaltMatcher image analysis based on deep-learning algorithms, a distinct facial gestalt was established. The largest cohort to date of individuals with TRIP12 variants was studied, further defining the associated phenotype and introducing a facial gestalt.

EUROPEAN JOURNAL OF HUMAN GENETICS (2023)

Article Mathematical & Computational Biology

Multiple multi-sample testing under arbitrary covariance dependency

Vladimir Vutov, Thorsten Dickhaus

Summary: Modern high-throughput biomedical devices generate large-scale data, and analyzing high-dimensional datasets is common in biomedical studies. This article proposes a procedure to simultaneously evaluate the strength of associations between a categorical response variable and multiple features. The proposed approach involves multiple testing under arbitrary correlation dependency among test statistics. It offers a trade-off between the expected numbers of true and false findings. The practical application of the method on hyperspectral imaging data obtained through a MALDI instrument is demonstrated.

STATISTICS IN MEDICINE (2023)

Article Genetics & Heredity

Understanding recessive disease risk in multi-ethnic populations with different degrees of consanguinity

Luis La Rocca, Julia Frank, Heidi Beate Bentzen, Jean Tori Pantel, Konrad Gerischer, Peter Krawitz, Anton Bovier

Summary: Population medical genetics aims to apply findings from large-scale studies to individual healthcare, but genetic counseling is still mainly based on family history. In a multi-ethnic society, healthcare professionals should also consider the influence of different mating schemes.

AMERICAN JOURNAL OF MEDICAL GENETICS PART A (2023)

Article Mathematical & Computational Biology

Multiple testing of composite null hypotheses for discrete data using randomized p-values

Daniel Ochieng, Anh-Tuan Hoang, Thorsten Dickhaus

Summary: In this study, two approaches utilizing randomized p-values are presented to deal with the conservativeness issue. The experiments conducted on binomial models show that the proposed randomized p-values are less conservative compared to nonrandomized p-values. The validity of the randomized p-values is also proved under various discrete statistical models.

BIOMETRICAL JOURNAL (2023)

Article Computer Science, Interdisciplinary Applications

GRAPE for fast and scalable graph processing and random-walk-based embedding

Luca Cappelletti, Tommaso Fontana, Elena Casiraghi, Vida Ravanmehr, Tiffany J. J. Callahan, Carlos Cano, Marcin P. P. Joachimiak, Christopher J. J. Mungall, Peter N. N. Robinson, Justin Reese, Giorgio Valentini

Summary: GRAPE is a software resource for graph processing and embedding that can scale with big graphs, showing substantial improvements in space and time complexity compared to existing resources. It offers efficient graph-processing utilities, node embedding methods, and inference models, making it a valuable tool for graph representation learning. GRAPE is capable of handling millions of nodes and billions of edges, enabling large-graph analysis in various real-world applications.

NATURE COMPUTATIONAL SCIENCE (2023)

Article Biochemical Research Methods

An expectation-maximization framework for comprehensive prediction of isoform-specific functions

Guy Karlebach, Leigh Carmody, Jagadish Chandrabose Sundaramurthi, Elena Casiraghi, Peter Hansen, Justin Reese, Christopher J. Mungall, Giorgio Valentini, Peter N. Robinson

Summary: This article proposes a method called isoform interpretation to infer isoform-specific functions using expectation-maximization. It predicts specific functional annotations for 85,617 isoforms of 17,900 protein-coding genes and outperforms other methods in comparison to manually annotated results.

BIOINFORMATICS (2023)

Article Genetics & Heredity

Prioritization of non-coding elements involved in non-syndromic cleft lip with/without cleft palate through genome-wide analysis of de novo mutations

Hanna K. Zieger, Leonie Weinhold, Axel Schmidt, Manuel Holtgrewe, Stefan A. Juranek, Anna Siewert, Annika B. Scheer, Frederic Thieme, Elisabeth Mangold, Nina Ishorst, Fabian U. Brand, Julia Welzenbach, Dieter Beule, Katrin Paeschke, Peter M. Krawitz, Kerstin U. Ludwig

Summary: In this study, whole-genome sequence data from 211 European individuals with non-syndromic cleft lip/palate were analyzed, and 13,522 de novo mutations were identified. These mutations were enriched at two genome-wide association study risk loci, suggesting a convergence of common and rare variants at these loci. Additionally, mutations in the binding region of the Musculin transcription factor were found to contribute to the etiology of cleft lip/palate.

HUMAN GENETICS AND GENOMICS ADVANCES (2023)

Article Biochemical Research Methods

Term-BLAST-like alignment tool for concept recognition in noisy clinical texts

Tudor Groza, Honghan Wu, Marcel E. Dinger, Daniel Danis, Coleman Hilton, Anita Bagley, Jon R. Davids, Ling Luo, Zhiyong Lu, Peter N. Robinson

Summary: Motivation methods for concept recognition in clinical texts are often tested on abstracts or articles, but texts from electronic health records (EHRs) often contain errors and nonstandard representations. This study presents a method inspired by the BLAST algorithm that screens texts for matches based on k-mer counts and scores candidates based on typical patterns of spelling errors. Experimental results show a significant enhancement in entity linking task performance, supporting the use of this method alongside existing approaches.

BIOINFORMATICS (2023)

Article Biochemistry & Molecular Biology

The Monarch Initiative in 2024: an analytic platform integrating phenotypes, genes and diseases across species

Tim E. Putman, Kevin Schaper, Nicolas Matentzoglu, Vincent P. Rubinetti, Faisal S. Alquaddoomi, Corey Cox, J. Harry Caufield, Glass Elsarboukh, Sarah Gehrke, Harshad Hegde, Justin T. Reese, Ian Braun, Richard M. Bruskiewich, Luca Cappelletti, Seth Carbon, Anita R. Caron, Lauren E. Chan, Christopher G. Chute, Katherina G. Cortes, Vinicius De Souza, Tommaso Fontana, Nomi L. Harris, Emily L. Hartley, Eric Hurwitz, Julius O. B. Jacobsen, Madan Krishnamurthy, Bryan J. Laraway, James A. McLaughlin, Julie A. McMurry, Sierra A. T. Moxon, Kathleen R. Mullen, Shawn T. O'Neil, Kent A. Shefchek, Ray Stefancsik, Sabrina Toro, Nicole A. Vasilevsky, Ramona L. Walls, Patricia L. Whetzel, David Osumi-Sutherland, Damian Smedley, Peter N. Robinson, Christopher J. Mungall, Melissa A. Haendel, Monica C. Munoz-Torres

Summary: The Monarch Initiative aims to bridge the gap between genetic variations, environmental determinants, and phenotypic outcomes by developing an integrated platform with open ontologies, semantic data models, and knowledge graphs. It provides advanced analysis tools and curated datasets for clinical diagnosis and understanding disease mechanisms.

NUCLEIC ACIDS RESEARCH (2023)

No Data Available