4.7 Letter

Misunderstood parameter of NCBI BLAST impacts the correctness of bioinformatics workflows

期刊

BIOINFORMATICS
卷 35, 期 9, 页码 1613-1614

出版社

OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/bty833

关键词

-

资金

  1. US National Science Foundation [IIS-1513615]

向作者/读者索取更多资源

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

Article Biochemical Research Methods

MAGUS: Multiple sequence Alignment using Graph cIUStering

Vladimir Smirnov, Tandy Warnow

Summary: MAGUS is a new technique for computing large-scale alignments, similar to PASTA but faster and more accurate. It utilizes a divide-and-conquer approach and merges subset alignments using a Graph Clustering Merger.

BIOINFORMATICS (2021)

Article Microbiology

Genomic Drivers of Multidrug-Resistant Shigella Affecting Vulnerable Patient Populations in the United States and Abroad

Jay Noboru Worley, Kiran Javkar, Maria Hoffmann, Kristen Hysell, Amanda Garcia-Williams, Kaitlin Tagg, Sanjat Kanjilal, Errol Strain, Mihai Pop, Marc Allard, Louise Francois Watkins, Lynn Bry

Summary: MDR Shigella infections are a global concern among MSM, with new macrolide-resistant strains complicating treatment. Genomic analyses reveal resistant genes in US Shigella isolates and the receptivity of certain strains to plasmid acquisition. Leveraging integrated genomic-epidemiologic analyses can guide targeted clinical actions and public health efforts to combat the spread of multidrug-resistant Shigella.
Article Microbiology

Binnacle: Using Scaffolds to Improve the Contiguity and Quality of Metagenomic Bins

Harihara Subrahmaniam Muralidharan, Nidhi Shah, Jacquelyn S. Meisel, Mihai Pop

Summary: High-throughput sequencing has transformed microbiology, but reconstructing complete genomes from metagenomic data is still challenging due to the fragmented nature. Scientists use binning to cluster contigs from the same organism, and this study suggests using assembly graphs to improve binning strategies. The Binnacle tool extracts information from assembly graphs to cluster scaffolds into comprehensive bins, enhancing the quality and contiguity of the resulting bins.

FRONTIERS IN MICROBIOLOGY (2021)

Article Evolutionary Biology

DISCO: Species Tree Inference using Multicopy Gene Family Tree Decomposition

James Willson, Mrinmoy Saha Roddur, Baqiao Liu, Paul Zaharias, Tandy Warnow

Summary: Gene tree heterogeneity poses a challenge for species tree inference, but the introduction of DISCO, a new approach that decomposes multi-copy gene family trees into single copy trees, improves the accuracy of species tree estimation.

SYSTEMATIC BIOLOGY (2022)

Article Biochemical Research Methods

MAGUS plus eHMMs: improved multiple sequence alignment accuracy for fragmentary sequences

Chengze Shen, Paul Zaharias, Tandy Warnow

Summary: Multiple sequence alignment is a key step in bioinformatics pipelines, but it is challenging to estimate alignments on datasets with fragmentary sequences. This paper examines a new MSA method called MAGUS, which is robust to fragmentary sequences under many conditions, and shows that using a two-stage approach improves alignment accuracy.

BIOINFORMATICS (2022)

Article Biochemical Research Methods

Quintet Rooting: rooting species trees under the multi-species coalescent model

Yasamin Tabatabaee, Kowshika Sarker, Tandy Warnow

Summary: This article presents Quintet Rooting (QR), a method for rooting species trees based on a proof of identifiability of the rooted species tree under the multi-species coalescent model. The method is shown to be generally more accurate than other rooting methods, except under extreme levels of gene tree estimation error.

BIOINFORMATICS (2022)

Article Pharmacology & Pharmacy

Gut Microbiome-Wide Search for Bacterial Azoreductases Reveals Potentially Uncharacterized Azoreductases Encoded in the Human Gut Microbiome

Domenick J. Braccia, Glory Minabou Ndjite, Ashley Weiss, Sophia Levy, Stephenie Abeysinghe, Xiaofang Jiang, Mihai Pop, Brantley Hall

Summary: The human gut microbiome contains numerous azoreductases that play a vital role in modifying orally administered drugs. Through analyzing bacterial azoreductases and genome sequences, this study identified putative azo-reducing species and hypothesized the presence of uncharacterized azoreductases in prominent strains of the human gut microbiome.

DRUG METABOLISM AND DISPOSITION (2023)

Review Biology

Recent progress on methods for estimating and updating large phylogenies

Paul Zaharias, Tandy Warnow

Summary: This article introduces some recent advances in highly accurate phylogeny estimation on large datasets, including divide-and-conquer techniques, methods for estimating species trees from multi-locus datasets and addressing heterogeneity, and methods for adding sequences into large gene trees or species trees.

PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY B-BIOLOGICAL SCIENCES (2022)

Article Biochemical Research Methods

Large-Scale Multiple Sequence Alignment and the Maximum Weight Trace Alignment Merging Problem

Paul Zaharias, Vladimir Smirnov, Tandy Warnow

Summary: MAGUS is an accurate multiple sequence alignment method that uses divide-and-conquer and the Graph Clustering Method (GCM) for merging alignments. The study shows that GCM is a good heuristic for the NP-hard MWT-AM problem and suggests a new direction for large-scale MSA estimation based on improved divide-and-conquer strategies. MAGUS and its enhanced versions can be found at https://github.com/vlasmirnov/MAGUS.

IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS (2023)

Article Biochemical Research Methods

SCAMPP: Scaling Alignment-Based Phylogenetic Placement to Large Trees

Eleanor Wedell, Yirong Cai, Tandy Warnow

Summary: SCAMPP is a technique that extends the scalability of likelihood-based phylogenetic placement methods to ultra-large backbone trees, achieving accurate evolutionary tree classification. It can handle ultra-large backbone trees with 50,000 or more leaves and has higher accuracy compared to other fast phylogenetic placement methods.

IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS (2023)

Article Biochemical Research Methods

UPP2: fast and accurate alignment of datasets with fragmentary sequences

Minhyuk Park, Stefan Ivanovic, Gillian Chu, Chengze Shen, Tandy Warnow

Summary: UPP2 is an improvement on UPP, with a fast technique for selecting HMMs in the ensemble, achieving the same accuracy as UPP but with reduced runtime.

BIOINFORMATICS (2023)

Article Information Science & Library Science

Center-periphery structure in research communities

Eleanor Wedell, Minhyuk Park, Dmitriy Korobskiy, Tandy Warnow, George Chacko

Summary: Clustering and community detection in networks are widely studied topics, we focus on detecting communities of scientific publications linked by citations, and have developed a modular pipeline based on the k-core algorithm to find publication communities. Through quantitative and qualitative evaluation on a citation network of over 14 million publications in the extracellular vesicles field, we compare our approach with the widely used Leiden algorithm for community detection.

QUANTITATIVE SCIENCE STUDIES (2022)

Article Biochemical Research Methods

WITCH: Improved Multiple Sequence Alignment Through Weighted Consensus Hidden Markov Model Alignment

Chengze Shen, Minhyuk Park, Tandy Warnow

Summary: Accurate multiple sequence alignment is challenging, especially for data sets with sequence length heterogeneity. Existing methods have made progress in addressing the first two challenges, but sequence length heterogeneity remains a significant issue. This study introduces a new method, WITCH, which improves alignment accuracy by weighting and ranking HMMs, using multiple HMMs, and using a consensus algorithm that considers the weights.

JOURNAL OF COMPUTATIONAL BIOLOGY (2022)

Article Biochemical Research Methods

Scalable Species Tree Inference with External Constraints

Baqiao Liu, Tandy Warnow

Summary: This study introduces two new methods, NJst-J and FASTRAL-J, for estimating the species tree based on partial knowledge. The results show that both NJst-J and FASTRAL-J are faster than ASTRAL-J, and all three methods are statistically consistent under the given constraint.

JOURNAL OF COMPUTATIONAL BIOLOGY (2022)

Article Biochemical Research Methods

Re-evaluating Deep Neural Networks for Phylogeny Estimation: The Issue of Taxon Sampling

Paul Zaharias, Martin Grosshauser, Tandy Warnow

Summary: This study evaluated the accuracy of recently trained DNNs in comparison to standard phylogeny estimation methods on simulated datasets with similar and higher rates of evolution. The results showed that DNNs were less accurate than standard methods for quartet accuracy, and global methods had higher accuracy on large datasets.

JOURNAL OF COMPUTATIONAL BIOLOGY (2022)

暂无数据