Article
Biology
Stephan Schmeing, Mark D. Robinson
Summary: Continuity, correctness, and completeness of genome assemblies are crucial for biological projects. Long reads are beneficial for high-quality genomes, but not everyone can achieve the required coverage. Therefore, improving existing assemblies with low-coverage long reads is a promising alternative, involving correction, scaffolding, and gap filling. We propose a new tool, gapless, which combines all three tasks using PacBio or Oxford Nanopore reads. It can be accessed at https://github.com/schmeing/gapless.
LIFE SCIENCE ALLIANCE
(2023)
Article
Biochemical Research Methods
Lauren Coombe, Janet X. Li, Theodora Lo, Johnathan Wong, Vladimir Nikolic, Rene L. Warren, Inanc Birol
Summary: LongStitch is a scalable pipeline that corrects and scaffolds draft genome assemblies exclusively using long reads. It incorporates multiple tools developed by the group and runs in up to three stages, including initial assembly correction (Tigmint-long), followed by two incremental scaffolding stages (ntLink and ARKS-long). Tested on various organisms and consistently improving assembly contiguity compared to other tools, LongStitch is expected to benefit a wide variety of de novo genome assembly projects.
BMC BIOINFORMATICS
(2021)
Article
Multidisciplinary Sciences
Ying Chen, Fan Nie, Shang-Qian Xie, Ying-Feng Zheng, Qi Dai, Thomas Bray, Yao-Xin Wang, Jian-Feng Xing, Zhi-Jian Huang, De-Peng Wang, Li-Juan He, Feng Luo, Jian-Xin Wang, Yi-Zhi Liu, Chuan-Le Xiao
Summary: The error correction and de novo assembly tool NECAT developed by the authors efficiently produces high-quality assemblies of nanopore reads. The tool utilizes adaptive read selection and a two-step progressive method to overcome the high error rates in nanopore reads.
NATURE COMMUNICATIONS
(2021)
Article
Multidisciplinary Sciences
Mohamed Awad, Xiangchao Gan
Summary: This paper introduces GALA, a computational framework for chromosome-based sequencing data separation and gap-free de novo assembly. It allows integration of different data sources and addresses the challenge of achieving gap-free chromosome-scale assemblies using current workflows for long-read platforms. The method is demonstrated through the assembly of various genomes.
NATURE COMMUNICATIONS
(2023)
Article
Biochemical Research Methods
Junwei Luo, Ting Guan, Guolin Chen, Zhonghua Yu, Haixia Zhai, Chaokun Yan, Huimin Luo
Summary: In this article, a hybrid scaffolding method (SLHSD) is introduced that combines the advantages of short reads and long reads to construct an optimal scaffold graph for assembly. Experimental results demonstrate that SLHSD outperforms other methods in terms of performance.
BRIEFINGS IN BIOINFORMATICS
(2023)
Article
Biochemistry & Molecular Biology
Shunhua Han, Guilherme B. Dias, Preston J. Basting, Raghuvir Viswanatha, Norbert Perrimon, Casey M. Bergman
Summary: Animal cell lines often undergo extreme genome restructuring events that hinder de novo whole-genome assembly. This study used long-read and linked-read technologies to sequence the genome of a tetraploid Drosophila cell line and developed a novel method called TELR for TE analysis. The results shed light on the role and mechanism of transposable elements in animal cell culture genome evolution.
NUCLEIC ACIDS RESEARCH
(2022)
Article
Biotechnology & Applied Microbiology
Guillaume Holley, Doruk Beyter, Helga Ingimundardottir, Peter L. Moller, Snodis Kristmundsdottir, Hannes P. Eggertsson, Bjarni Halldorsson
Summary: Ratatosk is a method presented to correct long reads with short read data, reducing the error rate of long reads 6-fold on average and significantly improving the accuracy of SNP and indel calls. An assembly of Ratatosk corrected reads from an individual showed better contig N50 and less misassemblies compared to a PacBio HiFi reads assembly.
Article
Biochemical Research Methods
Xiaowen Feng, Haoyu Cheng, Daniel Portik, Heng Li
Summary: hifiasm-meta is a software tool designed for assembling metagenomes using high-accuracy long-read data, which can reconstruct bacterial genomes in microbial communities more accurately.
Article
Biochemistry & Molecular Biology
Atif Rahman, Lior Pachter
Summary: This study presents a method for scaffolding in genome assembly using second generation sequencing reads, based on a generative model for sequencing to estimate likelihoods of linking contigs. The method, implemented in SWALO, consistently makes more or similar number of correct joins and very few incorrect joins compared to other scaffolders, demonstrating the potential for substantial improvement in genome assembly using statistical models. SWALO is freely available for download at https://atifrahman.github.io/SWALO.
NUCLEIC ACIDS RESEARCH
(2021)
Article
Biotechnology & Applied Microbiology
Xueyan Shen, Yong Chao Niu, Joseph Angelo V. Uichanco, Norman Phua, Pranjali Bhandare, Natascha May Thevasagayam, Sai Rama Sridatta Prakki, Laszlo Orban
Summary: For Asian seabass cultured in sea cages, aquatic pathogens, environmental factors, and stress are major causes of disease, resulting in substantial economic losses. Researchers identified a genomic region associated with increased robustness in response to pathogen-infected marine environments. This finding has potential applications in selecting robust Asian seabass lines for breeding programs.
Article
Biochemical Research Methods
Markus Hiltunen, Martin Ryberg, Hanna Johannesson
Summary: ARBitR was developed to accurately recreate regions where draft assemblies are broken by taking overlaps into account during scaffolding.
Review
Biochemical Research Methods
Junwei Luo, Yawei Wei, Mengna Lyu, Zhengjiang Wu, Xiaoyan Liu, Huimin Luo, Chaokun Yan
Summary: This article discusses the significance of scaffolding methods in genome assembly and the challenges they face, as well as the impact of various types of reads on assembly quality. It emphasizes the importance of researchers gaining a deep understanding of the latest scaffolding methods to address these challenges.
BRIEFINGS IN BIOINFORMATICS
(2021)
Article
Biochemistry & Molecular Biology
Stephanie H. Chen, Maurizio Rossetto, Marlien van der Merwe, Patricia Lu-Irving, Jia-Yee S. Yap, Herve Sauquet, Greg Bourke, Timothy G. Amos, Jason G. Bragg, Richard J. Edwards
Summary: This study presents the first chromosome-level genome of Telopea speciosissima, providing valuable insights into the speciation, introgression, and adaptation processes. The genome assembly is of high quality and the annotation revealed important genetic information, such as the presence of CYCLOIDEA genes and potential gene duplications. The T. speciosissima reference genome will contribute to the conservation efforts of Proteaceae plants in Australia and beyond.
MOLECULAR ECOLOGY RESOURCES
(2022)
Article
Biochemical Research Methods
Philip C. Dishuck, Allison N. Rozanski, Glennis A. Logsdon, David Porubsky, Evan E. Eichler
Summary: The study presents the development of an open-source pipeline called GAVISUNK, which detects misassemblies and determines a set of reliable regions in the genome by assessing the concordance of distances between unique k-mers in high-fidelity assemblies and raw sequencing data.
Article
Biology
Arne Ludwig, Martin Pippel, Gene Myers, Michael Hiller
Summary: In this study, we present DENTIST, a sensitive, highly accurate, and automated pipeline method for closing gaps in short-read assemblies with long error-prone reads. Through tests on real genomic data, we demonstrate that DENTIST achieves higher accuracy and similar sensitivity compared to previous methods.