Article
Plant Sciences
Ilya Kirov, Elizaveta Kolganova, Maxim Dudnikov, Olga Yu. Yurkevich, Alexandra V. Amosova, Olga V. Muravenko
Summary: High-copy tandemly organized repeats (TRs), or satellite DNA, are an important and mysterious component of eukaryotic genomes. In this study, we developed NanoTRF, a new python pipeline for the de novo identification of TRs in raw Nanopore sequencing data. NanoTRF can generate informative reports on TR genome abundance, monomer sequence, and length. It also performs annotation of transposable element sequences within or near satDNA arrays, providing insights into the co-evolution of TRs and transposable elements in the genome.
Article
Microbiology
Benjamin J. Callahan, Dmitry Grinevich, Siddhartha Thakur, Michael A. Balamotis, Tuval Ben Yehezkel
Summary: LoopSeq is a commercially available synthetic long-read (SLR) sequencing technology that generates highly accurate long reads from standard short reads, enabling direct identification of microbial genes and species in complex samples. Compared to standard Illumina amplicon sequencing, LoopSeq offers an order-of-magnitude improvement in length and accuracy, allowing accurate identification of species and strains from complex to low-biomass microbiome samples.
Review
Genetics & Heredity
Monika Cechova
Summary: The challenge of assembling short reads into a high-quality reference genome has been complicated by the repetitive nature of the human genome. The emergence of long reads has allowed for better characterization of difficult genomic regions and differentiation of identical sequences based on epigenetic marks. Although long reads still contain some sequencing errors, they provide new possibilities for solving the problem of multi-mapping reads.
Article
Biochemistry & Molecular Biology
Valentina Peona, Verena E. Kutschera, Mozes P. K. Blom, Martin Irestedt, Alexander Suh
Summary: This study explores the diversity and evolution of satDNA in bird species belonging to different genera using short- and long-read data. The results reveal rapid divergence of satDNA between closely related crow species, while satDNA appears more similar between birds-of-paradise species of different genera.
Article
Biochemical Research Methods
Son Hoang Nguyen, Minh Duc Cao, Lachlan J. M. Coin
Summary: npGraph is a streaming hybrid assembly tool that uses assembly graph instead of separate pre-assembly contigs, resulting in more complete genome assembly by resolving the path finding problem on the assembly graph using long reads as the traversing guide. It provides a real-time visualization of the progress of assembly and maintains a low computational cost.
PLOS COMPUTATIONAL BIOLOGY
(2021)
Article
Biology
Mikang Sim, Jongin Lee, Suyeon Wy, Nayoung Park, Daehwan Lee, Daehong Kwon, Jaebum Kim
Summary: A new method called PLR-GEN is proposed to generate pseudo-long reads from metagenomic short reads by considering small sequence variations existing in individual genomes. The use of these pseudo-long reads significantly improves metagenomic assembly in terms of sequence number, assembly contiguity, and prediction of species and genes.
Article
Biology
Arne Ludwig, Martin Pippel, Gene Myers, Michael Hiller
Summary: In this study, we present DENTIST, a sensitive, highly accurate, and automated pipeline method for closing gaps in short-read assemblies with long error-prone reads. Through tests on real genomic data, we demonstrate that DENTIST achieves higher accuracy and similar sensitivity compared to previous methods.
Article
Biochemical Research Methods
E. Sacristan-Horcajada, S. Gonzalez-de la Fuente, R. Peiro-Pastor, F. Carrasco-Ramiro, R. Amils, J. M. Requena, J. Berenguer, B. Aguado
Summary: Researchers have developed a NGS long-reads indels correction pipeline called ARAMIS, which combines multiple correction software in one step using accurate short reads to address insertions and deletions errors in long-read sequencing. The study found systematic sequencing errors in PacBio sequences affecting homopolymeric regions, and that the type of indel errors introduced during PacBio sequencing are related to the GC content of the organism.
BRIEFINGS IN BIOINFORMATICS
(2021)
Article
Biotechnology & Applied Microbiology
Fatih Karaoglanoglu, Cedric Chauve, Faraz Hach
Summary: Genion is a sensitive and fast gene fusion detection method that accurately identifies gene fusions in both simulated and real datasets, with better clustering accuracy than other methods. In the breast cancer cell line MCF-7, Genion correctly identifies all experimentally validated gene fusions.
Article
Biochemical Research Methods
Marek Kokot, Adam Gudys, Heng Li, Sebastian Deorowicz
Summary: The cost of maintaining a large amount of data generated by third-generation sequencing has become a significant concern in genomic research. Existing algorithms for compressing long reads have only a slight advantage over general-purpose gzip. In this study, we introduce CoLoRd, an algorithm that can significantly reduce the size of third-generation sequencing data without compromising the accuracy of downstream analyses.
Article
Biochemical Research Methods
Alaina Shumate, Brandon Wong, Geo Pertea, Mihaela Pertea
Summary: Short-read RNA sequencing and long-read RNA sequencing have their own strengths and weaknesses. The new release of StringTie allows for hybrid-read assembly, combining the strengths of both short and long reads to achieve higher accuracy and faster speed.
PLOS COMPUTATIONAL BIOLOGY
(2022)
Article
Biochemistry & Molecular Biology
Stephanie H. Chen, Maurizio Rossetto, Marlien van der Merwe, Patricia Lu-Irving, Jia-Yee S. Yap, Herve Sauquet, Greg Bourke, Timothy G. Amos, Jason G. Bragg, Richard J. Edwards
Summary: This study presents the first chromosome-level genome of Telopea speciosissima, providing valuable insights into the speciation, introgression, and adaptation processes. The genome assembly is of high quality and the annotation revealed important genetic information, such as the presence of CYCLOIDEA genes and potential gene duplications. The T. speciosissima reference genome will contribute to the conservation efforts of Proteaceae plants in Australia and beyond.
MOLECULAR ECOLOGY RESOURCES
(2022)
Article
Biochemistry & Molecular Biology
Tihana Vondrak, Ludmila Oliveira, Petr Novak, Andrea Koblizkova, Pavel Neumann, Jiri Macas
Summary: Long-range sequence analysis revealed the complex structure of heterochromatin regions containing major satellite repeats, with frequent interruptions by simple sequence repeats and targeted insertions of LINE retrotransposons. These data demonstrate that the organization of satellite repeats in heterochromatic chromosome bands can be more complex than previously thought, and show that heterochromatin organization can be efficiently investigated without genome assembly.
COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL
(2021)
Article
Biotechnology & Applied Microbiology
Hyunji Lee, Jun Kim, Junho Lee
Summary: Recent advances in long-read sequencing technologies have enabled accurate identification of genetic variants. In this study, two Caenorhabditis elegans strains were used to compare the performance of two long-read sequencing platforms, HiFi and CLR. HiFi identified more true-positive variants and fewer false-positive variants compared to CLR. Additionally, assembly-based variant calling was shown to be effective for detection of large insertions using accurate long-read sequencing data.
Article
Biotechnology & Applied Microbiology
Zhao Chen, David L. Erickson, Jianghong Meng
Summary: Oxford Nanopore sequencing is widely used for bacterial pathogen genome assembly, but high error rates in long reads necessitate polishing with Illumina short reads. NextPolish outperformed Pilon in improving genomic analyses of bacterial pathogens, requiring varying numbers of rounds for different strains. Simulated and real reads showed that the accuracy of genomic analyses depended on the optimization tool and the specific bacterial strain.
Article
Biochemical Research Methods
Nicholas Stoler, Barbara Arbeithuber, Gundula Povysil, Monika Heinzl, Renato Salazar, Kateryna D. Makova, Irene Tiemann-Boege, Anton Nekrutenko
BMC BIOINFORMATICS
(2020)
Article
Biology
Alison Barrett, Barbara Arbeithuber, Arslan Zaidi, Peter Wilton, Ian M. Paul, Rasmus Nielsen, Kateryna D. Makova
PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY B-BIOLOGICAL SCIENCES
(2020)
Article
Evolutionary Biology
Rahulsimham Vegesna, Marta Tomaszkiewicz, Oliver A. Ryder, Rebeca Campos-Sanchez, Paul Medvedev, Michael DeGiorgio, Kateryna D. Makova
GENOME BIOLOGY AND EVOLUTION
(2020)
Article
Biochemistry & Molecular Biology
Di Chen, Marzia A. Cremona, Zongtai Qi, Robi D. Mitra, Francesca Chiaromonte, Kateryna D. Makova
MOLECULAR BIOLOGY AND EVOLUTION
(2020)
Article
Biochemistry & Molecular Biology
Barbara Arbeithuber, James Hester, Marzia A. Cremona, Nicholas Stoler, Arslan Zaidi, Bonnie Higgins, Kate Anthony, Francesca Chiaromonte, Francisco J. Diaz, Kateryna D. Makova
Article
Biochemistry & Molecular Biology
Wilfried M. Guiblet, Marzia A. Cremona, Robert S. Harris, Di Chen, Kristin A. Eckert, Francesca Chiaromonte, Yi-Fei Huang, Kateryna D. Makova
Summary: Approximately 13% of the human genome can fold into non-canonical (non-B) DNA structures, which have been implicated in vital cellular processes. Non-B DNA hinders replication, increasing errors and facilitating mutagenesis, yet its contribution to genomewide variation in mutation rates remains unexplored. Non-B DNA substantially contributes to variation in substitution frequencies at small and large scales, highlighting its role in germline mutagenesis with implications to evolution and genetic diseases.
NUCLEIC ACIDS RESEARCH
(2021)
Review
Genetics & Heredity
Monika Cechova
Summary: The challenge of assembling short reads into a high-quality reference genome has been complicated by the repetitive nature of the human genome. The emergence of long reads has allowed for better characterization of difficult genomic regions and differentiation of identical sequences based on epigenetic marks. Although long reads still contain some sequencing errors, they provide new possibilities for solving the problem of multi-mapping reads.
Article
Biochemistry & Molecular Biology
Renato Salazar, Barbara Arbeithuber, Maja Ivankovic, Monika Heinzl, Sofia Moura, Ingrid Hartl, Theresa Mair, Angelika Lahnsteiner, Thomas Ebner, Omar Shebl, Johannes Proell, Irene Tiemann-Boege
Summary: Researchers have discovered highly recurrent selfish mutations associated with congenital disorders in male germline. Using duplex sequencing, they examined the FGFR3 coding region and found that older donors harbor more mutations associated with congenital disorders.
Article
Multidisciplinary Sciences
Barbara Arbeithuber, Marzia A. Cremona, James Hester, Alison Barrett, Bonnie Higgins, Kate Anthony, Francesca Chiaromonte, Francisco J. Diaz, Kateryna D. Makova
Summary: Duplex sequencing technology reveals the accumulation of mtDNA mutations in somatic tissues and germline cells of primates as they age. The frequency of these mutations significantly increases in liver and muscle tissues with age, while it stabilizes in oocytes of older animals after 9 years of age.
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA
(2022)
Review
Cell Biology
Monika Cechova, Karen H. Miga
Summary: This review focuses on the biology of satellite DNA on human X and Y chromosomes and its impact on sex chromosome aneuploidies. The findings provide insights into the prevalence and consequences of these aneuploidies.
SEMINARS IN CELL & DEVELOPMENTAL BIOLOGY
(2022)
Review
Genetics & Heredity
Kateryna D. Makova, Matthias H. Weissensteiner
Summary: In addition to the canonical right-handed double helix, non-B DNA structures can form in the genomes across the tree of life. These structures regulate cellular processes and have the potential to drive genomic and phenotypic evolution. Recent studies have established non-B DNA as novel functional elements subject to natural selection, affecting the evolution of transposable elements and centromeres. Evolutionary analyses should consider not only DNA sequence, but also its structure.
TRENDS IN GENETICS
(2023)
Article
Evolutionary Biology
Marta Tomaszkiewicz, Kristoffer Sahlin, Paul Medvedev, Kateryna D. Makova
Summary: This study decoded the transcript sequences of nine YAG families in six great ape species and found evolutionarily conserved alternative splicing patterns in most families. It revealed that BPY2 and PRY families have distinct features and that the PRY family is undergoing pseudogenization. No selection signatures were detected in the YAG families shared among great apes, but many species-specific protein-coding transcripts were identified. Consensus disorder regions were predicted, providing a resource for future studies on male infertility.
GENOME BIOLOGY AND EVOLUTION
(2023)
Review
Veterinary Sciences
Monika Cechova, Michaela Andrlikova
Summary: Cattle, as one of the most important farm animals, have undergone intense selection and genetic testing to enhance their agricultural potential. Modern technologies such as gene editing and in vitro embryo production are being used to accelerate the breeding process for genetically superior animals, adapting to changing environments and demands.
ACTA VETERINARIA BRNO
(2021)