Article
Biochemical Research Methods
Timo Lassmann
Summary: SAMStat is an efficient program for extracting quality control metrics from fastq and SAM/BAM files. It displays sequence composition, base quality composition, and mapping error profiles based on mapping quality, allowing users to quickly identify reasons for poor mapping. A major update to SAMStat now supports paired-end and long-read data, with quality control plots drawn using the ploty javascript library.
Article
Biochemistry & Molecular Biology
Markus Pfenninger, Philipp Schoennenbeck, Tilman Schell
Summary: Accurate estimation of genome sizes is essential in biodiversity genomics, and this study introduces a method that can estimate genome size from the number of sequenced bases and mean sequencing depth. Simulations demonstrate that even from low-coverage genome drafts, reasonable estimates can be obtained using this method. Comparison with flow cytometry estimates suggests that both methods provide similar and interchangeable results.
MOLECULAR ECOLOGY RESOURCES
(2022)
Review
Biochemical Research Methods
Sergey Knyazev, Lauren Hughes, Pavel Skums, Alexander Zelikovsky
Summary: Advancements in next-generation sequencing have enabled detailed assessment of viral population complexity within hosts, extracting crucial epidemiological and biomedical information. However, the complex analysis of NGS data is required to handle the rapidly mutating viral populations.
BRIEFINGS IN BIOINFORMATICS
(2021)
Article
Computer Science, Interdisciplinary Applications
Tsozen Yeh, Yulin Chen
Summary: This paper discusses how to accelerate job execution speed in hybrid cloud environments by optimizing the data transfer process. By designing and implementing a new model, the execution time of jobs was reduced significantly, improving data access efficiency and having a positive impact on cloud computing environments.
SIMULATION MODELLING PRACTICE AND THEORY
(2021)
Article
Biochemistry & Molecular Biology
Lei Zhao, Rasmus Nielsen, Thorfinn Sand Korneliussen
Summary: Commonly used methods for inferring phylogenies are not well-suited for handling challenges associated with noisy, diploid sequencing data. To address this problem, we introduce two new probabilistic approaches, distAngsd-geno and distAngsd-nuc, that account for uncertainty in genotype calling and are specifically designed for next-generation sequencing data.
MOLECULAR BIOLOGY AND EVOLUTION
(2022)
Review
Biochemical Research Methods
Tingting Gong, Vanessa M. Hayes, Eva K. F. Chan
Summary: This review highlights important factors affecting somatic SV detection and compares the performance of seven commonly used SV callers. By focusing on changes in sensitivity and precision for detecting different SV types and size ranges from samples with varying variant allele frequencies and sequencing depths, the evaluation findings extend beyond the seven SV callers examined in this paper.
BRIEFINGS IN BIOINFORMATICS
(2021)
Article
Computer Science, Information Systems
Tsozen Yeh, Chiahung Sun
Summary: Cloud and IoT technologies are widely used in many aspects of our lives, where cloud systems maintain and process the large amount of data collected. Ensuring data reliability is crucial in case of hardware failure. Although storing multiple copies of data blocks helps, it still cannot guarantee reliability. To overcome this, a new scheme in Hadoop was designed to efficiently identify data inconsistency between clouds, greatly enhancing data reliability.
INFORMATION SYSTEMS FRONTIERS
(2023)
Article
Biochemical Research Methods
Rasmus Amund Henriksen, Lei Zhao, Thorfinn Sand Korneliussen
Summary: This article presents a multithreaded simulator for next-generation sequencing data, which can simulate reads faster than currently available methods and programs. It can simulate reads with platform-specific characteristics based on nucleotide quality score profiles and includes a post-mortem damage model for simulating ancient DNA. The program is implemented in a multithreading framework and is significantly faster than currently available tools, while also offering additional features and output formats.
Article
Biochemical Research Methods
Jie Huang, Stefano Pallotti, Qianling Zhou, Marcus Kleber, Xiaomeng Xin, Daniel A. King, Valerio Napolioni
Summary: The development of PERHAPS allows for direct calling of haplotypes from short-read, paired-end NGS data with high reliability. By applying this method, the study successfully extracted haplotype data related to APOE polymorphism and identified the rare APOE(*)1 haplotype in the African population.
BRIEFINGS IN BIOINFORMATICS
(2021)
Article
Microbiology
Ilayda Akacin, Seymanur Ersoy, Osman Doluca, Mine Gungormusler
Summary: This review provides a comprehensive overview of recent literature on the utilization of TGS and NGS technologies in microbial metagenomics research. It discusses the advantages and limitations of these technologies and presents real-time examples of novel applications in clinical microbiology and public health, food and agriculture, energy and environment, arts and space.
MICROBIOLOGICAL RESEARCH
(2022)
Article
Genetics & Heredity
Karim Hasanpur, Sevda Hosseinzadeh, Atiye Mirzaaghayi, Sadegh Alijani
Summary: Accurate normalization of gene expression assays using housekeeping genes is crucial, but there is no consensus on the suitable set of housekeeping genes for quantitative real-time PCR analyses of chicken tissues. This study utilized high-throughput gene expression data to identify the most suitable and stable reference genes for 16 chicken tissues. The results revealed tissue-specific sets of reference genes and disproved the suitability of previously widely used housekeeping genes. The newly identified reference genes can contribute to more accurate normalization for future expression analysis of chicken genes.
FRONTIERS IN GENETICS
(2022)
Article
Biochemistry & Molecular Biology
Christos Tzaferis, Evangelos Karatzas, Fotis A. Baltoumas, Georgios A. Pavlopoulos, George Kollias, Dimitris Konstantopoulos
Summary: Analysis and interpretation of high-throughput transcriptional and chromatin accessibility data at single-cell resolution remain challenges in the biomedical field. SCALA is a bioinformatics tool for analyzing and visualizing single-cell RNA sequencing and Assay for Transposase-Accessible Chromatin using sequencing data-sets. It offers independent or integrative analysis options and various analysis modules to aid biomedical researchers in exploring, analyzing, and visualizing their data without coding experience.
COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL
(2023)
Article
Computer Science, Theory & Methods
Su Ozgur, Mehmet Orman
Summary: This study investigates the application of deep learning and machine learning algorithms in the analysis of biological data. By evaluating real and simulated genomic data, the researchers assess the prediction performance of different algorithms under different parameters. The findings provide guidance for genomic scientists on how to effectively utilize deep learning/machine learning methods for the analysis of human genomic data.
JOURNAL OF BIG DATA
(2023)
Review
Biochemical Research Methods
Alba Gutierrez-Sacristan, Carlos De Niz, Cartik Kothari, Sek Won Kong, Kenneth D. Mandl, Paul Avillach
Summary: Precision medicine aims to provide the best diagnosis, treatment, and prognosis for each individual by tailoring therapeutic approaches to their genetic profile, lifestyle, and environmental exposures. However, to achieve this goal, researchers need access to large-scale clinical and genomic data repositories, which can be challenging to locate and obtain.
BRIEFINGS IN BIOINFORMATICS
(2021)
Article
Biochemical Research Methods
Brendan O'Fallon, Jacob Durtschi, Ana Kellogg, Tracey Lewis, Devin Close, Hunter Best
Summary: This study proposes two algorithmic adaptations to improve the accuracy of CNV detection in a Hidden Markov Model (HMM) context. First, it improves the accuracy by computing target- and copy number-specific emission distributions. Second, it enhances the sensitivity for small CNV calls using the Pointwise Maximum a posteriori (PMAP) HMM decoding procedure. The prototype implementation, called Cobalt, shows similar sensitivity to other CNV detection tools but significantly reduces false positive detections.
BMC BIOINFORMATICS
(2022)
Article
Computer Science, Information Systems
Asad Javed, Jeremy Robert, Keijo Heljanko, Kary Framling
JOURNAL OF GRID COMPUTING
(2020)
Article
Computer Science, Artificial Intelligence
Abdul Rehman Javed, Mirza Omer Beg, Muhammad Asim, Thar Baker, Ali Hilal Al-Bayatti
JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING
(2020)
Article
Engineering, Chemical
Hossein Mostafaei, Teemu Ikonen, Jason Kramb, Tewodros Deneke, Keijo Heljanko, Iiro Harjunkoski
INDUSTRIAL & ENGINEERING CHEMISTRY RESEARCH
(2020)
Article
Engineering, Chemical
Iiro Harjunkoski, Teemu Ikonen, Hossein Mostafaei, Tewodros Deneke, Keijo Heljanko
INDUSTRIAL & ENGINEERING CHEMISTRY RESEARCH
(2020)
Article
Biochemical Research Methods
Kim T. Gurwitz, Prakash Singh Gaur, Louisa J. Bellis, Lee Larcombe, Eva Alloza, Balint Laszlo Balint, Alexander Botzki, Jure Dimec, Victoria Dominguez del Angel, Pedro L. Fernandes, Eija Korpelainen, Roland Krause, Mateusz Kuzak, Loredana Le Pera, Brane Leskosek, Jessica M. Lindvall, Diana Marek, Paula A. Martinez, Tuur Muyldermans, Stale Nygard, Patricia M. Palagi, Hedi Peterson, Fotis Psomopoulos, Vojtech Spiwok, Celia W. G. van Gelder, Allegra Via, Marko Vidak, Daniel Wibberg, Sarah L. Morgan, Gabriella Rustici
PLOS COMPUTATIONAL BIOLOGY
(2020)
Article
Computer Science, Interdisciplinary Applications
Teemu J. Ikonen, Keijo Heljanko, Iiro Harjunkoski
COMPUTERS & CHEMICAL ENGINEERING
(2020)
Article
Computer Science, Software Engineering
Antti Siirtola, Keijo Heljanko
SCIENCE OF COMPUTER PROGRAMMING
(2020)
Article
Engineering, Chemical
Lingqing Yan, Tewodros L. Deneke, Keijo Heljanko, Iiro Harjunkoski, Thomas F. Edgar, Michael Baldea
Summary: Process intensification aims to make chemical processes safer and more efficient through significant modifications to design and structure. Dynamic process intensification (DPI) introduces operational changes to achieve the same product generation as steady-state operation, but with improved economics. The novel dynamic optimization-based DPI (Do-DPI) strategy involves true cyclic operation and can reduce energy use while maintaining product quality and production rate.
INDUSTRIAL & ENGINEERING CHEMISTRY RESEARCH
(2021)
Article
Multidisciplinary Sciences
Altti Ilari Maarala, Ossi Arasalo, Daniel Valenzuela, Veli Makinen, Keijo Heljanko
Summary: Computational pan-genomics analyzes information from multiple individual genomes to discover genetic variation thoroughly. With the rapid growth of whole-genome sequencing data, efficient data compression and indexing methods are crucial, especially for exploiting distributed and parallel computing more effectively.
Article
Engineering, Chemical
Teemu J. Ikonen, Keijo Heljanko, Iiro Harjunkoski
Summary: Periodic rescheduling is an iterative method used for real-time decision-making in industrial process operations. The design of such methods involves high-level decisions on when and how to schedule, with optimal choices depending on the operating environment. We propose the use of surrogate-based optimization to determine continuous control parameter choices, reducing computational costs.
Proceedings Paper
Computer Science, Hardware & Architecture
Emily Yu, Armin Biere, Keijo Heljanko
Summary: A formal framework was presented to certify k-induction-based model checking results, utilizing the concept of k-witness circuit and a simple inductive invariant. The approach reduces the certification problem to pure SAT checks and checking a simple QBF with one quantifier alternation in order to allow proofs to be checked with an independent proof checker. The resulting certification toolkit CERTIFAIGER was evaluated on instances from the hardware model checking competition, demonstrating the practical use of the certification method.
COMPUTER AIDED VERIFICATION, PT II, CAV 2021
(2021)
Proceedings Paper
Computer Science, Software Engineering
Natalia Gavrilenko, Hernan Ponce-de-Leon, Florian Furbach, Keijo Heljanko, Roland Meyer
COMPUTER AIDED VERIFICATION, CAV 2019, PT I
(2019)
Proceedings Paper
Computer Science, Information Systems
Asad Javed, Narges Yousefnezhad, Jeremy Robert, Keijo Heljanko, Kary Framling
2019 IEEE INTERNATIONAL CONFERENCE ON PERVASIVE COMPUTING AND COMMUNICATIONS WORKSHOPS (PERCOM WORKSHOPS)
(2019)
Article
Biochemical Research Methods
Teresa K. Attwood, Sarah Blackford, Michelle D. Brazas, Angela Davies, Maria Victoria Schneider
BRIEFINGS IN BIOINFORMATICS
(2019)
Proceedings Paper
Computer Science, Interdisciplinary Applications
Hernan Ponce-de-Leon, Florian Furbach, Keijo Heljanko, Roland Meyer
PROCEEDINGS OF THE 2018 18TH CONFERENCE ON FORMAL METHODS IN COMPUTER AIDED DESIGN (FMCAD)
(2018)