4.7 Article

HGT-Finder: A New Tool for Horizontal Gene Transfer Finding and Application to Aspergillus genomes

期刊

TOXINS
卷 7, 期 10, 页码 4035-4053

出版社

MDPI AG
DOI: 10.3390/toxins7104035

关键词

horizontal gene transfer; HGT; gene clusters; secondary metabolism; Aspergillus; bioinformatics; software

资金

  1. National Institutes of Health [1R15GM114706]
  2. NIU

向作者/读者索取更多资源

Horizontal gene transfer (HGT) is a fast-track mechanism that allows genetically unrelated organisms to exchange genes for rapid environmental adaptation. We developed a new phyletic distribution-based software, HGT-Finder, which implements a novel bioinformatics algorithm to calculate a horizontal transfer index and a probability value for each query gene. Applying this new tool to the Aspergillus fumigatus, Aspergillus flavus, and Aspergillus nidulans genomes, we found 273, 542, and 715 transferred genes (HTGs), respectively. HTGs have shorter length, higher guanine-cytosine (GC) content, and relaxed selection pressure. Metabolic process and secondary metabolism functions are significantly enriched in HTGs. Gene clustering analysis showed that 61%, 41% and 74% of HTGs in the three genomes form physically linked gene clusters (HTGCs). Overlapping manually curated, secondary metabolite gene clusters (SMGCs) with HTGCs found that 9 of the 33 A. fumigatus SMGCs and 31 of the 65 A. nidulans SMGCs share genes with HTGCs, and that HTGs are significantly enriched in SMGCs. Our genome-wide analysis thus presented very strong evidence to support the hypothesis that HGT has played a very critical role in the evolution of SMGCs. The program is freely available at http://cys.bios.niu.edu/HGTFinder/HGTFinder.tar.gz.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

Article Biochemistry & Molecular Biology

dbCAN-PUL: a database of experimentally characterized CAZyme gene clusters and their substrates

Catherine Ausland, Jinfang Zheng, Haidong Yi, Bowen Yang, Tang Li, Xuehuan Feng, Bo Zheng, Yanbin Yin

Summary: PULs are gene clusters containing CAZymes and other genes that digest and utilize carbohydrate substrates, and dbCAN-PUL is an online database displaying experimentally verified CAZyme-containing PULs with metadata, sequences, and annotation. Compared to other resources, dbCAN-PUL offers new features such as batch download, annotation, external links, homologous gene cluster display, and BLASTX query service.

NUCLEIC ACIDS RESEARCH (2021)

Article Biotechnology & Applied Microbiology

Polyphenol Utilization Proteins in the Human Gut Microbiome

Bo Zheng, Yinchao He, Pengxiang Zhang, Yi-Xin Huo, Yanbin Yin

Summary: This study curated experimentally characterized polyphenol utilization proteins (PUPs) and their homologs, identified potential players in polyphenol metabolism in the human gut microbiome, and found that Africans have higher abundance and prevalence of PUP homologs and gene clusters than other populations.

APPLIED AND ENVIRONMENTAL MICROBIOLOGY (2022)

Article Biochemical Research Methods

HDMC: a novel deep learning-based framework for removing batch effects in single-cell RNA-seq data

Xiao Wang, Jia Wang, Han Zhang, Shenwei Huang, Yanbin Yin

Summary: Researchers propose a novel deep learning approach, which is a hierarchical distribution-matching framework assisted with contrastive learning, to address batch effects in single-cell RNA sequencing data. This method effectively reduces distribution differences between different batches and aligns samples from different batches to recover cell type clusters.

BIOINFORMATICS (2022)

Article Biochemical Research Methods

LR-GNN: a graph neural network based on link representation for predicting molecular associations

Chuanze Kang, Han Zhang, Zhuo Liu, Shenwei Huang, Yanbin Yin

Summary: This paper presents a novel GNN method LR-GNN based on link representation learning for accurately predicting molecular associations. Experimental results show that LR-GNN outperforms state-of-the-art methods and demonstrates robust ability to predict unknown associations. Visualizations also validate the effectiveness of the link representation used in LR-GNN.

BRIEFINGS IN BIOINFORMATICS (2022)

Letter Plant Sciences

The chromosome-level rambutan genome reveals a significant role of segmental duplication in the expansion of resistance genes

Jinfang Zheng, Lyndel W. Meinhardt, Ricardo Goenaga, Tracie Matsumoto, Dapeng Zhang, Yanbin Yin

HORTICULTURE RESEARCH (2022)

Article Biochemical Research Methods

Critical assessment of pan-genomic analysis of metagenome-assembled genomes

Tang Li, Yanbin Yin

Summary: The pan-genome analysis of metagenome-assembled genomes (MAGs) can be affected by issues such as fragmentation, incompleteness, and contamination. In this study, the researchers conducted a critical assessment of pan-genomics by comparing the results of complete bacterial genomes and simulated MAGs. The findings show that incompleteness leads to significant loss of core genes, while contamination mainly affects accessory genomes. Lowering the core gene threshold and using gene prediction algorithms that consider fragmented genes can alleviate the loss, but to a limited extent. The study concludes that new pan-genome analysis tools specifically for MAGs are needed.

BRIEFINGS IN BIOINFORMATICS (2022)

Article Engineering, Environmental

Root cell wall remodeling: A way for exopolysaccharides to mitigate cadmium toxicity in rice seedling

Hong-yu Wei, Yi Li, Jiao Yan, Shuai-ying Peng, Sai-jin Wei, Yanbin Yin, Kun-tai Li, Xin Cheng

Summary: This study found that EPS from Lactobacillus plantarum LPC-1 has a regulating effect on cell wall remodeling in rice roots and enhances plants' resistance to heavy metals.

JOURNAL OF HAZARDOUS MATERIALS (2023)

Article Biochemistry & Molecular Biology

dbCAN-seq update: CAZyme gene clusters and substrates in microbiomes

Jinfang Zheng, Boyang Hu, Xinpeng Zhang, Qiwei Ge, Yuchen Yan, Jerry Akresi, Ved Piyush, Le Huang, Yanbin Yin

Summary: The updated dbCAN-seq database provides predictions of glycan substrates for CGCs from microbiomes. New features include graphical display of CGC gene compositions, alignment of query CGC and subject PUL, and a statistics page.

NUCLEIC ACIDS RESEARCH (2023)

Article Microbiology

AcaFinder: Genome Mining for Anti-CRISPR-Associated Genes

Bowen Yang, Jinfang Zheng, Yanbin Yin

Summary: This paper introduces AcaFinder, the first tool for Aca genome mining. AcaFinder can predict Acas and their associated acr-aca operons, identify homologs of known Acas, and analyze potential prophages, CRISPR-Cas systems, and self-targeting spacers (STSs) in input genomes. The tool was applied to mining prokaryotic and gut phage genomes, resulting in the identification of 36 high-confident new Aca families. The study also reveals a complex association network between Acrs and Acas.

MSYSTEMS (2022)

Article Computer Science, Artificial Intelligence

Semantic-guided graph neural network for heterogeneous graph embedding

Mingjing Han, Han Zhang, Wei Li, Yanbin Yin

Summary: In this paper, a novel method called Semantic-guided Graph Neural Network (SGNN) is proposed to address the semantic confusion problem in heterogeneous graph embedding. The proposed SGNN utilizes two-level fusion mechanisms to enhance the local representation and extract jumping knowledge from multiple semantics. Experimental results demonstrate the effectiveness of SGNN in real-world tasks.

EXPERT SYSTEMS WITH APPLICATIONS (2023)

Article Biochemistry & Molecular Biology

dbCAN3: automated carbohydrate-active enzyme and substrate annotation

Jinfang Zheng, Qiwei Ge, Yuchen Yan, Xinpeng Zhang, Le Huang, Yanbin Yin

Summary: dbCAN is an online web server for automated annotation of carbohydrate active enzymes (CAZymes). dbCAN3 is an updated version of this server with three new methods for predicting glycan substrates, as well as improved data browsing and visualization features.

NUCLEIC ACIDS RESEARCH (2023)

Article Plant Sciences

Two interacting basic helix-loop-helix transcription factors control flowering time in rice

Yanbin Yin, Zhiqiang Yan, Jianing Guan, Yiqiong Huo, Tianqiong Wang, Tong Li, Zhibo Cui, Wenhong Ma, Xiaoxue Wang, Wenfu Chen

Summary: Hd1 binding protein 1 (HBP1) and Partner of HBP1 (POH1) were identified as transcriptional regulators of Heading date 1 (Hd1), which is a key factor in the photoperiodic control of flowering time in rice. HBP1 and POH1 physically interacted to form homo- or heterodimers and directly activated the expression of Hd1 by binding to its promoter region. Knockout mutations of HBP1 accelerated flowering time, while overexpression of HBP1 and POH1 delayed flowering time by regulating the expression of Hd1 and other related genes.

PLANT PHYSIOLOGY (2023)

Article Biochemistry & Molecular Biology

dbAPIS: a database of anti-prokaryotic immune system genes

Yuchen Yan, Jinfang Zheng, Xinpeng Zhang, Yanbin Yin

Summary: In this study, we developed dbAPIS as the first literature curated data repository for experimentally verified APIS genes and their associated protein families. The key features of dbAPIS include experimentally verified APIS genes with their protein sequences, functional annotation, structures, genomic context, homologs, classification of APIS proteins, and user-friendly web interface for data browsing, searching, and batch downloading. The current release of dbAPIS contains 41 verified APIS proteins and a large number of sequence homologs of different families and clans. dbAPIS will facilitate the discovery of novel anti-defense genes and genomic islands in phages, by providing a user-friendly data repository and a web resource for an easy homology search against known APIS proteins.

NUCLEIC ACIDS RESEARCH (2023)

Article Biochemical Research Methods

Genome mining for anti-CRISPR operons using machine learning

Bowen Yang, Minal Khatri, Jinfang Zheng, Jitender Deogun, Yanbin Yin

Summary: Recent studies have shown that known anti-CRISPR (Acr) genes often exist in the same operons as other Acr genes and phage structural genes. However, current Acr prediction tools do not take this important genomic context into consideration. Researchers have developed a new software tool called AOminer, which exploits the genomic context of known Acr genes and their homologs to facilitate the discovery of novel anti-CRISPR operons.

BIOINFORMATICS (2023)

Article Biology

Improved Methods for Acetocarmine and Haematoxylin Staining to Visualize Chromosomes in the Filamentous Green Alga Zygnema (Charophyta)

Nina Rittmeier, Andreas Holzinger

Summary: This study investigated the chromosome visualization methods in the filamentous green alga Zygnema. Existing protocols were modified to allow reliable chromosome counting in this genus. The challenges of interference from cell wall components and random cell divisions were addressed.

BIO-PROTOCOL (2023)

暂无数据