Article
Computer Science, Artificial Intelligence
Deng Jiang, Haopeng Ren, Yi Cai, Jingyun Xu, Yanxia Liu, Ho-fung Leung
Summary: Named entity recognition (NER) is crucial in NLP tasks. We propose a neural multi-task model for extracting nested entities, which performs better and more efficiently compared to feature-based models.
Article
Computer Science, Artificial Intelligence
Yingwen Fu, Nankai Lin, Zhihe Yang, Shengyi Jiang
Summary: Named entity recognition (NER) plays a crucial role in various natural language processing (NLP) applications. However, applying advanced NER research to low-resource languages like Malay has been challenging due to the lack of sufficient data. This paper presents a system for building a Malay NER dataset (MS-NER) with 20,146 sentences through labeled datasets in related languages and iterative optimization. Additionally, a Multi-Task framework (MTBR) is proposed to effectively integrate boundary information for improved NER performance.
CONNECTION SCIENCE
(2023)
Article
Biochemical Research Methods
Keqin Peng, Chuantao Yin, Wenge Rong, Chenghua Lin, Deyu Zhou, Zhang Xiong
Summary: Biomedical factoid question answering is a crucial task in biomedical question answering applications. This study proposes a framework that fine-tunes BioBERT with a named entity dataset to improve question answering performance. BiLSTM is applied to encode the question text for sentence-level information, and bagging is used to combine question and token level information for enhanced overall performance.
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS
(2022)
Article
Computer Science, Information Systems
Vincenzo Moscato, Marco Postiglione, Carlo Sansone, Giancarlo Sperli
Summary: In Biomedical Named Entity Recognition (BioNER), the lack of publicly available annotated datasets hampers the use of cutting-edge deep learning-based methods like BERT and GPT-3. Annotating multiple entity types poses challenges as most existing datasets only provide annotations for a single entity type. To address this, the proposed TaughtNet framework leverages knowledge distillation to fine-tune a single multi-task student model using both ground truth and single-task teachers. Experimental results demonstrate the effectiveness of TaughtNet in recognizing mentions of diseases, chemical compounds, and genes, outperforming state-of-the-art baselines in terms of precision, recall, and F1 scores. TaughtNet also enables the training of smaller and lighter student models, making them suitable for real-world scenarios with limited-memory hardware devices and fast inferences, while also showing potential for explainability. The code and multi-task model have been made publicly available on GitHub (1) and the huggingface repository (2).
IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS
(2023)
Article
Computer Science, Information Systems
Mohsen Asghari, Daniel Sierra-Sosa, Adel S. Elmaghraby
Summary: The healthcare industry aims to improve patient experience and service quality, but there is a lack of standard datasets and computational resources for biomedical natural language understanding. This paper introduces a model trained on low-tier GPU computers to address this challenge.
INFORMATION SCIENCES
(2022)
Article
Biochemical Research Methods
Zhiyu Zhang, Arbee L. P. Chen
Summary: This study introduces a novel fully-shared multi-task learning model based on a pre-trained language model in the biomedical domain, which achieved significant performance improvements on seven benchmark BioNER datasets compared to single-task models.
BMC BIOINFORMATICS
(2022)
Article
Computer Science, Artificial Intelligence
Yuren Mao, Yu Hao, Weiwei Liu, Xuemin Lin, Xin Cao
Summary: This article proposes a novel PU learning method for distantly supervised NER, which can automatically handle class imbalance and does not rely on class prior estimation, resulting in state-of-the-art performance.
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
(2023)
Article
Chemistry, Multidisciplinary
Qin Fang, Yane Li, Hailin Feng, Yaoping Ruan
Summary: Compared to English, Chinese named entity recognition has lower performance due to the greater ambiguity in entity boundaries in Chinese text. To leverage entity boundary information, the task has been decomposed into two subtasks: boundary annotation and type annotation. A multi-task learning network (MTL-BERT) has been proposed that combines a bidirectional encoder (BERT) model, effectively improving the performance and efficiency of Chinese named entity recognition tasks.
APPLIED SCIENCES-BASEL
(2023)
Review
Biochemical Research Methods
Bosheng Song, Fen Li, Yuansheng Liu, Xiangxiang Zeng
Summary: Deep learning methods have achieved state-of-the-art performance in biomedical named entity recognition, and can be applied to BioNER in various domains based on dataset size and type. These methods are classified into four categories, including single neural network-based, multitask learning-based, transfer learning-based, and hybrid models.
BRIEFINGS IN BIOINFORMATICS
(2021)
Article
Biochemical Research Methods
Ling Luo, Chih-Hsuan Wei, Po-Ting Lai, Robert Leaman, Qingyu Chen, Zhiyong Lu
Summary: Biomedical named entity recognition (BioNER) aims to automatically identify biomedical entities in natural language text, providing a necessary foundation for downstream text mining tasks and applications. Due to the expensive and domain-specific expertise required for manual annotation of training data, current BioNER approaches suffer from data scarcity and limitations in generalizability and entity coverage. In this paper, we propose an all-in-one (AIO) scheme that utilizes external annotated resources to enhance the accuracy and stability of BioNER models. We introduce AIONER, a general-purpose BioNER tool based on cutting-edge deep learning and our AIO scheme, and demonstrate its effectiveness, robustness, and advantages over existing methods on 14 BioNER benchmark tasks and three independent tasks.
Article
Biochemical Research Methods
Tingting Liang, Congying Xia, Ziqiang Zhao, Yixuan Jiang, Yuyu Yin, Philip S. Yu
Summary: Biomedical Named Entity Recognition (BioNER) aims to identify biomedical entities such as genes, proteins, diseases, and chemical compounds in textual data. However, due to ethical and privacy issues, as well as the specialized nature of biomedical data, BioNER lacks quality labeled data, especially at the token-level. This study proposes a gazetteer-based approach to BioNER, where the task is to build a BioNER system from scratch without any token-level annotations. By formulating BioNER as a Textual Entailment problem and using Textual Entailment with Dynamic Contrastive learning (TEDC), this work addresses the noisy labeling issue and transfers knowledge from pre-trained textual entailment models. The dynamic contrastive learning framework improves the model's discrimination ability by contrasting entities and non-entities in the same sentence. Experimental results on real-world biomedical datasets demonstrate that TEDC achieves state-of-the-art performance for gazetteer-based BioNER.
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS
(2023)
Article
Biochemical Research Methods
Zhaoying Chai, Han Jin, Shenghui Shi, Siyan Zhan, Lin Zhuo, Yu Yang
Summary: This study proposes a hierarchical shared transfer learning method, which combines multi-task learning and fine-tuning to fuse the underlying entity features and upper data features, thereby improving the performance of biomedical named entity recognition.
BMC BIOINFORMATICS
(2022)
Article
Computer Science, Artificial Intelligence
Qi Peng, Changmeng Zheng, Yi Cai, Tao Wang, Haoran Xie, Qing Li
Summary: An unsupervised cross domain model is proposed in this study, which leverages labeled data from the source domain to predict entities in the target domain by applying adversarial training and an entity-aware attention module to reduce feature discrepancy between different domains.
Article
Computer Science, Artificial Intelligence
Xiaoyong Tang, Yong Huang, Meng Xia, Chengfeng Long
Summary: Named entity recognition is a technology that aims to identify and mark entities with specific meanings in text. This paper proposes a multi-task intelligent processing model that utilizes machine learning and deep learning techniques, as well as context semantic information, to improve Chinese named entity recognition. The model achieves significant improvements in F1 score compared to previous single task models, demonstrating the effectiveness of multi-task learning.
NEURAL PROCESSING LETTERS
(2023)
Article
Computer Science, Artificial Intelligence
Zhengjie Chen, Yu Zhang, Siya Mi
Summary: This paper introduces a method for improving the performance of Multimodal Named Entity Recognition (MNER) through cross-modal auxiliary tasks. The method utilizes cross-modal matching and cross-modal mutual information maximization to address the issue of mismatched image-text pairs, and separates the features of the main task and auxiliary tasks through a cross-modal gate-control mechanism.
PATTERN RECOGNITION LETTERS
(2023)
Article
Biochemical Research Methods
Danyang Ji, Mario Juhas, Chi Man Tsang, Chun Kit Kwok, Yongshu Li, Yang Zhang
Summary: Researchers have identified potential G-quadruplex-forming sequences in the SARS-CoV-2 RNA genome and other Coronaviridae family members, with some confirmed to form RNA G-quadruplex structures in vitro. These structures were found to interact with viral helicase (nsp13), suggesting a potential target for inhibiting the virus.
BRIEFINGS IN BIOINFORMATICS
(2021)
Editorial Material
Genetics & Heredity
Yang Zhang, Hui Xi, Mario Juhas
Summary: The emergence of a mutant strain of SARS-CoV-2 with an amino acid change at position 614 has become dominant in the pandemic, highlighting the importance of efficient detection using biosensing technologies for pandemic control.
TRENDS IN GENETICS
(2021)
Article
Chemistry, Analytical
Yang Zhang, Yang Wu, Hongjin Zheng, Hui Xi, Taoyu Ye, Chun-Yin Chan, Chun Kit Kwok
Summary: Nucleic acid medicine is emerging as a promising next-generation therapy, and the development of cell-penetrating aptamers can enhance the cellular delivery efficiency of therapeutic nucleic acids. Characteristic CD spectral analysis revealed the G-quadruplex structures of enriched aptamers.
ANALYTICAL CHEMISTRY
(2021)
Article
Biochemical Research Methods
Mei Zuo, Yang Zhang
Summary: Motivation: Information about bacteria biotopes (BB) is crucial for microbiological research and applications. The BB task at BioNLP-OST 2019 focuses on extracting microorganism locations and phenotypes from biomedical texts. Our span-based model, utilizing a pre-trained BERT model, achieves significantly better performance in entity and relation extraction tasks for BBs compared to previous methods, showing a reduction of 21.6% in slot error rate (SER). The model also shows effectiveness in recognizing nested entities and can be applied to other related tasks with state-of-the-art performance.
Editorial Material
Biochemistry & Molecular Biology
Yang Zhang, Hao Jiang, Taoyu Ye, Mario Juhas
Summary: Despite the significant interest in deep learning in microbiology, its full potential is yet to be realized. Deep-learning-based systems are believed to play a crucial role in monitoring and investigating microorganisms in the future.
TRENDS IN MICROBIOLOGY
(2021)
Article
Biology
Sen Li, Zeyu Du, Xiangjie Meng, Yang Zhang
Summary: This article introduces a novel deep learning approach using a deep transfer graph convolutional network (DTGCN) for the recognition of malaria parasites of various stages in blood smear images. The method has shown higher accuracy and effectiveness in publicly available microscopic images of multi-stage malaria parasites compared to a wide range of state-of-the-art approaches.
Article
Biophysics
Brij Mohan, Sandeep Kumar, Hui Xi, Shixuan Ma, Zhiyu Tao, Tiantian Xing, Hengzhi You, Yang Zhang, Peng Ren
Summary: The study focuses on the application of sensors made of porous metal-organic frameworks (MOFs) in cancer biomarker detection, analyzing factors such as fabrication strategies and structural properties that influence sensing performance, and proposes an innovative technique for detecting cancer biomarkers using luminescence and electrochemical sensors.
BIOSENSORS & BIOELECTRONICS
(2022)
Letter
Biochemistry & Molecular Biology
Gang Mao, Yulin Wu, Yang Zhang, Xuan Wang, Yan Zhu, Bo Liu, Yadong Wang, Junyi Li
JOURNAL OF GENETICS AND GENOMICS
(2022)
Review
Biotechnology & Applied Microbiology
Yang Zhang, Mario Juhas, Chun Kit Kwok
Summary: SARS-CoV-2, the cause of COVID-19, is a major contributor to global mortality. Existing antigen/antibody-based immunoassays and neutralizing antibodies are often ineffective against emerging SARS-CoV-2 variants, highlighting the urgent need for new approaches. Aptamers have been successfully used for detecting and inhibiting various viruses, and hold promise in the fight against COVID-19. This review discusses recent advances and future trends in the development of aptamer-based approaches for the diagnosis and treatment of SARS-CoV-2.
TRENDS IN BIOTECHNOLOGY
(2023)
Article
Chemistry, Medicinal
Yang Zhang, Yongen Li
Summary: SARS-CoV-2, the virus causing COVID-19, remains a leading cause of death globally. Despite the development of effective methods for diagnosing and treating COVID-19, there is still an urgent need for new approaches to tackle SARS-CoV-2 variants and long COVID. Aptamers have shown great potential as diagnostic and therapeutic agents for COVID-19, but their translation into clinical use has been slow, posing challenges that need to be overcome.
JOURNAL OF MEDICINAL CHEMISTRY
(2023)
Article
Biochemical Research Methods
Ruijun Feng, Sen Li, Yang Zhang
Summary: Cellular image analysis is a crucial method employed by microbiologists for the identification and study of microbes. The article presents a knowledge-integrated deep learning framework for cellular image analysis, focusing on classification, detection, and reconstruction tasks. It provides comprehensive information on various models, datasets, computing environment setup, knowledge representation, data pre-processing, and training and tuning, as well as evaluation and visualization techniques.
Article
Chemistry, Analytical
Hui Xi, Hanlin Jiang, Mario Juhas, Yang Zhang
Summary: Noncanonical G-quadruplex nucleic acid structures have been used as probes in biosensors for accurate and efficient detection of metal ions, proteins, and nucleic acids. In this study, a reliable and efficient fluorescent biosensor platform for G-quadruplex based detection of the human AGT protein was constructed using the magnetic bead enrichment method. This biosensor provides high accuracy, speed, and low cost, and successfully detected AGT at the cellular level.
Article
Biochemistry & Molecular Biology
Chi Zhang, Hao Jiang, Weihuang Liu, Junyi Li, Shiming Tang, Mario Juhas, Yang Zhang
Summary: In this study, a model based on Cycle Generative Adversarial Network (CycleGAN) and a multi-component weighted loss function was developed to address the issue of out-of-focus microscopic images. The proposed model achieved state-of-the-art performance in deblurring and demonstrated excellent generalization capabilities.
COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL
(2022)
Article
Biochemistry & Molecular Biology
Taoyu Ye, Sen Li, Yang Zhang
Summary: This study introduces a novel image-based deep learning strategy for cancer classification, achieving higher accuracy compared to existing methods. The approach is not only applicable to various types of cancer, but also helps identify top-ranked tumor-specific genes and pathways through heatmaps.
COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL
(2021)
Article
Biochemistry & Molecular Biology
Hao Jiang, Shiming Tang, Weihuang Liu, Yang Zhang
Summary: To address the urgent need for COVID-19 diagnosis, AI-based methods for analyzing chest CT images have been proposed. By synthesizing a dataset and testing various deep learning models, accurate and efficient diagnostic testing for COVID-19 can be achieved.
COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL
(2021)