Article
Computer Science, Artificial Intelligence
Eri Teruya, Tadashi Takeuchi, Hidekazu Morita, Takayuki Hayashi, Kanta Ono
Summary: This paper introduces an autonomous research topic selection (ARTS) system that analyzes research information in articles to construct research concept networks and selects potential research topics that are likely to reveal new scientific facts yet have not been extensively studied.
MACHINE LEARNING-SCIENCE AND TECHNOLOGY
(2022)
Review
Biochemistry & Molecular Biology
Sofia I. R. Conceicao, Francisco M. Couto
Summary: In building biological networks, providing reliable interactions is crucial. Text mining methods can help extract knowledge from scientific literature to overcome the challenge of tracking recent discoveries. These tools can lead to more reliable and personalized networks by identifying relations between entities of interest.
Article
Medicine, General & Internal
Wee-Ming Tan, Kean-Hooi Teoh, Mogana Ganggayah, Nur Taib, Hana Zaini, Sarinder Dhillon
Summary: This study aims to develop an automated natural language processing (NLP) algorithm to summarize the existing narrative breast pathology report to a narrower structured synoptic pathology report, in order to improve communication between pathologists and clinicians.
Review
Green & Sustainable Science & Technology
Joo Hyuk Lee, Myeonghun Lee, Kyoungmin Min
Summary: There is a growing demand for innovative materials in the development of new industries. To overcome the laborious and time-consuming process of locating such materials, researchers are shifting towards using existing material science research knowledge more efficiently. Natural language processing (NLP) has emerged as a crucial technology in this movement, proving to be valuable for processing language-based data in materials science literature.
INTERNATIONAL JOURNAL OF PRECISION ENGINEERING AND MANUFACTURING-GREEN TECHNOLOGY
(2023)
Article
Chemistry, Multidisciplinary
Margherita Berardi, Luigi Santamaria Amato, Francesca Cigna, Deodato Tapete, Mario Siciliani de Cumis
Summary: Volcanic monitoring reports contain valuable geochemical and geophysical data. This study presents a natural language processing system that can extract relevant gas parameters from such reports, as demonstrated by its successful application to monitoring bulletins from Stromboli volcano published between 2015 and 2021.
APPLIED SCIENCES-BASEL
(2022)
Article
Medical Informatics
Boxiang Liu, Liang Huang
Summary: Biomedical language translation requires multilingual fluency and relevant domain knowledge, posing challenges in training qualified translators and generating high-quality translations. Machine translation, while effective, requires large in-domain datasets. A new English-Chinese biomedical parallel corpus was developed from NEJM, with training on out-of-domain data and fine-tuning on as few as 4000 NEJM sentence pairs resulting in significant translation quality improvement. Further improvements were seen with larger in-domain data subsets, leading to a total increase of 33.0 (24.3) BLEU for en -> zh (zh -> en) directions on the full dataset.
BMC MEDICAL INFORMATICS AND DECISION MAKING
(2021)
Article
Computer Science, Artificial Intelligence
Nikola Milosevic, Wolfgang Thielemann
Summary: Biomedical research is expanding rapidly, leading to an overwhelming amount of published literature. Knowledge graphs offer a framework for representing semantic knowledge from this literature. This paper presents and compares rule-based and machine learning methods for scalable relationship extraction from biomedical literature, with a focus on their resilience to unbalanced and small datasets. Experiments show that transformer-based models, such as PubMedBERT and distilBERT, perform well in handling both small and unbalanced datasets.
JOURNAL OF WEB SEMANTICS
(2023)
Article
Computer Science, Interdisciplinary Applications
Ying Hu, Yanping Chen, Yongbin Qin, Ruizhang Huang
Summary: Biomedical Relation Extraction (BioRE) is an important task in automatically extracting semantic relations for given entity pairs. Current popular methods often use pretrained language models for feature extraction, but they suffer from overlapping semantics. This study proposes an Entity-oriented Representation (EoR) model that enhances the discriminability between entity pairs and achieves state-of-the-art performance in multiple BioRE tasks.
JOURNAL OF BIOMEDICAL INFORMATICS
(2023)
Article
Computer Science, Information Systems
Ying Hu, Yanping Chen, Ruizhang Huang, Yongbin Qin, Qinghua Zheng
Summary: Biomedical relation extraction aims to extract the interactive relations between biomedical entities in a sentence. This study proposes a hierarchical convolutional model to address the semantic overlapping and data imbalance problems. The model encodes both local contextual features and global semantic dependencies, enhancing the discriminability of the neural network for biomedical relation extraction.
INFORMATION PROCESSING & MANAGEMENT
(2024)
Article
Health Care Sciences & Services
Wee Ming Tan, Wei Lin Ng, Mogana Darshini Ganggayah, Victor Chee Wai Hoe, Kartini Rahmat, Hana Salwani Zaini, Nur Aishah Mohd Taib, Sarinder Kaur Dhillon
Summary: This study aims to convert unstructured breast radiology reports into structured formats using natural language processing (NLP) algorithm. Through analyzing 327 de-identified breast radiology reports, our NLP algorithm achieved an accuracy of 97% in training data and 94.9% in testing data. The predictive model based on random forest generated the highest accuracy of 92%, indicating the research value of mineable structured data.
HEALTH INFORMATICS JOURNAL
(2023)
Article
Computer Science, Artificial Intelligence
Szabolcs Szeker, Gyorgy Fogarassy, Agnes Vathy-Fogarassy
Summary: This study presents a method for extracting and structuring numerical measurement results and descriptions from cardiac ultrasound reports. The method has been tested and shown to have good accuracy and completeness in extracting important echocardiography parameters. It is applicable for processing any medical texts.
ARTIFICIAL INTELLIGENCE IN MEDICINE
(2023)
Article
Biochemical Research Methods
Suyang Dai, Yuxia Ding, Zihan Zhang, Wenxuan Zuo, Xiaodi Huang, Shanfeng Zhu
Summary: In this paper, a pipeline system called GrantExtractor is introduced to accurately extract grant support (GS) information from fulltext biomedical literature. By integrating advanced machine learning techniques, including a sentence classifier and bidirectional LSTM, GrantExtractor has demonstrated superior performance over baseline methods in benchmark datasets. Furthermore, GrantExtractor achieved top ranking in the 2017 BioASQ challenge with exceptional Micro F-measure scores.
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS
(2021)
Article
Computer Science, Interdisciplinary Applications
Tommaso Mario Buonocore, Claudio Crema, Alberto Redolfi, Riccardo Bellazzi, Enea Parimbelli
Summary: In the era of digital healthcare, the underused textual information in hospitals could be effectively utilized with task-specific, fine-tuned biomedical language representation models. However, less-resourced languages face challenges in accessing in-domain adaptation resources. To address this issue, our study investigates two accessible approaches to derive biomedical language models in languages like Italian, and demonstrates that data quantity is a harder constraint than data quality for biomedical adaptation. The models developed from our investigations have the potential to unlock important research opportunities for Italian healthcare institutions and academia, and also provide insights towards building generalizable biomedical language models for less-resourced languages and different domains.
JOURNAL OF BIOMEDICAL INFORMATICS
(2023)
Article
Public, Environmental & Occupational Health
Sana. S. S. BuHamra, Abdullah. N. N. Almutairi, Abdullah. K. K. Buhamrah, Sabah. H. H. Almadani, Yusuf. A. A. Alibrahim
Summary: This study utilizes Natural Language Processing (NLP) to construct an automated system for extracting causes of death and comorbidities in COVID-19 patients from electronic health records (EHRs). Findings show that septic shock or sepsis-related multiorgan failure is the leading cause of mortality, and acute respiratory distress syndrome (ARDS) is a common intermediate cause. Arrhythmia (AF) is determined to be the strongest predictor of intermediate cause of death.
FRONTIERS IN PUBLIC HEALTH
(2022)
Article
Computer Science, Interdisciplinary Applications
Abbas Akkasi, Mari-Francine Moens
Summary: Identifying causal relationships between events or entities in biomedical texts is crucial for creating scientific knowledge bases and is a fundamental task in NLP. Despite being an open problem in artificial intelligence, there is increasing research attention on this issue, with new techniques like deep neural networks showing promise in addressing it. Enhancements in state-of-the-art systems can be achieved through data augmentation techniques such as random oversampling to address class imbalance.
JOURNAL OF BIOMEDICAL INFORMATICS
(2021)