4.6 Article

Fine-tuning BERT for automatic ADME semantic labeling in FDA drug labeling to enhance product-specific guidance assessment

期刊

JOURNAL OF BIOMEDICAL INFORMATICS
卷 138, 期 -, 页码 -

出版社

ACADEMIC PRESS INC ELSEVIER SCIENCE
DOI: 10.1016/j.jbi.2023.104285

关键词

Semantic Labeling; ADME; Drug Labeling; Transfer Learning; BERT; Natural Language Processing

向作者/读者索取更多资源

Product-specific guidances (PSGs) recommended by the FDA play a crucial role in generic drug development. This study used a pre-trained language model, BERT, to automatically label ADME paragraphs in FDA-approved drug labeling, facilitating PSG assessment. Fine-tuning BERT achieved better performance than traditional machine learning techniques, with a 12.5% F1 improvement. This research successfully applied BERT to ADME semantic labeling and analyzed the contributions of pre-training and fine-tuning.
Product-specific guidances (PSGs) recommended by the United States Food and Drug Administration (FDA) are instrumental to promote and guide generic drug product development. To assess a PSG, the FDA assessor needs to take extensive time and effort to manually retrieve supportive drug information of absorption, distribution, metabolism, and excretion (ADME) from the reference listed drug labeling. In this work, we leveraged the state-of-the-art pre-trained language models to automatically label the ADME paragraphs in the pharmacokinetics section from the FDA-approved drug labeling to facilitate PSG assessment. We applied a transfer learning approach by fine-tuning the pre-trained Bidirectional Encoder Representations from Transformers (BERT) model to develop a novel application of ADME semantic labeling, which can automatically retrieve ADME paragraphs from drug labeling instead of manual work. We demonstrate that fine-tuning the pre-trained BERT model can outperform conventional machine learning techniques, achieving up to 12.5% absolute F1 improvement. To our knowledge, we were the first to successfully apply BERT to solve the ADME semantic labeling task. We further assessed the relative contribution of pre-training and fine-tuning to the overall performance of the BERT model in the ADME semantic labeling task using a series of analysis methods, such as attention similarity and layer-based ablations. Our analysis revealed that the information learned via fine-tuning is focused on task-specific knowl-edge in the top layers of the BERT, whereas the benefit from the pre-trained BERT model is from the bottom layers.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

Article Engineering, Electrical & Electronic

An analysis and improvement of error control performance of IS-LDPC codes with a large number of subsets

Taha ValizadehAslani, Abolfazl Falahati

PHYSICAL COMMUNICATION (2018)

Article Biochemical Research Methods

PharmBERT: a domain-specific BERT model for drug labels

Taha ValizadehAslani, Yiwen Shi, Ping Ren, Jing Wang, Yi Zhang, Meng Hu, Liang Zhao, Hualou Liang

Summary: Human prescription drug labeling provides essential scientific information for the safe and effective use of drugs. Automatic information extraction from drug labels using NLP techniques, especially BERT, has shown exceptional performance. The development of PharmBERT, a BERT model pretrained specifically on drug labels, has demonstrated superior performance in multiple NLP tasks in the drug label domain.

BRIEFINGS IN BIOINFORMATICS (2023)

暂无数据