4.6 Article

The gene normalization task in BioCreative III

期刊

BMC BIOINFORMATICS
卷 12, 期 -, 页码 -

出版社

BIOMED CENTRAL LTD
DOI: 10.1186/1471-2105-12-S8-S2

关键词

-

资金

  1. MEXT
  2. (JST), Japan
  3. University of Manchester
  4. Alexander-von-Humboldt Stiftung
  5. National Library of Medicine [5R01LM009836, 5R01LM010125]
  6. European Union [217139]
  7. Swiss National Science Foundation [100014-118396/1, 105315-130558/1]
  8. NITAS/TMS
  9. Text Mining Services
  10. Novartis Pharma AG
  11. Basel, Switzerland
  12. NIH [1-R01-LM009959-01A1, 3T15 LM009451-03S1, 5R01 LM010120-02, 5R01 LM008111-05, 5R01 GM083649-03]
  13. NSF [0845523]
  14. Portuguese Foundation for Science and Technology [PTDC/EIA-CCO/100541/2008]
  15. Swiss National Science Foundation (SNF) [105315_130558] Funding Source: Swiss National Science Foundation (SNF)
  16. Direct For Biological Sciences
  17. Div Of Biological Infrastructure [0850319] Funding Source: National Science Foundation

向作者/读者索取更多资源

Background: We report the Gene Normalization (GN) challenge in BioCreative III where participating teams were asked to return a ranked list of identifiers of the genes detected in full-text articles. For training, 32 fully and 500 partially annotated articles were prepared. A total of 507 articles were selected as the test set. Due to the high annotation cost, it was not feasible to obtain gold-standard human annotations for all test articles. Instead, we developed an Expectation Maximization (EM) algorithm approach for choosing a small number of test articles for manual annotation that were most capable of differentiating team performance. Moreover, the same algorithm was subsequently used for inferring ground truth based solely on team submissions. We report team performance on both gold standard and inferred ground truth using a newly proposed metric called Threshold Average Precision (TAP-k). Results: We received a total of 37 runs from 14 different teams for the task. When evaluated using the gold-standard annotations of the 50 articles, the highest TAP-k scores were 0.3297 (k=5), 0.3538 (k=10), and 0.3535 (k=20), respectively. Higher TAP-k scores of 0.4916 (k=5, 10, 20) were observed when evaluated using the inferred ground truth over the full test set. When combining team results using machine learning, the best composite system achieved TAP-k scores of 0.3707 (k=5), 0.4311 (k=10), and 0.4477 (k=20) on the gold standard, representing improvements of 12.4%, 21.8%, and 26.6% over the best team results, respectively. Conclusions: By using full text and being species non-specific, the GN task in BioCreative III has moved closer to a real literature curation task than similar tasks in the past and presents additional challenges for the text mining community, as revealed in the overall team results. By evaluating teams using the gold standard, we show that the EM algorithm allows team submissions to be differentiated while keeping the manual annotation effort feasible. Using the inferred ground truth we show measures of comparative performance between teams. Finally, by comparing team rankings on gold standard vs. inferred ground truth, we further demonstrate that the inferred ground truth is as effective as the gold standard for detecting good team performance.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

Article Public, Environmental & Occupational Health

Predicting Adolescent Intervention Non-responsiveness for Precision HIV Prevention Using Machine Learning

Bo Wang, Feifan Liu, Lynette Deveaux, Arlene Ash, Ben Gerber, Jeroan Allison, Carly Herbert, Maxwell Poitier, Karen MacDonell, Xiaoming Li, Bonita Stanton

Summary: This study used machine learning approaches to identify important predictors of non-responsiveness to interventions among adolescents in The Bahamas, including self-efficacy, perceived response cost, and parent monitoring. The findings suggest that machine learning techniques can help identify different types of adolescents, guiding the development of more effective interventions.

AIDS AND BEHAVIOR (2023)

Article Chemistry, Multidisciplinary

Destruction of a Copper Metal-Organic Framework to Induce CuPt Growth as a Heterojunction Catalyst for Hydrogen Peroxide Sensing

Lipei Jiang, Jiannan Zhu, Guangfang Li, Zhuang Rao, Zhengyun Wang, Hongfang Liu

Summary: A new method involving partial thermal reduction and galvanic replacement is reported to induce CuPt growth on a CuHHTP MOF, resulting in the construction of a CuPt@CuHHTP heterojunction. The size of the CuPt nanoparticles can be effectively regulated by modifying the reduction temperature. The CuPt NP@CuHHTP heterojunction nanoarrays exhibit high electrocatalytic activity and potential as an electrochemical H2O2 sensor with a low detection limit, high sensitivity, and outstanding selectivity.

CHEMISTRY-A EUROPEAN JOURNAL (2023)

Article Chemistry, Multidisciplinary

Synergistic Combination of Fermi Level Equilibrium and Plasmonic Effect for Formic Acid Dehydrogenation

Jiannan Zhu, Jing Huang, Jiawei Dai, Lipei Jiang, You Xu, Rong Chen, Longhua Li, Xiaoqi Fu, Zhengyun Wang, Hongfang Liu, Guangfang Li

Summary: We developed trimetallic NiAuPd heterogeneous catalysts through a galvanic replacement reaction and a subsequent chemical reduction process, achieving efficient hydrogen generation from formic acid decomposition through Fermi level engineering and plasmonic effect.

CHEMSUSCHEM (2023)

Article Environmental Sciences

The acute toxicity, mechanism, bioconcentration and elimination of fluxametamide on zebrafish (Danio rerio)

Qiutang Huang, Zhongqiang Jia, Shenggan Wu, Feifan Liu, Yingnan Wang, Genmiao Song, Xiaoli Chang, Chunqing Zhao

Summary: To expand the usage of Fluxametamide in paddy fields, its potential toxicological effects on fish were investigated. The study found that Fluxametamide exhibited high toxicity to zebrafish Danio rerio, but only slightly inhibited the GABA-stimulated current of certain receptors. It also showed a high bioconcentration level in zebrafish, but rapidly decreased in concentration over a short period of time.

ENVIRONMENTAL POLLUTION (2023)

Article Chemistry, Multidisciplinary

Enhanced Proton Transfer in Proton-Exchange Membranes with Interconnected and Zwitterion-Functionalized Covalent Porous Material Structures

Zhuang Rao, Deyu Zhu, You Xu, Minqiu Lan, Lipei Jiang, Zhengyun Wang, Beibei Tang, Hongfang Liu

Summary: An interconnected and zwitterion-functionalized covalent porous material (CPM) based on carbon nanotubes and a Schiff-base network (CNT@ZSNW-1) is developed as a highly efficient proton-conductive accelerator. The CNT@ZSNW-1 structure offers additional proton-conducting sites and promotes water retention capacity, leading to enhanced proton conductivity and peak power density in a composite proton-exchange membrane (PEM) with Nafion. This study provides a potential reference for the design and preparation of functionalized CPMs to expedite proton transfer in PEMs.

CHEMSUSCHEM (2023)

Article Materials Science, Multidisciplinary

Effect of sulfate reducing bacteria on the galvanic corrosion behavior of X52 carbon steel and 2205 stainless steel bimetallic couple

Huihai Wan, Tiansui Zhang, Zixuan Xu, Zhuang Rao, Guoan Zhang, Guangfang Li, Hongfang Liu

Summary: The effect of sulfate reducing bacteria (SRB) on galvanic corrosion between 2205 SS and X52 carbon steel in enriched artificial seawater was comprehensively investigated. This study is significant for understanding microbiologically influenced corrosion of bimetallic composite pipelines. Electrochemical tests revealed that galvanic effect was higher in the SRB-containing medium compared to the sterile medium. The coupling also accelerated the corrosion of 2205 SS, resulting in a 13-fold increase in weight loss.

CORROSION SCIENCE (2023)

Article Geosciences, Multidisciplinary

Large-Scale Disturbances in the Upper Thermosphere Induced by the 2022 Tonga Volcanic Eruption

Ruoxi Li, Jiuhou Lei, Jurgen Kusche, Tong Dang, Fuqing Huang, Xiaoli Luan, Shun-Rong Zhang, Maodong Yan, Ziyi Yang, Feifan Liu, Xiankang Dou

Summary: The volcanic eruption in Tonga in 2010 caused significant disturbances in the thermosphere, including global changes in neutral density and the formation of density waves. These effects were comparable to a moderate geomagnetic storm.

GEOPHYSICAL RESEARCH LETTERS (2023)

Article Biochemistry & Molecular Biology

High strength chitin/chitosan-based aerogel with 3D hierarchically macro-meso-microporous structure for high-efficiency adsorption of Cu(II) ions and Congo red

Jie Kuang, Taimei Cai, Jiangbei Dai, Lihua Yao, Feifan Liu, Yue Liu, Jicheng Shu, Jieping Fan, Hailong Peng

Summary: A high-strength aerogel with a 3D hierarchically macro-meso-microporous structure (HPS-aerogel) was designed using biological macromolecules of chitin and chitosan. It showed high porosity, good mechanical properties, and excellent compression strength. The HPS-aerogel was also used as an adsorbent for the simultaneous removal of Cu(II) and Congo red from water, exhibiting promising adsorption capabilities.

INTERNATIONAL JOURNAL OF BIOLOGICAL MACROMOLECULES (2023)

Article Chemistry, Multidisciplinary

In situ preparation of spindle calcium carbonate-chitosan/poly (vinyl alcohol) anti-biofouling hydrogels inspired by shellfish

Yanan Wei, Weihua Li, Hongfang Liu, Hongwei Liu

Summary: Hydrogels with excellent antibiofouling properties were prepared by in situ synthesis of spindle CaCO3-Chitosan/poly (vinyl alcohol) (SCC/PVA) hydrogels inspired by seashells. The morphologies, structure, and components of the hydrogels were characterized. The SCC/PVA hydrogels exhibited excellent underwater superoleophobicity properties and remarkable resistance to protein, bacteria, and algae adhesion. The SCC/PVA-3 hydrogel had the highest superoleophobicity and compressive strength, and significantly reduced the settlement of Navicula, Nitzschia, and Closterium on its surface, as well as inhibiting the adhesion of elegans in the South China Sea for 180 days.

JOURNAL OF INDUSTRIAL AND ENGINEERING CHEMISTRY (2023)

Article Health Care Sciences & Services

Automating Quality Assessment of Medical Evidence in Systematic Reviews: Model Development and Validation Study

Simon Suster, Timothy Baldwin, Jey Han Lau, Antonio Jimeno Yepes, David Martinez Iraola, Yulia Otmakhova, Karin Verspoor

Summary: The study proposes a quality assessment task that provides an overall quality rating for each body of evidence (BoE) and justification for different quality criteria. A machine learning system (EvidenceGRADEr) is developed to automate the quality assessment process using a new dataset. The results show that the system performs well for some quality criteria but struggles with others due to limited data availability. This technology has the potential to reduce reviewer workload and expedite evidence synthesis.

JOURNAL OF MEDICAL INTERNET RESEARCH (2023)

Article Computer Science, Information Systems

Reproducible variability: assessing investigator discordance across 9 research teams attempting to reproduce the same observational study

Anna Ostropolets, Yasser Albogami, Mitchell Conover, Juan M. Banda, William A. Baumgartner Jr, Clair Blacketer, Priyamvada Desai, Scott L. DuVall, Stephen Fortin, James P. Gilbert, Asieh Golozar, Joshua Ide, Andrew S. Kanter, David M. Kern, Chungsoo Kim, Lana Y. H. Lai, Chenyu Li, Feifan Liu, Kristine E. Lynch, Evan Minty, Maria Ines Neves, Ding Quan Ng, Tontel Obene, Victor Pera, Nicole Pratt, Gowtham Rao, Nadav Rappoport, Ines Reinecke, Paola Saroufim, Azza Shoaibi, Katherine Simon, Marc A. Suchard, Joel N. Swerdel, Erica A. Voss, James Weaver, Linying Zhang, George Hripcsak, Patrick B. Ryan

Summary: Objective observational studies should be robust and reproducible, but nonreproducibility is often caused by unclear reporting. This study aimed to assess how different interpretations of study logic can impact patient characteristics.

JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION (2023)

Article Computer Science, Interdisciplinary Applications

Graph embedding-based link prediction for literature-based discovery in Alzheimer's Disease

Yiyuan Pu, Daniel Beck, Karin Verspoor

Summary: This study explores the framing of literature-based discovery (LBD) as link prediction and graph embedding learning in the context of Alzheimer's Disease (AD). A four-stage approach is proposed to create and analyze an AD-specific knowledge graph and predict new knowledge based on time-sliced link prediction. The results show that neural network graph-embedding link prediction methods have promise for LBD, but the prediction setting is extremely challenging.

JOURNAL OF BIOMEDICAL INFORMATICS (2023)

Article Computer Science, Interdisciplinary Applications

Attention-based multimodal fusion with contrast for robust clinical prediction in the face of missing modalities

Jinghui Liu, Daniel Capurro, Anthony Nguyen, Karin Verspoor

Summary: With the increasing amount and growing variety of healthcare data, multimodal machine learning has become an important tool for clinical machine learning tasks. However, managing the differences in dimensionality and availability of data modalities, as well as handling missing modalities, remains challenging. In this study, a Transformer-based fusion model called ARMOUR is proposed to address these issues and achieve improved performance in clinical prediction tasks.

JOURNAL OF BIOMEDICAL INFORMATICS (2023)

Article Health Care Sciences & Services

Analysis of predictive performance and reliability of classifiers for quality assessment of medical evidence revealed important variation by medical area

Simon Suster, Timothy Baldwin, Karin Verspoor

Summary: This study analyzes two systems for assessing medical evidence quality and finds that they are well calibrated on most quality criteria but vary significantly by medical area. Therefore, practitioners should expect fluctuations in system reliability and predictive performance when adopting automated quality assessment.

JOURNAL OF CLINICAL EPIDEMIOLOGY (2023)

Article Computer Science, Theory & Methods

The Secondary Use of Electronic Health Records for Data Mining: Data Characteristics and Challenges

Tabinda Sarwar, Sattar Seifollahi, Jeffrey Chan, Xiuzhen Zhang, Vural Aksakalli, Irene Hudson, Karin Verspoor, Lawrence Cavedon

Summary: The primary objective of implementing Electronic Health Records (EHRs) is to improve the management of patients' health-related information. However, these records have also been extensively used for the secondary purpose of clinical research and to improve healthcare practice. But the presence of diverse data types and associated characteristics poses many challenges to the use of EHR data. In this article, we provide an overview of information found in EHR systems and their characteristics that could be utilized for secondary applications, as well as the methods used to address data quality issues.

ACM COMPUTING SURVEYS (2023)

暂无数据