☆ 4.6 Article

Discovery of novel biomarkers and phenotypes by semantic technologies

BMC BIOINFORMATICS (2013)

Journal

BMC BIOINFORMATICS

Volume 14, Issue -, Pages -

Publisher

BIOMED CENTRAL LTD

DOI: 10.1186/1471-2105-14-51

Keywords

In silico drug research; Semantic technologies; Text mining; Biomedical ontologies; Discovery of novel relationships

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

Background: Biomarkers and target-specific phenotypes are important to targeted drug design and individualized medicine, thus constituting an important aspect of modern pharmaceutical research and development. More and more, the discovery of relevant biomarkers is aided by in silico techniques based on applying data mining and computational chemistry on large molecular databases. However, there is an even larger source of valuable information available that can potentially be tapped for such discoveries: repositories constituted by research documents. Results: This paper reports on a pilot experiment to discover potential novel biomarkers and phenotypes for diabetes and obesity by self-organized text mining of about 120,000 PubMed abstracts, public clinical trial summaries, and internal Merck research documents. These documents were directly analyzed by the InfoCodex semantic engine, without prior human manipulations such as parsing. Recall and precision against established, but different benchmarks lie in ranges up to 30% and 50% respectively. Retrieval of known entities missed by other traditional approaches could be demonstrated. Finally, the InfoCodex semantic engine was shown to discover new diabetes and obesity biomarkers and phenotypes. Amongst these were many interesting candidates with a high potential, although noticeable noise (uninteresting or obvious terms) was generated. Conclusions: The reported approach of employing autonomous self-organising semantic engines to aid biomarker discovery, supplemented by appropriate manual curation processes, shows promise and has potential to impact, conservatively, a faster alternative to vocabulary processes dependent on humans having to read and analyze all the texts. More optimistically, it could impact pharmaceutical research, for example to shorten time-to-market of novel drugs, or speed up early recognition of dead ends and adverse reactions.

Authors

I am an author on this paper

Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6

Not enough ratings

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Predicting drug characteristics using biomedical text embedding

Guy Shtar, Asnat Greenstein-Messica, Eyal Mazuz, Lior Rokach, Bracha Shapira

Summary: This study proposes an adjacency biomedical text embedding (ABTE) method that combines known drug interactions and drug's biomedical text embeddings to predict both new and known drug interactions. The results demonstrate the superiority of this approach compared to other existing drug interaction prediction models and matrix factorization-based approaches. Furthermore, the study explores the use of different text embedding methods and finds that concept embedding achieves the highest performance. Additionally, the effectiveness of leveraging biomedical text embedding for drug safety prediction is demonstrated.

BMC BIOINFORMATICS (2022)