4.7 Article

Multi-dimensional classification of biomedical text: Toward automated, practical provision of high-utility text to diverse users

Journal

BIOINFORMATICS
Volume 24, Issue 18, Pages 2086-2093

Publisher

OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/btn381

Keywords

-

Funding

  1. NSERC Discovery [298292-04]
  2. CFI New Opportunities Award [10437]
  3. NSF [EIA-0121687]

Ask authors/readers for more resources

Motivation: Much current research in biomedical text mining is concerned with serving biologists by extracting certain information from scientific text. We note that there is no average biologist client; different users have distinct needs. For instance, as noted in past evaluation efforts (BioCreative, TREC, KDD) database curators are often interested in sentences showing experimental evidence and methods. Conversely, lab scientists searching for known information about a protein may seek facts, typically stated with high confidence. Text-mining systems can target specific end-users and become more effective, if the system can first identify text regions rich in the type of scientific content that is of interest to the user, retrieve documents that have many such regions, and focus on fact extraction from these regions. Here, we study the ability to characterize and classify such text automatically. We have recently introduced a multi-dimensional categorization and annotation scheme, developed to be applicable to a wide variety of biomedical documents and scientific statements, while intended to support specific biomedical retrieval and extraction tasks. Results: The annotation scheme was applied to a large corpus in a controlled effort by eight independent annotators, where three individual annotators independently tagged each sentence. We then trained and tested machine learning classifiers to automatically categorize sentence fragments based on the annotation. We discuss here the issues involved in this task, and present an overview of the results. The latter strongly suggest that automatic annotation along most of the dimensions is highly feasible, and that this new framework for scientific sentence categorization is applicable in practice.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

Article Environmental Sciences

Associations between environmental quality and adult asthma prevalence in medical claims data

Christine L. Gray, Danelle T. Lobdell, Kristen M. Rappazzo, Yun Jian, Jyotsna S. Jagai, Lynne C. Messer, Achal P. Patel, Stephanie A. DeFlorio-Barker, Christopher Lyttle, Julian Solway, Andrey Rzhetsky

ENVIRONMENTAL RESEARCH (2018)

Article Pediatrics

Comprehensive modeling reveals proximity, seasonality, and hygiene practices as key determinants of MRSA colonization in exposed households

Ryan L. Mork, Patrick G. Hogan, Carol E. Muenks, Mary G. Boyle, Ryley M. Thompson, John J. Morelli, Melanie L. Sullivan, Sarah J. Gehlert, David G. Ross, Alicia Yn, Juliane Bubeck Wardenburg, Andrey Rzhetsky, Carey-Ann D. Burnham, Stephanie A. Fritz

PEDIATRIC RESEARCH (2018)

Article Oncology

Human Organoids Share Structural and Genetic Features with Primary Pancreatic Adenocarcinoma Tumors

Isabel Romero-Calvo, Christopher R. Weber, Mohana Ray, Miguel Brown, Kori Kirby, Rajib K. Nandi, Tiha M. Long, Samantha M. Sparrow, Andrey Ugolkov, Wenan Qiang, Yilin Zhang, Tonya Brunetti, Hedy Kindler, Jeremy P. Segal, Andrey Rzhetsky, Andrew P. Mazar, Mary M. Buschmann, Ralph Weichselbaum, Kevin Roggin, Kevin P. White

MOLECULAR CANCER RESEARCH (2019)

Article Infectious Diseases

Interplay of personal, pet, and environmental colonization in households affected by community-associated methicillin-resistant Staphylococcus aureus

Patrick G. Hogan, Ryan L. Mork, Mary G. Boyle, Carol E. Muenks, John J. Morelli, Ryley M. Thompson, Melanie L. Sullivan, Sarah J. Gehlert, Jessica R. Merlo, Matt G. McKenzie, Juliane Bubeck Wardenburg, Andrey Rzhetsky, Carey-Ann D. Burnham, Stephanie A. Fritz

JOURNAL OF INFECTION (2019)

Article Genetics & Heredity

GRIK5 Genetically Regulated Expression Associated with Eye and Vascular Phenomes: Discovery through Iteration among Biobanks, Electronic Health Records, and Zebrafish

Gokhan Unlu, Eric R. Gamazon, Xinzi Qi, Daniel S. Levic, Lisa Bastarache, Joshua C. Denny, Dan M. Roden, Ilya Mayzus, Max Breyer, Xue Zhong, Anuar Konkashbaev, Andrey Rzhetsky, Ela W. Knapik, Nancy J. Cox

AMERICAN JOURNAL OF HUMAN GENETICS (2019)

Article Multidisciplinary Sciences

Ultra-multiplexed analysis of single-cell dynamics reveals logic rules in differentiation

Ce Zhang, Hsiung-Lin Tu, Gengjie Jia, Tanzila Mukhtar, Verdon Taylor, Andrey Rzhetsky, Savas Tay

SCIENCE ADVANCES (2019)

Article Biology

Centralized scientific communities are less likely to generate replicable results

Valentin Danchev, Andrey Rzhetsky, James A. Evans

ELIFE (2019)

Article Biochemistry & Molecular Biology

Environmental pollution is associated with increased risk of psychiatric disorders in the US and Denmark

Atif Khan, Oleguer Plana-Ripoll, Sussie Antonsen, Jorgen Brandt, Camilla Geels, Hannah Landecker, Patrick F. Sullivan, Carsten Bocker Pedersen, Andrey Rzhetsky

PLOS BIOLOGY (2019)

Article Biochemical Research Methods

Measurable health effects associated with the daylight saving time shift

Hanxin Zhang, Torsten Dahlen, Atif Khan, Gustaf Edgren, Andrey Rzhetsky

PLOS COMPUTATIONAL BIOLOGY (2020)

Article Multidisciplinary Sciences

Automated microfluidic platform for dynamic and combinatorial drug screening of tumor organoids

Brooke Schuster, Michael Junkin, Sara Saheb Kashaf, Isabel Romero-Calvo, Kori Kirby, Jonathan Matthews, Christopher R. Weber, Andrey Rzhetsky, Kevin P. White, Savas Tay

NATURE COMMUNICATIONS (2020)

Article Biochemistry & Molecular Biology

Do psychiatric diseases follow annual cyclic seasonality?

Hanxin Zhang, Atif Khan, Qi Chen, Henrik Larsson, Andrey Rzhetsky

Summary: This study examines the annual patterns of psychiatric disorders in the U.S. and Sweden using large datasets. The findings show remarkable similarities in annual patterns across studied diseases, with greater variation in Sweden for psychiatric disorders. Results varied for different age groups in terms of healthcare-seeking visit patterns. The study suggests that uncorrected results may capture real trends, while corrected results may reflect artifacts influenced by fluctuating health-seeking visits.

PLOS BIOLOGY (2021)

Article Biochemical Research Methods

Observable variations in human sex ratio at birth

Yanan H. Long, Qi Chen, Henrik H. Larsson, Andrey Rzhetsky

Summary: The study investigates the association between the human sex ratio at birth and environmental factors using large datasets from the US and Sweden. While seasonal and temperature variations were not found to affect the sex ratio, various pollutants, including industrial and agricultural activities, were associated with lower sex ratios. Additionally, some environmental toxins were found to induce higher sex ratios.

PLOS COMPUTATIONAL BIOLOGY (2021)

Article Psychiatry

Dissecting schizophrenia phenotypic variation: the contribution of genetic variation, environmental exposures, and gene-environment interactions

Hanxin Zhang, Atif Khan, Steven A. Kushner, Andrey Rzhetsky

Summary: Schizophrenia is a leading cause of disability worldwide, and its etiology involves genetic and environmental factors. This study utilized unique data sources and mathematical models to estimate the contributions of genetic and environmental factors to schizophrenia risk, finding that environmental factors are an important source of explanatory variance.

SCHIZOPHRENIA (2022)

Article Geriatrics & Gerontology

Free-living wrist and hip accelerometry forecast cognitive decline among older adults without dementia over 1-or 5-years in two distinct observational cohorts

Chengjian Shi, Niser Babiker, Jacek K. Urbanek, Robert L. Grossman, Megan Huisingh-Scheetz, Andrey Rzhetsky

Summary: This study used accelerometer data to predict cognitive decline in older adults within 1 or 5 years, with high accuracy. The proposed models can be applied to clinical practices serving aging populations.

NPJ AGING (2022)

Meeting Abstract Chemistry, Multidisciplinary

Computational analysis of publications' texts for bioassay protocol classification

Olga Tarasova, Ivan Mayorov, Dmitry Filimonov, Vladimir Poroikov, Ilya Mayzus, Andrey Rzhetsky

ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY (2018)

No Data Available