4.0 Article

CharaParser for Fine-Grained Semantic Annotation of Organism Morphological Descriptions

出版社

WILEY
DOI: 10.1002/asi.22618

关键词

-

资金

  1. National Science Foundation [EF0849982]
  2. Emerging Frontiers
  3. Direct For Biological Sciences [0849982] Funding Source: National Science Foundation

向作者/读者索取更多资源

Biodiversity information organization is looking beyond the traditional document-level metadata approach and has started to look into factual content in textual documents to support more intelligent and semantic-based access. This article reports the development and evaluation of CharaParser, a software application for semantic annotation of morphological descriptions. CharaParser annotates semistructured morphological descriptions in such a detailed manner that all stated morphological characters of an organ are marked up in Extensible Markup Language1 format. Using an unsupervised machine learning algorithm and a general purpose syntactic parser as its key annotation tools, CharaParser requires minimal additional knowledge engineering work and seems to perform well across different description collections and/or taxon groups. The system has been formally evaluated on over 1,000 sentences randomly selected from Volume 19 of Flora of North American and Part H of Treatise on Invertebrate Paleontology. CharaParser reaches and exceeds 90% in sentence-wise recall and precision, exceeding other similar systems reported in the literature. It also significantly outperforms a heuristic rule-based system we developed earlier. Early evidence that enriching the lexicon of a syntactic parser with domain terms alone may be sufficient to adapt the parser for the biodiversity domain is also observed and may have significant implications.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.0
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

Article Biochemical Research Methods

OTO: Ontology Term Organizer

Fengqiong Huang, James A. Macklin, Hong Cui, Heather A. Cole, Lorena Endara

BMC BIOINFORMATICS (2015)

Article Biochemistry & Molecular Biology

Finding Our Way through Phenotypes

Andrew R. Deans, Suzanna E. Lewis, Eva Huala, Salvatore S. Anzaldo, Michael Ashburner, James P. Balhoff, David C. Blackburn, Judith A. Blake, J. Gordon Burleigh, Bruno Chanet, Laurel D. Cooper, Melanie Courtot, Sandor Csoesz, Hong Cui, Wasila Dahdul, Sandip Das, T. Alexander Dececchi, Agnes Dettai, Rui Diogo, Robert E. Druzinsky, Michel Dumontier, Nico M. Franz, Frank Friedrich, George V. Gkouto, Melissa Haendel, Luke J. Harmon, Terry F. Hayamizu, Yongqun He, Heather M. Hines, Nizar Ibrahim, Laura M. Jackson, Pankaj Jaiswal, Christina James-Zorn, Sebastian Koehler, Guillaume Lecointre, Hilmar Lapp, Carolyn J. Lawrence, Nicolas Le Novere, John G. Lundberg, James Macklin, Austin R. Mast, Peter E. Midford, Istvan Miko, Christopher J. Mungall, Anika Oellrich, David Osumi-Sutherland, Helen Parkinson, Martin J. Ramirez, Stefan Richter, Peter N. Robinson, Alan Ruttenberg, Katja S. Schulz, Erik Segerdell, Katja C. Seltmann, Michael J. Sharkey, Aaron D. Smith, Barry Smith, Chelsea D. Specht, R. Burke Squires, Robert W. Thacker, Anne Thessen, Jose Fernandez-Triana, Mauno Vihinen, Peter D. Vize, Lars Vogt, Christine E. Wall, Ramona L. Walls, Monte Westerfeld, Robert A. Wharton, Christian S. Wirkner, James B. Woolley, Matthew J. Yoder, Aaron M. Zorn, Paula M. Mabee

PLOS BIOLOGY (2015)

Article Biochemical Research Methods

Microbial phenomics information extractor (MicroPIE): a natural language processing tool for the automated acquisition of prokaryotic phenotypic characters from text sources

Jin Mao, Lisa R. Moore, Carrine E. Blank, Elvis Hsin-Hui Wu, Marcia Ackerman, Sonali Ranade, Hong Cui

BMC BIOINFORMATICS (2016)

Article Biochemical Research Methods

Introducing Explorer of Taxon Concepts with a case study on spider measurement matrix building

Hong Cui, Dongfang Xu, Steven S. Chong, Martin Ramirez, Thomas Rodenhausen, James A. Macklin, Bertram Ludascher, Robert A. Morris, Eduardo M. Soto, Nicolas Mongiardino Koch

BMC BIOINFORMATICS (2016)

Article Mathematical & Computational Biology

MicrO: an ontology of phenotypic and metabolic characters, assays, and culture media found in prokaryotic taxonomic descriptions

Carrine E. Blank, Hong Cui, Lisa R. Moore, Ramona L. Walls

JOURNAL OF BIOMEDICAL SEMANTICS (2016)

Article Plant Sciences

Building the Plant Glossary-A controlled botanical vocabulary using terms extracted from the Floras of North America and China

Lorena Endara, Heather A. Cole, J. Gordon Burleigh, Nathalie S. Nagalingum, James A. Macklin, Jing Liu, Sonali Ranade, Hong Cui

Article Plant Sciences

Extraction of phenotypic traits from taxonomic descriptions for the tree of life using natural language processing

Lorena Endara, Hong Cui, J. Gordon Burleigh

APPLICATIONS IN PLANT SCIENCES (2018)

Article Computer Science, Information Systems

Identifying Bacterial Biotope Entities Using Sequence Labeling: Performance and Feature Analysis

Jin Mao, Hong Cui

JOURNAL OF THE ASSOCIATION FOR INFORMATION SCIENCE AND TECHNOLOGY (2018)

Article Mathematical & Computational Biology

Which methods are the most effective in enabling novice users to participate in ontology creation? A usability study

Limin Zhang, Xingyi Yang, Zuleima Cota, Hong Cui, Bruce Ford, Hsin-liang Chen, James A. Macklin, Anton Reznicek, Julian Starr

Summary: The study evaluates ontology construction methods through two usability studies and finds that Quick Form, Wizard, and WebProtege are preferred by participants, guiding the design of future iterations.

DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION (2021)

Article Mathematical & Computational Biology

Authors' attitude toward adopting a new workflow to improve the computability of phenotype publications

Hong Cui, Bruce Ford, Julian Starr, Anton Reznicek, Limin Zhang, James A. Macklin

Summary: The survey results indicate a high level of interest among biologists in adopting controlled vocabularies in phenotype publications, particularly in addressing issues of ambiguity and inconsistency in phenotype descriptions. They believe that a new authoring workflow would better reflect the original meaning of data. While controlled vocabularies are widespread, their actual use is not common.

DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION (2022)

Article Mathematical & Computational Biology

Measurement Recorder: developing a useful tool for making species descriptions that produces computable phenotypes

Hong Cui, Limin Zhang, Bruce Ford, Hsin-liang Cheng, James A. Macklin, Anton Reznicek, Julian Starr

DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION (2020)

Article Biodiversity Conservation

Modifier Ontologies for frequency, certainty, degree, and coverage phenotype modifier

Lorena Endara, Anne E. Thessen, Heather A. Cole, Ramona Walls, Georgios Gkoutos, Yujie Cao, Steven S. Chong, Hong Cui

BIODIVERSITY DATA JOURNAL (2018)

Article Biodiversity Conservation

Resolving orphaned non-specific structures using machine learning and natural language processing methods

Dongfang Xu, Steven S. Chong, Thomas Rodenhausen, Hong Cui

BIODIVERSITY DATA JOURNAL (2018)

Proceedings Paper Computer Science, Theory & Methods

Where Are iSchools Heading?

Vikas Yadav, Farig Sadeque, Bryan Heidorn, Hong Cui

TRANSFORMING DIGITAL WORLDS, ICONFERENCE 2018 (2018)

Article Biodiversity Conservation

Incentivising use of structured language in biological descriptions: Author-driven phenotype data and ontology production

Hong Cui, James A. Macklin, Joel Sachs, Anton Reznicek, Julian Starr, Bruce Ford, Lyubomir Penev, Hsin-Liang Chen

BIODIVERSITY DATA JOURNAL (2018)

暂无数据