4.5 Article

Topic Modeling for Wikipedia Link Disambiguation

Journal

ACM TRANSACTIONS ON INFORMATION SYSTEMS
Volume 32, Issue 3, Pages -

Publisher

ASSOC COMPUTING MACHINERY
DOI: 10.1145/2633044

Keywords

Design; Algorithms; Performance; Topic modeling; link disambiguation; Wikipedia

Funding

  1. National Science Foundation [0746930, 1218488]
  2. Div Of Information & Intelligent Systems
  3. Direct For Computer & Info Scie & Enginr [0746930, 1218488] Funding Source: National Science Foundation

Ask authors/readers for more resources

Many articles in the online encyclopedia Wikipedia have hyperlinks to ambiguous article titles; these ambiguous links should be replaced with links to unambiguous articles, a process known as disambiguation. We propose a novel statistical topic model based on link text, which we refer to as the Link Text Topic Model (LTTM), that we use to suggest new link targets for ambiguous links. To evaluate our model, we describe a method for extracting ground truth for this link disambiguation task from edits made to Wikipedia in a specific time period. We use this ground truth to demonstrate the superiority of LTTM over other existing link- and content-based approaches to disambiguating links in Wikipedia. Finally, we build a web service that uses LTTM to make suggestions to human editors wanting to fix ambiguous links in Wikipedia.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available