Journal
ACM TRANSACTIONS ON INFORMATION SYSTEMS
Volume 32, Issue 3, Pages -Publisher
ASSOC COMPUTING MACHINERY
DOI: 10.1145/2633044
Keywords
Design; Algorithms; Performance; Topic modeling; link disambiguation; Wikipedia
Categories
Funding
- National Science Foundation [0746930, 1218488]
- Div Of Information & Intelligent Systems
- Direct For Computer & Info Scie & Enginr [0746930, 1218488] Funding Source: National Science Foundation
Ask authors/readers for more resources
Many articles in the online encyclopedia Wikipedia have hyperlinks to ambiguous article titles; these ambiguous links should be replaced with links to unambiguous articles, a process known as disambiguation. We propose a novel statistical topic model based on link text, which we refer to as the Link Text Topic Model (LTTM), that we use to suggest new link targets for ambiguous links. To evaluate our model, we describe a method for extracting ground truth for this link disambiguation task from edits made to Wikipedia in a specific time period. We use this ground truth to demonstrate the superiority of LTTM over other existing link- and content-based approaches to disambiguating links in Wikipedia. Finally, we build a web service that uses LTTM to make suggestions to human editors wanting to fix ambiguous links in Wikipedia.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available