☆ 4.7 Article Proceedings Paper

Exemplar-based Visualization of Large Document Corpus

IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS (2009)

期刊

IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS

卷 15, 期 6, 页码 1161-1168

出版社

IEEE COMPUTER SOC

DOI: 10.1109/TVCG.2009.140

关键词

Exemplar; large-scale document visualization; multidimensional projection

类别

Computer Science, Software Engineering

资金

Direct For Computer & Info Scie & Enginr
Div Of Information & Intelligent Systems [0915933] Funding Source: National Science Foundation
Div Of Information & Intelligent Systems
Direct For Computer & Info Scie & Enginr [0937586] Funding Source: National Science Foundation

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

With the rapid growth of the World Wide Web and electronic information services, text corpus is becoming available online at an incredible rate. By displaying text data in a logical layout (e.g., color graphs), text visualization presents a direct way to observe the documents as well as understand the relationship between them. In this paper, we propose a novel technique, Exemplar-based Visualization (EV), to visualize an extremely large text corpus. Capitalizing on recent advances in matrix approximation and decomposition, EV presents a probabilistic multidimensional projection model in the low-rank text subspace with a sound objective function. The probability of each document proportion to the topics is obtained through iterative optimization and embedded to a low dimensional space using parameter embedding. By selecting the representative exemplars, we obtain a compact approximation of the data. This makes the visualization highly efficient and flexible. In addition, the selected exemplars neatly summarize the entire data set and greatly reduce the cognitive overload in the visualization, leading to an easier interpretation of large text corpus. Empirically, we demonstrate the superior performance of EV through extensive experiments performed on the publicly available text data sets.

Exemplar-based Visualization of Large Document Corpus

期刊

IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS

出版社

IEEE COMPUTER SOC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Exemplar-based Visualization of Large Document Corpus

期刊

IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS

出版社

IEEE COMPUTER SOC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文