☆ 4.7 Article

Segmentation of historical machine-printed documents using Adaptive Run Length Smoothing and skeleton segmentation paths

IMAGE AND VISION COMPUTING (2010)

期刊

IMAGE AND VISION COMPUTING

卷 28, 期 4, 页码 590-604

出版社

ELSEVIER

DOI: 10.1016/j.imavis.2009.09.013

关键词

Text line segmentation; Word segmentation; Character segmentation; Historical machine-printed documents; Run Length Smoothing Algorithm

类别

Computer Science, Artificial Intelligence Computer Science, Software Engineering Computer Science, Theory & Methods Engineering, Electrical & Electronic Optics

资金

European Community's Seventh Framework Programme [215064]
Greek Ministry of Research

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

In this paper, we strive towards the development of efficient techniques in order to segment document pages resulting from the digitization of historical machine-printed sources. This kind of documents often suffer from low quality and local skew, several degradations due to the old printing matrix quality or ink diffusion, and exhibit complex and dense layout. To face these problems, we introduce the following innovative aspects: (i) use of a novel Adaptive Run Length Smoothing Algorithm (ARLSA) in order to face the problem of complex and dense document layout, (ii) detection of noisy areas and punctuation marks that are usual in historical machine-printed documents, (iii) detection of possible obstacles formed from background areas in order to separate neighboring text columns or text lines, and (iv) use of skeleton segmentation paths in order to isolate possible connected characters. Comparative experiments using several historical machine-printed documents prove the efficiency of the proposed technique. (C) 2009 Elsevier B.V. All rights reserved.

Segmentation of historical machine-printed documents using Adaptive Run Length Smoothing and skeleton segmentation paths

期刊

IMAGE AND VISION COMPUTING

出版社

ELSEVIER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Segmentation of historical machine-printed documents using Adaptive Run Length Smoothing and skeleton segmentation paths

期刊

IMAGE AND VISION COMPUTING

出版社

ELSEVIER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文