Identification of scripts and orientations of degraded document images

Author: Lu Shijian  

Publisher: Springer Publishing Company

ISSN: 1433-7541

Source: Pattern Analysis and Applications (PAA), Vol.13, Iss.4, 2010-11, pp. : 469-475

Disclaimer: Any content in publications that violate the sovereignty, the constitution or regulations of the PRC is not accepted or approved by CNPIEC.

Previous Menu Next

Abstract

Document scripts and document orientations are important information for the document digitalization. Prior work has been reported to identify document scripts and document orientations, whereas most reported methods are very sensitive to document skew and low image resolution. This paper reports a document script and document orientation identification method that addresses this issue by converting a document image into a pair of document vectors using the density and distribution of character strokes. Experiments over 3,024 document images of 12 scripts show that the proposed methods are accurate and tolerant to various types of document degradation.