An Algorithm for the Estimation of the Initial Text Skew

Darko Brodić, Dragan R. Milivojević

Abstract


The paper presents a methodology for the estimation of the initial skew rate of text lines. Firstly, it splits text into groups according to the bounding boxes. Linked bounding boxes establish the bigger objects called connected components. After applying mathematical morphology operations, the enlarged group of the connected components is formed. The longest connected component is extracted by the longest common subsequence method. Inside the longest connected component, the gravity centers are determined for each bounding box. They represent the reference points, which are used for the calculation of the initial skew rate. Calculation is made by the moment based method. The comparative analysis of the origin and estimated skew rate is used to evaluate the algorithm. Hence, the proposed algorithm is examined with different printed text samples. It showed robustness for the skew estimation in the wide range of resolutions.

DOI: http://dx.doi.org/10.5755/j01.itc.41.3.1249


Keywords


Document image processing; Text skew; Initial text skew; Printed text

Full Text: PDF

Print ISSN: 1392-124X 
Online ISSN: 2335-884X