An Algorithm for the Estimation of the Initial Text Skew

Authors

  • Darko Brodić University of Belgrade, Technical Faculty in Bor, V.J. 12, 19210 Bor, Serbia
  • Dragan R. Milivojević Mining and Metallurgy Institute, Department of Informatics, Z. Bulevar bb, 19210 Bor, Serbia

DOI:

https://doi.org/10.5755/j01.itc.41.3.1249

Keywords:

Document image processing, Text skew, Initial text skew, Printed text

Abstract

The paper presents a methodology for the estimation of the initial skew rate of text lines. Firstly, it splits text into groups according to the bounding boxes. Linked bounding boxes establish the bigger objects called connected components. After applying mathematical morphology operations, the enlarged group of the connected components is formed. The longest connected component is extracted by the longest common subsequence method. Inside the longest connected component, the gravity centers are determined for each bounding box. They represent the reference points, which are used for the calculation of the initial skew rate. Calculation is made by the moment based method. The comparative analysis of the origin and estimated skew rate is used to evaluate the algorithm. Hence, the proposed algorithm is examined with different printed text samples. It showed robustness for the skew estimation in the wide range of resolutions.

DOI: http://dx.doi.org/10.5755/j01.itc.41.3.1249

Downloads

Published

2012-09-05

Issue

Section

Articles