An Algorithm for the Estimation of the Initial Text Skew
Keywords:Document image processing, Text skew, Initial text skew, Printed text
AbstractThe paper presents a methodology for the estimation of the initial skew rate of text lines. Firstly, it splits text into groups according to the bounding boxes. Linked bounding boxes establish the bigger objects called connected components. After applying mathematical morphology operations, the enlarged group of the connected components is formed. The longest connected component is extracted by the longest common subsequence method. Inside the longest connected component, the gravity centers are determined for each bounding box. They represent the reference points, which are used for the calculation of the initial skew rate. Calculation is made by the moment based method. The comparative analysis of the origin and estimated skew rate is used to evaluate the algorithm. Hence, the proposed algorithm is examined with different printed text samples. It showed robustness for the skew estimation in the wide range of resolutions.
Copyright terms are indicated in the Republic of Lithuania Law on Copyright and Related Rights, Articles 4-37.