Text Document Clustering Approach by Improved Sine Cosine Algorithm
Keywords:text document clustering, optimization problems, metaheuristics, sine cosine algorithm, hybridization and K-means
Due to the vast amounts of textual data available in various forms such as online content, social media comments, corporate data, public e-services and media data, text clustering has been experiencing rapid development. Text clustering involves categorizing and grouping similar content. It is a process of identifying significant patterns from unstructured textual data. Algorithms are being developed globally to extract useful and relevant information from large amounts of text data. Measuring the significance of content in documents to partition the collection of text data is one of the most important obstacles in text clustering. This study suggests utilizing an improved metaheuristics algorithm to fine-tune the K-means approach for text clustering task. The suggested technique is evaluated using the first 30 unconstrained test functions from the CEC2017 test-suite and six standard criterion text datasets. The simulation results and comparison with existing techniques demonstrate the robustness and supremacy of the suggested method.
Copyright terms are indicated in the Republic of Lithuania Law on Copyright and Related Rights, Articles 4-37.