Incorporating Semantic Word Representations into Query Expansion for Microblog Information Retrieval
Keywords:Microblog retrieval, query expansion, word embeddings, word vectors, information retrieval
Microblog information retrieval has attracted much attention of researchers to capture the desired information in daily communications on social networks. Since the contents of microblogs are always non-standardized and flexible, including many popular Internet expressions, the retrieval accuracy of microblogs has much room for improvement. To enhance microblog information retrieval, we propose a novel query expansion method to enrich user queries with semantic word representations. In our method, we use a neural network model to map each word in the corpus to a low-dimensional vector representation. The mapped word vectors satisfy the algebraic vector addition operation, and the new vector obtained by the addition operation can express some common attributes of the two words. In this sense, we represent keywords in user queries as vectors, sum all the keyword vectors, and use the obtained query vectors to select the expansion words. In addition, we also combine the traditional pseudo-relevance feedback query expansion method with the proposed query expansion method. Experimental results show that the proposed method is effective and reduces noises in the expanded query, which improves the accuracy of microblog retrieval.
Copyright terms are indicated in the Republic of Lithuania Law on Copyright and Related Rights, Articles 4-37.