Incorporating Semantic Word Representations into Query Expansion for Microblog Information Retrieval

Authors

  • Bo Xu Dalian University of Technology
  • Hongfei Lin Dalian University of Technology
  • Yuan Lin Dalian University of Technology
  • Kan Xu Dalian University of Technology
  • Lin Wang Dalian University of Technology
  • Jiping Gao Institute of Scientific and Technical Information of China

DOI:

https://doi.org/10.5755/j01.itc.48.4.22487

Keywords:

Microblog retrieval, query expansion, word embeddings, word vectors, information retrieval

Abstract

Microblog information retrieval has attracted much attention of researchers to capture the desired information in daily communications on social networks. Since the contents of microblogs are always non-standardized and flexible, including many popular Internet expressions, the retrieval accuracy of microblogs has much room for improvement. To enhance microblog information retrieval, we propose a novel query expansion method to enrich user queries with semantic word representations. In our method, we use a neural network model to map each word in the corpus to a low-dimensional vector representation. The mapped word vectors satisfy the algebraic vector addition operation, and the new vector obtained by the addition operation can express some common attributes of the two words. In this sense, we represent keywords in user queries as vectors, sum all the keyword vectors, and use the obtained query vectors to select the expansion words. In addition, we also combine the traditional pseudo-relevance feedback query expansion method with the proposed query expansion method. Experimental results show that the proposed method is effective and reduces noises in the expanded query, which improves the accuracy of microblog retrieval.

Author Biography

Bo Xu, Dalian University of Technology

Faculty of Electronic Information and Electrical Engineering

Downloads

Published

2019-12-18

Issue

Section

Articles