Tri-CLT: Learning Tri-Modal Representations with Contrastive Learning and Transformer for Multimodal Sentiment Recognition

Authors

  • Zhiyong Yang The College of Big Data and Internet of Things, Chongqing Vocational Institute of Engineering, and the College of Computer and Information Science, Chongqing Normal University, Chongqing, 402246, China
  • Zijian Li School of Computer and Information Science, Chongqing Normal University, Chongqing, 401331, China
  • Dongdong Zhu School of Computer and Information Science, Chongqing Normal University, Chongqing, 401331, China
  • Yu Zhou The College of Finance and Tourism, Chongqing Vocational Institute of Engineering, Chongqing, 402246, China

DOI:

https://doi.org/10.5755/j01.itc.53.1.35060

Keywords:

Multimodal Sentiment Analysis, fusion, transformer, cross-modal attention, contrastive learning

Abstract

Multimodal Sentiment Analysis (MSA) has become an essential area of research to achieve more accurate sentiment analysis by integrating multiple perceptual modalities such as text, vision, and audio. However, most previous studies failed to align the various modalities well and ignored the differences in semantic information, leading to inefficient fusion between modalities and generating redundant information. In order to solve the above problems, this paper proposes a transformer-based network model, Tri-CLT. Specifically, this paper designs Integrating Fusion Block to fuse modal features to enhance their semantic information and mitigate the secondary complexity of paired sequences in the transformer. Meanwhile, the cross-modal attention mechanism is utilized for complementary learning between modalities to enhance the model performance. In addition, contrastive learning is introduced to improve the model's representation of learning ability. Finally, this paper conducts experiments on CMU-MOSEI aligned and unaligned data, and the experimental results show that the proposed method outperforms the existing methods.

Downloads

Published

2024-03-22

Issue

Section

Articles