Loughborough University
Browse

A sequential mixing fusion network for enhanced feature representations in multimodal sentiment analysis

journal contribution
posted on 2025-05-20, 12:56 authored by Chenchen Wang, Qiang Zhang, Jing Dong, Hui FangHui Fang, Gerald SchaeferGerald Schaefer, Rui Liu, Pengfei Yi

Multimodal sentiment analysis exploits multiple modalities to understand a user’s sentiment state from video content. Recent work in this area integrates features derived from different modalities. However, current multimodal sentiment datasets are typically small with limited cross-modal interaction diversity, for which simple feature fusion mechanisms can lead to modality dependence and model overfitting. Consequently, how to augment diverse cross-modal samples and use non-verbal modalities to dynamically enhance text feature representations is still under-explored. In this paper, we propose a sequential mixing fusion network to tackle this research challenge. Using speech text content as a primary source, we design a sequential fusion strategy to maximise the feature expressiveness enhanced by auxiliary modalities, namely facial movements and audio features, and a random feature-level mixing algorithm to augment diverse cross-modality interactions. Experimental results on three benchmark datasets show that our proposed approach significantly outperforms current state-of-the-art methods, while demonstrating strong robustness capability when dealing with a missing modality.

Funding

Dalian Major Projects of Basic Research [2023JJ11CG002]

111 Project [D23006]

National Foreign Expert Project of China [D20240244]

Interdisciplinary Research Project of Dalian University [DLUXK-2024-YB-007]

Scientific Research Foundation of Education Department of Liaoning Province grant [LJKMZ20221839, JYTMS20230379]

History

School

  • Science

Published in

Knowledge-Based Systems

Volume

320

Publisher

Elsevier B.V.

Version

  • AM (Accepted Manuscript)

Rights holder

©Elsevier B.V

Publisher statement

This manuscript version is made available under the CC-BY-NC-ND 4.0 license https://creativecommons.org/licenses/by-nc-nd/4.0/

Publication date

2025-05-01

Copyright date

2025

ISSN

0950-7051

eISSN

1872-7409

Language

  • en

Depositor

Dr Hui Fang. Deposit date: 2 May 2025

Article number

113638