Journal Home Online First Current Issue Archive For Authors Journal Information 中文版

Frontiers of Information Technology & Electronic Engineering >> 2023, Volume 24, Issue 10 doi: 10.1631/FITEE.2200514

Robust cross-modal retrieval with alignment refurbishment

Affiliation(s): School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094, China; School of Mathematics and Statistics, Qingdao University, Qingdao 266071, China; less

Received: 2022-10-27 Accepted: 2023-10-27 Available online: 2023-10-27

Next Previous

Abstract

tries to achieve mutual retrieval between modalities by establishing consistent alignment for different modal data. Currently, many methods have been proposed and have achieved excellent results; however, these are trained with clean cross-modal pairs, which are semantically matched but costly, compared with easily available data with noise alignment (i.e., paired but mismatched in semantics). When training these methods with noise-aligned data, the performance degrades dramatically. Therefore, we propose a robust with alignment refurbishment (RCAR), which significantly reduces the impact of noise on the model. Specifically, RCAR first conducts multi-task learning to slow down the overfitting to the noise to make data separable. Then, RCAR uses a two-component to divide them into clean and noise alignments and refurbishes the label according to the posterior probability of the noise-alignment component. In addition, we define partial and complete noises in the noise-alignment paradigm. Experimental results show that, compared with the popular methods, RCAR achieves more robust performance with both types of noise.

Related Research