Author: Wang Leiquan Chu Xiaoliang Zhang Weishan Wei Yiwei Sun Weichen Wu Chunlei
Publisher: MDPI
E-ISSN: 1424-8220|18|2|646-646
ISSN: 1424-8220
Source: Sensors, Vol.18, Iss.2, 2018-02, pp. : 646-646
Disclaimer: Any content in publications that violate the sovereignty, the constitution or regulations of the PRC is not accepted or approved by CNPIEC.
Abstract
Image captioning with a natural language has been an emerging trend. However, the social image, associated with a set of user-contributed tags, has been rarely investigated for a similar task. The user-contributed tags, which could reflect the user attention, have been neglected in conventional image captioning. Most existing image captioning models cannot be applied directly to social image captioning. In this work, a dual attention model is proposed for social image captioning by combining the visual attention and user attention simultaneously.Visual attention is used to compress a large mount of salient visual information, while user attention is applied to adjust the description of the social images with user-contributed tags. Experiments conducted on the Microsoft (MS) COCO dataset demonstrate the superiority of the proposed method of dual attention.
Related content
Modeling Bottom-Up Visual Attention Using Dihedral Group
Symmetry, Vol. 8, Iss. 8, 2016-08 ,pp. :
Selective Attention in Multi-Chip Address-Event Systems
By Bartolozzi Chiara Indiveri Giacomo
Sensors, Vol. 9, Iss. 7, 2009-06 ,pp. :
Recognizing the Degree of Human Attention Using EEG Signals from Mobile Sensors
By Liu Ning-Han Chiang Cheng-Yu Chu Hsuan-Chin
Sensors, Vol. 13, Iss. 8, 2013-08 ,pp. :