Selected Publications

(* denotes for equal contribution)Submit to ECCV, 2018.

Although the recent success of convolutional neural network (CNN) advances state-of-the-art saliency prediction in static images, few work has addressed the problem of predicting attention in videos. On the other hand, we find that the attention of different subjects consistently focuses on a single face in each frame of videos involving multiple faces. Therefore, we propose in this paper a novel deep learning (DL) based method to predict salient face in multiple-face videos, which is capable of learning features and transition of salient faces across video frames. In particular, we first learn a CNN for each frame to locate salient face. Taking CNN features as input, we develop a multiple-stream long short-term memory (M-LSTM) network to predict the temporal transition of salient faces in video sequences. To evaluate our DL-based method, we build a new eye-tracking database of multiple-face videos. The experimental results show that our method outperforms the prior state-of-the-art methods in predicting visual attention on faces in multiple face videos.
In CVPR, 2017.

Recent Publications

. Few-shot Learning with Spatial-Task Attention Network. (* denotes for equal contribution)Submit to ECCV, 2018.

. Predicting Salient Face in Multiple-face Videos. In CVPR, 2017.

PDF Dataset

Recent Posts

Computer Vision 2018 Spring Course Project–Survey Part


I gived a talk of the latest work about multi-shot re-id in VENUS Reading Group


I gived a talk about the paper which got the ICML 2017 best paper award in VENUS Reading Group



Deep Feature Flow(Cityscapes)

Reimplement Deep Feature Flow On Cityscapes based on Deeplab-V2.

Capsule Network On Gluon

Reimplement Capsule Network Based on Gluon.