Learning Target-aware Attention for Robust Tracking with Conditional Adversarial Network

Xiao Wang (Anhui University), Rui Yang (Anhui university), Tao Sun (Anhui university), Bin Luo (Anhui University)

Many of current visual trackers are based on tracking-by-detection framework which attempts to search target object within a local search window for each frame. Although they have achieved appealing performance, however, their localization and scale handling often perform poorly in extremely challenging scenarios, such as heavy occlusion and large deformation due to two major reasons: i) They simply set a local searching window using temporal context, which may not cover the target at all and therefore cause tracking failure. ii) Some of them adopt image pyramid strategy to handle scale variations, which heavily relies on target localization, and thus can be easily disturbed when the localization is unreliable. To handle these issues, this paper presents a novel and general target-aware attention learning approach to simultaneously achieve target localization and scale handling. Through conditional generative adversarial network (CGAN), attention maps are produced to generate the proposals with high-quality locations and scales, and perform object tracking via multi-domain CNN. The proposed approach is efficient and effective, needs small amount of training data, and improves the tracking-by-detection framework significantly. Extensive experiments have shown the proposed approach outperforms most of recent state-of-the-art trackers on several visual tracking benchmarks, and provides improved robustness for fast motion, scale variation as well as heavy occlusion. The project page of this paper can be found at: \url{https://sites.google.com/view/globalattentiontracking/home}.


Paper (PDF)
Supplementary material (PDF)

title={Learning Target-aware Attention for Robust Tracking with Conditional Adversarial Network},
author={Xiao Wang and Rui Yang and Tao Sun and Bin Luo},
booktitle={Proceedings of the British Machine Vision Conference (BMVC)},
publisher={BMVA Press},
editor={Kirill Sidorov and Yulia Hicks},