Deep Learning Fusion of RGB and Depth Images for Pedestrian Detection

Zhixin Guo (Ghent University), Wenzhi Liao (Ghent University), Yifan Xiao (Ghent University), Peter Veelaert (UGent), Wilfried Philips (IPI - Ghent University - imec)

Abstract
In this paper, we propose an effective method based on the Faster-RCNN structure to combine RGB and depth images for pedestrian detection. During the training stage, we generate a semantic segmentation map from the depth image and use it to refine the convolutional features extracted from the RGB images. In addition, we acquire more accurate region proposals by exploring the perspective projection with the help of depth information. Experimental results demonstrate that our proposed method achieves the state-of-the-art RGBD pedestrian detection performance on KITTI dataset.

DOI
10.5244/C.33.166
https://dx.doi.org/10.5244/C.33.166

Files
Paper (PDF)

BibTeX
@inproceedings{BMVC2019,
title={Deep Learning Fusion of RGB and Depth Images for Pedestrian Detection},
author={Zhixin Guo and Wenzhi Liao and Yifan Xiao and Peter Veelaert and Wilfried Philips},
year={2019},
month={September},
pages={166.1--166.13},
articleno={166},
numpages={13},
booktitle={Proceedings of the British Machine Vision Conference (BMVC)},
publisher={BMVA Press},
editor={Kirill Sidorov and Yulia Hicks},
doi={10.5244/C.33.166},
url={https://dx.doi.org/10.5244/C.33.166}
}