ClueNet : A Deep Framework for Occluded Pedestrian Pose Estimation

Perla Sai Raj Kishore (Institute of Engineering & Management), Sudip Das (Indian Statistical Institute), Partha Sarathi Mukherjee (Indian Statistical Institute), Ujjwal Bhattacharya (ISI Kolkata)

Pose estimation of a pedestrian helps to gather information about the current activity or the instant behaviour of the subject. Such information is useful for autonomous vehicles, augmented reality, video surveillance, etc. Although a large volume of pedestrian detection studies are available in the literature, detection of the same in situations of significant occlusions still remains a challenging task. In this work, we take a step further to propose a novel deep learning framework, called ClueNet, to detect as well as estimate the entire pose of occluded pedestrians in an unsupervised manner. ClueNet is a two stage framework where the first stage generates visual clues for the second stage to accurately estimate the pose of occluded pedestrians. The first stage employs a multi-task network to segment the visible parts and predict a bounding box enclosing the visible and occluded regions for each pedestrian. The second stage uses these predictions from the first stage for pose estimation. Here we propose a novel strategy, called Mask and Predict, to train our ClueNet to estimate the pose even for occluded regions. Additionally, we make use of various other training strategies to further improve our results. The proposed work is first of its kind and the experimental results on CityPersons and MS COCO datasets show the superior performance of our approach over existing methods.


Paper (PDF)

title={ClueNet : A Deep Framework for Occluded Pedestrian Pose Estimation},
author={Perla Sai Raj Kishore and Sudip Das and Partha Sarathi Mukherjee and Ujjwal Bhattacharya},
booktitle={Proceedings of the British Machine Vision Conference (BMVC)},
publisher={BMVA Press},
editor={Kirill Sidorov and Yulia Hicks},