Improving Multi-stage Object Detection via Iterative Proposal Refinement

Jicheng Gong (Westwell-lab), Zhao Zhao (Westwell-lab), Nic Li (Westwell-lab)

Abstract
For object detection tasks, multi-stage detection frameworks have achieved excellent detection performance (e.g., Cascade R-CNN) compared to those one and two-stage frameworks (e.g., FPN). In this work, we introduce an LSTM-based proposal refinement module that iteratively refines proposed bounding boxes. This module can naturally be integrated with different frameworks. And the number of iterative steps is flexible and can differ between training and testing stages. In this work, we focus on improving the widely used two-stage frameworks by replacing the original bounding box regression head with our proposed module. To verify the efficacy of our method, we perform extensive experiments on PASCAL VOC and MS COCO benchmarks with both ResNet-50 and ResNet-101 backbones. The results show that by having our LSTM based module it achieves significantly higher mAP than the vanilla R-FCN and FPN on both benchmarks. Meanwhile, it outperforms the existing state-of-the-art method Cascade R-CNN especially under high IoU thresholds.

DOI
10.5244/C.33.105
https://dx.doi.org/10.5244/C.33.105

Files
Paper (PDF)
Supplementary material (ZIP)

BibTeX
@inproceedings{BMVC2019,
title={Improving Multi-stage Object Detection via Iterative Proposal Refinement},
author={Jicheng Gong and Zhao Zhao and Nic Li},
year={2019},
month={September},
pages={105.1--105.13},
articleno={105},
numpages={13},
booktitle={Proceedings of the British Machine Vision Conference (BMVC)},
publisher={BMVA Press},
editor={Kirill Sidorov and Yulia Hicks},
doi={10.5244/C.33.105},
url={https://dx.doi.org/10.5244/C.33.105}
}