Where are the Masks: Instance Segmentation with Image-level Supervision

Issam Hadj Laradji (University of British Columbia), David Vazquez (Element AI), Mark Schmidt (University of British Columbia)

Abstract
A major obstacle in instance segmentation is that existing methods often need many per-pixel labels in order to be effective. These labels require large human effort and for certain applications, such labels are not readily available. To address this limitation, we propose a novel framework that can effectively train with image-level labels, which are significantly cheaper to acquire. For instance, one can do an internet search for the term ``car'' and obtain many images where a car is present with minimal effort. Our framework consists of two stages: (1) train a classifier to generate pseudo masks for the objects of interest; (2) train a fully supervised Mask R-CNN on these pseudo masks. Our two main contribution are proposing a pipeline that is simple to implement and is amenable to different segmentation methods; and achieves new state-of-the-art results for this problem setup. Our results are based on evaluating our method on PASCAL VOC 2012, a standard dataset for weakly supervised methods, where we demonstrate major performance gains compared to existing methods with respect to mean average precision.

DOI
10.5244/C.33.13
https://dx.doi.org/10.5244/C.33.13

Files
Paper (PDF)

BibTeX
@inproceedings{BMVC2019,
title={Where are the Masks: Instance Segmentation with Image-level Supervision},
author={Issam Hadj Laradji and David Vazquez and Mark Schmidt},
year={2019},
month={September},
pages={13.1--13.13},
articleno={13},
numpages={13},
booktitle={Proceedings of the British Machine Vision Conference (BMVC)},
publisher={BMVA Press},
editor={Kirill Sidorov and Yulia Hicks},
doi={10.5244/C.33.13},
url={https://dx.doi.org/10.5244/C.33.13}
}