Instance-level human parsing is one of the essential tasks for human-centric analysis which aims to segment various body parts and associate each part with the corresponding human instance simultaneously. Most state-of-the-art methods group instances upon multi-human parsing results, but they tend to miss instances and fail in grouping under the crowded scene. To address this problem, we propose a top-down unified framework to simultaneously detect human instance and parse every part within that instance. To better parse the single human, we also design an attention module, which is aggregated to our parsing network. As a result, our approach is capable of obtaining fine-grained parsing results and the corresponding human mask in a single forward pass. Experiments show that the proposed algorithm performs favorably against state-of-the-art methods on the CIHP and PASCAL-Person-Part datasets.
Supplementary material (ZIP)