Refining pseudo-labels through iterative mix-up for weakly supervised semantic segmentation
Weakly supervised semantic segmentation (WSSS) aims to provide accurate pixel-level annotation based on only weak guidance, primarily derived from image-level labels. Recent WSSS methods exploit pseudo-labels generated from improved class activation maps (CAMs) to train a fine-grained classification model for semantic segmentation. However, these pseudo-labels are unreliable because they tend to either miss parts of the objects or include irrelevant regions due to weak guidance from individual images. In this paper, we propose a simple yet effective iterative mix-up strategy, Pseudo-Labelbased Mix (PL-Mix), that refines pseudo-labels iteratively, thereby further enhancing WSSS performance. During each iteration, we migrate object regions from pseudo-labels produced in previous steps and render them with new contexts in a mix-up fashion. Due to model consistency enforcement across varied backgrounds and new combinations of multiple objects from enriched image samples, these pseudo-labels progressively become more accurate and reliable. Further enhanced by a masking strategy and a CAM-based earth mover’s distance loss, we achieve state-of-the-art performance on the PASCAL VOC2012 and MS COCO2014 benchmark datasets.
History
School
- Science
Department
- Computer Science
Published in
Pattern RecognitionPublisher
ElsevierVersion
- AM (Accepted Manuscript)
Publisher statement
This manuscript version is made available under the CC-BY-NC-ND 4.0 license https://creativecommons.org/licenses/by-nc-nd/4.0/Acceptance date
2025-06-04ISSN
0031-3203eISSN
1873-5142Publisher version
Language
- en