Improving class activation maps for weakly supervised semantic segmentation
Semantic segmentation, which aims to classify each pixel in an image, has emerged as a critical technique with wide-ranging applications. However, pixel-level annotated datasets are expensive and time-consuming to create. To address this challenge, Weakly Supervised Semantic Segmentation (WSSS) has gained significant attention, aiming to achieve high-quality segmentation results using only weak annotations, such as image-level labels. Among WSSS techniques, those based on Class Activation Maps (CAMs) have shown particular promise. Despite the progress made in CAM-based WSSS, several challenges persist, including difficulties in multi-object scenes, inaccuracies in pseudo-label generation, and coarse boundary issues in segmentation results.
This thesis tackles these challenges through three contributions to CAM-based WSSS. First, a multi-channel weight assignment scheme for CAMs is proposed that improves performance in multi-object scenes by generating more accurate object representations, and a Multi-Contrast Learning (MCL) encoder is introduced to enhance the quality and reliability of CAMs further. Second, an iterative refinement strategy called Pseudo-Label-based Mix (PL-Mix) is developed to improve the accuracy and reliability of pseudo-labels. Third, a CAM-based level set method is introduced to refine pseudo-label boundaries using Fourier neural operators, significantly improving between regions.
The methods developed in this thesis have significant practical implications for real-world applications of semantic segmentation. By reducing the dependence on fully annotated data, the work makes semantic segmentation more accessible and practical for a wide range of applications, particularly in domains where obtaining pixel-level annotations is prohibitively expensive or time-consuming. The improved accuracy and boundary precision of the proposed methods enhance the reliability of semantic segmentation in critical applications.
History
School
- Science
Department
- Computer Science
Publisher
Loughborough UniversityRights holder
© Yifan WangPublication date
2024Notes
A Doctoral Thesis. Submitted in partial fulfilment of the requirements for the award of the degree of Doctor of Philosophy of Loughborough University.Language
- en
Supervisor(s)
Hui Fang ; Gerald SchaeferQualification name
- PhD
Qualification level
- Doctoral
This submission includes a signed certificate in addition to the thesis file(s)
- I have submitted a signed certificate