SUM: Uncertainty-aware Fine-tuning of Segmentation Foundation Models

Uncertainty-aware Fine-tuning of Segmentation Foundation Models (NeurIPS 2024)

Abstract

The Segment Anything Model (SAM) is a large-scale foundation model that has revolutionized segmentation methodology. Despite its impressive generalization ability, the segmentation accuracy of SAM on images with intricate structures is often unsatisfactory. Recent works have proposed lightweight fine-tuning using high-quality annotated data to improve accuracy on such images. However, here we provide extensive empirical evidence that this strategy leads to forgetting how to "segment anything": these models lose the original generalization abilities of SAM, in the sense that they perform worse for segmentation tasks not represented in the annotated fine-tuning set.

To improve performance without forgetting, we introduce a novel framework that combines high-quality annotated data with a large unlabeled dataset. The framework relies on two methodological innovations. First, we quantify the uncertainty in the SAM pseudo labels associated with the unlabeled data and leverage it to perform uncertainty-aware fine-tuning. Second, we encode the type of segmentation task associated with each training example using a task prompt to reduce ambiguity.

We evaluated the proposed Segmentation with Uncertainty Model (SUM) on a diverse test set consisting of 14 public benchmarks, where it achieves state-of-the-art results. Notably, our method consistently surpasses SAM by 3-6 points in mean IoU and 4-7 in mean boundary IoU across point-prompt interactive segmentation rounds.

BibTeX

@inproceedings{ liu2024uncertaintyaware, title={Uncertainty-aware Fine-tuning of Segmentation Foundation Models}, author={Kangning Liu and Brian L. Price and Jason Kuen and Yifei Fan and Zijun Wei and Luis Figueroa and Krzysztof J. Geras and Carlos Fernandez-Granda}, booktitle={The Thirty-eighth Annual Conference on Neural Information Processing Systems}, year={2024}, url={https://openreview.net/forum?id=qNXRXUC90b} }

Uncertainty-aware Fine-tuning of Segmentation Foundation Models (NeurIPS 2024)

Abstract

Framework

Generation of Uncertainty Map

Generation of uncertainty maps: (1) The mask-refinement module receives as input the segmentation prediction produced by SAM. (2) The module produces a refined segmentation mask. (3) The uncertainty map equals the absolute difference between the SAM and refined predictions.

Better Quality

Comparative visualization of segmentation outcomes using single-box prompts.

Comparative visualization of segmentation outcomes using point prompts, where blue points signify positive prompts and red points indicate negative prompts. We adhere to the same point prompt sampling evaluation strategy as SAM.

Dataset

Fine-tuning under different human annotation budget: FT-Small, FT-Medium, FT-Large

Experiments

Comparison of HQ-SAM with Vanilla and SUM fine-tuned Using the Same Lightweight Scheme as HQ-SAM SUM Matches HQ-SAM and outperforms Vanilla in salient-object segmentation and is superior in entity and part segmentation.

Comparison with Semi-supervised Methods 3 point-prompt segmentation evaluation of models fine-tuned on FT-Small dataset with various strategies. SUM clearly outperforms all other strategies.

Comparison of SAM with SUM Fine-tuned Under Different Human Annotation Budget 5 point-prompt segmentation evaluation. SUM consistently outperforms SAM, showing even greater improvement as the budget of human-annotated data increases.

Ablation Study. This table reports interactive segmentation mean IoU of different ablated versions of SUM fine-tuned on FT-Medium, showing individual gains provided by uncertainty-aware fine-tuning and task prompts.

BibTeX

Acknowledgements

The authors acknowledge Markus Woodson for valuable discussions and feedback.