A Quantitative Comparison of Epistemic Uncertainty Maps Applied to Multi-Class Segmentation
Research output: Contribution to journal › Journal article › Research › peer-review
Standard
A Quantitative Comparison of Epistemic Uncertainty Maps Applied to Multi-Class Segmentation. / Camarasa, Robin; Bos, Daniel; Hendrikse, Jeroen; Nederkoorn, Paul; Kooi, M. Eline; Lugt, Aad van der; Bruijne, Marleen de.
In: The Journal of Machine Learning for Biomedical Imaging, Vol. 2021, 013, 22.09.2021, p. 1-39.Research output: Contribution to journal › Journal article › Research › peer-review
Harvard
APA
Vancouver
Author
Bibtex
}
RIS
TY - JOUR
T1 - A Quantitative Comparison of Epistemic Uncertainty Maps Applied to Multi-Class Segmentation
AU - Camarasa, Robin
AU - Bos, Daniel
AU - Hendrikse, Jeroen
AU - Nederkoorn, Paul
AU - Kooi, M. Eline
AU - Lugt, Aad van der
AU - Bruijne, Marleen de
N1 - 39 pages, 22 figures, to be published in Journal of Machine Learning for Biomedical Imaging for the Special Issue: Uncertainty for Safe Utilization of Machine Learning in Medical Imaging (UNSURE) 2020
PY - 2021/9/22
Y1 - 2021/9/22
N2 - Uncertainty assessment has gained rapid interest in medical image analysis. A popular technique to compute epistemic uncertainty is the Monte-Carlo (MC) dropout technique. From a network with MC dropout and a single input, multiple outputs can be sampled. Various methods can be used to obtain epistemic uncertainty maps from those multiple outputs. In the case of multi-class segmentation, the number of methods is even larger as epistemic uncertainty can be computed voxelwise per class or voxelwise per image. This paper highlights a systematic approach to define and quantitatively compare those methods in two different contexts: class-specific epistemic uncertainty maps (one value per image, voxel and class) and combined epistemic uncertainty maps (one value per image and voxel). We applied this quantitative analysis to a multi-class segmentation of the carotid artery lumen and vessel wall, on a multi-center, multi-scanner, multi-sequence dataset of (MR) images. We validated our analysis over 144 sets of hyperparameters of a model. Our main analysis considers the relationship between the order of the voxels sorted according to their epistemic uncertainty values and the misclassification of the prediction. Under this consideration, the comparison of combined uncertainty maps reveals that the multi-class entropy and the multi-class mutual information statistically out-perform the other combined uncertainty maps under study. In a class-specific scenario, the one-versus-all entropy statistically out-performs the class-wise entropy, the class-wise variance and the one versus all mutual information. The class-wise entropy statistically out-performs the other class-specific uncertainty maps in terms of calibration. We made a python package available to reproduce our analysis on different data and tasks.
AB - Uncertainty assessment has gained rapid interest in medical image analysis. A popular technique to compute epistemic uncertainty is the Monte-Carlo (MC) dropout technique. From a network with MC dropout and a single input, multiple outputs can be sampled. Various methods can be used to obtain epistemic uncertainty maps from those multiple outputs. In the case of multi-class segmentation, the number of methods is even larger as epistemic uncertainty can be computed voxelwise per class or voxelwise per image. This paper highlights a systematic approach to define and quantitatively compare those methods in two different contexts: class-specific epistemic uncertainty maps (one value per image, voxel and class) and combined epistemic uncertainty maps (one value per image and voxel). We applied this quantitative analysis to a multi-class segmentation of the carotid artery lumen and vessel wall, on a multi-center, multi-scanner, multi-sequence dataset of (MR) images. We validated our analysis over 144 sets of hyperparameters of a model. Our main analysis considers the relationship between the order of the voxels sorted according to their epistemic uncertainty values and the misclassification of the prediction. Under this consideration, the comparison of combined uncertainty maps reveals that the multi-class entropy and the multi-class mutual information statistically out-perform the other combined uncertainty maps under study. In a class-specific scenario, the one-versus-all entropy statistically out-performs the class-wise entropy, the class-wise variance and the one versus all mutual information. The class-wise entropy statistically out-performs the other class-specific uncertainty maps in terms of calibration. We made a python package available to reproduce our analysis on different data and tasks.
KW - cs.CV
KW - eess.IV
KW - I.4.6
UR - https://www.melba-journal.org/papers/2021:013.html
M3 - Journal article
VL - 2021
SP - 1
EP - 39
JO - The Journal of Machine Learning for Biomedical Imaging
JF - The Journal of Machine Learning for Biomedical Imaging
SN - 2766-905X
M1 - 013
ER -
ID: 314634566