Paper Title: Transformer based multiple instance learning for weakly supervised histopathology image segmentation
Official Code: https://github.com/Nexuslkl/Swin_MIL
The Swin transformer is incorporated into the MIL framework to encode long-range relationships between instances within a bag. It employs so-called deep supervision, meaning additional "companion" objective functions at different hidden layers (here: after each transformer stage) are introduced. The final loss is then computed from the output loss plus the companion losses. A decoder produces pixel-wise predictions using the feature maps after each stage of the transformer. A fusion layer is employed to combine these side-outputs of different scales to produce the final segmentation map.
It tackles the problem of WSI patch segmentation using only binary image-level labels of WSIs
It's the first method performing a weakly-supervised segmentation using a combination of Transformer and MIL, which enables to produce features that encode long-distance relationships between instances. Usually instances of a bag are independent of eachother in MIL.
- Image type: H&E stained, colon cancer (private)
- Image number: 910 (330 CA, 580 NC)
- Train/Val/Test: 750 (250 CA, 500 NC) / 160 (80 CA, 80 NC)
- Image size: 3000 x 3000, but downsampled to 256 x 256 for training
- Resolution: 0.226 microns/pixel at 40x magnification
- Hardware: Multiple RTX 3090 with 24GB memory
- Initialization: Pretrained on ImageNet, Xavier for side-output layers
- Optimizer: Adam with 1e-6 learning rate, 1e-9 learning rate for side outputs
- Batch size: 4 per GPU
- F1-Score
- Hausdorff Distance