YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
U-Net for Image Segmentation (Kvasir-SEG)
This model is a custom U-Net implementation designed for medical image segmentation, specifically targeting gastrointestinal polyp identification. It was developed as part of the Complements of Machine Learning course (Weekly Lab 4).
The model follows a symmetric encoder-decoder architecture with skip connections, inspired by Ronneberger et al. (2015).
- Encoder: Four downsampling stages reducing spatial resolution to a 1024-channel bottleneck.
- Decoder: Symmetric upsampling stages using learnable transposed convolutions to recover original image dimensions.
- Enhancements: Inclusion of Batch Normalization for training stability and Same-Padding to maintain consistent spatial shapes without cropping.
Training Procedure
BCEWithLogitsLoss was chosen to optimize pixel-level binary classification and address the inherent class imbalance in medical imaging.
Hyperparameters:
- Epochs: 20
- Learning Rate: 1e-4
- Batch Size: 8
- Resolution: 256x256
The model was developed using PyTorch and the Hugging Face Trainer API, accelerated by a T4 GPU on Google Colab.