Research Project
Multimodel skin cancer prediction using structured clinical data and images
Overview
Developed a diagnostic pipeline for the Kaggle ISIC 2024 Challenge to detect malignant melanoma using a large-scale multimodal dataset of 400,000+ samples. It integrated both imaging data and structured electronic health record data.
- 1.Image feature extraction: Utilized EfficientNet-B0 with Generalized Mean (GeM) pooling to capture local visual features. Applied Focal Loss to better handle class imbalance by emphasizing rare malignant cases and reducing majority-class bias.
- 2.Sampling and validation: Implemented a 1:20 downsampling strategy along with StratifiedGroupKFold cross-validation to address data sparsity, improve model robustness, and prevent data leakage across patient-level image groups.
- 3.Multimodal ensemble: Engineered 40+ clinical metadata features and built a LightGBM-based stacking ensemble to integrate image and tabular data. This multimodal framework improved the pAUC from 0.148 for the image-only baseline to 0.175.
- 4.Clinical interpretability: Integrated Grad-CAM to generate visual heatmaps of lesion regions and enhance model interpretability.
