Grad-CAM Tumor Classification
An applied computer vision study using CNNs and ResNet models to classify brain tumors from CT and MRI scans, with Grad-CAM used to evaluate model interpretability and clinical trustworthiness.
What it is
We built a two‑modal pipeline (CT and MRI) that trains CNN and ResNet classifiers and generates Grad‑CAM heatmaps to visualize what the models attend to. Datasets were cleaned, normalized per modality, and split with stratified sampling. Baselines established reliable classification performance, while transfer‑learned ResNet models pushed accuracy and F1 to near‑perfect on MRI. Grad‑CAM overlays then tested whether high confidence aligned with plausible tumor regions, revealing stronger spatial localization for MRI and highlighting CT failure modes where attention drifted toward non‑lesion areas or boundaries.
My Role
I implemented the MRI CNN baseline end‑to‑end: modality‑specific preprocessing (resize to 224×224, per‑dataset normalization, channel stacking), light augmentation (flip/rotate), class‑weighted training, and stratified k‑fold cross‑validation to validate generalization. The baseline achieved high accuracy and balanced precision/recall across splits, establishing a credible reference for ResNet improvements and subsequent Grad‑CAM analysis. I also contributed to the report—curating results tables/plots, documenting metrics, and helping synthesize findings across models and modalities.
Interesting Constraints
- 01Modality differences: MRI and CT have distinct resolution, contrast, and texture distributions; preprocessing and normalization must be per‑modality to avoid spurious signals.
- 02Class balance and sampling: moderate imbalance (especially in MRI) requires stratified splits and class‑aware training to keep recall strong and reduce false negatives.
- 03CT artifacts and text: CT variability and corner annotations can bias attention. Consequently filters were applied toreduce text visibility, but attention drift still appears in some models.
- 04Interpretability gap: strong quantitative metrics do not guarantee clinically meaningful attention. Grad‑CAM must correlate with confidence and localize plausible regions.
What I Learned
- Data quality drives outcomes: per‑modality normalization, label hygiene, and stratified splits matter as much as architecture.
- Baselines before complexity: a well‑regularized CNN baseline establishes truth anchors and reveals where transfer learning adds value.
- Evaluate what you ship: pair accuracy/F1 with Grad‑CAM alignment as confidence without localized attention is a warning in medical contexts.
- Cross‑validation builds trust: stratified k‑fold with consistent preprocessing makes results reproducible and interpretations defensible.