NeuroMed-Cardio-0.8B
A multi-component AI system for cardiomegaly detection from chest X-ray images. The system combines segmentation, object detection, radiomics feature extraction, and an artificial neural network to produce a single final prediction.
Clinical Context: A cardiothoracic ratio (CTR) ≥ 0.5 on a chest X-ray is the standard clinical threshold for cardiomegaly — an early indicator of heart failure.
System Architecture
The pipeline runs through two parallel branches. All outputs are fused inside a final ANN for the binary cardiomegaly prediction.
Chest X-Ray
│
▼
[Preprocessing]
Gamma Correction → Gaussian Blur → CLAHE
│
├──────────────────────────────────┐
│ │
▼ ▼
[Segmentation Branch] [YOLOv11 Branch]
U-Net++ (Heart) Heart + Lung
DeepLabV3+ (Lungs) Bounding Box Detection
│ │
├── Segmentation CTR YOLO CTR
├── Heart/Lung Area Ratio │
├── Heart Left–Right Width │
└── Radiomics (55 features) │
│ │
└──────────┬─────────────────┘
▼
Ensemble CTR
(YOLO 95% + Segmentation 5%)
│
▼
[ANN — 63 Features]
512 → 256 → 128 neurons
│
▼
Cardiomegaly Prediction
Preprocessing
Raw X-ray images vary significantly across different scanner devices — contrast imbalances and sensor noise directly affect model performance. A three-stage preprocessing pipeline is applied before any model sees the image:
| Technique | What It Does |
|---|---|
| Gamma Correction | Reveals cardiac tissue and vascular boundary details hidden in dark regions |
| Gaussian Blur | Suppresses pixel-level sensor noise and smooths edges for cleaner detection |
| CLAHE | Locally enhances contrast to bring out fine anatomical structures |
All images are normalized to 256×256 pixels.
Segmentation Models
U-Net++ — Heart Segmentation
Predicts the heart mask at the pixel level.
- Architecture: U-Net++ with nested skip connections (detail-preserving)
- Parameters: ~8 million
- Performance: 89% IoU score
- Regularization: BatchNormalization + Dropout (prevents overfitting)
- Output: Binary mask isolating the heart region
DeepLabV3+ — Lung Segmentation
Produces separate masks for the left and right lungs.
- Backbone: ResNet101
- Key Component: ASPP (Atrous Spatial Pyramid Pooling) — captures multi-scale contextual features simultaneously
- Mechanism: Atrous (dilated) convolution captures wide receptive fields without losing fine boundary detail
- Output: Separate masks for left lung and right lung
YOLOv11 Branch — Object Detection
Runs in parallel with segmentation and produces an independent CTR estimate.
- Model: YOLOv11-large
- Detected Structures: Heart, Left Lung, Right Lung
- Keypoint Detection: Lower corners of lungs and heart — used to measure horizontal extents
- Data Augmentation: Mosaic, Mixup, Horizontal/Vertical Flip
- Training: 50 epochs, batch size 64
- CTR Calculation: Heart width and lung width are derived from detected bounding box coordinates
CTR Calculation
Segmentation-Based CTR
Canny edge detection is applied to the heart and lung masks. Horizontal extents are measured from the detected boundaries:
CTR = Heart Width / Lung Width
Ensemble CTR
The two branches are combined with a weighted average:
Ensemble CTR = (YOLO CTR × 0.95) + (Segmentation CTR × 0.05)
YOLO contributes high overall accuracy; the segmentation branch adds fine anatomical detail as a complementary correction signal.
Ensemble performance improvement:
| Model | MAPE | SMAPE | MAE | RMSE |
|---|---|---|---|---|
| YOLO only | 5.58% | 5.32% | 0.02 | 0.04 |
| Segmentation only | 6.03% | 5.64% | 0.03 | 0.05 |
| Ensemble | 5.02% | 4.87% | 0.02 | 0.03 |
Feature Extraction
The 63 features fed into the ANN come from four sources:
1. Ensemble CTR
The weighted CTR value computed above. The primary clinical measurement.
2. Heart/Lung Area Ratio
Computed as: Heart Pixel Area / Lung Pixel Area
Complements CTR by providing an area-based measurement rather than a width-based one. Particularly useful in cases of chest deformity or lung disease where width measurements alone can be misleading.
3. Heart Left–Right Width
Measures how much the heart extends to the left and right of the cardiac midline.
- The esophageal axis is estimated by predicting its upper and lower endpoints
- The angle between these points is computed via
arctan2and extended into a full axis line - The segmented heart is split along this line
- Left-side and right-side widths are measured separately
This determines the direction of cardiac enlargement, providing clinically meaningful spatial context beyond a single ratio.
4. Radiomics — 55 Features
Extracted from the heart segmentation mask using PyRadiomics.
- 186 candidate features are initially extracted covering intensity distribution, tissue homogeneity, shape, and texture
- 55 clinically significant and statistically contributive features are selected for the final model
- These features allow the model to reason about the statistical character of the image, not just its visual appearance
CNN — Parallel Image Classifier
A custom Residual Attention CNN runs alongside the feature extraction pipeline and provides an independent classification signal.
- Input: 256×256 grayscale chest X-ray
- Parameters: ~7.5 million
- Loss Function: Binary Crossentropy
- Optimizer: Adam, 100 epochs
| Component | Purpose |
|---|---|
| Residual (skip) blocks | Prevents information loss in deeper layers |
| Attention mechanism | Focuses the model on the cardiac and pulmonary region |
| Dropout + Batch Normalization | Prevents overfitting |
| Latent Supervision | Trains on both intermediate and final outputs for balanced learning |
- External Test Performance: 80.8% accuracy, 81.8% F1 score
ANN — Final Fusion Model
All features (63-dimensional vector) are fused in a single fully-connected ANN that produces the final prediction.
- Architecture: 512 → 256 → 128 neurons (fully connected)
- Input: CNN output + Ensemble CTR + Heart/Lung Area Ratio + Heart Left-Right Width + 55 Radiomics features
- Regularization: BatchNormalization + Dropout
- Output: Cardiomegaly present / not present (binary)
- External Test Performance: 84.0% accuracy, 85.3% F1 score
Synthetic Data Augmentation with GANs
To address low diversity in cardiomegaly cases, four GAN architectures were trained and evaluated using FID scores (lower = better):
| Model | FID Score |
|---|---|
| CGAN | 62 |
| DCGAN | 72 |
| IAGAN | 40 |
| StyleGAN2-ADA | 25 ✓ |
StyleGAN2-ADA's Adaptive Data Augmentation (ADA) mechanism enables stable training even with limited data, achieving the lowest FID score and the most realistic synthetic X-ray images.
Explainable AI (XAI)
The system not only produces a prediction — it explains why.
- Grad-CAM: Generates heatmaps showing which image regions the CNN focused on. Confirms the model attends to the cardiac and pulmonary area rather than irrelevant regions.
- SHAP Analysis: Quantifies each feature's individual contribution to the final prediction. CNN output is the highest contributor, followed by YOLO CTR and area ratio.
Training Data
A total of 80,572 images from five datasets were assembled into a balanced training set (40,286 cardiomegaly / 40,286 non-cardiomegaly).
| Dataset | Institution | Images | Split |
|---|---|---|---|
| MIMIC-CXR | MIT, USA | 51,693 | Training |
| PadChest | Hospital San Juan, Spain | 12,600 | Training |
| CheXpert | Stanford University, USA | 6,046 | Training |
| VinDr-CXR | VinBigData, Vietnam | 4,598 | Training |
| BRAX | Hospital Albert Einstein, Brazil | 2,572 | Hard Training |
| NIH ChestX-Ray14 | NIH Clinical Center, USA | 3,063 | External Test |
Performance Summary
| Model / Component | Metric | Value |
|---|---|---|
| U-Net++ (Heart Segmentation) | IoU | 89% |
| YOLOv11 CTR | MAPE / SMAPE | 5.58% / 5.32% |
| Segmentation CTR | MAPE / SMAPE | 6.03% / 5.64% |
| Ensemble CTR | MAPE / SMAPE | 5.02% / 4.87% |
| CNN Classifier | F1 / Accuracy | 81.8% / 80.8% |
| ANN (Final) | F1 / Accuracy | 85.3% / 84.0% |
Installation & Validation
This is the official technical documentation prepared by the Ethosoft team for Teknofest validators.
Python Version
Python 3.12 must be installed on the validation device. This is required for full library compatibility.
NVIDIA GPU Setup (Optional)
If the validation device has an NVIDIA GPU and GPU acceleration is desired, follow the steps below. This step can be skipped for CPU-only runs.
⚠️ On Windows, GPU acceleration will not work for TensorFlow. It can be used for PyTorch only.
⚠️ Non-NVIDIA GPUs are not supported.
- Visit the CUDA Toolkit download page for your device type and follow the installation steps.
- After installation, verify by running the following in your terminal:
nvidia-smi
Library Installation
Windows
- Activate your target Python environment (skip if installing globally).
- Run:
pip install -r winrequirements.txt
Linux
- Activate your target Python environment (skip if installing globally).
- Run:
pip install -r requirements.txt
PyRadiomics Installation
The cardiomegaly task uses PyRadiomics, which cannot be installed directly via pip and requires a manual build. Git must be installed on your system.
- Open a terminal in the documentation folder.
- Run:
git clone https://github.com/AIM-Harvard/pyradiomics.git
pip install -e pyradiomics/[dev,docs,test]
Running Cardiomegaly / CTR Validation
From the documentation folder, with your Python environment active, run:
python kardiyomegaliteknefes.py --base_path [IMAGE_PATH] --output_json [OUTPUT_JSON_PATH]
Replace the bracketed values with your own paths:
| Argument | Description |
|---|---|
--base_path |
Path to the folder containing the chest X-ray images |
--output_json |
Path where the output JSON file will be saved |
- Downloads last month
- 233
Evaluation results
- F1 Score on NIH ChestX-Ray14 (External Test)self-reported0.853
- Accuracy on NIH ChestX-Ray14 (External Test)self-reported0.840
- MAPE on NIH ChestX-Ray14 (External Test)self-reported0.050
- MAE on NIH ChestX-Ray14 (External Test)self-reported0.020
- RMSE on NIH ChestX-Ray14 (External Test)self-reported0.030
