NeuroMed-Cardio-0.8B

A multi-component AI system for cardiomegaly detection from chest X-ray images. The system combines segmentation, object detection, radiomics feature extraction, and an artificial neural network to produce a single final prediction.

Clinical Context: A cardiothoracic ratio (CTR) ≥ 0.5 on a chest X-ray is the standard clinical threshold for cardiomegaly — an early indicator of heart failure.

System Architecture

The pipeline runs through two parallel branches. All outputs are fused inside a final ANN for the binary cardiomegaly prediction.

Chest X-Ray
      │
      ▼
[Preprocessing]
Gamma Correction → Gaussian Blur → CLAHE
      │
      ├──────────────────────────────────┐
      │                                  │
      ▼                                  ▼
[Segmentation Branch]            [YOLOv11 Branch]
U-Net++ (Heart)                  Heart + Lung
DeepLabV3+ (Lungs)               Bounding Box Detection
      │                                  │
      ├── Segmentation CTR          YOLO CTR
      ├── Heart/Lung Area Ratio          │
      ├── Heart Left–Right Width         │
      └── Radiomics (55 features)        │
            │                            │
            └──────────┬─────────────────┘
                       ▼
                 Ensemble CTR
              (YOLO 95% + Segmentation 5%)
                       │
                       ▼
                [ANN — 63 Features]
                512 → 256 → 128 neurons
                       │
                       ▼
             Cardiomegaly Prediction

Preprocessing

Raw X-ray images vary significantly across different scanner devices — contrast imbalances and sensor noise directly affect model performance. A three-stage preprocessing pipeline is applied before any model sees the image:

Technique	What It Does
Gamma Correction	Reveals cardiac tissue and vascular boundary details hidden in dark regions
Gaussian Blur	Suppresses pixel-level sensor noise and smooths edges for cleaner detection
CLAHE	Locally enhances contrast to bring out fine anatomical structures

All images are normalized to 256×256 pixels.

Segmentation Models

U-Net++ — Heart Segmentation

Predicts the heart mask at the pixel level.

Architecture: U-Net++ with nested skip connections (detail-preserving)
Parameters: ~8 million
Performance: 89% IoU score
Regularization: BatchNormalization + Dropout (prevents overfitting)
Output: Binary mask isolating the heart region

DeepLabV3+ — Lung Segmentation

Produces separate masks for the left and right lungs.

Backbone: ResNet101
Key Component: ASPP (Atrous Spatial Pyramid Pooling) — captures multi-scale contextual features simultaneously
Mechanism: Atrous (dilated) convolution captures wide receptive fields without losing fine boundary detail
Output: Separate masks for left lung and right lung

YOLOv11 Branch — Object Detection

Runs in parallel with segmentation and produces an independent CTR estimate.

Model: YOLOv11-large
Detected Structures: Heart, Left Lung, Right Lung
Keypoint Detection: Lower corners of lungs and heart — used to measure horizontal extents
Data Augmentation: Mosaic, Mixup, Horizontal/Vertical Flip
Training: 50 epochs, batch size 64
CTR Calculation: Heart width and lung width are derived from detected bounding box coordinates

CTR Calculation

Segmentation-Based CTR

Canny edge detection is applied to the heart and lung masks. Horizontal extents are measured from the detected boundaries:

CTR = Heart Width / Lung Width

Ensemble CTR

The two branches are combined with a weighted average:

Ensemble CTR = (YOLO CTR × 0.95) + (Segmentation CTR × 0.05)

YOLO contributes high overall accuracy; the segmentation branch adds fine anatomical detail as a complementary correction signal.

Ensemble performance improvement:

Model	MAPE	SMAPE	MAE	RMSE
YOLO only	5.58%	5.32%	0.02	0.04
Segmentation only	6.03%	5.64%	0.03	0.05
Ensemble	5.02%	4.87%	0.02	0.03

Feature Extraction

The 63 features fed into the ANN come from four sources:

1. Ensemble CTR

The weighted CTR value computed above. The primary clinical measurement.

2. Heart/Lung Area Ratio

Computed as: Heart Pixel Area / Lung Pixel Area

Complements CTR by providing an area-based measurement rather than a width-based one. Particularly useful in cases of chest deformity or lung disease where width measurements alone can be misleading.

3. Heart Left–Right Width

Measures how much the heart extends to the left and right of the cardiac midline.

The esophageal axis is estimated by predicting its upper and lower endpoints
The angle between these points is computed via arctan2 and extended into a full axis line
The segmented heart is split along this line
Left-side and right-side widths are measured separately

This determines the direction of cardiac enlargement, providing clinically meaningful spatial context beyond a single ratio.

4. Radiomics — 55 Features

Extracted from the heart segmentation mask using PyRadiomics.

186 candidate features are initially extracted covering intensity distribution, tissue homogeneity, shape, and texture
55 clinically significant and statistically contributive features are selected for the final model
These features allow the model to reason about the statistical character of the image, not just its visual appearance

CNN — Parallel Image Classifier

A custom Residual Attention CNN runs alongside the feature extraction pipeline and provides an independent classification signal.

Input: 256×256 grayscale chest X-ray
Parameters: ~7.5 million
Loss Function: Binary Crossentropy
Optimizer: Adam, 100 epochs

Component	Purpose
Residual (skip) blocks	Prevents information loss in deeper layers
Attention mechanism	Focuses the model on the cardiac and pulmonary region
Dropout + Batch Normalization	Prevents overfitting
Latent Supervision	Trains on both intermediate and final outputs for balanced learning

External Test Performance: 80.8% accuracy, 81.8% F1 score

ANN — Final Fusion Model

All features (63-dimensional vector) are fused in a single fully-connected ANN that produces the final prediction.

Architecture: 512 → 256 → 128 neurons (fully connected)
Input: CNN output + Ensemble CTR + Heart/Lung Area Ratio + Heart Left-Right Width + 55 Radiomics features
Regularization: BatchNormalization + Dropout
Output: Cardiomegaly present / not present (binary)
External Test Performance: 84.0% accuracy, 85.3% F1 score

Synthetic Data Augmentation with GANs

To address low diversity in cardiomegaly cases, four GAN architectures were trained and evaluated using FID scores (lower = better):

Model	FID Score
CGAN	62
DCGAN	72
IAGAN	40
StyleGAN2-ADA	25 ✓

StyleGAN2-ADA's Adaptive Data Augmentation (ADA) mechanism enables stable training even with limited data, achieving the lowest FID score and the most realistic synthetic X-ray images.

Explainable AI (XAI)

The system not only produces a prediction — it explains why.

Grad-CAM: Generates heatmaps showing which image regions the CNN focused on. Confirms the model attends to the cardiac and pulmonary area rather than irrelevant regions.
SHAP Analysis: Quantifies each feature's individual contribution to the final prediction. CNN output is the highest contributor, followed by YOLO CTR and area ratio.

Training Data

A total of 80,572 images from five datasets were assembled into a balanced training set (40,286 cardiomegaly / 40,286 non-cardiomegaly).

Dataset	Institution	Images	Split
MIMIC-CXR	MIT, USA	51,693	Training
PadChest	Hospital San Juan, Spain	12,600	Training
CheXpert	Stanford University, USA	6,046	Training
VinDr-CXR	VinBigData, Vietnam	4,598	Training
BRAX	Hospital Albert Einstein, Brazil	2,572	Hard Training
NIH ChestX-Ray14	NIH Clinical Center, USA	3,063	External Test

Performance Summary

Model / Component	Metric	Value
U-Net++ (Heart Segmentation)	IoU	89%
YOLOv11 CTR	MAPE / SMAPE	5.58% / 5.32%
Segmentation CTR	MAPE / SMAPE	6.03% / 5.64%
Ensemble CTR	MAPE / SMAPE	5.02% / 4.87%
CNN Classifier	F1 / Accuracy	81.8% / 80.8%
ANN (Final)	F1 / Accuracy	85.3% / 84.0%

Installation & Validation

This is the official technical documentation prepared by the Ethosoft team for Teknofest validators.

Python Version

Python 3.12 must be installed on the validation device. This is required for full library compatibility.

NVIDIA GPU Setup (Optional)

If the validation device has an NVIDIA GPU and GPU acceleration is desired, follow the steps below. This step can be skipped for CPU-only runs.

⚠️ On Windows, GPU acceleration will not work for TensorFlow. It can be used for PyTorch only.
⚠️ Non-NVIDIA GPUs are not supported.

Visit the CUDA Toolkit download page for your device type and follow the installation steps.
After installation, verify by running the following in your terminal:

nvidia-smi

Library Installation

Windows

Activate your target Python environment (skip if installing globally).
Run:

pip install -r winrequirements.txt

Linux

Activate your target Python environment (skip if installing globally).
Run:

pip install -r requirements.txt

PyRadiomics Installation

The cardiomegaly task uses PyRadiomics, which cannot be installed directly via pip and requires a manual build. Git must be installed on your system.

Open a terminal in the documentation folder.
Run:

git clone https://github.com/AIM-Harvard/pyradiomics.git
pip install -e pyradiomics/[dev,docs,test]

Running Cardiomegaly / CTR Validation

From the documentation folder, with your Python environment active, run:

python kardiyomegaliteknefes.py --base_path [IMAGE_PATH] --output_json [OUTPUT_JSON_PATH]

Replace the bracketed values with your own paths:

Argument	Description
`--base_path`	Path to the folder containing the chest X-ray images
`--output_json`	Path where the output JSON file will be saved

Downloads last month: 233

Evaluation results

F1 Score on NIH ChestX-Ray14 (External Test)
self-reported

0.853
Accuracy on NIH ChestX-Ray14 (External Test)
self-reported

0.840
MAPE on NIH ChestX-Ray14 (External Test)
self-reported

0.050
MAE on NIH ChestX-Ray14 (External Test)
self-reported

0.020
RMSE on NIH ChestX-Ray14 (External Test)
self-reported

0.030