A deep learning pipeline for multiclass classification of pulmonary diseases from respiratory sounds. Published in IEEE WCONF 2023.
This research project implements an 8-class respiratory disease classification pipeline using real-world auscultation data. It applies audio preprocessing, mel-spectrogram conversion, and CNN-based classification using ResNet50, EfficientNet-B0, and VGG16.
📝 Published Paper: "A Deep Learning Framework for Multiclass Categorization of Pulmonary Diseases"
📚 DOI: 10.1109/WCONF58270.2023.10235057
The project is based on the ICBHI 2017 Respiratory Sound Database, a real-world, expert-annotated dataset collected from 126 patients across various age groups.
- 📈 6,898 labeled audio segments
- 🧓 Age groups: Children, Adults, Elderly
- ⏱ Recording length: 10s–90s
- 🧬 Classes: Pneumonia, COPD, Asthma, LRTI, URTI, Bronchiectasis, Bronchiolitis, Healthy
Dataset Version | Link |
---|---|
📦 Official ICBHI Site | bhichallenge.med.auth.gr |
📁 Kaggle Mirror | Kaggle Link |
🎧 Preprocessed Audio | Google Drive |
🖼️ Mel-Spectrograms | Google Drive |
- Preprocessing – Segment audio into 6s slices, denoise, normalize
- Feature Extraction – Convert to mel-spectrograms
- Model Training – CNN-based classification (EfficientNet, ResNet50, VGG16)
- Evaluation – Confusion matrix, accuracy, ROC curves
Model | Task | Accuracy (%) |
---|---|---|
ResNet50 | Binary (Healthy vs Others) | 87.1 |
EfficientNet-B0 | Binary (Healthy vs Others) | 94.1 |
EfficientNet-B0 | Multiclass (8-Class) | 82.1 |
ResNet50 | Multiclass (8-Class) | 79.8 |
VGG16 | Multiclass (8-Class) | 0.14 (failed) |
.
├── 01_audio_preprocessing_subslices.ipynb
├── 02_train_test_split.ipynb
├── 03_feature_extraction_spectrograms.ipynb
├── 04_resnet50_binary_classification.ipynb
├── 05_efficientnet_binary_classification.ipynb
├── 06_efficientnet_multiclass_classification.ipynb
├── 07_resnet50_multiclass_classification.ipynb
├── 08_vgg16_multiclass_classification.ipynb
├── patient-disease-labels.csv
├── pulmonary.png
└── README.md
Notebook | Description |
---|---|
01_audio_preprocessing_subslices.ipynb |
Splits raw audio into 6-second segments |
02_train_test_split.ipynb |
80/20 stratified data split |
03_feature_extraction_spectrograms.ipynb |
Mel-spectrogram generation using librosa |
04_resnet50_binary_classification.ipynb |
Binary classification using ResNet50 |
05_efficientnet_binary_classification.ipynb |
Binary classification using EfficientNet |
06_efficientnet_multiclass_classification.ipynb |
8-class classification using EfficientNet |
07_resnet50_multiclass_classification.ipynb |
8-class classification using ResNet50 |
08_vgg16_multiclass_classification.ipynb |
Failed training example (for comparison) |
- Python 3.8.3+
- TensorFlow 2.x
- Librosa
- NumPy, Pandas, scikit-learn
- Jupyter, Matplotlib
git clone https://github.com/your-username/respiratory-sound-classification.git
cd respiratory-sound-classification
pip install -r requirements.txt
Then run notebooks in order from 01_
to 08_
.
⚠️ This project is intended for research and educational purposes only. It is not approved for clinical use by any regulatory authority (e.g. FDA, CE, CDSCO).
- Decision support tools
- AI-assisted stethoscope design
- Remote respiratory screening systems
Please cite this work as:
@INPROCEEDINGS{10235057,
author={Khanaghavalle, G R and Manoj, Allen and Karthikeyan, Janani and Murali, Ritunjay},
booktitle={2023 World Conference on Communication & Computing (WCONF)},
title={A Deep Learning Framework for Multiclass Categorization of Pulmonary Diseases},
year={2023},
pages={1-6},
keywords={Deep learning;Pulmonary diseases;Lung;Air pollution;Convolutional neural networks},
doi={10.1109/WCONF58270.2023.10235057}
}
- Audio signal processing (librosa)
- CNN-based image classification
- Model evaluation & statistical validation
- Reproducible research practices
- Clinical AI design awareness
- Research publication & reporting