Skip to content

๐Ÿฉบ Healthcare Data Integration, Analysis & Visualization project using Python, Jupyter Notebook, and Power BI. Integrates multi-sheet Excel data, performs data cleaning, feature engineering, exploratory analysis, and delivers interactive healthcare dashboards.

License

Notifications You must be signed in to change notification settings

ChandrashekarDevarasetti/Visual-Healthcare-Insights-Python-EDA-Power-BI-Dashboards

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

83 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Healthcare Data Insights Banner

Visual-Healthcare-Insights-Python-EDA-Power-BI-Dashboards

๐Ÿ“š Table of Contents

  1. Project Overview
  2. Project Description
  3. Key Features
  4. Tools & Technologies
  5. Project Folder Structure
  6. Installation & Setup (One Block for Python + Power BI)
  7. How to Run (For both Python EDA + Power BI)
  8. Detailed Overview of Health_Care_EDA in Python
    ย ย ย ย  8.1 Description of the Dataset
    ย ย ย ย  8.2 Data Cleaning & Preparation
    ย ย ย ย ย ย ย ย  8.2.1 Merging All Datasets
    ย ย ย ย ย ย ย ย  8.2.2 Standardizing Data
    ย ย ย ย ย ย ย ย  8.2.3 Data Integrity Validation
    ย ย ย ย ย ย ย ย  8.2.4 Handling Missing Values
    ย ย ย ย ย ย ย ย  8.2.5 Handling Duplicates Records
    ย ย ย ย ย ย ย ย  8.2.6 Converting Datatypes
    ย ย ย ย ย ย ย ย  8.2.7 Creating Derived Columns
    ย ย ย ย ย ย ย ย  8.2.8 Mapping Categorical Values
  9. Exploratory Data Analysis (EDA)
    ย ย ย ย  9.1 Univariate Analysis
    ย ย ย ย  9.2 Bivariate Analysis
    ย ย ย ย  9.3 Multivariate Analysis
    ย ย ย ย  9.4 Distribution Analysis
    ย ย ย ย  9.5 Correlation Analysis
  10. Detailed Overview of HealthCare Power BI Dashboard
    ย ย ย ย  10.1 Overview Dashboard
    ย ย ย ย  10.2 Medical Condition & Outcome Analysis
    ย ย ย ย  10.3 Billing & Insurance Analysis
    ย ย ย ย  10.4 Doctor & Hospital Performance
    ย ย ย ย  10.5 Time-Based Analysis
  11. Author
  12. License

๐Ÿ“Œ 1. Project Overview

ย ย ย ย This project focuses on analyzing healthcare data to uncover key insights into patient admissions, medical conditions, treatment outcomes, and hospital performance. By combining Python for data preparation and cleaning with Power BI for interactive dashboards, the project aims to support healthcare administrators in making data-driven operational and clinical decisions.

๐Ÿ“Œ 2. Project Description

ย ย ย ย ๐Ÿฉบ The Healthcare Data Analysis and Visualization Project involves working with a multi-sheet Excel dataset containing patient details, hospital information, doctor records, and patient visit data. The project workflow starts with merging and cleaning the data using Python libraries such as Pandas and NumPy in a Jupyter Notebook environment. Key data cleaning steps included handling missing values, standardizing text data, mapping admission type codes, calculating patient length of stay, and identifying high billing cases.

After preparing a clean and integrated dataset, exploratory data analysis (EDA) was performed in Python to validate data distributions and detect anomalies. The prepared dataset was then visualized in Power BI, where a series of interactive dashboards were built to deliver actionable insights.

The dashboards created include:

๐Ÿ” Overview Dashboard : Patient Admissions Summary: Visualizing patient admission counts, age distribution, gender splits, and admission trends.

๐Ÿฅ Medical Condition & Outcome Analysis : Analyzing the frequency of medical conditions, treatment outcomes, and recovery rates.

๐Ÿ’ต Billing & Insurance Analysis : Tracking billing amounts, insurance coverage patterns, and flagging high-cost cases.

๐Ÿง‘โ€โš•๏ธ Doctor & Hospital Performance : Evaluating doctor-wise and hospital-wise patient outcomes, admissions, and billing performance.

๐Ÿ“… Time-Based Analysis : Examining trends over time including admissions, discharges, and length of stay patterns.

This project demonstrates how Python-based data engineering can seamlessly integrate with BI tools like Power BI to deliver healthcare insights that improve operational efficiency and patient care decisions.

๐Ÿ“Œ 3. Key Features

  • ๐Ÿ“‘ Merges multiple Excel sheets into a single clean dataset.

  • ๐Ÿงน Cleans and standardizes patient, doctor, and hospital details.

  • โš™๏ธ Handles missing values (numeric โ†’ median, categorical โ†’ mode).

  • ๐Ÿ“ Calculates Length of Stay for each patient.

  • ๐Ÿ’ธ Flags patients with High Billing Amounts.

  • ๐Ÿ”ข Maps Admission Types to numeric codes for analysis.

  • ๐Ÿ“Š Performs EDA using Python (Pandas, Matplotlib, Seaborn).

  • ๐Ÿ“ˆ Builds Power BI dashboards for dynamic visual insights.

๐Ÿ“Œ4. Tools & Technologies

  • Python
    • Pandas
    • NumPy
    • Matplotlib
    • Seaborn
  • ๐Ÿ“Š Power BI
  • ๐Ÿ“‘ Microsoft Excel
  • ๐Ÿ““ Jupyter Notebook
  • ๐Ÿ“‚ CSV & Excel Files (for data storage)

๐Ÿ“Œ 5. Project Folder Structure

โ”œโ”€โ”€ ๐Ÿ“ Data/ # Healthcare Excel dataset files

โ”‚ โ””โ”€โ”€ healthcare_dataset.xlsx

โ”‚

โ”œโ”€โ”€ ๐Ÿ“ Images/ # Project images for README or dashboards

โ”‚

โ”œโ”€โ”€ ๐Ÿ“ Python/ # Python notebook, requirements, and scripts

โ”‚ โ”œโ”€โ”€ HealthCare_EDA.ipynb

โ”‚ โ””โ”€โ”€ requirements.txt

โ”‚

โ”œโ”€โ”€ ๐Ÿ“ PowerBI/ # Power BI dashboard files

โ”‚ โ””โ”€โ”€ HealthCare_Dashboard.pbix

โ”‚

โ”œโ”€โ”€ ๐Ÿ“„ .gitignore # Git ignore rules

โ”œโ”€โ”€ ๐Ÿ“„ LICENSE # Project open source license

โ”œโ”€โ”€ ๐Ÿ“„ README.md # Project overview and documentation

๐Ÿ“Œ 6. Installation & Setup (One Block for Python + Power BI)

1๏ธโƒฃ Clone the repository

git clone https://github.com/ChandrashekarDevarasetti/Visual-Healthcare-Insights-Python-EDA-Power-BI-Dashboards.git

cd Visual-Healthcare-Insights-Python-EDA-Power-BI-Dashboards

2๏ธโƒฃ Install required Python packages

pip install -r Python/requirements.txt

3๏ธโƒฃ Launch the Jupyter Notebook

jupyter notebook Python/HealthCare_EDA.ipynb

4๏ธโƒฃ Open the Power BI Dashboard manually:

Navigate to the 'PowerBI' folder and open 'HealthCare_Dashboard.pbix' in Power BI Desktop

๐Ÿ“Œ 7. How to Run (For both Python EDA + Power BI)

Run Python EDA Notebook

1๏ธโƒฃ Install dependencies

ย ย ย ย  Make sure you have all the necessary dependencies by running the following command:

ย ย ย ย  pip install -r Python/requirements.txt

2๏ธโƒฃ Launch the Jupyter Notebook

ย ย ย ย  After installing the dependencies, open the Jupyter notebook with the following command:

ย ย ย ย  jupyter notebook Python/HEALTHCARE_EDA.ipynb

3๏ธโƒฃ In your browser, open the notebook and run all cells sequentially

ย ย ย ย  Once the notebook is open in your browser, execute all the cells to run the EDA analysis.

๐Ÿ“Š Open Power BI Dashboard

1๏ธโƒฃ Install Power BI Desktop

ย ย ย ย  If you haven't already, install Power BI Desktop. You can download it from here.

2๏ธโƒฃ Open the Power BI file

ย ย ย ย  To view the dashboards, open the Power BI file located in the PowerBI directory:

ย ย ย ย  PowerBI/HealthCare_Dashboard.pbix

3๏ธโƒฃ Explore all the interactive dashboards

ย ย ย ย  Once the Power BI file is open, you can explore the following interactive dashboards:

  • ๐Ÿ“Š Overview Dashboard

  • ๐Ÿฉบ Medical Condition & Outcome Analysis

  • ๐Ÿ’ธ Billing & Insurance Analysis

  • ๐Ÿง‘โ€โš•๏ธ Doctor & Hospital Performance

4๏ธโƒฃ Refresh the dataset if needed

ย ย ย ย  If you need to refresh the data, connect to the Excel file located under the /data/ directory.

๐Ÿ“Œ 8. Detailed Overview of HealthCare_EDA in Python

ย ย ย ย  This notebook begins with a descriptive exploration of the patient and hospital datasets using summary statistics and visual analysis. It then examines patterns in patient demographics, admission types, and medical conditions to understand what factors may influence hospital stay duration. Finally, relationships between variables such as department, billing, and severity of illness are analyzed further.

8.1 Description of the Dataset

ย ย ย ย  The data in the healthcare dataset includes information about patients admitted to hospitals across different medical conditions. It contains 55500 rows and 17 columns, with data spanning several years, starting from 2019. The dataset includes details such as patient ID (P_ID), doctor ID (D_ID), hospital ID (H_ID), medical condition, date of admission, insurance provider, billing amount, room number, admission type, discharge date, medication prescribed, test results, patient name, age, gender, blood type, doctor name, and hospital name.

Key variables in the dataset include medical condition (Cancer, Diabetes, Asthma, Hypertension), billing amount (non-negative real numbers), room number (integer), admission type (Elective, Emergency, Urgent), and medication (Lipitor, Aspirin, Paracetamol). The age and blood type variables are numerical, while gender and insurance provider are categorical variables. The test results vary, with categories like Inconclusive, Abnormal, Normal and NaN values.

info

8.2 Data Cleaning & Preparation

ย ย ย ย  Data Cleaning & Preparation is the process of identifying and fixing errors, inconsistencies, and missing values in raw data, transforming it into a structured, reliable, and analysis-ready format for further processing.

8.2.1 Merging All Datasets

ย ย ย ย  To perform a complete analysis, we merge all four datasets using their respective key columns (P_ID, D_ID, H_ID). This helps consolidate patient details, doctor information, hospital data, and medical history into a single unified DataFrame for further exploration and visualization.

merged_df

8.2.2 Standardizing Name Fields & Removing Duplicates in merged data

ย ย ย ย  After merging all datasets, we ensure the Name, Doctor, and Hospital columns are clean and consistently formatted. This helps eliminate redundancy, avoids mismatched values, and improves overall data quality for analysis and visualization.

standardizing_df

8.2.3 Data Integrity Validation for Foreign Keys (P_ID, D_ID, H_ID)

Identifying Mismatches and Foreign Key Issues Between P_ID, D_ID, and H_ID in Merged Data and Master Tables

Data_Integrity

8.2.4 Handling Missing Values

Identifing and appropriately handling missing values in the dataset to prevent incomplete analysis or errors during visualization.

Missing_Values

8.2.5 Handling Duplicate Records

Identifing and appropriately handling missing values in the dataset to prevent incomplete analysis or errors during visualization.

Duplicate_Records

8.2.6 Converting Data Types

Ensure all columns have correct data types for analysis.

Converting_Datatypes

8.2.7 Creating New Derived Columns

Creating useful new columns like Length of Stay or Billing Category.

Derived_Columns

8.2.8 Mapping Categorical Values

Mapping or encode categorical values for better readability or later modeling.

Mapping_Categorical_Values

๐Ÿ“Œ 9. Exploratory Data Analysis (EDA)

Creating charts and graphs to make sense of data patterns, trends, relationships, and anomalies visually.

EDA

9.1 Univariate Analysis

ย ย ย ย  Univariate Analysis is the simplest form of data analysis where only one variable is analyzed at a time to understand its distribution, central tendency, spread, and underlying patterns.

Univariate

9.2 Bivariate Analysis

ย ย ย ย  Bivariate Analysis is the analysis of two variables simultaneously to explore the relationship, association, or correlation between them and understand how one variable affects or relates to the other.

Bivariate

9.3 Multivariate Analysis

Multivariate Analysis is the analysis of more than two variables simultaneously to understand complex relationships, interactions, and combined effects among multiple variables within a dataset.

Multivariate

9.4 Distribution Analysis

Understand data distribution patterns and proportions.

Distributions

9.5 Correlation Analysis

Correlation Heatmap: Show correlation strength between multiple numericย variables

Correlation_Heatmap

๐Ÿ“Œ 10. Detailed Overview of HealthCare Power BI Dashboard

ย ย ย ย  This comprehensive Power BI Healthcare Admissions & Billing Dashboard offers end-to-end insights into patient admissions, medical conditions, doctor performance, billing trends, and time-based activity. It includes interactive KPI cards, dynamic charts, matrix visuals, and drill-through pages for detailed patient-level analysis. The dashboard empowers stakeholders to monitor hospital operations, financial performance, and clinical outcomes effectively with slicers, bookmarks, and customized timelines for rich, interactive exploration.

๐Ÿ” 10.1 Overview Dashboard

What it does: This dashboard provides a quick summary of hospital admissions, patient volumes, and financial performance.

๐Ÿ“Š Visual Insights:

  • KPI cards display total admissions, average stay, total billing, and avg. billing per patient.

  • Bar chart shows admissions trend by year/month.

  • Donut chart compares Elective vs Emergency admissions.

  • Slicers allow filtering by Year, Gender, and Insurance Provider.

๐ŸŽฏ Result: Quickly monitor hospital activity, identify admission trends, and understand patient distribution by type and demographics at a glance.

Overview_Dashboard

๐Ÿฅ 10.2 Medical Condition & Outcome Analysis

What it does: This dashboard highlights patient counts by medical condition and their corresponding test outcomes.

๐Ÿ“Š Visual Insights:

  • Stacked bar chart shows the Top 10 medical conditions by number of patients.

  • Matrix displays the outcome distribution (Normal, Abnormal, Inconclusive) for each condition.

  • Table lists patient details, filterable by condition and doctor using slicers.

๐ŸŽฏ Result: Quickly identify which conditions are most common, how patients are performing in tests, and filter detailed patient lists for deeper analysis.

Medical_Condition

๐Ÿ’ต 10.3 Billing & Insurance Analysis

What it does: This dashboard tracks hospital billing patterns, insurance provider contributions, and cost relationships.

๐Ÿ“Š Visual Insights:

  • Bar chart compares total billing amounts by insurance provider.

  • Line chart shows billing trends over time.

  • Scatter chart visualizes how billing amounts relate to patient length of stay, color-coded by medical condition.

๐ŸŽฏ Result: Easily monitor financial performance, identify top-paying insurers, and spot patterns between costs, patient stays, and conditions.

Billing_Insurance

๐Ÿง‘โ€โš•๏ธ 10.4 Doctor & Hospital Performance

What it does: This dashboard evaluates doctor workload, patient outcomes, and hospital-wise admissions.

๐Ÿ“Š Visual Insights:

  • Table/Matrix shows each doctorโ€™s patient count, average billing, and average length of stay.

  • Bar chart displays number of admissions per hospital.

  • Heat map cross-tabulates doctors with admission types and test results.

๐ŸŽฏ Result: Identify high-performing doctors, hospital patient loads, and how test results vary by doctor and admission type.

Doctor_Hospital

๐Ÿ“… 10.5 Time-Based Analysis

What it does: This dashboard tracks patient admissions over time, helping spot trends and seasonal patterns.

๐Ÿ“Š Visual Insights:

  • Line chart shows admission trends over time.

  • Calendar heatmap highlights daily admissions activity.

  • Custom timeline (via bookmarks) lets users switch views by Year โ†’ Quarter โ†’ Month โ†’ Date.

  • Drill-through pages provide patient-level details from any time point.

๐ŸŽฏ Result: Understand how admissions fluctuate over time, identify peak periods, and drill down to patient records on specific dates for deeper analysis.

Time_Based_Yearly

Time_Based_Monthly

Time_Based_Daily

Patient_Details_Drillthorugh

๐Ÿ‘จโ€๐Ÿ’ป 11. Author

Chandra Shekar Devarasetti
Data Scientist | Data Science Enthusiast

12. License

This project is licensed under the MIT License.

About

๐Ÿฉบ Healthcare Data Integration, Analysis & Visualization project using Python, Jupyter Notebook, and Power BI. Integrates multi-sheet Excel data, performs data cleaning, feature engineering, exploratory analysis, and delivers interactive healthcare dashboards.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published