This project focuses on developing an enhanced medical image segmentation system that integrates Generative Adversarial Networks (GANs) for data augmentation and a Large Language Model (LLM) for generating descriptive reports based on segmentation results. The system uses a U-Net model for segmentation and incorporates GPT-2 as the LLM to provide detailed insights into the segmentation outputs. The entire project leverages NVIDIA GPUs and relevant technologies for accelerated computing.
- Project Overview
- Project Structure
- Installation
- Dataset Preparation
- Usage
- Results
- Visualization
- Acknowledgments
The goal of this project is to enhance medical image segmentation performance by:
- Data Augmentation: Using a Generative Adversarial Network (GAN) to generate synthetic medical images, thereby increasing dataset diversity.
- Image Segmentation: Implementing a U-Net model for accurate segmentation of medical images.
- Report Generation: Integrating a pre-trained GPT-2 model to generate descriptive textual reports based on the segmentation outputs.
Key Technologies and Tools:
- Programming Language: Python 3.x
- Deep Learning Framework: PyTorch
- Libraries: NumPy, Matplotlib, scikit-learn, torchvision, transformers, Pillow
- Hardware: NVIDIA GPU with CUDA for accelerated computing
- Dataset: ISIC Skin Cancer Dataset
MedSegGAN-LLM/
├── data/
│ ├── images/ # Original images
│ ├── masks/ # Original segmentation masks
│ ├── augmented_images/ # Synthetic images generated by GAN
│ └── augmented_masks/ # Corresponding masks for synthetic images (if applicable)
├── results/
│ ├── generated_images/ # Images generated during GAN training
│ ├── segmentation_outputs/ # Output masks from the segmentation model
│ └── reports/ # Generated textual reports from LLM
├── src/
│ ├── data_preprocessing.py # Data loading and preprocessing functions
│ ├── gan_training.py # GAN model training script
│ ├── unet_training.py # U-Net model training script
│ ├── llm_integration.py # LLM integration for report generation
│ ├── neural_networks.py # Neural network architectures (GAN and U-Net)
│ ├── utils.py # Utility functions (e.g., visualization)
│ └── main.py # Main script to run the entire pipeline
├── requirements.txt # Required Python libraries
├── README.md # Documentation
- Python 3.8 or higher
- CUDA Toolkit (for GPU acceleration)
- An NVIDIA GPU compatible with CUDA
- Anaconda or virtualenv (recommended for environment management)
-
Clone the Repository
git clone https://github.com/Arek-KesizAbnousi/MedImgSegmentation-GAN-LLM.git cd MedImgSegmentation-GAN-LLM
-
Create a Virtual Environment
Using Anaconda:
conda create -n medseg_env python=3.8 conda activate medseg_env
Or using virtualenv:
python3 -m venv medseg_env source medseg_env/bin/activate
-
Install Required Packages
pip install -r requirements.txt
-
Verify CUDA Installation Ensure that PyTorch detects the GPU:
python -c "import torch; print(torch.cuda.is_available())"
The output should be True.
-
Download the ISIC Skin Cancer Dataset
- Register on the ISIC Archive.
- Download the dataset, including images and corresponding segmentation masks.
-
Data Preprocessing
- Run the data preprocessing script to prepare the datasets.
python src/data_preprocessing.py
- Functionality:
- Loads images and masks.
- Resizes and normalizes images.
- Splits data into training and validation sets.
Script: src/data_preprocessing.py
Purpose: Load and preprocess the dataset, and split it into training and validation sets.
Run:
python src/data_preprocessing.py
Script: src/gan_training.py
Purpose: Train the GAN to generate synthetic images for data augmentation.
Run:
python src/gan_training.py
Output:
- Trained GAN models saved in models/
- Generated images saved in results/generated_images/
Script: src/gan_training.py
Purpose: Train the U-Net model for image segmentation using the augmented dataset.
Run:
python src/unet_training.py
Output:
- Trained U-Net model saved in
models/
- Segmentation outputs saved in
results/segmentation_outputs/
Script: src/llm_integration.py
Purpose: Generate descriptive reports based on the segmentation results using GPT-2.
Run:
python src/llm_integration.py
Output:
- Generated reports saved in results/reports/
Script: src/main.py
Purpose: Execute the entire process from data preprocessing to report generation.
Run:
python src/main.py
Output:
- Generated reports saved in results/reports/
- Synthetic Images: Check
results/generated_images/
to view images generated by the GAN. - Segmentation Outputs: Segmentation masks produced by the U-Net model are in
results/segmentation_outputs/
. - Reports: Textual reports generated by GPT-2 are in
results/reports/
.
Sample Outputs::
- Original Image and Mask:
- Synthetic Image:
- Segmentation Result:
- Generated Report: NVIDIA GPU with CUDA for accelerated computing
The segmentation results indicate that the lesion has irregular borders and heterogeneous pigmentation, suggesting a potential malignancy. Further dermoscopic evaluation is recommended.
Use the utils.py
script for visualization purposes.
Display an Image:
from utils import display_image display_image(image_tensor)
Display an Image:
from utils import display_mask display_mask(mask_tensor)
- ISIC Archive: For providing the skin cancer dataset.
- PyTorch Community: For the powerful deep learning framework.
- Hugging Face: For the transformers library and pre-trained GPT-2 model.
- NVIDIA: For GPUs and CUDA, enabling accelerated computing.