Toolkit for extraction of metafeatures from medical datasets. Four different methods for metafeature extraction can be used.
Toolkit for extraction of metafeatures from medical datasets. Metafeatures are a compressed representation of a dataset which can be used in meta-learning to predict model performance for example. 4 different methods for metafeature extraction can be used. Example of usage of metafeatures
- Statistical: standard numerical features of images in datasets (mean voxel value, kurtosis, skewness etc.), and features describing the relations between images in datasets (mutual information, correlation etc.).
- VGG16/Resnet50/MobileNetV1: Deep learning based feature extraction from datasets. Network is finetuned without the need of labels and outputs a feature representation of a dataset which can be used as a metafeature.
Metafeature extraction using deep learning based methods
Images should have .nii.gz extension. (if your images have .nii format, gzip: https://www.gzip.org/ can be used for fast conversion)
The main medical_metafeatures
requirement is:
- Python (>= 3.6)
Specific requirement to be found in 'requirements.txt' Package is tested on Python 3.6 and Python 3.7
Installation through pip is not yet enabled.
It is possible to install the current version using:
pip install -U git+https://github.com/tjvsonsbeek/medical-mfe
or
git clone https://github.com/tjvsonsbeek/medical-mfe.git
cd medical_metafeatures
python setup.py sdist
or
Download the package through this link:
https://drive.google.com/file/d/1gR1mRe-wpD_0Ap-F8szHgNqYe7SOk7YS/view?usp=sharing
Use get_meta_features for the extraction of metafeatures.
Example:
python -m medical_metafeatures.meta_get_features --task 'Example_dataset' --feature_extractors 'STAT' 'VGG16', --meta_suset_size 15 --generate_weights False --output_path 'dest' --task_path 'datasets'
-t, --task\
Name of dataset or datasets on which metafeatures will be extracted as string. Multiple inputs are possible.
--feature_extractors
Feature extractors to use for metafeature extraction. Expected as string. choose from 'STAT', 'VGG16', 'ResNet50' and 'MobileNetV1'. Multiple inputs are possible.
Default = ['STAT', 'VGG16']
--load_labels
Choose whether to load metalabels. will throw error if there are no metalabels. Currently only works for medical decathlon datasets. Metalabels are not public yet.
Default = False
--meta_subset_size
Number of images on which metafeature is based.
Default = 20
--meta_sample_size\
Number of metafeatures per dataset.
Default = 10
--generate_model_weights
Boolean which tells whether new model weights should be generated. Only used when deep learning based metafeature extraction is done.
Default = True
--output_path
Path where all output will be saved
Default = 'metafeature_extraction_result'
--task_path.
Path in which to find the dataset folder. In this folder there should be a folder with the name of -t/--task. This folder should contain a ImagesTs folder with the images to extract the metafeature from in it. Images should have the .nii.gz extension.
Default = 'DecathlonData'
--finetune_ntrain
Number of training images in finetuning. Only applicable when generate_model_weights == True
Default = 800
--finetune_nval
Number of validation images in finetuning. Only applicable when generate_model_weights == True
Default = 200
--finetune_nepochs
Number of epochs in finetuning. Only applicable when generate_model_weights == True
Default = 5
--finetune_batch
Batch size in finetuning. Only applicable when generate_model_weights == True
Default = 5
This project has been set up using PyScaffold 3.2.2. For details and usage information on PyScaffold see https://pyscaffold.org/.