Autoencoder and NCA based neural network model to estimate survival prognosis in multiple myeloma using arrayCGH data
Vidhi Malik 1, Shayoni Dutta 2, Navaneethan Radhakrishnan 1, Yogesh Kalakoti 1, Ritu Gupta 3,* and Durai Sundar 1,4,*
1 Department of Biochemical Engineering & Biotechnology, Indian Institute of Technology (IIT) Delhi, New Delhi - 110016, India;
2 Certara UK Ltd, Quantitative systems pharmacology division of SimCyp, Level 2-Acero, 1 Concourse Way, Sheffield S1 2BJ, United Kingdom.;
3 Laboratory Oncology Unit, Dr. B.R.A.IRCH, All India Institute of Medical Sciences (AIIMS), Ansari Nagar, New Delhi, 110029, India.
4 Yardi School of Artificial Intelligence, Indian Institute of Technology (IIT) Delhi, New Delhi – 110016, India.
Multiple myeloma (MM) is malignancy of plasma cells, found in the bone marrow, which aids in fighting infections by synthesis of immunoglobulins. Clonal proliferation of abnormal plasma cell outgrows normal plasma cells and carry on synthesis of abnormal proteins, leading to MM. With advancements in clinical research, the disease has become highly manageable, but not curable. Various clinical factors are considered by medical practitioners for prediction of prognosis and treatment regimens for patients. An attempt has been made here to develop a tool that can help in predicting the survival and prognosis of MM patients, which will eventually support clinicians in designing suitable treatment regimen for the patients.
To use the proposed neural network based survival prediction model for Multiple myeloma patients, use commands:
cd TeamSundar/Multiple-myeloma_prognosis/NCA-Neuralnet
load mm_NCANN.mat
newoutput = mm_NCANN(newinput);
The pipeline require two input files:
- Clinical features file should have seven columns in a format specified below
aCGH ID_1 | Age | Gender | OS_Time (days) | Chemotherapy Regimen | ISS Staging |
---|---|---|---|---|---|
253058713873_1 | 73 | 1 | 52 | 1 | 2 |
253058713873_3 | 75 | 0 | 175 | 4 | 2 |
253058713877_1 | 54 | 1 | 182 | 2 | 3 |
253058713877_2 | 58 | 1 | 203 | 7 | 3 |
Please refer following table for symbols used for features like gender, chemotherapy regimen, ISS satging and response columns:
Gender | Chemotherapy Regimen | Staging (International Staging System) |
---|---|---|
0 (Male) | 1 : lenalidomide-dexamethasone (RD) | 1 (ISS 1) |
1 (Female) | 2 : thalidomide-dexamethasone (TD) | 2 (ISS 2) |
3 : bortezomib-dexamethasone (VD) | 3 (ISS 3) | |
4 : melphalan-prednisone-thalidomide (MPT) | ||
5 : bortezomib- thalidomide-dexamethasone (VTD) | ||
6 : bortezomib-lenalidomide-dexamethasone (VRD) | ||
7 : bortezomib-cyclophosphamide-dexamethasone (VCD) | ||
8: cyclophosphamide, thalidomide, dexamethasone (CTD) |
- CNV file The required CNV input file should in the format specified in table below:
Sample | Gene1 | Gene2 | .. | GeneN |
---|---|---|---|---|
Sample 1 | ||||
Sample2 | ||||
.. | ||||
SampleN |
The neighbourhood component analysis (NCA) algorithm was used to reduce the dimension of input dataset that provided us a gene signature comprised of 211 genes that were able to classify the patients into three classes based on the progression event and death event of the participant. The input file should have CNV values for these 211 genes. The input file can be formatted using script ./Input_Data/Input_prep.py
Model will classifiy patient into three classes based on progression and death event chances i.e.,
- Class 1: 11 (Dead with relapse i.e., Progression event: 1 and death event :1)
- Class 2: 10 (alive with relapse i.e., Progression event : 1 and death event: 0)
- Class 3: 0 (alive with no relapse i.e., Progression event :0 and death event: 0)
The Matlab live script for proposed NCA-Neural network-based model is located in ./NCA-Neuralnet/ArrayCGH_NCA_Neural_net_92_7percent_accuracy_final_model.mlx
The Matlab live scripts for autoencoder based prediction models, DNN1 and DNN2 is located in directory ./DNN1_and_DNN2/ArrayCGH_DNN1_52_6_andDNN2_68_4percent_SVM_41_2_RUS_33percent.mlx
Distributed under the MIT License. See LICENSE
for more information.