diff --git a/My Project/.ipynb_checkpoints/09-11-2020 ML Course Nigeria Project 'Abdulhameed Araromi'-checkpoint.ipynb b/My Project/.ipynb_checkpoints/09-11-2020 ML Course Nigeria Project 'Abdulhameed Araromi'-checkpoint.ipynb new file mode 100644 index 0000000..5258d8a --- /dev/null +++ b/My Project/.ipynb_checkpoints/09-11-2020 ML Course Nigeria Project 'Abdulhameed Araromi'-checkpoint.ipynb @@ -0,0 +1,1696 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Project\n", + "\n", + "In this project, our aim is to building a model for predicting churn. Churn is the percentage of customers that stopped using your company's product or service during a certain time frame. Thus, in the given dataset, our label will be `Churn` column.\n", + "\n", + "## Steps\n", + "- Read the `churn.csv` file and describe it.\n", + "- Make at least 4 different analysis on Exploratory Data Analysis section.\n", + "- Pre-process the dataset to get ready for ML application. (Check missing data and handle them, can we need to do scaling or feature extraction etc.)\n", + "- Define appropriate evaluation metric for our case (classification).\n", + "- Train and evaluate Logistic Regression, Decision Trees and one other appropriate algorithm which you can choose from scikit-learn library.\n", + "- Is there any overfitting and underfitting? Interpret your results and try to overcome if there is any problem in a new section.\n", + "- Create confusion metrics for each algorithm and display Accuracy, Recall, Precision and F1-Score values.\n", + "- Analyse and compare results of 3 algorithms.\n", + "- Select best performing model based on evaluation metric you chose on test dataset.\n", + "\n", + "\n", + "Good luck :)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "

Abdulhameed Temitope Araromi

" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Data" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [], + "source": [ + "import pandas as pd\n", + "import seaborn as sns\n", + "import numpy as np\n", + "import matplotlib.pyplot as plt" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
ChurnAccountWeeksContractRenewalDataPlanDataUsageCustServCallsDayMinsDayCallsMonthlyChargeOverageFeeRoamMins
00128112.71265.111089.09.8710.0
10107113.71161.612382.09.7813.7
20137100.00243.411452.06.0612.2
3084000.02299.47157.03.106.6
4075000.03166.711341.07.4210.1
\n", + "
" + ], + "text/plain": [ + " Churn AccountWeeks ContractRenewal DataPlan DataUsage CustServCalls \\\n", + "0 0 128 1 1 2.7 1 \n", + "1 0 107 1 1 3.7 1 \n", + "2 0 137 1 0 0.0 0 \n", + "3 0 84 0 0 0.0 2 \n", + "4 0 75 0 0 0.0 3 \n", + "\n", + " DayMins DayCalls MonthlyCharge OverageFee RoamMins \n", + "0 265.1 110 89.0 9.87 10.0 \n", + "1 161.6 123 82.0 9.78 13.7 \n", + "2 243.4 114 52.0 6.06 12.2 \n", + "3 299.4 71 57.0 3.10 6.6 \n", + "4 166.7 113 41.0 7.42 10.1 " + ] + }, + "execution_count": 2, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Read csv\n", + "data = pd.read_csv(\"churn.csv\")\n", + "data.head()" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "RangeIndex: 3333 entries, 0 to 3332\n", + "Data columns (total 11 columns):\n", + " # Column Non-Null Count Dtype \n", + "--- ------ -------------- ----- \n", + " 0 Churn 3333 non-null int64 \n", + " 1 AccountWeeks 3333 non-null int64 \n", + " 2 ContractRenewal 3333 non-null int64 \n", + " 3 DataPlan 3333 non-null int64 \n", + " 4 DataUsage 3333 non-null float64\n", + " 5 CustServCalls 3333 non-null int64 \n", + " 6 DayMins 3333 non-null float64\n", + " 7 DayCalls 3333 non-null int64 \n", + " 8 MonthlyCharge 3333 non-null float64\n", + " 9 OverageFee 3333 non-null float64\n", + " 10 RoamMins 3333 non-null float64\n", + "dtypes: float64(5), int64(6)\n", + "memory usage: 286.6 KB\n" + ] + } + ], + "source": [ + "# Describe our data for each feature and use .info() for get information about our dataset\n", + "# Analys missing values\n", + "data.info()" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
ChurnAccountWeeksContractRenewalDataPlanDataUsageCustServCallsDayMinsDayCallsMonthlyChargeOverageFeeRoamMins
count3333.0000003333.0000003333.0000003333.0000003333.0000003333.0000003333.0000003333.0000003333.0000003333.0000003333.000000
mean0.144914101.0648060.9030900.2766280.8164751.562856179.775098100.43564456.30516110.05148810.237294
std0.35206739.8221060.2958790.4473981.2726681.31549154.46738920.06908416.4260322.5357122.791840
min0.0000001.0000000.0000000.0000000.0000000.0000000.0000000.00000014.0000000.0000000.000000
25%0.00000074.0000001.0000000.0000000.0000001.000000143.70000087.00000045.0000008.3300008.500000
50%0.000000101.0000001.0000000.0000000.0000001.000000179.400000101.00000053.50000010.07000010.300000
75%0.000000127.0000001.0000001.0000001.7800002.000000216.400000114.00000066.20000011.77000012.100000
max1.000000243.0000001.0000001.0000005.4000009.000000350.800000165.000000111.30000018.19000020.000000
\n", + "
" + ], + "text/plain": [ + " Churn AccountWeeks ContractRenewal DataPlan DataUsage \\\n", + "count 3333.000000 3333.000000 3333.000000 3333.000000 3333.000000 \n", + "mean 0.144914 101.064806 0.903090 0.276628 0.816475 \n", + "std 0.352067 39.822106 0.295879 0.447398 1.272668 \n", + "min 0.000000 1.000000 0.000000 0.000000 0.000000 \n", + "25% 0.000000 74.000000 1.000000 0.000000 0.000000 \n", + "50% 0.000000 101.000000 1.000000 0.000000 0.000000 \n", + "75% 0.000000 127.000000 1.000000 1.000000 1.780000 \n", + "max 1.000000 243.000000 1.000000 1.000000 5.400000 \n", + "\n", + " CustServCalls DayMins DayCalls MonthlyCharge OverageFee \\\n", + "count 3333.000000 3333.000000 3333.000000 3333.000000 3333.000000 \n", + "mean 1.562856 179.775098 100.435644 56.305161 10.051488 \n", + "std 1.315491 54.467389 20.069084 16.426032 2.535712 \n", + "min 0.000000 0.000000 0.000000 14.000000 0.000000 \n", + "25% 1.000000 143.700000 87.000000 45.000000 8.330000 \n", + "50% 1.000000 179.400000 101.000000 53.500000 10.070000 \n", + "75% 2.000000 216.400000 114.000000 66.200000 11.770000 \n", + "max 9.000000 350.800000 165.000000 111.300000 18.190000 \n", + "\n", + " RoamMins \n", + "count 3333.000000 \n", + "mean 10.237294 \n", + "std 2.791840 \n", + "min 0.000000 \n", + "25% 8.500000 \n", + "50% 10.300000 \n", + "75% 12.100000 \n", + "max 20.000000 " + ] + }, + "execution_count": 8, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "data.describe()" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "Churn 0\n", + "AccountWeeks 0\n", + "ContractRenewal 0\n", + "DataPlan 0\n", + "DataUsage 0\n", + "CustServCalls 0\n", + "DayMins 0\n", + "DayCalls 0\n", + "MonthlyCharge 0\n", + "OverageFee 0\n", + "RoamMins 0\n", + "dtype: int64" + ] + }, + "execution_count": 9, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "data.isna().sum()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Exploratory Data Analysis" + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "" + ] + }, + "execution_count": 11, + "metadata": {}, + "output_type": "execute_result" + }, + { + "data": { + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYsAAAEGCAYAAACUzrmNAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/d3fzzAAAACXBIWXMAAAsTAAALEwEAmpwYAAAP60lEQVR4nO3dcayd9V3H8fdnMBm6ESEUVtrOsqVTCyqEayXyh0yi1CWmbHNLMRuNErsQZkaymMD+ENQ0WSLbHHPDdBmDmm2k2YZUBSfD6VxkY7dLs9JiXR0Id630spmARtF2X/84T8NZe3p/p7c959z2vl/JyXnO93l+z/lecssnz/P8nuemqpAkaS6vmHQDkqSFz7CQJDUZFpKkJsNCktRkWEiSms6cdAOjcv7559fKlSsn3YYknVK2b9/+fFUtObJ+2obFypUrmZ6ennQbknRKSfJvg+qehpIkNRkWkqQmw0KS1GRYSJKaDAtJUpNhIUlqMiwkSU2GhSSpybCQJDWdtndwn6grfm/LpFvQArT9j2+YdAvSRHhkIUlqMiwkSU2GhSSpybCQJDUZFpKkJsNCktRkWEiSmgwLSVKTYSFJajIsJElNhoUkqcmwkCQ1GRaSpCbDQpLUZFhIkpoMC0lSk2EhSWoyLCRJTYaFJKnJsJAkNRkWkqSmkYVFkhVJvpzkySS7kry3q9+R5LtJdnSvN/eNuS3J3iR7klzbV78iyc5u3V1JMqq+JUlHO3OE+z4IvK+qvpnkNcD2JI906z5cVXf2b5xkNbAeuAS4CPhSkjdW1SHgbmAj8DXgIWAt8PAIe5ck9RnZkUVV7a+qb3bLLwJPAsvmGLIOuL+qXqqqp4C9wJokS4FzquqxqipgC3DdqPqWJB1tLNcskqwELge+3pXek+RbSe5Jcm5XWwY82zdspqst65aPrA/6no1JppNMz87OnswfQZIWtZGHRZJXA58HbqmqF+idUnoDcBmwH/jg4U0HDK856kcXqzZX1VRVTS1ZsuREW5ckdUYaFkleSS8oPl1VXwCoqueq6lBV/QD4BLCm23wGWNE3fDmwr6svH1CXJI3JKGdDBfgk8GRVfaivvrRvs7cAT3TL24D1Sc5KcjGwCni8qvYDLya5stvnDcCDo+pbknS0Uc6Gugp4F7AzyY6u9n7g+iSX0TuV9DTwboCq2pVkK7Cb3kyqm7uZUAA3AfcCZ9ObBeVMKEkao5GFRVV9lcHXGx6aY8wmYNOA+jRw6cnrTpJ0PLyDW5LUZFhIkpoMC0lSk2EhSWoyLCRJTYaFJKnJsJAkNRkWkqQmw0KS1GRYSJKaDAtJUpNhIUlqMiwkSU2GhSSpybCQJDUZFpKkJsNCktRkWEiSmgwLSVKTYSFJajIsJElNhoUkqcmwkCQ1GRaSpCbDQpLUZFhIkpoMC0lS08jCIsmKJF9O8mSSXUne29XPS/JIkm937+f2jbktyd4ke5Jc21e/IsnObt1dSTKqviVJRxvlkcVB4H1V9dPAlcDNSVYDtwKPVtUq4NHuM9269cAlwFrg40nO6PZ1N7ARWNW91o6wb0nSEUYWFlW1v6q+2S2/CDwJLAPWAfd1m90HXNctrwPur6qXquopYC+wJslS4JyqeqyqCtjSN0aSNAZjuWaRZCVwOfB14MKq2g+9QAEu6DZbBjzbN2ymqy3rlo+sD/qejUmmk0zPzs6e1J9BkhazkYdFklcDnwduqaoX5tp0QK3mqB9drNpcVVNVNbVkyZLjb1aSNNBIwyLJK+kFxaer6gtd+bnu1BLd+4GuPgOs6Bu+HNjX1ZcPqEuSxmSUs6ECfBJ4sqo+1LdqG7ChW94APNhXX5/krCQX07uQ/Xh3qurFJFd2+7yhb4wkaQzOHOG+rwLeBexMsqOrvR/4ALA1yY3AM8DbAapqV5KtwG56M6lurqpD3bibgHuBs4GHu5ckaUxGFhZV9VUGX28AuOYYYzYBmwbUp4FLT153kqTj4R3ckqQmw0KS1GRYSJKaDAtJUpNhIUlqMiwkSU2GhSSpybCQJDUZFpKkJsNCktRkWEiSmgwLSVKTYSFJajIsJElNhoUkqcmwkCQ1GRaSpCbDQpLUZFhIkpoMC0lS01BhkeTRYWqSpNPTmXOtTPIq4EeB85OcC6RbdQ5w0Yh7kyQtEHOGBfBu4BZ6wbCdl8PiBeBjo2tLkrSQzBkWVfUR4CNJfreqPjqmniRJC0zryAKAqvpokl8EVvaPqaotI+pLkrSADBUWSf4ceAOwAzjUlQswLCRpERgqLIApYHVV1SibkSQtTMPeZ/EE8NpRNiJJWriGDYvzgd1Jvphk2+HXXAOS3JPkQJIn+mp3JPlukh3d6819625LsjfJniTX9tWvSLKzW3dXkhz5XZKk0Rr2NNQd89j3vcCfcvR1jQ9X1Z39hSSrgfXAJfSm6X4pyRur6hBwN7AR+BrwELAWeHge/UiS5mnY2VD/cLw7rqqvJFk55ObrgPur6iXgqSR7gTVJngbOqarHAJJsAa7DsJCksRr2cR8vJnmhe/1PkkNJXpjnd74nybe601TndrVlwLN928x0tWXd8pH1Y/W5Mcl0kunZ2dl5tidJOtJQYVFVr6mqc7rXq4C30TvFdLzupjcF9zJgP/DBrj7oOkTNUT9Wn5uraqqqppYsWTKP9iRJg8zrqbNV9RfAL89j3HNVdaiqfgB8AljTrZoBVvRtuhzY19WXD6hLksZo2Jvy3tr38RX07rs47nsukiytqv3dx7fQm5ILsA34TJIP0bvAvQp4vKoOdafArgS+DtwA+NgRSRqzYWdD/Xrf8kHgaXoXpY8pyWeBq+k9sXYGuB24Osll9ILmaXoPKqSqdiXZCuzu9n9zNxMK4CZ6M6vOpndh24vbkjRmw86G+q3j3XFVXT+g/Mk5tt8EbBpQnwYuPd7vlySdPMPOhlqe5IHuJrvnknw+yfL2SEnS6WDYC9yfondd4SJ6U1f/sqtJkhaBYcNiSVV9qqoOdq97AeemStIiMWxYPJ/knUnO6F7vBL43ysYkSQvHsGHx28A7gH+ndzPdbwDHfdFbknRqGnbq7B8BG6rqPwCSnAfcSS9EJEmnuWGPLH72cFAAVNX3gctH05IkaaEZNixe0ffQv8NHFsMelUiSTnHD/g//g8A/Jfkcvbuv38GAG+gkSaenYe/g3pJkmt7DAwO8tap2j7QzSdKCMfSppC4cDAhJWoTm9YhySdLiYlhIkpoMC0lSk2EhSWoyLCRJTYaFJKnJsJAkNRkWkqQmw0KS1GRYSJKaDAtJUpNhIUlqMiwkSU2GhSSpybCQJDUZFpKkppGFRZJ7khxI8kRf7bwkjyT5dvfe/3e9b0uyN8meJNf21a9IsrNbd1eSjKpnSdJgozyyuBdYe0TtVuDRqloFPNp9JslqYD1wSTfm40nO6MbcDWwEVnWvI/cpSRqxkYVFVX0F+P4R5XXAfd3yfcB1ffX7q+qlqnoK2AusSbIUOKeqHquqArb0jZEkjcm4r1lcWFX7Abr3C7r6MuDZvu1mutqybvnI+kBJNiaZTjI9Ozt7UhuXpMVsoVzgHnQdouaoD1RVm6tqqqqmlixZctKak6TFbtxh8Vx3aonu/UBXnwFW9G23HNjX1ZcPqEuSxmjcYbEN2NAtbwAe7KuvT3JWkovpXch+vDtV9WKSK7tZUDf0jZEkjcmZo9pxks8CVwPnJ5kBbgc+AGxNciPwDPB2gKralWQrsBs4CNxcVYe6Xd1Eb2bV2cDD3UuSNEYjC4uquv4Yq645xvabgE0D6tPApSexNUnScVooF7glSQuYYSFJajIsJElNhoUkqcmwkCQ1GRaSpCbDQpLUZFhIkpoMC0lSk2EhSWoyLCRJTYaFJKnJsJAkNRkWkqQmw0KS1GRYSJKaDAtJUpNhIUlqMiwkSU2GhSSpybCQJDUZFpKkJsNCktRkWEiSmgwLSVKTYSFJajIsJElNEwmLJE8n2ZlkR5LprnZekkeSfLt7P7dv+9uS7E2yJ8m1k+hZkhazSR5ZvKmqLquqqe7zrcCjVbUKeLT7TJLVwHrgEmAt8PEkZ0yiYUlarBbSaah1wH3d8n3AdX31+6vqpap6CtgLrBl/e5K0eE0qLAr42yTbk2zsahdW1X6A7v2Crr4MeLZv7ExXO0qSjUmmk0zPzs6OqHVJWnzOnND3XlVV+5JcADyS5J/n2DYDajVow6raDGwGmJqaGriNJOn4TSQsqmpf934gyQP0Tis9l2RpVe1PshQ40G0+A6zoG74c2DfWhqUF5pk//JlJt6AF6HW/v3Nk+x77aagkP5bkNYeXgV8FngC2ARu6zTYAD3bL24D1Sc5KcjGwCnh8vF1L0uI2iSOLC4EHkhz+/s9U1d8k+QawNcmNwDPA2wGqaleSrcBu4CBwc1UdmkDfkrRojT0squo7wM8NqH8PuOYYYzYBm0bcmiTpGBbS1FlJ0gJlWEiSmgwLSVKTYSFJajIsJElNhoUkqcmwkCQ1GRaSpCbDQpLUZFhIkpoMC0lSk2EhSWoyLCRJTYaFJKnJsJAkNRkWkqQmw0KS1GRYSJKaDAtJUpNhIUlqMiwkSU2GhSSpybCQJDUZFpKkJsNCktRkWEiSmgwLSVKTYSFJajplwiLJ2iR7kuxNcuuk+5GkxeSUCIskZwAfA34NWA1cn2T1ZLuSpMXjlAgLYA2wt6q+U1X/C9wPrJtwT5K0aJw56QaGtAx4tu/zDPALR26UZCOwsfv4n0n2jKG3xeB84PlJN7EQ5M4Nk25BR/P387DbczL28hODiqdKWAz6L1BHFao2A5tH387ikmS6qqYm3Yc0iL+f43GqnIaaAVb0fV4O7JtQL5K06JwqYfENYFWSi5P8CLAe2DbhniRp0TglTkNV1cEk7wG+CJwB3FNVuybc1mLiqT0tZP5+jkGqjjr1L0nSDzlVTkNJkibIsJAkNRkWmpOPWdFCleSeJAeSPDHpXhYDw0LH5GNWtMDdC6yddBOLhWGhufiYFS1YVfUV4PuT7mOxMCw0l0GPWVk2oV4kTZBhobkM9ZgVSac/w0Jz8TErkgDDQnPzMSuSAMNCc6iqg8Dhx6w8CWz1MStaKJJ8FngM+MkkM0lunHRPpzMf9yFJavLIQpLUZFhIkpoMC0lSk2EhSWoyLCRJTYaFdAKSvDbJ/Un+NcnuJA8l2Zjkrybdm3QyGRbSPCUJ8ADw91X1hqpaDbwfuPAE93tK/LljLS7+Ukrz9ybg/6rqzw4XqmpHkh8HrknyOeBSYDvwzqqqJE8DU1X1fJIp4M6qujrJHcBFwErg+ST/ArwOeH33/idVddf4fjTph3lkIc3f4SAY5HLgFnp/B+T1wFVD7O8KYF1V/Wb3+aeAa+k9Kv72JK88oW6lE2BYSKPxeFXNVNUPgB30jhhatlXVf/d9/uuqeqmqngcOcIKnt6QTYVhI87eL3tHAIC/1LR/i5VO+B3n5392rjhjzX0PuQxo7w0Kav78DzkryO4cLSX4e+KU5xjzNywHzttG1Jp1choU0T9V7CudbgF/pps7uAu5g7r/58QfAR5L8I72jBemU4FNnJUlNHllIkpoMC0lSk2EhSWoyLCRJTYaFJKnJsJAkNRkWkqSm/wf03QODNr6OSgAAAABJRU5ErkJggg==\n", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], + "source": [ + "# Our label Distribution (countplot)\n", + "sns.countplot(data['Churn'])" + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "" + ] + }, + "execution_count": 12, + "metadata": {}, + "output_type": "execute_result" + }, + { + "data": { + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYAAAAEGCAYAAABsLkJ6AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/d3fzzAAAACXBIWXMAAAsTAAALEwEAmpwYAAAu60lEQVR4nO3deZhcZZX48e+p6uo1ve97d/Z0FrInhC2sBhCCosOiAuqAUXBGZ3Rkfj6PPx1n5ue4zYgimyLEEREBNQgSIpCwJIEsZOsknXR3Ot2d9Jbe963e3x91g03TS/V6azmfJ/101b33rTrvU+k697733vOKMQallFLBx2F3AEoppeyhCUAppYKUJgCllApSmgCUUipIaQJQSqkgFWJ3AGORlJRk8vLy7A5DKaX8yr59+84ZY5IHL/erBJCXl8fevXvtDkMppfyKiJwearkOASmlVJDSBKCUUkFKE4BSSgUpTQBKKRWkNAEopVSQ0gSglFJBShOAUkoFKU0ASikVpDQBKKVUkPKrO4FV8HrqnfIR19++JmeaIlEqcHh1BCAiG0SkSESKReT+IdaLiDxgrT8kIssHrHtcRGpF5MigNgkisk1ETlq/4yfeHaWUUt4aNQGIiBN4ELgWKABuE5GCQZtdC8yxfu4BHhqw7glgwxAvfT/wqjFmDvCq9VwppdQ08eYIYDVQbIwpNcb0AE8DGwdtsxHYbDx2A3Eikg5gjHkDaBjidTcCT1qPnwRuGkf8SimlxsmbBJAJVAx4XmktG+s2g6UaY6oArN8pXsSilFJqkniTAGSIZWYc24yLiNwjIntFZG9dXd1kvKRSSim8uwqoEsge8DwLODuObQarEZF0Y0yVNVxUO9RGxphHgUcBVq5cOSlJRQWXka4g0quHVDDzJgHsAeaISD5wBrgVuH3QNluA+0TkaWAN0Hx+eGcEW4A7ge9Zv/80lsCVGs65tm62F9Xx+vFaKho7AEiODiMzNoK5adE4ZKgDVqWCz6gJwBjTJyL3AVsBJ/C4MaZQRDZZ6x8GXgKuA4qBDuCz59uLyG+B9UCSiFQC/9cY80s8X/zPiMjngXLgk5PZMRVcnnqnnN5+N1sOnGV/eSMGiA4PISU6jIb2Hg5XNmOApBlhXD4vmSVZcTgdmghUcBNj/GdUZeXKlUanhAxOo90I1tLVy292n6aisZOLZyexNDuO9NhwxNrb7+13c7y6le1FtVQ1d5EYFcrHl2eRnxQ17Gvq8JAKFCKyzxizcvByvRNY+b2zTZ1s3lVGZ28/n1qTw8KM2A9t43I6WJwZy6KMGI5Xt/Li4Sp+8WYpl8xJ5qqCFEIcWhVFBR9NAMqvtXf3sXlXGSLCFy6dRUZcxIjbiwgL0mOYmRzFi4eqeONkHSdrW7l9dQ6JM8LG9N56cln5O93tUX7LGMNz+ytp7+nnM2tzR/3yHygsxMnHl2fx6TW5NHX08uD2Yk7UtE5htEr5Hk0Aym/tKq3neHUrGxamjenLf6CCjBjuvXw2cRGhPLmzjO1FtfjTeTGlJkITgPJLVc2d/OVINfNSo1k3K3FCr5UQFcqmy2axOCuWV47W8PSeCnr63JMUqVK+S88BKL/jNoZn91USGerk5hVZ71/pMxGhIQ5uWZlNRmwEWwuraWjvYcOiNNJiwychYqV8kx4BKL9zoKKJquYurluczoywyduHEREunZvMp9fmUtfWzY0/e4vDlc2T9vpK+RpNAMqv9Pa72Xa0hsy4CBZnfvhyz8mwID2GTZfNwuV0cMuju9heNGSVEqX8ng4BKb+yq6Se5s5ePrEia0pLOqTFhPOZC3N5cmcZn3tiDx9blsWKXJ2zSAUWPQJQfqOjp4/tJ2qZlxrNrOQZU/5+MeEu7r5kJjOTZvDc/kp2lZyb8vdUajppAlB+Y3tRHd29bj6yKG3a3jPc5eSOdbkUpMfwwqEqDlQ0Tdt7KzXVNAEov9Da1cvu0nqW5cSRFjO9V+aEOBzcsiqb/KQont1XQVG13jCmAoMmAOUXdpbU0+82rJ9rz8RxLqeDz6zNJS02nKfePU1FQ4ctcSg1mTQBKJ/X3OnZ+1+UGUtS9Njq9UymcJeTu9blMyMshN/t1ZvFlP/TBKB83q93ldHd52b9vGS7Q2FGWAg3r8iiob2HlwtHm/NIKd+mCUD5tI6ePh5/u4x5qdGkx46v3s9km5k0g4tmJbK7tIHi2ja7w1Fq3DQBKJ/29LsVNLT3+MTe/0DXLEwjaUYYz++vpKu33+5wlBoXTQDKZ/X0uXnszVLW5CeQmzj8zF12cDkdfHJFFs2dvWw7WmN3OEqNiyYA5bO2HDxLVXMXX7p8tt2hDCk7IZIVufG8W9ZAc2ev3eEoNWaaAJRPcrsNj+woYX5aNJfOSbI7nGGtn5eCMYYdJ+rsDkWpMdMEoHzS60W1nKxtY9Nlsyal3PNUSYgKZXlOPHv0KED5IU0Ayic9sqOUzLgIrl+Sbncoo9KjAOWvNAEon7PvdCPvljXw95fk43L6/n9RPQpQ/sr3/7pU0HlkRwlxkS5uWZVtdyheO38U8OZJPQpQ/kMTgPIpxbWtvHK0hjvW5hIZ6j/TVSREhbIoM5b95Y309muJCOUfNAEon/LwjlLCXQ7uuijf7lDGbFVeAl29bo6c0WkklX/QBKB8xpmmTv743hluXZVDQlSo3eGM2cykKBKjQtlT1mh3KEp5RROA8hmPvVEKwN2XzrQ5kvEREVbmxlNW305JndYIUr5PE4DyCQ3tPTy9p5yblmWSGecbRd/GY3luPA6BZ/ZU2B2KUqPSBKB8whNvn6K7z82my/xz7/+86HAX89NieHZfpc4XoHyeJgBlu7buPp7cdZprClKZnRJtdzgTtiovnvr2Hv56TIvEKd+mCUDZ7ql3TtPc2csX1/tm0bexmpMaTXpsOM/s1WEg5du8SgAiskFEikSkWETuH2K9iMgD1vpDIrJ8tLYislREdovIARHZKyKrJ6dLyp909fbz2JunuHh2Ekuz4+wOZ1I4RLjxggzeOnmOpo4eu8NRalijJgARcQIPAtcCBcBtIlIwaLNrgTnWzz3AQ160/T7wHWPMUuBb1nMVZJ7dV0ldazdfunyW3aFMquuXpNPnNrxSqMNAynd5c6vlaqDYGFMKICJPAxuBowO22QhsNsYYYLeIxIlIOpA3QlsDxFjtY4GzE++O8nVPvVP+/uN+t+HH24rIjo/gVF0762b5btnnsTpc2Ux8pIvH3iylz20+sO72NTk2RaXUB3kzBJQJDBzMrLSWebPNSG2/AvxARCqAHwL/OtSbi8g91hDR3ro6rbMSSA5VNtHY0cv6eSk+XfJ5PESExZlxlNS10dHdZ3c4Sg3JmwQw1F+m8XKbkdp+EfiqMSYb+Crwy6He3BjzqDFmpTFmZXKyb80Lq8bPbZVPTosJZ16a/1/5M5TFWbG4DRRWtdgdilJD8iYBVAIDyzJm8eHhmuG2GantncDz1uPf4xlqUkGiqLqV2tZuLp2bhCPA9v7Py4gNJyEqVGsDKZ/lTQLYA8wRkXwRCQVuBbYM2mYLcId1NdBaoNkYUzVK27PAZdbjK4CTE+yL8iNvnqwjLsLF4sw4u0OZMp5hoFhK6tpo12Eg5YNGPQlsjOkTkfuArYATeNwYUygim6z1DwMvAdcBxUAH8NmR2lovfTfwExEJAbrwXD2kgkBFQwdl9R1cvzgdpyMw9/7PW5wZy44TdRw928Kq/AS7w1HqA7wquG6MeQnPl/zAZQ8PeGyAe71tay1/C1gxlmBVYHjzZB3hLgcrc+PtDmXKpceGkxgVyuEzzZoAlM/RO4HVtKpv66bwbAtr8hMJczntDmfKiQgLM2IpPddGZ0+/3eEo9QGaANS0ervkHA4RLpyZaHco06YgIwa3gaIavRpI+RZNAGraNLb3sO90I0uz44iJcNkdzrTJio8gOiyEo1Wtdoei1Af4z6Sryi8MvNN3sDdO1NHbb7hoduDc8esNhwjz06M5WNlMn84XrHyIHgGoaeE2hnfLGshLjCQtNtzucKZdQXoMPX1uSura7Q5FqfdpAlDTori2jYb2HtYE0dj/QDOTZxDqdHBM7wpWPkQTgJoWu0vriQoLYWFGzOgbByCX08Hc1Bkcq27B7R5cSUUpe2gCUFOusaOHoupWVuXGE+II3v9yC9JjaO3q42Blk92hKAVoAlDTYM+pBoCgvxFqfloMDoFtR3WOAOUbNAGoKdXndrPndCPz06KJjwy1OxxbRYQ6yUuK4hVNAMpHaAJQU+pYVSvt3X1Be/J3sIL0GIpr2yg7p1cDKftpAlBT6r3yRmLCQ5idMsPuUHzC/DTPSfC/HtOjAGU/TQBqyrR193GippWl2fEBW/N/rBKiQpmfFq0JQPkETQBqyhysaMJtYFlOnN2h+JSrFqSyp6yRpo4eu0NRQU4TgJoy71U0khkXQWpM8N35O5KrClLpdxu2F+kc18peWgtITYnqli7ONnXx0SXpXrcZqY5QIFmSGUtydBjbjtZw07JMu8NRQUyPANSUOFDeiENgSVac3aH4HIdDuGpBCjtO1NHdp3MEKPtoAlCTzm0MByqamJsazYwwPcgcylULUmnr7uOd0ga7Q1FBTBOAmnQldW20dPWxLCfwp3wcr4tmJxHucujVQMpWmgDUpDtc2UxYiIP5adF2h+Kzwl1OLpmTzF+P1uCZUlup6acJQE2qfreh8GwLC9JjcDn1v9dIri5I5WxzF4VntUS0sof+hapJVVLXRmdvP4syYu0OxeddOT8Fh8ArhdV2h6KClCYANamOnPEM/8xJ1dIPo0mcEcaqvAQtDqdsowlATZrefrcO/4zRNQvTOF7dyul6LQ6npp9eo6cmzc6Sejp7+1mcqcM/Ixl4w1tnj+c+gP/6y3EunpMMwO1rcmyJSwUf3U1Tk+bFQ2cJC3Fo5c8xSIgKJT02nKM6V7CygSYANSl6+928crRGh3/GYUF6DKfrO2jr7rM7FBVk9C9VTYqdJfU0dfTq8M84FKTHYIDjehSgppkmADUpXj5SRVSoU4d/xiE9Npz4SJcOA6lppwlATVi/27DtaA3r56fo8M84iMj7U0V292pxODV99K9VTdj+8kbOtfXwkYVpdofitxZmxNLnNhyvabU7FBVEvEoAIrJBRIpEpFhE7h9ivYjIA9b6QyKy3Ju2IvJla12hiHx/4t1RdnilsJpQp4PL5yXbHYrfykmMJDo8hCNnmu0ORQWRUe8DEBEn8CBwNVAJ7BGRLcaYowM2uxaYY/2sAR4C1ozUVkQuBzYCS4wx3SKSMpkdU9PDGMPWwhrWzU4kOtxldzh+yyHCwowY9p1upKOnj8hQvUVHTT1vjgBWA8XGmFJjTA/wNJ4v7oE2ApuNx24gTkTSR2n7ReB7xphuAGNM7ST0R02z49WtlDd06PDPJFiUGUtvv+H14zpVpJoe3iSATKBiwPNKa5k324zUdi5wiYi8IyI7RGTVUG8uIveIyF4R2VtXp38YvmZrYTUinglO1MTkJUYxIyyEl45U2R2KChLeJAAZYtngAubDbTNS2xAgHlgLfB14RkQ+tL0x5lFjzEpjzMrkZB1j9jVbC2tYkRNPcnSY3aH4vfPDQK8dq32/RIRSU8mbBFAJZA94ngWc9XKbkdpWAs9bw0bvAm4gyfvQld0qGjo4VtWiwz+TaFFmLJ29/ew4oSOiaup5kwD2AHNEJF9EQoFbgS2DttkC3GFdDbQWaDbGVI3S9o/AFQAiMhcIBc5NtENq+my16thrApg8eYlRJEaF8tJhnSNATb1RLzUwxvSJyH3AVsAJPG6MKRSRTdb6h4GXgOuAYqAD+OxIba2Xfhx4XESOAD3AnUbnxvML56tZ/u/uctJiwnmr+Jznk1cT5nQI1yxMY8uBM3T19hPuctodkgpgXl1rZox5Cc+X/MBlDw94bIB7vW1rLe8BPj2WYJXvaOvu43R9O5fP16t3J9tHl6Tz23fLee14LdctTrc7HBXA9E5gNS5F1S0YPIXM1ORaOzOR5OgwthwYfKpNqcmlCUCNy9GzLcRFuEiPDbc7lIDjdAg3LMngtaJamjt77Q5HBTBNAGrMevrcnKxtY0FGDENcuasmwY1LM+jpc79/ol2pqaAJQI3ZiZpW+txGh3+m0AVZseQmRuowkJpSmgDUmB2raiHC5SQvMcruUAKWiLDxggx2lpyjtrXL7nBUgNIEoMakt9/N8epWFqRH43To8M9UunFpBm4DLx7S0hBqamgCUGOy51QDnb39OvwzDWanRFOQHsOfdBhITRFNAGpMthZWE+IQZqdE2x1KUNi4NIMDFU2UnWu3OxQVgDQBKK+53YaXC6uZmxpNaIj+15kONy7NQASe319pdygqAOlfsfLaexVN1LR0szBDh3+mS3psBBfPTuK5/Wdwu7VSippcmgCU114+UoXLKcxP0wQwnT6xIoszTZ3sPlVvdygqwGgCUF4xxvCXI9VcPDuJiFAtUDadrilIY0ZYCM/tO2N3KCrAaAJQXik820JlYyfXLtLiZNMtItTJR5ek85cjVbR399kdjgogmgCUV/5ypAqnQ7i6QKd+tMPNK7Lo6OnnL0e0NISaPJoA1KjOD/+snZlAfFSo3eEEpZW58eQmRvLcPr0aSE0er+YDUMHn/KQvADUtXZTWtbMoI/YDy9X0ERFuXp7Fj7edoKKhg+yESLtDUgFAE4Aa1ZGzzQjo5Z/TZLgkG+IQROD3eyv4p2vmTXNUKhDpEJAakTGGQ5XN5CZGEh3usjucoBYXGcqlc5J5Zm8lff1uu8NRAUCPANSIqlu6qGvtZt3SDLtDUUBmXAQ7TtTxby8cZf6geky3r8mxKSrlr/QIQI3oYEUzDoFFGbF2h6KABekxzAgLYU9Zg92hqACgCUANy20MhyqbmJMSTVSYHiz6AqdDWJEbT1FNq04XqSZME4AaVkVDB02dvSzJ0r1/X7IyNx63gX2nG+0ORfk5TQBqWAcqmghxiNb+9zGJM8KYmRzFvtMNuI0WiFPjpwlADanfbThyppkF6TGEubT2j69ZnZdAY0cvxbVtdoei/JgmADWkkro22nv6uUCHf3xSQUYMUWEh7C7VCqFq/DQBqCEdrGgi3OVgbqrO/OWLQhwOVufFU1TdSkN7j93hKD+lCUB9SGtXL0fONrM4M5YQp/4X8VWr8xMRgXd0ngA1TvrXrT7khYNV9PYbVuYm2B2KGkFshIuC9Bj2ljXS06d3Bqux0wSgPuR3eytIiQ4jKz7C7lDUKNbOTKSzt59DlU12h6L8kCYA9QFF1a0crGhiZV4CImJ3OGoU+UlRpESHsau0HqOXhKox0gSgPuCZvRW4nMKy7Di7Q1FeEBEunJVIVXOX3himxkwTgHpfT5+bP7x3hqsLUrX0gx9Zmh1HuMvB42+fsjsU5We8SgAiskFEikSkWETuH2K9iMgD1vpDIrJ8DG2/JiJGRJIm1hU1UX89VkNDew+fXJltdyhqDMJCnKzOS+TlI9VUNHTYHY7yI6MmABFxAg8C1wIFwG0iUjBos2uBOdbPPcBD3rQVkWzgakCnmfIBT++pID02nEvnJNsdihqjC2cl4hDhl2/pUYDynjdHAKuBYmNMqTGmB3ga2Dhom43AZuOxG4gTkXQv2v438C+Anr2yWUldG2+cqOOWVdk4HXry19/ERri44YIMntlbQXOHVglV3vEmAWQCFQOeV1rLvNlm2LYiciNwxhhzcKQ3F5F7RGSviOytq6vzIlw1Hpt3luFyCp9ak2t3KGqc/v6SfDp6+vntHj2gVt7xJgEMtTs4eI99uG2GXC4ikcA3gW+N9ubGmEeNMSuNMSuTk3VoYiq0dPXy7L5KbliSQXJ0mN3hqHFamBHLulmJPPF2md4YprziTQKoBAaeFcwCznq5zXDLZwH5wEERKbOW7xeRtLEErybH7/dW0t7Tz2cvyrc7FDVBd18yk+qWLv58aPCfqFIf5k0C2APMEZF8EQkFbgW2DNpmC3CHdTXQWqDZGFM1XFtjzGFjTIoxJs8Yk4cnUSw3xlRPVseUd/rdhid3lrEiN57FWvnT7102N5m5qTN4ZEep3himRjVqAjDG9AH3AVuBY8AzxphCEdkkIpuszV4CSoFi4DHgSyO1nfReqHF7/Xgt5Q0dfPaiPLtDUZPA4RA2XTaLoppWXi+qtTsc5eO8utvHGPMSni/5gcseHvDYAPd623aIbfK8iUNNvl/tPEVaTDgfWaijb4Hihgsy+NErJ3hoewlXzE+1Oxzlw/RO4CB2qLKJt4vruXNdHi4t+xwwXE4Hd1+Sz56yRvaUNdgdjvJh+lcfxH7+egkx4SF8em2O3aGoSXbLqhwSokJ5eHuJ3aEoH6YJIEidrGnl5cJq7lqXR3S4y+5w1CSLCHVy17o8Xj1ey/HqFrvDUT5KK34FqYd2lOByCtHhLp56R28cCkR3XJjLwztKeGh7CT+5dZnd4SgfpAkgCFU0dPCnA2dZm5+gVT8DyFCJfHlOPFsOnOWfr55HTmKkDVEpX6ZDQEHokTdKcAhcrEXfAt7Fs5NwOISH39BzAerDNAEEmarmTp7ZW8knVmQRG6Fj/4EuJsLFipx4nt1bSU1Ll93hKB+jCSDI/Oy1Yowx3Hv5bLtDUdPk0rnJ9Lnd/OLNUrtDUT5GB4AD2OAx4cb2Hp5+t4KVefG8ceKcTVGp6ZYQFcqNF2Twm3fKuffy2cRFhtodkvIRegQQRF4vqkUE1s9LsTsUNc2+uH42HT39/OrtMrtDUT5EE0CQqG/rZn95I6vzE3TsPwjNS4vm6oJUnthZRlt3n93hKB+hCSBIvHa8FqdDuGyuXvkTrL60fhbNnb38Vu/7UBZNAEGgpqWLAxVNrJ2ZqHf9BrFlOfFcNDuRx94spau33+5wlA/Qk8BBYNvRGkJDHFym1/0HrfMXBMxPi+Ht4nq+8dwh1uQnvr/+9jVaDyoY6RFAgKto6OBoVQuXzEkmUu/6DXozk6LIjo/gjRN19Lt1wphgpwkggBljeLmwmqiwEC6anTh6AxXwRIT181Jo7Ojl8Jkmu8NRNtMEEMCKa9s4da6dK+YlExbitDsc5SPmpUWTGhPG9qI63DptZFDTBBCg3G7D1qPVxEe6WJWfYHc4yoc4RFg/N4Xa1m6OntVS0cFME0CA+vPhKs42dXHVglRCHPoxqw9anBVLYlQo24tqdfL4IKbfDAGop8/ND7cWkRYTzgXZcXaHo3yQQ4T185I529zFiZpWu8NRNtEEEIB+885pyhs62LAoDYeI3eEoH7U0O564CBevHdejgGClCSDAtHb18tPXilk3K5E5KTPsDkf5MKdDuHRuMhWNnewqqbc7HGUDTQAB5pEdpTS09/Cv1y5AdO9fjWJFbjzR4SH85NWTdoeibKAJIIDUtHTxi7dKueGCDBZnxdodjvIDLqeDy+Ym886pBnYWa4nwYKMJIIB8/+Ui3G74+jXz7A5F+ZFVeQmkx4bzw1eK9FxAkNEEECAOVzbz3P5KPntRnk7+rcbE5XRw3xWz2V/exPaiOrvDUdNIE0AAMMbw3RePkhgVyr1X6FSPauw+uSKb7IQIfrRNjwKCiSaAALC1sJp3TzXw1avnEqPlntU4hIY4+Mcr53LkTAtbC2vsDkdNEy0P6ec27yzjf149SUp0GMZ8eB5gpbx109IMfv56MT96pYirFqQQ4tT9w0Cnn7Cfe6v4HA3tPVy/OB2nQy/7VOMX4nTwLxvmc7K2jafe1R2JYKAJwI9VNHTw2vFaFmbEMCc12u5wVAD4yMJU1s1K5MfbTtDU0WN3OGqKeZUARGSDiBSJSLGI3D/EehGRB6z1h0Rk+WhtReQHInLc2v4PIhI3KT0KIt95oRCHCNcvTrc7FBUgRIRv3VBAS2cv//NXvTks0I2aAETECTwIXAsUALeJSMGgza4F5lg/9wAPedF2G7DIGLMEOAH864R7E0ReKazmr8dquXJBCnGRoXaHowLI/LQYbl+Tw693n+akFooLaN4cAawGio0xpcaYHuBpYOOgbTYCm43HbiBORNJHamuMecUY02e13w1kTUJ/gkJHTx/feeEo81KjWTcrye5wVAD6p6vnERXq5N/+fFQvCw1g3iSATKBiwPNKa5k323jTFuBzwF+GenMRuUdE9orI3ro6vUkFPHf8nmnq5Ls3LdITv2pKJESF8s/XzOPNk+d4bv8Zu8NRU8SbBDDUN8zgXYLhthm1rYh8E+gDfjPUmxtjHjXGrDTGrExOTvYi3MC2s/gcT+ws4651eazWmb7UFPrM2lxW5yXwnRcKqW7usjscNQW8uQ+gEsge8DwLOOvlNqEjtRWRO4GPAlcaPc4cVUtXL19/9hD5SVF8Y8N8u8NRAWS4+0cumZPEoTNN/J8/HOaXd67UCrMBxpsjgD3AHBHJF5FQ4FZgy6BttgB3WFcDrQWajTFVI7UVkQ3AN4AbjTEdk9SfgPbdF45S1dzJj/7uAiJCdZJ3NfUSZ4TxLx+Zz2vHa3leh4ICzqgJwDpRex+wFTgGPGOMKRSRTSKyydrsJaAUKAYeA740Ulurzc+AaGCbiBwQkYcnr1uB5+Uj1fx+XyWbLpvF8px4u8NRQeSudXmszkvg2y8UUl6v+2qBRPxp5GXlypVm7969docx7U6da+fGn75FXlIUz37xQsJC/rb3r6Uf1FS7fU0OFQ0dXP/Am+QkRvLspnWEu/QI1J+IyD5jzMrBy/VOYB/X2dPPF/93H06n8NCnl3/gy1+p6ZKdEMl/37KUI2da+M4LhaM3UH5Bi8H5MGMM3/zDYYqqW7lzXR5vnNAZm9T0G3iUedncZH77bgW9fYblufHcvibHxsjURGkC8GFP7Czj+ffOcOWCFOZqrR/lA65akEp5Qwd/PHCG5Ogwu8NRE6RDQD5q29Eavvvno1y1IJXL56XYHY5SADgdwm2rc4gOD2Hz7tNUNOhJYX+mCcAHHaps4h9++x6LMmN54LalOPTaa+VDZoSFcOeFefS73XzuiT00d/baHZIaJ00APqaysYPPPbGXxBmh/PLOVUSG6iid8j0pMeF8ak0uZfXtfOk3++jpc9sdkhoHvQzURzz1TjmtXb089mYpbd19bLp0Fikx4XaHpdSIwkIc/PPvD3L94nQeuG2Z1qbyUcNdBqq7lz6is6efJ3aW0dzZy+cuytcvf+UXbl6RRUN7D//x0jGiw0P4fx9frOUi/IgmAB/Q0dPH5l1l1LZ085kLc8lNjLI7JKW8dvelM2nq7OHB10uIjXBx/7XzNQn4CU0ANuvq7ecLv95HeUMHt67O0cs9lV/62jXzaOns45E3Sgl3Ofnq1XPtDkl5QROAjbr7PHf5vnnyHDcvz2JxZqzdISk1JgNvEpuXFs2KnHh+8upJCs8284s7V9kYmfKGXgVkk95+N/c99R6vF9Xxnx9bzIpcLfCm/JtDhI8tz2RZdhx/PVbLg68X2x2SGoUmABv09bv5ytMH2Ha0hm/fUKC306uA4RDh5hVZLM2O4wdbi/jpqzqxvC/TIaBp1tfv5qvPHOTFw1X8n+vmc9dF+XaHpNSkcojwiRVZCPCjbSfYd7qRqwtSP3RiWHd87KcJYBr1uw1f+/1BXjh4lm9smM89l86yOySlpsT5I4EQp4PtJ+ro7Xdz3eJ0vTrIx2gCmCb/u/s0z+2r5L2KJq4pSCU2wqW1/FVAc4hw09IMQpzC2yX1dPe52bg0U28W8yGaAKZBv9vw/H7Pl/9VC1JYr8XdVJAQET66OJ0Il5PXjtfS3tPPrauycTn19KMv0AQwiYbao3cbwx/2n2F/eRNXLkjhivmpNkSmlH1EhKsWpBIVFsKfD57l8bdPccfaPLvDUuhVQFPKbQx/eO8M+8obuXJ+Clfql78KYhfOTOSWVdlUNnTy0I5iTp1rtzukoKcJYIq4jeH5/WfYd7qRK+ancOUC/fJXaklWHJ+7OJ+Onn5uevBtdhbrLHd20gQwBdzG8Ny+SvaXN3LlghSu0i9/pd6XnxTFl9bPJiU6jDsef5fH3zqFP1UlDiSaACZZv9vw7L7zJ3xTddhHqSEkRIXy/JfWsX5eCv/256PcvXkvje09docVdDQBTKI+t5un95RzwLrU84r5erWPUsOJDnfx2B0r+NZHC9hxoo7rHnhTh4SmmSaASdLV289vdpdTeLaF6xen66WeSnlBRPjcxfk8/8WLCHc5uf0X7/DV3x2grrXb7tCCgs4INglau3r5wq/3sauknhuXZrAmP9HukJTyO739brYX1fLGiXNEhTn5ylVzuX1NDuEup92h+b3hZgTTBDBBNS1dfPZXeyiqaeXjyzJZlqNVPZWaiNrWLvaUNfB2cT0p0WFsumyWJoIJ0gQwBYprW7nz8T00dvTw0KdXcKax0+6QlAoIt6/JYVdJPT959QS7SxuIi3Rx8/IsbludzewUnTRprHRO4En2+vFavvK7A7icDn53z4UszorV2j5KTaILZyVy4awLeae0ns27T7N5Vxm/fOsUy3LiuH5xOhsWpZEVH2l3mH5NE8AY9bsN/73tBD97vZiC9Bge+cwKshP0P6FSU2XNzETWzEzkXFs3z+2r5MmdZfz7i8f49xePkREXztyUaOakRpOTEInTIVpmegw0AYxBeX0H9z9/iJ0l9dyyMpvvbFyo45JKTZOkGWF84bJZRIe7qG/r5sjZFoqqW3jjZB3bT9QR6nSQnRBBbWsXq/ISWJIVS3S4y+6wfZqeA/BCV28/D+8o4efbSwhxCN++cSF/tzL7Q9vpEJBS06+rt5/SujaK69o5Xd9OdUsXxoAIzEyK4oKsOAoyYliYEUtBegyxkcGXFPQcwDi0dvXyh/fO8NibpVQ0dHLDBRl887oFpMWG2x2aUsoS7nJSkBFLQUYsANcvSee98kYOVTZzqLKJN4vP8fx7Z97fPjbCRVpMOKkx4aTGhJEaE05ydBh3rsuzqQf28eoIQEQ2AD8BnMAvjDHfG7RerPXXAR3AXcaY/SO1FZEE4HdAHlAG/J0xpnGkOKbjCKCjp489ZY389WgNz++vpL2nnyVZsdy/YT7rZieN2FaPAJTyTa1dvVQ1d1HV3EVNSxfVzV3UtXbTb33/CZCVEMHMpBnMTI4iNyGSnMRIsuMjSYsNZ0ZYiF/PZjbuIwARcQIPAlcDlcAeEdlijDk6YLNrgTnWzxrgIWDNKG3vB141xnxPRO63nn9jIp0cidtt6Ol309Pvpqunn+bOXlq6ejnX1kN5fQdl9e2cqGnlQEUTvf2G0BAHH12Szh0X5rE0O26qwlJKTYPocBfR4S7mpv7tEtJ+t+FcWze1rd3UtHQRFRZCSW0b755qoLO3/wPtI1xOUmLCSIgKJS7CRVxkKNHhIUSFhTAjLITIUCcRLicRoU7CXdZPiIMwl5OwEAdhIQ5CQxyEOh24nA5CnILL6cDpEEIcYlty8WYIaDVQbIwpBRCRp4GNwMAEsBHYbDyHE7tFJE5E0vHs3Q/XdiOw3mr/JLCdKUoA3/rTETbvOj3iNnGRLvKTovj8xTNZNyuRlXnxRIbqCJlSgcrpEGsYKJzFmZ7howtnJmKMoa27j8b2Hho6emnt6qW1q48W63dtSzcdPX109brp6XO/fxQxGfE4BATB+sf5vCAIj3xmBZfOTZ6U9zrPm2+4TKBiwPNKPHv5o22TOUrbVGNMFYAxpkpEhiyeIyL3APdYT9tEpMiLmMfsNHAQ+OP4XyIJCLRKVoHWp0DrDwRenwKtPzBJfbrs3yfUPHeohd4kgKGOTQanvOG28abtiIwxjwKPjqWNHURk71BjbP4s0PoUaP2BwOtToPUHfLtP3lQDrQQGXvOYBZz1cpuR2tZYw0RYv2u9D1sppdREeZMA9gBzRCRfREKBW4Etg7bZAtwhHmuBZmt4Z6S2W4A7rcd3An+aYF+UUkqNwahDQMaYPhG5D9iK51LOx40xhSKyyVr/MPASnktAi/FcBvrZkdpaL/094BkR+TxQDnxyUns2/Xx+mGocAq1PgdYfCLw+BVp/wIf75Fd3AiullJo8OiOYUkoFKU0ASikVpDQBTJCIbBCRIhEptu5o9ksiUiYih0XkgIjstZYliMg2ETlp/fbp6c5E5HERqRWRIwOWDdsHEflX63MrEpGP2BP18Ibpz7dF5Iz1OR0QkesGrPPp/gCISLaIvC4ix0SkUET+0Vrul5/TCP3xj8/JGKM/4/zBc2K7BJgJhOK5l6zA7rjG2ZcyIGnQsu8D91uP7wf+y+44R+nDpcBy4MhofQAKrM8rDMi3Pken3X3woj/fBr42xLY+3x8rznRgufU4Gjhhxe6Xn9MI/fGLz0mPACbm/TIZxpge4Hypi0CxEU+ZDqzfN9kXyuiMMW8ADYMWD9eHjcDTxphuY8wpPFewrZ6OOL01TH+G4/P9Ac9d/8YqFGmMaQWO4akY4Jef0wj9GY5P9UcTwMQMVwLDHxngFRHZZ5XfgEHlOoAhy3X4uOH64M+f3X0icsgaIjo/VOJ3/RGRPGAZ8A4B8DkN6g/4weekCWBiJlzqwodcZIxZjqey670icqndAU0xf/3sHgJmAUuBKuBH1nK/6o+IzACeA75ijGkZadMhlvlcv4boj198TpoAJsabMhl+wRhz1vpdC/wBz2FpIJTrGK4PfvnZGWNqjDH9xhg38Bh/Gz7wm/6IiAvPl+VvjDHPW4v99nMaqj/+8jlpApgYb8pk+DwRiRKR6POPgWuAIwRGuY7h+rAFuFVEwkQkH89cFu/aEN+YnP+StHwMz+cEftIfERHgl8AxY8yPB6zyy89puP74zedk91l0f//BUwLjBJ6z+d+0O55x9mEmnisTDgKF5/sBJAKvAiet3wl2xzpKP36L53C7F8+e1udH6gPwTetzKwKutTt+L/vza+AwcAjPl0m6v/THivFiPEMeh4AD1s91/vo5jdAfv/ictBSEUkoFKR0CUkqpIKUJQCmlgpQmAKWUClKaAJRSKkhpAlBKqSClCUD5NRH5mIgYEZlvcxw3iUiB9fgCETkwYN1tItJh3TCEiCwWkUPjeI/1IvLnSQtaBT1NAMrf3Qa8hecmPDvdhKfSI3iu/849f3MdsA44jqdOzPnnb09rdEoNQROA8ltW/ZWL8Nwgdau1zCkiP7TmNjgkIl+2lq8SkZ0iclBE3hWRaBEJF5FfWdu+JyKXW9veJSI/G/A+fxaR9dbjNhH5D+t1dotIqoisA24EfmDt+efjuUt8jfUSK4AH8XzxY/3ead2B/biI7LHef+OAPvzAWn5IRL4wRN9XWW1mishlA+rOvzcg8Sg1Ik0Ayp/dBLxsjDkBNIjIcuAePF/Ay4wxS4DfWGU6fgf8ozHmAuAqoBO4F8AYsxjPkcSTIhI+yntGAbut13kDuNsYsxPP3Z5fN8YsNcaUADuBdVZpDTewnQ8mgLfx3BH6mjFmFXA5ngQShSehNVvLVwF3W2UDALASzsPARmNMKfA14F5jzFLgEqtvSo1KE4DyZ7fhmYMB6/dteL7cHzbG9AEYYxqAeUCVMWaPtazFWn8xnlv2McYcB04Dc0d5zx7g/Dj8PiBvmO3exvNFvxrYYyWF2SKSDMywvrivAe63jhq2A+FAjrX8Dmv5O3jKJMyxXncB8ChwgzGmfMB7/VhE/gGIO993pUYTYncASo2HiCQCVwCLRMTgmZ3N4PlSHlzfRIZYdn75UPr44M7RwKOCXvO3+in9DP83tBvP3vvFwC5rWSWeoaqdA97/ZmNM0QeC8hQY+7IxZuug5evx1AYKx3M+4XwF1++JyIt4atDsFpGrrISm1Ij0CED5q08Am40xucaYPGNMNnAK2A9sEpEQ8Mw1i+cEbIaIrLKWRVvr3wA+ZS2bi2fvuwjP9JhLRcQhItl4N2NTK54pAYH3Z4eqAO7ibwlgF/AV/pYAtgJftr7wEZFlA5Z/ccBVQ3OtoSGAJuB64D8HnJeYZYw5bIz5L2AvYOsVUcp/aAJQ/uo2PPMWDPQckAGUA4dE5CBwu/FM13kL8FNr2TY8e9E/B5wichjPOYK7jDHdeIZUTuG5mueHeJLKaJ4Gvm6dhJ1lLXsbCDPGnJ8BaheeyqvnE8B3AZcV6xHrOcAvgKPAfmv5Iww40jDG1AA3AA+KyBrgKyJyxOpbJ/AXL+JVSquBKqVUsNIjAKWUClKaAJRSKkhpAlBKqSClCUAppYKUJgCllApSmgCUUipIaQJQSqkg9f8BB+HCWThoo6EAAAAASUVORK5CYII=\n", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], + "source": [ + "# Example EDA\n", + "sns.distplot(data.AccountWeeks)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Let us perform some analysis with the data's features**\n", + "* Group the data by whether the customer wil churn and analyse their different features to know more about how the data behave." + ] + }, + { + "cell_type": "code", + "execution_count": 35, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "" + ] + }, + "execution_count": 35, + "metadata": {}, + "output_type": "execute_result" + }, + { + "data": { + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYUAAAEGCAYAAACKB4k+AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/d3fzzAAAACXBIWXMAAAsTAAALEwEAmpwYAAAV9UlEQVR4nO3dfbCedZ3f8feHECEijAIRw4FswBPrBjtGezZDx5ku6mKI7TayO25jq2Ss0zhTiLF1OwP+o+xMHGfHh9JUXWJlpZ2tNPvgGJ+6C1GLTtWY0AgEpNwLCZyQJiGy5SE0Mcm3f5w7Fyfk5JxjkvtcJ7nfr5kz9/37Xdfvur+HOeRz/67HVBWSJAGc1XYBkqTpw1CQJDUMBUlSw1CQJDUMBUlS4+y2CzgZF198cc2bN6/tMiTptLJ58+anq2r2WMtO61CYN28emzZtarsMSTqtJNl+vGXuPpIkNQwFSVLDUJAkNQwFSVLDUJAkNQwFSVKjZ6GQ5NwkG5P8PMnWJLd2+z+ZZEeSLd2fd48ac0uSTpJHkizuVW2SpLH18jqF/cA7qur5JDOBHyX5bnfZ56vqM6NXTrIAWAZcBVwK3JPkDVV1qIc1ShrHmjVr6HQ6rdawY8cOAAYGBlqtA2BwcJCVK1e2XUZP9WymUCOe7zZndn/Ge3jDUuCuqtpfVY8DHWBRr+qTdHp48cUXefHFF9suo2/09IrmJDOAzcAg8IWq+mmSJcBNSW4ANgEfq6pngAHgJ6OGD3f7Xr7NFcAKgLlz5/ayfKnvTYdvxatWrQLgtttua7mS/tDTA81VdaiqFgKXAYuSvAn4EvB6YCGwE/hsd/WMtYkxtrm2qoaqamj27DFv3SFJOkFTcvZRVf0d8APguqra1Q2Lw8CXeWkX0TBw+ahhlwFPTUV9kqQRvTz7aHaSV3ffzwJ+B/hFkjmjVrseeLD7fj2wLMk5Sa4A5gMbe1WfJOlYvTymMAe4s3tc4SxgXVV9K8l/SbKQkV1D24APA1TV1iTrgIeAg8CNnnkkSVOrZ6FQVfcDbxmj/wPjjFkNrO5VTZKk8XlFsySpYShIkhqGgiSpYShIkhqGgiSpYShIkhqGgiSpYShIkhqGgiSpYSgIgL179/KRj3yEvXv3tl2KpBYZCgLg9ttv5/7772ft2rVtlyKpRYaC2Lt3L/fccw8Ad999t7MFqY8ZCuL222/n8OHDABw+fNjZgtTHDAWxYcOGo9pHZg2S+o+hIJKM25bUP3r5kB1N0po1a+h0Oq19/vnnn88zzzxzVPvIw9LbMDg4OC0eGC/1I2cKYs6cOeO2JfUPZwrTwHT4Vnz99dfzzDPPsHjxYm655Za2y5HUEkNBwMjs4MCBA6xYsaLtUiS1qGe7j5Kcm2Rjkp8n2Zrk1m7/hUnuTvJo9/U1o8bckqST5JEki3tVm441c+ZMBgcHueiii9ouRVKLenlMYT/wjqp6M7AQuC7J1cDNwIaqmg9s6LZJsgBYBlwFXAd8McmMHtYnSXqZnoVCjXi+25zZ/SlgKXBnt/9O4D3d90uBu6pqf1U9DnSARb2qT5J0rJ6efZRkRpItwG7g7qr6KXBJVe0E6L6+trv6APDkqOHD3b6Xb3NFkk1JNu3Zs6eX5UtS3+lpKFTVoapaCFwGLErypnFWH+uKqRpjm2uraqiqhmbPnn2KKpUkwRRdp1BVfwf8gJFjBbuSzAHovu7urjYMXD5q2GXAU1NRnyRpRC/PPpqd5NXd97OA3wF+AawHlndXWw58o/t+PbAsyTlJrgDmAxt7VZ8k6Vi9vE5hDnBn9wyis4B1VfWtJD8G1iX5EPAE8F6AqtqaZB3wEHAQuLGqDvWwPknSy/QsFKrqfuAtY/TvBd55nDGrgdW9qkmSND7vfSRJahgKkqSGoSBJahgKkqSGoSBJahgKkqSGoSBJahgKkqSGoSBJahgKkqSGoSBJahgKkqRGL++SKukErVmzhk6n03YZ08KR/w6rVq1quZLpYXBwkJUrV/Zs+4aCNA11Oh0e3fq/mPsq7x7/il+N7NDYv31Ty5W074nnZ/T8MwwFaZqa+6pDfPytz7ZdhqaRT913Qc8/w2MKkqSGoSBJahgKkqSGoSBJavQsFJJcnuT7SR5OsjXJqm7/J5PsSLKl+/PuUWNuSdJJ8kiSxb2qTZI0tl6efXQQ+FhV3ZfkfGBzkru7yz5fVZ8ZvXKSBcAy4CrgUuCeJG+oKs/Jk6Qp0rOZQlXtrKr7uu+fAx4GBsYZshS4q6r2V9XjQAdY1Kv6JEnHmpJjCknmAW8BftrtuinJ/UnuSPKabt8A8OSoYcOMESJJViTZlGTTnj17elm2JPWdnodCklcBfwl8tKqeBb4EvB5YCOwEPntk1TGG1zEdVWuraqiqhmbPnt2boiWpT/U0FJLMZCQQ/qyq/gqgqnZV1aGqOgx8mZd2EQ0Dl48afhnwVC/rkyQdrZdnHwX4CvBwVX1uVP+cUatdDzzYfb8eWJbknCRXAPOBjb2qT5J0rF6effQ24APAA0m2dPs+DrwvyUJGdg1tAz4MUFVbk6wDHmLkzKUbPfNIkqZWz0Khqn7E2McJvjPOmNXA6l7VJEkan1c0S5IahoIkqWEoSJIahoIkqWEoSJIahoIkqWEoSJIahoIkqWEoSJIahoIkqWEoSJIavbwhnqQTtGPHDl54bgafuu+CtkvRNLL9uRmct2NHTz/DmYIkqeFMQZqGBgYG2H9wJx9/67Ntl6Jp5FP3XcA5A+M96v7kOVOQJDX6eqawZs0aOp1O22VMC0f+O6xatarlSqaHwcFBVq5c2XYZ0pTr61DodDpsefBhDr3ywrZLad1ZBwqAzY/tarmS9s3Y98u2S5BaM6lQSHIe8GJVHU7yBuCNwHer6lc9rW4KHHrlhbz4xne3XYamkVm/OO7DAaUz3mSPKdwLnJtkANgAfBD4aq+KkiS1Y7KhkKraB/wesKaqrgcWjDsguTzJ95M8nGRrklXd/guT3J3k0e7ra0aNuSVJJ8kjSRaf6C8lSToxkw6FJP8Q+BfAt7t9E+16Ogh8rKp+E7gauDHJAuBmYENVzWdk1nFz9wMWAMuAq4DrgC8mmfHr/DKSpJMz2VBYBdwCfL2qtia5Evj+eAOqamdV3dd9/xzwMDAALAXu7K52J/Ce7vulwF1Vtb+qHgc6wKJf43eRJJ2kSR1orqp7GTmucKT9GPCRyX5IknnAW4CfApdU1c7udnYmeW13tQHgJ6OGDXf7Xr6tFcAKgLlz5062BEnSJEz27KM3AH8IzBs9pqreMYmxrwL+EvhoVT2b5LirjtFXx3RUrQXWAgwNDR2zXJJ04iZ7ncKfA38C/Cfg0GQ3nmQmI4HwZ1X1V93uXUnmdGcJc4Dd3f5h4PJRwy8DnprsZ0mSTt5kjykcrKovVdXGqtp85Ge8ARmZEnwFeLiqPjdq0Xpgeff9cuAbo/qXJTknyRXAfGDjpH8TSdJJm+xM4ZtJ/jXwdWD/kc6qGu/Sz7cBHwAeSLKl2/dx4NPAuiQfAp4A3tvd1tYk64CHGDlz6caqmvSsRJJ08iYbCke+2f+7UX0FXHm8AVX1I8Y+TgDwzuOMWQ2snmRNkqRTbLJnH13R60IkSe0bNxSSvKOqvpfk98ZaPurgsSTpDDDRTOG3ge8BvzvGsgIMBUk6g4wbClX1ie7rB6emHElSmybaffRvx1v+slNNJUmnuYl2H30G2AJ8l5FTUY97ObIk6fQ3USi8lZE7l/5jYDPwNUbucHpG3F5ix44dzNj3f32oio4yY99eduw42HYZUivGvaK5qrZU1c1VtZCRq5OXAg8l+adTUZwkaWpN9oZ4sxm5y+nfZ+QeRbvHH3F6GBgY4P/sP9vHceoos37xHQYGLmm7DKkVEx1o/iDwz4Bzgb8A/qCqzohAkCQda6KZwleABxi5R9Fi4F2jb31dVe5GkqQzyESh8PYpqUKSNC1MdPHa/wBI8k+A71TV4SmpSpLUisk+T2EZ8GiSP07ym70sSJLUnkmFQlW9n5Gzj/4W+NMkP06yIsn5Pa1OkjSlJjtToKqeZeTRmncBc4DrgfuSrOxRbZKkKTapUEjyu0m+zsgdU2cCi6pqCfBm4A97WJ8kaQpN9slr7wU+X1X3ju6sqn1J/uWpL0uS1IbJPnnthnGWbTh15UiS2jTZ3UdXJ/lZkueTHEhyKMmzE4y5I8nuJA+O6vtkkh1JtnR/3j1q2S1JOkkeSbL4xH8lSdKJmuzuo//IyGmpfw4MATcAgxOM+Wp33H9+Wf/nq+ozozuSLOhu/yrgUuCeJG+oqkOTrE864zzx/Aw+dd8FbZfRul37Rr67XvJKL5N64vkZzO/xZ0w2FKiqTpIZ3X+o/zTJ/5xg/XuTzJvk5pcCd1XVfuDxJB1gEfDjydYnnUkGByf6ztU/DnQ6AJzzG/43mU/v/zYmGwr7krwC2JLkj4GdwHkn+Jk3JbkB2AR8rKqeAQaAn4xaZ7jbd4wkK4AVAHPnzj3BEqTpbeVKz/Q+YtWqVQDcdtttLVfSHyZ7ncIHuuveBLwAXA78/gl83peA1wMLGQmWz3b7x3qi25gP8qmqtVU1VFVDs2fPPoESJEnHM9mzj7Z3n6lAVd16oh9WVbuOvE/yZeBb3eYwI0FzxGXAUyf6OZKkEzPR8xQCfIKRGUKAs5IcBNZU1R/9uh+WZE5V7ew2rweOnJm0HvivST7HyIHm+cDGX3f7J2LGvl/6OE7grP83cjLZ4XM9sDlj3y8BH7Kj/jTRTOGjwNuA36qqxwGSXAl8Kcm/qarPH29gkq8B1wAXJxlmJFyuSbKQkV1D24APA1TV1iTrgIeAg8CNU3HmkQfzXtLpPAfA4JX+YwiX+LehvjVRKNwAXFtVTx/pqKrHkrwf+BvguKFQVe8bo/sr46y/Glg9QT2nlAfzXuLBPEkw8YHmmaMD4Yiq2sPIPZAkSWeQiULhwAkukySdhibaffTm49zOIsC5PahHktSiiR7HOWOqCpEktW/SD9mRJJ35DAVJUsNQkCQ1DAVJUsNQkCQ1DAVJUsNQkCQ1DAVJUsNQkCQ1DAVJUsNQkCQ1DAVJUsNQkCQ1DAVJUsNQkCQ1ehYKSe5IsjvJg6P6Lkxyd5JHu6+vGbXsliSdJI8kWdyruiRJx9fLmcJXgete1nczsKGq5gMbum2SLACWAVd1x3wxiQ/4kaQp1rNQqKp7gV++rHspcGf3/Z3Ae0b131VV+6vqcaADLOpVbZKksU31MYVLqmonQPf1td3+AeDJUesNd/skSVNouhxozhh9NeaKyYokm5Js2rNnT4/LkqT+MtWhsCvJHIDu6+5u/zBw+aj1LgOeGmsDVbW2qoaqamj27Nk9LVaS+s1Uh8J6YHn3/XLgG6P6lyU5J8kVwHxg4xTXJkl97+xebTjJ14BrgIuTDAOfAD4NrEvyIeAJ4L0AVbU1yTrgIeAgcGNVHepVbZKksfUsFKrqfcdZ9M7jrL8aWN2reiRJE5suB5olSdOAoSBJahgKkqSGoSBJahgKkqSGoSBJahgKkqSGoSBJahgKkqSGoSBJahgKkqSGoSBJahgKkqSGoSBJahgKkqSGoSBJahgKkqSGoSBJahgKkqSGoSBJapzdxocm2QY8BxwCDlbVUJILgf8GzAO2AX9QVc+0UZ8k9as2Zwpvr6qFVTXUbd8MbKiq+cCGbluSNIWm0+6jpcCd3fd3Au9prxRJ6k9thUIBf5Nkc5IV3b5LqmonQPf1tWMNTLIiyaYkm/bs2TNF5UpSf2grFN5WVW8FlgA3JvlHkx1YVWuraqiqhmbPnt27CvvMvn37eOCBB+h0Om2XIqlFrYRCVT3Vfd0NfB1YBOxKMgeg+7q7jdr61fbt2zl8+DC33npr26VIatGUn32U5DzgrKp6rvv+XcAfAeuB5cCnu6/fmOra2rJmzZpWv6Hv27ePAwcOAPDkk0+yYsUKZs2a1Vo9g4ODrFy5srXPl/pZGzOFS4AfJfk5sBH4dlX9d0bC4NokjwLXdtuaAtu3bz+qvW3btnYKkdS6KZ8pVNVjwJvH6N8LvHOq65kO2v5WfM011xzVPnDgALfddls7xUhq1XQ6JVWS1DJDQZLUMBQkSQ1DQZLUMBTEeeedN25bUv8wFMQLL7wwbltS/zAUxNlnnz1uW1L/MBTEwYMHx21L6h+Ggpg3b964bUn9w1AQN9xww1Ht5cuXt1SJpLYZCuKOO+4Yty2pfxgKYnh4+Kj2k08+2VIlktpmKIgk47Yl9Q9DQVx99dXjtiX1D0NBnH/++Ue1L7jggpYqkdQ2Q0H88Ic/PKp97733tlSJpLYZCuKiiy4aty2pfxgKYufOneO2JfUPQ0GS1Jh2oZDkuiSPJOkkubntevrBpZdeOm5bUv+YVqGQZAbwBWAJsAB4X5IF7VZ15tuzZ8+4bUn9Y7rdI3kR0KmqxwCS3AUsBR5qtaoz3Ote9zq2bdt2VFsCWLNmDZ1Op9Uajnz+qlWrWq0DYHBwkJUrV7ZdRk9Nq5kCMACMvsfCcLevkWRFkk1JNvmN9tTYtWvXuG2pTbNmzWLWrFltl9E3pttMYaz7K9RRjaq1wFqAoaGhGmN9/ZquvfZavvnNb1JVJOFd73pX2yVpmjjTvxXrWNNtpjAMXD6qfRnwVEu19I3ly5c3T1ubOXPmMbfSltQ/plso/AyYn+SKJK8AlgHrW67pjHfRRRexZMkSkrBkyRIvXpP62LTafVRVB5PcBPw1MAO4o6q2tlxWX1i+fDnbtm1zliD1uVSdvrvlh4aGatOmTW2XIUmnlSSbq2porGXTbfeRJKlFhoIkqWEoSJIahoIkqXFaH2hOsgfY3nYdZ5CLgafbLkIag3+bp9ZvVNXssRac1qGgUyvJpuOdkSC1yb/NqePuI0lSw1CQJDUMBY22tu0CpOPwb3OKeExBktRwpiBJahgKkqSGoSCSXJfkkSSdJDe3XY90RJI7kuxO8mDbtfQLQ6HPJZkBfAFYAiwA3pdkQbtVSY2vAte1XUQ/MRS0COhU1WNVdQC4C1jack0SAFV1L/DLtuvoJ4aCBoAnR7WHu32S+pChoIzR53nKUp8yFDQMXD6qfRnwVEu1SGqZoaCfAfOTXJHkFcAyYH3LNUlqiaHQ56rqIHAT8NfAw8C6qtrablXSiCRfA34M/L0kw0k+1HZNZzpvcyFJajhTkCQ1DAVJUsNQkCQ1DAVJUsNQkCQ1DAVpAklel+SuJH+b5KEk30myIsm32q5NOtUMBWkcSQJ8HfhBVb2+qhYAHwcuOcntnn0q6pNONf8wpfG9HfhVVf3JkY6q2pLk1cA7k/wF8CZgM/D+qqok24Chqno6yRDwmaq6JskngUuBecDTSf43MBe4svv676vqP0zdryYdy5mCNL4j/+CP5S3ARxl5DsWVwNsmsb1/ACytqn/ebb8RWMzILcw/kWTmSVUrnSRDQTpxG6tquKoOA1sYmQFMZH1VvTiq/e2q2l9VTwO7OcndUtLJMhSk8W1l5Nv9WPaPen+Il3bHHuSl/7fOfdmYFya5DakVhoI0vu8B5yT5V0c6kvwW8NvjjNnGS0Hy+70rTTr1DAVpHDVyx8jrgWu7p6RuBT7J+M+cuBW4LckPGfn2L502vEuqJKnhTEGS1DAUJEkNQ0GS1DAUJEkNQ0GS1DAUJEkNQ0GS1Pj/DTAwv91M99AAAAAASUVORK5CYII=\n", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], + "source": [ + "data_churn = data.groupby('Churn').get_group(1)\n", + "data_no_churn = data.groupby('Churn').get_group(0)\n", + "\n", + "#Check how the DayMins columns for customer that churn vs those that didnt churn varies using boxplot\n", + "#sns.boxplot('DayCalls',data = data_churn)\n", + "sns.boxplot('Churn','DayMins', data = data)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "From the boxplot above, it seems that customer that churn tends to have lower **DayMins** rate than those that wont churn. Although the **DayMins** minimum is significantly low which might not be expected if our assumption that customer customer with lower **DayMins** tends to churn, although detailed explanation about what DayMins mean was not provided. Let continue our comparison and see customer behavior as regards **DataUsage**" + ] + }, + { + "cell_type": "code", + "execution_count": 37, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "" + ] + }, + "execution_count": 37, + "metadata": {}, + "output_type": "execute_result" + }, + { + "data": { + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXgAAAEGCAYAAABvtY4XAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/d3fzzAAAACXBIWXMAAAsTAAALEwEAmpwYAAAT+ElEQVR4nO3df3BcV3nG8efVDztxAnWyUVzHQRjqFpqhkB8igcmQNrFkZCCGlmlLmGC1hcozgG0oLaUMQ0mnpTNAW2yFTkcTCHJDk8GlaQlNhGXTOM5AATm4cUJSoqZyG9lxpI3dhNhWJO3bP3ZlS4p8tdj33Cud/X5mNNa72r33jWb9+OTsveeYuwsAEJ+6vBsAAIRBwANApAh4AIgUAQ8AkSLgASBSDXk3MNVFF13kK1euzLsNAFgw9u7dO+LuTbP9bF4F/MqVK9Xf3593GwCwYJjZgdP9jCkaAIgUAQ8AkSLgASBSBDwARIqAB5CpYrGoTZs2qVgs5t1K9Ah4AJnq6enR/v37tW3btrxbiR4BDyAzxWJRvb29cnf19vYyig+MgAeQmZ6eHpVKJUnSxMQEo/jACHgAmdm5c6fGx8clSePj4+rr68u5o7gR8AAy09raqoaG8g30DQ0Namtry7mjuBHwADLT0dGhurpy7NTX12v9+vU5dxQ3Ah5AZgqFgtrb22Vmam9vV6FQyLulqM2rxcYAxK+jo0ODg4OM3jNAwAPIVKFQ0NatW/NuoyYwRQMAkSLgASBSBDwARIqAB4BIEfAAECkCHgAiFfQySTMblPS8pAlJ4+7eEvJ8AIBTsrgO/np3H8ngPACAKZiiAYBIhQ54l7TDzPaaWedsTzCzTjPrN7P+4eHhwO0AQO0IHfDXuvuVktZK+pCZXTfzCe7e7e4t7t7S1NQUuB0AqB1BA97dD1b+fEbS3ZKuDnk+AMApwQLezM4zs5dNfi9pjaRHQp0PADBdyKtolkm628wmz/MP7t4b8HwAgCmCBby7PynpDaGODwBIxmWSABApAh4AIsWOTinr6urSwMBArj0MDQ1JklasWJFrH5K0atUqbdy4Me82gJpEwEfo+PHjebcAYB4g4FM2H0armzdvliRt2bIl504A5Ik5eACZKhaL2rRpk4rFYt6tRI+AB5Cp7u5uPfzww+ru7s67legR8AAyUywW1dfXJ0nq6+tjFB8YAQ8gM93d3SqVSpKkUqnEKD4wAh5AZnbt2pVYI10EPIDMuHtijXQR8AAys3r16ml1a2trTp3UBgIeQGY2bNigurpy7NTV1amzc9aN3pASAh5AZgqFwslRe1tbmwqFQs4dxY07WQFkasOGDXr66acZvWeAgAeQqUKhoK1bt+bdRk1gigYAIkXAA0CkCHgAmWKxsewQ8AAy1dPTo/3792vbtm15txI9Ah5AZorFonp7e+Xu6u3tZRQfGAEPIDM9PT0nFxubmJhgFB8YAQ8gMzt37tT4+LgkaXx8/OTSwQiDgAeQmbe85S2JNdJFwAPIzIkTJ6bVo6OjOXVSGwh4AJl58MEHp9V79uzJqZPaQMADyIyZJdZIV/CAN7N6M/uRmX0r9LkAzG8z14OfWSNdWYzgN0t6LIPzAJjnOjs7WQ8+Q0ED3swulfR2SbeFPA+AhaFQKKipqUmS1NTUxHrwgYUewX9R0scllU73BDPrNLN+M+sfHh4O3A6APBWLRR0+fFiSdPjwYe5kDSxYwJvZOyQ94+57k57n7t3u3uLuLZP/sgOIU1dXV2KNdIUcwV8raZ2ZDUq6S9INZnZHwPMBmOd2796dWCNdwQLe3f/E3S9195WS3iPpO+5+c6jzAZj/3D2xRrq4Dh5AZhoaGhJrpCuT36673y/p/izOBWD+mlxo7HQ10sUIHkBmuJM1WwQ8gMwwB58tAh4AIkXAA0CkCHgAiBQBDyAzkwuNna5GuvjtAsjM5Ibbp6uRLgIeACJFwANApAh4AJlZunTptPqCCy7Ip5EaQcADyMzRo0en1UeOHMmnkRpBwANApAh4AJm5+OKLp9XLli3LqZPaQMADyMzM5YHr6+tz6qQ2EPAAMnPw4MHEGuki4AFkZuXKlYk10kXAA8jMpz71qcQa6aoq4K3sZjP7dKVuNrOrw7YGIDYzr3vnOviwqh3B/62kN0u6qVI/L+lLQToCEK2enp6TC4zV1dVp27ZtOXcUt2oD/hp3/5CkE5Lk7kckLQrWFYAo7dy58+QCY6VSSX19fTl3FLdqA37MzOoluSSZWZMkloED8DNpbW2dVre1teXUSW2oNuC3Srpb0sVm9heSHpT02WBdAYjSunXrptU33nhjTp3UhqoC3t2/Junjkv5S0iFJ73L37SEbAxCfO+64I7FGuhrmfopkZhdKekbSnVMea3T3sVCNAYjP7t27E2ukq9opmockDUv6iaQnKt//t5k9ZGZXhWoOQFzcPbFGuqoN+F5Jb3P3i9y9IGmtpK9L+qDKl1ACAOaZagO+xd2/PVm4+w5J17n7v0taHKQzANFZsmRJYo10VTUHL+lZM/tjSXdV6t+WdKRy6SSXSwKoyrFjxxJrpKvaEfx7JV0q6Z8l/Yuk5spj9ZJ+a7YXmNk5ZvYDM/sPM3vUzG5JoV8ACxiLjWWr2sskR9x9o7tf4e6Xu/uH3X3Y3V9094HTvGxU0g3u/gZJl0tqN7M3pdQ3gAVo5o1N7e3tOXVSG6pdbKzJzD5vZvea2Xcmv5Je42U/rZSNlS8+Mgdq2O233z6tvu2223LqpDZUO0XzNUmPS3qVpFskDUr64VwvMrN6M9un8jX0fe7+/Vme02lm/WbWPzw8XG3fABag8fHxxBrpqjbgC+7+ZUlj7r7b3X9P0pzTLe4+4e6Xqzx/f7WZvW6W53S7e4u7tzQ1Nf0svQMAElS92Fjlz0Nm9nYzu0Ll0K6Kux+VdL8kJtwAICPVBvyfm9nPSfqYpD+UdJukjya9oDJvv7Ty/bmSWlWe5gFQo84777zEGumq6jp4d/9W5dv/k3R9lcdeLqmncq18naSvTzkOgBrEHHy2qr2K5nNm9nIzazSzXWY2YmY3J73G3R+uXFb5end/nbv/WTotA1ioli9fnlgjXdVO0axx9+ckvUPSU5J+SdIfBesKQJQOHTqUWCNd1QZ8Y+XPt0m6092fDdQPgIjV19cn1khXtWvR3GNmj0s6LumDlS37ToRrC0CMWIsmW9UuVfAJSW9WeVXJMUnHJL0zZGMAgLOTOII3s9+Y8ZCb2Yikfe7+dLi2AMTIzKZt8mFmOXYTv7mmaGbbEfdCSa83s/e7e+J6NAAwVV1dnSYmJqbVCCcx4N39d2d73MxeqfKOTteEaApAnFavXq0dO3acrFtbW3PsJn5n9M+nux/QqStrAKAqa9asSayRrjMKeDN7jcrrvQNA1W699dZpdVdXV06d1Ia5PmS9Ry9dw/1ClZchSLyTFQBmGhwcTKyRrrk+ZP3CjNolFSU94e4vhmkJQKzOOeccnThxYlqNcOb6kHV3Vo0AiN/UcJ+tRrqqXWzsTWb2QzP7qZm9aGYTZvZc6OYAAGeu2g9Zb5V0k6QnJJ0r6QOS+HQEAOaxateikbsPmFm9u09Iut3MvhuwLwDAWao24I+Z2SJJ+8zsc5IOSWIrFgA/k/r6+ml3srKaZFjVTtG8r/LcD0t6QdIrJM1cpwYAEjU2Tr8/ctGiRTl1UhuqDfh3ufsJd3/O3W9x9z9QefMPAKjazKtmjh8/nlMntaHagO+Y5bHfSbEPAEDK5rqT9SZJ75X0KjP75pQfvUzlG54AoGosF5ytuT5k/a7KH6heJOmvpjz+vKSHQzUFIE5XXnml9u7de7K+6qqrcuwmfnPdyXpA0gGVd3MCgLMyc5PtgwcP5tRJbeBOVgCZmRnoBHxY3MkKAJGqej14dx+QVO/uE+5+u6Trw7UFIEbLly+fVl9yySU5dVIbuJMVQGaOHj06rT5y5Eg+jdSIs7mT9d2hmgIQp+uuuy6xRrqqGsG7+wEza6p8f0vYlgDEauo18AgvcQRvZZ8xsxFJj0v6iZkNm9mn5zqwmb3CzP7NzB4zs0fNbHNaTQNYmPbs2TOtfuCBB3LqpDbMNUXzEUnXSnqjuxfc/QJJ10i61sw+OsdrxyV9zN1/WdKbJH3IzC4724YBLFzLli1LrJGuuaZo1ktqc/eRyQfc/Ukzu1nSDkl/c7oXuvshlT+Mlbs/b2aPSVoh6cdn3fUsurq6NDAwEOLQC87k72HzZv6nSZJWrVqljRs35t0GJB0+fDixRrrmCvjGqeE+yd2HzaxxthfMxsxWSrpC0vdn+VmnpE5Jam5urvaQLzEwMKB9jzymiSUXnvExYlH3Ynmec++T/OWpP/Zs3i1gira2Nt1zzz1yd5mZ1qxZk3dLUZsr4F88w5+dZGbnS/qGpI+4+0vufnX3bkndktTS0nJWn8BMLLlQx1/7trM5BCJz7uP35t0Cpujo6NB9992nsbExNTY2av369Xm3FLW55uDfYGbPzfL1vKRfmevglVH+NyR9zd3/KY2GASxchUJBa9eulZlp7dq1KhQKebcUtcSAd/d6d3/5LF8vc/fEKRorrwP6ZUmPuftfp9k0gIVr3bp1WrJkiW688ca8W4le1UsVnIFrVb5B6gYz21f5Yv4EqHHbt2/XCy+8oO3bt+fdSvSCBby7P+ju5u6vd/fLK19MiAI1rFgsqq+vT5LU19enYpF9g0IKOYIHgGm6u7tVKpUkSaVSSd3d3Tl3FDcCHkBmdu3alVgjXQQ8gMzMXIuGtWnCIuABZGbx4sWJNdJFwAPIzLFjxxJrpIuAB5CZJUuWJNZIFwEPIDOjo6OJNdJFwANApAh4AJlh0+1sEfAAMjPzztWRkZesRo4UEfAAMtPW1jatZj34sAh4AJnp6OhQY2N5IdpFixaxHnxgBDyAzLAefLbm2tEJAFLV0dGhwcFBRu8ZIOABZKpQKGjr1q15t1ETmKIBgEgR8AAQKQIeACJFwANApAh4AIgUAQ8gU8ViUZs2bWLD7QwQ8AAy1dPTo/3792vbtm15txI9Ah5AZorFonp7e+Xuuu+++xjFB0bAA8hMT0+PxsbGJEljY2OM4gMj4AFkpq+vT+4uSXJ37dixI+eO4kbAA8jMsmXLEmuki4AHkJnDhw8n1khXsIA3s6+Y2TNm9kiocwBYWNra2mRmkiQzY8OPwEKO4L8qqT3g8QEsMB0dHWpoKC9i29jYyJLBgQULeHd/QNKzoY4PYOFhw49s5b4evJl1SuqUpObm5py7ARAaG35kJ/cPWd29291b3L2lqakp73YABDa54Qej9/ByD3gAQBgEPABEKuRlkndK+p6k15jZU2b2/lDnAgC8VLAPWd39plDHBgDMjSkaAIgUAQ8AkSLgASBSBDwARIqAB4BIEfAAECkCHgAiRcADQKQIeACIFAEPAJEi4AEgUgQ8AESKgAeASOW+ZR+AbHR1dWlgYCDvNjQ0NCRJWrFiRa59rFq1Shs3bsy1h9AIeACZOn78eN4t1AwCHqgR82W0unnzZknSli1bcu4kfszBA0CkCHgAiBQBDwCRYg4eyMB8uYJlPpj8PUzOxde6kFfzEPBABgYGBvTEoz9S8/kTebeSu0Vj5YmD0QP9OXeSv//5aX3Q4xPwQEaaz5/QJ698Lu82MI989qGXBz0+c/AAECkCHgAiRcADQKQIeACIFAEPAJEi4AEgUkED3szazew/zWzAzD4R8lwAgOmCBbyZ1Uv6kqS1ki6TdJOZXRbqfACA6ULe6HS1pAF3f1KSzOwuSe+U9OMQJxsaGlL980Wd/9Dfhzh89UoTknu+PcwnZlJd2Lv1Ek2Ma2hoPL/zVwwNDenZow3asPuC3HoYK5lKvDVPqjOpsS7fX8johOnChqFgxw8Z8Csk/e+U+ilJ18x8kpl1SuqUpObm5jM+2dKlS+fFRgKjo6MqlUp5tzFv1NXVafHiRTl2sEhLly7N8fxl8+L9OToq8d48pa5OdYsX59rCuVLQ96d5oNGmmf2mpLe6+wcq9fskXe3up11Vp6Wlxfv7WZ8CAKplZnvdvWW2n4X8kPUpSa+YUl8q6WDA8wEApggZ8D+U9Itm9iozWyTpPZK+GfB8AIApgs3Bu/u4mX1Y0rcl1Uv6irs/Gup8AIDpgi4X7O73Sro35DkAALPjTlYAiBQBDwCRIuABIFIEPABEKtiNTmfCzIYlHci7j0hcJGkk7yaA0+D9mZ5XunvTbD+YVwGP9JhZ/+nubgPyxvszG0zRAECkCHgAiBQBH6/uvBsAEvD+zABz8AAQKUbwABApAh4AIkXAR4jNzjFfmdlXzOwZM3sk715qAQEfGTY7xzz3VUnteTdRKwj4+Jzc7NzdX5Q0udk5kDt3f0DSs3n3USsI+PjMttn5ipx6AZAjAj4+NstjXAsL1CACPj5sdg5AEgEfIzY7ByCJgI+Ou49Lmtzs/DFJX2ezc8wXZnanpO9Jeo2ZPWVm78+7p5ixVAEARIoRPABEioAHgEgR8AAQKQIeACJFwANApAh41BQz+3kzu8vM/svMfmxm95pZp5l9K+/egLQR8KgZZmaS7pZ0v7v/grtfJumTkpad5XEb0ugPSBtvTNSS6yWNufvfTT7g7vvMbKmk1Wb2j5JeJ2mvpJvd3c1sUFKLu4+YWYukL7j7r5nZZyRdImmlpBEz+4mkZkmvrvz5RXffmt1/GvBSjOBRSybDezZXSPqIymvov1rStVUc7ypJ73T391bq10p6q8pLNv+pmTWeVbfAWSLggbIfuPtT7l6StE/lkflcvunux6fU/+ruo+4+IukZneXUD3C2CHjUkkdVHnXPZnTK9xM6NX05rlN/T86Z8ZoXqjwGkAsCHrXkO5IWm9nvTz5gZm+U9KsJrxnUqX8U3h2uNSB9BDxqhpdX1vt1SW2VyyQflfQZJa+Xf4ukLWa2R+VRObBgsJokAESKETwARIqAB4BIEfAAECkCHgAiRcADQKQIeACIFAEPAJH6f/G02IcNfmwfAAAAAElFTkSuQmCC\n", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], + "source": [ + "sns.boxplot('Churn','DataUsage', data = data)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The boxplot above indicate people with lower **DataUsage** tends not to churn and there apear to be several outliers for people who doesnt churn and have high data usage. Detailed explanation of what **DataUsage** means was not given, therefore no significant conclusion can be made" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Pre-process Data (Decision Tree)\n", + "* Check for duplicate values and remove\n", + "* Split the data to train-test" + ] + }, + { + "cell_type": "code", + "execution_count": 48, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
ChurnAccountWeeksContractRenewalDataPlanDataUsageCustServCallsDayMinsDayCallsMonthlyChargeOverageFeeRoamMins
00128112.701265.111089.09.8710.0
10107113.701161.612382.09.7813.7
20137100.000243.411452.06.0612.2
3084000.002299.47157.03.106.6
4075000.003166.711341.07.4210.1
....................................
33280192112.672156.27771.710.789.9
3329068100.343231.15756.47.679.6
3330028100.002180.810956.014.4414.1
33310184000.002213.810550.07.985.0
3332074113.700234.4113100.013.3013.7
\n", + "

3333 rows × 11 columns

\n", + "
" + ], + "text/plain": [ + " Churn AccountWeeks ContractRenewal DataPlan DataUsage \\\n", + "0 0 128 1 1 2.70 \n", + "1 0 107 1 1 3.70 \n", + "2 0 137 1 0 0.00 \n", + "3 0 84 0 0 0.00 \n", + "4 0 75 0 0 0.00 \n", + "... ... ... ... ... ... \n", + "3328 0 192 1 1 2.67 \n", + "3329 0 68 1 0 0.34 \n", + "3330 0 28 1 0 0.00 \n", + "3331 0 184 0 0 0.00 \n", + "3332 0 74 1 1 3.70 \n", + "\n", + " CustServCalls DayMins DayCalls MonthlyCharge OverageFee RoamMins \n", + "0 1 265.1 110 89.0 9.87 10.0 \n", + "1 1 161.6 123 82.0 9.78 13.7 \n", + "2 0 243.4 114 52.0 6.06 12.2 \n", + "3 2 299.4 71 57.0 3.10 6.6 \n", + "4 3 166.7 113 41.0 7.42 10.1 \n", + "... ... ... ... ... ... ... \n", + "3328 2 156.2 77 71.7 10.78 9.9 \n", + "3329 3 231.1 57 56.4 7.67 9.6 \n", + "3330 2 180.8 109 56.0 14.44 14.1 \n", + "3331 2 213.8 105 50.0 7.98 5.0 \n", + "3332 0 234.4 113 100.0 13.30 13.7 \n", + "\n", + "[3333 rows x 11 columns]" + ] + }, + "execution_count": 48, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "from sklearn.model_selection import train_test_split\n", + "bool_df = data.duplicated(keep = False)\n", + "data_cl = data[~bool_df]\n", + "data_cl" + ] + }, + { + "cell_type": "code", + "execution_count": 72, + "metadata": {}, + "outputs": [], + "source": [ + "X, y = data.iloc[:, 2:], data.iloc[:,0] #Choose not to use AccountWeeks\n", + "X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.3, random_state = 19, stratify = y)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Decision Tree Training and Evaluation" + ] + }, + { + "cell_type": "code", + "execution_count": 80, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Training accuracy: 0.9502786112301758\n", + "Test Accuracy: 0.922\n" + ] + } + ], + "source": [ + "from sklearn.tree import DecisionTreeClassifier\n", + "clf = DecisionTreeClassifier(max_depth = 6, random_state= 9)\n", + "clf.fit(X_train, y_train)\n", + "\n", + "print(\"Training accuracy: \", clf.score(X_train, y_train))\n", + "print(\"Test Accuracy: \", clf.score(X_test, y_test))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The variance of the model is 0.03 which indicate that the model is doing well in avoiding overfitting.But, this model probably wil have overfit to the no churn label since there are significant more label 0 than 1. The best accuracy metrics to use for this is recall score or generalized F1 score, that way, we will know how our model is doing against the imbalanced dataset. Let plot confusion matrix to verify." + ] + }, + { + "cell_type": "code", + "execution_count": 82, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Accuracy score: 0.922\n", + "Precision score: 0.8819149990638664\n", + "Recall score: 0.7797136519459569\n", + "F1 score: 0.8192285229579775\n", + " precision recall f1-score support\n", + "\n", + " 0 0.93 0.98 0.96 855\n", + " 1 0.83 0.58 0.68 145\n", + "\n", + " accuracy 0.92 1000\n", + " macro avg 0.88 0.78 0.82 1000\n", + "weighted avg 0.92 0.92 0.92 1000\n", + "\n" + ] + } + ], + "source": [ + "from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, classification_report,confusion_matrix\n", + "prob = clf.predict(X_test)\n", + "print(\"Accuracy score: \", accuracy_score(y_test, prob))\n", + "print(\"Precision score: \", precision_score(y_test, prob, average = 'macro'))\n", + "print(\"Recall score: \", recall_score(y_test, prob, average = 'macro'))\n", + "print(\"F1 score: \", f1_score(y_test, prob, average = 'macro'))\n", + "print(classification_report(y_test, prob))" + ] + }, + { + "cell_type": "code", + "execution_count": 85, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "Text(91.68, 0.5, 'Actual label')" + ] + }, + "execution_count": 85, + "metadata": {}, + "output_type": "execute_result" + }, + { + "data": { + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAQsAAAELCAYAAADOVaNSAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/d3fzzAAAACXBIWXMAAAsTAAALEwEAmpwYAAAXDElEQVR4nO3dd5QV5f3H8feXXaVIExAELCiKBY4tgAoqKoiCKIgGS/yhRiWxxxIjsQSxRSPmxBYliaLGhtgV0QgaQcGAkahIEQsqvUkvu/D9/TEDLJctzy63zO5+Xufcs3fKnfu9F/azzzwz84y5OyIiZamR6wJEpHJQWIhIEIWFiARRWIhIEIWFiATJz3UB5VGw6BsduqlEarc4OtclSAUUrp9txc1Xy0JEgigsRCSIwkJEgigsRCSIwkJEgigsRCSIwkJEgigsRCSIwkJEgigsRCSIwkJEgigsRCSIwkJEgigsRCSIwkJEgigsRCSIwkJEgigsRCSIwkJEgigsRCSIwkJEgigsRCSIwkJEgigsRCSIwkJEgigsRCSIwkJEgigsRCSIwkJEgigsRCSIwkJEgigsRCSIwkJEgigsRCSIwkJEgigsRCSIwkJEgigsRCSIwkJEgigsRCSIwkJEgigsRCSIwkJEgigsRCSIwkJEgigsRCSIwkJEgigsRCSIwkJEgigsRCRIfq4LqKyefO5lXnx9FGbGvq1bcfvvr6FmzR23We/zqdP5xYBruHfwDXQ/7ujtes/169cz8LYhfDn9Kxo2qM+9gwfSsnkzps34mtvufZCVq1ZTI68GA/qfRY9uXbbrvWSLvw0dwsk9u7Fg4SIOObQrAM88/VfatGkNQMMG9flp2XLad+ieyzIzTi2LCpi/cBFPj3iV5x+7n1f++QgbN27krXf/vc16GzZs4M8PP07njoeVa/uz587n/Muv32b+S2+8Q/16dXlr+GP835l9uO/hxwCoVasmd958Ha8+/SiPDrmdu+9/lOUrVlbsw8k2nnxyOCf3+sVW8875xSW079Cd9h268/LLI3nllZE5qi57stqyMLP9gd5AS8CBOcBr7j41m3WkQ+GGDaxbt578vHzWrF3HLk0abbPOMyNe44RjO/PF1BlbzX/97TE8/cKrFBQUclDb/bjp2svIy8sr8z3HjB3PpReeC0D3Y4/mzvv+irvTao/dNq/TdJfGNNq5IUt/Wkb9enW381MKwNhxH7PnnruVuPyMM07hhBP7ZbGi3Mhay8LMfgc8BxjwH2Bi/PxZM7shW3WkQ7NdmnD+2afTrW9/jut9DvV2qkPnw3+21TrzFy5i9Acf0a9Pz63mf/3d94wa/W+eemQILz7xEDVq1OCNd94Let8FCxeza9MmAOTn51F3pzr8tGz5Vut8/uV0CgoK2b1l8+34hBLq6KMOZ/6Chcyc+W2uS8m4bLYsLgTauntB0Zlmdh8wBfhjcS8yswHAAICHh9zORf3PznSdZVq2fAXvjZ3A2y88Tr16dbn2pjt5/e0xnHLi8ZvXufsvj3L1Jb/cpsXw8aTJfDltJmddeBUA69ato9HODQG4cuBgZs+ZT0FhAXPnL+T08y4D4Nx+vTnt5O64+za1mNnm5wsXLWHg4D9xx03XUqOG9jCz4cwz+/D886/muoysyGZYbARaALNS5jePlxXL3YcCQwEKFn2z7W9LDkyYNJmWLZpt/iXv2qUTkz//cquwmDLtK377hyj/li5bztjxE8nLy8PdObVHN66+5IJttnv/XbcAUZ/FjXcMYdiD92y1vFnTJsxbsIhdm+5CYeEGVq5aTYP69QBYuWoVl/72Fq4YcB4HtzsgEx9bUuTl5XFanx50PKJHrkvJimyGxW+A0Wb2FfBDPG8PYB/g8izWsd2aN9uFz76Yxpq1a6lVsyYfT5pM2/333Wqdt0cM2/z8xtuH0KVzR7oe04mvv53FFTcMpv9Zp9F454YsW76CVatX02LXZmW+73FHHcGrI9/lkHYH8M77Yzn8ZwdjZhQUFHDVwNs49aSunHj89h1xkXDduh7N9OkzmT17bq5LyYqshYW7jzKzNkBHog5OA34EJrr7hmzVkQ4Htd2fE447in4XXEFeXh77t2nNz3v34PmX3wTgzNNOLvG1rffakysu7s+A39zIRt/IDvn53HjNpUFh0bfXiQy87U/06PdLGtSvx59ujbp6Ro0ZyyeTv+CnZSt4ZeS7ANxx4zXsHx/ak+3zz6ceossxR9KkSSO++2YStw6+l8eHPUe/fr15rprsggBYcfvBSZWU3RAJU7uFWjmVUeH62VbcfPWCiUgQhYWIBFFYiEgQhYWIBFFYiEgQhYWIBFFYiEiQEk/KMrM65dmQu6/e/nJEJKlKO4NzJdFl5KHKvsZaRCqt0sLil5QvLESkCisxLNx9WBbrEJGEK9eFZGZ2IPAzYHfgMXefZ2b7APPdfUUmChSRZAgKCzOrCzwGnAEUxK8bBcwD7gS+B67LUI0ikgChh07vAzoBXYF6RJeXbzISOCnNdYlIwoTuhvQFrnL398ws9ajHLGDP9JYlIkkT2rKoDSwuYVk9oFINXiMi5RcaFhOB/iUsOwP4KD3liEhShe6G3AS8a2bvAi8QnX/R08yuJgqLYzJUn4gkRFDLwt3HEXVu1gQeJOrgvBXYG+jm7hMzVqGIJELweRbu/iFwtJnVBnYGftL1ICLVR0WuOl1LdK7FmjTXIiIJFhwWZtbTzD4iCot5wFoz+8jMSh73XkSqjKCwMLNfAa8TXYl6FfDz+OdK4LV4uYhUYUH3DTGzWcBId7+kmGWPAD3dfY8M1LcV3TekctF9Qyqn7b1vSGPgpRKWvQg0qkhRIlJ5hIbFe0CXEpZ1AT5ITzkiklSlDat3YJHJ+4G/m1lj4BVgAdAUOA3oAVyUwRpFJAFK7LMws41sPVJW0f0YT51294wPq6c+i8pFfRaVU0l9FqWdlHVchmoRkUqotGH1/p3NQkQk2co1rB6AmdUAaqXO16nfIlVb6ElZZma/M7OZRKd6ryjmISJVWOih0yuBG4B/EHVs3gEMBmYA3wEDMlGciCRHaFhcDPwBuCeefsXdbwXaAtOAfTNQm4gkSGhY7AVMdvcNRLshDQHcfSPwMHBeRqoTkcQIDYvFQN34+ffAoUWW7Uw0RqeIVGGhR0M+BDoQDfv/DDDIzBoB64HLgNGZKU9EkiI0LAYBLePndxLthpxP1KL4F3BFmusSkYQJukQ9KXS6d+Wi070rp+29RF1EqrnSrjodXp4NuXu/7S9HRJKqtD6LXbJWhYgkXmkXkumqUxHZTH0WIhJEYSEiQRQWIhJEYSEiQRQWIhJEYSEiQdJ1Upa7+5lpqKdULVr3yPRbSBq1rNc41yVIGumkLBEJopOyRCSI+ixEJEjwrQDMrB7QG2hD8bcCuD6NdYlIwgSFhZm1Jhotqw6wE7CQ6M7p+cBSYBmgsBCpwkJ3Q/4MTAKaEd0KoCfRKFnnAiuBjB8JEZHcCt0N6Uh0p/R18fSO8Ujfz5hZE+AvQKcM1CciCRHasqgFLI+H/l8CtCiy7Avg4HQXJiLJEhoWM4A94+efAr82s1pmtgNwITAnE8WJSHKE7oY8BxwCPAXcDLwNLAc2xts4PwO1iUiCBIWFu99X5PkEM2sH9CDaPRnj7l9kqD4RSYjg8yyKcvcfgKFprkVEEiz0PIueZa3j7iO3vxwRSarQlsUbgBOdY1FU0Zv+5KWlIhFJpNCw2KuYeY2A7kSdmxekqyARSabQDs5ZxcyeBXxqZhuA3wOnprMwEUmWdFx1+ilwfBq2IyIJtl1hYWY7Eu2GzE1LNSKSWKFHQyaydWcmwI5AK6Ae6rMQqfJCOzinsG1YrAVeAF5x9ylprUpEEie0g/P8DNchIgkX1GdhZmPMbP8SlrUxszHpLUtEkia0g/NYoH4Jy+oDx6SlGhFJrPIcDUnts9h0NOR4YF7aKhKRRCrtJkN/AG6JJx2YYJZ6tvdmf0pzXSKSMKV1cI4EFhFdD3I/MAT4LmWd9cA0dx+bkepEJDFKu8nQRGAigJmtAN5w98XZKkxEkiW0z2IycHhxC8ysp5kdlLaKRCSRynMrgGLDAugQLxeRKiw0LA4juslQccYDh6anHBFJqtCwyCO6E1lxdiK6TkREqrDQsJgIDChh2QCiu5WJSBUWeiHZIOBdM/sYeILoJKzmQH+iGwydkJHqRCQxQi8k+8DMugN3AQ8QnXuxEfgYOEHnWYhUfcG3AnD394EjzawOsDOw1N1XA5jZDu5ekJkSRSQJyj1SlruvdvfZwBozO97M/oauDRGp8sp9kyEzOxw4G+gHNCO6UfJzaa5LRBImdFi9dkQBcRbRUHrriQ6XXgM85O6FmSpQRJKhxN0QM9vbzH5vZp8D/wOuA6YSHQHZl6iT81MFhUj1UFrLYibRpekfA78CXnT3pQBm1iALtYlIgpTWwTmLqPXQjmikrE5mVqEbKYtI5VdiWLj7XkBnopOwugKvA/Pjox9dKWbkLBGpuko9dOru4939CqAlcCLwKnA6MCJe5WIza5/ZEkUkCcy9fA2EeNzNnkRHRnoBtYEZ7n5A+svb2i4N9lNrphKpk18z1yVIBcxa/Fmx42dW5KSs9e7+irufRXSeRX+izlARqcK2616n7r7K3Z9291PSVZCIJFM67qIuItWAwkJEgigsRCSIwiIH6jeox2NP/oWPJr7Fh/8ZSfsOh3Bqn5MYO+EN5i+dysGHtst1iZLiwl+fy78+fIl3xr3E/UPvpmbNLSNJDrjsPGYt/oydGzXMXYFZoLDIgTv/eCNj3h1Lpw49OLZzb2bM+JqpX87g/HOvYPyHE3NdnqRo1rwpFwz4Bb26nk33o/qSl1eDU/qeBEDzFs046tgj+PGHOTmuMvMUFllWt95OHNG5A/98MjqvraCggOXLVvDVjG/4eua3Oa5OSpKXn0etWjXJy8ujdu1azJ+7EIBb7rieuwb9mfKer1QZKSyyrFWr3Vm8aAkPPHwXY8a+zJ8fuJ06dWrnuiwpxfy5Cxj64BOM/987TPxyNCuWr2Ts++PpdtKxzJu7gKlTZuS6xKxIRFiY2QWlLBtgZpPMbNLa9T9lsarMyMvP56CDD+TxfzzL8UefxupVa7jy6pIGTpckqN+gHt17HsdRh/WgY9tu1N6pNn3PPIXLr7mY++56KNflZU0iwgK4taQF7j7U3du7e/taOzbMYkmZMXf2PObMnsd/P/kMgNdfHcVBBx+Y46qkNEd1OYIfZv3IksVLKSwsZNQbo+l3dm9236Mlb33wAuM+fYvmLZrx5nvPs0vTxrkuN2Oydsm5mX1W0iKi08arhQULFjFn9jxa77MXX8/8lqO7HMn06V/nuiwpxZzZ8zi0/UHUql2LtWvW0vmYwxn15mjO6nPR5nXGffoWp3Q9m6VLfspdoRmWzfEpmhFdubo0Zb4BH2WxjpwbeP1tPPL3e9lhhx2Y9d0PXHnZQHr26sZd99xM4yaNeGb4o0z5fCr9+l5U9sYk4yZ/8jkjX3uXN997ng2FG5jy+VSeeWJE2S+sYsp91WmF38jsH8Dj7j6umGXPuPs5ZW1DV51WLrrqtHIq6arTrLUs3P3CUpaVGRQikltJ6eAUkYRTWIhIEIWFiARRWIhIEIWFiARRWIhIEIWFiARRWIhIEIWFiARRWIhIEIWFiARRWIhIEIWFiARRWIhIEIWFiARRWIhIEIWFiARRWIhIEIWFiARRWIhIEIWFiARRWIhIEIWFiARRWIhIEIWFiARRWIhIEIWFiARRWIhIEIWFiARRWIhIEIWFiARRWIhIEIWFiARRWIhIEIWFiARRWIhIEIWFiARRWIhIEIWFiARRWIhIEIWFiARRWIhIEIWFiARRWIhIEIWFiARRWIhIEIWFiARRWIhIEIWFiAQxd891DQKY2QB3H5rrOiRMdfz3UssiOQbkugApl2r376WwEJEgCgsRCaKwSI5qtf9bBVS7fy91cIpIELUsRCSIwkJEgigscszMTjKz6WY208xuyHU9Ujoze8zMFpjZF7muJdsUFjlkZnnAQ0AP4EDgbDM7MLdVSRmGASfluohcUFjkVkdgprt/4+7rgeeA3jmuSUrh7h8AS3JdRy4oLHKrJfBDkekf43kiiaOwyC0rZp6OZUsiKSxy60dg9yLTuwFzclSLSKkUFrk1EdjXzPYysx2Bs4DXclyTSLEUFjnk7oXA5cDbwFRguLtPyW1VUhozexYYD+xnZj+a2YW5rilbdLq3iARRy0JEgigsRCSIwkJEgigsRCSIwkJEgigsEsDMBpmZF3nMMbMXzax1Bt+zV/xereLpVvF0r3Jso5+ZnZ/GmurGNZS4zYrUGb9umJlN2u4io229b2Yj0rGtyiQ/1wXIZsvYcjXj3sBtwGgza+vuq7Lw/nOBI4Fp5XhNP6AJ0ZWYUsUpLJKj0N0nxM8nmNn3wFigJ/BC6spmVtvd16Trzd19HTChzBWl2tJuSHJ9Ev9sBWBm35nZEDO72cx+BJbH82uY2Q3x4DnrzGyGmZ1XdEMWGRQP2rLCzJ4E6qesU2zz3swuNrPPzWytmc03sxFm1sDMhgGnA12K7D4NKvK63mY2KX7dPDO7x8x2SNn26XG9a8zsA2D/inxRZtbfzMaZ2RIzW2pm75lZ+xLW7WNm0+K6xqWOHxLyfVZXalkkV6v457wi884BpgCXsuXf7gHgPGAw8F/gBOAxM1vs7m/E61wJ3ALcSdRa6QvcU1YBZnZTvN2Hgd8CdYCTgbpEu0l7AA3jeiC6MA4z6wc8CzwK/B5oDdxF9Mfpunidw4DngZeBq4C2wPCyaipBK+BJ4GtgR6Lv6QMza+fu3xRZb0/gPuBmYA1wK/C2me3r7mvjdUK+z+rJ3fXI8QMYBCwiCoB8oA3wHlHroXm8zndE/Qq1irxuH2AjcF7K9p4EJsbP84iuZP1ryjr/IrocvlU83Sqe7hVPNwRWA/eVUvcI4P2UeQbMAh5Pmf9Lol/QxvH0cOBL4ksO4nk3xjWcX8p7blVnMctrxN/hNOCWIvOHxa/rVGTenkAh8OvQ7zOefh8Ykev/N9l+aDckORoDBfFjOlEn55nuPrfIOqN9y19AgK5E/7lfNrP8TQ9gNHBIPGzf7kBz4NWU93upjHqOBGoDj5fzc7QhanEMT6lpDFALaBev1xF4zePfvsCaimVmB5jZy2Y2H9hA9B3uF9dS1AJ3/2jThLvPItrd6xjPCvk+qy3thiTHMqAb0V+/ecCclF8kgPkp002IWg7LSthmc2DX+PmClGWp06kaxz/nlrrWtprEP0eWsHzT+B27VqCmbZhZPeAdou/mGqJWzVrg70ThVNb2FxB9TxD2ff5Y3hqrCoVFchS6e1nnAaSGxxKiZnRnor+IqRaw5d+4acqy1OlUi+OfzYl2kUJtGp9yAPBpMcu/jX/Oq0BNxTmSaNCgE9x982FfM2tQzLrFbb8pUT8QhH2f1ZbConIbQ/SXsIG7/6u4FczsB6JfzN7AqCKL+pax7fFEfQznEXdKFmM92/71ng7MJuoL+Vsp258InGpmA4u0oMqqqTi145/rNs0ws05EfRufpKzb1Mw6bdoVMbM9gMPYsqtV5vdZnSksKjF3n25mjwDPmdk9wCSiX962QBt3v8jdN8TL7jWzRURHQ04HDihj2z+Z2W3AHfEoXiOBmkRHQ25199lEnYi9zawPUfN8jrvPMbNrgafMrD7wFlGo7A30Ac5w99XA3cDHRH0b/yDqy6jIQDITgJXA3+LPuRtRh/HsYtZdFNe16WjIYKLWwrD4M5f5fVagvqoj1z2semw5GlLGOt8B9xYz34DfEDWl1wELgX8D/VPWuS1etgJ4mujwYolHQ4q89ldERy3WEbVQhgP142VNiA59LolfO6jI63oQBdMqoqM6k4Hbgfwi6/wcmEnUxzAO6EAFjoYQnfn6BVEAfEZ0Itv7FDliQRQIk4haLzPiz/Mh0K4C3+dW264uD42UJSJBdOhURIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkyP8DrJqzin4hc6MAAAAASUVORK5CYII=\n", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], + "source": [ + "cm = confusion_matrix(y_test, prob)\n", + "ax = sns.heatmap(cm, square=True, annot= True, cbar = False)\n", + "ax.set_xlabel('Predicted label', fontsize = 15)\n", + "ax.set_ylabel ('Actual label', fontsize = 15)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Since we don't have enough data for churn label, the model mispredict 61 of churn as not churn. Let try XGBOOST to select the best parameters to use" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### XGBOOST" + ] + }, + { + "cell_type": "code", + "execution_count": 86, + "metadata": {}, + "outputs": [], + "source": [ + "import xgboost as xgb" + ] + }, + { + "cell_type": "code", + "execution_count": 99, + "metadata": {}, + "outputs": [], + "source": [ + "dmatrix_train = xgb.DMatrix(data=X_train, label=y_train)\n", + "dmatrix_test = xgb.DMatrix(data=X_test, label=y_test)\n", + "\n", + "param = {'max_depth':6, \n", + " 'eta':0.3, \n", + " 'objective':'multi:softprob', \n", + " 'num_class':2}\n", + "\n", + "num_round = 6\n", + "model = xgb.train(param, dmatrix_train, num_round)\n", + "\n", + "preds = model.predict(dmatrix_test)\n", + "\n", + "best_preds = np.asarray([np.argmax(line) for line in preds])" + ] + }, + { + "cell_type": "code", + "execution_count": 103, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Precision: 0.8803212544949875\n", + "Recall: 0.8216172615446662\n", + "Accuracy: 0.93\n", + " precision recall f1-score support\n", + "\n", + " 0 0.95 0.97 0.96 855\n", + " 1 0.82 0.67 0.73 145\n", + "\n", + " accuracy 0.93 1000\n", + " macro avg 0.88 0.82 0.85 1000\n", + "weighted avg 0.93 0.93 0.93 1000\n", + "\n" + ] + } + ], + "source": [ + "# metrics\n", + "print(\"Precision: \", (precision_score(y_test, best_preds, average='macro')))\n", + "print(\"Recall: \",(recall_score(y_test, best_preds, average='macro')))\n", + "print(\"Accuracy: \", (accuracy_score(y_test, best_preds)))\n", + "print(classification_report(y_test, best_preds))" + ] + }, + { + "cell_type": "code", + "execution_count": 104, + "metadata": {}, + "outputs": [ + { + "data": { + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAQsAAAELCAYAAADOVaNSAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/d3fzzAAAACXBIWXMAAAsTAAALEwEAmpwYAAAXwUlEQVR4nO3dd5gV5dnH8e+9uxEQFlCQTlBRbImaookVUQQRDYKKmlhQkFwWbFhABGmiYnmjIcYumteGUSx5FQuIYheViBJQUDSUpSlIW9hd7vePGdblsOVZOW2X3+e6zrVnys7c5yznxzPPzHnG3B0RkarkZLoAEakZFBYiEkRhISJBFBYiEkRhISJB8jJdQHUULf9Kp25qkHqtjsh0CfITFG9caOXNV8tCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkSF6mC6ipHnliIk+/MAkzY8/2uzL62iuoU2eH0uVTpr3LX+97hBzLITc3l0GX9ufXB/xim/a5ceNGBo+6jVlzvqRxo4bcOnIwrVs2Z/YX8xh16zjWrF1HTm4O/c8+nW6dO27rS5RYmzatGP/gHTRvsQubNm3i/vsf5a/jHuDmG6+j+wnHsnHjRr766hv69ruCVat+yHS5KWPunukaghUt/yoril2ybDlnX3Alzz16D3Xr1GHg0DEc8fuDOKn7saXrrFu3nnr16mJmzJn7NVcOHcMLj98XtP2Fi5cw5IbbGD9u7Bbzn3jmX8yZ+zXXXz2AF1+byuQ33uW2UYOZ/+0CzIx2bVuzdNkKevcdwPOP3kvD/AZJfd3VVa/VERndf7K0aNGMli2a8cmMz2jQoD4fvD+Jk085jzatWzLl9bcpKSnhxjHXAjD42jEZrnbbFW9caOXNT2vLwsz2BnoArQEHFgHPu/t/0llHMhSXlLBhw0bycvNYX7iBXZruvMXyHXesV/p8fWEh2I/v/wsvT+HRp56jqKiY/ffbi+sGXkRubm6V+5wy7V0u7HsmAF2OOoIxt/8dd2fXn7cpXafZLk3YeafGfL9yVcbDorYoKFhKQcFSANasWcvs2V/SulULXn3tzdJ13nv/Y07u1T1TJaZF2voszOwa4AnAgA+AD+Pnj5vZoHTVkQzNd2lKnzNOpnOvs+nU44/k19+Rw373m63We+2NtznxjPO58MphjLr2cgDmzf+WSZPf4B9338bTD/+NnJwc/vXK60H7XbpsBS2aNQUgLy+XBvV3ZGVCs3fmrDkUFRXTtnXLbXyVUp527dpw4AG/4P0PPtli/rl9TmfSy2F/x5oqnS2LvsB+7l5UdqaZ3Q58DtxU3i+ZWX+gP8Bdt42m39lnpLrOKq36YTWvT3uPl596iPz8Bgy8bgwvvDyFE7sevcV6nTseRueOhzF9xkzG3fcI999xI+9Pn8Gs2XM5ve+lAGzYsIGdd2oMwCWDR7Jw0RKKiotYvGQZJ59zEQBn9u5Bz+5dKO+Q0cq0WJYt/47BI2/hhusGkpOjvutkq19/RyY8eR9XXHk9q1evKZ0/eNAlFBcX89hjz2SwutRLZ1hsAloB3yTMbxkvK5e73wvcC9nTZ/He9Bm0btW89EN+TMdDmTFz1lZhsdlvD/wl/124mO9XrsLd+UO3zlx+wblbrXfnjcOAivssmjdrSsHS5bRotgvFxSWsWbuORg3zAVizdi0XXjWMAf3P4YBf7JPEVysAeXl5PPXkfTz++ESeffal0vlnnXUq3Y/vzLFde2ewuvRI538/lwGTzewlM7s3fkwCJgOXprGObday+S58+tls1hcW4u68P30Gu7dru8U63y5YVNoSmDVnLkVFxTRu1JDf//ZAXp36Fiu+XwlErZRFBUuC9tvp8N/z3IuvAfDK1Gn87jcHYGYUFRVx6eBR/OG4Y+h6dO3oVMw29917G/+ZPZe/3HFv6byuXY7iqisv5KRefVi/vjCD1aVH2loW7j7JzDoABxN1cBqwAPjQ3UvSVUcy7L/f3hzb6XB6nzuA3Nxc9u7QnlN7dOPJif8HwGk9u/Pq1Ld4/qXJ5OXlUbfODtw6chBmRvvd2jHg/LPpf9kQNvkmfpaXx5ArLqRVi+ZV7rfXCV0ZPOoWuvU+j0YN87llRNTVM2nKND6a8RkrV63m2ThMbhhyBXt3aJ+6N2E7ctihB3HWmafw6cxZTP/wFQCGDr2J/7l9JHXq1GHSS08A8P77H3PRxTWq+61adOpUUqa2nDrd3lR06lS9YCISRGEhIkEUFiISRGEhIkEUFiISRGEhIkEUFiISRGEhIkEUFiISJCgszKyZme1WZtrMrL+Z/cXMTkxdeSKSLUJbFuOBy8tMjwDuAo4DJppZn+SWJSLZJjQsfg1MATCzHOAC4Fp33xu4gegbpSJSi4WGRSNgRfz8N8DOwKPx9BRgjyTXJSJZJjQsFgD7xs+7A7PdfWE83Qio/V/mF9nOhY5n8SAw1sw6E4XF4DLLfg/UuAF3RaR6gsLC3W80s4XAQcAAovDYbGfg/hTUJiJZRIPfSMpo8Juaqdr3DTGzHauzA3dfV92iRKTmqOwwZA3RjYBCVX2XHBGpsSoLi/OoXliISC1WYVi4+/g01iEiWa5atwIws32JLspqCzzo7gVmtgewxN1Xp6JAEckOQWFhZg2ITpeeAhTFvzcJKADGAN8CV6aoRhHJAqFXcN4OHAocA+QT3SBosxeJvlAmIrVY6GFIL+BSd3/dzBLPenwDtEtuWSKSbUJbFvX48YtkifKBGnX7QRGpvtCw+BA4u4JlpwDvJKccEclWoYch1wGvmdlrwFNE118cb2aXE4XFkSmqT0SyRFDLwt3fIurcrAOMI+rgHAHsDnR29w9TVqGIZIXg6yzc/W3gCDOrB+wErNT3QUS2Hz9ldO9Comst1ie5FhHJYsFhYWbHm9k7RGFRABSa2Ttm1j1l1YlI1gi9FcCfgReIvol6KXBq/HMN8Hy8XERqsaDBb8zsG+BFd7+gnGV3A8e7+89TUN8WNPhNzaLBb2qmiga/CT0MaQI8U8Gyp4mG1hORWiw0LF4HOlawrCPwZnLKEZFsVdmwevuWmbwTuN/MmgDPAkuBZkBPoBvQL4U1ikgWqLDPwsw2seVIWWWPYzxx2t1TPqye+ixqFvVZ1EzVHrAX6JSiWkSkBqpsWL030lmIiGS3ag2rB6U3Rq6bOF+XfovUbqEXZZmZXWNmc4ku9V5dzkNEarHQU6eXAIOAB4g6Nm8ARgJfAPOB/qkoTkSyR2hYnA9cD4yNp5919xHAfsBsYM8U1CYiWSQ0LHYDZrh7CdFhSGMAd98E3AWck5LqRCRrhIbFCqBB/Pxb4Fdllu1ENEaniNRioWdD3gYOIhr2/zFguJntDGwELgImp6Y8EckWoWExHGgdPx9DdBjSh6hF8SowIMl1iUiWCfqKerbQ5d41iy73rpl+yuXeQczsZGBCOr4b0qRd51TvQpKoXcPmmS5BkuinjMEpItshhYWIBFFYiEgQhYWIBKlspKwJgdtok6RaRCSLVXY2ZJfAbWxAY3CK1HqVDX6jkbJEpJT6LEQkiMJCRIIoLEQkiMJCRIIoLEQkSLXCIh64t62ZHWpm9VNVlIhkn+CwMLMLgYXAN8A0YK94/jNmdllKqhORrBF6K4CrgNuB+4Cj2fLWhVOB05JemYhkldDxLC4Chrn7WDNLHLdiDtAhuWWJSLYJPQxpAXxUwbJNlHOHMhGpXULDYi7QsYJlRwKzklOOiGSr0MOQvwB3mdlG4J/xvGZm1he4gugmRCJSiwWFhbvfb2Y7AcOAEfHsF4F1wHB3fyxF9YlIlggesNfdbzGzu4FDgSbAd8C77r4qVcWJSPao1uje7r4aeDlFtYhIFgsKi/iCrEq5+13bXo6IZKvQlsW4SpZtvvGPwkKkFgs6deruOYkPYGfgDODfwL6pLFJEMu8n35HM3VcCT5pZI+Ae4Kgk1SQiWSgZX1H/GvhtErYjIllsm8LCzFoCA4kCQ0RqsdCzIcv4sSNzsx2AfKAQ6JXkukQky2zL2ZBCYAEwyd1XJK8kEclGVYaFmf0MeA342t0Xpb4kEclGIX0WJcAUYJ8U1yIiWazKsHD3TcCXQPPUlyMi2Sr0bMgQYJiZ/TKVxYhI9qrsLupHAh+7+xrgOqJvms4ws4XAEhLOjrj7waksVEQyq7IOzteBQ4APgM/ih4hspyoLi9IRvN393DTUIiJZTHckE5EgVV1ncbyZ7R2yIXd/JAn1iEiWqioshgVuxwGFhUgtVlVYdAKmp6MQEcluVYXFendfm5ZKRCSrqYNTRIIoLEQkSIWHIfE4myIigFoWIhJIYSEiQRQWIhJEYSEiQX7yfUNk2+Tk5PDGW8+xeNESep/Sj1/uvw9/uWM0derWobi4hIGXDeWjjz7NdJkS69P/DE47qyeY8eQ/JjL+nse48/6b2K19OwAaNsrnh1WrObHTGRmuNHUUFhlywUXn8sWceeTnNwBg1OhB3HTjnbz6yht06XoUI0cPonu3P2a4SgHosHd7TjurJz27nE3RxiIemjCOqa9O45J+g0rXGTzyclb/sCaDVaaeDkMyoFWrFnQ9rhMPj3+ydJ67lwZHw4b5FBQszVR5kqB9h9345KOZFK4vpKSkhA/e+Ygu3Y/eYp3uPY7lX89MylCF6aGWRQbcNHYow4bcRIP8+qXzrrl6FBOfe5jRYwaTk5PDsUefksEKpawv/jOPgUMuovFOjSgs3EDHzofz2YxZpcsPOuTXLF/2HfO/+m8Gq0y9rGhZmFmFg+uYWX8zm25m0zcW/5DOslLiuOOOZvmyFcyYseXAY/36/YnB14xm370OZ/A1oxn395szVKEkmvfl19xz53gefvouHpowjtmff0FxSUnp8hN7deWFWt6qADD3xBuNZaAIs2/d/edVrdew/u6ZL3YbXT/iKk4/4ySKi0uoW7cO+fkNeOH5lzmu2zG0bXVA6XoLFv+bNi0PqGRL2W+Xeo0zXUJKDBxyMQWLlvDoQ0+Rm5vLOzMn0eOYP1GwuHYcOs5b/rGVNz9tLQsz+7SCx0y2o9sMjLj+FvbpcBi/3PdIzj3nEt58413O73sFBYuXcPgRvwOg41GHMm/e/MwWKlto0nQnAFq2bkHXEzqVtiQO6/g75s2dX2uCojLp7LNoDnQFvk+Yb8A7aawjKw24+FpuvmUoeXl5bCjcwKUXD8l0SVLG3x66lcY7N6K4qJjhV9/MD6tWA3BCzy7bxSEIpPEwxMweAB5y97fKWfaYu1d5nrA2HIZsT2rrYUhtV9FhSNpaFu7et5JluqBAJMtlxdkQEcl+CgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEg5u6ZrkEAM+vv7vdmug4Jsz3+vdSyyB79M12AVMt29/dSWIhIEIWFiARRWGSP7er4txbY7v5e6uAUkSBqWYhIEIWFiARRWGSYmR1nZnPMbK6ZDcp0PVI5M3vQzJaa2WeZriXdFBYZZGa5wN+AbsC+wBlmtm9mq5IqjAeOy3QRmaCwyKyDgbnu/pW7bwSeAHpkuCaphLu/CXyX6ToyQWGRWa2B/5aZXhDPE8k6CovMsnLm6Vy2ZCWFRWYtANqWmW4DLMpQLSKVUlhk1ofAnma2m5ntAJwOPJ/hmkTKpbDIIHcvBi4GXgb+A0xw988zW5VUxsweB94F9jKzBWbWN9M1pYsu9xaRIGpZiEgQhYWIBFFYiEgQhYWIBFFYiEgQhUUamdlwM/Myj0Vm9rSZtU/hPk+I97VrPL1rPH1CNbbR28z6JLGmBnENlW4zXufibdzXcDNbvi3bKLOt8WY2PRnbqonyMl3AdmgVP35rcXdgFDDZzPZz97Vp2P9i4BBgdjV+pzfQlOgbl7KdUlikX7G7vxc/f8/MvgWmAccDTyWubGb13H19snbu7huA96pcUSSBDkMy76P4564AZjbfzG4zs6FmtgD4IZ6fY2aD4kFyNpjZF2Z2TtkNWWR4PDjLajN7BGiYsE65hyFmdr6ZzTSzQjNbYmb/NLNGZjYeOBnoWObwaXiZ3+thZtPj3ysws7Fm9rOEbZ8c17vezN4E9k7C+4aZdTezV+PX+4OZvWdmXSpY9zAz+ziuc4aZHV7OOv3M7PP4/f3GzK6uYv+Nzez++HCy0My+NbP7kvHaspFaFpm3a/yzoMy8PwKfAxfy49/or8A5wEjgY+BY4EEzW+Hu/4rXuQQYBowhaq30AsZWVYCZXRdv9y7gKmBHoDvQgOgw6edA47geiL4Ah5n1Bh4H7gGuBdoDNxL9J3RlvM6vgSeBicClwH7AhKpqCrQb8AJwK7CJaBChl8zsSHd/u8x6OwL/G9e2GBgYr7enuxfEdV5F9L6NBaYCvwFGmdk6dx9Xwf5vBw4FLif6+7UFjkzSa8s+7q5Hmh7AcGA5UQDkAR2A14laDy3jdeYT/YOuW+b39iD6MJyTsL1HgA/j57lE31j9e8I6rxJ97X3XeHrXePqEeLoxsA64vZK6/wlMTZhnwDfAQwnzzwPWA03i6QnALOKvFsTzhsQ19Kni/XLg4sD3Nid+T18GHkx4zx34Y5l5DYgGsLkpnm4IrAGuT9jmSKIQyI2nxwPTyyz/DBiQ6X9X6XroMCT9mgBF8WMOUSfnae6+uMw6k929sMz0MURhMdHM8jY/gMnAgfHwfG2BlsBzCft7pop6DgHqAQ9V83V0IGpxTEioaQpQF/hFvN7BwPMef7oCawpiZm3M7GEzWwgUE72nXeLaEk3c/MTd1xCF6MHxrEOA+sBT5byW5kRDB5RnBnCVmV1oZuXts1bRYUj6rQI6E/1vVwAsSvggASxJmG5K1HJYVcE2WwIt4udLE5YlTidqEv9cXOlaW2sa/3yxguWbx+lo8RNqqpKZ5RB9nT+f6NBrLrCWqDXQLGH1Nb51J/FSYP/4+ebXUtE3ftsStaISXRzvbxjwNzObCwx19yeq8VJqDIVF+hW7e1Xn6hPD4zui/zkPI2phJFrKj3/LxA9K4nSiFfHPlkSHSKE2j0PZH/iknOVfxz8LfkJNIfYAfgV0c/dJm2eaWb1y1m1QzlmlZvwYkJtfywlsHdQQtQC34u4rifqJLjGz/YGrgUfN7FN3n1WdF1MT6DCkZphC1LJo5O7Ty3lsJBrLs4CtB/ztVcW23yXqYzinknU2Eh1alDUHWEjUF1JeTZtD6EPgD2ZWdgjBqmoKsTkUNmyeYWbtiAK1PD3LrNeAqIP4g3jW5vegVQWvZXVVxbj7p0Sdwzkk6WxPtlHLogZw9zlmdjfwhJmNBaYTfXj3Azq4ez93L4mX3RpfsTiN6JTnPlVse6WZjQJusGi0rheBOkRnQ0a4+0KiC7h6mNlJRGdCFrn7IjMbCPzDzBoCLxGFyu7AScAp7r4OuBl4n6hv4wGivozqDBhzoJmdkjBvGdG1IguA28xsKNHhyAiiAEu0Pn59DYg6ga8EdgDuKPMeDAfuiAPnTaIPfQegk7v3LGebmNlbRH0hnxG1Bs8nOhT6oLz1a7xM97BuTw/isyFVrDMfuLWc+QZcRnRcvYHoA/MGcHbCOqPiZauBR4lOw1Z4NqTM7/6Z6KzFBqIWygSgYbysKdGH4rv4d4eX+b1uRMG0luiszgxgNJBXZp1TifoUCoG3gIMIPxtS3mNqvPwgog/meuBLoA9bn7EYTnR4dURc2wbg38CR5ezvTKLrXtYD3xOF3BVllidu+xZgZvxeryQ6s3VEpv+dpeqhkbJEJIj6LEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIL8P0h+CPIbqrFlAAAAAElFTkSuQmCC\n", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], + "source": [ + "cm = confusion_matrix(y_test, best_preds)\n", + "ax = sns.heatmap(cm, square=True, annot=True, cbar=False)\n", + "ax.set_xlabel('Predicted Labels',fontsize = 15)\n", + "ax.set_ylabel('True Labels',fontsize = 15)\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Using XGboost, there have been some trade-off and our recall score and f1 score have increased. Let search for the best hyperparameters" + ] + }, + { + "cell_type": "code", + "execution_count": 106, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Tuned: {'eta': 0.05, 'gamma': 0.1, 'learning_rate': 0.01, 'max_depth': 5, 'min_child_weight': 1, 'n_estimators': 500}\n", + "Mean of the cv scores is 0.932706\n", + "Train Score 0.958423\n", + "Test Score 0.938000\n", + "Seconds used for refitting the best model on the train dataset: 3.063578\n" + ] + } + ], + "source": [ + "from xgboost.sklearn import XGBClassifier\n", + "from sklearn.model_selection import GridSearchCV, RandomizedSearchCV \n", + "\n", + "param_dict = {\n", + " 'eta': [0.05,0.10,0.20,0.25,0.30],\n", + " 'gamma': [0.0, 0.1, 0.2, 0.4],\n", + " 'max_depth':range(3,10,2),\n", + " 'min_child_weight':range(1,6,2),\n", + " 'learning_rate': [0.001,0.01,0.1,1],\n", + " 'n_estimators': [200,500,1000]\n", + " \n", + "}\n", + "\n", + "xgc = XGBClassifier()\n", + "\n", + "clf = GridSearchCV(xgc,param_dict,cv=2,n_jobs = -1).fit(X_train,y_train)\n", + "\n", + "print(\"Tuned: {}\".format(clf.best_params_)) \n", + "print(\"Mean of the cv scores is {:.6f}\".format(clf.best_score_))\n", + "print(\"Train Score {:.6f}\".format(clf.score(X_train,y_train)))\n", + "print(\"Test Score {:.6f}\".format(clf.score(X_test,y_test)))\n", + "print(\"Seconds used for refitting the best model on the train dataset: {:.6f}\".format(clf.refit_time_))" + ] + }, + { + "cell_type": "code", + "execution_count": 113, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + " precision recall f1-score support\n", + "\n", + " 0 0.95 0.98 0.96 855\n", + " 1 0.87 0.68 0.76 145\n", + "\n", + " accuracy 0.94 1000\n", + " macro avg 0.91 0.83 0.86 1000\n", + "weighted avg 0.94 0.94 0.93 1000\n", + "\n" + ] + }, + { + "data": { + "text/plain": [ + "Text(91.68, 0.5, 'True Labels')" + ] + }, + "execution_count": 113, + "metadata": {}, + "output_type": "execute_result" + }, + { + "data": { + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAQsAAAELCAYAAADOVaNSAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/d3fzzAAAACXBIWXMAAAsTAAALEwEAmpwYAAAXLElEQVR4nO3dd3xUdbrH8c+TsAqa0KVYURQRvOrqyl5BROyILqKIdQEbXrCLWJYiqKCicNVV115YK2L3qqhgwcUCuqigIBZwAWkqTSAk8Nw/zgHCMEl+gUxJ+L5fr3llTplznplkvjm/U37H3B0RkbLkZLoAEakcFBYiEkRhISJBFBYiEkRhISJBqmW6gPIoXPSDDt1UIjV2bJvpEmQzFK2eY8nGa8tCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJiM4185kU6nXUhJ539P/S9/hYKClYnne+rb6azX9uOvPXu+C1e5+rVq+kz4GY6dD2XMy64nDk/zwdg2rffc1bPK+h01oV07taLN955f4vXJRs8+MBw5s7+gsn/Hrt+3MABVzLrx0lMmvgWkya+RYfjjshghemhsNgM8xcu4snRL/PsI3fx0hP3sXbt2qRf0DVr1vC/9z5Km1YHlmv5c36eT4+Lr95k/AuvvUXN/DzeGPUIfz3tJEbc+wgA1atvy9ABV/Hyk/dz//CbuPWu+1m6bPnmvTnZxMiRo+h4wlmbjL/zrgf508HH8KeDj+GNN8dloLL0qpbOlZlZc6ATsBPgwFzgFXf/Jp11VISiNWsoKFhNtdxqrFxVwA71624yz1OjX+How9sw5ZtvNxr/6phxPPncyxQWFrFfy73p3+cicnNzy1znuPEf0fu8swE45vC2DB3xD9ydJrvuvH6eBjvUo26d2vy2eAk18/O28F0KwPgPP2G33XYue8YqLm1bFmZ2DfAMYMCnwMT4+dNmdm266qgIDXeoT48zTuGok7vRvtOZ5G+/HW3+fNBG88xfuIixH0yg60nHbzT++5k/8ebY9/nnfcN5/vF7yMnJ4bW33g1a74KFv9CoQX0AqlXLJW/77Vi8ZOlG83z19XQKC4vYZafGW/AOJUTvXufw+Wdv8+ADw6ldu1amy0m5dG5ZnAe0dPfC4iPNbAQwFbgl2YvMrCfQE+De4TdxfrczUl1nmZYsXca74z9mzHOPkp+fR5/+Q3l1zDhOPHZDu/XWO+/nil7nbrLF8MmkyXw97TtOP+8yAAoKCqhbpzYAl153A3PmzqewqJCf5y/klO4XAXB210507ngM7r5JLWa2/vnCRb9y3Q23MaR/H3Jy1MJMpfvuH8lNQ+7A3blh8NXcNmwgF/Tsk+myUiqdYbEW2BGYlTC+cTwtKXd/AHgAoHDRD5t+WzLg40mT2WnHhuu/5Ee2a83kr77eKCymTptB3+uj/PttyVLGfzSR3Nxc3J2/dDiKK3qds8ly77p5IBDts+g3ZDiP3T1so+kNG9Rn3oJFNGqwA0VFa1j++wpq1cwHYPnvv9O770Au6dmd/ffdJxVvW4pZsGDR+ucPPfwkL7/0eAarSY90hsXlwFgzmwH8Jx63K7AncHEa69hijRvuwJdTprFy1Sqqb7stn0yaTMvme200z5jRj61/3u+m4bRr04ojD2vN9z/O4pJrb6Db6Z2pV6c2S5Yu4/cVK9ixUcMy19v+0P/m5dff4YB99+Gt98bz54P2x8woLCzksutu5C/HHcmxR7St6LcrSTRq1IB58xYAcFKnDkydOj3DFaVe2sLC3d80s2ZAK6IdnAbMBia6+5p01VER9mvZnKPbH0rXcy4hNzeX5s2acmqnDjz74v8BcFrnjiW+tunuu3HJBd3oeXk/1vpa/lCtGv2u7B0UFiefcCzX3XgbHbqeS62a+dw2ONrV8+a48Xw2eQqLlyzjpdffAWBIvytp3qxpBbxbeeKf99DusEOoX78uM3+YxOAbbqddu9bsv38L3J1Zs2bTq/c1mS4z5SxZOzhbZUszRMLU2FFbOZVR0eo5lmy89oKJSBCFhYgEUViISBCFhYgEUViISBCFhYgEUViISBCFhYgEUViISJCgsDCzBma2e7FhM7OeZnaHmZ2YuvJEJFuEblk8BlxRbHgwcC9wHPCimfWo2LJEJNuEhsWBwDgAM8sBegF/c/fmwBCiK0pFpAoLDYtawC/x84OAusCT8fA4osvMRaQKCw2L2UCL+HlHYJq7z4mHawGrKrowEckuof1ZPAIMM7OjiMLiumLT/huodB3uikj5BIWFu99sZnOAg4FLiMJjnbrAQymoTUSyiDq/kZRR5zeVU0md35S4ZWFm25VnBe6+orxFiUjlUVozZDnRjYBClX2XHBGptEoLi3MpX1iISBVWYli4+2NprENEsly5bgVgZi2ITsraBXjE3eeZ2Z7AfHdflooCRSQ7BIWFmeURHS7tAhTGr3sTmAcMBX4CrkpRjSKSBULP4BwBtAaOBPKJbhC0zutEF5SJSBUW2gw5GbjM3d81s8SjHrOA3Sq2LBHJNqFbFjXYcCFZonygUt1+UETKLzQsJgLdSpjWBZhQMeWISLYKbYb0B94xs3eA54jOvzjezK4gCovDUlSfiGSJoC0Ld/+QaOfmtsDdRDs4BwN7AEe5+8SUVSgiWSH4PAt3/xfQ1sxqAHWAxboeRGTrsTm9e68iOtdiZQXXIiJZLDgszOx4M5tAFBbzgFVmNsHMOqasOhHJGqG3ArgQeJXoStTLgFPjn8uBV+LpIlKFBXV+Y2azgNfdvVeSafcBx7v7rimobyPq/KZyUec3lVNJnd+ENkPqAS+UMO15oq71RKQKCw2Ld4F2JUxrB3xQMeWISLYqrVu9FsUG7wIeMrN6wEvAAqAB0BnoAJyfwhpFJAuUuM/CzNaycU9Zxdsxnjjs7invVk/7LCoX7bOonMrdYS/QPkW1iEglVFq3eu+nsxARyW7l6lYP1t8YuXrieJ36LVK1hZ6UZWZ2jZl9R3Sq97IkDxGpwkIPnV4KXAs8TLRjcwhwA/AtMBPomYriRCR7hIbFBcD1wLB4+CV3Hwy0BKYBe6WgNhHJIqFhsTsw2d3XEDVDagO4+1rgXqB7SqoTkawRGha/AHnx85+APxabVoeoj04RqcJCj4b8CziYqNv/p4BBZlYXWA1cBIxNTXkiki1Cw2IQsFP8fChRM6QH0RbF28AlFVyXiGSZoEvUs4VO965cdLp35bQ5p3sHMbNTgFHpuDakfpOjU70KqUB71Gqc6RKkAm1OH5wishVSWIhIEIWFiARRWIhIkNJ6yhoVuIydK6gWEclipR0N2SFwGQWoD06RKq+0zm/UU5aIrKd9FiISRGEhIkEUFiISRGEhIkEUFiISpFxhEXfcu4uZtTaz7VNVlIhkn+CwMLPewBxgFjAe2Dse/4KZXZ6S6kQka4TeCqAvMAJ4EDiCjW9d+B5wWoVXJiJZJbQ/i4uAge4+zMwS+62YDjSr2LJEJNuENkMaAZ+VMG0tSe5QJiJVS2hYfAe0K2HaYcDXFVOOiGSr0GbIHcC9ZrYaGB2Pa2Bm5wFXEt2ESESqsKCwcPeHzKwOMBAYHI9+HVgBDHL3p1JUn4hkieAOe939NjO7D2gN1AN+BT5y9yWpKk5Eske5evd292XAmBTVIiJZLCgs4hOySuXu9255OSKSrUK3LO4uZdq6G/8oLESqsKBDp+6ek/gA6gJnAF8ALVJZpIhk3mbfkczdFwPPmlkt4H7g8AqqSUSyUEVcov4j8KcKWI6IZLEtCgszawz0IQoMEanCQo+GLGTDjsx1tgHygVXAyRVcl4hkmS05GrIKmA286e6/VFxJIpKNygwLM/sD8A7wo7vPTX1JIpKNQvZZrAHGAfukuBYRyWJlhoW7rwVmAA1TX46IZKvQoyH9gIFm9l+pLEZEsldpd1E/DPjc3ZcD/YmuNJ1sZnOA+SQcHXH3VqksVEQyq7QdnO8ChwCfAlPih4hspUoLi/U9eLv7OWmoRUSymO5IJiJByjrP4ngzax6yIHcfWQH1iEiWKissBgYuxwGFhUgVVlZYtAcmpaMQEcluZYXFSnf/PS2ViEhW0w5OEQmisBCRICU2Q+J+NkVEAG1ZiEgghYWIBFFYiEgQhYWIBNns+4bIlsnJyeH98S8xd+58Tjv1Ah59/C723Gt3AGrVqsmSJUtp2/rEDFcp63TreTpdz+6MGYx64iUev/9p9tm3GYNvu45tq29DUdEaBl99K1/+e2qmS00ZhUWG9Ordg+nTvyc/Pw+Ac7pfun7aTUOvY+nSZZkqTRLs1bwpXc/uTJdju1G4uoiHn72L997+kL4DL+Xu2x/kg7ETaHdUG/pefyl/PenCTJebMmqGZMCOOzbi2OPaM/LxUUmndz65I6Ofey3NVUlJmjZrwheffcWqlQWsWbOGTyd8ztHHt8dx8vK3ByAvP48F8xZmuNLU0pZFBtwyrD8D+9+6/g+tuNZtDmbhgkX88P3M9BcmSc345nuu+FtvatepxapVq2h3VBumfPENQ/sN5+FRd3PNoMvIycnhtOPPzXSpKZUVWxZmVmLnOmbW08wmmdmk1YVL01lWShx7XHsWLvyFyZOTdzzW5dQTGf3cq2muSkrz/YyZPPj3kTw6+h4efvbvTJs6g6KiNZxxTheGDhhBuwNOYOiAEQy9Y0CmS00pc0+80VgGijD7yd13LWu+WnlNM1/sFrp+0FWcdsZJFBWtoXr1bcnPz+PVV8bQ8/w+5ObmMm3GBNod2om5c+dlutQt1rBGnUyXkBJX9uvNvLkL6NP/Yg5qevj68Z//8B4H7nF4ia+rLL5dOMmSjU/bloWZfVnC4yu2otsMDB50Oy32PpT9Wrbj3B6X8cH7H9Hz/D4AHN6+Dd9++32VCIqqpm79KPga79SQYzoewWsvjGHBvIW0an0QAIe0PZiZP/wnkyWmXDr3WTQEjgV+SxhvwIQ01pG1TulyAs+rCZKV7n50GLXr1KKosIjB19zK0iXL6H/lTfQbchXVcnMpKFjNgCuHZLrMlEpbM8TMHgYedfcPk0x7yt3PLGsZVaEZsjWpqs2Qqq6kZkjatizc/bxSppUZFCKSWVlxNEREsp/CQkSCKCxEJIjCQkSCKCxEJIjCQkSCKCxEJIjCQkSCKCxEJIjCQkSCKCxEJIjCQkSCKCxEJIjCQkSCKCxEJIjCQkSCKCxEJIjCQkSCKCxEJIjCQkSCKCxEJIjCQkSCKCxEJIjCQkSCKCxEJIjCQkSCKCxEJIjCQkSCKCxEJIjCQkSCKCxEJIjCQkSCKCxEJIjCQkSCKCxEJIjCQkSCKCxEJIjCQkSCKCxEJIjCQkSCKCxEJIjCQkSCKCxEJIjCQkSCKCxEJIjCQkSCKCxEJIjCQkSCKCxEJIi5e6ZrEMDMerr7A5muQ8Jsjb8vbVlkj56ZLkDKZav7fSksRCSIwkJEgigsssdW1f6tAra635d2cIpIEG1ZiEgQhYWIBFFYZJiZHWdm083sOzO7NtP1SOnM7BEzW2BmUzJdS7opLDLIzHKBe4AOQAvgDDNrkdmqpAyPAcdluohMUFhkVivgO3f/wd1XA88AnTJck5TC3T8Afs10HZmgsMisnYD/FBueHY8TyToKi8yyJON0LFuyksIis2YDuxQb3hmYm6FaREqlsMisicBeZra7mW0DnA68kuGaRJJSWGSQuxcBFwNjgG+AUe4+NbNVSWnM7GngI2BvM5ttZudluqZ00eneIhJEWxYiEkRhISJBFBYiEkRhISJBFBYiEkRhkUZmNsjMvNhjrpk9b2ZNU7jOE+J1NYmHm8TDJ5RjGV3NrEcF1pQX11DqMuN5Lt7CdQ0ys0Vbsoxiy3rMzCZVxLIqo2qZLmArtIQNVy3uAdwIjDWzlu7+exrW/zNwCDCtHK/pCtQnuuJStlIKi/QrcveP4+cfm9lPwHjgeOC5xJnNrIa7r6yolbt7AfBxmTOKJFAzJPM+i382ATCzmWY23MwGmNlsYGk8PsfMro07ySkws2/NrHvxBVlkUNw5yzIzGwnUTJgnaTPEzC4ws6/MbJWZzTez0WZWy8weA04B2hVrPg0q9rpOZjYpft08MxtmZn9IWPYpcb0rzewDoHkFfG6YWUczezt+v0vN7GMzO6aEeduY2edxnZPN7NAk85xvZlPjz3eWmV1dxvprm9lDcXNylZn9ZGYPVsR7y0bassi8JvHPecXGnQlMBXqz4Xf0d6A7cAPwOXA08IiZ/eLur8XzXAoMBIYSba2cDAwrqwAz6x8v916gL7Ad0BHII2om7QrUjuuB6AI4zKwr8DRwP/A3oClwM9E/oavieQ4EngVeBC4DWgKjyqop0O7Aq8DtwFqiToTeMLPD3P1fxebbDngiru1noE88317uPi+usy/R5zYMeA84CLjRzFa4+90lrH8E0Bq4guj3twtwWAW9t+zj7nqk6QEMAhYRBUA1oBnwLtHWQ+N4nplEf9DVi71uT6IvQ/eE5Y0EJsbPc4muWP1HwjxvE1323iQebhIPnxAP1wZWACNKqXs08F7COANmAY8mjD8XWAnUi4dHAV8TX1oQj+sX19CjjM/LgYsDP9uc+DMdAzyS8Jk7cGaxcXlEHdjcEg/XBJYD1ycs8waiEMiNhx8DJhWbPgW4JNN/V+l6qBmSfvWAwvgxnWgn52nu/nOxeca6+6piw0cShcWLZlZt3QMYCxwQd8+3C9AYeDlhfS+UUc8hQA3g0XK+j2ZEWxyjEmoaB1QH9o3nawW84vG3K7CmIGa2s5k9bmZzgCKiz/SYuLZEL6574u7LiUK0VTzqEGB74Lkk76UhUdcByUwG+ppZbzNLts4qRc2Q9FsCHEX0324eMDfhiwQwP2G4PtGWw5ISltkYaBQ/X5AwLXE4Ub3458+lzrWp+vHP10uYvq6fjkabUVOZzCyH6HL+fKKm13fA70RbAw0SZl/um+4kXgDsFz9f915KuuJ3F6KtqEQXx+sbCNxjZt8BA9z9mXK8lUpDYZF+Re5e1rH6xPD4leg/ZxuiLYxEC9jwu0z8oiQOJ/ol/tmYqIkUal0/lD2BfyeZ/mP8c95m1BRiT+CPQAd3f3PdSDOrkWTevCRHlRqwISDXvZcT2DSoIdoC3IS7LybaT3Spme0HXA08aWZfuvvX5XkzlYGaIZXDOKIti1ruPinJYzVRX57z2LTD35PLWPZHRPsYupcyz2qipkVx04E5RPtCktW0LoQmAn8xs+JdCJZVU4h1oVCwboSZ7UYUqMl0LjZfHtEO4k/jUes+gx1LeC/LyirG3b8k2jmcQwUd7ck22rKoBNx9upndBzxjZsOASURf3pZAM3c/393XxNNuj89YHE90yHOfMpa92MxuBIZY1FvX68C2REdDBrv7HKITuDqZ2UlER0LmuvtcM+sD/NPMagJvEIXKHsBJQBd3XwHcCnxCtG/jYaJ9GeXpMOYAM+uSMG4h0bkis4HhZjaAqDkymCjAEq2M318e0U7gq4BtgDuLfQaDgDvjwPmA6EvfDGjv7p2TLBMz+5BoX8gUoq3BC4iaQp8mm7/Sy/Qe1q3pQXw0pIx5ZgK3JxlvwOVE7eoCoi/M+0C3hHlujKctA54kOgxb4tGQYq+9kOioRQHRFsoooGY8rT7Rl+LX+LWDir2uA1Ew/U50VGcycBNQrdg8pxLtU1gFfAgcTPjRkGSP9+LpBxN9MVcCM4AebHrEYhBR86ptXFsB8AVwWJL1nU103stK4DeikLuy2PTEZd8GfBV/1ouJjmy1zfTfWaoe6ilLRIJon4WIBFFYiEgQhYWIBFFYiEgQhYWIBFFYiEgQhYWIBFFYiEiQ/wc2asYdwKocSwAAAABJRU5ErkJggg==\n", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], + "source": [ + "xgb_pred = clf.predict(X_test)\n", + "print(classification_report(y_test, xgb_pred))\n", + "cm = confusion_matrix(y_test, xgb_pred)\n", + "ax = sns.heatmap(cm, square=True, annot=True, cbar=False)\n", + "ax.set_xlabel('Predicted Labels',fontsize = 15)\n", + "ax.set_ylabel('True Labels',fontsize = 15)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Hyperparameter tuning was worth the time as both the recall and f1 score improved compared to choosing hyperparameter manually." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Logistic Regression\n", + "Logisitic regression will require us to standardize our dataset" + ] + }, + { + "cell_type": "code", + "execution_count": 136, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[0.41167182, 0.67648946, 0.32758048, ..., 1.99072703, 0.0715836 ,\n", + " 0.08500823],\n", + " [0.41167182, 0.14906505, 0.32758048, ..., 1.56451025, 0.10708191,\n", + " 1.24048169],\n", + " [0.41167182, 0.9025285 , 0.32758048, ..., 0.26213309, 1.57434567,\n", + " 0.70312091],\n", + " ...,\n", + " [0.41167182, 1.83505538, 0.32758048, ..., 0.01858065, 1.73094204,\n", + " 1.3837779 ],\n", + " [0.41167182, 2.08295458, 3.05268496, ..., 0.38390932, 0.81704825,\n", + " 1.87621082],\n", + " [0.41167182, 0.67974475, 0.32758048, ..., 2.66049626, 1.28129669,\n", + " 1.24048169]])" + ] + }, + "execution_count": 136, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "from scipy import stats\n", + "import numpy as np\n", + "z = np.abs(stats.zscore(data))\n", + "z" + ] + }, + { + "cell_type": "code", + "execution_count": 137, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "414\n" + ] + }, + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
indexChurnAccountWeeksContractRenewalDataPlanDataUsageCustServCallsDayMinsDayCallsMonthlyChargeOverageFeeRoamMins
000128112.701265.111089.09.8710.0
110107113.701161.612382.09.7813.7
220137100.000243.411452.06.0612.2
360121112.033218.28887.317.437.5
480117100.191184.59763.917.588.7
.......................................
29143327079100.002134.79840.09.4911.8
291533280192112.672156.27771.710.789.9
29163329068100.343231.15756.47.679.6
29173330028100.002180.810956.014.4414.1
29183332074113.700234.4113100.013.3013.7
\n", + "

2919 rows × 12 columns

\n", + "
" + ], + "text/plain": [ + " index Churn AccountWeeks ContractRenewal DataPlan DataUsage \\\n", + "0 0 0 128 1 1 2.70 \n", + "1 1 0 107 1 1 3.70 \n", + "2 2 0 137 1 0 0.00 \n", + "3 6 0 121 1 1 2.03 \n", + "4 8 0 117 1 0 0.19 \n", + "... ... ... ... ... ... ... \n", + "2914 3327 0 79 1 0 0.00 \n", + "2915 3328 0 192 1 1 2.67 \n", + "2916 3329 0 68 1 0 0.34 \n", + "2917 3330 0 28 1 0 0.00 \n", + "2918 3332 0 74 1 1 3.70 \n", + "\n", + " CustServCalls DayMins DayCalls MonthlyCharge OverageFee RoamMins \n", + "0 1 265.1 110 89.0 9.87 10.0 \n", + "1 1 161.6 123 82.0 9.78 13.7 \n", + "2 0 243.4 114 52.0 6.06 12.2 \n", + "3 3 218.2 88 87.3 17.43 7.5 \n", + "4 1 184.5 97 63.9 17.58 8.7 \n", + "... ... ... ... ... ... ... \n", + "2914 2 134.7 98 40.0 9.49 11.8 \n", + "2915 2 156.2 77 71.7 10.78 9.9 \n", + "2916 3 231.1 57 56.4 7.67 9.6 \n", + "2917 2 180.8 109 56.0 14.44 14.1 \n", + "2918 0 234.4 113 100.0 13.30 13.7 \n", + "\n", + "[2919 rows x 12 columns]" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/plain": [ + "2919" + ] + }, + "execution_count": 137, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "outliers = list(set(np.where(z > 3)[0]))\n", + "\n", + "print(len(outliers))\n", + "\n", + "new_data = data.drop(outliers,axis = 0).reset_index(drop = False)\n", + "display(new_data)\n", + "\n", + "y_new = y[list(new_data[\"index\"])]\n", + "len(y_new)" + ] + }, + { + "cell_type": "code", + "execution_count": 143, + "metadata": {}, + "outputs": [], + "source": [ + "X_new = new_data.drop(['index', 'Churn'], axis = 1)\n", + "\n", + "from sklearn.preprocessing import StandardScaler\n", + "X_scaled = StandardScaler().fit_transform(X_new)" + ] + }, + { + "cell_type": "code", + "execution_count": 156, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Training accuracy: 0.8962310327949095\n", + "Test accuracy: 0.910958904109589\n" + ] + } + ], + "source": [ + "from sklearn.linear_model import LogisticRegressionCV\n", + "from sklearn.model_selection import train_test_split\n", + "X_train, X_test, y_train, y_test = train_test_split(X_scaled, y_new, test_size = 0.3, random_state = 19, stratify = y_new)\n", + "model = LogisticRegressionCV(cv = 3,solver = 'sag', max_iter = 1000, random_state = 9)\n", + "model.fit(X_train, y_train)\n", + "\n", + "print(\"Training accuracy: \", model.score(X_train, y_train))\n", + "print(\"Test accuracy: \", model.score(X_test, y_test))" + ] + }, + { + "cell_type": "code", + "execution_count": 157, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Precision: 0.8795475113122172\n", + "Recall: 0.6120192307692308\n", + "Accuracy: 0.910958904109589\n", + " precision recall f1-score support\n", + "\n", + " 0 0.91 0.99 0.95 780\n", + " 1 0.85 0.23 0.36 96\n", + "\n", + " accuracy 0.91 876\n", + " macro avg 0.88 0.61 0.66 876\n", + "weighted avg 0.91 0.91 0.89 876\n", + "\n" + ] + } + ], + "source": [ + "preds = model.predict(X_test)\n", + "print(\"Precision: \", (precision_score(y_test, preds, average='macro')))\n", + "print(\"Recall: \",(recall_score(y_test, preds, average='macro')))\n", + "print(\"Accuracy: \", (accuracy_score(y_test, preds)))\n", + "print(classification_report(y_test, preds))" + ] + }, + { + "cell_type": "code", + "execution_count": 158, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "Text(91.68, 0.5, 'Actual Value')" + ] + }, + "execution_count": 158, + "metadata": {}, + "output_type": "execute_result" + }, + { + "data": { + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAQsAAAELCAYAAADOVaNSAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/d3fzzAAAACXBIWXMAAAsTAAALEwEAmpwYAAAXlUlEQVR4nO3dd5gV5dnH8e+9gIJSRJG2gNiIRt/ERGPBV42CiIgikqCxACpi7yUSDIrYgmKMUV9jDGoSAxKNjSAGQYwNBIXEBooFpYv0zi73+8cMuBzOnn0WTpnd/X2u61y7M8+Ue8/u+e3MM83cHRGRihQVugARqRoUFiISRGEhIkEUFiISRGEhIkFqF7qAytiw6HMduqlC6rU8qtAlyDYoWT/H0o3XloWIBFFYiEgQhYWIBFFYiEgQhYWIBFFYiEgQhYWIBFFYiEgQhYWIBFFYiEgQhYWIBFFYiEgQhYWIBFFYiEgQhYWIBFFYiEgQhYWIBFFYiEgQhYWIBFFYiEgQhYWIBFFYiEgQhYWIBFFYiEgQhYWIBFFYiEgQhYWIBFFYiEgQhYWIBFFYiEgQhYWIBFFYiEgQhYWIBFFYiEgQhYWIBFFYiEgQhYWIBFFYiEgQhYWIBFFYiEgQhYWIBFFYiEgQhYWIBFFYiEgQhYWIBFFYiEgQhYWIBFFYiEgQhYWIBFFYiEiQ2oUuoCr6YtZsrht45+bh2XPncVnfczjn9O6bx61YuYobbx3CvAXfUFpSSp8ze9D9pE7btd7169fTf/BQPprxKbs0asg9t/anuEUzpn/yGYPveYCVq1ZTVKuIfr3O4MSOx2zXuiSzoqIiJk18iblz5tOte+9Cl5MXCottsOcerXjmiQcBKC0t5bhTz6HDMe23mGb4My+yd9s2PDhkEIuXLKXrLy6ga6djqVOnToXLnzNvAQNuH8rjDwzZYvw/Rv2Lhg3q89LIYYx+ZQL3PjSMoYP7U7fujtzx6+vYo3UxC7/5lp7nX86Rhx1Mwwb1s/dDyxauuLwv06d/SsMGDQpdSt7kdTfEzPYzs1+a2f1m9rv4+/3zWUO2TZwyjdbFLWjZvNkW482MVavX4O6sXrOWRg0bUKtWLQBefHk8Z/S9kh69L2XQkPspLS0NWtf419+mW5eOAHT66VFMenca7k7bNq3Yo3UxAE13341dG+/CkqXLsvhTSlnFxS3ocmIHhg0bXuhS8ipvYWFmvwRGAAa8A0yOvx9uZjfmq45se2nca3RJs8l/Zo+T+fzLrzm221l073UxN151EUVFRXz25VeMGfcaf3l4KM888SBFRUWM+terQeta+M23NG/aBIDatWtRf+edWLps+RbTvP/RDDZsKKF1cYvt/+EkrXuHDuLG/rexcePGQpeSV/ncDTkfOMDdN5QdaWb3Ah8Cd6Wbycz6Af0AHhp6G317/SLXdQbbsGEDE96YxFUXnbtV25vvvMt+++7FsN/fxddz5nHBVb/i4B8ewKQp0/ho+kzOOP9KANatW8eujXcB4Ir+tzJn7gI2lGxg3oJv6NH7UgDO7tmN7id1wt23Wo+Zbf7+m0WL6X/r3dx+07UUFanvOhdO6tKRhQsX8d7U9znm6CMKXU5e5TMsNgItgVkp41vEbWm5+yPAIwAbFn2+9aelgF6fOIX92+1Nk10bb9X27D/H0vfsnpgZbVq1pLhFc76YNRt355QTO3L1xVsHzP13DgTK77No1rQJ8xcuonnT3SkpKWXlqtU0ahjtM69ctYpLrh/I5f1688MDq/SeXaK1b38IJ3ftxImdj6Nu3R1p2LABTzx+P737XFHo0nIun/9+rgLGmdlLZvZI/BoDjAOuzGMdWTN67AS6HP/TtG0tmu3OxHenAbBo8RK+/Go2rVo25/BDDmLshDf4dslSAJYtX8Hc+QuC1nfs/x7O86NfAeBfE17nsIN/iJmxYcMGruw/mFM6d+CE447a3h9LMhhw01203esQ9ml3OGedfQmvvvpmjQgKyOOWhbuPMbN2wKFAMVF/xWxgsruH9fAlyJq1a3l78lRuvuG7P5Snnv0nAKd3P4mL+pzJgNuH0v2ci3F3rr7kPBrv0ojGuzTi8gt60e+qAWz0jdSpXZsB11yyVQdpOqd1PYH+g+/mxJ7n0ahhA+4eFHX1jBn/Ou9O+4Cly1bwXBwmtw+4hv3a7Z2Dn1xqKku3H5xUSdsNkczqtdRWTlVUsn6OpRuvXjARCaKwEJEgCgsRCaKwEJEgCgsRCVKpQ6dm9n3gYKA1MMzd55vZPsACd1+RiwJFJBmCwsLM6gPDgB5ASTzfGGA+cAfwFXBdjmoUkQQI3Q25F2gPdAQaEJ1QtclooHOW6xKRhAndDTkNuNLdXzWzWilts4A9sluWiCRN6JZFPeDbctoaAFXudG0RqZzQsJgM9Cqn7WfAW9kpR0SSKnQ35CbgFTN7Bfg74EAXM7uaKCyOzlF9IpIQQVsW7v4G0AHYEXiAqINzELAX0NHdJ+esQhFJhODzLNz9TeAoM6sHNAaWuvvqnFUmIolS6ftZuPsaYE0OahGRBAs9KWtkRdO4e8/tL0dEkip0y2L3NON2Bb5HdEh1RtYqEpFECgoLdz823Xgzaw08C/w2m0WJSPJs11Wn7v41cCcwpKJpRaRqy8Yl6qVAqywsR0QSLLSD8/tpRu8A7A8MJjrDU0SqsdAOzg+IztpMZURB0TdrFYlIIoWGRboOzrXAbHefk8V6RCShQo+GvJbrQkQk2coNCzPbqTIL0qnfItVbpi2LlaTvpyhP6k1xRKQayRQW51G5sBCRaqzcsHD3x/NYh4gknJ4bIiJBgi9RN7PTgQuAdkDd1HZ3b5rFukQkYYK2LMzsTOAJYCbRqd0vAKPi+ZcT3T1LRKqx0N2Q64lO6740Hn7I3c8D9gQWATpsKlLNhYbFvsCb7l5KdOFYQ4D4kYW/AS7LTXkikhShYbGM6Ga9AHOILiDbxIDdslmUiCRPaAfnFOAHwMtE/RUDzawEWA8MBCblpjwRSYrQsLiT7x5RODD+/iGiszYnA/2yX5qIJEmma0M+BJ4EnnL3icBEAHdfCnQzsx2BHd19eT4KFZHCytRn8QVwM/CJmU0ysyvNrOWmRndfp6AQqTnKDQt37wo0Ay4kOpfiHuArMxtvZn3NrHGeahSRBMh4NMTdl7r7o+5+PFAMXEW06/IHYL6ZvWhmZ5rZzrkvVUQKydwrf2GpmRUDpwNnAAcDa9y9fpZr28qGRZ/rKtgqpF7LowpdgmyDkvVzLN34bb2QzIGN8de0CxaR6iU4LMysiZldbGYTgK+IztxcAJwF6CIykWou43kWZtYIOI1od+NYonB5DbgIeMbdl+S8QhFJhEznWTwPdCI6zfsdoovJnnL3+XmqTUQSJNOWxV5EV5oOd/cv8lSPiCRUptvq/U8+CxGRZNNt9UQkiMJCRIIE34MzCfbb72eFLkEqoU6tKvXnJRXQloWIBFFYiEiQTOdZjKzEctzdT89CPSKSUJl2KnfPWxUikniZzrM4Np+FiEiyqc9CRIJU5vGFDYBulP/4whuyWJeIJExQWJjZ3sCbwE7AzsA3wK7x/EuIniuisBCpxkJ3Q35L9OyQZkQ3u+kC1APOBlYS3TVLRKqx0N2QQ4G+wLp4eIf4UYZ/M7MmwO+A9jmoT0QSInTLoi6w3N03AouBlmXaPgB+mO3CRCRZQsPiE757ItlU4CIzq2tmdYDzgbm5KE5EkiN0N2QEcBDwF+DXRM88XU50097aQJ8c1CYiCRIUFu5+b5nvJ5rZgcCJRLsn4939gxzVJyIJsU3XELv718AjWa5FRBIs9DyLLhVN4+6jt78cEUmq0C2LUaR/oFDZJ4TVykpFIpJIoWGxZ5pxuxI9KqAPcG62ChKRZArt4JyVZvQsYKqZlQK/Ak7JZmEikizZuOp0KnBcFpYjIgm2XWFhZjsQ7YbMy0o1IpJYoUdDJrNlZybADkBboAHqsxCp9kI7OD9k67BYC/wdeM7dP8xqVSKSOKEdnH1yXIeIJFxQn4WZjTez/cppa2dm47NblogkTWgH50+BhuW0NQSOzko1IpJYlTkaktpnseloyHHA/KxVJCKJlOkhQzcDA+NBByaapZ7tvdndWa5LRBImUwfnaGAR0fUg9wNDgS9TplkPTHf313NSnYgkRqaHDE0GJgOY2QpglLt/m6/CRCRZQvsspgGHpWswsy5m9oOsVSQiiVSZRwGkDQvgJ3G7iFRjoWHxY6KHDKXzNvCj7JQjIkkVGha1iJ5Els7ORNeJiEg1FhoWk4F+5bT1I3pamYhUY6EXkt0CvGJmk4AniE7CagH0InrA0PE5qU5EEiP0QrJ/m1kn4E7g90TnXmwEJgHH6zwLkeov+FEA7j4BOMLMdgIaA0vcfTWAmdVx9w25KVFEkqDSd8py99XuPgdYY2bHmdkf0bUhItVepR8yZGaHAb8AegLNiB6UPCLLdYlIwoTeVu9AooA4g+hWeuuJDpdeAzzo7iW5KlBEkqHc3RAz28vMfmVm7wP/Aa4DPiY6ArIvUSfnVAWFSM2QactiJtGl6ZOAC4Fn3H0JgJk1ykNtIpIgmTo4ZxFtPRxIdKes9ma2TQ9SFpGqr9ywcPc9gSOJTsLqALwILIiPfnQgzZ2zRKT6ynjo1N3fdvfLgWLgBOB5oAfwdDzJBWZ2SG5LFJEkCDrPwt03uvtYdz8PaA6cRvTMkO7AJDP7OIc1ikgCbMtJWevd/Tl3P4PoPIteRJ2hIlKNbdezTt19lbs/6e4nZ6sgEUmmbDxFXSphz3324MVXh29+Tfvi3/S58MzN7X0vPYfPFr1H4113KVyRsoVWrVowZswIpk4dx7vvjuXSS6NH+95xx6+YNm0c77wzhqee+gONGpX3aJ3qwdyrzkGNvZv8uOoUG6CoqIi33h/DaSf0Zu7sebRo2Yw77hvI3vu2pVuHs1iyeGmhS9wuc1dVj/s7N2/elObNmzJt2gfUr78zb701ip49+1Fc3JwJE96itLSU2267EYCbbrqrwNVuvzVrZqV95oe2LAqo/dGH8tWXs5k7ex4AA267lt8Muo+qFOA1wfz5C5k27QMAVq5cxfTpM2nZshnjxr1OaWkpAO+8M5Xi4haFLDPnFBYF1LX7Cbz4j5cB6ND5aBbMW8j0Dz8tcFWSSZs2rTjooAOYPHnaFuN79erJyy9PKEhN+ZKIsDCzczO09TOzKWY2ZfnaRfksK6fq1KlNh85HM/qFsdStV5dLrj6f3971cKHLkgx23nknhg9/mOuvv5UVK1ZuHn/DDZdRWlrCiBHPFrC63EtEWACDymtw90fc/RB3P6Rh3Sb5rCmnjul4JB/+dzrffrOYNm1b0bpNMf98bQSvvTeK5i2b8sL4J2nSdLdClymx2rVrM3z4wzz11HM8//yYzePPOqsHXbp0oE+fKwtYXX7k7VoPM/tveU1E52vUKCef1nnzLsgnH8/k0P07bm577b1RnNrx7CrfwVmdPPzwEGbMmMn99z+6edzxxx/DtddeTKdOPVmzZm0Bq8uPfF4Y1ozolPElKeMNeCuPdRRc3Xp1OfKYwxhwze2FLkUCtG9/CGed1YP33/+YiRNHA3DzzXczdOgt7LjjDowa9Vcg6uS84ooBhSw1p/J26NTM/gQ85u5vpGn7m7ufmWa2LVS3Q6fVXXU5dFrTlHfoNG9bFu5+foa2CoNCRAorKR2cIpJwCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEg5u6FrkEAM+vn7o8Uug4JUxN/X9qySI5+hS5AKqXG/b4UFiISRGEhIkEUFslRo/Z/q4Ea9/tSB6eIBNGWhYgEUViISBCFRYGZWWczm2FmM83sxkLXI5mZ2TAzW2hmHxS6lnxTWBSQmdUCHgROBL4P/MLMvl/YqqQCjwOdC11EISgsCutQYKa7f+7u64ERQLcC1yQZuPu/gcWFrqMQFBaFVQx8XWZ4djxOJHEUFoVlacbpWLYkksKisGYDrcsMtwLmFqgWkYwUFoU1GdjXzPY0sx2AM4AXClyTSFoKiwJy9xLgMuBl4GNgpLt/WNiqJBMzGw68DXzPzGab2fmFrilfdLq3iATRloWIBFFYiEgQhYWIBFFYiEgQhYWIBFFYFIiZ3WJmXuY118yeMbO9c7jOrvG62sbDbePhrpVYRk8z65PFmurHNaRdppkdErf3KKe9mZmVmNkNgeubYGZPb0fJNZbCorCWAUfEr+uAg4BxZrZzntY/L173G5WYpyfQJyfVpOHuU4BPiU5YS+fnRH/HT+WrpppKYVFYJe4+MX79DegN7AF0STexmdXL5srdfV287qXZXG4OjABOMrP6adrOAN5y91l5rqnGUVgky7vx17YAZvalmQ01s1+b2WxgeTy+yMxujG+Ys87MPjGz3mUXZJFb4hu1rDCzPwMNU6ZJuxtiZheY2ftmttbMFpjZ02bWyMweB3oAx5TZfbqlzHzdzGxKPN98MxtiZnVSlt0jrneNmf0b2C/gfRkO1CPl8n0zaw20j9sxs2vNbLKZLYvrftHM9sm0YDN73MymVPS+hLzn1Z3CIlnaxl/nlxl3JnAMcAlwejzu98BNRHeYPgl4FhiW8qG/AhgYT/MzYA0wpKICzOwm4A/Aa8CpwMVEu0v1gcHAq8BUvtt9ejSeryfwD+Ad4BRgENGDeO4ss+wfE+0u/Ac4jeg6mJEV1eTuH8fzpO6KnA5sBP4eD7cCHiAKlQuAWsCbZtaoonUECHnPqzd316sAL+AWYBFQO361I/ogLgdaxNN8SdSvULfMfPsQfUB6pyzvz8Dk+PtaRFev/l/KNGOJLoFvGw+3jYe7xsO7AKuBezPU/TQwIWWcAbOAx1LGn0cUUrvFwyOBj4gvM4jHDYhr6FPB+/VLYB3QuMy4KcDL5Uxfi2hrZAXQq8z4CcDTZYYfB6akzJv6vlT4nteEl7YsCms3YEP8mgHsBZzu7vPKTDPO3deWGe5A9If7rJnV3vQCxgEHxbfqaw20AJ5PWd8/KqjnCKIP2GOV/DnaAW2AkSk1jQfqAgfG0x0KvODxJy2wpk1GAHWA7gDxUaODiXdB4nGHm9lYM/sWKCEKvvpxfdsj5D2v9moXuoAabhnQkei/2HxgbsoHCWBBynATov+ay8pZZgugefz9wpS21OFUu8Vf52WcamtN4q+jy2nfdM+O5ttQEwDuPsvM3ibaFRkWf11HtDuAmbUB/kW0G3Qh0ZbVeuCfRIG1PULe89nbuY7EU1gUVolHhwYzSQ2PxUT/NY8k+m+XaiHf/V6bprSlDqf6Nv7agmgXKdSme1L2I+rPSPVF/HX+NtRU1nDgPjNrShQWo9190we4M7AT0M3dVwHE//13rWCZa4EdUsalzhPynld7CouqZzzRf7lG7j423QRm9jXRB7MbMKZM02kVLPttoj6G3kTnfaSznq3/U88A5hD1hfwxw/InA6eYWf8yW1AV1VTWSOA+oo7bA4Fby7TVI/ogl5QZ15OK/8ZnA23NrG6Z3b3jU6ap8D2vCRQWVYy7zzCzh4ERZjaEqJOvLnAA0M7d+7p7adx2j5ktAl4nOuS5fwXLXmpmg4Hb4zt3jQZ2JOr9H+Tuc4DpQDczO5XogzbX3eea2bXAX8ysIfASUajsRXRE5Wfuvhr4DTCJqG/jT0Qf+OCbx7j7QjMbT3RkaCUwqkzzpg/0Y/GyDyAKvKUVLPY5otB5ND40/CPg3JT1Vvieh/4MVVqhe1hr6ov4aEgF03wJ3JNmvAFXAR8S7bd/Q3Sos1fKNIPjthXAk0SHYcs9GlJm3guJjlqsI9pCGQk0jNuaEPUTLI7nvaXMfCcSBdMqoqM604DbgNplpvk5MJNo8/8N4CcEHA0pM/+58fR/TdPWC/iMaOtoInBY6ntIytGQeFyfeL7VRAHUPvV9CXnPq/tLd8oSkSA6dCoiQRQWIhJEYSEiQRQWIhJEYSEiQRQWIhJEYSEiQRQWIhLk/wFCDOi3LXwE1QAAAABJRU5ErkJggg==\n", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], + "source": [ + "cmx = confusion_matrix(y_test, preds)\n", + "ax = sns.heatmap(cmx, square= True, annot= True, cbar= False)\n", + "ax.set_xlabel(\"Predicted Value\", fontsize = 15)\n", + "ax.set_ylabel(\"Actual Value\", fontsize = 15)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Logistic regression performed worse on the recall and f1 score which indicated that logistic regression is best used when we don't have imbalanced dataset. " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Evaluation\n", + "Choosing a better model is taskful but data processing and analysis is more taskful. After carefully training the data using Decision Tree, XGboost and Logistic Regression, it was evident that XGBoost performed the best due to the fact that it is able to run several trees and developed on error from previous tree. Using the grid search from the scikit learn library to tune hyperparameters gives the best result and finally XGBoost model was selected. The dataset was biased i.e. it is unbalanced, therefore most model will overfit to the largest number of cases which in this case was 'Customer not churn (0)'. The best performing model was able to predict true positive of 98 and false negative of 47 leading to a low recall. \n", + "* Others:\n", + "* False positive: 15\n", + "* True negative: 8.4e02. \n", + "* The model can generally be improved if more positive case is provided in the dataset i.e. more data will improve the model." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Preprocessing\n", + "\n", + "- Are there any duplicated values?\n", + "- Do we need to do feature scaling?\n", + "- Do we need to generate new features?\n", + "- Split Train and Test dataset. (0.7/0.3)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# ML Application\n", + "\n", + "- Define models.\n", + "- Fit models.\n", + "- Evaluate models for both train and test dataset.\n", + "- Generate Confusion Matrix and scores of Accuracy, Recall, Precision and F1-Score.\n", + "- Analyse occurrence of overfitting and underfitting. If there is any of them, try to overcome it within a different section." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Evaluation\n", + "\n", + "- Select the best performing model and write your comments about why choose this model.\n", + "- Analyse results and make comment about how you can improve model." + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.9" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/My Project/09-11-2020 ML Course Nigeria Project 'Abdulhameed Araromi'.ipynb b/My Project/09-11-2020 ML Course Nigeria Project 'Abdulhameed Araromi'.ipynb new file mode 100644 index 0000000..52b2795 --- /dev/null +++ b/My Project/09-11-2020 ML Course Nigeria Project 'Abdulhameed Araromi'.ipynb @@ -0,0 +1,1675 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Project\n", + "\n", + "In this project, our aim is to building a model for predicting churn. Churn is the percentage of customers that stopped using your company's product or service during a certain time frame. Thus, in the given dataset, our label will be `Churn` column.\n", + "\n", + "## Steps\n", + "- Read the `churn.csv` file and describe it.\n", + "- Make at least 4 different analysis on Exploratory Data Analysis section.\n", + "- Pre-process the dataset to get ready for ML application. (Check missing data and handle them, can we need to do scaling or feature extraction etc.)\n", + "- Define appropriate evaluation metric for our case (classification).\n", + "- Train and evaluate Logistic Regression, Decision Trees and one other appropriate algorithm which you can choose from scikit-learn library.\n", + "- Is there any overfitting and underfitting? Interpret your results and try to overcome if there is any problem in a new section.\n", + "- Create confusion metrics for each algorithm and display Accuracy, Recall, Precision and F1-Score values.\n", + "- Analyse and compare results of 3 algorithms.\n", + "- Select best performing model based on evaluation metric you chose on test dataset.\n", + "\n", + "\n", + "Good luck :)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "

Abdulhameed Temitope Araromi

" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Data" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [], + "source": [ + "import pandas as pd\n", + "import seaborn as sns\n", + "import numpy as np\n", + "import matplotlib.pyplot as plt" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
ChurnAccountWeeksContractRenewalDataPlanDataUsageCustServCallsDayMinsDayCallsMonthlyChargeOverageFeeRoamMins
00128112.71265.111089.09.8710.0
10107113.71161.612382.09.7813.7
20137100.00243.411452.06.0612.2
3084000.02299.47157.03.106.6
4075000.03166.711341.07.4210.1
\n", + "
" + ], + "text/plain": [ + " Churn AccountWeeks ContractRenewal DataPlan DataUsage CustServCalls \\\n", + "0 0 128 1 1 2.7 1 \n", + "1 0 107 1 1 3.7 1 \n", + "2 0 137 1 0 0.0 0 \n", + "3 0 84 0 0 0.0 2 \n", + "4 0 75 0 0 0.0 3 \n", + "\n", + " DayMins DayCalls MonthlyCharge OverageFee RoamMins \n", + "0 265.1 110 89.0 9.87 10.0 \n", + "1 161.6 123 82.0 9.78 13.7 \n", + "2 243.4 114 52.0 6.06 12.2 \n", + "3 299.4 71 57.0 3.10 6.6 \n", + "4 166.7 113 41.0 7.42 10.1 " + ] + }, + "execution_count": 2, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Read csv\n", + "data = pd.read_csv(\"churn.csv\")\n", + "data.head()" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "RangeIndex: 3333 entries, 0 to 3332\n", + "Data columns (total 11 columns):\n", + " # Column Non-Null Count Dtype \n", + "--- ------ -------------- ----- \n", + " 0 Churn 3333 non-null int64 \n", + " 1 AccountWeeks 3333 non-null int64 \n", + " 2 ContractRenewal 3333 non-null int64 \n", + " 3 DataPlan 3333 non-null int64 \n", + " 4 DataUsage 3333 non-null float64\n", + " 5 CustServCalls 3333 non-null int64 \n", + " 6 DayMins 3333 non-null float64\n", + " 7 DayCalls 3333 non-null int64 \n", + " 8 MonthlyCharge 3333 non-null float64\n", + " 9 OverageFee 3333 non-null float64\n", + " 10 RoamMins 3333 non-null float64\n", + "dtypes: float64(5), int64(6)\n", + "memory usage: 286.6 KB\n" + ] + } + ], + "source": [ + "# Describe our data for each feature and use .info() for get information about our dataset\n", + "# Analys missing values\n", + "data.info()" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
ChurnAccountWeeksContractRenewalDataPlanDataUsageCustServCallsDayMinsDayCallsMonthlyChargeOverageFeeRoamMins
count3333.0000003333.0000003333.0000003333.0000003333.0000003333.0000003333.0000003333.0000003333.0000003333.0000003333.000000
mean0.144914101.0648060.9030900.2766280.8164751.562856179.775098100.43564456.30516110.05148810.237294
std0.35206739.8221060.2958790.4473981.2726681.31549154.46738920.06908416.4260322.5357122.791840
min0.0000001.0000000.0000000.0000000.0000000.0000000.0000000.00000014.0000000.0000000.000000
25%0.00000074.0000001.0000000.0000000.0000001.000000143.70000087.00000045.0000008.3300008.500000
50%0.000000101.0000001.0000000.0000000.0000001.000000179.400000101.00000053.50000010.07000010.300000
75%0.000000127.0000001.0000001.0000001.7800002.000000216.400000114.00000066.20000011.77000012.100000
max1.000000243.0000001.0000001.0000005.4000009.000000350.800000165.000000111.30000018.19000020.000000
\n", + "
" + ], + "text/plain": [ + " Churn AccountWeeks ContractRenewal DataPlan DataUsage \\\n", + "count 3333.000000 3333.000000 3333.000000 3333.000000 3333.000000 \n", + "mean 0.144914 101.064806 0.903090 0.276628 0.816475 \n", + "std 0.352067 39.822106 0.295879 0.447398 1.272668 \n", + "min 0.000000 1.000000 0.000000 0.000000 0.000000 \n", + "25% 0.000000 74.000000 1.000000 0.000000 0.000000 \n", + "50% 0.000000 101.000000 1.000000 0.000000 0.000000 \n", + "75% 0.000000 127.000000 1.000000 1.000000 1.780000 \n", + "max 1.000000 243.000000 1.000000 1.000000 5.400000 \n", + "\n", + " CustServCalls DayMins DayCalls MonthlyCharge OverageFee \\\n", + "count 3333.000000 3333.000000 3333.000000 3333.000000 3333.000000 \n", + "mean 1.562856 179.775098 100.435644 56.305161 10.051488 \n", + "std 1.315491 54.467389 20.069084 16.426032 2.535712 \n", + "min 0.000000 0.000000 0.000000 14.000000 0.000000 \n", + "25% 1.000000 143.700000 87.000000 45.000000 8.330000 \n", + "50% 1.000000 179.400000 101.000000 53.500000 10.070000 \n", + "75% 2.000000 216.400000 114.000000 66.200000 11.770000 \n", + "max 9.000000 350.800000 165.000000 111.300000 18.190000 \n", + "\n", + " RoamMins \n", + "count 3333.000000 \n", + "mean 10.237294 \n", + "std 2.791840 \n", + "min 0.000000 \n", + "25% 8.500000 \n", + "50% 10.300000 \n", + "75% 12.100000 \n", + "max 20.000000 " + ] + }, + "execution_count": 8, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "data.describe()" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "Churn 0\n", + "AccountWeeks 0\n", + "ContractRenewal 0\n", + "DataPlan 0\n", + "DataUsage 0\n", + "CustServCalls 0\n", + "DayMins 0\n", + "DayCalls 0\n", + "MonthlyCharge 0\n", + "OverageFee 0\n", + "RoamMins 0\n", + "dtype: int64" + ] + }, + "execution_count": 9, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "data.isna().sum()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Exploratory Data Analysis" + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "" + ] + }, + "execution_count": 11, + "metadata": {}, + "output_type": "execute_result" + }, + { + "data": { + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYsAAAEGCAYAAACUzrmNAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/d3fzzAAAACXBIWXMAAAsTAAALEwEAmpwYAAAP60lEQVR4nO3dcayd9V3H8fdnMBm6ESEUVtrOsqVTCyqEayXyh0yi1CWmbHNLMRuNErsQZkaymMD+ENQ0WSLbHHPDdBmDmm2k2YZUBSfD6VxkY7dLs9JiXR0Id630spmARtF2X/84T8NZe3p/p7c959z2vl/JyXnO93l+z/lecssnz/P8nuemqpAkaS6vmHQDkqSFz7CQJDUZFpKkJsNCktRkWEiSms6cdAOjcv7559fKlSsn3YYknVK2b9/+fFUtObJ+2obFypUrmZ6ennQbknRKSfJvg+qehpIkNRkWkqQmw0KS1GRYSJKaDAtJUpNhIUlqMiwkSU2GhSSpybCQJDWdtndwn6grfm/LpFvQArT9j2+YdAvSRHhkIUlqMiwkSU2GhSSpybCQJDUZFpKkJsNCktRkWEiSmgwLSVKTYSFJajIsJElNhoUkqcmwkCQ1GRaSpCbDQpLUZFhIkpoMC0lSk2EhSWoyLCRJTYaFJKnJsJAkNRkWkqSmkYVFkhVJvpzkySS7kry3q9+R5LtJdnSvN/eNuS3J3iR7klzbV78iyc5u3V1JMqq+JUlHO3OE+z4IvK+qvpnkNcD2JI906z5cVXf2b5xkNbAeuAS4CPhSkjdW1SHgbmAj8DXgIWAt8PAIe5ck9RnZkUVV7a+qb3bLLwJPAsvmGLIOuL+qXqqqp4C9wJokS4FzquqxqipgC3DdqPqWJB1tLNcskqwELge+3pXek+RbSe5Jcm5XWwY82zdspqst65aPrA/6no1JppNMz87OnswfQZIWtZGHRZJXA58HbqmqF+idUnoDcBmwH/jg4U0HDK856kcXqzZX1VRVTS1ZsuREW5ckdUYaFkleSS8oPl1VXwCoqueq6lBV/QD4BLCm23wGWNE3fDmwr6svH1CXJI3JKGdDBfgk8GRVfaivvrRvs7cAT3TL24D1Sc5KcjGwCni8qvYDLya5stvnDcCDo+pbknS0Uc6Gugp4F7AzyY6u9n7g+iSX0TuV9DTwboCq2pVkK7Cb3kyqm7uZUAA3AfcCZ9ObBeVMKEkao5GFRVV9lcHXGx6aY8wmYNOA+jRw6cnrTpJ0PLyDW5LUZFhIkpoMC0lSk2EhSWoyLCRJTYaFJKnJsJAkNRkWkqQmw0KS1GRYSJKaDAtJUpNhIUlqMiwkSU2GhSSpybCQJDUZFpKkJsNCktRkWEiSmgwLSVKTYSFJajIsJElNhoUkqcmwkCQ1GRaSpCbDQpLUZFhIkpoMC0lS08jCIsmKJF9O8mSSXUne29XPS/JIkm937+f2jbktyd4ke5Jc21e/IsnObt1dSTKqviVJRxvlkcVB4H1V9dPAlcDNSVYDtwKPVtUq4NHuM9269cAlwFrg40nO6PZ1N7ARWNW91o6wb0nSEUYWFlW1v6q+2S2/CDwJLAPWAfd1m90HXNctrwPur6qXquopYC+wJslS4JyqeqyqCtjSN0aSNAZjuWaRZCVwOfB14MKq2g+9QAEu6DZbBjzbN2ymqy3rlo+sD/qejUmmk0zPzs6e1J9BkhazkYdFklcDnwduqaoX5tp0QK3mqB9drNpcVVNVNbVkyZLjb1aSNNBIwyLJK+kFxaer6gtd+bnu1BLd+4GuPgOs6Bu+HNjX1ZcPqEuSxmSUs6ECfBJ4sqo+1LdqG7ChW94APNhXX5/krCQX07uQ/Xh3qurFJFd2+7yhb4wkaQzOHOG+rwLeBexMsqOrvR/4ALA1yY3AM8DbAapqV5KtwG56M6lurqpD3bibgHuBs4GHu5ckaUxGFhZV9VUGX28AuOYYYzYBmwbUp4FLT153kqTj4R3ckqQmw0KS1GRYSJKaDAtJUpNhIUlqMiwkSU2GhSSpybCQJDUZFpKkJsNCktRkWEiSmgwLSVKTYSFJajIsJElNhoUkqcmwkCQ1GRaSpCbDQpLUZFhIkpoMC0lS01BhkeTRYWqSpNPTmXOtTPIq4EeB85OcC6RbdQ5w0Yh7kyQtEHOGBfBu4BZ6wbCdl8PiBeBjo2tLkrSQzBkWVfUR4CNJfreqPjqmniRJC0zryAKAqvpokl8EVvaPqaotI+pLkrSADBUWSf4ceAOwAzjUlQswLCRpERgqLIApYHVV1SibkSQtTMPeZ/EE8NpRNiJJWriGDYvzgd1Jvphk2+HXXAOS3JPkQJIn+mp3JPlukh3d6819625LsjfJniTX9tWvSLKzW3dXkhz5XZKk0Rr2NNQd89j3vcCfcvR1jQ9X1Z39hSSrgfXAJfSm6X4pyRur6hBwN7AR+BrwELAWeHge/UiS5mnY2VD/cLw7rqqvJFk55ObrgPur6iXgqSR7gTVJngbOqarHAJJsAa7DsJCksRr2cR8vJnmhe/1PkkNJXpjnd74nybe601TndrVlwLN928x0tWXd8pH1Y/W5Mcl0kunZ2dl5tidJOtJQYVFVr6mqc7rXq4C30TvFdLzupjcF9zJgP/DBrj7oOkTNUT9Wn5uraqqqppYsWTKP9iRJg8zrqbNV9RfAL89j3HNVdaiqfgB8AljTrZoBVvRtuhzY19WXD6hLksZo2Jvy3tr38RX07rs47nsukiytqv3dx7fQm5ILsA34TJIP0bvAvQp4vKoOdafArgS+DtwA+NgRSRqzYWdD/Xrf8kHgaXoXpY8pyWeBq+k9sXYGuB24Osll9ILmaXoPKqSqdiXZCuzu9n9zNxMK4CZ6M6vOpndh24vbkjRmw86G+q3j3XFVXT+g/Mk5tt8EbBpQnwYuPd7vlySdPMPOhlqe5IHuJrvnknw+yfL2SEnS6WDYC9yfondd4SJ6U1f/sqtJkhaBYcNiSVV9qqoOdq97AeemStIiMWxYPJ/knUnO6F7vBL43ysYkSQvHsGHx28A7gH+ndzPdbwDHfdFbknRqGnbq7B8BG6rqPwCSnAfcSS9EJEmnuWGPLH72cFAAVNX3gctH05IkaaEZNixe0ffQv8NHFsMelUiSTnHD/g//g8A/Jfkcvbuv38GAG+gkSaenYe/g3pJkmt7DAwO8tap2j7QzSdKCMfSppC4cDAhJWoTm9YhySdLiYlhIkpoMC0lSk2EhSWoyLCRJTYaFJKnJsJAkNRkWkqQmw0KS1GRYSJKaDAtJUpNhIUlqMiwkSU2GhSSpybCQJDUZFpKkppGFRZJ7khxI8kRf7bwkjyT5dvfe/3e9b0uyN8meJNf21a9IsrNbd1eSjKpnSdJgozyyuBdYe0TtVuDRqloFPNp9JslqYD1wSTfm40nO6MbcDWwEVnWvI/cpSRqxkYVFVX0F+P4R5XXAfd3yfcB1ffX7q+qlqnoK2AusSbIUOKeqHquqArb0jZEkjcm4r1lcWFX7Abr3C7r6MuDZvu1mutqybvnI+kBJNiaZTjI9Ozt7UhuXpMVsoVzgHnQdouaoD1RVm6tqqqqmlixZctKak6TFbtxh8Vx3aonu/UBXnwFW9G23HNjX1ZcPqEuSxmjcYbEN2NAtbwAe7KuvT3JWkovpXch+vDtV9WKSK7tZUDf0jZEkjcmZo9pxks8CVwPnJ5kBbgc+AGxNciPwDPB2gKralWQrsBs4CNxcVYe6Xd1Eb2bV2cDD3UuSNEYjC4uquv4Yq645xvabgE0D6tPApSexNUnScVooF7glSQuYYSFJajIsJElNhoUkqcmwkCQ1GRaSpCbDQpLUZFhIkpoMC0lSk2EhSWoyLCRJTYaFJKnJsJAkNRkWkqQmw0KS1GRYSJKaDAtJUpNhIUlqMiwkSU2GhSSpybCQJDUZFpKkJsNCktRkWEiSmgwLSVKTYSFJajIsJElNEwmLJE8n2ZlkR5LprnZekkeSfLt7P7dv+9uS7E2yJ8m1k+hZkhazSR5ZvKmqLquqqe7zrcCjVbUKeLT7TJLVwHrgEmAt8PEkZ0yiYUlarBbSaah1wH3d8n3AdX31+6vqpap6CtgLrBl/e5K0eE0qLAr42yTbk2zsahdW1X6A7v2Crr4MeLZv7ExXO0qSjUmmk0zPzs6OqHVJWnzOnND3XlVV+5JcADyS5J/n2DYDajVow6raDGwGmJqaGriNJOn4TSQsqmpf934gyQP0Tis9l2RpVe1PshQ40G0+A6zoG74c2DfWhqUF5pk//JlJt6AF6HW/v3Nk+x77aagkP5bkNYeXgV8FngC2ARu6zTYAD3bL24D1Sc5KcjGwCnh8vF1L0uI2iSOLC4EHkhz+/s9U1d8k+QawNcmNwDPA2wGqaleSrcBu4CBwc1UdmkDfkrRojT0squo7wM8NqH8PuOYYYzYBm0bcmiTpGBbS1FlJ0gJlWEiSmgwLSVKTYSFJajIsJElNhoUkqcmwkCQ1GRaSpCbDQpLUZFhIkpoMC0lSk2EhSWoyLCRJTYaFJKnJsJAkNRkWkqQmw0KS1GRYSJKaDAtJUpNhIUlqMiwkSU2GhSSpybCQJDUZFpKkJsNCktRkWEiSmgwLSVKTYSFJajplwiLJ2iR7kuxNcuuk+5GkxeSUCIskZwAfA34NWA1cn2T1ZLuSpMXjlAgLYA2wt6q+U1X/C9wPrJtwT5K0aJw56QaGtAx4tu/zDPALR26UZCOwsfv4n0n2jKG3xeB84PlJN7EQ5M4Nk25BR/P387DbczL28hODiqdKWAz6L1BHFao2A5tH387ikmS6qqYm3Yc0iL+f43GqnIaaAVb0fV4O7JtQL5K06JwqYfENYFWSi5P8CLAe2DbhniRp0TglTkNV1cEk7wG+CJwB3FNVuybc1mLiqT0tZP5+jkGqjjr1L0nSDzlVTkNJkibIsJAkNRkWmpOPWdFCleSeJAeSPDHpXhYDw0LH5GNWtMDdC6yddBOLhWGhufiYFS1YVfUV4PuT7mOxMCw0l0GPWVk2oV4kTZBhobkM9ZgVSac/w0Jz8TErkgDDQnPzMSuSAMNCc6iqg8Dhx6w8CWz1MStaKJJ8FngM+MkkM0lunHRPpzMf9yFJavLIQpLUZFhIkpoMC0lSk2EhSWoyLCRJTYaFdAKSvDbJ/Un+NcnuJA8l2Zjkrybdm3QyGRbSPCUJ8ADw91X1hqpaDbwfuPAE93tK/LljLS7+Ukrz9ybg/6rqzw4XqmpHkh8HrknyOeBSYDvwzqqqJE8DU1X1fJIp4M6qujrJHcBFwErg+ST/ArwOeH33/idVddf4fjTph3lkIc3f4SAY5HLgFnp/B+T1wFVD7O8KYF1V/Wb3+aeAa+k9Kv72JK88oW6lE2BYSKPxeFXNVNUPgB30jhhatlXVf/d9/uuqeqmqngcOcIKnt6QTYVhI87eL3tHAIC/1LR/i5VO+B3n5392rjhjzX0PuQxo7w0Kav78DzkryO4cLSX4e+KU5xjzNywHzttG1Jp1choU0T9V7CudbgF/pps7uAu5g7r/58QfAR5L8I72jBemU4FNnJUlNHllIkpoMC0lSk2EhSWoyLCRJTYaFJKnJsJAkNRkWkqSm/wf03QODNr6OSgAAAABJRU5ErkJggg==\n", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], + "source": [ + "# Our label Distribution (countplot)\n", + "sns.countplot(data['Churn'])" + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "" + ] + }, + "execution_count": 12, + "metadata": {}, + "output_type": "execute_result" + }, + { + "data": { + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYAAAAEGCAYAAABsLkJ6AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/d3fzzAAAACXBIWXMAAAsTAAALEwEAmpwYAAAu60lEQVR4nO3deZhcZZX48e+p6uo1ve97d/Z0FrInhC2sBhCCosOiAuqAUXBGZ3Rkfj6PPx1n5ue4zYgimyLEEREBNQgSIpCwJIEsZOsknXR3Ot2d9Jbe963e3x91g03TS/V6azmfJ/101b33rTrvU+k697733vOKMQallFLBx2F3AEoppeyhCUAppYKUJgCllApSmgCUUipIaQJQSqkgFWJ3AGORlJRk8vLy7A5DKaX8yr59+84ZY5IHL/erBJCXl8fevXvtDkMppfyKiJwearkOASmlVJDSBKCUUkFKE4BSSgUpTQBKKRWkNAEopVSQ0gSglFJBShOAUkoFKU0ASikVpDQBKKVUkPKrO4FV8HrqnfIR19++JmeaIlEqcHh1BCAiG0SkSESKReT+IdaLiDxgrT8kIssHrHtcRGpF5MigNgkisk1ETlq/4yfeHaWUUt4aNQGIiBN4ELgWKABuE5GCQZtdC8yxfu4BHhqw7glgwxAvfT/wqjFmDvCq9VwppdQ08eYIYDVQbIwpNcb0AE8DGwdtsxHYbDx2A3Eikg5gjHkDaBjidTcCT1qPnwRuGkf8SimlxsmbBJAJVAx4XmktG+s2g6UaY6oArN8pXsSilFJqkniTAGSIZWYc24yLiNwjIntFZG9dXd1kvKRSSim8uwqoEsge8DwLODuObQarEZF0Y0yVNVxUO9RGxphHgUcBVq5cOSlJRQWXka4g0quHVDDzJgHsAeaISD5wBrgVuH3QNluA+0TkaWAN0Hx+eGcEW4A7ge9Zv/80lsCVGs65tm62F9Xx+vFaKho7AEiODiMzNoK5adE4ZKgDVqWCz6gJwBjTJyL3AVsBJ/C4MaZQRDZZ6x8GXgKuA4qBDuCz59uLyG+B9UCSiFQC/9cY80s8X/zPiMjngXLgk5PZMRVcnnqnnN5+N1sOnGV/eSMGiA4PISU6jIb2Hg5XNmOApBlhXD4vmSVZcTgdmghUcBNj/GdUZeXKlUanhAxOo90I1tLVy292n6aisZOLZyexNDuO9NhwxNrb7+13c7y6le1FtVQ1d5EYFcrHl2eRnxQ17Gvq8JAKFCKyzxizcvByvRNY+b2zTZ1s3lVGZ28/n1qTw8KM2A9t43I6WJwZy6KMGI5Xt/Li4Sp+8WYpl8xJ5qqCFEIcWhVFBR9NAMqvtXf3sXlXGSLCFy6dRUZcxIjbiwgL0mOYmRzFi4eqeONkHSdrW7l9dQ6JM8LG9N56cln5O93tUX7LGMNz+ytp7+nnM2tzR/3yHygsxMnHl2fx6TW5NHX08uD2Yk7UtE5htEr5Hk0Aym/tKq3neHUrGxamjenLf6CCjBjuvXw2cRGhPLmzjO1FtfjTeTGlJkITgPJLVc2d/OVINfNSo1k3K3FCr5UQFcqmy2axOCuWV47W8PSeCnr63JMUqVK+S88BKL/jNoZn91USGerk5hVZ71/pMxGhIQ5uWZlNRmwEWwuraWjvYcOiNNJiwychYqV8kx4BKL9zoKKJquYurluczoywyduHEREunZvMp9fmUtfWzY0/e4vDlc2T9vpK+RpNAMqv9Pa72Xa0hsy4CBZnfvhyz8mwID2GTZfNwuV0cMuju9heNGSVEqX8ng4BKb+yq6Se5s5ePrEia0pLOqTFhPOZC3N5cmcZn3tiDx9blsWKXJ2zSAUWPQJQfqOjp4/tJ2qZlxrNrOQZU/5+MeEu7r5kJjOTZvDc/kp2lZyb8vdUajppAlB+Y3tRHd29bj6yKG3a3jPc5eSOdbkUpMfwwqEqDlQ0Tdt7KzXVNAEov9Da1cvu0nqW5cSRFjO9V+aEOBzcsiqb/KQont1XQVG13jCmAoMmAOUXdpbU0+82rJ9rz8RxLqeDz6zNJS02nKfePU1FQ4ctcSg1mTQBKJ/X3OnZ+1+UGUtS9Njq9UymcJeTu9blMyMshN/t1ZvFlP/TBKB83q93ldHd52b9vGS7Q2FGWAg3r8iiob2HlwtHm/NIKd+mCUD5tI6ePh5/u4x5qdGkx46v3s9km5k0g4tmJbK7tIHi2ja7w1Fq3DQBKJ/29LsVNLT3+MTe/0DXLEwjaUYYz++vpKu33+5wlBoXTQDKZ/X0uXnszVLW5CeQmzj8zF12cDkdfHJFFs2dvWw7WmN3OEqNiyYA5bO2HDxLVXMXX7p8tt2hDCk7IZIVufG8W9ZAc2ev3eEoNWaaAJRPcrsNj+woYX5aNJfOSbI7nGGtn5eCMYYdJ+rsDkWpMdMEoHzS60W1nKxtY9Nlsyal3PNUSYgKZXlOPHv0KED5IU0Ayic9sqOUzLgIrl+Sbncoo9KjAOWvNAEon7PvdCPvljXw95fk43L6/n9RPQpQ/sr3/7pU0HlkRwlxkS5uWZVtdyheO38U8OZJPQpQ/kMTgPIpxbWtvHK0hjvW5hIZ6j/TVSREhbIoM5b95Y309muJCOUfNAEon/LwjlLCXQ7uuijf7lDGbFVeAl29bo6c0WkklX/QBKB8xpmmTv743hluXZVDQlSo3eGM2cykKBKjQtlT1mh3KEp5RROA8hmPvVEKwN2XzrQ5kvEREVbmxlNW305JndYIUr5PE4DyCQ3tPTy9p5yblmWSGecbRd/GY3luPA6BZ/ZU2B2KUqPSBKB8whNvn6K7z82my/xz7/+86HAX89NieHZfpc4XoHyeJgBlu7buPp7cdZprClKZnRJtdzgTtiovnvr2Hv56TIvEKd+mCUDZ7ql3TtPc2csX1/tm0bexmpMaTXpsOM/s1WEg5du8SgAiskFEikSkWETuH2K9iMgD1vpDIrJ8tLYislREdovIARHZKyKrJ6dLyp909fbz2JunuHh2Ekuz4+wOZ1I4RLjxggzeOnmOpo4eu8NRalijJgARcQIPAtcCBcBtIlIwaLNrgTnWzz3AQ160/T7wHWPMUuBb1nMVZJ7dV0ldazdfunyW3aFMquuXpNPnNrxSqMNAynd5c6vlaqDYGFMKICJPAxuBowO22QhsNsYYYLeIxIlIOpA3QlsDxFjtY4GzE++O8nVPvVP+/uN+t+HH24rIjo/gVF0762b5btnnsTpc2Ux8pIvH3iylz20+sO72NTk2RaXUB3kzBJQJDBzMrLSWebPNSG2/AvxARCqAHwL/OtSbi8g91hDR3ro6rbMSSA5VNtHY0cv6eSk+XfJ5PESExZlxlNS10dHdZ3c4Sg3JmwQw1F+m8XKbkdp+EfiqMSYb+Crwy6He3BjzqDFmpTFmZXKyb80Lq8bPbZVPTosJZ16a/1/5M5TFWbG4DRRWtdgdilJD8iYBVAIDyzJm8eHhmuG2GantncDz1uPf4xlqUkGiqLqV2tZuLp2bhCPA9v7Py4gNJyEqVGsDKZ/lTQLYA8wRkXwRCQVuBbYM2mYLcId1NdBaoNkYUzVK27PAZdbjK4CTE+yL8iNvnqwjLsLF4sw4u0OZMp5hoFhK6tpo12Eg5YNGPQlsjOkTkfuArYATeNwYUygim6z1DwMvAdcBxUAH8NmR2lovfTfwExEJAbrwXD2kgkBFQwdl9R1cvzgdpyMw9/7PW5wZy44TdRw928Kq/AS7w1HqA7wquG6MeQnPl/zAZQ8PeGyAe71tay1/C1gxlmBVYHjzZB3hLgcrc+PtDmXKpceGkxgVyuEzzZoAlM/RO4HVtKpv66bwbAtr8hMJczntDmfKiQgLM2IpPddGZ0+/3eEo9QGaANS0ervkHA4RLpyZaHco06YgIwa3gaIavRpI+RZNAGraNLb3sO90I0uz44iJcNkdzrTJio8gOiyEo1Wtdoei1Af4z6Sryi8MvNN3sDdO1NHbb7hoduDc8esNhwjz06M5WNlMn84XrHyIHgGoaeE2hnfLGshLjCQtNtzucKZdQXoMPX1uSura7Q5FqfdpAlDTori2jYb2HtYE0dj/QDOTZxDqdHBM7wpWPkQTgJoWu0vriQoLYWFGzOgbByCX08Hc1Bkcq27B7R5cSUUpe2gCUFOusaOHoupWVuXGE+II3v9yC9JjaO3q42Blk92hKAVoAlDTYM+pBoCgvxFqfloMDoFtR3WOAOUbNAGoKdXndrPndCPz06KJjwy1OxxbRYQ6yUuK4hVNAMpHaAJQU+pYVSvt3X1Be/J3sIL0GIpr2yg7p1cDKftpAlBT6r3yRmLCQ5idMsPuUHzC/DTPSfC/HtOjAGU/TQBqyrR193GippWl2fEBW/N/rBKiQpmfFq0JQPkETQBqyhysaMJtYFlOnN2h+JSrFqSyp6yRpo4eu0NRQU4TgJoy71U0khkXQWpM8N35O5KrClLpdxu2F+kc18peWgtITYnqli7ONnXx0SXpXrcZqY5QIFmSGUtydBjbjtZw07JMu8NRQUyPANSUOFDeiENgSVac3aH4HIdDuGpBCjtO1NHdp3MEKPtoAlCTzm0MByqamJsazYwwPcgcylULUmnr7uOd0ga7Q1FBTBOAmnQldW20dPWxLCfwp3wcr4tmJxHucujVQMpWmgDUpDtc2UxYiIP5adF2h+Kzwl1OLpmTzF+P1uCZUlup6acJQE2qfreh8GwLC9JjcDn1v9dIri5I5WxzF4VntUS0sof+hapJVVLXRmdvP4syYu0OxeddOT8Fh8ArhdV2h6KClCYANamOnPEM/8xJ1dIPo0mcEcaqvAQtDqdsowlATZrefrcO/4zRNQvTOF7dyul6LQ6npp9eo6cmzc6Sejp7+1mcqcM/Ixl4w1tnj+c+gP/6y3EunpMMwO1rcmyJSwUf3U1Tk+bFQ2cJC3Fo5c8xSIgKJT02nKM6V7CygSYANSl6+928crRGh3/GYUF6DKfrO2jr7rM7FBVk9C9VTYqdJfU0dfTq8M84FKTHYIDjehSgppkmADUpXj5SRVSoU4d/xiE9Npz4SJcOA6lppwlATVi/27DtaA3r56fo8M84iMj7U0V292pxODV99K9VTdj+8kbOtfXwkYVpdofitxZmxNLnNhyvabU7FBVEvEoAIrJBRIpEpFhE7h9ivYjIA9b6QyKy3Ju2IvJla12hiHx/4t1RdnilsJpQp4PL5yXbHYrfykmMJDo8hCNnmu0ORQWRUe8DEBEn8CBwNVAJ7BGRLcaYowM2uxaYY/2sAR4C1ozUVkQuBzYCS4wx3SKSMpkdU9PDGMPWwhrWzU4kOtxldzh+yyHCwowY9p1upKOnj8hQvUVHTT1vjgBWA8XGmFJjTA/wNJ4v7oE2ApuNx24gTkTSR2n7ReB7xphuAGNM7ST0R02z49WtlDd06PDPJFiUGUtvv+H14zpVpJoe3iSATKBiwPNKa5k324zUdi5wiYi8IyI7RGTVUG8uIveIyF4R2VtXp38YvmZrYTUinglO1MTkJUYxIyyEl45U2R2KChLeJAAZYtngAubDbTNS2xAgHlgLfB14RkQ+tL0x5lFjzEpjzMrkZB1j9jVbC2tYkRNPcnSY3aH4vfPDQK8dq32/RIRSU8mbBFAJZA94ngWc9XKbkdpWAs9bw0bvAm4gyfvQld0qGjo4VtWiwz+TaFFmLJ29/ew4oSOiaup5kwD2AHNEJF9EQoFbgS2DttkC3GFdDbQWaDbGVI3S9o/AFQAiMhcIBc5NtENq+my16thrApg8eYlRJEaF8tJhnSNATb1RLzUwxvSJyH3AVsAJPG6MKRSRTdb6h4GXgOuAYqAD+OxIba2Xfhx4XESOAD3AnUbnxvML56tZ/u/uctJiwnmr+Jznk1cT5nQI1yxMY8uBM3T19hPuctodkgpgXl1rZox5Cc+X/MBlDw94bIB7vW1rLe8BPj2WYJXvaOvu43R9O5fP16t3J9tHl6Tz23fLee14LdctTrc7HBXA9E5gNS5F1S0YPIXM1ORaOzOR5OgwthwYfKpNqcmlCUCNy9GzLcRFuEiPDbc7lIDjdAg3LMngtaJamjt77Q5HBTBNAGrMevrcnKxtY0FGDENcuasmwY1LM+jpc79/ol2pqaAJQI3ZiZpW+txGh3+m0AVZseQmRuowkJpSmgDUmB2raiHC5SQvMcruUAKWiLDxggx2lpyjtrXL7nBUgNIEoMakt9/N8epWFqRH43To8M9UunFpBm4DLx7S0hBqamgCUGOy51QDnb39OvwzDWanRFOQHsOfdBhITRFNAGpMthZWE+IQZqdE2x1KUNi4NIMDFU2UnWu3OxQVgDQBKK+53YaXC6uZmxpNaIj+15kONy7NQASe319pdygqAOlfsfLaexVN1LR0szBDh3+mS3psBBfPTuK5/Wdwu7VSippcmgCU114+UoXLKcxP0wQwnT6xIoszTZ3sPlVvdygqwGgCUF4xxvCXI9VcPDuJiFAtUDadrilIY0ZYCM/tO2N3KCrAaAJQXik820JlYyfXLtLiZNMtItTJR5ek85cjVbR399kdjgogmgCUV/5ypAqnQ7i6QKd+tMPNK7Lo6OnnL0e0NISaPJoA1KjOD/+snZlAfFSo3eEEpZW58eQmRvLcPr0aSE0er+YDUMHn/KQvADUtXZTWtbMoI/YDy9X0ERFuXp7Fj7edoKKhg+yESLtDUgFAE4Aa1ZGzzQjo5Z/TZLgkG+IQROD3eyv4p2vmTXNUKhDpEJAakTGGQ5XN5CZGEh3usjucoBYXGcqlc5J5Zm8lff1uu8NRAUCPANSIqlu6qGvtZt3SDLtDUUBmXAQ7TtTxby8cZf6geky3r8mxKSrlr/QIQI3oYEUzDoFFGbF2h6KABekxzAgLYU9Zg92hqACgCUANy20MhyqbmJMSTVSYHiz6AqdDWJEbT1FNq04XqSZME4AaVkVDB02dvSzJ0r1/X7IyNx63gX2nG+0ORfk5TQBqWAcqmghxiNb+9zGJM8KYmRzFvtMNuI0WiFPjpwlADanfbThyppkF6TGEubT2j69ZnZdAY0cvxbVtdoei/JgmADWkkro22nv6uUCHf3xSQUYMUWEh7C7VCqFq/DQBqCEdrGgi3OVgbqrO/OWLQhwOVufFU1TdSkN7j93hKD+lCUB9SGtXL0fONrM4M5YQp/4X8VWr8xMRgXd0ngA1TvrXrT7khYNV9PYbVuYm2B2KGkFshIuC9Bj2ljXS06d3Bqux0wSgPuR3eytIiQ4jKz7C7lDUKNbOTKSzt59DlU12h6L8kCYA9QFF1a0crGhiZV4CImJ3OGoU+UlRpESHsau0HqOXhKox0gSgPuCZvRW4nMKy7Di7Q1FeEBEunJVIVXOX3himxkwTgHpfT5+bP7x3hqsLUrX0gx9Zmh1HuMvB42+fsjsU5We8SgAiskFEikSkWETuH2K9iMgD1vpDIrJ8DG2/JiJGRJIm1hU1UX89VkNDew+fXJltdyhqDMJCnKzOS+TlI9VUNHTYHY7yI6MmABFxAg8C1wIFwG0iUjBos2uBOdbPPcBD3rQVkWzgakCnmfIBT++pID02nEvnJNsdihqjC2cl4hDhl2/pUYDynjdHAKuBYmNMqTGmB3ga2Dhom43AZuOxG4gTkXQv2v438C+Anr2yWUldG2+cqOOWVdk4HXry19/ERri44YIMntlbQXOHVglV3vEmAWQCFQOeV1rLvNlm2LYiciNwxhhzcKQ3F5F7RGSviOytq6vzIlw1Hpt3luFyCp9ak2t3KGqc/v6SfDp6+vntHj2gVt7xJgEMtTs4eI99uG2GXC4ikcA3gW+N9ubGmEeNMSuNMSuTk3VoYiq0dPXy7L5KbliSQXJ0mN3hqHFamBHLulmJPPF2md4YprziTQKoBAaeFcwCznq5zXDLZwH5wEERKbOW7xeRtLEErybH7/dW0t7Tz2cvyrc7FDVBd18yk+qWLv58aPCfqFIf5k0C2APMEZF8EQkFbgW2DNpmC3CHdTXQWqDZGFM1XFtjzGFjTIoxJs8Yk4cnUSw3xlRPVseUd/rdhid3lrEiN57FWvnT7102N5m5qTN4ZEep3himRjVqAjDG9AH3AVuBY8AzxphCEdkkIpuszV4CSoFi4DHgSyO1nfReqHF7/Xgt5Q0dfPaiPLtDUZPA4RA2XTaLoppWXi+qtTsc5eO8utvHGPMSni/5gcseHvDYAPd623aIbfK8iUNNvl/tPEVaTDgfWaijb4Hihgsy+NErJ3hoewlXzE+1Oxzlw/RO4CB2qLKJt4vruXNdHi4t+xwwXE4Hd1+Sz56yRvaUNdgdjvJh+lcfxH7+egkx4SF8em2O3aGoSXbLqhwSokJ5eHuJ3aEoH6YJIEidrGnl5cJq7lqXR3S4y+5w1CSLCHVy17o8Xj1ey/HqFrvDUT5KK34FqYd2lOByCtHhLp56R28cCkR3XJjLwztKeGh7CT+5dZnd4SgfpAkgCFU0dPCnA2dZm5+gVT8DyFCJfHlOPFsOnOWfr55HTmKkDVEpX6ZDQEHokTdKcAhcrEXfAt7Fs5NwOISH39BzAerDNAEEmarmTp7ZW8knVmQRG6Fj/4EuJsLFipx4nt1bSU1Ll93hKB+jCSDI/Oy1Yowx3Hv5bLtDUdPk0rnJ9Lnd/OLNUrtDUT5GB4AD2OAx4cb2Hp5+t4KVefG8ceKcTVGp6ZYQFcqNF2Twm3fKuffy2cRFhtodkvIRegQQRF4vqkUE1s9LsTsUNc2+uH42HT39/OrtMrtDUT5EE0CQqG/rZn95I6vzE3TsPwjNS4vm6oJUnthZRlt3n93hKB+hCSBIvHa8FqdDuGyuXvkTrL60fhbNnb38Vu/7UBZNAEGgpqWLAxVNrJ2ZqHf9BrFlOfFcNDuRx94spau33+5wlA/Qk8BBYNvRGkJDHFym1/0HrfMXBMxPi+Ht4nq+8dwh1uQnvr/+9jVaDyoY6RFAgKto6OBoVQuXzEkmUu/6DXozk6LIjo/gjRN19Lt1wphgpwkggBljeLmwmqiwEC6anTh6AxXwRIT181Jo7Ojl8Jkmu8NRNtMEEMCKa9s4da6dK+YlExbitDsc5SPmpUWTGhPG9qI63DptZFDTBBCg3G7D1qPVxEe6WJWfYHc4yoc4RFg/N4Xa1m6OntVS0cFME0CA+vPhKs42dXHVglRCHPoxqw9anBVLYlQo24tqdfL4IKbfDAGop8/ND7cWkRYTzgXZcXaHo3yQQ4T185I529zFiZpWu8NRNtEEEIB+885pyhs62LAoDYeI3eEoH7U0O564CBevHdejgGClCSDAtHb18tPXilk3K5E5KTPsDkf5MKdDuHRuMhWNnewqqbc7HGUDTQAB5pEdpTS09/Cv1y5AdO9fjWJFbjzR4SH85NWTdoeibKAJIIDUtHTxi7dKueGCDBZnxdodjvIDLqeDy+Ym886pBnYWa4nwYKMJIIB8/+Ui3G74+jXz7A5F+ZFVeQmkx4bzw1eK9FxAkNEEECAOVzbz3P5KPntRnk7+rcbE5XRw3xWz2V/exPaiOrvDUdNIE0AAMMbw3RePkhgVyr1X6FSPauw+uSKb7IQIfrRNjwKCiSaAALC1sJp3TzXw1avnEqPlntU4hIY4+Mcr53LkTAtbC2vsDkdNEy0P6ec27yzjf149SUp0GMZ8eB5gpbx109IMfv56MT96pYirFqQQ4tT9w0Cnn7Cfe6v4HA3tPVy/OB2nQy/7VOMX4nTwLxvmc7K2jafe1R2JYKAJwI9VNHTw2vFaFmbEMCc12u5wVAD4yMJU1s1K5MfbTtDU0WN3OGqKeZUARGSDiBSJSLGI3D/EehGRB6z1h0Rk+WhtReQHInLc2v4PIhI3KT0KIt95oRCHCNcvTrc7FBUgRIRv3VBAS2cv//NXvTks0I2aAETECTwIXAsUALeJSMGgza4F5lg/9wAPedF2G7DIGLMEOAH864R7E0ReKazmr8dquXJBCnGRoXaHowLI/LQYbl+Tw693n+akFooLaN4cAawGio0xpcaYHuBpYOOgbTYCm43HbiBORNJHamuMecUY02e13w1kTUJ/gkJHTx/feeEo81KjWTcrye5wVAD6p6vnERXq5N/+fFQvCw1g3iSATKBiwPNKa5k323jTFuBzwF+GenMRuUdE9orI3ro6vUkFPHf8nmnq5Ls3LdITv2pKJESF8s/XzOPNk+d4bv8Zu8NRU8SbBDDUN8zgXYLhthm1rYh8E+gDfjPUmxtjHjXGrDTGrExOTvYi3MC2s/gcT+ws4651eazWmb7UFPrM2lxW5yXwnRcKqW7usjscNQW8uQ+gEsge8DwLOOvlNqEjtRWRO4GPAlcaPc4cVUtXL19/9hD5SVF8Y8N8u8NRAWS4+0cumZPEoTNN/J8/HOaXd67UCrMBxpsjgD3AHBHJF5FQ4FZgy6BttgB3WFcDrQWajTFVI7UVkQ3AN4AbjTEdk9SfgPbdF45S1dzJj/7uAiJCdZJ3NfUSZ4TxLx+Zz2vHa3leh4ICzqgJwDpRex+wFTgGPGOMKRSRTSKyydrsJaAUKAYeA740Ulurzc+AaGCbiBwQkYcnr1uB5+Uj1fx+XyWbLpvF8px4u8NRQeSudXmszkvg2y8UUl6v+2qBRPxp5GXlypVm7969docx7U6da+fGn75FXlIUz37xQsJC/rb3r6Uf1FS7fU0OFQ0dXP/Am+QkRvLspnWEu/QI1J+IyD5jzMrBy/VOYB/X2dPPF/93H06n8NCnl3/gy1+p6ZKdEMl/37KUI2da+M4LhaM3UH5Bi8H5MGMM3/zDYYqqW7lzXR5vnNAZm9T0G3iUedncZH77bgW9fYblufHcvibHxsjURGkC8GFP7Czj+ffOcOWCFOZqrR/lA65akEp5Qwd/PHCG5Ogwu8NRE6RDQD5q29Eavvvno1y1IJXL56XYHY5SADgdwm2rc4gOD2Hz7tNUNOhJYX+mCcAHHaps4h9++x6LMmN54LalOPTaa+VDZoSFcOeFefS73XzuiT00d/baHZIaJ00APqaysYPPPbGXxBmh/PLOVUSG6iid8j0pMeF8ak0uZfXtfOk3++jpc9sdkhoHvQzURzz1TjmtXb089mYpbd19bLp0Fikx4XaHpdSIwkIc/PPvD3L94nQeuG2Z1qbyUcNdBqq7lz6is6efJ3aW0dzZy+cuytcvf+UXbl6RRUN7D//x0jGiw0P4fx9frOUi/IgmAB/Q0dPH5l1l1LZ085kLc8lNjLI7JKW8dvelM2nq7OHB10uIjXBx/7XzNQn4CU0ANuvq7ecLv95HeUMHt67O0cs9lV/62jXzaOns45E3Sgl3Ofnq1XPtDkl5QROAjbr7PHf5vnnyHDcvz2JxZqzdISk1JgNvEpuXFs2KnHh+8upJCs8284s7V9kYmfKGXgVkk95+N/c99R6vF9Xxnx9bzIpcLfCm/JtDhI8tz2RZdhx/PVbLg68X2x2SGoUmABv09bv5ytMH2Ha0hm/fUKC306uA4RDh5hVZLM2O4wdbi/jpqzqxvC/TIaBp1tfv5qvPHOTFw1X8n+vmc9dF+XaHpNSkcojwiRVZCPCjbSfYd7qRqwtSP3RiWHd87KcJYBr1uw1f+/1BXjh4lm9smM89l86yOySlpsT5I4EQp4PtJ+ro7Xdz3eJ0vTrIx2gCmCb/u/s0z+2r5L2KJq4pSCU2wqW1/FVAc4hw09IMQpzC2yX1dPe52bg0U28W8yGaAKZBv9vw/H7Pl/9VC1JYr8XdVJAQET66OJ0Il5PXjtfS3tPPrauycTn19KMv0AQwiYbao3cbwx/2n2F/eRNXLkjhivmpNkSmlH1EhKsWpBIVFsKfD57l8bdPccfaPLvDUuhVQFPKbQx/eO8M+8obuXJ+Clfql78KYhfOTOSWVdlUNnTy0I5iTp1rtzukoKcJYIq4jeH5/WfYd7qRK+ancOUC/fJXaklWHJ+7OJ+Onn5uevBtdhbrLHd20gQwBdzG8Ny+SvaXN3LlghSu0i9/pd6XnxTFl9bPJiU6jDsef5fH3zqFP1UlDiSaACZZv9vw7L7zJ3xTddhHqSEkRIXy/JfWsX5eCv/256PcvXkvje09docVdDQBTKI+t5un95RzwLrU84r5erWPUsOJDnfx2B0r+NZHC9hxoo7rHnhTh4SmmSaASdLV289vdpdTeLaF6xen66WeSnlBRPjcxfk8/8WLCHc5uf0X7/DV3x2grrXb7tCCgs4INglau3r5wq/3sauknhuXZrAmP9HukJTyO739brYX1fLGiXNEhTn5ylVzuX1NDuEup92h+b3hZgTTBDBBNS1dfPZXeyiqaeXjyzJZlqNVPZWaiNrWLvaUNfB2cT0p0WFsumyWJoIJ0gQwBYprW7nz8T00dvTw0KdXcKax0+6QlAoIt6/JYVdJPT959QS7SxuIi3Rx8/IsbludzewUnTRprHRO4En2+vFavvK7A7icDn53z4UszorV2j5KTaILZyVy4awLeae0ns27T7N5Vxm/fOsUy3LiuH5xOhsWpZEVH2l3mH5NE8AY9bsN/73tBD97vZiC9Bge+cwKshP0P6FSU2XNzETWzEzkXFs3z+2r5MmdZfz7i8f49xePkREXztyUaOakRpOTEInTIVpmegw0AYxBeX0H9z9/iJ0l9dyyMpvvbFyo45JKTZOkGWF84bJZRIe7qG/r5sjZFoqqW3jjZB3bT9QR6nSQnRBBbWsXq/ISWJIVS3S4y+6wfZqeA/BCV28/D+8o4efbSwhxCN++cSF/tzL7Q9vpEJBS06+rt5/SujaK69o5Xd9OdUsXxoAIzEyK4oKsOAoyYliYEUtBegyxkcGXFPQcwDi0dvXyh/fO8NibpVQ0dHLDBRl887oFpMWG2x2aUsoS7nJSkBFLQUYsANcvSee98kYOVTZzqLKJN4vP8fx7Z97fPjbCRVpMOKkx4aTGhJEaE05ydBh3rsuzqQf28eoIQEQ2AD8BnMAvjDHfG7RerPXXAR3AXcaY/SO1FZEE4HdAHlAG/J0xpnGkOKbjCKCjp489ZY389WgNz++vpL2nnyVZsdy/YT7rZieN2FaPAJTyTa1dvVQ1d1HV3EVNSxfVzV3UtXbTb33/CZCVEMHMpBnMTI4iNyGSnMRIsuMjSYsNZ0ZYiF/PZjbuIwARcQIPAlcDlcAeEdlijDk6YLNrgTnWzxrgIWDNKG3vB141xnxPRO63nn9jIp0cidtt6Ol309Pvpqunn+bOXlq6ejnX1kN5fQdl9e2cqGnlQEUTvf2G0BAHH12Szh0X5rE0O26qwlJKTYPocBfR4S7mpv7tEtJ+t+FcWze1rd3UtHQRFRZCSW0b755qoLO3/wPtI1xOUmLCSIgKJS7CRVxkKNHhIUSFhTAjLITIUCcRLicRoU7CXdZPiIMwl5OwEAdhIQ5CQxyEOh24nA5CnILL6cDpEEIcYlty8WYIaDVQbIwpBRCRp4GNwMAEsBHYbDyHE7tFJE5E0vHs3Q/XdiOw3mr/JLCdKUoA3/rTETbvOj3iNnGRLvKTovj8xTNZNyuRlXnxRIbqCJlSgcrpEGsYKJzFmZ7howtnJmKMoa27j8b2Hho6emnt6qW1q48W63dtSzcdPX109brp6XO/fxQxGfE4BATB+sf5vCAIj3xmBZfOTZ6U9zrPm2+4TKBiwPNKPHv5o22TOUrbVGNMFYAxpkpEhiyeIyL3APdYT9tEpMiLmMfsNHAQ+OP4XyIJCLRKVoHWp0DrDwRenwKtPzBJfbrs3yfUPHeohd4kgKGOTQanvOG28abtiIwxjwKPjqWNHURk71BjbP4s0PoUaP2BwOtToPUHfLtP3lQDrQQGXvOYBZz1cpuR2tZYw0RYv2u9D1sppdREeZMA9gBzRCRfREKBW4Etg7bZAtwhHmuBZmt4Z6S2W4A7rcd3An+aYF+UUkqNwahDQMaYPhG5D9iK51LOx40xhSKyyVr/MPASnktAi/FcBvrZkdpaL/094BkR+TxQDnxyUns2/Xx+mGocAq1PgdYfCLw+BVp/wIf75Fd3AiullJo8OiOYUkoFKU0ASikVpDQBTJCIbBCRIhEptu5o9ksiUiYih0XkgIjstZYliMg2ETlp/fbp6c5E5HERqRWRIwOWDdsHEflX63MrEpGP2BP18Ibpz7dF5Iz1OR0QkesGrPPp/gCISLaIvC4ix0SkUET+0Vrul5/TCP3xj8/JGKM/4/zBc2K7BJgJhOK5l6zA7rjG2ZcyIGnQsu8D91uP7wf+y+44R+nDpcBy4MhofQAKrM8rDMi3Pken3X3woj/fBr42xLY+3x8rznRgufU4Gjhhxe6Xn9MI/fGLz0mPACbm/TIZxpge4Hypi0CxEU+ZDqzfN9kXyuiMMW8ADYMWD9eHjcDTxphuY8wpPFewrZ6OOL01TH+G4/P9Ac9d/8YqFGmMaQWO4akY4Jef0wj9GY5P9UcTwMQMVwLDHxngFRHZZ5XfgEHlOoAhy3X4uOH64M+f3X0icsgaIjo/VOJ3/RGRPGAZ8A4B8DkN6g/4weekCWBiJlzqwodcZIxZjqey670icqndAU0xf/3sHgJmAUuBKuBH1nK/6o+IzACeA75ijGkZadMhlvlcv4boj198TpoAJsabMhl+wRhz1vpdC/wBz2FpIJTrGK4PfvnZGWNqjDH9xhg38Bh/Gz7wm/6IiAvPl+VvjDHPW4v99nMaqj/+8jlpApgYb8pk+DwRiRKR6POPgWuAIwRGuY7h+rAFuFVEwkQkH89cFu/aEN+YnP+StHwMz+cEftIfERHgl8AxY8yPB6zyy89puP74zedk91l0f//BUwLjBJ6z+d+0O55x9mEmnisTDgKF5/sBJAKvAiet3wl2xzpKP36L53C7F8+e1udH6gPwTetzKwKutTt+L/vza+AwcAjPl0m6v/THivFiPEMeh4AD1s91/vo5jdAfv/ictBSEUkoFKR0CUkqpIKUJQCmlgpQmAKWUClKaAJRSKkhpAlBKqSClCUD5NRH5mIgYEZlvcxw3iUiB9fgCETkwYN1tItJh3TCEiCwWkUPjeI/1IvLnSQtaBT1NAMrf3Qa8hecmPDvdhKfSI3iu/849f3MdsA44jqdOzPnnb09rdEoNQROA8ltW/ZWL8Nwgdau1zCkiP7TmNjgkIl+2lq8SkZ0iclBE3hWRaBEJF5FfWdu+JyKXW9veJSI/G/A+fxaR9dbjNhH5D+t1dotIqoisA24EfmDt+efjuUt8jfUSK4AH8XzxY/3ead2B/biI7LHef+OAPvzAWn5IRL4wRN9XWW1mishlA+rOvzcg8Sg1Ik0Ayp/dBLxsjDkBNIjIcuAePF/Ay4wxS4DfWGU6fgf8ozHmAuAqoBO4F8AYsxjPkcSTIhI+yntGAbut13kDuNsYsxPP3Z5fN8YsNcaUADuBdVZpDTewnQ8mgLfx3BH6mjFmFXA5ngQShSehNVvLVwF3W2UDALASzsPARmNMKfA14F5jzFLgEqtvSo1KE4DyZ7fhmYMB6/dteL7cHzbG9AEYYxqAeUCVMWaPtazFWn8xnlv2McYcB04Dc0d5zx7g/Dj8PiBvmO3exvNFvxrYYyWF2SKSDMywvrivAe63jhq2A+FAjrX8Dmv5O3jKJMyxXncB8ChwgzGmfMB7/VhE/gGIO993pUYTYncASo2HiCQCVwCLRMTgmZ3N4PlSHlzfRIZYdn75UPr44M7RwKOCXvO3+in9DP83tBvP3vvFwC5rWSWeoaqdA97/ZmNM0QeC8hQY+7IxZuug5evx1AYKx3M+4XwF1++JyIt4atDsFpGrrISm1Ij0CED5q08Am40xucaYPGNMNnAK2A9sEpEQ8Mw1i+cEbIaIrLKWRVvr3wA+ZS2bi2fvuwjP9JhLRcQhItl4N2NTK54pAYH3Z4eqAO7ibwlgF/AV/pYAtgJftr7wEZFlA5Z/ccBVQ3OtoSGAJuB64D8HnJeYZYw5bIz5L2AvYOsVUcp/aAJQ/uo2PPMWDPQckAGUA4dE5CBwu/FM13kL8FNr2TY8e9E/B5wichjPOYK7jDHdeIZUTuG5mueHeJLKaJ4Gvm6dhJ1lLXsbCDPGnJ8BaheeyqvnE8B3AZcV6xHrOcAvgKPAfmv5Iww40jDG1AA3AA+KyBrgKyJyxOpbJ/AXL+JVSquBKqVUsNIjAKWUClKaAJRSKkhpAlBKqSClCUAppYKUJgCllApSmgCUUipIaQJQSqkg9f8BB+HCWThoo6EAAAAASUVORK5CYII=\n", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], + "source": [ + "# Example EDA\n", + "sns.distplot(data.AccountWeeks)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Let us perform some analysis with the data's features**\n", + "* Group the data by whether the customer wil churn and analyse their different features to know more about how the data behave." + ] + }, + { + "cell_type": "code", + "execution_count": 35, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "" + ] + }, + "execution_count": 35, + "metadata": {}, + "output_type": "execute_result" + }, + { + "data": { + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYUAAAEGCAYAAACKB4k+AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/d3fzzAAAACXBIWXMAAAsTAAALEwEAmpwYAAAV9UlEQVR4nO3dfbCedZ3f8feHECEijAIRw4FswBPrBjtGezZDx5ku6mKI7TayO25jq2Ss0zhTiLF1OwP+o+xMHGfHh9JUXWJlpZ2tNPvgGJ+6C1GLTtWY0AgEpNwLCZyQJiGy5SE0Mcm3f5w7Fyfk5JxjkvtcJ7nfr5kz9/37Xdfvur+HOeRz/67HVBWSJAGc1XYBkqTpw1CQJDUMBUlSw1CQJDUMBUlS4+y2CzgZF198cc2bN6/tMiTptLJ58+anq2r2WMtO61CYN28emzZtarsMSTqtJNl+vGXuPpIkNQwFSVLDUJAkNQwFSVLDUJAkNQwFSVKjZ6GQ5NwkG5P8PMnWJLd2+z+ZZEeSLd2fd48ac0uSTpJHkizuVW2SpLH18jqF/cA7qur5JDOBHyX5bnfZ56vqM6NXTrIAWAZcBVwK3JPkDVV1qIc1ShrHmjVr6HQ6rdawY8cOAAYGBlqtA2BwcJCVK1e2XUZP9WymUCOe7zZndn/Ge3jDUuCuqtpfVY8DHWBRr+qTdHp48cUXefHFF9suo2/09IrmJDOAzcAg8IWq+mmSJcBNSW4ANgEfq6pngAHgJ6OGD3f7Xr7NFcAKgLlz5/ayfKnvTYdvxatWrQLgtttua7mS/tDTA81VdaiqFgKXAYuSvAn4EvB6YCGwE/hsd/WMtYkxtrm2qoaqamj27DFv3SFJOkFTcvZRVf0d8APguqra1Q2Lw8CXeWkX0TBw+ahhlwFPTUV9kqQRvTz7aHaSV3ffzwJ+B/hFkjmjVrseeLD7fj2wLMk5Sa4A5gMbe1WfJOlYvTymMAe4s3tc4SxgXVV9K8l/SbKQkV1D24APA1TV1iTrgIeAg8CNnnkkSVOrZ6FQVfcDbxmj/wPjjFkNrO5VTZKk8XlFsySpYShIkhqGgiSpYShIkhqGgiSpYShIkhqGgiSpYShIkhqGgiSpYSgIgL179/KRj3yEvXv3tl2KpBYZCgLg9ttv5/7772ft2rVtlyKpRYaC2Lt3L/fccw8Ad999t7MFqY8ZCuL222/n8OHDABw+fNjZgtTHDAWxYcOGo9pHZg2S+o+hIJKM25bUP3r5kB1N0po1a+h0Oq19/vnnn88zzzxzVPvIw9LbMDg4OC0eGC/1I2cKYs6cOeO2JfUPZwrTwHT4Vnz99dfzzDPPsHjxYm655Za2y5HUEkNBwMjs4MCBA6xYsaLtUiS1qGe7j5Kcm2Rjkp8n2Zrk1m7/hUnuTvJo9/U1o8bckqST5JEki3tVm441c+ZMBgcHueiii9ouRVKLenlMYT/wjqp6M7AQuC7J1cDNwIaqmg9s6LZJsgBYBlwFXAd8McmMHtYnSXqZnoVCjXi+25zZ/SlgKXBnt/9O4D3d90uBu6pqf1U9DnSARb2qT5J0rJ6efZRkRpItwG7g7qr6KXBJVe0E6L6+trv6APDkqOHD3b6Xb3NFkk1JNu3Zs6eX5UtS3+lpKFTVoapaCFwGLErypnFWH+uKqRpjm2uraqiqhmbPnn2KKpUkwRRdp1BVfwf8gJFjBbuSzAHovu7urjYMXD5q2GXAU1NRnyRpRC/PPpqd5NXd97OA3wF+AawHlndXWw58o/t+PbAsyTlJrgDmAxt7VZ8k6Vi9vE5hDnBn9wyis4B1VfWtJD8G1iX5EPAE8F6AqtqaZB3wEHAQuLGqDvWwPknSy/QsFKrqfuAtY/TvBd55nDGrgdW9qkmSND7vfSRJahgKkqSGoSBJahgKkqSGoSBJahgKkqSGoSBJahgKkqSGoSBJahgKkqSGoSBJahgKkqRGL++SKukErVmzhk6n03YZ08KR/w6rVq1quZLpYXBwkJUrV/Zs+4aCNA11Oh0e3fq/mPsq7x7/il+N7NDYv31Ty5W074nnZ/T8MwwFaZqa+6pDfPytz7ZdhqaRT913Qc8/w2MKkqSGoSBJahgKkqSGoSBJavQsFJJcnuT7SR5OsjXJqm7/J5PsSLKl+/PuUWNuSdJJ8kiSxb2qTZI0tl6efXQQ+FhV3ZfkfGBzkru7yz5fVZ8ZvXKSBcAy4CrgUuCeJG+oKs/Jk6Qp0rOZQlXtrKr7uu+fAx4GBsYZshS4q6r2V9XjQAdY1Kv6JEnHmpJjCknmAW8BftrtuinJ/UnuSPKabt8A8OSoYcOMESJJViTZlGTTnj17elm2JPWdnodCklcBfwl8tKqeBb4EvB5YCOwEPntk1TGG1zEdVWuraqiqhmbPnt2boiWpT/U0FJLMZCQQ/qyq/gqgqnZV1aGqOgx8mZd2EQ0Dl48afhnwVC/rkyQdrZdnHwX4CvBwVX1uVP+cUatdDzzYfb8eWJbknCRXAPOBjb2qT5J0rF6effQ24APAA0m2dPs+DrwvyUJGdg1tAz4MUFVbk6wDHmLkzKUbPfNIkqZWz0Khqn7E2McJvjPOmNXA6l7VJEkan1c0S5IahoIkqWEoSJIahoIkqWEoSJIahoIkqWEoSJIahoIkqWEoSJIahoIkqWEoSJIavbwhnqQTtGPHDl54bgafuu+CtkvRNLL9uRmct2NHTz/DmYIkqeFMQZqGBgYG2H9wJx9/67Ntl6Jp5FP3XcA5A+M96v7kOVOQJDX6eqawZs0aOp1O22VMC0f+O6xatarlSqaHwcFBVq5c2XYZ0pTr61DodDpsefBhDr3ywrZLad1ZBwqAzY/tarmS9s3Y98u2S5BaM6lQSHIe8GJVHU7yBuCNwHer6lc9rW4KHHrlhbz4xne3XYamkVm/OO7DAaUz3mSPKdwLnJtkANgAfBD4aq+KkiS1Y7KhkKraB/wesKaqrgcWjDsguTzJ95M8nGRrklXd/guT3J3k0e7ra0aNuSVJJ8kjSRaf6C8lSToxkw6FJP8Q+BfAt7t9E+16Ogh8rKp+E7gauDHJAuBmYENVzWdk1nFz9wMWAMuAq4DrgC8mmfHr/DKSpJMz2VBYBdwCfL2qtia5Evj+eAOqamdV3dd9/xzwMDAALAXu7K52J/Ce7vulwF1Vtb+qHgc6wKJf43eRJJ2kSR1orqp7GTmucKT9GPCRyX5IknnAW4CfApdU1c7udnYmeW13tQHgJ6OGDXf7Xr6tFcAKgLlz5062BEnSJEz27KM3AH8IzBs9pqreMYmxrwL+EvhoVT2b5LirjtFXx3RUrQXWAgwNDR2zXJJ04iZ7ncKfA38C/Cfg0GQ3nmQmI4HwZ1X1V93uXUnmdGcJc4Dd3f5h4PJRwy8DnprsZ0mSTt5kjykcrKovVdXGqtp85Ge8ARmZEnwFeLiqPjdq0Xpgeff9cuAbo/qXJTknyRXAfGDjpH8TSdJJm+xM4ZtJ/jXwdWD/kc6qGu/Sz7cBHwAeSLKl2/dx4NPAuiQfAp4A3tvd1tYk64CHGDlz6caqmvSsRJJ08iYbCke+2f+7UX0FXHm8AVX1I8Y+TgDwzuOMWQ2snmRNkqRTbLJnH13R60IkSe0bNxSSvKOqvpfk98ZaPurgsSTpDDDRTOG3ge8BvzvGsgIMBUk6g4wbClX1ie7rB6emHElSmybaffRvx1v+slNNJUmnuYl2H30G2AJ8l5FTUY97ObIk6fQ3USi8lZE7l/5jYDPwNUbucHpG3F5ix44dzNj3f32oio4yY99eduw42HYZUivGvaK5qrZU1c1VtZCRq5OXAg8l+adTUZwkaWpN9oZ4sxm5y+nfZ+QeRbvHH3F6GBgY4P/sP9vHceoos37xHQYGLmm7DKkVEx1o/iDwz4Bzgb8A/qCqzohAkCQda6KZwleABxi5R9Fi4F2jb31dVe5GkqQzyESh8PYpqUKSNC1MdPHa/wBI8k+A71TV4SmpSpLUisk+T2EZ8GiSP07ym70sSJLUnkmFQlW9n5Gzj/4W+NMkP06yIsn5Pa1OkjSlJjtToKqeZeTRmncBc4DrgfuSrOxRbZKkKTapUEjyu0m+zsgdU2cCi6pqCfBm4A97WJ8kaQpN9slr7wU+X1X3ju6sqn1J/uWpL0uS1IbJPnnthnGWbTh15UiS2jTZ3UdXJ/lZkueTHEhyKMmzE4y5I8nuJA+O6vtkkh1JtnR/3j1q2S1JOkkeSbL4xH8lSdKJmuzuo//IyGmpfw4MATcAgxOM+Wp33H9+Wf/nq+ozozuSLOhu/yrgUuCeJG+oqkOTrE864zzx/Aw+dd8FbZfRul37Rr67XvJKL5N64vkZzO/xZ0w2FKiqTpIZ3X+o/zTJ/5xg/XuTzJvk5pcCd1XVfuDxJB1gEfDjydYnnUkGByf6ztU/DnQ6AJzzG/43mU/v/zYmGwr7krwC2JLkj4GdwHkn+Jk3JbkB2AR8rKqeAQaAn4xaZ7jbd4wkK4AVAHPnzj3BEqTpbeVKz/Q+YtWqVQDcdtttLVfSHyZ7ncIHuuveBLwAXA78/gl83peA1wMLGQmWz3b7x3qi25gP8qmqtVU1VFVDs2fPPoESJEnHM9mzj7Z3n6lAVd16oh9WVbuOvE/yZeBb3eYwI0FzxGXAUyf6OZKkEzPR8xQCfIKRGUKAs5IcBNZU1R/9uh+WZE5V7ew2rweOnJm0HvivST7HyIHm+cDGX3f7J2LGvl/6OE7grP83cjLZ4XM9sDlj3y8BH7Kj/jTRTOGjwNuA36qqxwGSXAl8Kcm/qarPH29gkq8B1wAXJxlmJFyuSbKQkV1D24APA1TV1iTrgIeAg8CNU3HmkQfzXtLpPAfA4JX+YwiX+LehvjVRKNwAXFtVTx/pqKrHkrwf+BvguKFQVe8bo/sr46y/Glg9QT2nlAfzXuLBPEkw8YHmmaMD4Yiq2sPIPZAkSWeQiULhwAkukySdhibaffTm49zOIsC5PahHktSiiR7HOWOqCpEktW/SD9mRJJ35DAVJUsNQkCQ1DAVJUsNQkCQ1DAVJUsNQkCQ1DAVJUsNQkCQ1DAVJUsNQkCQ1DAVJUsNQkCQ1DAVJUsNQkCQ1ehYKSe5IsjvJg6P6Lkxyd5JHu6+vGbXsliSdJI8kWdyruiRJx9fLmcJXgete1nczsKGq5gMbum2SLACWAVd1x3wxiQ/4kaQp1rNQqKp7gV++rHspcGf3/Z3Ae0b131VV+6vqcaADLOpVbZKksU31MYVLqmonQPf1td3+AeDJUesNd/skSVNouhxozhh9NeaKyYokm5Js2rNnT4/LkqT+MtWhsCvJHIDu6+5u/zBw+aj1LgOeGmsDVbW2qoaqamj27Nk9LVaS+s1Uh8J6YHn3/XLgG6P6lyU5J8kVwHxg4xTXJkl97+xebTjJ14BrgIuTDAOfAD4NrEvyIeAJ4L0AVbU1yTrgIeAgcGNVHepVbZKksfUsFKrqfcdZ9M7jrL8aWN2reiRJE5suB5olSdOAoSBJahgKkqSGoSBJahgKkqSGoSBJahgKkqSGoSBJahgKkqSGoSBJahgKkqSGoSBJahgKkqSGoSBJahgKkqSGoSBJahgKkqSGoSBJahgKkqSGoSBJapzdxocm2QY8BxwCDlbVUJILgf8GzAO2AX9QVc+0UZ8k9as2Zwpvr6qFVTXUbd8MbKiq+cCGbluSNIWm0+6jpcCd3fd3Au9prxRJ6k9thUIBf5Nkc5IV3b5LqmonQPf1tWMNTLIiyaYkm/bs2TNF5UpSf2grFN5WVW8FlgA3JvlHkx1YVWuraqiqhmbPnt27CvvMvn37eOCBB+h0Om2XIqlFrYRCVT3Vfd0NfB1YBOxKMgeg+7q7jdr61fbt2zl8+DC33npr26VIatGUn32U5DzgrKp6rvv+XcAfAeuB5cCnu6/fmOra2rJmzZpWv6Hv27ePAwcOAPDkk0+yYsUKZs2a1Vo9g4ODrFy5srXPl/pZGzOFS4AfJfk5sBH4dlX9d0bC4NokjwLXdtuaAtu3bz+qvW3btnYKkdS6KZ8pVNVjwJvH6N8LvHOq65kO2v5WfM011xzVPnDgALfddls7xUhq1XQ6JVWS1DJDQZLUMBQkSQ1DQZLUMBTEeeedN25bUv8wFMQLL7wwbltS/zAUxNlnnz1uW1L/MBTEwYMHx21L6h+Ggpg3b964bUn9w1AQN9xww1Ht5cuXt1SJpLYZCuKOO+4Yty2pfxgKYnh4+Kj2k08+2VIlktpmKIgk47Yl9Q9DQVx99dXjtiX1D0NBnH/++Ue1L7jggpYqkdQ2Q0H88Ic/PKp97733tlSJpLYZCuKiiy4aty2pfxgKYufOneO2JfUPQ0GS1Jh2oZDkuiSPJOkkubntevrBpZdeOm5bUv+YVqGQZAbwBWAJsAB4X5IF7VZ15tuzZ8+4bUn9Y7rdI3kR0KmqxwCS3AUsBR5qtaoz3Ote9zq2bdt2VFsCWLNmDZ1Op9Uajnz+qlWrWq0DYHBwkJUrV7ZdRk9Nq5kCMACMvsfCcLevkWRFkk1JNvmN9tTYtWvXuG2pTbNmzWLWrFltl9E3pttMYaz7K9RRjaq1wFqAoaGhGmN9/ZquvfZavvnNb1JVJOFd73pX2yVpmjjTvxXrWNNtpjAMXD6qfRnwVEu19I3ly5c3T1ubOXPmMbfSltQ/plso/AyYn+SKJK8AlgHrW67pjHfRRRexZMkSkrBkyRIvXpP62LTafVRVB5PcBPw1MAO4o6q2tlxWX1i+fDnbtm1zliD1uVSdvrvlh4aGatOmTW2XIUmnlSSbq2porGXTbfeRJKlFhoIkqWEoSJIahoIkqXFaH2hOsgfY3nYdZ5CLgafbLkIag3+bp9ZvVNXssRac1qGgUyvJpuOdkSC1yb/NqePuI0lSw1CQJDUMBY22tu0CpOPwb3OKeExBktRwpiBJahgKkqSGoSCSXJfkkSSdJDe3XY90RJI7kuxO8mDbtfQLQ6HPJZkBfAFYAiwA3pdkQbtVSY2vAte1XUQ/MRS0COhU1WNVdQC4C1jack0SAFV1L/DLtuvoJ4aCBoAnR7WHu32S+pChoIzR53nKUp8yFDQMXD6qfRnwVEu1SGqZoaCfAfOTXJHkFcAyYH3LNUlqiaHQ56rqIHAT8NfAw8C6qtrablXSiCRfA34M/L0kw0k+1HZNZzpvcyFJajhTkCQ1DAVJUsNQkCQ1DAVJUsNQkCQ1DAVpAklel+SuJH+b5KEk30myIsm32q5NOtUMBWkcSQJ8HfhBVb2+qhYAHwcuOcntnn0q6pNONf8wpfG9HfhVVf3JkY6q2pLk1cA7k/wF8CZgM/D+qqok24Chqno6yRDwmaq6JskngUuBecDTSf43MBe4svv676vqP0zdryYdy5mCNL4j/+CP5S3ARxl5DsWVwNsmsb1/ACytqn/ebb8RWMzILcw/kWTmSVUrnSRDQTpxG6tquKoOA1sYmQFMZH1VvTiq/e2q2l9VTwO7OcndUtLJMhSk8W1l5Nv9WPaPen+Il3bHHuSl/7fOfdmYFya5DakVhoI0vu8B5yT5V0c6kvwW8NvjjNnGS0Hy+70rTTr1DAVpHDVyx8jrgWu7p6RuBT7J+M+cuBW4LckPGfn2L502vEuqJKnhTEGS1DAUJEkNQ0GS1DAUJEkNQ0GS1DAUJEkNQ0GS1Pj/DTAwv91M99AAAAAASUVORK5CYII=\n", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], + "source": [ + "data_churn = data.groupby('Churn').get_group(1)\n", + "data_no_churn = data.groupby('Churn').get_group(0)\n", + "\n", + "#Check how the DayMins columns for customer that churn vs those that didnt churn varies using boxplot\n", + "#sns.boxplot('DayCalls',data = data_churn)\n", + "sns.boxplot('Churn','DayMins', data = data)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "From the boxplot above, it seems that customer that churn tends to have lower **DayMins** rate than those that wont churn. Although the **DayMins** minimum is significantly low which might not be expected if our assumption that customer customer with lower **DayMins** tends to churn, although detailed explanation about what DayMins mean was not provided. Let continue our comparison and see customer behavior as regards **DataUsage**" + ] + }, + { + "cell_type": "code", + "execution_count": 37, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "" + ] + }, + "execution_count": 37, + "metadata": {}, + "output_type": "execute_result" + }, + { + "data": { + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXgAAAEGCAYAAABvtY4XAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/d3fzzAAAACXBIWXMAAAsTAAALEwEAmpwYAAAT+ElEQVR4nO3df3BcV3nG8efVDztxAnWyUVzHQRjqFpqhkB8igcmQNrFkZCCGlmlLmGC1hcozgG0oLaUMQ0mnpTNAW2yFTkcTCHJDk8GlaQlNhGXTOM5AATm4cUJSoqZyG9lxpI3dhNhWJO3bP3ZlS4p8tdj33Cud/X5mNNa72r33jWb9+OTsveeYuwsAEJ+6vBsAAIRBwANApAh4AIgUAQ8AkSLgASBSDXk3MNVFF13kK1euzLsNAFgw9u7dO+LuTbP9bF4F/MqVK9Xf3593GwCwYJjZgdP9jCkaAIgUAQ8AkSLgASBSBDwARIqAB5CpYrGoTZs2qVgs5t1K9Ah4AJnq6enR/v37tW3btrxbiR4BDyAzxWJRvb29cnf19vYyig+MgAeQmZ6eHpVKJUnSxMQEo/jACHgAmdm5c6fGx8clSePj4+rr68u5o7gR8AAy09raqoaG8g30DQ0Namtry7mjuBHwADLT0dGhurpy7NTX12v9+vU5dxQ3Ah5AZgqFgtrb22Vmam9vV6FQyLulqM2rxcYAxK+jo0ODg4OM3jNAwAPIVKFQ0NatW/NuoyYwRQMAkSLgASBSBDwARIqAB4BIEfAAECkCHgAiFfQySTMblPS8pAlJ4+7eEvJ8AIBTsrgO/np3H8ngPACAKZiiAYBIhQ54l7TDzPaaWedsTzCzTjPrN7P+4eHhwO0AQO0IHfDXuvuVktZK+pCZXTfzCe7e7e4t7t7S1NQUuB0AqB1BA97dD1b+fEbS3ZKuDnk+AMApwQLezM4zs5dNfi9pjaRHQp0PADBdyKtolkm628wmz/MP7t4b8HwAgCmCBby7PynpDaGODwBIxmWSABApAh4AIsWOTinr6urSwMBArj0MDQ1JklasWJFrH5K0atUqbdy4Me82gJpEwEfo+PHjebcAYB4g4FM2H0armzdvliRt2bIl504A5Ik5eACZKhaL2rRpk4rFYt6tRI+AB5Cp7u5uPfzww+ru7s67legR8AAyUywW1dfXJ0nq6+tjFB8YAQ8gM93d3SqVSpKkUqnEKD4wAh5AZnbt2pVYI10EPIDMuHtijXQR8AAys3r16ml1a2trTp3UBgIeQGY2bNigurpy7NTV1amzc9aN3pASAh5AZgqFwslRe1tbmwqFQs4dxY07WQFkasOGDXr66acZvWeAgAeQqUKhoK1bt+bdRk1gigYAIkXAA0CkCHgAmWKxsewQ8AAy1dPTo/3792vbtm15txI9Ah5AZorFonp7e+Xu6u3tZRQfGAEPIDM9PT0nFxubmJhgFB8YAQ8gMzt37tT4+LgkaXx8/OTSwQiDgAeQmbe85S2JNdJFwAPIzIkTJ6bVo6OjOXVSGwh4AJl58MEHp9V79uzJqZPaQMADyIyZJdZIV/CAN7N6M/uRmX0r9LkAzG8z14OfWSNdWYzgN0t6LIPzAJjnOjs7WQ8+Q0ED3swulfR2SbeFPA+AhaFQKKipqUmS1NTUxHrwgYUewX9R0scllU73BDPrNLN+M+sfHh4O3A6APBWLRR0+fFiSdPjwYe5kDSxYwJvZOyQ94+57k57n7t3u3uLuLZP/sgOIU1dXV2KNdIUcwV8raZ2ZDUq6S9INZnZHwPMBmOd2796dWCNdwQLe3f/E3S9195WS3iPpO+5+c6jzAZj/3D2xRrq4Dh5AZhoaGhJrpCuT36673y/p/izOBWD+mlxo7HQ10sUIHkBmuJM1WwQ8gMwwB58tAh4AIkXAA0CkCHgAiBQBDyAzkwuNna5GuvjtAsjM5Ibbp6uRLgIeACJFwANApAh4AJlZunTptPqCCy7Ip5EaQcADyMzRo0en1UeOHMmnkRpBwANApAh4AJm5+OKLp9XLli3LqZPaQMADyMzM5YHr6+tz6qQ2EPAAMnPw4MHEGuki4AFkZuXKlYk10kXAA8jMpz71qcQa6aoq4K3sZjP7dKVuNrOrw7YGIDYzr3vnOviwqh3B/62kN0u6qVI/L+lLQToCEK2enp6TC4zV1dVp27ZtOXcUt2oD/hp3/5CkE5Lk7kckLQrWFYAo7dy58+QCY6VSSX19fTl3FLdqA37MzOoluSSZWZMkloED8DNpbW2dVre1teXUSW2oNuC3Srpb0sVm9heSHpT02WBdAYjSunXrptU33nhjTp3UhqoC3t2/Junjkv5S0iFJ73L37SEbAxCfO+64I7FGuhrmfopkZhdKekbSnVMea3T3sVCNAYjP7t27E2ukq9opmockDUv6iaQnKt//t5k9ZGZXhWoOQFzcPbFGuqoN+F5Jb3P3i9y9IGmtpK9L+qDKl1ACAOaZagO+xd2/PVm4+w5J17n7v0taHKQzANFZsmRJYo10VTUHL+lZM/tjSXdV6t+WdKRy6SSXSwKoyrFjxxJrpKvaEfx7JV0q6Z8l/Yuk5spj9ZJ+a7YXmNk5ZvYDM/sPM3vUzG5JoV8ACxiLjWWr2sskR9x9o7tf4e6Xu/uH3X3Y3V9094HTvGxU0g3u/gZJl0tqN7M3pdQ3gAVo5o1N7e3tOXVSG6pdbKzJzD5vZvea2Xcmv5Je42U/rZSNlS8+Mgdq2O233z6tvu2223LqpDZUO0XzNUmPS3qVpFskDUr64VwvMrN6M9un8jX0fe7+/Vme02lm/WbWPzw8XG3fABag8fHxxBrpqjbgC+7+ZUlj7r7b3X9P0pzTLe4+4e6Xqzx/f7WZvW6W53S7e4u7tzQ1Nf0svQMAElS92Fjlz0Nm9nYzu0Ll0K6Kux+VdL8kJtwAICPVBvyfm9nPSfqYpD+UdJukjya9oDJvv7Ty/bmSWlWe5gFQo84777zEGumq6jp4d/9W5dv/k3R9lcdeLqmncq18naSvTzkOgBrEHHy2qr2K5nNm9nIzazSzXWY2YmY3J73G3R+uXFb5end/nbv/WTotA1ioli9fnlgjXdVO0axx9+ckvUPSU5J+SdIfBesKQJQOHTqUWCNd1QZ8Y+XPt0m6092fDdQPgIjV19cn1khXtWvR3GNmj0s6LumDlS37ToRrC0CMWIsmW9UuVfAJSW9WeVXJMUnHJL0zZGMAgLOTOII3s9+Y8ZCb2Yikfe7+dLi2AMTIzKZt8mFmOXYTv7mmaGbbEfdCSa83s/e7e+J6NAAwVV1dnSYmJqbVCCcx4N39d2d73MxeqfKOTteEaApAnFavXq0dO3acrFtbW3PsJn5n9M+nux/QqStrAKAqa9asSayRrjMKeDN7jcrrvQNA1W699dZpdVdXV06d1Ia5PmS9Ry9dw/1ClZchSLyTFQBmGhwcTKyRrrk+ZP3CjNolFSU94e4vhmkJQKzOOeccnThxYlqNcOb6kHV3Vo0AiN/UcJ+tRrqqXWzsTWb2QzP7qZm9aGYTZvZc6OYAAGeu2g9Zb5V0k6QnJJ0r6QOS+HQEAOaxateikbsPmFm9u09Iut3MvhuwLwDAWao24I+Z2SJJ+8zsc5IOSWIrFgA/k/r6+ml3srKaZFjVTtG8r/LcD0t6QdIrJM1cpwYAEjU2Tr8/ctGiRTl1UhuqDfh3ufsJd3/O3W9x9z9QefMPAKjazKtmjh8/nlMntaHagO+Y5bHfSbEPAEDK5rqT9SZJ75X0KjP75pQfvUzlG54AoGosF5ytuT5k/a7KH6heJOmvpjz+vKSHQzUFIE5XXnml9u7de7K+6qqrcuwmfnPdyXpA0gGVd3MCgLMyc5PtgwcP5tRJbeBOVgCZmRnoBHxY3MkKAJGqej14dx+QVO/uE+5+u6Trw7UFIEbLly+fVl9yySU5dVIbuJMVQGaOHj06rT5y5Eg+jdSIs7mT9d2hmgIQp+uuuy6xRrqqGsG7+wEza6p8f0vYlgDEauo18AgvcQRvZZ8xsxFJj0v6iZkNm9mn5zqwmb3CzP7NzB4zs0fNbHNaTQNYmPbs2TOtfuCBB3LqpDbMNUXzEUnXSnqjuxfc/QJJ10i61sw+OsdrxyV9zN1/WdKbJH3IzC4724YBLFzLli1LrJGuuaZo1ktqc/eRyQfc/Ukzu1nSDkl/c7oXuvshlT+Mlbs/b2aPSVoh6cdn3fUsurq6NDAwEOLQC87k72HzZv6nSZJWrVqljRs35t0GJB0+fDixRrrmCvjGqeE+yd2HzaxxthfMxsxWSrpC0vdn+VmnpE5Jam5urvaQLzEwMKB9jzymiSUXnvExYlH3Ynmec++T/OWpP/Zs3i1gira2Nt1zzz1yd5mZ1qxZk3dLUZsr4F88w5+dZGbnS/qGpI+4+0vufnX3bkndktTS0nJWn8BMLLlQx1/7trM5BCJz7uP35t0Cpujo6NB9992nsbExNTY2av369Xm3FLW55uDfYGbPzfL1vKRfmevglVH+NyR9zd3/KY2GASxchUJBa9eulZlp7dq1KhQKebcUtcSAd/d6d3/5LF8vc/fEKRorrwP6ZUmPuftfp9k0gIVr3bp1WrJkiW688ca8W4le1UsVnIFrVb5B6gYz21f5Yv4EqHHbt2/XCy+8oO3bt+fdSvSCBby7P+ju5u6vd/fLK19MiAI1rFgsqq+vT5LU19enYpF9g0IKOYIHgGm6u7tVKpUkSaVSSd3d3Tl3FDcCHkBmdu3alVgjXQQ8gMzMXIuGtWnCIuABZGbx4sWJNdJFwAPIzLFjxxJrpIuAB5CZJUuWJNZIFwEPIDOjo6OJNdJFwANApAh4AJlh0+1sEfAAMjPzztWRkZesRo4UEfAAMtPW1jatZj34sAh4AJnp6OhQY2N5IdpFixaxHnxgBDyAzLAefLbm2tEJAFLV0dGhwcFBRu8ZIOABZKpQKGjr1q15t1ETmKIBgEgR8AAQKQIeACJFwANApAh4AIgUAQ8gU8ViUZs2bWLD7QwQ8AAy1dPTo/3792vbtm15txI9Ah5AZorFonp7e+Xuuu+++xjFB0bAA8hMT0+PxsbGJEljY2OM4gMj4AFkpq+vT+4uSXJ37dixI+eO4kbAA8jMsmXLEmuki4AHkJnDhw8n1khXsIA3s6+Y2TNm9kiocwBYWNra2mRmkiQzY8OPwEKO4L8qqT3g8QEsMB0dHWpoKC9i29jYyJLBgQULeHd/QNKzoY4PYOFhw49s5b4evJl1SuqUpObm5py7ARAaG35kJ/cPWd29291b3L2lqakp73YABDa54Qej9/ByD3gAQBgEPABEKuRlkndK+p6k15jZU2b2/lDnAgC8VLAPWd39plDHBgDMjSkaAIgUAQ8AkSLgASBSBDwARIqAB4BIEfAAECkCHgAiRcADQKQIeACIFAEPAJEi4AEgUgQ8AESKgAeASOW+ZR+AbHR1dWlgYCDvNjQ0NCRJWrFiRa59rFq1Shs3bsy1h9AIeACZOn78eN4t1AwCHqgR82W0unnzZknSli1bcu4kfszBA0CkCHgAiBQBDwCRYg4eyMB8uYJlPpj8PUzOxde6kFfzEPBABgYGBvTEoz9S8/kTebeSu0Vj5YmD0QP9OXeSv//5aX3Q4xPwQEaaz5/QJ698Lu82MI989qGXBz0+c/AAECkCHgAiRcADQKQIeACIFAEPAJEi4AEgUkED3szazew/zWzAzD4R8lwAgOmCBbyZ1Uv6kqS1ki6TdJOZXRbqfACA6ULe6HS1pAF3f1KSzOwuSe+U9OMQJxsaGlL980Wd/9Dfhzh89UoTknu+PcwnZlJd2Lv1Ek2Ma2hoPL/zVwwNDenZow3asPuC3HoYK5lKvDVPqjOpsS7fX8johOnChqFgxw8Z8Csk/e+U+ilJ18x8kpl1SuqUpObm5jM+2dKlS+fFRgKjo6MqlUp5tzFv1NXVafHiRTl2sEhLly7N8fxl8+L9OToq8d48pa5OdYsX59rCuVLQ96d5oNGmmf2mpLe6+wcq9fskXe3up11Vp6Wlxfv7WZ8CAKplZnvdvWW2n4X8kPUpSa+YUl8q6WDA8wEApggZ8D+U9Itm9iozWyTpPZK+GfB8AIApgs3Bu/u4mX1Y0rcl1Uv6irs/Gup8AIDpgi4X7O73Sro35DkAALPjTlYAiBQBDwCRIuABIFIEPABEKtiNTmfCzIYlHci7j0hcJGkk7yaA0+D9mZ5XunvTbD+YVwGP9JhZ/+nubgPyxvszG0zRAECkCHgAiBQBH6/uvBsAEvD+zABz8AAQKUbwABApAh4AIkXAR4jNzjFfmdlXzOwZM3sk715qAQEfGTY7xzz3VUnteTdRKwj4+Jzc7NzdX5Q0udk5kDt3f0DSs3n3USsI+PjMttn5ipx6AZAjAj4+NstjXAsL1CACPj5sdg5AEgEfIzY7ByCJgI+Ou49Lmtzs/DFJX2ezc8wXZnanpO9Jeo2ZPWVm78+7p5ixVAEARIoRPABEioAHgEgR8AAQKQIeACJFwANApAh41BQz+3kzu8vM/svMfmxm95pZp5l9K+/egLQR8KgZZmaS7pZ0v7v/grtfJumTkpad5XEb0ugPSBtvTNSS6yWNufvfTT7g7vvMbKmk1Wb2j5JeJ2mvpJvd3c1sUFKLu4+YWYukL7j7r5nZZyRdImmlpBEz+4mkZkmvrvz5RXffmt1/GvBSjOBRSybDezZXSPqIymvov1rStVUc7ypJ73T391bq10p6q8pLNv+pmTWeVbfAWSLggbIfuPtT7l6StE/lkflcvunux6fU/+ruo+4+IukZneXUD3C2CHjUkkdVHnXPZnTK9xM6NX05rlN/T86Z8ZoXqjwGkAsCHrXkO5IWm9nvTz5gZm+U9KsJrxnUqX8U3h2uNSB9BDxqhpdX1vt1SW2VyyQflfQZJa+Xf4ukLWa2R+VRObBgsJokAESKETwARIqAB4BIEfAAECkCHgAiRcADQKQIeACIFAEPAJH6f/G02IcNfmwfAAAAAElFTkSuQmCC\n", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], + "source": [ + "sns.boxplot('Churn','DataUsage', data = data)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The boxplot above indicate people with lower **DataUsage** tends not to churn and there apear to be several outliers for people who doesnt churn and have high data usage. Detailed explanation of what **DataUsage** means was not given, therefore no significant conclusion can be made" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Pre-process Data (Decision Tree)\n", + "* Check for duplicate values and remove\n", + "* Split the data to train-test" + ] + }, + { + "cell_type": "code", + "execution_count": 48, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
ChurnAccountWeeksContractRenewalDataPlanDataUsageCustServCallsDayMinsDayCallsMonthlyChargeOverageFeeRoamMins
00128112.701265.111089.09.8710.0
10107113.701161.612382.09.7813.7
20137100.000243.411452.06.0612.2
3084000.002299.47157.03.106.6
4075000.003166.711341.07.4210.1
....................................
33280192112.672156.27771.710.789.9
3329068100.343231.15756.47.679.6
3330028100.002180.810956.014.4414.1
33310184000.002213.810550.07.985.0
3332074113.700234.4113100.013.3013.7
\n", + "

3333 rows × 11 columns

\n", + "
" + ], + "text/plain": [ + " Churn AccountWeeks ContractRenewal DataPlan DataUsage \\\n", + "0 0 128 1 1 2.70 \n", + "1 0 107 1 1 3.70 \n", + "2 0 137 1 0 0.00 \n", + "3 0 84 0 0 0.00 \n", + "4 0 75 0 0 0.00 \n", + "... ... ... ... ... ... \n", + "3328 0 192 1 1 2.67 \n", + "3329 0 68 1 0 0.34 \n", + "3330 0 28 1 0 0.00 \n", + "3331 0 184 0 0 0.00 \n", + "3332 0 74 1 1 3.70 \n", + "\n", + " CustServCalls DayMins DayCalls MonthlyCharge OverageFee RoamMins \n", + "0 1 265.1 110 89.0 9.87 10.0 \n", + "1 1 161.6 123 82.0 9.78 13.7 \n", + "2 0 243.4 114 52.0 6.06 12.2 \n", + "3 2 299.4 71 57.0 3.10 6.6 \n", + "4 3 166.7 113 41.0 7.42 10.1 \n", + "... ... ... ... ... ... ... \n", + "3328 2 156.2 77 71.7 10.78 9.9 \n", + "3329 3 231.1 57 56.4 7.67 9.6 \n", + "3330 2 180.8 109 56.0 14.44 14.1 \n", + "3331 2 213.8 105 50.0 7.98 5.0 \n", + "3332 0 234.4 113 100.0 13.30 13.7 \n", + "\n", + "[3333 rows x 11 columns]" + ] + }, + "execution_count": 48, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "from sklearn.model_selection import train_test_split\n", + "bool_df = data.duplicated(keep = False)\n", + "data_cl = data[~bool_df]\n", + "data_cl" + ] + }, + { + "cell_type": "code", + "execution_count": 72, + "metadata": {}, + "outputs": [], + "source": [ + "X, y = data.iloc[:, 2:], data.iloc[:,0] #Choose not to use AccountWeeks\n", + "X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.3, random_state = 19, stratify = y)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Decision Tree Training and Evaluation" + ] + }, + { + "cell_type": "code", + "execution_count": 80, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Training accuracy: 0.9502786112301758\n", + "Test Accuracy: 0.922\n" + ] + } + ], + "source": [ + "from sklearn.tree import DecisionTreeClassifier\n", + "clf = DecisionTreeClassifier(max_depth = 6, random_state= 9)\n", + "clf.fit(X_train, y_train)\n", + "\n", + "print(\"Training accuracy: \", clf.score(X_train, y_train))\n", + "print(\"Test Accuracy: \", clf.score(X_test, y_test))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The variance of the model is 0.03 which indicate that the model is doing well in avoiding overfitting.But, this model probably wil have overfit to the no churn label since there are significant more label 0 than 1. The best accuracy metrics to use for this is recall score or generalized F1 score, that way, we will know how our model is doing against the imbalanced dataset. Let plot confusion matrix to verify." + ] + }, + { + "cell_type": "code", + "execution_count": 82, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Accuracy score: 0.922\n", + "Precision score: 0.8819149990638664\n", + "Recall score: 0.7797136519459569\n", + "F1 score: 0.8192285229579775\n", + " precision recall f1-score support\n", + "\n", + " 0 0.93 0.98 0.96 855\n", + " 1 0.83 0.58 0.68 145\n", + "\n", + " accuracy 0.92 1000\n", + " macro avg 0.88 0.78 0.82 1000\n", + "weighted avg 0.92 0.92 0.92 1000\n", + "\n" + ] + } + ], + "source": [ + "from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, classification_report,confusion_matrix\n", + "prob = clf.predict(X_test)\n", + "print(\"Accuracy score: \", accuracy_score(y_test, prob))\n", + "print(\"Precision score: \", precision_score(y_test, prob, average = 'macro'))\n", + "print(\"Recall score: \", recall_score(y_test, prob, average = 'macro'))\n", + "print(\"F1 score: \", f1_score(y_test, prob, average = 'macro'))\n", + "print(classification_report(y_test, prob))" + ] + }, + { + "cell_type": "code", + "execution_count": 85, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "Text(91.68, 0.5, 'Actual label')" + ] + }, + "execution_count": 85, + "metadata": {}, + "output_type": "execute_result" + }, + { + "data": { + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAQsAAAELCAYAAADOVaNSAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/d3fzzAAAACXBIWXMAAAsTAAALEwEAmpwYAAAXDElEQVR4nO3dd5QV5f3H8feXXaVIExAELCiKBY4tgAoqKoiCKIgGS/yhRiWxxxIjsQSxRSPmxBYliaLGhtgV0QgaQcGAkahIEQsqvUkvu/D9/TEDLJctzy63zO5+Xufcs3fKnfu9F/azzzwz84y5OyIiZamR6wJEpHJQWIhIEIWFiARRWIhIEIWFiATJz3UB5VGw6BsduqlEarc4OtclSAUUrp9txc1Xy0JEgigsRCSIwkJEgigsRCSIwkJEgigsRCSIwkJEgigsRCSIwkJEgigsRCSIwkJEgigsRCSIwkJEgigsRCSIwkJEgigsRCSIwkJEgigsRCSIwkJEgigsRCSIwkJEgigsRCSIwkJEgigsRCSIwkJEgigsRCSIwkJEgigsRCSIwkJEgigsRCSIwkJEgigsRCSIwkJEgigsRCSIwkJEgigsRCSIwkJEgigsRCSIwkJEgigsRCSIwkJEgigsRCSIwkJEgigsRCSIwkJEgigsRCSIwkJEgigsRCSIwkJEgigsRCRIfq4LqKyefO5lXnx9FGbGvq1bcfvvr6FmzR23We/zqdP5xYBruHfwDXQ/7ujtes/169cz8LYhfDn9Kxo2qM+9gwfSsnkzps34mtvufZCVq1ZTI68GA/qfRY9uXbbrvWSLvw0dwsk9u7Fg4SIOObQrAM88/VfatGkNQMMG9flp2XLad+ieyzIzTi2LCpi/cBFPj3iV5x+7n1f++QgbN27krXf/vc16GzZs4M8PP07njoeVa/uz587n/Muv32b+S2+8Q/16dXlr+GP835l9uO/hxwCoVasmd958Ha8+/SiPDrmdu+9/lOUrVlbsw8k2nnxyOCf3+sVW8875xSW079Cd9h268/LLI3nllZE5qi57stqyMLP9gd5AS8CBOcBr7j41m3WkQ+GGDaxbt578vHzWrF3HLk0abbPOMyNe44RjO/PF1BlbzX/97TE8/cKrFBQUclDb/bjp2svIy8sr8z3HjB3PpReeC0D3Y4/mzvv+irvTao/dNq/TdJfGNNq5IUt/Wkb9enW381MKwNhxH7PnnruVuPyMM07hhBP7ZbGi3Mhay8LMfgc8BxjwH2Bi/PxZM7shW3WkQ7NdmnD+2afTrW9/jut9DvV2qkPnw3+21TrzFy5i9Acf0a9Pz63mf/3d94wa/W+eemQILz7xEDVq1OCNd94Let8FCxeza9MmAOTn51F3pzr8tGz5Vut8/uV0CgoK2b1l8+34hBLq6KMOZ/6Chcyc+W2uS8m4bLYsLgTauntB0Zlmdh8wBfhjcS8yswHAAICHh9zORf3PznSdZVq2fAXvjZ3A2y88Tr16dbn2pjt5/e0xnHLi8ZvXufsvj3L1Jb/cpsXw8aTJfDltJmddeBUA69ato9HODQG4cuBgZs+ZT0FhAXPnL+T08y4D4Nx+vTnt5O64+za1mNnm5wsXLWHg4D9xx03XUqOG9jCz4cwz+/D886/muoysyGZYbARaALNS5jePlxXL3YcCQwEKFn2z7W9LDkyYNJmWLZpt/iXv2qUTkz//cquwmDLtK377hyj/li5bztjxE8nLy8PdObVHN66+5IJttnv/XbcAUZ/FjXcMYdiD92y1vFnTJsxbsIhdm+5CYeEGVq5aTYP69QBYuWoVl/72Fq4YcB4HtzsgEx9bUuTl5XFanx50PKJHrkvJimyGxW+A0Wb2FfBDPG8PYB/g8izWsd2aN9uFz76Yxpq1a6lVsyYfT5pM2/333Wqdt0cM2/z8xtuH0KVzR7oe04mvv53FFTcMpv9Zp9F454YsW76CVatX02LXZmW+73FHHcGrI9/lkHYH8M77Yzn8ZwdjZhQUFHDVwNs49aSunHj89h1xkXDduh7N9OkzmT17bq5LyYqshYW7jzKzNkBHog5OA34EJrr7hmzVkQ4Htd2fE447in4XXEFeXh77t2nNz3v34PmX3wTgzNNOLvG1rffakysu7s+A39zIRt/IDvn53HjNpUFh0bfXiQy87U/06PdLGtSvx59ujbp6Ro0ZyyeTv+CnZSt4ZeS7ANxx4zXsHx/ak+3zz6ceossxR9KkSSO++2YStw6+l8eHPUe/fr15rprsggBYcfvBSZWU3RAJU7uFWjmVUeH62VbcfPWCiUgQhYWIBFFYiEgQhYWIBFFYiEgQhYWIBFFYiEiQEk/KMrM65dmQu6/e/nJEJKlKO4NzJdFl5KHKvsZaRCqt0sLil5QvLESkCisxLNx9WBbrEJGEK9eFZGZ2IPAzYHfgMXefZ2b7APPdfUUmChSRZAgKCzOrCzwGnAEUxK8bBcwD7gS+B67LUI0ikgChh07vAzoBXYF6RJeXbzISOCnNdYlIwoTuhvQFrnL398ws9ajHLGDP9JYlIkkT2rKoDSwuYVk9oFINXiMi5RcaFhOB/iUsOwP4KD3liEhShe6G3AS8a2bvAi8QnX/R08yuJgqLYzJUn4gkRFDLwt3HEXVu1gQeJOrgvBXYG+jm7hMzVqGIJELweRbu/iFwtJnVBnYGftL1ICLVR0WuOl1LdK7FmjTXIiIJFhwWZtbTzD4iCot5wFoz+8jMSh73XkSqjKCwMLNfAa8TXYl6FfDz+OdK4LV4uYhUYUH3DTGzWcBId7+kmGWPAD3dfY8M1LcV3TekctF9Qyqn7b1vSGPgpRKWvQg0qkhRIlJ5hIbFe0CXEpZ1AT5ITzkiklSlDat3YJHJ+4G/m1lj4BVgAdAUOA3oAVyUwRpFJAFK7LMws41sPVJW0f0YT51294wPq6c+i8pFfRaVU0l9FqWdlHVchmoRkUqotGH1/p3NQkQk2co1rB6AmdUAaqXO16nfIlVb6ElZZma/M7OZRKd6ryjmISJVWOih0yuBG4B/EHVs3gEMBmYA3wEDMlGciCRHaFhcDPwBuCeefsXdbwXaAtOAfTNQm4gkSGhY7AVMdvcNRLshDQHcfSPwMHBeRqoTkcQIDYvFQN34+ffAoUWW7Uw0RqeIVGGhR0M+BDoQDfv/DDDIzBoB64HLgNGZKU9EkiI0LAYBLePndxLthpxP1KL4F3BFmusSkYQJukQ9KXS6d+Wi070rp+29RF1EqrnSrjodXp4NuXu/7S9HRJKqtD6LXbJWhYgkXmkXkumqUxHZTH0WIhJEYSEiQRQWIhJEYSEiQRQWIhJEYSEiQdJ1Upa7+5lpqKdULVr3yPRbSBq1rNc41yVIGumkLBEJopOyRCSI+ixEJEjwrQDMrB7QG2hD8bcCuD6NdYlIwgSFhZm1Jhotqw6wE7CQ6M7p+cBSYBmgsBCpwkJ3Q/4MTAKaEd0KoCfRKFnnAiuBjB8JEZHcCt0N6Uh0p/R18fSO8Ujfz5hZE+AvQKcM1CciCRHasqgFLI+H/l8CtCiy7Avg4HQXJiLJEhoWM4A94+efAr82s1pmtgNwITAnE8WJSHKE7oY8BxwCPAXcDLwNLAc2xts4PwO1iUiCBIWFu99X5PkEM2sH9CDaPRnj7l9kqD4RSYjg8yyKcvcfgKFprkVEEiz0PIueZa3j7iO3vxwRSarQlsUbgBOdY1FU0Zv+5KWlIhFJpNCw2KuYeY2A7kSdmxekqyARSabQDs5ZxcyeBXxqZhuA3wOnprMwEUmWdFx1+ilwfBq2IyIJtl1hYWY7Eu2GzE1LNSKSWKFHQyaydWcmwI5AK6Ae6rMQqfJCOzinsG1YrAVeAF5x9ylprUpEEie0g/P8DNchIgkX1GdhZmPMbP8SlrUxszHpLUtEkia0g/NYoH4Jy+oDx6SlGhFJrPIcDUnts9h0NOR4YF7aKhKRRCrtJkN/AG6JJx2YYJZ6tvdmf0pzXSKSMKV1cI4EFhFdD3I/MAT4LmWd9cA0dx+bkepEJDFKu8nQRGAigJmtAN5w98XZKkxEkiW0z2IycHhxC8ysp5kdlLaKRCSRynMrgGLDAugQLxeRKiw0LA4juslQccYDh6anHBFJqtCwyCO6E1lxdiK6TkREqrDQsJgIDChh2QCiu5WJSBUWeiHZIOBdM/sYeILoJKzmQH+iGwydkJHqRCQxQi8k+8DMugN3AQ8QnXuxEfgYOEHnWYhUfcG3AnD394EjzawOsDOw1N1XA5jZDu5ekJkSRSQJyj1SlruvdvfZwBozO97M/oauDRGp8sp9kyEzOxw4G+gHNCO6UfJzaa5LRBImdFi9dkQBcRbRUHrriQ6XXgM85O6FmSpQRJKhxN0QM9vbzH5vZp8D/wOuA6YSHQHZl6iT81MFhUj1UFrLYibRpekfA78CXnT3pQBm1iALtYlIgpTWwTmLqPXQjmikrE5mVqEbKYtI5VdiWLj7XkBnopOwugKvA/Pjox9dKWbkLBGpuko9dOru4939CqAlcCLwKnA6MCJe5WIza5/ZEkUkCcy9fA2EeNzNnkRHRnoBtYEZ7n5A+svb2i4N9lNrphKpk18z1yVIBcxa/Fmx42dW5KSs9e7+irufRXSeRX+izlARqcK2616n7r7K3Z9291PSVZCIJFM67qIuItWAwkJEgigsRCSIwiIH6jeox2NP/oWPJr7Fh/8ZSfsOh3Bqn5MYO+EN5i+dysGHtst1iZLiwl+fy78+fIl3xr3E/UPvpmbNLSNJDrjsPGYt/oydGzXMXYFZoLDIgTv/eCNj3h1Lpw49OLZzb2bM+JqpX87g/HOvYPyHE3NdnqRo1rwpFwz4Bb26nk33o/qSl1eDU/qeBEDzFs046tgj+PGHOTmuMvMUFllWt95OHNG5A/98MjqvraCggOXLVvDVjG/4eua3Oa5OSpKXn0etWjXJy8ujdu1azJ+7EIBb7rieuwb9mfKer1QZKSyyrFWr3Vm8aAkPPHwXY8a+zJ8fuJ06dWrnuiwpxfy5Cxj64BOM/987TPxyNCuWr2Ts++PpdtKxzJu7gKlTZuS6xKxIRFiY2QWlLBtgZpPMbNLa9T9lsarMyMvP56CDD+TxfzzL8UefxupVa7jy6pIGTpckqN+gHt17HsdRh/WgY9tu1N6pNn3PPIXLr7mY++56KNflZU0iwgK4taQF7j7U3du7e/taOzbMYkmZMXf2PObMnsd/P/kMgNdfHcVBBx+Y46qkNEd1OYIfZv3IksVLKSwsZNQbo+l3dm9236Mlb33wAuM+fYvmLZrx5nvPs0vTxrkuN2Oydsm5mX1W0iKi08arhQULFjFn9jxa77MXX8/8lqO7HMn06V/nuiwpxZzZ8zi0/UHUql2LtWvW0vmYwxn15mjO6nPR5nXGffoWp3Q9m6VLfspdoRmWzfEpmhFdubo0Zb4BH2WxjpwbeP1tPPL3e9lhhx2Y9d0PXHnZQHr26sZd99xM4yaNeGb4o0z5fCr9+l5U9sYk4yZ/8jkjX3uXN997ng2FG5jy+VSeeWJE2S+sYsp91WmF38jsH8Dj7j6umGXPuPs5ZW1DV51WLrrqtHIq6arTrLUs3P3CUpaVGRQikltJ6eAUkYRTWIhIEIWFiARRWIhIEIWFiARRWIhIEIWFiARRWIhIEIWFiARRWIhIEIWFiARRWIhIEIWFiARRWIhIEIWFiARRWIhIEIWFiARRWIhIEIWFiARRWIhIEIWFiARRWIhIEIWFiARRWIhIEIWFiARRWIhIEIWFiARRWIhIEIWFiARRWIhIEIWFiARRWIhIEIWFiARRWIhIEIWFiARRWIhIEIWFiARRWIhIEIWFiARRWIhIEIWFiARRWIhIEIWFiARRWIhIEIWFiARRWIhIEIWFiARRWIhIEIWFiAQxd891DQKY2QB3H5rrOiRMdfz3UssiOQbkugApl2r376WwEJEgCgsRCaKwSI5qtf9bBVS7fy91cIpIELUsRCSIwkJEgigscszMTjKz6WY208xuyHU9Ujoze8zMFpjZF7muJdsUFjlkZnnAQ0AP4EDgbDM7MLdVSRmGASfluohcUFjkVkdgprt/4+7rgeeA3jmuSUrh7h8AS3JdRy4oLHKrJfBDkekf43kiiaOwyC0rZp6OZUsiKSxy60dg9yLTuwFzclSLSKkUFrk1EdjXzPYysx2Bs4DXclyTSLEUFjnk7oXA5cDbwFRguLtPyW1VUhozexYYD+xnZj+a2YW5rilbdLq3iARRy0JEgigsRCSIwkJEgigsRCSIwkJEgigsEsDMBpmZF3nMMbMXzax1Bt+zV/xereLpVvF0r3Jso5+ZnZ/GmurGNZS4zYrUGb9umJlN2u4io229b2Yj0rGtyiQ/1wXIZsvYcjXj3sBtwGgza+vuq7Lw/nOBI4Fp5XhNP6AJ0ZWYUsUpLJKj0N0nxM8nmNn3wFigJ/BC6spmVtvd16Trzd19HTChzBWl2tJuSHJ9Ev9sBWBm35nZEDO72cx+BJbH82uY2Q3x4DnrzGyGmZ1XdEMWGRQP2rLCzJ4E6qesU2zz3swuNrPPzWytmc03sxFm1sDMhgGnA12K7D4NKvK63mY2KX7dPDO7x8x2SNn26XG9a8zsA2D/inxRZtbfzMaZ2RIzW2pm75lZ+xLW7WNm0+K6xqWOHxLyfVZXalkkV6v457wi884BpgCXsuXf7gHgPGAw8F/gBOAxM1vs7m/E61wJ3ALcSdRa6QvcU1YBZnZTvN2Hgd8CdYCTgbpEu0l7AA3jeiC6MA4z6wc8CzwK/B5oDdxF9Mfpunidw4DngZeBq4C2wPCyaipBK+BJ4GtgR6Lv6QMza+fu3xRZb0/gPuBmYA1wK/C2me3r7mvjdUK+z+rJ3fXI8QMYBCwiCoB8oA3wHlHroXm8zndE/Qq1irxuH2AjcF7K9p4EJsbP84iuZP1ryjr/IrocvlU83Sqe7hVPNwRWA/eVUvcI4P2UeQbMAh5Pmf9Lol/QxvH0cOBL4ksO4nk3xjWcX8p7blVnMctrxN/hNOCWIvOHxa/rVGTenkAh8OvQ7zOefh8Ykev/N9l+aDckORoDBfFjOlEn55nuPrfIOqN9y19AgK5E/7lfNrP8TQ9gNHBIPGzf7kBz4NWU93upjHqOBGoDj5fzc7QhanEMT6lpDFALaBev1xF4zePfvsCaimVmB5jZy2Y2H9hA9B3uF9dS1AJ3/2jThLvPItrd6xjPCvk+qy3thiTHMqAb0V+/ecCclF8kgPkp002IWg7LSthmc2DX+PmClGWp06kaxz/nlrrWtprEP0eWsHzT+B27VqCmbZhZPeAdou/mGqJWzVrg70ThVNb2FxB9TxD2ff5Y3hqrCoVFchS6e1nnAaSGxxKiZnRnor+IqRaw5d+4acqy1OlUi+OfzYl2kUJtGp9yAPBpMcu/jX/Oq0BNxTmSaNCgE9x982FfM2tQzLrFbb8pUT8QhH2f1ZbConIbQ/SXsIG7/6u4FczsB6JfzN7AqCKL+pax7fFEfQznEXdKFmM92/71ng7MJuoL+Vsp258InGpmA4u0oMqqqTi145/rNs0ws05EfRufpKzb1Mw6bdoVMbM9gMPYsqtV5vdZnSksKjF3n25mjwDPmdk9wCSiX962QBt3v8jdN8TL7jWzRURHQ04HDihj2z+Z2W3AHfEoXiOBmkRHQ25199lEnYi9zawPUfN8jrvPMbNrgafMrD7wFlGo7A30Ac5w99XA3cDHRH0b/yDqy6jIQDITgJXA3+LPuRtRh/HsYtZdFNe16WjIYKLWwrD4M5f5fVagvqoj1z2semw5GlLGOt8B9xYz34DfEDWl1wELgX8D/VPWuS1etgJ4mujwYolHQ4q89ldERy3WEbVQhgP142VNiA59LolfO6jI63oQBdMqoqM6k4Hbgfwi6/wcmEnUxzAO6EAFjoYQnfn6BVEAfEZ0Itv7FDliQRQIk4haLzPiz/Mh0K4C3+dW264uD42UJSJBdOhURIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkyP8DrJqzin4hc6MAAAAASUVORK5CYII=\n", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], + "source": [ + "cm = confusion_matrix(y_test, prob)\n", + "ax = sns.heatmap(cm, square=True, annot= True, cbar = False)\n", + "ax.set_xlabel('Predicted label', fontsize = 15)\n", + "ax.set_ylabel ('Actual label', fontsize = 15)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Since we don't have enough data for churn label, the model mispredict 61 of churn as not churn. Let try XGBOOST to select the best parameters to use" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### XGBOOST" + ] + }, + { + "cell_type": "code", + "execution_count": 86, + "metadata": {}, + "outputs": [], + "source": [ + "import xgboost as xgb" + ] + }, + { + "cell_type": "code", + "execution_count": 99, + "metadata": {}, + "outputs": [], + "source": [ + "dmatrix_train = xgb.DMatrix(data=X_train, label=y_train)\n", + "dmatrix_test = xgb.DMatrix(data=X_test, label=y_test)\n", + "\n", + "param = {'max_depth':6, \n", + " 'eta':0.3, \n", + " 'objective':'multi:softprob', \n", + " 'num_class':2}\n", + "\n", + "num_round = 6\n", + "model = xgb.train(param, dmatrix_train, num_round)\n", + "\n", + "preds = model.predict(dmatrix_test)\n", + "\n", + "best_preds = np.asarray([np.argmax(line) for line in preds])" + ] + }, + { + "cell_type": "code", + "execution_count": 103, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Precision: 0.8803212544949875\n", + "Recall: 0.8216172615446662\n", + "Accuracy: 0.93\n", + " precision recall f1-score support\n", + "\n", + " 0 0.95 0.97 0.96 855\n", + " 1 0.82 0.67 0.73 145\n", + "\n", + " accuracy 0.93 1000\n", + " macro avg 0.88 0.82 0.85 1000\n", + "weighted avg 0.93 0.93 0.93 1000\n", + "\n" + ] + } + ], + "source": [ + "# metrics\n", + "print(\"Precision: \", (precision_score(y_test, best_preds, average='macro')))\n", + "print(\"Recall: \",(recall_score(y_test, best_preds, average='macro')))\n", + "print(\"Accuracy: \", (accuracy_score(y_test, best_preds)))\n", + "print(classification_report(y_test, best_preds))" + ] + }, + { + "cell_type": "code", + "execution_count": 104, + "metadata": {}, + "outputs": [ + { + "data": { + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAQsAAAELCAYAAADOVaNSAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/d3fzzAAAACXBIWXMAAAsTAAALEwEAmpwYAAAXwUlEQVR4nO3dd5gV5dnH8e+9uxEQFlCQTlBRbImaookVUQQRDYKKmlhQkFwWbFhABGmiYnmjIcYumteGUSx5FQuIYheViBJQUDSUpSlIW9hd7vePGdblsOVZOW2X3+e6zrVnys7c5yznxzPPzHnG3B0RkarkZLoAEakZFBYiEkRhISJBFBYiEkRhISJB8jJdQHUULf9Kp25qkHqtjsh0CfITFG9caOXNV8tCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkSF6mC6ipHnliIk+/MAkzY8/2uzL62iuoU2eH0uVTpr3LX+97hBzLITc3l0GX9ufXB/xim/a5ceNGBo+6jVlzvqRxo4bcOnIwrVs2Z/YX8xh16zjWrF1HTm4O/c8+nW6dO27rS5RYmzatGP/gHTRvsQubNm3i/vsf5a/jHuDmG6+j+wnHsnHjRr766hv69ruCVat+yHS5KWPunukaghUt/yoril2ybDlnX3Alzz16D3Xr1GHg0DEc8fuDOKn7saXrrFu3nnr16mJmzJn7NVcOHcMLj98XtP2Fi5cw5IbbGD9u7Bbzn3jmX8yZ+zXXXz2AF1+byuQ33uW2UYOZ/+0CzIx2bVuzdNkKevcdwPOP3kvD/AZJfd3VVa/VERndf7K0aNGMli2a8cmMz2jQoD4fvD+Jk085jzatWzLl9bcpKSnhxjHXAjD42jEZrnbbFW9caOXNT2vLwsz2BnoArQEHFgHPu/t/0llHMhSXlLBhw0bycvNYX7iBXZruvMXyHXesV/p8fWEh2I/v/wsvT+HRp56jqKiY/ffbi+sGXkRubm6V+5wy7V0u7HsmAF2OOoIxt/8dd2fXn7cpXafZLk3YeafGfL9yVcbDorYoKFhKQcFSANasWcvs2V/SulULXn3tzdJ13nv/Y07u1T1TJaZF2voszOwa4AnAgA+AD+Pnj5vZoHTVkQzNd2lKnzNOpnOvs+nU44/k19+Rw373m63We+2NtznxjPO58MphjLr2cgDmzf+WSZPf4B9338bTD/+NnJwc/vXK60H7XbpsBS2aNQUgLy+XBvV3ZGVCs3fmrDkUFRXTtnXLbXyVUp527dpw4AG/4P0PPtli/rl9TmfSy2F/x5oqnS2LvsB+7l5UdqaZ3Q58DtxU3i+ZWX+gP8Bdt42m39lnpLrOKq36YTWvT3uPl596iPz8Bgy8bgwvvDyFE7sevcV6nTseRueOhzF9xkzG3fcI999xI+9Pn8Gs2XM5ve+lAGzYsIGdd2oMwCWDR7Jw0RKKiotYvGQZJ59zEQBn9u5Bz+5dKO+Q0cq0WJYt/47BI2/hhusGkpOjvutkq19/RyY8eR9XXHk9q1evKZ0/eNAlFBcX89hjz2SwutRLZ1hsAloB3yTMbxkvK5e73wvcC9nTZ/He9Bm0btW89EN+TMdDmTFz1lZhsdlvD/wl/124mO9XrsLd+UO3zlx+wblbrXfnjcOAivssmjdrSsHS5bRotgvFxSWsWbuORg3zAVizdi0XXjWMAf3P4YBf7JPEVysAeXl5PPXkfTz++ESeffal0vlnnXUq3Y/vzLFde2ewuvRI538/lwGTzewlM7s3fkwCJgOXprGObday+S58+tls1hcW4u68P30Gu7dru8U63y5YVNoSmDVnLkVFxTRu1JDf//ZAXp36Fiu+XwlErZRFBUuC9tvp8N/z3IuvAfDK1Gn87jcHYGYUFRVx6eBR/OG4Y+h6dO3oVMw29917G/+ZPZe/3HFv6byuXY7iqisv5KRefVi/vjCD1aVH2loW7j7JzDoABxN1cBqwAPjQ3UvSVUcy7L/f3hzb6XB6nzuA3Nxc9u7QnlN7dOPJif8HwGk9u/Pq1Ld4/qXJ5OXlUbfODtw6chBmRvvd2jHg/LPpf9kQNvkmfpaXx5ArLqRVi+ZV7rfXCV0ZPOoWuvU+j0YN87llRNTVM2nKND6a8RkrV63m2ThMbhhyBXt3aJ+6N2E7ctihB3HWmafw6cxZTP/wFQCGDr2J/7l9JHXq1GHSS08A8P77H3PRxTWq+61adOpUUqa2nDrd3lR06lS9YCISRGEhIkEUFiISRGEhIkEUFiISRGEhIkEUFiISRGEhIkEUFiISJCgszKyZme1WZtrMrL+Z/cXMTkxdeSKSLUJbFuOBy8tMjwDuAo4DJppZn+SWJSLZJjQsfg1MATCzHOAC4Fp33xu4gegbpSJSi4WGRSNgRfz8N8DOwKPx9BRgjyTXJSJZJjQsFgD7xs+7A7PdfWE83Qio/V/mF9nOhY5n8SAw1sw6E4XF4DLLfg/UuAF3RaR6gsLC3W80s4XAQcAAovDYbGfg/hTUJiJZRIPfSMpo8Juaqdr3DTGzHauzA3dfV92iRKTmqOwwZA3RjYBCVX2XHBGpsSoLi/OoXliISC1WYVi4+/g01iEiWa5atwIws32JLspqCzzo7gVmtgewxN1Xp6JAEckOQWFhZg2ITpeeAhTFvzcJKADGAN8CV6aoRhHJAqFXcN4OHAocA+QT3SBosxeJvlAmIrVY6GFIL+BSd3/dzBLPenwDtEtuWSKSbUJbFvX48YtkifKBGnX7QRGpvtCw+BA4u4JlpwDvJKccEclWoYch1wGvmdlrwFNE118cb2aXE4XFkSmqT0SyRFDLwt3fIurcrAOMI+rgHAHsDnR29w9TVqGIZIXg6yzc/W3gCDOrB+wErNT3QUS2Hz9ldO9Comst1ie5FhHJYsFhYWbHm9k7RGFRABSa2Ttm1j1l1YlI1gi9FcCfgReIvol6KXBq/HMN8Hy8XERqsaDBb8zsG+BFd7+gnGV3A8e7+89TUN8WNPhNzaLBb2qmiga/CT0MaQI8U8Gyp4mG1hORWiw0LF4HOlawrCPwZnLKEZFsVdmwevuWmbwTuN/MmgDPAkuBZkBPoBvQL4U1ikgWqLDPwsw2seVIWWWPYzxx2t1TPqye+ixqFvVZ1EzVHrAX6JSiWkSkBqpsWL030lmIiGS3ag2rB6U3Rq6bOF+XfovUbqEXZZmZXWNmc4ku9V5dzkNEarHQU6eXAIOAB4g6Nm8ARgJfAPOB/qkoTkSyR2hYnA9cD4yNp5919xHAfsBsYM8U1CYiWSQ0LHYDZrh7CdFhSGMAd98E3AWck5LqRCRrhIbFCqBB/Pxb4Fdllu1ENEaniNRioWdD3gYOIhr2/zFguJntDGwELgImp6Y8EckWoWExHGgdPx9DdBjSh6hF8SowIMl1iUiWCfqKerbQ5d41iy73rpl+yuXeQczsZGBCOr4b0qRd51TvQpKoXcPmmS5BkuinjMEpItshhYWIBFFYiEgQhYWIBKlspKwJgdtok6RaRCSLVXY2ZJfAbWxAY3CK1HqVDX6jkbJEpJT6LEQkiMJCRIIoLEQkiMJCRIIoLEQkSLXCIh64t62ZHWpm9VNVlIhkn+CwMLMLgYXAN8A0YK94/jNmdllKqhORrBF6K4CrgNuB+4Cj2fLWhVOB05JemYhkldDxLC4Chrn7WDNLHLdiDtAhuWWJSLYJPQxpAXxUwbJNlHOHMhGpXULDYi7QsYJlRwKzklOOiGSr0MOQvwB3mdlG4J/xvGZm1he4gugmRCJSiwWFhbvfb2Y7AcOAEfHsF4F1wHB3fyxF9YlIlggesNfdbzGzu4FDgSbAd8C77r4qVcWJSPao1uje7r4aeDlFtYhIFgsKi/iCrEq5+13bXo6IZKvQlsW4SpZtvvGPwkKkFgs6deruOYkPYGfgDODfwL6pLFJEMu8n35HM3VcCT5pZI+Ae4Kgk1SQiWSgZX1H/GvhtErYjIllsm8LCzFoCA4kCQ0RqsdCzIcv4sSNzsx2AfKAQ6JXkukQky2zL2ZBCYAEwyd1XJK8kEclGVYaFmf0MeA342t0Xpb4kEclGIX0WJcAUYJ8U1yIiWazKsHD3TcCXQPPUlyMi2Sr0bMgQYJiZ/TKVxYhI9qrsLupHAh+7+xrgOqJvms4ws4XAEhLOjrj7waksVEQyq7IOzteBQ4APgM/ih4hspyoLi9IRvN393DTUIiJZTHckE5EgVV1ncbyZ7R2yIXd/JAn1iEiWqioshgVuxwGFhUgtVlVYdAKmp6MQEcluVYXFendfm5ZKRCSrqYNTRIIoLEQkSIWHIfE4myIigFoWIhJIYSEiQRQWIhJEYSEiQX7yfUNk2+Tk5PDGW8+xeNESep/Sj1/uvw9/uWM0derWobi4hIGXDeWjjz7NdJkS69P/DE47qyeY8eQ/JjL+nse48/6b2K19OwAaNsrnh1WrObHTGRmuNHUUFhlywUXn8sWceeTnNwBg1OhB3HTjnbz6yht06XoUI0cPonu3P2a4SgHosHd7TjurJz27nE3RxiIemjCOqa9O45J+g0rXGTzyclb/sCaDVaaeDkMyoFWrFnQ9rhMPj3+ydJ67lwZHw4b5FBQszVR5kqB9h9345KOZFK4vpKSkhA/e+Ygu3Y/eYp3uPY7lX89MylCF6aGWRQbcNHYow4bcRIP8+qXzrrl6FBOfe5jRYwaTk5PDsUefksEKpawv/jOPgUMuovFOjSgs3EDHzofz2YxZpcsPOuTXLF/2HfO/+m8Gq0y9rGhZmFmFg+uYWX8zm25m0zcW/5DOslLiuOOOZvmyFcyYseXAY/36/YnB14xm370OZ/A1oxn395szVKEkmvfl19xz53gefvouHpowjtmff0FxSUnp8hN7deWFWt6qADD3xBuNZaAIs2/d/edVrdew/u6ZL3YbXT/iKk4/4ySKi0uoW7cO+fkNeOH5lzmu2zG0bXVA6XoLFv+bNi0PqGRL2W+Xeo0zXUJKDBxyMQWLlvDoQ0+Rm5vLOzMn0eOYP1GwuHYcOs5b/rGVNz9tLQsz+7SCx0y2o9sMjLj+FvbpcBi/3PdIzj3nEt58413O73sFBYuXcPgRvwOg41GHMm/e/MwWKlto0nQnAFq2bkHXEzqVtiQO6/g75s2dX2uCojLp7LNoDnQFvk+Yb8A7aawjKw24+FpuvmUoeXl5bCjcwKUXD8l0SVLG3x66lcY7N6K4qJjhV9/MD6tWA3BCzy7bxSEIpPEwxMweAB5y97fKWfaYu1d5nrA2HIZsT2rrYUhtV9FhSNpaFu7et5JluqBAJMtlxdkQEcl+CgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEg5u6ZrkEAM+vv7vdmug4Jsz3+vdSyyB79M12AVMt29/dSWIhIEIWFiARRWGSP7er4txbY7v5e6uAUkSBqWYhIEIWFiARRWGSYmR1nZnPMbK6ZDcp0PVI5M3vQzJaa2WeZriXdFBYZZGa5wN+AbsC+wBlmtm9mq5IqjAeOy3QRmaCwyKyDgbnu/pW7bwSeAHpkuCaphLu/CXyX6ToyQWGRWa2B/5aZXhDPE8k6CovMsnLm6Vy2ZCWFRWYtANqWmW4DLMpQLSKVUlhk1ofAnma2m5ntAJwOPJ/hmkTKpbDIIHcvBi4GXgb+A0xw988zW5VUxsweB94F9jKzBWbWN9M1pYsu9xaRIGpZiEgQhYWIBFFYiEgQhYWIBFFYiEgQhUUamdlwM/Myj0Vm9rSZtU/hPk+I97VrPL1rPH1CNbbR28z6JLGmBnENlW4zXufibdzXcDNbvi3bKLOt8WY2PRnbqonyMl3AdmgVP35rcXdgFDDZzPZz97Vp2P9i4BBgdjV+pzfQlOgbl7KdUlikX7G7vxc/f8/MvgWmAccDTyWubGb13H19snbu7huA96pcUSSBDkMy76P4564AZjbfzG4zs6FmtgD4IZ6fY2aD4kFyNpjZF2Z2TtkNWWR4PDjLajN7BGiYsE65hyFmdr6ZzTSzQjNbYmb/NLNGZjYeOBnoWObwaXiZ3+thZtPj3ysws7Fm9rOEbZ8c17vezN4E9k7C+4aZdTezV+PX+4OZvWdmXSpY9zAz+ziuc4aZHV7OOv3M7PP4/f3GzK6uYv+Nzez++HCy0My+NbP7kvHaspFaFpm3a/yzoMy8PwKfAxfy49/or8A5wEjgY+BY4EEzW+Hu/4rXuQQYBowhaq30AsZWVYCZXRdv9y7gKmBHoDvQgOgw6edA47geiL4Ah5n1Bh4H7gGuBdoDNxL9J3RlvM6vgSeBicClwH7AhKpqCrQb8AJwK7CJaBChl8zsSHd/u8x6OwL/G9e2GBgYr7enuxfEdV5F9L6NBaYCvwFGmdk6dx9Xwf5vBw4FLif6+7UFjkzSa8s+7q5Hmh7AcGA5UQDkAR2A14laDy3jdeYT/YOuW+b39iD6MJyTsL1HgA/j57lE31j9e8I6rxJ97X3XeHrXePqEeLoxsA64vZK6/wlMTZhnwDfAQwnzzwPWA03i6QnALOKvFsTzhsQ19Kni/XLg4sD3Nid+T18GHkx4zx34Y5l5DYgGsLkpnm4IrAGuT9jmSKIQyI2nxwPTyyz/DBiQ6X9X6XroMCT9mgBF8WMOUSfnae6+uMw6k929sMz0MURhMdHM8jY/gMnAgfHwfG2BlsBzCft7pop6DgHqAQ9V83V0IGpxTEioaQpQF/hFvN7BwPMef7oCawpiZm3M7GEzWwgUE72nXeLaEk3c/MTd1xCF6MHxrEOA+sBT5byW5kRDB5RnBnCVmV1oZuXts1bRYUj6rQI6E/1vVwAsSvggASxJmG5K1HJYVcE2WwIt4udLE5YlTidqEv9cXOlaW2sa/3yxguWbx+lo8RNqqpKZ5RB9nT+f6NBrLrCWqDXQLGH1Nb51J/FSYP/4+ebXUtE3ftsStaISXRzvbxjwNzObCwx19yeq8VJqDIVF+hW7e1Xn6hPD4zui/zkPI2phJFrKj3/LxA9K4nSiFfHPlkSHSKE2j0PZH/iknOVfxz8LfkJNIfYAfgV0c/dJm2eaWb1y1m1QzlmlZvwYkJtfywlsHdQQtQC34u4rifqJLjGz/YGrgUfN7FN3n1WdF1MT6DCkZphC1LJo5O7Ty3lsJBrLs4CtB/ztVcW23yXqYzinknU2Eh1alDUHWEjUF1JeTZtD6EPgD2ZWdgjBqmoKsTkUNmyeYWbtiAK1PD3LrNeAqIP4g3jW5vegVQWvZXVVxbj7p0Sdwzkk6WxPtlHLogZw9zlmdjfwhJmNBaYTfXj3Azq4ez93L4mX3RpfsTiN6JTnPlVse6WZjQJusGi0rheBOkRnQ0a4+0KiC7h6mNlJRGdCFrn7IjMbCPzDzBoCLxGFyu7AScAp7r4OuBl4n6hv4wGivozqDBhzoJmdkjBvGdG1IguA28xsKNHhyAiiAEu0Pn59DYg6ga8EdgDuKPMeDAfuiAPnTaIPfQegk7v3LGebmNlbRH0hnxG1Bs8nOhT6oLz1a7xM97BuTw/isyFVrDMfuLWc+QZcRnRcvYHoA/MGcHbCOqPiZauBR4lOw1Z4NqTM7/6Z6KzFBqIWygSgYbysKdGH4rv4d4eX+b1uRMG0luiszgxgNJBXZp1TifoUCoG3gIMIPxtS3mNqvPwgog/meuBLoA9bn7EYTnR4dURc2wbg38CR5ezvTKLrXtYD3xOF3BVllidu+xZgZvxeryQ6s3VEpv+dpeqhkbJEJIj6LEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIIoLEQkiMJCRIL8P0h+CPIbqrFlAAAAAElFTkSuQmCC\n", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], + "source": [ + "cm = confusion_matrix(y_test, best_preds)\n", + "ax = sns.heatmap(cm, square=True, annot=True, cbar=False)\n", + "ax.set_xlabel('Predicted Labels',fontsize = 15)\n", + "ax.set_ylabel('True Labels',fontsize = 15)\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Using XGboost, there have been some trade-off and our recall score and f1 score have increased. Let search for the best hyperparameters" + ] + }, + { + "cell_type": "code", + "execution_count": 106, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Tuned: {'eta': 0.05, 'gamma': 0.1, 'learning_rate': 0.01, 'max_depth': 5, 'min_child_weight': 1, 'n_estimators': 500}\n", + "Mean of the cv scores is 0.932706\n", + "Train Score 0.958423\n", + "Test Score 0.938000\n", + "Seconds used for refitting the best model on the train dataset: 3.063578\n" + ] + } + ], + "source": [ + "from xgboost.sklearn import XGBClassifier\n", + "from sklearn.model_selection import GridSearchCV, RandomizedSearchCV \n", + "\n", + "param_dict = {\n", + " 'eta': [0.05,0.10,0.20,0.25,0.30],\n", + " 'gamma': [0.0, 0.1, 0.2, 0.4],\n", + " 'max_depth':range(3,10,2),\n", + " 'min_child_weight':range(1,6,2),\n", + " 'learning_rate': [0.001,0.01,0.1,1],\n", + " 'n_estimators': [200,500,1000]\n", + " \n", + "}\n", + "\n", + "xgc = XGBClassifier()\n", + "\n", + "clf = GridSearchCV(xgc,param_dict,cv=2,n_jobs = -1).fit(X_train,y_train)\n", + "\n", + "print(\"Tuned: {}\".format(clf.best_params_)) \n", + "print(\"Mean of the cv scores is {:.6f}\".format(clf.best_score_))\n", + "print(\"Train Score {:.6f}\".format(clf.score(X_train,y_train)))\n", + "print(\"Test Score {:.6f}\".format(clf.score(X_test,y_test)))\n", + "print(\"Seconds used for refitting the best model on the train dataset: {:.6f}\".format(clf.refit_time_))" + ] + }, + { + "cell_type": "code", + "execution_count": 159, + "metadata": {}, + "outputs": [ + { + "ename": "ValueError", + "evalue": "feature_names mismatch: ['ContractRenewal', 'DataPlan', 'DataUsage', 'CustServCalls', 'DayMins', 'DayCalls', 'MonthlyCharge', 'OverageFee', 'RoamMins'] ['f0', 'f1', 'f2', 'f3', 'f4', 'f5', 'f6', 'f7', 'f8', 'f9']\nexpected RoamMins, MonthlyCharge, DataPlan, DayMins, OverageFee, DayCalls, CustServCalls, ContractRenewal, DataUsage in input data\ntraining data did not have the following fields: f9, f7, f2, f3, f8, f0, f5, f6, f1, f4", + "output_type": "error", + "traceback": [ + "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[1;31mValueError\u001b[0m Traceback (most recent call last)", + "\u001b[1;32m\u001b[0m in \u001b[0;36m\u001b[1;34m\u001b[0m\n\u001b[1;32m----> 1\u001b[1;33m \u001b[0mxgb_pred\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mclf\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mpredict\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mX_test\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 2\u001b[0m \u001b[0mprint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mclassification_report\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0my_test\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mxgb_pred\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 3\u001b[0m \u001b[0mcm\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mconfusion_matrix\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0my_test\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mxgb_pred\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 4\u001b[0m \u001b[0max\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0msns\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mheatmap\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mcm\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0msquare\u001b[0m\u001b[1;33m=\u001b[0m\u001b[1;32mTrue\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mannot\u001b[0m\u001b[1;33m=\u001b[0m\u001b[1;32mTrue\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mcbar\u001b[0m\u001b[1;33m=\u001b[0m\u001b[1;32mFalse\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 5\u001b[0m \u001b[0max\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mset_xlabel\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34m'Predicted Labels'\u001b[0m\u001b[1;33m,\u001b[0m\u001b[0mfontsize\u001b[0m \u001b[1;33m=\u001b[0m \u001b[1;36m15\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n", + "\u001b[1;32m~\\AppData\\Local\\conda\\conda\\envs\\tf\\lib\\site-packages\\sklearn\\utils\\metaestimators.py\u001b[0m in \u001b[0;36m\u001b[1;34m(*args, **kwargs)\u001b[0m\n\u001b[0;32m 117\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 118\u001b[0m \u001b[1;31m# lambda, but not partial, allows help() to work with update_wrapper\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 119\u001b[1;33m \u001b[0mout\u001b[0m \u001b[1;33m=\u001b[0m \u001b[1;32mlambda\u001b[0m \u001b[1;33m*\u001b[0m\u001b[0margs\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;33m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[1;33m:\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mfn\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mobj\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;33m*\u001b[0m\u001b[0margs\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;33m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 120\u001b[0m \u001b[1;31m# update the docstring of the returned function\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 121\u001b[0m \u001b[0mupdate_wrapper\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mout\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mfn\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n", + "\u001b[1;32m~\\AppData\\Local\\conda\\conda\\envs\\tf\\lib\\site-packages\\sklearn\\model_selection\\_search.py\u001b[0m in \u001b[0;36mpredict\u001b[1;34m(self, X)\u001b[0m\n\u001b[0;32m 485\u001b[0m \"\"\"\n\u001b[0;32m 486\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_check_is_fitted\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34m'predict'\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 487\u001b[1;33m \u001b[1;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mbest_estimator_\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mpredict\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mX\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 488\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 489\u001b[0m \u001b[1;33m@\u001b[0m\u001b[0mif_delegate_has_method\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mdelegate\u001b[0m\u001b[1;33m=\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34m'best_estimator_'\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;34m'estimator'\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n", + "\u001b[1;32m~\\AppData\\Local\\conda\\conda\\envs\\tf\\lib\\site-packages\\xgboost\\sklearn.py\u001b[0m in \u001b[0;36mpredict\u001b[1;34m(self, data, output_margin, ntree_limit, validate_features, base_margin)\u001b[0m\n\u001b[0;32m 896\u001b[0m \u001b[0moutput_margin\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0moutput_margin\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 897\u001b[0m \u001b[0mntree_limit\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0mntree_limit\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 898\u001b[1;33m validate_features=validate_features)\n\u001b[0m\u001b[0;32m 899\u001b[0m \u001b[1;32mif\u001b[0m \u001b[0moutput_margin\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 900\u001b[0m \u001b[1;31m# If output_margin is active, simply return the scores\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n", + "\u001b[1;32m~\\AppData\\Local\\conda\\conda\\envs\\tf\\lib\\site-packages\\xgboost\\core.py\u001b[0m in \u001b[0;36mpredict\u001b[1;34m(self, data, output_margin, ntree_limit, pred_leaf, pred_contribs, approx_contribs, pred_interactions, validate_features, training)\u001b[0m\n\u001b[0;32m 1362\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 1363\u001b[0m \u001b[1;32mif\u001b[0m \u001b[0mvalidate_features\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m-> 1364\u001b[1;33m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_validate_features\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mdata\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 1365\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 1366\u001b[0m \u001b[0mlength\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mc_bst_ulong\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n", + "\u001b[1;32m~\\AppData\\Local\\conda\\conda\\envs\\tf\\lib\\site-packages\\xgboost\\core.py\u001b[0m in \u001b[0;36m_validate_features\u001b[1;34m(self, data)\u001b[0m\n\u001b[0;32m 1934\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 1935\u001b[0m raise ValueError(msg.format(self.feature_names,\n\u001b[1;32m-> 1936\u001b[1;33m data.feature_names))\n\u001b[0m\u001b[0;32m 1937\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 1938\u001b[0m def get_split_value_histogram(self, feature, fmap='', bins=None,\n", + "\u001b[1;31mValueError\u001b[0m: feature_names mismatch: ['ContractRenewal', 'DataPlan', 'DataUsage', 'CustServCalls', 'DayMins', 'DayCalls', 'MonthlyCharge', 'OverageFee', 'RoamMins'] ['f0', 'f1', 'f2', 'f3', 'f4', 'f5', 'f6', 'f7', 'f8', 'f9']\nexpected RoamMins, MonthlyCharge, DataPlan, DayMins, OverageFee, DayCalls, CustServCalls, ContractRenewal, DataUsage in input data\ntraining data did not have the following fields: f9, f7, f2, f3, f8, f0, f5, f6, f1, f4" + ] + } + ], + "source": [ + "xgb_pred = clf.predict(X_test)\n", + "print(classification_report(y_test, xgb_pred))\n", + "cm = confusion_matrix(y_test, xgb_pred)\n", + "ax = sns.heatmap(cm, square=True, annot=True, cbar=False)\n", + "ax.set_xlabel('Predicted Labels',fontsize = 15)\n", + "ax.set_ylabel('True Labels',fontsize = 15)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Hyperparameter tuning was worth the time as both the recall and f1 score improved compared to choosing hyperparameter manually." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Logistic Regression\n", + "Logisitic regression will require us to standardize our dataset" + ] + }, + { + "cell_type": "code", + "execution_count": 136, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[0.41167182, 0.67648946, 0.32758048, ..., 1.99072703, 0.0715836 ,\n", + " 0.08500823],\n", + " [0.41167182, 0.14906505, 0.32758048, ..., 1.56451025, 0.10708191,\n", + " 1.24048169],\n", + " [0.41167182, 0.9025285 , 0.32758048, ..., 0.26213309, 1.57434567,\n", + " 0.70312091],\n", + " ...,\n", + " [0.41167182, 1.83505538, 0.32758048, ..., 0.01858065, 1.73094204,\n", + " 1.3837779 ],\n", + " [0.41167182, 2.08295458, 3.05268496, ..., 0.38390932, 0.81704825,\n", + " 1.87621082],\n", + " [0.41167182, 0.67974475, 0.32758048, ..., 2.66049626, 1.28129669,\n", + " 1.24048169]])" + ] + }, + "execution_count": 136, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "from scipy import stats\n", + "import numpy as np\n", + "z = np.abs(stats.zscore(data))\n", + "z" + ] + }, + { + "cell_type": "code", + "execution_count": 137, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "414\n" + ] + }, + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
indexChurnAccountWeeksContractRenewalDataPlanDataUsageCustServCallsDayMinsDayCallsMonthlyChargeOverageFeeRoamMins
000128112.701265.111089.09.8710.0
110107113.701161.612382.09.7813.7
220137100.000243.411452.06.0612.2
360121112.033218.28887.317.437.5
480117100.191184.59763.917.588.7
.......................................
29143327079100.002134.79840.09.4911.8
291533280192112.672156.27771.710.789.9
29163329068100.343231.15756.47.679.6
29173330028100.002180.810956.014.4414.1
29183332074113.700234.4113100.013.3013.7
\n", + "

2919 rows × 12 columns

\n", + "
" + ], + "text/plain": [ + " index Churn AccountWeeks ContractRenewal DataPlan DataUsage \\\n", + "0 0 0 128 1 1 2.70 \n", + "1 1 0 107 1 1 3.70 \n", + "2 2 0 137 1 0 0.00 \n", + "3 6 0 121 1 1 2.03 \n", + "4 8 0 117 1 0 0.19 \n", + "... ... ... ... ... ... ... \n", + "2914 3327 0 79 1 0 0.00 \n", + "2915 3328 0 192 1 1 2.67 \n", + "2916 3329 0 68 1 0 0.34 \n", + "2917 3330 0 28 1 0 0.00 \n", + "2918 3332 0 74 1 1 3.70 \n", + "\n", + " CustServCalls DayMins DayCalls MonthlyCharge OverageFee RoamMins \n", + "0 1 265.1 110 89.0 9.87 10.0 \n", + "1 1 161.6 123 82.0 9.78 13.7 \n", + "2 0 243.4 114 52.0 6.06 12.2 \n", + "3 3 218.2 88 87.3 17.43 7.5 \n", + "4 1 184.5 97 63.9 17.58 8.7 \n", + "... ... ... ... ... ... ... \n", + "2914 2 134.7 98 40.0 9.49 11.8 \n", + "2915 2 156.2 77 71.7 10.78 9.9 \n", + "2916 3 231.1 57 56.4 7.67 9.6 \n", + "2917 2 180.8 109 56.0 14.44 14.1 \n", + "2918 0 234.4 113 100.0 13.30 13.7 \n", + "\n", + "[2919 rows x 12 columns]" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/plain": [ + "2919" + ] + }, + "execution_count": 137, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "outliers = list(set(np.where(z > 3)[0]))\n", + "\n", + "print(len(outliers))\n", + "\n", + "new_data = data.drop(outliers,axis = 0).reset_index(drop = False)\n", + "display(new_data)\n", + "\n", + "y_new = y[list(new_data[\"index\"])]\n", + "len(y_new)" + ] + }, + { + "cell_type": "code", + "execution_count": 143, + "metadata": {}, + "outputs": [], + "source": [ + "X_new = new_data.drop(['index', 'Churn'], axis = 1)\n", + "\n", + "from sklearn.preprocessing import StandardScaler\n", + "X_scaled = StandardScaler().fit_transform(X_new)" + ] + }, + { + "cell_type": "code", + "execution_count": 156, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Training accuracy: 0.8962310327949095\n", + "Test accuracy: 0.910958904109589\n" + ] + } + ], + "source": [ + "from sklearn.linear_model import LogisticRegressionCV\n", + "from sklearn.model_selection import train_test_split\n", + "X_train, X_test, y_train, y_test = train_test_split(X_scaled, y_new, test_size = 0.3, random_state = 19, stratify = y_new)\n", + "model = LogisticRegressionCV(cv = 3,solver = 'sag', max_iter = 1000, random_state = 9)\n", + "model.fit(X_train, y_train)\n", + "\n", + "print(\"Training accuracy: \", model.score(X_train, y_train))\n", + "print(\"Test accuracy: \", model.score(X_test, y_test))" + ] + }, + { + "cell_type": "code", + "execution_count": 157, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Precision: 0.8795475113122172\n", + "Recall: 0.6120192307692308\n", + "Accuracy: 0.910958904109589\n", + " precision recall f1-score support\n", + "\n", + " 0 0.91 0.99 0.95 780\n", + " 1 0.85 0.23 0.36 96\n", + "\n", + " accuracy 0.91 876\n", + " macro avg 0.88 0.61 0.66 876\n", + "weighted avg 0.91 0.91 0.89 876\n", + "\n" + ] + } + ], + "source": [ + "preds = model.predict(X_test)\n", + "print(\"Precision: \", (precision_score(y_test, preds, average='macro')))\n", + "print(\"Recall: \",(recall_score(y_test, preds, average='macro')))\n", + "print(\"Accuracy: \", (accuracy_score(y_test, preds)))\n", + "print(classification_report(y_test, preds))" + ] + }, + { + "cell_type": "code", + "execution_count": 158, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "Text(91.68, 0.5, 'Actual Value')" + ] + }, + "execution_count": 158, + "metadata": {}, + "output_type": "execute_result" + }, + { + "data": { + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAQsAAAELCAYAAADOVaNSAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/d3fzzAAAACXBIWXMAAAsTAAALEwEAmpwYAAAXlUlEQVR4nO3dd5gV5dnH8e+9gIJSRJG2gNiIRt/ERGPBV42CiIgikqCxACpi7yUSDIrYgmKMUV9jDGoSAxKNjSAGQYwNBIXEBooFpYv0zi73+8cMuBzOnn0WTpnd/X2u61y7M8+Ue8/u+e3MM83cHRGRihQVugARqRoUFiISRGEhIkEUFiISRGEhIkFqF7qAytiw6HMduqlC6rU8qtAlyDYoWT/H0o3XloWIBFFYiEgQhYWIBFFYiEgQhYWIBFFYiEgQhYWIBFFYiEgQhYWIBFFYiEgQhYWIBFFYiEgQhYWIBFFYiEgQhYWIBFFYiEgQhYWIBFFYiEgQhYWIBFFYiEgQhYWIBFFYiEgQhYWIBFFYiEgQhYWIBFFYiEgQhYWIBFFYiEgQhYWIBFFYiEgQhYWIBFFYiEgQhYWIBFFYiEgQhYWIBFFYiEgQhYWIBFFYiEgQhYWIBFFYiEgQhYWIBFFYiEgQhYWIBFFYiEgQhYWIBFFYiEgQhYWIBFFYiEgQhYWIBFFYiEiQ2oUuoCr6YtZsrht45+bh2XPncVnfczjn9O6bx61YuYobbx3CvAXfUFpSSp8ze9D9pE7btd7169fTf/BQPprxKbs0asg9t/anuEUzpn/yGYPveYCVq1ZTVKuIfr3O4MSOx2zXuiSzoqIiJk18iblz5tOte+9Cl5MXCottsOcerXjmiQcBKC0t5bhTz6HDMe23mGb4My+yd9s2PDhkEIuXLKXrLy6ga6djqVOnToXLnzNvAQNuH8rjDwzZYvw/Rv2Lhg3q89LIYYx+ZQL3PjSMoYP7U7fujtzx6+vYo3UxC7/5lp7nX86Rhx1Mwwb1s/dDyxauuLwv06d/SsMGDQpdSt7kdTfEzPYzs1+a2f1m9rv4+/3zWUO2TZwyjdbFLWjZvNkW482MVavX4O6sXrOWRg0bUKtWLQBefHk8Z/S9kh69L2XQkPspLS0NWtf419+mW5eOAHT66VFMenca7k7bNq3Yo3UxAE13341dG+/CkqXLsvhTSlnFxS3ocmIHhg0bXuhS8ipvYWFmvwRGAAa8A0yOvx9uZjfmq45se2nca3RJs8l/Zo+T+fzLrzm221l073UxN151EUVFRXz25VeMGfcaf3l4KM888SBFRUWM+terQeta+M23NG/aBIDatWtRf+edWLps+RbTvP/RDDZsKKF1cYvt/+EkrXuHDuLG/rexcePGQpeSV/ncDTkfOMDdN5QdaWb3Ah8Cd6Wbycz6Af0AHhp6G317/SLXdQbbsGEDE96YxFUXnbtV25vvvMt+++7FsN/fxddz5nHBVb/i4B8ewKQp0/ho+kzOOP9KANatW8eujXcB4Ir+tzJn7gI2lGxg3oJv6NH7UgDO7tmN7id1wt23Wo+Zbf7+m0WL6X/r3dx+07UUFanvOhdO6tKRhQsX8d7U9znm6CMKXU5e5TMsNgItgVkp41vEbWm5+yPAIwAbFn2+9aelgF6fOIX92+1Nk10bb9X27D/H0vfsnpgZbVq1pLhFc76YNRt355QTO3L1xVsHzP13DgTK77No1rQJ8xcuonnT3SkpKWXlqtU0ahjtM69ctYpLrh/I5f1688MDq/SeXaK1b38IJ3ftxImdj6Nu3R1p2LABTzx+P737XFHo0nIun/9+rgLGmdlLZvZI/BoDjAOuzGMdWTN67AS6HP/TtG0tmu3OxHenAbBo8RK+/Go2rVo25/BDDmLshDf4dslSAJYtX8Hc+QuC1nfs/x7O86NfAeBfE17nsIN/iJmxYcMGruw/mFM6d+CE447a3h9LMhhw01203esQ9ml3OGedfQmvvvpmjQgKyOOWhbuPMbN2wKFAMVF/xWxgsruH9fAlyJq1a3l78lRuvuG7P5Snnv0nAKd3P4mL+pzJgNuH0v2ci3F3rr7kPBrv0ojGuzTi8gt60e+qAWz0jdSpXZsB11yyVQdpOqd1PYH+g+/mxJ7n0ahhA+4eFHX1jBn/Ou9O+4Cly1bwXBwmtw+4hv3a7Z2Dn1xqKku3H5xUSdsNkczqtdRWTlVUsn6OpRuvXjARCaKwEJEgCgsRCaKwEJEgCgsRCVKpQ6dm9n3gYKA1MMzd55vZPsACd1+RiwJFJBmCwsLM6gPDgB5ASTzfGGA+cAfwFXBdjmoUkQQI3Q25F2gPdAQaEJ1QtclooHOW6xKRhAndDTkNuNLdXzWzWilts4A9sluWiCRN6JZFPeDbctoaAFXudG0RqZzQsJgM9Cqn7WfAW9kpR0SSKnQ35CbgFTN7Bfg74EAXM7uaKCyOzlF9IpIQQVsW7v4G0AHYEXiAqINzELAX0NHdJ+esQhFJhODzLNz9TeAoM6sHNAaWuvvqnFUmIolS6ftZuPsaYE0OahGRBAs9KWtkRdO4e8/tL0dEkip0y2L3NON2Bb5HdEh1RtYqEpFECgoLdz823Xgzaw08C/w2m0WJSPJs11Wn7v41cCcwpKJpRaRqy8Yl6qVAqywsR0QSLLSD8/tpRu8A7A8MJjrDU0SqsdAOzg+IztpMZURB0TdrFYlIIoWGRboOzrXAbHefk8V6RCShQo+GvJbrQkQk2coNCzPbqTIL0qnfItVbpi2LlaTvpyhP6k1xRKQayRQW51G5sBCRaqzcsHD3x/NYh4gknJ4bIiJBgi9RN7PTgQuAdkDd1HZ3b5rFukQkYYK2LMzsTOAJYCbRqd0vAKPi+ZcT3T1LRKqx0N2Q64lO6740Hn7I3c8D9gQWATpsKlLNhYbFvsCb7l5KdOFYQ4D4kYW/AS7LTXkikhShYbGM6Ga9AHOILiDbxIDdslmUiCRPaAfnFOAHwMtE/RUDzawEWA8MBCblpjwRSYrQsLiT7x5RODD+/iGiszYnA/2yX5qIJEmma0M+BJ4EnnL3icBEAHdfCnQzsx2BHd19eT4KFZHCytRn8QVwM/CJmU0ysyvNrOWmRndfp6AQqTnKDQt37wo0Ay4kOpfiHuArMxtvZn3NrHGeahSRBMh4NMTdl7r7o+5+PFAMXEW06/IHYL6ZvWhmZ5rZzrkvVUQKydwrf2GpmRUDpwNnAAcDa9y9fpZr28qGRZ/rKtgqpF7LowpdgmyDkvVzLN34bb2QzIGN8de0CxaR6iU4LMysiZldbGYTgK+IztxcAJwF6CIykWou43kWZtYIOI1od+NYonB5DbgIeMbdl+S8QhFJhEznWTwPdCI6zfsdoovJnnL3+XmqTUQSJNOWxV5EV5oOd/cv8lSPiCRUptvq/U8+CxGRZNNt9UQkiMJCRIIE34MzCfbb72eFLkEqoU6tKvXnJRXQloWIBFFYiEiQTOdZjKzEctzdT89CPSKSUJl2KnfPWxUikniZzrM4Np+FiEiyqc9CRIJU5vGFDYBulP/4whuyWJeIJExQWJjZ3sCbwE7AzsA3wK7x/EuIniuisBCpxkJ3Q35L9OyQZkQ3u+kC1APOBlYS3TVLRKqx0N2QQ4G+wLp4eIf4UYZ/M7MmwO+A9jmoT0QSInTLoi6w3N03AouBlmXaPgB+mO3CRCRZQsPiE757ItlU4CIzq2tmdYDzgbm5KE5EkiN0N2QEcBDwF+DXRM88XU50097aQJ8c1CYiCRIUFu5+b5nvJ5rZgcCJRLsn4939gxzVJyIJsU3XELv718AjWa5FRBIs9DyLLhVN4+6jt78cEUmq0C2LUaR/oFDZJ4TVykpFIpJIoWGxZ5pxuxI9KqAPcG62ChKRZArt4JyVZvQsYKqZlQK/Ak7JZmEikizZuOp0KnBcFpYjIgm2XWFhZjsQ7YbMy0o1IpJYoUdDJrNlZybADkBboAHqsxCp9kI7OD9k67BYC/wdeM7dP8xqVSKSOKEdnH1yXIeIJFxQn4WZjTez/cppa2dm47NblogkTWgH50+BhuW0NQSOzko1IpJYlTkaktpnseloyHHA/KxVJCKJlOkhQzcDA+NBByaapZ7tvdndWa5LRBImUwfnaGAR0fUg9wNDgS9TplkPTHf313NSnYgkRqaHDE0GJgOY2QpglLt/m6/CRCRZQvsspgGHpWswsy5m9oOsVSQiiVSZRwGkDQvgJ3G7iFRjoWHxY6KHDKXzNvCj7JQjIkkVGha1iJ5Els7ORNeJiEg1FhoWk4F+5bT1I3pamYhUY6EXkt0CvGJmk4AniE7CagH0InrA0PE5qU5EEiP0QrJ/m1kn4E7g90TnXmwEJgHH6zwLkeov+FEA7j4BOMLMdgIaA0vcfTWAmdVx9w25KVFEkqDSd8py99XuPgdYY2bHmdkf0bUhItVepR8yZGaHAb8AegLNiB6UPCLLdYlIwoTeVu9AooA4g+hWeuuJDpdeAzzo7iW5KlBEkqHc3RAz28vMfmVm7wP/Aa4DPiY6ArIvUSfnVAWFSM2QactiJtGl6ZOAC4Fn3H0JgJk1ykNtIpIgmTo4ZxFtPRxIdKes9ma2TQ9SFpGqr9ywcPc9gSOJTsLqALwILIiPfnQgzZ2zRKT6ynjo1N3fdvfLgWLgBOB5oAfwdDzJBWZ2SG5LFJEkCDrPwt03uvtYdz8PaA6cRvTMkO7AJDP7OIc1ikgCbMtJWevd/Tl3P4PoPIteRJ2hIlKNbdezTt19lbs/6e4nZ6sgEUmmbDxFXSphz3324MVXh29+Tfvi3/S58MzN7X0vPYfPFr1H4113KVyRsoVWrVowZswIpk4dx7vvjuXSS6NH+95xx6+YNm0c77wzhqee+gONGpX3aJ3qwdyrzkGNvZv8uOoUG6CoqIi33h/DaSf0Zu7sebRo2Yw77hvI3vu2pVuHs1iyeGmhS9wuc1dVj/s7N2/elObNmzJt2gfUr78zb701ip49+1Fc3JwJE96itLSU2267EYCbbrqrwNVuvzVrZqV95oe2LAqo/dGH8tWXs5k7ex4AA267lt8Muo+qFOA1wfz5C5k27QMAVq5cxfTpM2nZshnjxr1OaWkpAO+8M5Xi4haFLDPnFBYF1LX7Cbz4j5cB6ND5aBbMW8j0Dz8tcFWSSZs2rTjooAOYPHnaFuN79erJyy9PKEhN+ZKIsDCzczO09TOzKWY2ZfnaRfksK6fq1KlNh85HM/qFsdStV5dLrj6f3971cKHLkgx23nknhg9/mOuvv5UVK1ZuHn/DDZdRWlrCiBHPFrC63EtEWACDymtw90fc/RB3P6Rh3Sb5rCmnjul4JB/+dzrffrOYNm1b0bpNMf98bQSvvTeK5i2b8sL4J2nSdLdClymx2rVrM3z4wzz11HM8//yYzePPOqsHXbp0oE+fKwtYXX7k7VoPM/tveU1E52vUKCef1nnzLsgnH8/k0P07bm577b1RnNrx7CrfwVmdPPzwEGbMmMn99z+6edzxxx/DtddeTKdOPVmzZm0Bq8uPfF4Y1ozolPElKeMNeCuPdRRc3Xp1OfKYwxhwze2FLkUCtG9/CGed1YP33/+YiRNHA3DzzXczdOgt7LjjDowa9Vcg6uS84ooBhSw1p/J26NTM/gQ85u5vpGn7m7ufmWa2LVS3Q6fVXXU5dFrTlHfoNG9bFu5+foa2CoNCRAorKR2cIpJwCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEgCgsRCaKwEJEg5u6FrkEAM+vn7o8Uug4JUxN/X9qySI5+hS5AKqXG/b4UFiISRGEhIkEUFslRo/Z/q4Ea9/tSB6eIBNGWhYgEUViISBCFRYGZWWczm2FmM83sxkLXI5mZ2TAzW2hmHxS6lnxTWBSQmdUCHgROBL4P/MLMvl/YqqQCjwOdC11EISgsCutQYKa7f+7u64ERQLcC1yQZuPu/gcWFrqMQFBaFVQx8XWZ4djxOJHEUFoVlacbpWLYkksKisGYDrcsMtwLmFqgWkYwUFoU1GdjXzPY0sx2AM4AXClyTSFoKiwJy9xLgMuBl4GNgpLt/WNiqJBMzGw68DXzPzGab2fmFrilfdLq3iATRloWIBFFYiEgQhYWIBFFYiEgQhYWIBFFYFIiZ3WJmXuY118yeMbO9c7jOrvG62sbDbePhrpVYRk8z65PFmurHNaRdppkdErf3KKe9mZmVmNkNgeubYGZPb0fJNZbCorCWAUfEr+uAg4BxZrZzntY/L173G5WYpyfQJyfVpOHuU4BPiU5YS+fnRH/HT+WrpppKYVFYJe4+MX79DegN7AF0STexmdXL5srdfV287qXZXG4OjABOMrP6adrOAN5y91l5rqnGUVgky7vx17YAZvalmQ01s1+b2WxgeTy+yMxujG+Ys87MPjGz3mUXZJFb4hu1rDCzPwMNU6ZJuxtiZheY2ftmttbMFpjZ02bWyMweB3oAx5TZfbqlzHzdzGxKPN98MxtiZnVSlt0jrneNmf0b2C/gfRkO1CPl8n0zaw20j9sxs2vNbLKZLYvrftHM9sm0YDN73MymVPS+hLzn1Z3CIlnaxl/nlxl3JnAMcAlwejzu98BNRHeYPgl4FhiW8qG/AhgYT/MzYA0wpKICzOwm4A/Aa8CpwMVEu0v1gcHAq8BUvtt9ejSeryfwD+Ad4BRgENGDeO4ss+wfE+0u/Ac4jeg6mJEV1eTuH8fzpO6KnA5sBP4eD7cCHiAKlQuAWsCbZtaoonUECHnPqzd316sAL+AWYBFQO361I/ogLgdaxNN8SdSvULfMfPsQfUB6pyzvz8Dk+PtaRFev/l/KNGOJLoFvGw+3jYe7xsO7AKuBezPU/TQwIWWcAbOAx1LGn0cUUrvFwyOBj4gvM4jHDYhr6FPB+/VLYB3QuMy4KcDL5Uxfi2hrZAXQq8z4CcDTZYYfB6akzJv6vlT4nteEl7YsCms3YEP8mgHsBZzu7vPKTDPO3deWGe5A9If7rJnV3vQCxgEHxbfqaw20AJ5PWd8/KqjnCKIP2GOV/DnaAW2AkSk1jQfqAgfG0x0KvODxJy2wpk1GAHWA7gDxUaODiXdB4nGHm9lYM/sWKCEKvvpxfdsj5D2v9moXuoAabhnQkei/2HxgbsoHCWBBynATov+ay8pZZgugefz9wpS21OFUu8Vf52WcamtN4q+jy2nfdM+O5ttQEwDuPsvM3ibaFRkWf11HtDuAmbUB/kW0G3Qh0ZbVeuCfRIG1PULe89nbuY7EU1gUVolHhwYzSQ2PxUT/NY8k+m+XaiHf/V6bprSlDqf6Nv7agmgXKdSme1L2I+rPSPVF/HX+NtRU1nDgPjNrShQWo9190we4M7AT0M3dVwHE//13rWCZa4EdUsalzhPynld7CouqZzzRf7lG7j423QRm9jXRB7MbMKZM02kVLPttoj6G3kTnfaSznq3/U88A5hD1hfwxw/InA6eYWf8yW1AV1VTWSOA+oo7bA4Fby7TVI/ogl5QZ15OK/8ZnA23NrG6Z3b3jU6ap8D2vCRQWVYy7zzCzh4ERZjaEqJOvLnAA0M7d+7p7adx2j5ktAl4nOuS5fwXLXmpmg4Hb4zt3jQZ2JOr9H+Tuc4DpQDczO5XogzbX3eea2bXAX8ysIfASUajsRXRE5Wfuvhr4DTCJqG/jT0Qf+OCbx7j7QjMbT3RkaCUwqkzzpg/0Y/GyDyAKvKUVLPY5otB5ND40/CPg3JT1Vvieh/4MVVqhe1hr6ov4aEgF03wJ3JNmvAFXAR8S7bd/Q3Sos1fKNIPjthXAk0SHYcs9GlJm3guJjlqsI9pCGQk0jNuaEPUTLI7nvaXMfCcSBdMqoqM604DbgNplpvk5MJNo8/8N4CcEHA0pM/+58fR/TdPWC/iMaOtoInBY6ntIytGQeFyfeL7VRAHUPvV9CXnPq/tLd8oSkSA6dCoiQRQWIhJEYSEiQRQWIhJEYSEiQRQWIhJEYSEiQRQWIhLk/wFCDOi3LXwE1QAAAABJRU5ErkJggg==\n", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], + "source": [ + "cmx = confusion_matrix(y_test, preds)\n", + "ax = sns.heatmap(cmx, square= True, annot= True, cbar= False)\n", + "ax.set_xlabel(\"Predicted Value\", fontsize = 15)\n", + "ax.set_ylabel(\"Actual Value\", fontsize = 15)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Logistic regression performed worse on the recall and f1 score which indicated that logistic regression is best used when we don't have imbalanced dataset. " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Evaluation\n", + "Choosing a better model is taskful but data processing and analysis is more taskful. After carefully training the data using Decision Tree, XGboost and Logistic Regression, it was evident that XGBoost performed the best due to the fact that it is able to run several trees and developed on error from previous tree. Using the grid search from the scikit learn library to tune hyperparameters gives the best result and finally XGBoost model was selected. The dataset was biased i.e. it is unbalanced, therefore most model will overfit to the largest number of cases which in this case was 'Customer not churn (0)'. The best performing model was able to predict true positive of 98 and false negative of 47 leading to a low recall. \n", + "* Others:\n", + "* False positive: 15\n", + "* True negative: 8.4e02. \n", + "* The model can generally be improved if more positive case is provided in the dataset i.e. more data will improve the model." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Preprocessing\n", + "\n", + "- Are there any duplicated values?\n", + "- Do we need to do feature scaling?\n", + "- Do we need to generate new features?\n", + "- Split Train and Test dataset. (0.7/0.3)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# ML Application\n", + "\n", + "- Define models.\n", + "- Fit models.\n", + "- Evaluate models for both train and test dataset.\n", + "- Generate Confusion Matrix and scores of Accuracy, Recall, Precision and F1-Score.\n", + "- Analyse occurrence of overfitting and underfitting. If there is any of them, try to overcome it within a different section." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Evaluation\n", + "\n", + "- Select the best performing model and write your comments about why choose this model.\n", + "- Analyse results and make comment about how you can improve model." + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.9" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/Project/churn.csv b/My Project/churn.csv similarity index 100% rename from Project/churn.csv rename to My Project/churn.csv diff --git a/Project/09-11-2020 ML Course Nigeria Project 'name'.ipynb b/Project/09-11-2020 ML Course Nigeria Project 'name'.ipynb deleted file mode 100644 index 1856d7e..0000000 --- a/Project/09-11-2020 ML Course Nigeria Project 'name'.ipynb +++ /dev/null @@ -1,331 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Project\n", - "\n", - "In this project, our aim is to building a model for predicting churn. Churn is the percentage of customers that stopped using your company's product or service during a certain time frame. Thus, in the given dataset, our label will be `Churn` column.\n", - "\n", - "## Steps\n", - "- Read the `churn.csv` file and describe it.\n", - "- Make at least 4 different analysis on Exploratory Data Analysis section.\n", - "- Pre-process the dataset to get ready for ML application. (Check missing data and handle them, can we need to do scaling or feature extraction etc.)\n", - "- Define appropriate evaluation metric for our case (classification).\n", - "- Train and evaluate Logistic Regression, Decision Trees and one other appropriate algorithm which you can choose from scikit-learn library.\n", - "- Is there any overfitting and underfitting? Interpret your results and try to overcome if there is any problem in a new section.\n", - "- Create confusion metrics for each algorithm and display Accuracy, Recall, Precision and F1-Score values.\n", - "- Analyse and compare results of 3 algorithms.\n", - "- Select best performing model based on evaluation metric you chose on test dataset.\n", - "\n", - "\n", - "Good luck :)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "

Your Name

" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Data" - ] - }, - { - "cell_type": "code", - "execution_count": 2, - "metadata": {}, - "outputs": [], - "source": [ - "import pandas as pd\n", - "import seaborn as sns\n", - "import numpy as np\n", - "import matplotlib.pyplot as plt" - ] - }, - { - "cell_type": "code", - "execution_count": 5, - "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
ChurnAccountWeeksContractRenewalDataPlanDataUsageCustServCallsDayMinsDayCallsMonthlyChargeOverageFeeRoamMins
00128112.71265.111089.09.8710.0
10107113.71161.612382.09.7813.7
20137100.00243.411452.06.0612.2
3084000.02299.47157.03.106.6
4075000.03166.711341.07.4210.1
\n", - "
" - ], - "text/plain": [ - " Churn AccountWeeks ContractRenewal DataPlan DataUsage CustServCalls \\\n", - "0 0 128 1 1 2.7 1 \n", - "1 0 107 1 1 3.7 1 \n", - "2 0 137 1 0 0.0 0 \n", - "3 0 84 0 0 0.0 2 \n", - "4 0 75 0 0 0.0 3 \n", - "\n", - " DayMins DayCalls MonthlyCharge OverageFee RoamMins \n", - "0 265.1 110 89.0 9.87 10.0 \n", - "1 161.6 123 82.0 9.78 13.7 \n", - "2 243.4 114 52.0 6.06 12.2 \n", - "3 299.4 71 57.0 3.10 6.6 \n", - "4 166.7 113 41.0 7.42 10.1 " - ] - }, - "execution_count": 5, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "# Read csv\n", - "data = pd.read_csv(\"churn.csv\")\n", - "data.head()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Describe our data for each feature and use .info() for get information about our dataset\n", - "# Analys missing values" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Exploratory Data Analysis" - ] - }, - { - "cell_type": "code", - "execution_count": 7, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "" - ] - }, - "execution_count": 7, - "metadata": {}, - "output_type": "execute_result" - }, - { - "data": { - "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYsAAAEGCAYAAACUzrmNAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjMsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+AADFEAAAPvklEQVR4nO3df6zddX3H8edL6o9saixpQWw7S0xdVvcD9A7JWDKdkV/JUn9MA4tSGVn9AxZNzBL0j8E0JGZDjTrCUmMFjEqIyuxMI9ZOp25Te2saoFTCHTJ6bUev1oCbylZ874/zveHQ3ns/p5eee245z0fyzfl+39/P93vel1x48f15U1VIkrSQZ426AUnS8mdYSJKaDAtJUpNhIUlqMiwkSU0rRt3AMKxatarWr18/6jYk6ZSyZ8+eH1fV6rnWPSPDYv369UxOTo66DUk6pST5z/nWeRpKktRkWEiSmgwLSVKTYSFJajIsJElNhoUkqcmwkCQ1GRaSpCbDQpLU9Ix8gvtkeNVf3TbqFrQM7fm7K0bdgjQSHllIkpoMC0lSk2EhSWoyLCRJTYaFJKnJsJAkNRkWkqQmw0KS1GRYSJKaDAtJUpNhIUlqMiwkSU2GhSSpybCQJDUZFpKkJsNCktRkWEiSmgwLSVKTYSFJajIsJElNhoUkqWloYZFkXZKvJ9mfZF+Sd3X165P8KMnebrq0b5v3JplKcn+Si/rqF3e1qSTXDqtnSdLcVgxx30eB91TV95O8ANiTZGe37iNVdWP/4CQbgcuAVwAvAb6W5OXd6puA1wPTwO4k26vqviH2LknqM7SwqKpDwKFu/mdJ9gNrFthkE3B7VT0O/DDJFHBet26qqh4ESHJ7N9awkKQlsiTXLJKsB84FvtuVrklyd5JtSVZ2tTXAgb7NprvafPVjv2NLkskkkzMzMyf5J5Ck8Tb0sEjyfOALwLur6jHgZuBlwDn0jjw+NDt0js1rgfpTC1Vbq2qiqiZWr159UnqXJPUM85oFSZ5NLyg+U1VfBKiqR/rWfwL4crc4Dazr23wtcLCbn68uSVoCw7wbKsAngf1V9eG++ll9w94I3NvNbwcuS/LcJGcDG4DvAbuBDUnOTvIcehfBtw+rb0nS8YZ5ZHEB8HbgniR7u9r7gMuTnEPvVNJDwDsBqmpfkjvoXbg+ClxdVU8AJLkGuAs4DdhWVfuG2Lck6RjDvBvq28x9vWHHAtvcANwwR33HQttJkobLJ7glSU2GhSSpybCQJDUZFpKkJsNCktRkWEiSmgwLSVKTYSFJajIsJElNhoUkqcmwkCQ1GRaSpCbDQpLUZFhIkpoMC0lSk2EhSWoyLCRJTYaFJKnJsJAkNRkWkqQmw0KS1GRYSJKaDAtJUpNhIUlqMiwkSU2GhSSpybCQJDUNLSySrEvy9ST7k+xL8q6ufnqSnUke6D5XdvUk+ViSqSR3J3ll3742d+MfSLJ5WD1LkuY2zCOLo8B7quq3gPOBq5NsBK4FdlXVBmBXtwxwCbChm7YAN0MvXIDrgFcD5wHXzQaMJGlpDC0squpQVX2/m/8ZsB9YA2wCbu2G3Qq8oZvfBNxWPd8BXpTkLOAiYGdVHamqnwI7gYuH1bck6XhLcs0iyXrgXOC7wJlVdQh6gQKc0Q1bAxzo22y6q81XP/Y7tiSZTDI5MzNzsn8ESRprQw+LJM8HvgC8u6oeW2joHLVaoP7UQtXWqpqoqonVq1cvrllJ0pyGGhZJnk0vKD5TVV/syo90p5foPg939WlgXd/ma4GDC9QlSUtkmHdDBfgksL+qPty3ajswe0fTZuBLffUruruizgce7U5T3QVcmGRld2H7wq4mSVoiK4a47wuAtwP3JNnb1d4HfBC4I8lVwMPAW7p1O4BLgSng58CVAFV1JMkHgN3duPdX1ZEh9i1JOsbQwqKqvs3c1xsAXjfH+AKunmdf24BtJ687SdKJ8AluSVKTYSFJajIsJElNhoUkqcmwkCQ1GRaSpCbDQpLUZFhIkpoMC0lSk2EhSWoyLCRJTYaFJKnJsJAkNRkWkqQmw0KS1GRYSJKaDAtJUpNhIUlqMiwkSU2GhSSpaaCwSLJrkJok6ZlpxUIrkzwP+DVgVZKVQLpVLwReMuTeJEnLxIJhAbwTeDe9YNjDk2HxGHDTEPuSJC0jC4ZFVX0U+GiSv6yqjy9RT5KkZaZ1ZAFAVX08yR8A6/u3qarbhtSXJGkZGSgsknwaeBmwF3iiKxdgWEjSGBgoLIAJYGNV1TCbkSQtT4M+Z3Ev8OJhNiJJWr4GDYtVwH1J7kqyfXZaaIMk25IcTnJvX+36JD9KsrebLu1b994kU0nuT3JRX/3irjaV5NoT/QElSU/foKehrl/Evm8B/p7jr2t8pKpu7C8k2QhcBryC3m26X0vy8m71TcDrgWlgd5LtVXXfIvqRJC3SoHdD/cuJ7riqvplk/YDDNwG3V9XjwA+TTAHndeumqupBgCS3d2MNC0laQoO+7uNnSR7rpl8meSLJY4v8zmuS3N2dplrZ1dYAB/rGTHe1+epz9bglyWSSyZmZmUW2Jkmay0BhUVUvqKoXdtPzgDfTO8V0om6mdwvuOcAh4ENdPXOMrQXqc/W4taomqmpi9erVi2hNkjSfRb11tqr+EfjjRWz3SFU9UVW/Aj7Bk6eapoF1fUPXAgcXqEuSltCgD+W9qW/xWfSeuzjhZy6SnFVVh7rFN9K7JRdgO/DZJB+md4F7A/A9ekcWG5KcDfyI3kXwPzvR75UkPT2D3g31J33zR4GH6F1onleSzwGvoffG2mngOuA1Sc6hFzQP0XtRIVW1L8kd9C5cHwWurqonuv1cA9wFnAZsq6p9A/YsSTpJBr0b6soT3XFVXT5H+ZMLjL8BuGGO+g5gx4l+vyTp5Bn0bqi1Se7sHrJ7JMkXkqwddnOSpOVh0Avcn6J3XeEl9G5d/aeuJkkaA4OGxeqq+lRVHe2mWwDvT5WkMTFoWPw4yduSnNZNbwN+MszGJEnLx6Bh8efAW4H/ovcw3Z8CJ3zRW5J0ahr01tkPAJur6qcASU4HbqQXIpKkZ7hBjyx+dzYoAKrqCHDucFqSJC03g4bFs/pe+jd7ZDHoUYkk6RQ36H/wPwT8W5LP03v6+q3M8QCdJOmZadAnuG9LMknv5YEB3uQfIJKk8THwqaQuHAwISRpDi3pFuSRpvBgWkqQmw0KS1GRYSJKaDAtJUpNhIUlqMiwkSU2GhSSpybCQJDUZFpKkJsNCktRkWEiSmgwLSVKTYSFJajIsJElNhoUkqWloYZFkW5LDSe7tq52eZGeSB7rPlV09ST6WZCrJ3Ule2bfN5m78A0k2D6tfSdL8hnlkcQtw8TG1a4FdVbUB2NUtA1wCbOimLcDN0AsX4Drg1cB5wHWzASNJWjpDC4uq+iZw5JjyJuDWbv5W4A199duq5zvAi5KcBVwE7KyqI1X1U2AnxweQJGnIlvqaxZlVdQig+zyjq68BDvSNm+5q89WPk2RLkskkkzMzMye9cUkaZ8vlAnfmqNUC9eOLVVuraqKqJlavXn1Sm5OkcbfUYfFId3qJ7vNwV58G1vWNWwscXKAuSVpCSx0W24HZO5o2A1/qq1/R3RV1PvBod5rqLuDCJCu7C9sXdjVJ0hJaMawdJ/kc8BpgVZJpenc1fRC4I8lVwMPAW7rhO4BLgSng58CVAFV1JMkHgN3duPdX1bEXzSVJQza0sKiqy+dZ9bo5xhZw9Tz72QZsO4mtSZJO0HK5wC1JWsYMC0lSk2EhSWoyLCRJTYaFJKnJsJAkNRkWkqQmw0KS1GRYSJKaDAtJUpNhIUlqMiwkSU2GhSSpybCQJDUZFpKkJsNCktRkWEiSmgwLSVKTYSFJajIsJElNhoUkqcmwkCQ1GRaSpCbDQpLUZFhIkpoMC0lSk2EhSWoaSVgkeSjJPUn2Jpnsaqcn2Znkge5zZVdPko8lmUpyd5JXjqJnSRpnozyyeG1VnVNVE93ytcCuqtoA7OqWAS4BNnTTFuDmJe9UksbccjoNtQm4tZu/FXhDX/226vkO8KIkZ42iQUkaV6MKiwK+mmRPki1d7cyqOgTQfZ7R1dcAB/q2ne5qT5FkS5LJJJMzMzNDbF2Sxs+KEX3vBVV1MMkZwM4kP1hgbOao1XGFqq3AVoCJiYnj1kuSFm8kYVFVB7vPw0nuBM4DHklyVlUd6k4zHe6GTwPr+jZfCxxc0oalZebh9//OqFvQMvQbf33P0Pa95Kehkvx6khfMzgMXAvcC24HN3bDNwJe6+e3AFd1dUecDj86erpIkLY1RHFmcCdyZZPb7P1tVX0myG7gjyVXAw8BbuvE7gEuBKeDnwJVL37IkjbclD4uqehD4vTnqPwFeN0e9gKuXoDVJ0jyW062zkqRlyrCQJDUZFpKkJsNCktRkWEiSmgwLSVKTYSFJajIsJElNhoUkqcmwkCQ1GRaSpCbDQpLUZFhIkpoMC0lSk2EhSWoyLCRJTYaFJKnJsJAkNRkWkqQmw0KS1GRYSJKaDAtJUpNhIUlqMiwkSU2GhSSpybCQJDUZFpKkJsNCktR0yoRFkouT3J9kKsm1o+5HksbJKREWSU4DbgIuATYClyfZONquJGl8nBJhAZwHTFXVg1X1v8DtwKYR9yRJY2PFqBsY0BrgQN/yNPDq/gFJtgBbusX/TnL/EvU2DlYBPx51E8tBbtw86hZ0PH8/Z12Xp7uHl8634lQJi7n+CdRTFqq2AluXpp3xkmSyqiZG3Yc0F38/l8apchpqGljXt7wWODiiXiRp7JwqYbEb2JDk7CTPAS4Dto+4J0kaG6fEaaiqOprkGuAu4DRgW1XtG3Fb48TTe1rO/P1cAqmq9ihJ0lg7VU5DSZJGyLCQJDUZFlqQr1nRcpRkW5LDSe4ddS/jwrDQvHzNipaxW4CLR93EODEstBBfs6Jlqaq+CRwZdR/jxLDQQuZ6zcqaEfUiaYQMCy2k+ZoVSePBsNBCfM2KJMCw0MJ8zYokwLDQAqrqKDD7mpX9wB2+ZkXLQZLPAf8O/GaS6SRXjbqnZzpf9yFJavLIQpLUZFhIkpoMC0lSk2EhSWoyLCRJTYaFtEhJXpzk9iT/keS+JDuSbEny5VH3Jp1shoW0CEkC3Al8o6peVlUbgfcBZz7N/Z4Sf+pY48ewkBbntcD/VdU/zBaqai/wLeD5ST6f5AdJPtMFC0keSrKqm59I8o1u/vokW5N8FbgtyTuSfDHJV5I8kORvl/ynk47h/8VIi/PbwJ551p0LvILee7T+FbgA+HZjf68C/rCqfpHkHcA53X4eB+5P8vGqOrDQDqRh8shCOvm+V1XTVfUrYC+wfoBttlfVL/qWd1XVo1X1S+A+4KVD6FMamGEhLc4+ekcDc3m8b/4JnjyCP8qT/84975ht/mfAfUgjYVhIi/PPwHOT/MVsIcnvA3+0wDYP8WTAvHl4rUknn2EhLUL13sD5RuD13a2z+4DrWfjvffwN8NEk36J3tCCdMnzrrCSpySMLSVKTYSFJajIsJElNhoUkqcmwkCQ1GRaSpCbDQpLU9P+BS+lzMeDnBAAAAABJRU5ErkJggg==\n", - "text/plain": [ - "
" - ] - }, - "metadata": { - "needs_background": "light" - }, - "output_type": "display_data" - } - ], - "source": [ - "# Our label Distribution (countplot)\n" - ] - }, - { - "cell_type": "code", - "execution_count": 3, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "" - ] - }, - "execution_count": 3, - "metadata": {}, - "output_type": "execute_result" - }, - { - "data": { - "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYAAAAEGCAYAAABsLkJ6AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjMsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+AADFEAAAgAElEQVR4nO3deXxV5Z348c8392bfyAZkIxtrAFGIgCho1SraWmwHW2htbavVaet0mZlXfzrzm06nv5e/GX8zU2c61XZcaK0bOm5FRbAtKrLvEHZCEkI2ErJC9tw8vz/uAWO8SW7Wc5fv+/XKKydny/fhhvM9z3Oe8zxijEEppVTwCbE7AKWUUvbQBKCUUkFKE4BSSgUpTQBKKRWkNAEopVSQctodwFAkJyeb7Oxsu8NQSim/sXfv3vPGmBRP2/wqAWRnZ7Nnzx67w1BKKb8hImf626ZNQEopFaQ0ASilVJDSBKCUUkFKE4BSSgUpTQBKKRWkNAEopVSQ0gSglFJBShOAUkoFKU0ASikVpPzqTWAVvF7cWdbvtq8umjKOkSgVOLyqAYjIchE5ISJFIvKQh+3hIvKytX2niGRb65NE5H0RuSgiv+pzzAIRKbSO+aWIyGgUSCmllHcGTQAi4gAeB24D8oHVIpLfZ7d7gQZjzFTgMeBRa3078A/A33o49a+B+4Fp1tfy4RRAKaXU8HhTA1gIFBljio0xncBaYEWffVYAz1rLrwI3iYgYY1qMMVtwJ4LLRCQViDPGbDfuSYl/D9w5koIopZQaGm8SQDpwttfP5dY6j/sYY7qBJiBpkHOWD3JOpZRSY8ibBOCpbd4MY59h7S8i94vIHhHZU1tbO8AplVJKDYU3vYDKgcxeP2cAlf3sUy4iTiAeqB/knBmDnBMAY8yTwJMABQUFAyUVpTzSHkRKeeZNDWA3ME1EckQkDFgFrOuzzzrgHmt5JbDJatv3yBhTBVwQkcVW759vAH8YcvRK9cPVYzhT18L7x2s4U9dCW6fL7pCU8jmD1gCMMd0i8iCwEXAAa4wxR0Tk58AeY8w64BngOREpwn3nv+rS8SJSCsQBYSJyJ3CLMeYo8F3gd0Ak8K71pdSQ9b7Db+no5oMTNewqrafL9cl7kKToMBbmJFKQlUhkmGO8w1TK58gAN+o+p6CgwOiUkMFpoGacS3YU17HxSDWd3T1cmTmBnORokmPC6eh2ca65g+PVFyitayHUISyblsKy6SmEOvqvBGvzkAoEIrLXGFPgaZu+Caz8njGGPx+vYdPxGqZNjOH2ualMiov4xD4zJsOy6SlUNbXxwYla/ny8hgNnG7nzqnTyUmJsilwpe2kCUH7NGMO7h6vZUnSeBVMS+OL8dEIGeKk8NT6S1QunUFBzgXUHKlmzpYRl01O4edYkHCHev4w+WI1Eaw/KH+hgcMqv7SqtZ0vReRbnJg168e9t2sRY/urGaRRkJ/DhyVqe+qiY5rauMY5WKd+iCUD5raqmNt45VMX0STF8/opUry/+l4Q5Q/jiVRmsujqT6qZ2nvigiPKG1jGKVinfowlA+aWObhcv7SojMszBygWZQ77493ZFxgQeuD6XkBDhyc3FHCpvHMVIlfJdmgCUX9pwuJq6i518uSCTmPCRP8pKjY/kezdMJT0hkrW7z/LHo+fo6fGfHnJKDYcmAOV3Khvb2FVSz6LcpFHtwRMT7uTea3NYMCWB90/U8OBL+2jv0hfIVODSXkDKrxhjeOtQJZFhDj47a9Kon9/pCOFL89OZGBfO+sJqmtp2899fLxiVWoZSvkb/qpVfOVjeyJm6Vr54VfqYvc0rIiydlkJMuJPX9pWz/D82880l2USF6X8XFVi0CUj5jS5XDxsOV5OREMmCrIQx/31XTUnga4uyqGpq59ltpXR0a3OQCiyaAJTf2FlcR3N7N8vnTB5Rr5+hmJUax+qrM6lobOP5HWfodvWMy+9VajxoAlB+oaPbxQcna5k6MYbc5PEduiE/LZ4vzc/gdG0L/7O3HH8aP0upgWgCUH5h++k6WjtdY/Lg1xvzpyRwa/4kCiua2Ha6zpYYlBptmgCUz2tq62LzqVpmTo4lMzHKtjiWTU9hVmoc7x6uoqyuxbY4lBotmgCUz/v9tlLau3q42aa7/0tEhJXzM4iPDOWl3Wd1khnl9zQBKJ/W1unit9tKmTEplrQJkXaHQ2SYg9ULp9Dc1sV7R6vtDkepEdEEoHzay7vLqG/pZNn0FLtDuSwjIYpr8pLYVVLP2XodPE75L00Aymd1uXp46qMSCrISyEmOtjucT7h51iRiI5z84WAFPdorSPkpTQDKZ711sJKKxja+e0Oe3aF8SkSog9vnplLZ2M6uknq7w1FqWDQBKJ/U02P49QenmTEplhtnTrQ7HI/mpseTnRTFBydq6NIXxJQf0gSgfNKfj9dwquYi370hDxmnt36HSkS4ceYkmtu72Xumwe5wlBoyTQDK5xhjeOKDIjISIvn8Fal2hzOgvJRoshKj+PBkrQ4TofyOJgDlc3aW1LO/rJEHluXidPj2n6iIcOOsiTS1dbG3TGsByr/49v8uFZR+/cFpkmPCuKsg0+5QvDI1JYbMhEg+PFmLS2cRU35EE4DyKYcrmvjwZC3fujaHiNCxGe9/tIkIy6an0NjaxclzF+wORymvaQJQPuU3H54mJtzJ3Yuz7A5lSGZOjiMuwsmOYh0oTvkPTQDKZ5Seb2F9YRVfWzyF+MhQu8MZEkeIsDAnkVM1F6m72GF3OEp5RROA8hn/vbkYpyOEe6/NsTuUYSnITiRE3A+xlfIHmgCUT6hpbue1veWsXJDBxLgIu8MZlriIUGanxbP3TAPtXTpSqPJ9mgCUT3hmawndPT08sCzX7lBGZFFOIm1dLtYXVtkdilKD0gSgbNfU1sULO8r43BVpZCX51qBvQ5WTHE1CVChv7K+wOxSlBqUJQNnu+R1nuNjRzV9e7993/+DuEnpl5gS2Fp2nprnd7nCUGpBXCUBElovICREpEpGHPGwPF5GXre07RSS717aHrfUnROTWXut/LCJHROSwiLwkIv7Z8KtGpK3TxZotJdwwI4XZafF2hzMqrsxMoMfAuoOVdoei1IAGTQAi4gAeB24D8oHVIpLfZ7d7gQZjzFTgMeBR69h8YBUwG1gOPCEiDhFJB34AFBhj5gAOaz8VZP5n71nqWjr57vW+N+TzcKXEhjMvI57X92kzkPJtTi/2WQgUGWOKAURkLbACONprnxXAz6zlV4FfiXsIxxXAWmNMB1AiIkXW+cqs3x0pIl1AFKC3S0HgxZ1ll5ddPYbH/niSKYlRFNVcZFFuko2Rja7MxCjePlTFL/54ksl9ejV9ddEUm6JS6pO8aQJKB872+rncWudxH2NMN9AEJPV3rDGmAvg33ImgCmgyxrzn6ZeLyP0iskdE9tTW1noRrvIXR6uaaWjtYum0ZJ8d8nm4rsiYQIjAgbJGu0NRql/eJABP/zP7jnjV3z4e14tIAu7aQQ6QBkSLyN2efrkx5kljTIExpiAlxXfmhVUjt7XoPInRYcxKjbM7lFEXE+5k6sQYCisaMTplpPJR3iSAcqD3sIwZfLq55vI+IuIE4oH6AY69GSgxxtQaY7qA14ElwymA8k9l9a2U1bdybV4SIQF293/J3PR4Glq7qGzU3kDKN3mTAHYD00QkR0TCcD+sXddnn3XAPdbySmCTcd/2rANWWb2EcoBpwC7cTT+LRSTKelZwE3Bs5MVR/mJr0XkiQkOYn5VgdyhjZlZqHCEChRVNdoeilEeDJgCrTf9BYCPui/QrxpgjIvJzEfmCtdszQJL1kPevgYesY48Ar+B+YLwB+L4xxmWM2Yn7YfE+oNCK48lRLZnyWQ0tnRyuaGJhdiLhTv8Y8nk4osKc5KXEcLiySZuBlE/yphcQxpj1wPo+637aa7kduKufYx8BHvGw/h+BfxxKsCowbC+uQwSuyUu2O5QxNyctnjcOVFDV1E7ahEi7w1HqE/RNYDWu2rtc7C6tZ256vN8N+Twc+WnuZqDDldoMpHyPJgA1rvaeaaCju4drpwb+3T9AdLiTnORoDldoM5DyPZoA1LjpdvWw7fR5spOiyEiIsjuccTMnPZ7zFzs5d0EnilG+xatnAEoNRe+3fXsrrGiiobWLz81NHeeI7DUrNY4/HKjkaGXzp94KVspOWgNQ42b7afeLXzMD8MWvgcRFhJKZEMmxqma7Q1HqEzQBqHFxrrmd0rpWFmYnBuyLXwPJT4unorGNxtZOu0NR6jJNAGpc7CqtxxEiAf3i10DyrVqP1gKUL9EEoMZcl6uH/WUNzE6LIyY8OB87pcSGkxITzlFNAMqHaAJQY66wvIn2rh4WZifaHYqt8tPiKDnfQlNrl92hKAVoAlDjYFdpPckx4eQk+/d8vyOVnxpHj4FNJ87ZHYpSgCYANcbONbdTVt/KwuyEgBvzf6jSEyKJjXDy3hFNAMo3aAJQY2p/WQMhAldOCc6Hv72FiDArNY4PT9bS3uWyOxylNAGosdNjDAfONjJ9UmzQPvztKz81jtZOF1uLztsdilKaANTYKaq5SHN7N/P17v+y3JRoYsO1GUj5Bk0AaszsK2sgMtTBzMmxdofiM5whIdwwcyJ/OnYOV48ODqfspfVyNSbau1wcrWxmQVYCTod39xn9jSEUaG7Jn8RbByvZV9bA1UHeNVbZS2sAakwcrmiiu8do848HN8xIIdQhvHek2u5QVJDTBKDGxIHyRpKiw8hI0Fmw+oqNCGVJXjLvHT2ncwQoW2kCUKOuub2LktoW5mVOCPq+//25ZfYkztS1cvLcRbtDUUFME4AadYcrmjDAFenxdofisz47axKANgMpW2kCUKPuUHkTqfERTNTJT/o1MS6Cq6ZM4L2j2h1U2UcTgBpVZ+tbKatv1bt/L9ySP5nCiiYqG9vsDkUFKU0AalS9U1gFwNyMCTZH4vtume1uBvqj1gKUTTQBqFH11sFKMhMiSYwOszsUn5eXEkNeSjTvHdXnAMoe+iKYGjVlda0cqWzmtjmT7Q7Fp/V+4S0jIYqPTtXyzEclRIY5+OqiKTZGpoKN1gDUqNlo9WiZnabt/966NEfA8WqdKUyNP00AatRsOFLN7LQ4bf4ZgktzBOhUkcoOmgDUqKhpbmfvmQaWz9bmn6G4NEfAyXMX6HL12B2OCjKaANSo2Gj1ZFmu7f9Dlp8aR5fLUFSjbwWr8aUJQI2K945Uk5sSzdSJMXaH4ndyU6IJd4ZoM5Aad5oA1Ig1tnay/XQdt86erGP/DIMzJIQZk2M5VtWscwSoceVVAhCR5SJyQkSKROQhD9vDReRla/tOEcnute1ha/0JEbm11/oJIvKqiBwXkWMics1oFEiNvz8fq6G7x2j7/whcmipy75kGu0NRQWTQBCAiDuBx4DYgH1gtIvl9drsXaDDGTAUeAx61js0HVgGzgeXAE9b5AP4T2GCMmQnMA46NvDjKDhuOVJMaH8EVGdr9c7imT4rFESKXu9IqNR68qQEsBIqMMcXGmE5gLbCizz4rgGet5VeBm8TdFrACWGuM6TDGlABFwEIRiQOWAc8AGGM6jTGNIy+OGm8tHd1sPlmrzT8jFBHqYGpKDBsOV+scAWrceJMA0oGzvX4ut9Z53McY0w00AUkDHJsL1AK/FZH9IvK0iER7+uUicr+I7BGRPbW1tV6Eq8bThydr6eju0d4/o2BuRjwVjW3sK9N7ITU+vEkAnm7r+t6i9LdPf+udwHzg18aYq4AW4FPPFgCMMU8aYwqMMQUpKSlehKvG08Yj1SRFh+nctqMgPzWOMEcIbx+qtDsUFSS8SQDlQGavnzOAvn+hl/cREScQD9QPcGw5UG6M2WmtfxV3QlB+pKPbxaZjNdw8axKOEG3+GamIUAfXz0jhnUNV2htIjQtvEsBuYJqI5IhIGO6Huuv67LMOuMdaXglsMu6GzHXAKquXUA4wDdhljKkGzorIDOuYm4CjIyyLGmfbTtdxoaNbm39G0R3z0qi50MHu0nq7Q1FBYNDRQI0x3SLyILARcABrjDFHROTnwB5jzDrcD3OfE5Ei3Hf+q6xjj4jIK7gv7t3A940xLuvUfwW8YCWVYuBbo1w2NUYujWb5+r5ywp0hnK1v/cQIl2r4bpo5kYhQdzPQ4twku8NRAc6r4aCNMeuB9X3W/bTXcjtwVz/HPgI84mH9AaBgKMEq39FjDEermpkxORanQ98nHC3R4U5umjmJdwur+dkds/XfVo0p/etSw1Ja10Jrp0uHfh4Dd8xLpa6lk+3FdXaHogKcJgA1LEcrm3GGCNMn6dg/o+2GGROJDnPw9sEqu0NRAU4TgBoyYwxHKpuZNjGGcKdj8APUkESEOvhs/iQ2HKmms1uHiFZjRxOAGrKKxjaa2rq0+WcMff6KNJrauthadN7uUFQA0wSghuxIZTMhAjNTY+0OJWAtnZ5MXISTtw7qS2Fq7GgCUEPibv5pIjc5hqgwrzqRqWEIdzq4dfZk3jt6jvYu1+AHKDUMmgDUkJyqucj5i53kp8XZHUrA+/y8NC52dPPhSR0DS40NTQBqSNYXViHAbE0AY25JXhJJ0WG8ub/C7lBUgNIEoIZkfWEVWUnRxEaE2h1KwAt1hHDnVen86dg5Glo67Q5HBSBNAMprRTUXOHnuInPT9e5/vKxckEGXy7BOHwarMaAJQHltfWE1Imj3z3E0KzWO2WlxvLq33O5QVADSBKC8tr6wioKsBOIitflnPK1ckEFhRRPHq5vtDkUFGE0Ayiunay9yvPoCt81JtTuUoLPiynRCHcJrWgtQo0wTgPLKu4XucWlum6tj/4+3xOgwbpw5kTf2V9Ll0qEh1OjRBKC8sr6wmgVZCaTGR9odSlBauSCT8xc72KzvBKhRpK9yKo96T/BSd7GDo1XN3D43VSd+sckNM1JIig7j1b3l3DRrkt3hqAChCUAN6nBFEwBz9OWvMTdQgl1xZTrP7SiloaWThOiwcYxKBSptAlKDKqxsIjMhkglRetGxk74ToEab1gDUgOpbOqlsbOc2nfjddgfONpIaH8GTm4sJ7TNV5FcXTbEpKuXPtAagBvRx84++/OUL5k9JoKKxjeqmdrtDUQFAE4AaUGFFExkJkdrm7COuzJyAM0TYVarzBauR0wSg+lV7oYOKxjbmpuvdv6+IDncyNz2e/WWNdHTrPAFqZDQBqH4dONuIAPMyJtgdiuplUU4iHd09HDjbaHcoys9pAlAeGWM4cLaBvIkxOvaPj8lMjCI1PoKdxfUYY+wOR/kxTQDKo7L6Vhpau7gyU+/+fY2IsDgniermdsrqW+0OR/kxTQDKo/1nGwl1CLNT9eUvXzQvcwLhzhB2ltTbHYryY5oA1Kd0dvdQWN7ErNQ4wkMddoejPAhzhjB/SgKFFU1c7Oi2OxzlpzQBqE/ZdLyGti4XV2nzj09blJOIq8ew90yD3aEoP6UJQH3KK3vOEhfhZOrEWLtDUQOYGBdBTnI0u0rqcPXow2A1dJoA1CdUN7XzwYka5k9JwBEidoejBrE4N4mG1i4dJloNiyYA9Qmv7j1Lj4EFWQl2h6K8kJ8aR2y4k+d2nLE7FOWHNAGoy3p6DC/vOcs1uUkkxYTbHY7ygiNEKMhO5P0TNZypa7E7HOVnvEoAIrJcRE6ISJGIPORhe7iIvGxt3yki2b22PWytPyEit/Y5ziEi+0Xk7ZEWRI3c9uI6zta3sWphpt2hqCFYlJtIaEgIa7aU2B2K8jODJgARcQCPA7cB+cBqEcnvs9u9QIMxZirwGPCodWw+sAqYDSwHnrDOd8kPgWMjLYQaHS/tKiMuwsmts3XoZ38SFxHKF65M45U95TS2dtodjvIj3tQAFgJFxphiY0wnsBZY0WefFcCz1vKrwE0iItb6tcaYDmNMCVBknQ8RyQA+Bzw98mKokappbmfD4WruKsgkQvv++517r8uhrcvFCzplpxoCbxJAOnC218/l1jqP+xhjuoEmIGmQY/8D+AnQM9AvF5H7RWSPiOyprdWeDmPlpV1n6e4x3L04y+5Q1DDMSo1j6bRknt1WSmf3gP+llLrMmwTgqS9g307H/e3jcb2IfB6oMcbsHeyXG2OeNMYUGGMKUlJSBo9WDVmXq4cXdp7h+ukp5CRH2x2OGqb7luZSc6FDp4xUXvMmAZQDvZ8KZgB9/8Iu7yMiTiAeqB/g2GuBL4hIKe4mpRtF5PlhxK9GwcYj1dRc6OCeJXr378+WTUtmxqRYnv6oWEcJVV7xJgHsBqaJSI6IhOF+qLuuzz7rgHus5ZXAJuP+C1wHrLJ6CeUA04BdxpiHjTEZxphs63ybjDF3j0J51DD8ftsZpiRGcf30iXaHokZARLh3aQ7Hqy+wpei83eEoPzBoArDa9B8ENuLusfOKMeaIiPxcRL5g7fYMkCQiRcBfAw9Zxx4BXgGOAhuA7xtjdBojH3K4ooldpfXcvXiKvvkbAFZcmUZKbDhPfaRdQtXgnN7sZIxZD6zvs+6nvZbbgbv6OfYR4JEBzv0B8IE3cajR9+TmYmLCnaxaOMXuUNQoCHc6uOeaLP7tvZOcqL7AjMk6npPqn74JHMTKG1p5p7CK1QsziYvQWb8CxdcWZRERGsLTHxXbHYrycZoAgtiaLaUI8K1rc+wORY2ihOgwvlyQyZsHKqhuarc7HOXDNAEEqabWLtbuLuOOeWmkTYi0Oxw1yr6zNJceg9YC1IC8egagAs/zO8/Q2ukifUIkL+rbowEnMzGKO65I5cVdZXz/M1NJiA6zOyTlgzQBBKGLHd089VExMybF6t1/gPCUxKckRdPa6eJvXz3IM/dcbUNUytdpE1AQ+v32Uhpbu7hxpvb7D2ST4yKYOTmW7afraO3UeYPVp2kCCDItHd08tbmY66enkJkYZXc4aoxdPz2F1k4XL+06O/jOKuhoAggyz+04Q0NrFz+8eZrdoahxkJUUTXZSNE9/VKyDxKlP0WcAAa5323Bndw//9edTTJsYw/GqCzZGpcbTDTNS+N22Ut48UMGXC3SyH/UxrQEEkZ0ldbR0urTtP8hMmxhDfmocv/nwNK4eHSROfUwTQJDo7O5h88lapqbEkJWkQz4HExHhuzfkUVzbwntHqu0OR/kQTQBBQu/+g9vtc1PJToriiQ9O61DR6jJNAEGgs7uHzafOk5cSTbZO+BKUHCHCA9fnUVjRxNaiOrvDUT5CHwIHgR3FdbR0dHPjTB3xM1i9uLOMblcPsRFOfrruMPddl3t521cX6d9FsNIaQIBr63Tx4clapk+K0ekeg5zTEcJ1U5Mprm3hbH2r3eEoH6AJIMBtPlVLW5eLW/In2x2K8gELsxOJCA3hw5O1doeifIAmgABW09zOttPnuSIjXsf8UQCEhzq4JjeJo1XNVDfrUNHBThNAAPvlplO4egyfnTXJ7lCUD7k2L5kwZwjvH6+xOxRlM00AAar0fAtrd53l6uxEkmLC7Q5H+ZCocCfX5CZxuKKJc1oLCGqaAALUL/54klBHCJ/Rfv/Kg+umJhPqCOH9E1oLCGaaAALQkcom1h2s5NvXZetcv8qj6HAni3OTKCxvoqhGx4UKVpoAAtC/bjxBfGQo9y/LszsU5cOum5aM0yH816Yiu0NRNtEEEGC2nDrPBydq+d4NecRH6t2/6l+MVQt462Alp2sv2h2OsoEmgADS7erh/7x9lMzESL55bbbd4Sg/sHRaCuFOB7/SWkBQ0gQQQF7ec5YT5y7wd7fNItzpsDsc5Qdiwp3cvXgKfzhQQbHWAoKOJoAA0dzexS/eO8nCnESWz9G3fpX37l+WR5gzRGsBQUgTQIB4fFMR9a2d/PTz+YiI3eEoP5ISG87XF2fx5oEKTlRrj6BgogkgAJypa2HN1hJWzs9gTnq83eEoP/S9G6YSHe7k0Q3H7Q5FjSMdDtrPvbizjBd2nkEQ8ibGfGIOYKW8lRAdxvdumMqjG46zo7iOxblJdoekxoHWAPxcce1FjlQ2c/2MFH3pS43It67NJjU+gn9+97jOGhYkNAH4sW5XD+8UVjEhMpTrpibbHY7ycxGhDn782ekcPNvIuoOVdoejxoFXCUBElovICREpEpGHPGwPF5GXre07RSS717aHrfUnRORWa12miLwvIsdE5IiI/HC0ChRMfretlKqmdm6fm0qoQ3O5Grm/mJ/B3PR4/u/6Y1zs6LY7HDXGBr1qiIgDeBy4DcgHVotIfp/d7gUajDFTgceAR61j84FVwGxgOfCEdb5u4G+MMbOAxcD3PZxTDaCisY1/f+8kMyfHMjstzu5wVIBwhAj/tGI255o7+K9Np+wOR40xb24bFwJFxphiY0wnsBZY0WefFcCz1vKrwE3i7ou4AlhrjOkwxpQARcBCY0yVMWYfgDHmAnAMSB95cYKDMYZ//MNhAO6Yl6bdPtWomj8lgbsWZLBmSwlFNfpyWCDzJgGkA2d7/VzOpy/Wl/cxxnQDTUCSN8dazUVXATs9/XIRuV9E9ojIntpancYO4J3CKv50rIYf3TyNhKgwu8NRAeh/3TaTiFAH//vNQnp69IFwoPImAXi6vez7F9HfPgMeKyIxwGvAj4wxzZ5+uTHmSWNMgTGmICUlxYtwA1vthQ7+4c3DXJERz73X5dgdjgpQyTHh/P3ts9hRXM8Lu7RrcaDy5j2AciCz188ZQN8uApf2KRcRJxAP1A90rIiE4r74v2CMeX1Y0QcZYwx/90YhLZ0u/v2ueTj1wa8aBf29O2KMYem0ZP5l/TE+MyOFjISocY5MjTVvriC7gWkikiMiYbgf6q7rs8864B5reSWwybg7Eq8DVlm9hHKAacAu6/nAM8AxY8wvRqMgweD1fRX88eg5/uaz05k2KdbucFSAExH++UtzAXjoNW0KCkSDJgCrTf9BYCPuh7WvGGOOiMjPReQL1m7PAEkiUgT8NfCQdewR4BXgKLAB+L4xxgVcC3wduFFEDlhft49y2QJKyfkWfvqHwxRkJXDf0ly7w1FBIiMhir//XD5bis6zZmuJ3eGoUebVUBDGmPXA+j7rftpruR24q59jHwEe6bNuC56fDygPOrpdPPjiPkKdIfxy9VU4QvSfTo2f1Qsz+fBkDY9uOM7V2YnMy/b4b48AABBNSURBVJxgd0hqlGgjsh/45/XHOVLZzL+tnEfahEi7w1FBRkT4f38xj4mxEfzVS/tpbu+yOyQ1SsSfxvwoKCgwe/bssTuMcfXG/nJ+/PJBrs1L4nNXpNkdjgpiZ+paeOqjYmZMjuNri6Zw9+Isu0NSXhCRvcaYAk/btAbgww6VN/K/XiskJzma5XNS7Q5HBbmspGhum5PKsapmNh2vsTscNQo0AfiomgvtPPDcXlJiwvnqwina7q98wpK8JOZPmcCm4zVsOFxtdzhqhDQB+KCLHd18+3e7aWzt4slvLCA6XKdtUL5BRFhxZToZCZH8+OUD7C9rsDskNQKaAHxMZ3cP331+L8eqLvDE1+YzO01n+FK+JdQRwtcXZ5ESG863f7eb0zqZvN/SBOAjXtxZxvM7zvDl/97OR6fOc+eV6VQ1tesMX8onxUaE8vtvL8QRInzjmV2ca263OyQ1DJoAfMjGw9UcONvILfmTWJCVYHc4Sg0oOzma335zIY2tndyzZhdNbdo91N9oAvARW07V8lHRea7JTeL66TronfIPczPi+c3XF3C69iLf+f0e2rtcdoekhkATgA94dW856w9XMyc9ns9dkarj+yu/snRaCv/+5SvZVVLPgy/uo7O7x+6QlJe0e4nN3jpYyU9ePcjUiTHctSCDEL34Kz/R9/nUF+alse5gJSt+tYXVi6bwjWuy7QlMeU1rADbaeKSaH718gILsRO5elKXz+iq/tjg3iTvmpXGs+gIv7TqrNQE/oFccm3xwooa/enE/c9PjWfPNqwlz6keh/N81uUnccYX7bWFtDvJ92gRkg22nz/PAc3uZNimGZ7+9kBh90UsFkGvykgF461AVdz6+lVULM3GGfPIG56uLptgRmupDbzvH2a6Seu57dg9ZSVE8d+8i4iND7Q5JqVF3TV4yn78ilaNVzTy/44zWBHyU3nqOo0feOcaz20qJjwzlL+Zn6FgqKqAtyUsmNCSENw9U8MyWYu5Zkk1UmF5yfInWAMbJ7tJ6nt1WSlxkKPcuzSE2Qu/8VeC7OieR1QunUNXUzpObi/VlMR+j6XiUeRq64UxdC7+1Lv73Lc0hTi/+KojMSY8nKszBczvO8JsPT/OtJdl2h6QsWgMYY5cv/hFOvfiroJWbEsN3lubi6jH89+ZidhbX2R2SQhPAmCqra+F320qJDXdy33W5evFXQS1tQiQPLMslOtzJ157eyUu7dKBDu2kCGCNl9a38dlspMeFO7luaS5z29lGKpJhwvnt9HtdOTebh1wv5uzcKdfwgG2kCGANn6lr47dYSoq2Lv3b1VOpjkWEO1nzzav7y+jxe3FnGnY9v1TkFbKIJYJSdPHeBNVtLiI1w8h29+CvlkSNEeOi2maz5ZgHnmtv5/C+38LutJfT0GLtDCyqaAEbR+sIqntt+huSYcO5flqcXf6UGcePMSaz/4VIW5iTys7eO8pUnt3Py3AW7wwoaYoz/ZNyCggKzZ88eu8Pw6OXdZTz8eiGZCVF845psIsMcdoeklN8wxrC/rJG3CyvpchlWL8zkxzdPJykm3O7Q/J6I7DXGFHjapu8BjJAxhqc/KuGR9ce4fnoKn5kxUQd2U2qIRIT5WQnMmBxLeUMrz+8s4/V9Fdy9OIv7luYwMTbC7hADkl6pRqDL1cM//OEwj6w/xufmpvLUNwr04q/UCESHO/mnFXPY+KOl3JI/iac/Kua6R9/nxy8fYE9pPf7UYuEPtAYwTA0tnTz40j62FtXxwPW5/OTWmThCdDIXpUbD1Imx/Meqq/jRzdNZs7WE1/dV8Mb+CrKTolg+J5Xb5kzmiox4nT1vhPQZwDDsKK7jxy8foO5iJ//3S3NZuSDj8jZPQ0Eopbznaajolo5u3j5UyVMflVBce5EeAxMiQ5mdFse0SbFkJUUR7nToMNMe6DOAUdLW6eK/Np3iNx+eJispmte/t4Q56fF2h6VUwIsOd/KVq6fg6oHWzm6OVV3gSGUTO0rq2Xq6jhCB9AmRlDe0sig3iXkZ8UyICrM7bJ+nNQAvGGP449Fz/Pzto5Q3tHHXggx+9oXZRHuYyEVrAEqNn87uHsrqWymuvUjx+RYqG9vott4lyEiIZG56PHPS45lrfSVEB19S0BrAMLl6DBuPVPPrD05TWNHE9EkxrL1/MYtzk+wOTSkFhDlDmDoxhqkTYwC486o09pc1UljRRGFFE4crmni317wbE6JCmRwXwcTYcCZe+h4bwTevzbapBPbyqgYgIsuB/wQcwNPGmH/psz0c+D2wAKgDvmKMKbW2PQzcC7iAHxhjNnpzTk/GowbQ0e3i4Nkm3j1cxTuHqqi50EFOcjT3L8tl5YKMQSdu1xqAUr6lrdNFZVMbFQ1tVDS2UXOhnfMXOnH1uvZlJEQy3XqWkJEQRUZCpPUVRVyE068fNo+oBiAiDuBx4LNAObBbRNYZY4722u1eoMEYM1VEVgGPAl8RkXxgFTAbSAP+JCLTrWMGO+eoMcbQ5TK0d7to73LR0dVDW5eLhpZO6lo6qWhoo/h8CyfPXaCwvIlOVw9hzhA+MyOFL16VzmfzJ2sPH6X8VGSYg7yUGPJSYi6vc/UY6ls6OdfcTs2FDiLDHJw6d4EdxXW0dn5ycLowZwhJ0WEk9vqKDncSHeYgKsxJTLiTqHAH0WFOIsMchDtDCHOGuL87HIRZP4c5QwhzfPz9Uk4JEUHE+g6IMG4Jx5smoIVAkTGmGEBE1gIrgN4X6xXAz6zlV4FfibsEK4C1xpgOoEREiqzz4cU5R83sf9z4qQ+1r4SoUPJSYrhnSRYLshJZMjVJh29WKkA5QoSU2HBSYj9+0/ia3CSMMbR1umho7aK+tZPG1k4udnTT0uGipaObkvMtHKlspqO7h85uF12usXuG2jsppMSGs/3hm0b9d3iTANKBs71+LgcW9bePMaZbRJqAJGv9jj7HplvLg50TABG5H7jf+vGiiJzwIuYhOwMcAF4b/imSgfOjFI6vCLQyaXl8W6CVB0apTKcB+bthH57V3wZvEoCnukjftNffPv2t99SQ7jGVGmOeBJ4cKEBfICJ7+mtn81eBViYtj28LtPKA75fJm3ELyoHMXj9nAJX97SMiTiAeqB/gWG/OqZRSagx5kwB2A9NEJEdEwnA/1F3XZ591wD3W8kpgk3F3L1oHrBKRcBHJAaYBu7w8p1JKqTE0aBOQ1ab/ILARd5fNNcaYIyLyc2CPMWYd8AzwnPWQtx73BR1rv1dwP9ztBr5vjHEBeDrn6BdvXPl8M9UwBFqZtDy+LdDKAz5eJr96E1gppdTo0bGLlVIqSGkCUEqpIKUJYBSIyHIROSEiRSLykN3xDIeIlIpIoYgcEJE91rpEEfmjiJyyvifYHedARGSNiNSIyOFe6zyWQdx+aX1mh0Rkvn2Re9ZPeX4mIhXW53RARG7vte1hqzwnRORWe6Lun4hkisj7InJMRI6IyA+t9X75GQ1QHv/5jIwx+jWCL9wPsU8DuUAYcBDItzuuYZSjFEjus+7/AQ9Zyw8Bj9od5yBlWAbMBw4PVgbgduBd3O+qLAZ22h2/l+X5GfC3HvbNt/72woEc62/SYXcZ+sSYCsy3lmOBk1bcfvkZDVAev/mMtAYwcpeHyjDGdAKXhrUIBCuAZ63lZ4E7bYxlUMaYzbh7ofXWXxlWAL83bjuACSKSOj6Reqef8vTn8rArxpgSoPewKz7BGFNljNlnLV8AjuEeGcAvP6MBytMfn/uMNAGMnKehMgb6I/BVBnhPRPZaw28ATDLGVIH7jx2YaFt0w9dfGfz5c3vQahJZ06tZzq/KIyLZwFXATgLgM+pTHvCTz0gTwMh5M1SGP7jWGDMfuA34vogsszugMeavn9uvgTzgSqAK+Hdrvd+UR0RicA+79SNjTPNAu3pY53Nl8lAev/mMNAGMXEAMa2GMqbS+1wBv4K6anrtU5ba+19gX4bD1Vwa//NyMMeeMMS5jTA/wFB83IfhFeUQkFPfF8gVjzOvWar/9jDyVx58+I00AI+f3w1qISLSIxF5aBm4BDvPJIT7uAf5gT4Qj0l8Z1gHfsHqaLAaaLjVD+LI+beBfxP05Qf/DrvgMERHcowYcM8b8otcmv/yM+iuPX31Gdj9JD4Qv3L0VTuJ+qv/3dsczjPhzcfdOOAgcuVQG3EN6/xk4ZX1PtDvWQcrxEu4qdxfuu617+ysD7ur449ZnVggU2B2/l+V5zor3EO4LSmqv/f/eKs8J4Da74/dQnutwN3kcwj36+gHr/45ffkYDlMdvPiMdCkIppYKUNgEppVSQ0gSglFJBShOAUkoFKU0ASikVpDQBKKVUkNIEoPyWiHxRRIyIzLQ5jhtEZIm1PEFE6qw+4ojINVaMGdbP8SJSLyJD/r9njTL5t6MbvQpmmgCUP1sNbMGagtRGNwBLAIwxjUA1MMvatgTYf2k7H49q2TPOMSr1KZoAlF+yxl+5FvfLUat6rf+JuOc1OCgi/2Ktmyoif7LW7RORPOvt0n8VkcPW/l+x9r1BRN7udb5ficg3reVSEfkn6xyFIjLTGgTsL4EfW2O/LwW28vEFfwnwWJ+ft1nnyxORDdYAfB9dqsmISIqIvCYiu62vaz2U/zsi8q6IRIrID0TkqDX42NpR+idWQWDQSeGV8lF3AhuMMSetJpX5wCRr/SJjTKuIJFr7vgD8izHmDRGJwH3j8yXcg3XNA5KB3SKy2Yvfe94YM19Evod7zPf7ROQ3wEVjzL+B+8KOeyz/p3G/Zf0/wAPW8UuAf7aWnwT+0hhzSkQWAU8ANwL/CTxmjNkiIlOAjXxco0BEHsQ9XMedxpgOcU9ClGMtTxjSv6IKapoAlL9aDfyHtbzW+jkE+K0xphXAGFNvjXGUbox5w1rXDiAi1wEvGWNcuAcj+xC4GhhodEqASwOY7cWdRDzZCjxkjfdSaoxpt2ocMcACYJe1vAT4H+txAbgnCgG4GcjvtT7u0lhNwNdxDwtxpzGmy1p3CHhBRN4E3hwkfqUu0wSg/I6IJOG+U54jIgb3rGwG96iMfcc28TQE70Dru/lk02hEn+0d1ncX/fz/se7oE4A7gO3W6r3At4ASY8xFEYkDGo0xV3o4RQhwjTGm7RMBuxPCYdw1lwygxNr0Odw1ji8A/yAis40x3f2UT6nL9BmA8kcrcc8UlWWMyTbGZOK+GNYD3xaRKHDPNWvc47OXi8id1rpwa/tm4Csi4hCRFNwX0F3AGdx33+EiEg/c5EU8F3BPCdjbduCHfJwAtgM/wmr/t+IqEZG7rLhEROZZ+74HPHjpRCLSO0nsx92ctE5E0qzeRJnGmPeBnwATgBgvYlZKE4DyS6txz1nQ22tAGu7RF/eIyAHgUpfJrwM/EJFDuC/Ak63jD+EeAXUT8BNjTLUx5izwirXtBdwX3MG8BXyx10NgcDcDZQJ7rJ+3434esK3XcV8D7hWRS6OwXppK9AdAgfVQ9yjuh8yXGWO2WGV7B/dIms+LSKEV62NWTySlBqWjgSqlVJDSGoBSSgUpTQBKKRWkNAEopVSQ0gSglFJBShOAUkoFKU0ASikVpDQBKKVUkPr/e06q+RP1dnkAAAAASUVORK5CYII=\n", - "text/plain": [ - "
" - ] - }, - "metadata": { - "needs_background": "light" - }, - "output_type": "display_data" - } - ], - "source": [ - "# Example EDA\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Preprocessing\n", - "\n", - "- Are there any duplicated values?\n", - "- Do we need to do feature scaling?\n", - "- Do we need to generate new features?\n", - "- Split Train and Test dataset. (0.7/0.3)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# ML Application\n", - "\n", - "- Define models.\n", - "- Fit models.\n", - "- Evaluate models for both train and test dataset.\n", - "- Generate Confusion Matrix and scores of Accuracy, Recall, Precision and F1-Score.\n", - "- Analyse occurrence of overfitting and underfitting. If there is any of them, try to overcome it within a different section." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Evaluation\n", - "\n", - "- Select the best performing model and write your comments about why choose this model.\n", - "- Analyse results and make comment about how you can improve model." - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.7.3" - } - }, - "nbformat": 4, - "nbformat_minor": 4 -}