- understand the target customers to plan a marketing strategy.
- identify the most important shopping groups based on annual income, age, gender, and their shopping score.
- identify the ideal number of groups and labeling them using the Elbow's method.
- EDA (Exploratory Data Analysis)
- Utitlize KMeans clustering algorithm
- from the created clusters use Summary Statistics
- Distribution of Age:
- it's roughly distributed with a peak for people aged around 30-50 years.
- There're also some slightly peaks around 20 and 40 years, indicating other age clusters
- Distribution of Annual Income (in $k):
- The income is right skewed where few people has higher annual income.
- Distribution of spending score:
- The distribution is relatively uniform, where most customers have a moderate spending habits.
I recommend: Marketing strategies could target the 30-35 age group, as they are the most common customers. Also, High-income individuals (80k+) might be valuable for premium services.
-
we can see clearly that 'Females' portion is more frequent than that of 'Males'.
-
Also there is an outlier in the 'Male' as it has a right thick tail with the distribution skewed to the right.
- So to investigate this outlier:
Interpretation:
- The median income for females appears slightly lower than that for males.
- There’s one outlier among males, indicating a significantly higher income.
I) understanding customers behavior based on demographic factors like age and gender in relation to income and spending habits.
- In 'Annual Income vs. Spending Score':
- There may be a pattern found related to annual income and spending score
- In 'Age vs. Spending Score':
- No strong correlation wes found.
- Younger individuals may have slightly higher spending scores.
- In 'Age vs. Annual Income' :
- No correlation was found
- Income is different across all age groups
- In 'Age vs. Spending Score' :
- There is a strong negative correlation (-0.33) between age and spending score.
- Younger individuals tend to have higher spending scores.
- There's no clear correlation between 'Annual Income' and 'Spending Score'
- There's almost no correlation between 'Age' and 'Annual Income'
- target cluster would be the green cluster no.2 which has a high spending score and high annual income.
- approxiamtley 54% of customers in cluster 2 are females. So, we should look for ways to attract more of them by using a marketing campaign targeting popular items in this cluster.
- cluster 0 has a potential in sales. So, we should make an event on popular items purchased by them.