Skip to content

Store marketing campaign data is used to create an excel dashboard and to use insights from it to identify the age group that maximizes the sales revenue.

Notifications You must be signed in to change notification settings

anil023/Sales-and-Customer-Analysis-for-a-Store

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Sales and Customer Analysis for a Store

Define the Question

Data from a superstore(iFood) marketing campaign is accessible on Kaggle (link). This repository documents the data analysis conducted on this available dataset using Microsoft Excel. The primary objectives of this analysis include identifying customer groups for future customer acquisition campaigns that could help maximize sales.

The dataset served as the cornerstone for performing data analysis and generating a variety of insights using visualizations in Microsoft Excel. Additionally, it served as the source of inspiration for creating an Excel-based dashboard that captures the store, customer and sales demographics.

image

Collecting Details:

  • dictionary.png : describes the variables/column names present in the raw data file dictionary.png
  • ifood_df.csv : raw data file exported from Kaggle
  • Sales and Customer Analysis Dashboard.xlsx : excel encompassing all the data analysis and visualizations
    • sheet "rawdata" is the filtered and cleaned datset
    • sheet "tables" contains the tables used to link the sales products and channels
    • sheet "Data Analysis" contains tables used in visualizations etc
    • sheet "Dashboard" contains the excel based dashboard for the data
    • sheet "Version Control" contains the general instructins for the dashboard

Data Cleaning

  • The data stored as csv is imported using Power Query and cleaned using Power Query commands for correct data types.
  • The final dataset comprises 2,205 row items. I did not remove any potential duplicates flagged by Excel since there are no distinguishing columns, such as customerID, to confirm they are duplicates. It is possible for distinct customers to exhibit similar purchase behaviors.
  • The Age numerical variable is converted into a categorical variable Age Bracket. Similarly we also built Income Bracket, Last Purchase, Website Visits and Customer Loyalty variables from their corresponding numerical variables utilizing logical reasons to split the data that would make sense to read and analyze the data.
  • The Education Status is a categorical variable that combines corresponding combination of columns from the raw dataset. Similarly we also created Marital Status.
  • Remaining variables are renamed to ease the understanding of the dataset.

Data Analysis

Age-Based Sales of Products

Understand the Section: "Age-Based Sales of Products"

When using the "Age-Based Sales of Products" section of the dashboard we need two inputs:

  • Total Sales/Average Sales
  • Product(select only one product at a time)
  • Examining the column plot for the total sales of Fish products, it becomes clear that the age segment with the highest sales (indicated in red) of fish products is 40-49, whereas the age group 20-29 records the lowest sales (in blue). The sales of fish products generated revenue of approximately $83.3k, accounting for roughly 6.2 percent of the total sales revenue from all the products in the store. The top three age segments that contributed the most to the sales of fish products are 40-49, 60-69, and 50-59. Observing the column plot for the average sales of Fish products, it is worth noting that the age segment 20-29 spent the most on fish products on average, whereas the age group 40-49 spent the least. The top three age groups with the highest average spending on fish products are 20-29, 70 and Older, and 60-69.

    Based on the preliminary analysis of the section 'Age-Based Sales of Products', several key insights can be derived as follows:

    From Total Sales:

    • Wine, Meat, and Gold products sales revenue collectively constituted approximately 85% of the store's total sales revenue, with wine sales accounting for about half of the total sales revenue from the store.
    • The age groups encompassing customers aged 40-69 collectively accounted for approximately 75% of the store's total sales revenue, with the 40-49 age group emerging as the major customer segment across all product categories, except for Wine.
    • Remarkably, the age group 20-29 consistently had the lowest sales revenue across all product categories.

    From Average Sales:

    • The customers within age group 20-29 and who are aged 60 and older, tend to spend the most across various product categories at the store.
    • Conversely, the 40-49 age group had the least average spending for most of the product categories, except for Wine, where the 30-39 age group spent the least.
    • In the case of Gold products, the age group 50-59 replaces the 70 and Older group among the top three customer segments with maximum average sales.
    • Comparing age groups, it's evident that customers in the age group 20-29 spent the most, about 55% more than the aggregate age group 40-69, and approximately 85% more than the age group 40-49.
    • Notably, the 70 and Older and 60-69 age groups consistently emerge as the consecutive next-best age groups with the highest average spending across product categories.

    Considering the insights from the Total Sales we woud have recommended to target the age group 40-69, but it appears that the age group 20-29 shows promising potential for future sales growth due to their high average spending. Additionally, the 70 and Older and 60-69 age groups also offer strong opportunities for targeting, as they consistently exhibit above-average spending habits.

    AGE BRACKETS TOTAL SALES(in $1000) AVERAGE SALES(IN $)
    FISH FRUITS GOLD MEAT SWEET WINE AGGREGATE TOTALS by age FISH FRUITS GOLD MEAT SWEET WINE AGGREGATE AVERAGES by age
    20-29 2.7 1.7 2.9 13.9 1.8 16.9 39.9 63 40 67 324 42 393 927
    30-39 12.7 9.1 13.3 55.2 8.9 79.6 178.7 38 27 40 164 26 237 532
    40-49 22.3 15.7 26.6 94 17.3 171.4 347.4 32 23 38 136 25 248 503
    50-59 19.1 14.4 24.8 85.3 13.7 178.4 335.7 36 27 47 162 26 338 636
    60-69 20.1 13 23.9 86.3 13.8 168.9 326 42 28 51 182 29 357 689
    70 and Older 6.4 4.2 5.8 29.8 4.4 59.8 110.3 48 31 43 222 33 446 823
    AGGREGATE TOTALS by products 83.3 58.2 97.1 364.5 59.8 675.1 AGGREGATE AVERAGE by products 38 26 44 165 27 306

    Age-Based Sales Channel Utilization

    Understand the Section: "Age-Based Sales Channel Utilization"

    When using the "Age-Based Sales Channel Utilization" section of the dashboard we need two inputs:
  • Total Sales/Average Sales
  • Channel(select only one channel at a time)
  • Examining the column plot for the total sales through Catalog channel, we observe that the age segment of 60-69 emerges as the most frequent users (indicated in red) of catalogs. In contrast, the age group 20-29 displays the least inclination (in blue) towards utilizing catalogs. One particularly striking observation is the remarkable disparity in usage between the 60-69 age segment and the 20-29 group, with the former utilizing Catalogs approximately 725% more frequently. When examining the average sales through catalogs, it's worth noting that the most enthusiastic age group to use catalogs is 20-29, while the age group 30-39 used them the least. The disparity in engagement is approximately 102% of group 30-39.

    The analysis of sales channel utilization, based on age groups, unveils intriguing insights into customer behavior as below:

    • It's worth noting that the majority of age groups show a preference for the in-store channel, followed by the web, catalogs, and deals channels, in that sequence. However, there are exceptions. The age group 20-29 appears to prioritize catalogs over the web, while the age group 40-49 leans towards deals over catalogs.
    • A substantial portion of the total number of sales, approximately 67%, is attributed to sales through physical stores and the web, with brick-and-mortar stores contributing to around 39% of the total sales.
    • Considering the overall utilization of these sales channels, age groups ranging from 40 to 69 demonstrate significant engagement, utilizing these mediums approximately 78% of the time. Among them, the age group 40-49 emerges as the most versatile in channel utilization.
    • Interestingly, the age group 20-29 consistently exhibits the lowest utilization across all sales mediums.
    • On comparing the average number of sales, most age groups display relatively similar usage patterns across each sales medium. Nevertheless, there are notable exceptions.
    • Age groups with customers aged 60 and older and 20-29 tend to have the highest usage across multiple mediums, indicating their receptiveness to a variety of sales channels.
    • Conversely, the age group 30-39 emerges as the lowest consumer of each sales medium, potentially reflecting a need for tailored marketing approaches to enhance their engagement.
    • Of particular significance is the age group 70 and older, which not only utilized approximately 38% more channels than the age group 30-39 but also demonstrated around 15% higher usage than the combined 40-69 age group.

    In light of this analysis, the age group 70 and older clearly emerges as the most engaged and responsive to various sales channels, particularly the in-store and web channels. Therefore, for future sales growth, it is recommended to focus on strategies that further enhance the engagement of this age group through in-store and web channels. Additionally, a targeted approach to re-engage the age groups 60-69 and 20-29, with a particular emphasis on catalogs for the 20-29 group and deals for the 60-69 age group, may also prove beneficial.

    AGE BRACKETS TOTAL:CATALOG TOTAL:DEALS TOTAL:STORE TOTAL:WEB AGGREGATE TOTAL BY AGE AVERAGE:CATALOG AVERAGE:DEALS AVERAGE:STORE AVERAGE:WEB AGGREGATE AVERAGE BY AGE
    20-29 184 66 267 157 674 4.3 1.5 6.2 3.7 15.7
    30-39 711 588 1743 1131 4173 2.1 1.8 5.2 3.4 12.5
    40-49 1510 1717 3740 2661 9628 2.2 2.5 5.4 3.9 14
    50-59 1414 1321 3171 2324 8230 2.7 2.5 6.0 4.4 15.6
    60-69 1518 1161 3011 2110 7800 3.2 2.5 6.4 4.5 16.6
    70 and Older 496 259 909 659 2323 3.7 1.9 6.8 4.9 17.3
    AGGREGATE TOTAL by channels 5833 5112 12841 9042 AGGREGATE AVERAGE by channels 18.2 12.7 36 24.8

    STORE KPI's

    • AGE:
    Store KPI: Age of Customers

    We can observe a visually normal distribution of customers among the age segments, with a notable number of customers in each category. It's worth noting that the age group 20-29 is represented by 43 records, while the age group 70 and older is represented by 134 records; these are relatively smaller sample sizes, and hence their rankings might be less reliable due to the potential for greater variability in the estimates.

    To gain a deeper understanding of whether the mean values differ significantly, particularly for the age groups with smaller sample sizes, it would be beneficial to employ ANOVA tests. Additionally, I would recommend the inclusion of more customers from these age groups in the study to enhance our confidence in comparing the averages of different age brackets. For this study we will accept the sample size as sufficient enough to instill confidence on the mean value comparisons performed above.

    • CUSTOMER COMPLAIN:
    Store KPI: Customer Complain

    A very low percentage, less than 2% of customers in each age segment, have raised complaints. However, it is noteworthy that there is an unusually low number of complaints in the 50-59 age group, with only 1 complaint out of 527 customers. Conversely, the 60-69 age group has a higher number of complaints, with 7 out of 466 customers. This distribution is unusual as it deviates from a normal distribution curve and begs the question if there was a reason for this high/low numbers.

    • NUMBER OF WEBSITE VISITS:
    Store KPI: Number of Website Visits

    In total, about 56% of customers visit the website 5-10 times a month, while approximately 42% use it less than 5 times a month. We don't observe any huge discripancies in the distribution of the website users within different age groups but there is definitely more opportunities to increase the website usage more than 10 times a month through different campaigns.

    • CUSTOMER LOYALTY PERIOD:
    Store KPI: Customer Loyalty Period

    Approximately 53% of customers have shown remarkable loyalty by staying with us for about 6-7 years. But we should also note that only a small percentage, around 3%, have been customers for a period ranging from 4-6 years. This data underscores the presence of a highly loyal customer base overall. Yet, there remains an opportunity for growth through targeted customer acquisition campaigns, particularly among age brackets such as 20-29, which can contribute to sales growth.

    • LAST PURCHASE:
    Store KPI: Last Purchase

    A significant portion, about 67% of customers, haven't made a purchase from the store in more than 1 month, with a substantial segment (37%) indicating that their last purchase was more than 2 months ago. This trend suggests a potential opportunity for re-engagement and marketing strategies to bring back these customers through various sales channels.

    CUSTOMER DEMOGRAPHICS

    • INCOME:
      Customer Demographics: Income

    We observe a substantial increase in the percentage of low-income customers within the 30-39 age group compared to adjacent age groups. This is followed by a gradual decrease in the proportion of low-income customers as we move towards the "70 and Older" age group. This variation may contribute to the lower average sales observed within the 30-39 age group. In contrast, both mid-income and high-income customers demonstrate an opposing trend, transitioning from the 30-39 age group to the "70 and Older" group.

    The 20-29 age group stands out with the highest proportion of high-income customers, which could explain the higher per-customer sales within this age group. Notably, the income distribution among the age groups can be summarized as 36% low income, 28% mid-income, and 36% high income.

    • MARITAL STATUS:
      Customer Demographics: Marital Status

    The distribution of marital status across age groups reveals varying percentages. Majorly, about 35% of our customers are married, 27% are single, and 26% are in relationships or living together. We can visually confirm that the proportion of married customers increases within the 20-49 age groups, then remains relatively stable at around 35% until the 70 and older age group.

    Similarly, the 20-29 age group had the highest percentage of single customers, approximately 60%, which gradually declined as we progressed through the age brackets. The percentage of customers in relationships or living together has remained relatively constant, with a slight increase noted in the 60-69 age group. The number of customers who are widowed or divorced increases as we progress through the age groups, with a notable decrease in divorced customers after the age of 60.

    It's important to note that the current analysis does not establish a clear correlation between marital status and sales. A more in-depth analysis is required to confirm any hypotheses regarding this relationship.

    • EDUCATION:
      Customer Demographics: Education

    A significant 87% of customers hold a bachelor's degree or higher, with 49% of them having a bachelor's degree. The remaining customers have attained either master's or PhD degrees. As age progresses, we observe a declining percentage of customers with "2n Cycle," "Basic," and "Graduation" degrees, with the exception of a slight increase in both "2n Cycle" and "Graduation" degree holders within the 60-69 age group. Notably, customers with "Masters" and "PhD" degrees show a consistent upward trend across age brackets, although there is a minor dip in the proportion of "PhD" holders in the 60-69 age group, followed by a significant surge in the "70 and Older" age group.

    We don't observe any correlation from our analysis between the education status and average sales at the store and would need further analysis to prove any relationship.

    • KIDS:
      Customer Demographics: Kids

    Approximately 66% of customers are without kids, while 33% have one kid. Interestingly, there is a conspicuous absence of customers with two kids in the 20-29 age group, and this pattern of a very low proportion, roughly 2%, holds true among other age groups as well. In the case of customers with one kid, there is a substantial surge between the 20-29 and 30-39 age groups, followed by a gradual decrease as we progress towards the "70 and Older" customer category.

    It's worth noting that the percentage of customers with no children is remarkably similar in the "20-29" and "70 and Older" age groups, prompting the hypothesis that the number of children in a family may impact sales. This observation gains significance, especially considering that 50% of the products sold are wine, followed by meat and gold products, none of which cater to children.

    • TEENS:
      Customer Demographics: Teens

    Around 62% of customers do not have teenagers in their households, while 36% have one teenager. It makes sense when the "20-29" age group is characterized by the absence of any teenagers. Similarly, customers aged 20-39 do not typically have two teenagers in their families. However, this changes gradually starting from the "30-39" age group, and the percentage of customers with two teens remains relatively constant at 4% for customers older than 50. Proportion of customers with one teenager experience an increasing trend as we progress through the age brackets, with a significant 140% surge occurring in the "40-49" age group, followed by a 38% decline in the "70 and Older" age group.

    We don't notice any observable correlation between count of teenagers and sales provided by the customers.

    SUMMARY

    To recommend an age group for future sales growth based on the above analyses, we should consider multiple factors such as average spending, sample size, and the distribution of income, number of kids within each age group. Additionally, the analysis should align with the company's strategic goals. Here's my recommendation:

    Age Group 20-29:

    Statistical Reasons:

    1. Average Spending: The age group 20-29 exhibits the highest sum of averages for spending across various product categories, indicating the potential for higher sales within this age group.
    2. Income Diversity: This age group shows a diverse distribution of income levels, with a significant percentage of high-income customers. This diversity can be leveraged for sales growth strategies catering to different income segments.
    3. No Teens and Few Kids: With a lower percentage of customers with kids or teens, marketing efforts might be more focused on individual or couple-based purchases.
    4. Sample Size: While the sample size for this age group is relatively small, the potential for sales growth based on average spending and income diversity makes it a compelling target.
    5. Education: The majority of customers in this age group have at least a bachelor's degree, indicating a level of education that may make them receptive to targeted marketing and higher-end products.

    It's important to note that the choice of an age group should align with the company's products, services, and marketing strategies. The age group recommendation is based on statistical insights, but other factors such as product appeal, competition, and market trends should also be considered in the decision-making process.


    THANKS
    Anil Raju
    LinkedIn

    About

    Store marketing campaign data is used to create an excel dashboard and to use insights from it to identify the age group that maximizes the sales revenue.

    Topics

    Resources

    Stars

    Watchers

    Forks

    Releases

    No releases published

    Packages

    No packages published