Skip to content

dnsoveral/MidCamp_Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Identifation of Theme:

This project is based on a file found on Kaggle, with the name: French bakery daily sales.

WhoAmI?

Someone from Food Retail, worked in Operations and Commercial Departments and have love for it. So decided on a dataset about food retail with some data to work on as if it were for a daily basic analysis.

How to Begin:

After cleaning the dataset, in this project we will, first, analyze total revenue by tickets, as tickets_total. We will, that way, try to find a good enough statistical measure to analyze the rest of the dataset. Decision: use median, as is a measure of middle values not affected by extreme values.

Three Questions:

What are the Top and Bottom 5 Products sold in Quantity and Revenue? What are the Top and Bottom 3 Hours of Movement? What is the difference between Weekdays and Weekends in Quantity sold an Revenue generated? note: always in median values.

Market Basket Analysis:

Is there any relationship between sales of certain products? If a client buys A, is it almost certain that another B? What products? Could we use this information to make marketing actions, promotions, bundles of products, even lower prices and upscale the sales of certain products, and therefore of the sales as whole?

Answers:

Three Questions:

The most sold item by far is the traditional baguette. Then the revenue vs quantity finds some differences as the second to generate more revenue isn't the same as the seconde one to sell more in quantity: formule_sandwich vs croissant. In quantity the formule_sandwich isn't even in the top 5, which indicates that its price way higher than the 2nd to 5th more sold in quantity. As for the Bottom 5 in quantity and revenue, they are the same. And seeing as they are just one unit sold, they are probably errors in checkout.

In the Top 3 hours of revenue and quantity we already see that, probably at 12.00 the medium_ticket will be higher than at 10.00, since its quantity is lower than at 10.00, and the revenue at 12.00 almost equals the one of 11.00, that in quantity is way above the other two hours. As for the Bottom 3 hours both the revenue and the quantity have the same results. Comparing with the top 3 hours, they are hours with very little business. Indicated for cleaning and production for more hours with more movement.

The median quantity is usually the same by hour, either in weekdays or in weekends, with exception of the first hour of the weekends, where in quantity, they sell the double. The real difference is in the revenue generated, because exception made in the first hour of the weekends, they sell the same quantities, but the revenue generated by hour is higher, indicating higher tickets and prices by article on the weekends.

Market Basket Analysis:

Here we can see that there are really interesting relationships between products, sometimes even between groups of them. The higher the lift the better/interesting the relationship. So if interested in upscaling sales we could create promotions, bundles, even lower prices, and eventually provoke a rise in sales in products like 'croissant' and 'baguette, pain_au_chocolat', or 'croissant' and 'traditional_baguette, pain_au_chocolat', or 'pain_au_chocolat' and 'croissant'. The probability of this sets of values from 'articles' column in the sales of this bakery is so high that we can almost be sure, that they will be bought together.

Conclusions:

Between Average (Mean), a value used so frequently is almost as frequent as drinking water, and Median, a value almost never used outside more technical pleaces, Median will almost always be a more correct measure of middle values. As it isn't affected by extreme values, like the Average(Mean).

Quantity in sales may not represent generated Revenue, as seen in the analysis of weekdays vs weekends or between the top 3 hours of sales in quantity and revenue.

As for Market Basket Anaysis, it is a really good tool to work datasets that include transactions and items in their columns. It will allow us to find meaningfull relationships between items or sets of items, looking for their frequency in the dataset and looking for them together in each transaction, thus saying if they are frequently bought together and then telling us if the probability of the items or sets of item being bought together is so high it's almost certain. If the support is too high, might make us overlook hidden and meaningfull relations. If it's too low, the frequent itemsets may generate too many rules (sets of "if client buys A[antecedent], then buys B [consequent]"). That way we filter by Lift, which will tell us (the higher the better) if the rule is interesting enought to make marketing actions, promotions or bundles to upscale sales.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors