Market Basket Analysis In R: A Detailed And Extensive Guide


January 15, 2021

The increasing volume of data and the growing importance of retail analytics made it easy for retailers to know their customers better. With a large amount of data and an Omnichannel retail approach, analytics has become more important to drive decisions. Data can help retailers to understand customer behavior, plan and promote products, increase sales, improve customer experience, and optimize supply chain performance.

There are many algorithms and techniques used in retail that help uncover better insights and predict future events. One of the key and widely used techniques in retail is Market Basket Analysis. This post talks about Market Basket Analysis in R language and highlights the importance of such techniques in retail to boost sales.


An Introduction To Market Basket Analysis: From Concept To Implementation

You must have purchased online at least once. You may have observed that while doing so, there is one section that reads ‘frequently bought together’ regardless of the product type. eCommerce platforms are continuously making efforts to improve customer experience by using various techniques. This is one of them.

Image Source: sciencedirect.com

Market Basket Analysis is a technique that is used to discover the association between items. In simplest terms, it allows retailers to identify a relationship between items that generally people buy together.

For instance, if one person buys ‘bread’, he/she more likely to buy ‘butter’ or ‘jam’ which is predicted as a ‘go-along’ item with the purchase. This is further used in up-sell technique of eCommerce where retailer finds an increase in the sales of one item, they can promote sales of related items by giving discount so that people buy them together. This analysis is popular for cross-selling and up-selling of products. Retailers are using this in their marketing campaigns to boost sales and cross-sell products to customers.

This entire process and analysis are known as ‘Market Basket Analysis’ in terms of technology and data. It works on the idea that if a customer buys one item, they are bound to buy (or not buy) another related item or group of items.

To implement this, associate rule mining is used.

Let’s learn what Association Rule Mining is.

What Is Association Rule Mining?

Association Rule Mining is a rule-based machine learning method to find associations and relationships between large sets of items. This rule also shows how frequently an item occurs in the itemset based on the occurrences of other items in a transaction.

Association rules are widely used to analyze basket or transaction data to discover strong rules based on the interestingness and frequency of occurrences. Association rules can be understood as the “if this, then that” rule.

For example, if a user buys coffee and sugar, then he/she is likely to buy milk.

This rule could be written as:

If {A} Then {B}

Here, If part of the rule is known as antecedent and THEN part of the rule is known as consequent. {A} part is the condition and {B} part is considered as the result.

Algorithms Used In Market Basket Analysis:

Multiple techniques and algorithms are being used in Market Basket Analysis. One of the main objectives is to predict the likelihood of items being purchased together by users.

  • AIS
  • SETM
  • FP Growth

APRIORI is the by far widely-used and well-known association rule algorithm. It is considered accurate and outperforms AIS and SETM algorithms. It finds frequent itemsets in transactions and identifies association rules between those items. One of the limitations of the Apriori algorithm is a frequent itemset generation. It scans the database many times which leads to increased time and reduced performance as it is a computationally expensive step because of a large database.

The association rule has primarily three measures to decide the degree of confidence, these are:

  • Support
  • Confidence
  • Lift

This is one of the important measures to determine how frequently an itemset occurs in the transaction as a percentage of all transactions. Support is the number of transactions that include both {A} and {B} parts as a percentage of the total number of transactions.


This rule is the ratio of the number of transactions that include items in {A} and {B} to the number of transactions that include items in {A}. It can be understood as to how often items in B appear in transactions that contain A only. It is a conditional probability.


This third measure, lift or lift ratio is the ratio of confidence to expected confidence. We can say that this rule shows us how much better a rule is at predicting the result than just assuming it. Greater lift value tells how strong the association is.

It shows us the rate of confidence that B will be purchased given that A was purchased.

These rules are applied to hundreds and thousands of records to obtain closer and accurate results. Association rules are not considered significantly accurate if applied to a small set of data. Millions of transactions are analyzed and then the conclusions are drawn from observations.

What Are The Advantages Of Market Basket Analysis?

There are several advantages of implementing Market Basket Analysis in marketing. MBA can be applied to customer data from the point of sale (PoS) systems.

It can help retailers with:

  • Increase customer engagement
  • Boost sales and increase RoI
  • Improve customer experience
  • Optimize marketing campaigns and strategies
  • Help understand customers better
  • Help identify customer behavior and pattern

Here is one example of Market Basket Analysis Implementation in R on Tableau and Power BI. This is how it looks:

Predictive Market Basket Analysis In R on Power BI & Tableau

How Does Market Basket Analysis Look Like From Customer Perspective?

If you are still not able to understand it completely, here is an example from Amazon, the world’s largest eCommerce chain.

From a customer perspective, market basket analysis is like shopping at a supermarket. It generally observes all items bought by customers together in a single purchase. It shows the most relatable products together that customers tend to buy in one purchase.

Here’s one example of its implementation in R language. R is a language for statistical computing and data analysis. It is widely used by statisticians and data analysts across the world.

Market Basket Analysis In R: Implementation Step By Step:

Step 1: Loading Required Libraries/Package

R language uses the ‘arules’ package to represent, manipulate, and analyze transaction data and patterns. It uses frequent itemsets and association rules to perform an MBA on data.

Terminal Output:

Step 2: Loading Dataset/Transactional Dataset

Generally, an MBA is performed on transaction data from the point of sales system or directly on customer data. It takes input database by any means of data source such as relational database, flat files like CSV file, Excel(Xls) file, etc.

Here, we have used data from one bakery located in Edinburgh. This dataset is from Kaggle, an online community of data scientists and machine learning professionals.

Input Data:

Image Source: kaggle.com
Step 3: Performing Apriori Algorithm And Generating Association Rules

Association rules can be generated using the Apriori Algorithm. The apriori() function is from arules package. Apriori Algorithm is one of the widely-used algorithms in Market Basket Analysis. It takes various input parameters such as:

support threshold (supp),
confidence threshold(conf),
minimum length of rules(minlen)
data of two key columns – Transaction Id and Item name

Terminal Output:

It shows parameter specification, minimum support count, and status of performed operations.

Association Rules Output:

Here, LHS represents items already taken in a basket, RHS represents items frequently taken together along with purchased items. ‘Coffee’ and ‘Toast’, ‘Alfajores’ and ‘Coffee’ – these are popular combinations of the items we can derive from this analysis.

Practical Applications Of Market Basket Analysis Except For Retail:

It is obvious that when one hears the term ‘market basket analysis’, all they think is supermarkets and customers. But it is applicable in various industries and offer predictive insights.

Here is the list of industries:


Here, Market basket analysis can be used to determine the likelihood of what services and packages are being utilized and loved by customers. They can determine the popularity of package/service and direct marketing efforts as per the result.

Banks and Finance:

It can be used to analyze transactions and purchases of customers. This can help them to build profiles and identify fraud and breaches.


In healthcare, this analysis helps in the study of symptoms and illness. It can be also used to find biological associations between the environment and different genes.


By using this analysis, insurance companies can check for claim frauds and better build profiles of customers.


It can be used for predictive analysis of equipment failure.

Difference Between Association And Recommendation:

A recommendation is majorly based on individual preference rather than collective observation. Association rules find relationships between items based on how frequently they are bought together. The recommendation engine uses Collaborative Filtering based on historical preference and ratings to find similarities between users and items.

For example, a set of items can be recommended to user A based on the interests of a similar user B.

FAQs Related To Market Basket Analysis:

  • What is meant by market basket analysis?

Market basket analysis is a well-known technique used by retailers to identify associations between items purchased together based on frequency.

  • What is market basket analysis used for?

Market basket analysis is used to discover associations between items purchased by the users.

  • Is market basket analysis supervised or unsupervised?

Market basket analysis is an unsupervised machine learning method.

  • What is the other name of market basket analysis?

Market basket analysis is also known as association analysis.

  • What is the Apriori algorithm in R?

Apriori algorithm is used for finding associations between items in da dataset. It’s easy to implement in R statistical programming language.

Delivering Digital Outcomes To Accelerate Growth
Let’s Talk

SPEC INDIA, as your single stop IT partner has been successfully implementing a bouquet of diverse solutions and services all over the globe, proving its mettle as an ISO 9001:2015 certified IT solutions organization. With efficient project management practices, international standards to comply, flexible engagement models and superior infrastructure, SPEC INDIA is a customer’s delight. Our skilled technical resources are apt at putting thoughts in a perspective by offering value-added reads for all.

Delivering Digital Outcomes To Accelerate Growth
Let’s Talk