Member-only story
Machine Learning 101 P16: Apriori
Next, we will delve into Association Rule Learning, a model type designed to uncover patterns in large datasets. Apriori is one of its algorithms that will be analyzed in detail today. The goal is to discover the relationship within large datasets, especially in transactional databases.
Introduction
Apriori is a classic unsupervised learning algorithm that finds frequent itemsets in a dataset and derives association rules (patterns). It is based on the Apriori principle, which states:
If an itemset is frequent, then all of its subsets must also be frequent.
This principle helps prune the search space, making the algorithm more efficient.
Application
- Market Basket Analysis: Acknowledging items that are frequently bought together (e.g., “Customers who buy bread often buy butter”).
- Recommendation Systems: Suggesting products based on user behaviours.
- Fraud Detection: Identifying patterns in fraudulent transactions.
- Medical Diagnosis: Finding relationships between symptoms and diseases.
Key considerations
- It’s easy to implement, works well with large datasets, but can be computationally expensive for large dataset.
- It generates quite many rules, which later needs pruning techniques.
- It has the same assumption as Naive Bayes…