Member-only story
Machine Learning 101 P8: Logistic Regression
Introduction
Let's start with the easiest classification machine learning model today, shall we?
Despite its name including “regression" word, this is indeed a classification algorithm, not a regression algorithm. Logistic regression is a supervised machine learning algorithm used for binary classification. This simply returns the output into 2 categories (usually 0 and 1), which is useful in, e.g., spam detection and fraud detection problems. This algorithm works well with linearly separable data. You may wonder why not use linear regression instead when it works well with linearity and can predict output in continuous format. The reasons include:
- Linear regression output can be any continuous number, but in a classification problem, especially in the case of logistic regression, we need only 0 and 1.
- Outliers can heavily affect predictions.
- If we apply a threshold (e.g., if y ≥ 0.5, classify as 1), it may not generalise well.
So, Logistics regression in short:
- A classification algorithm
- Ouput binary categories
- Works with linearly separable data
Maths :)
Right into the mathematical engine behind the scenes!
Note: This is just a brief introduction, not an in-depth math tutoring here!