Machine learning basics (part 14): Linear Discriminant Analysis
Linear discriminant analysis (LDA) is generally used to classify patterns between 2 classes, however, it can be extended to classify multiple patterns. It is used as a dimensionality reduction technique, as known as pre-processing step before pattern classification in machine learning. Dimension reduction means simply to plot multi-dimensional data in just 2 or 3 dimensions.
LDA assumes that all classes are linearly separable and according to this multiple linear discrimination function representing several hyperplanes in the feature space are created to distinguish the classes. If there are two classes then the LDA draws one hyperplane and projects the data onto this hyperplane in such a way as to maximize the separation of the two categories. This hyperplane is created according to the two criteria considered simultaneously:
- Maximizing the distance between the means of two classes;
- Minimizing the variation between each category.
LDA 3 basic steps
(1) Calculate the separability between different classes. This is also known as between-class variance and is defined as the distance between the mean of different classes.
Suppose we have C classes, μi be the mean vector of class i, i = 1,2,3, …, C. Mi be the number of samples within class i, i = 1,2,3, …, C. Let M be the total number of samples.
(2) Calculate the within-class variance. This is the distance between the mean and the sample of every class.
(3) Construct the lower-dimensional space that maximizes Step1 (between-class variance) and minimizes Step 2(within-class variance). In the equation below P is the lower-dimensional space projection. This is also known as Fisher’s criterion.