Member-only story

Machine learning basics (part 8): Ensemble learning

Hang Nguyen
6 min readMay 16, 2022

--

The basic idea is that by having several learners or classifiers that each gets slightly different results on a dataset — some learning certain things well and some learning others — and putting them together. The results generated will be better than any one of them on its own provided that one succeeds well in joining them, otherwise could be even worse.

Interestingly, ensemble techniques do very well when there is very little data as well as when there is too much. This is a little bit like cross-validation used when there was not enough data to go around, and trained several classifiers on different subsets of the data. The most built models are then rejected. With an ensemble technique we keep them all, and combine their results in some way.

One simple way to combine the results is majority voting. This has the interesting property that for binary classification (more details about binary classification in part 6), the combined classifier will only receive the answer wrong if more than half of the classifiers were wrong. Hopefully, this is not going to occur frequently.

Most popular ensemble technique: Boosting

By taking a set of poor learners, each performing only little better than chance, by putting them together it is possible to construct an…

--

--

Hang Nguyen
Hang Nguyen

Written by Hang Nguyen

Just sharing (data) knowledge

No responses yet