What is Ensemble Learning?

This series(“Bagging & Boosting Ensemble Methods and What is the Difference Between Them?”) consists of 6 separate articles and is the first article in this series. In this part, we will talk about “What is Ensemble Learning?”.

https://link.springer.com/article/10.1007/s00521-020-04986-5

Note: I usually will use some abbreviated words below:

  • Data Science — DS

Let’s start…

In the world of ML, ensemble learning methods are the most popular topics to learn. These ensemble methods have been known as the winner algorithms. In the DS competitions platform like Kaggle, machinehack, HackerEarth ensemble methods are getting hype as the top-ranking people in the leaderboard are frequently using these methods like Bagging methods and Boosting methods.

For the data scientist roles, in interviews the difference between bagging and boosting most frequently asked question.

So in this article, we are going to learn different kinds of ensemble methods. In particular, we are going to focus more on Bagging and Boosting approaches. First, we will walk through the required basic concepts. Later we will learn in-depth about these methods.

What is Ensemble Learning?

In ML instead of building only a single model to predict target or future, how about considering multiple models to predict the target. This is the main idea behind ensemble learning.

https://1.bp.blogspot.com/-S8ss-zVfpRM/V1qKcxfCvNI/AAAAAAAAD0I/8UUFyrE4MqQYYuWSxrOOvX3zRfw93nCLwCLcB/s1600/Stacking.png

In ensemble learning we will build multiple ML models using the train data, we will discuss how we are going to use the same train data to build various models in the next sections of this article.

So what advantage will we get with ensemble learning?

This is the primary question that will arrive in our mind.

Let’s pass a second here to think about what advantage we will get if we build multiple models.

With a single model approach, if the build model is having high bias or high variance we will be limited to that. Even though we are having methods to handle high bias or high variance. Still if the final is facing any of the bias or variance issues we can’t do anything.

Whereas if we build multiple models we can reduce the high variance and high bias issue by averaging all models. If the individual models are having high bias, then when we build multiple models the high bias will average out. The same is true for high variance cases too.

https://www.dataquest.io/blog/learning-curves-machine-learning/

For building multiple models we are going to use the same train data.

If we use the same train data, then all the build models will be also the same right?

But this is not the case.

We will learn how to build different models using the same train dataset. Each model will be unique to itself. We will split the available train data into multiple smaller datasets. But while creating these datasets we should follow some key properties.

For now just remember, to build multiple models we will split the available train data in smaller datasets. In the next steps, we will learn how to build models using the smaller datasets. One model for one smaller dataset.

Summary:

The ensemble learning means instead of building a single model for prediction. We will build multiple ML models, we call these models as weak learners. A combination of all weak learners makes the strong learner, Which generalizes to predict all the target classes with a decent amount of accuracy.

1. Different Ensemble Methods

https://link.springer.com/article/10.1007/s11270-020-04693-w

We are saying we will build multiple models, how these models will differ from one other. We have two possibilities.

  • All the models are build using the same ML algorithm

Based on above mentioned criteria the ensemble methods are of two types.

  1. Homogeneous ensemble methods

Let’s understand these methods individually.

2. Homogeneous Ensemble Method

The first possibility of building multiple models is building the same ML model multiple times with the same available train data. Don’t worry even if we are using the same training data to build the same ML algorithm, still all the models will be different. Will explain this in the next section.

These individual models are called weak learners.

Just keep in mind, in the homogeneous ensemble methods all the individual models are built using the same ML algorithm.

For example, if the individual model is a decision tree then one good example for the ensemble method is random forest.

In the random forest model, we will build N different models. All individual models are decision tree models. If you want to learn how the decision tree and random forest algorithm works. Have a look at the below articles.

Both bagging and boosting belong to the homogeneous ensemble method.

https://dataaspirant.com/ensemble-methods-bagging-vs-boosting-difference/

3. Heterogeneous Ensemble Method

The second possibility for building multiple models is building different ML models. Each model will be different but uses the same training data.

Here also the individual models are called weak learners. The stacking method will fall under the heterogeneous ensemble method. In this article, we are mainly focusing only on the homogeneous ensemble methods. In the upcoming articles, we will learn about the staking method.

For now, let’s focus only on homogeneous methods.

By now we are clear with different types of ensemble methods. We frequently said the individual models are weak learners. So let’s spend some time understanding the weak learners and strong learners. These are the building blocks for ensemble methods.

Let me put a stop to our topic here and say we will see you in our next topic, “Weak Learners & Strong Learners for Machine Learning”.

http://www.plusxp.com/2011/02/back-to-the-future-the-game-episode-1-review/

References
1. https://link.springer.com/article/10.1007/s00521-020-04986-5
2. https://picknmix.readthedocs.io/en/latest/readme.html
3. https://www.dataquest.io/blog/learning-curves-machine-learning/
4. https://dataaspirant.com/ensemble-methods-bagging-vs-boosting-difference/
5. https://link.springer.com/article/10.1007/s11270-020-04693-w
6. https://dataaspirant.com/how-decision-tree-algorithm-works/
7. https://www.kaggle.com/mathchi/study-linear-multi-poly-dt-rf-log-regression#Random-Forest-Regression
8. http://www.plusxp.com/2011/02/back-to-the-future-the-game-episode-1-review/

Experienced Ph.D. with a demonstrated history of working in the higher education industry. Skilled in Data Science, AI, Deep Learning, Big Data, & Mathematics.