问答文章1 问答文章501 问答文章1001 问答文章1501 问答文章2001 问答文章2501 问答文章3001 问答文章3501 问答文章4001 问答文章4501 问答文章5001 问答文章5501 问答文章6001 问答文章6501 问答文章7001 问答文章7501 问答文章8001 问答文章8501 问答文章9001 问答文章9501

Bagging(装袋算法)

发布网友 发布时间:2023-03-17 15:40

我来回答

1个回答

热心网友 时间:2023-11-04 05:03

In the previous passage, I talked about the conception of Decision Treeand its use. Although being very powerful model that can handle both regression and classification tasks, decision trees usually suffer from high variance. This means that if we split the dataset into two parts at random and fit a decision tree to the two halves, we will get quite different results. Thus we need one approach to rece variance at the expense of bias.

Bagging, which is designed for the context, is just the procere for recing the variance of weak models. In bagging, a random sample of data in the training set is selected with replacement - which means indivial data points can be chosen more than once - and then we fit a weak learner, such as a decision tree, to each of the sample data. Finally, we aggregate the predictions of base learners to get a more accurate estimate.

We build B distinct sample datasets from the train set using bootstrapped training data sets, and calculate the prediction using B separate training sets, and average them in order to obtain a low-variance statistical model:

While bagging can rece the variance for many models, it is particularly useful for decision trees. To apply bagging, we simply construct B separate bootstrapped training sets and train B indivial decision trees on these training sets. Each tree can grow very deep and not be pruned, thus they have high variance but low bias. Hence, averaging these trees can rece variance.

Bagging has three steps to complete: bootstraping, parallel training, and aggregating.

There are a number of key benefits for bagging, including:

The key disadvantages of bagging are:

Now we practice how to use bagging to improve the performance of models. The scikit-learn Python machine learning library provides easy access to the bagging method.

First, we use make_classification function to construct the classification dataset for practice of the bagging problem.

Here, we make a binary problem dataset with 1000 observations and 30 input features.

(2250, 30) (750, 30) (2250,) (750,)

To demonstrate the benefits of bagging model, we first build one decision tree and compare it to bagging model.

Now we begin construct an ensemble model using bagging technique.

Based on the result, we can easily find that the ensemble model reces both bias(higher accuracy) and variance(lower std). Bagging model's accuracy is 0.066 higher than that of one single decision tree.

Make Prediction

BaggingClasifier can make predictions for new cases using the function predict .

Then we build a bagging model for the regression model. Similarly, we use make_regression function to make a dataset about the regression problem.

As we did before, we still use repeated k-fold cross-validation to evaluate the model. But one thing is different than the case of classification. The cross-validation feature expects a utility function rather than a cost function. In other words, the function thinks being greater is better rather than being smaller.

The scikit-learn package will make the metric, such as neg_mean_squared_erro negative so that is maximized instead of minimized. This means that a larger negative MSE is better. We can add one "+" before the score.

The mean squared error for decision treeis and variance is .

On the other hand, a bagging regressor performs much better than one single decision tree. The mean squared error is and variance is . The bagging reces both bias and variance.

In this section, we explore how to tune the hyperparameters for the bagging model.

We demonstrate this by performing a classification task.

Recall that the bagging is implemented by building a number of bootstrapped samples, and then building a weak learner for each sample data. The number of models we build corresponds to the parameter n_estimators .

Generally, the number of estimators can increase constantly until the performance of the ensemble model converges. And it is worth noting that using a very large number of n_estimators will not lead to overfitting.

Now let's try a different number of trees and examine the change in performance of the ensemble model.

Number of Trees 10: 0.862 0.038
Number of Trees 50: 0.887 0.025
Number of Trees 100: 0.888 0.027
Number of Trees 200: 0.89 0.027
Number of Trees 300: 0.888 0.027
Number of Trees 500: 0.888 0.028
Number of Trees 1000: 0.892 0.027
Number of Trees 2000: 0.889 0.029

Let's look at the distribution of scores

In this case, we can see that the performance of the bagging model converges to 0.888 when we grow 100 trees. The accuracy becomes flat after 100.

Now let's explore the number of samples in bootstrapped dataset. The default is to create the same number of samples as the original train set.

Number of Trees 0.1: 0.801 0.04
Number of Trees 0.2: 0.83 0.039
Number of Trees 0.30000000000000004: 0.849 0.029
Number of Trees 0.4: 0.842 0.031
Number of Trees 0.5: 0.856 0.039
Number of Trees 0.6: 0.866 0.037
Number of Trees 0.7000000000000001: 0.856 0.033
Number of Trees 0.8: 0.868 0.036
Number of Trees 0.9: 0.866 0.025
Number of Trees 1.0: 0.865 0.035

Similarly, look at the distribution of scores

The rule of thumb is that we set the max_sample to 1, but this does not mean all training observations will be selected from the train set. Since we leverage bootstrapping technique to select data from the training set at random with replacement, only about 63% of training instances are sampled on average on each predictor, while the remaining 37% of training instances are not sampled and thus called out-of-baginstances.

Since the ensemble predictor never sees the oob samples ring training, it can be evaluated on these instances, without additional need for cross-validation after training. We can use out-of-bag evaluation in scikit-learn by setting oob_score=True .

Let's try to use the out-of-bag score to evaluate a bagging model.

According to this oob evaluation, this BaggingClassifier is likely to achieve about 87.6% accuracy on the test set. Let’s verify this:

The BaggingClassifier class supports sampling the features as well. This is controlled by two hyperparameters: max_features and bootstrap_features. They work the same way as max_samples and bootstrap, but for feature sampling instead of instance sampling. Thus, each predictor will be trained on a random subset of the input features.

The random sampling of features is particularly useful for high-dimensional inputs, such as images. Randomly sampling both features and instances is called Random Patches. On the other hand, keeping all instances( bootstrap=False,max_sample=1.0 ) and sampling features( bootstrap_features=True,max_features smaller than 1.0 ) is called Random Subspaces.

Random subspaces ensemble is an extension to bagging ensemble model. It is created by a subset of features in the training set. Very similar to Random Forest, random subspace ensemble is different from it in only two aspects:

Sampling features results in even more predictor diversity, trading a bit more bias for a lower variance.

Reference:

声明声明:本网页内容为用户发布,旨在传播知识,不代表本网认同其观点,若有侵权等问题请及时与本网联系,我们将在第一时间删除处理。E-MAIL:11247931@qq.com
如何分别真金和仿金首饰 怎样区分真金和仿金首饰呢 小学生新年晚会主持人的串词!!(不要太多)急 大大后天就需要了!!!_百度... 周年晚会策划公司 奥格瑞玛传送门大厅在哪 奥格瑞玛传送门大厅怎么走 锻炼颈椎的几个动作 水多久能结冰 冰能在多长时间内形成 请问水低于0度会结冰吗? 如何防止脱发严重 嘴唇上有黑印用蜜蜡和棉线去除了胡须 为什么新买的笔记本电脑打开一个程序就会有噪音 路虎神行2柴油版加油无力怎么回事? 路虎极光1800-3000转提速不够快是什么原因 使用了路虎驭领油门反应慢怎么办 路虎发现神行加速卡顿 - 信息提示 那是什么飞机? 中国有真正意义上的战略轰炸机么 网上很多人说轰6 但是算不上吧 给提供些独家军事图片 正版苹果3代手机是一体的吗 苹果7是一体机吗? 苹果5S电信的是一体机吗 梦见双手臂被黑蚂蚁咬的预兆 讲话时的“抑扬顿挫”会使讲话具有情的魅力,那么“挫”的意思是什么?() 亳州油条半成品批发哪里有卖啊 唐山市场哪个店铺批发半成品油条好 传统农业与现代农业有什么区别? 破五的寓意 梦见一个比人还大的甲鱼的预兆 妒忌、、委托、、照办、预计、紧急、、疑惑的拼音? 怎么把手机时间改成北京时间? 郑州社保缴费基数标准公布 2022郑州社保最低缴费基数是多少,郑州社保每个月最低交多少钱? 包头市2022年水利中级工程师评审结果出来没 2022年职称评审结果公布时间 22年上海高级工程师评审结果公示在元旦前? 2022瑞安市中级工程师评审答辩结果公布时间 2022年陕西省工程师评审结果怎么还没出来 梦见幽灵抓我的预兆 为何这两天拼多多平台全都打出,拼单返0.01元,拼多多无钱要破产吗? 打火机上面的字能用风油精擦掉吗 怎么去除打火机上的广告字啊 打火机印字可以洗掉吗? 门冬50胰岛素是长效还是短效 易丰接送车司机工资怎么样啊 托运小汽车怎么收费2000公里 带掉字的四字成语 带掉字的成语有哪些成语有哪些 疯狂猜成语图上有个要做一个掉的胆字是什么成语 手机上如何设置横卷轴图形