Random Forest and Decision Tree Algorithm
A random forest is a collection of decision trees following the bagging concept. When we move from one decision tree to the next decision tree then how does the information learned by last decision tree move forward to the next?
Because, as per my understanding, there is nothing like a trained model which gets created for every decision tree and then loaded before the next decision tree starts learning from the misclassified error.
So how does it work?
machine-learning random-forest cart bagging
add a comment |
A random forest is a collection of decision trees following the bagging concept. When we move from one decision tree to the next decision tree then how does the information learned by last decision tree move forward to the next?
Because, as per my understanding, there is nothing like a trained model which gets created for every decision tree and then loaded before the next decision tree starts learning from the misclassified error.
So how does it work?
machine-learning random-forest cart bagging
"When we move from one decision tree to the next decision tree". This suggests an linear process. We've built parallel implementations where we worked on one tree per CPU core; this works perfectly fine unless you use a separate random number generator per CPU core in training, all of which share the same seed. In that case you can end up with lots of identical trees.
– MSalters
Nov 21 at 14:25
add a comment |
A random forest is a collection of decision trees following the bagging concept. When we move from one decision tree to the next decision tree then how does the information learned by last decision tree move forward to the next?
Because, as per my understanding, there is nothing like a trained model which gets created for every decision tree and then loaded before the next decision tree starts learning from the misclassified error.
So how does it work?
machine-learning random-forest cart bagging
A random forest is a collection of decision trees following the bagging concept. When we move from one decision tree to the next decision tree then how does the information learned by last decision tree move forward to the next?
Because, as per my understanding, there is nothing like a trained model which gets created for every decision tree and then loaded before the next decision tree starts learning from the misclassified error.
So how does it work?
machine-learning random-forest cart bagging
machine-learning random-forest cart bagging
edited Nov 21 at 12:16
Peter Flom♦
74.1k11105202
74.1k11105202
asked Nov 20 at 1:55
Abhay Raj Singh
563
563
"When we move from one decision tree to the next decision tree". This suggests an linear process. We've built parallel implementations where we worked on one tree per CPU core; this works perfectly fine unless you use a separate random number generator per CPU core in training, all of which share the same seed. In that case you can end up with lots of identical trees.
– MSalters
Nov 21 at 14:25
add a comment |
"When we move from one decision tree to the next decision tree". This suggests an linear process. We've built parallel implementations where we worked on one tree per CPU core; this works perfectly fine unless you use a separate random number generator per CPU core in training, all of which share the same seed. In that case you can end up with lots of identical trees.
– MSalters
Nov 21 at 14:25
"When we move from one decision tree to the next decision tree". This suggests an linear process. We've built parallel implementations where we worked on one tree per CPU core; this works perfectly fine unless you use a separate random number generator per CPU core in training, all of which share the same seed. In that case you can end up with lots of identical trees.
– MSalters
Nov 21 at 14:25
"When we move from one decision tree to the next decision tree". This suggests an linear process. We've built parallel implementations where we worked on one tree per CPU core; this works perfectly fine unless you use a separate random number generator per CPU core in training, all of which share the same seed. In that case you can end up with lots of identical trees.
– MSalters
Nov 21 at 14:25
add a comment |
4 Answers
4
active
oldest
votes
No information is passed between trees. In a random forest, all of the trees are iid. They are iid because trees are grown using the same randomization strategy for all trees: first, take a bootstrap sample of the data, and then grow the tree using splits from a randomly-chosen subset of features. This happens for each tree individually without attention to any other trees in the ensemble.
You might find it helpful to read an introduction to random forests from a high-quality text. One is "Random Forests" by Leo Breiman. There's also a chapter in Elements of Statistical Learning by Hastie et al.
It's possible that you've confused random forests with boosting methods such as AdaBoost or gradient-boosted trees. Boosting methods are not the same, because they use information about misfit from previous boosting rounds to inform the next boosting round.
2
By iid do you mean independent and identically distributed? I wasn't familiar with this abbreviation.
– nekomatic
Nov 21 at 11:52
1
@nekomatic It's safe to assume that that was the intended meaning. It's a pretty common abbrev. in statistics.
– JAD
Nov 21 at 14:00
add a comment |
The random forests is a collection of multiple decision trees which are trained independently of one another. So there is no notion of sequentially dependent training (which is the case in boosting algorithms). As a result of this, as mentioned in another answer, it is possible to do parallel training of the trees.
You might like to know where the "random" in random forest comes from: there are two ways with which randomness is injected into the process of learning the trees. First is the random selection of data points used for training each of the trees, and second is the random selection of features used in building each tree. As a single decision tree usually tends to overfit on the data, the injection of randomness in this way results in having a bunch of trees where each one of them have a good accuracy (and possibly overfit) on a different subset of the available training data. Therefore, when we take the average of the predictions made by all the trees, we would observe a reduction in overfitting (compared to the case of training one single decision tree on all the available data).
To better understand this, here is a rough sketch of the training process assuming all the data points are stored in a set denoted by $M$ and the number of trees in the forest is $N$:
- $i = 0$
- Take a boostrap sample of $M$ (i.e. sampling with replacement and with the same size as $M$) which is denoted by $S_i$.
- Train $i$-th tree, denoted as $T_i$, using $S_i$ as input data.
- the training process is the same as training a decision tree except with the difference that at each node in the tree only a random selection of features is used for the split in that node.
- $i = i + 1$
- if $i < N$ go to step 2, otherwise all the trees have been trained, so random forest training is finished.
Note that I described the algorithm as a sequential algorithm, but since training of the trees is not dependent on each other, you can also do this in parallel. Now for prediction step, first make a prediction for every tree (i.e. $T_1$, $T_2$, ..., $T_N$) in the forest and then:
If it is used for a regression task, take the average of predictions as the final prediction of the random forest.
If it is used for a classification task, use soft voting strategy: take the average of the probabilities predicted by the trees for each class, then declare the class with the highest average probability as the final prediction of random forest.
Further, it is worth mentioning that it is possible to train the trees in a sequentially dependent manner and that's exactly what gradient boosted trees algorithm does, which is a totally different method from random forests.
add a comment |
Random forest is a bagging algorithm rather than a boosting algorithm.
Random forest constructs the tree independently using random sample of the data. A parallel implementation is possible.
You might like to check out gradient boosting where trees are built sequentially where new tree tries to correct the mistake previously made.
add a comment |
So how does it works ?
Random Forest is a collection of decision trees. The trees are constructed independently. Each tree is trained on subset of features and subset of a sample chosen with replacement.
When predicting, say for Classification, the input parameters are given to each tree in the forest and each tree "votes" on the classification, label with most votes wins.
Why to use Random Forest over simple Decision Tree? Bias/Variance trade off. Random Forest are built from much simpler trees when compared to a single decision tree. Generally Random forests provide a big reduction of error due to variance and small increase in error due to bias.
If we are chosing different features for every Decision Tree, then how the learning by a set of features in previous Decision Tree improves while we send the missclassified values ahead as for the next Decision Tree there is totally a new set of features ?
– Abhay Raj Singh
Nov 20 at 6:50
3
@AbhayRajSingh - you do not "send the misclassified values ahead" in Random Forest. As Akavall says, "The trees are constructed independently"
– Henry
Nov 20 at 10:16
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "65"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f377865%2frandom-forest-and-decision-tree-algorithm%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
4 Answers
4
active
oldest
votes
4 Answers
4
active
oldest
votes
active
oldest
votes
active
oldest
votes
No information is passed between trees. In a random forest, all of the trees are iid. They are iid because trees are grown using the same randomization strategy for all trees: first, take a bootstrap sample of the data, and then grow the tree using splits from a randomly-chosen subset of features. This happens for each tree individually without attention to any other trees in the ensemble.
You might find it helpful to read an introduction to random forests from a high-quality text. One is "Random Forests" by Leo Breiman. There's also a chapter in Elements of Statistical Learning by Hastie et al.
It's possible that you've confused random forests with boosting methods such as AdaBoost or gradient-boosted trees. Boosting methods are not the same, because they use information about misfit from previous boosting rounds to inform the next boosting round.
2
By iid do you mean independent and identically distributed? I wasn't familiar with this abbreviation.
– nekomatic
Nov 21 at 11:52
1
@nekomatic It's safe to assume that that was the intended meaning. It's a pretty common abbrev. in statistics.
– JAD
Nov 21 at 14:00
add a comment |
No information is passed between trees. In a random forest, all of the trees are iid. They are iid because trees are grown using the same randomization strategy for all trees: first, take a bootstrap sample of the data, and then grow the tree using splits from a randomly-chosen subset of features. This happens for each tree individually without attention to any other trees in the ensemble.
You might find it helpful to read an introduction to random forests from a high-quality text. One is "Random Forests" by Leo Breiman. There's also a chapter in Elements of Statistical Learning by Hastie et al.
It's possible that you've confused random forests with boosting methods such as AdaBoost or gradient-boosted trees. Boosting methods are not the same, because they use information about misfit from previous boosting rounds to inform the next boosting round.
2
By iid do you mean independent and identically distributed? I wasn't familiar with this abbreviation.
– nekomatic
Nov 21 at 11:52
1
@nekomatic It's safe to assume that that was the intended meaning. It's a pretty common abbrev. in statistics.
– JAD
Nov 21 at 14:00
add a comment |
No information is passed between trees. In a random forest, all of the trees are iid. They are iid because trees are grown using the same randomization strategy for all trees: first, take a bootstrap sample of the data, and then grow the tree using splits from a randomly-chosen subset of features. This happens for each tree individually without attention to any other trees in the ensemble.
You might find it helpful to read an introduction to random forests from a high-quality text. One is "Random Forests" by Leo Breiman. There's also a chapter in Elements of Statistical Learning by Hastie et al.
It's possible that you've confused random forests with boosting methods such as AdaBoost or gradient-boosted trees. Boosting methods are not the same, because they use information about misfit from previous boosting rounds to inform the next boosting round.
No information is passed between trees. In a random forest, all of the trees are iid. They are iid because trees are grown using the same randomization strategy for all trees: first, take a bootstrap sample of the data, and then grow the tree using splits from a randomly-chosen subset of features. This happens for each tree individually without attention to any other trees in the ensemble.
You might find it helpful to read an introduction to random forests from a high-quality text. One is "Random Forests" by Leo Breiman. There's also a chapter in Elements of Statistical Learning by Hastie et al.
It's possible that you've confused random forests with boosting methods such as AdaBoost or gradient-boosted trees. Boosting methods are not the same, because they use information about misfit from previous boosting rounds to inform the next boosting round.
edited Nov 20 at 21:17
answered Nov 20 at 1:59
Sycorax
38.6k1097189
38.6k1097189
2
By iid do you mean independent and identically distributed? I wasn't familiar with this abbreviation.
– nekomatic
Nov 21 at 11:52
1
@nekomatic It's safe to assume that that was the intended meaning. It's a pretty common abbrev. in statistics.
– JAD
Nov 21 at 14:00
add a comment |
2
By iid do you mean independent and identically distributed? I wasn't familiar with this abbreviation.
– nekomatic
Nov 21 at 11:52
1
@nekomatic It's safe to assume that that was the intended meaning. It's a pretty common abbrev. in statistics.
– JAD
Nov 21 at 14:00
2
2
By iid do you mean independent and identically distributed? I wasn't familiar with this abbreviation.
– nekomatic
Nov 21 at 11:52
By iid do you mean independent and identically distributed? I wasn't familiar with this abbreviation.
– nekomatic
Nov 21 at 11:52
1
1
@nekomatic It's safe to assume that that was the intended meaning. It's a pretty common abbrev. in statistics.
– JAD
Nov 21 at 14:00
@nekomatic It's safe to assume that that was the intended meaning. It's a pretty common abbrev. in statistics.
– JAD
Nov 21 at 14:00
add a comment |
The random forests is a collection of multiple decision trees which are trained independently of one another. So there is no notion of sequentially dependent training (which is the case in boosting algorithms). As a result of this, as mentioned in another answer, it is possible to do parallel training of the trees.
You might like to know where the "random" in random forest comes from: there are two ways with which randomness is injected into the process of learning the trees. First is the random selection of data points used for training each of the trees, and second is the random selection of features used in building each tree. As a single decision tree usually tends to overfit on the data, the injection of randomness in this way results in having a bunch of trees where each one of them have a good accuracy (and possibly overfit) on a different subset of the available training data. Therefore, when we take the average of the predictions made by all the trees, we would observe a reduction in overfitting (compared to the case of training one single decision tree on all the available data).
To better understand this, here is a rough sketch of the training process assuming all the data points are stored in a set denoted by $M$ and the number of trees in the forest is $N$:
- $i = 0$
- Take a boostrap sample of $M$ (i.e. sampling with replacement and with the same size as $M$) which is denoted by $S_i$.
- Train $i$-th tree, denoted as $T_i$, using $S_i$ as input data.
- the training process is the same as training a decision tree except with the difference that at each node in the tree only a random selection of features is used for the split in that node.
- $i = i + 1$
- if $i < N$ go to step 2, otherwise all the trees have been trained, so random forest training is finished.
Note that I described the algorithm as a sequential algorithm, but since training of the trees is not dependent on each other, you can also do this in parallel. Now for prediction step, first make a prediction for every tree (i.e. $T_1$, $T_2$, ..., $T_N$) in the forest and then:
If it is used for a regression task, take the average of predictions as the final prediction of the random forest.
If it is used for a classification task, use soft voting strategy: take the average of the probabilities predicted by the trees for each class, then declare the class with the highest average probability as the final prediction of random forest.
Further, it is worth mentioning that it is possible to train the trees in a sequentially dependent manner and that's exactly what gradient boosted trees algorithm does, which is a totally different method from random forests.
add a comment |
The random forests is a collection of multiple decision trees which are trained independently of one another. So there is no notion of sequentially dependent training (which is the case in boosting algorithms). As a result of this, as mentioned in another answer, it is possible to do parallel training of the trees.
You might like to know where the "random" in random forest comes from: there are two ways with which randomness is injected into the process of learning the trees. First is the random selection of data points used for training each of the trees, and second is the random selection of features used in building each tree. As a single decision tree usually tends to overfit on the data, the injection of randomness in this way results in having a bunch of trees where each one of them have a good accuracy (and possibly overfit) on a different subset of the available training data. Therefore, when we take the average of the predictions made by all the trees, we would observe a reduction in overfitting (compared to the case of training one single decision tree on all the available data).
To better understand this, here is a rough sketch of the training process assuming all the data points are stored in a set denoted by $M$ and the number of trees in the forest is $N$:
- $i = 0$
- Take a boostrap sample of $M$ (i.e. sampling with replacement and with the same size as $M$) which is denoted by $S_i$.
- Train $i$-th tree, denoted as $T_i$, using $S_i$ as input data.
- the training process is the same as training a decision tree except with the difference that at each node in the tree only a random selection of features is used for the split in that node.
- $i = i + 1$
- if $i < N$ go to step 2, otherwise all the trees have been trained, so random forest training is finished.
Note that I described the algorithm as a sequential algorithm, but since training of the trees is not dependent on each other, you can also do this in parallel. Now for prediction step, first make a prediction for every tree (i.e. $T_1$, $T_2$, ..., $T_N$) in the forest and then:
If it is used for a regression task, take the average of predictions as the final prediction of the random forest.
If it is used for a classification task, use soft voting strategy: take the average of the probabilities predicted by the trees for each class, then declare the class with the highest average probability as the final prediction of random forest.
Further, it is worth mentioning that it is possible to train the trees in a sequentially dependent manner and that's exactly what gradient boosted trees algorithm does, which is a totally different method from random forests.
add a comment |
The random forests is a collection of multiple decision trees which are trained independently of one another. So there is no notion of sequentially dependent training (which is the case in boosting algorithms). As a result of this, as mentioned in another answer, it is possible to do parallel training of the trees.
You might like to know where the "random" in random forest comes from: there are two ways with which randomness is injected into the process of learning the trees. First is the random selection of data points used for training each of the trees, and second is the random selection of features used in building each tree. As a single decision tree usually tends to overfit on the data, the injection of randomness in this way results in having a bunch of trees where each one of them have a good accuracy (and possibly overfit) on a different subset of the available training data. Therefore, when we take the average of the predictions made by all the trees, we would observe a reduction in overfitting (compared to the case of training one single decision tree on all the available data).
To better understand this, here is a rough sketch of the training process assuming all the data points are stored in a set denoted by $M$ and the number of trees in the forest is $N$:
- $i = 0$
- Take a boostrap sample of $M$ (i.e. sampling with replacement and with the same size as $M$) which is denoted by $S_i$.
- Train $i$-th tree, denoted as $T_i$, using $S_i$ as input data.
- the training process is the same as training a decision tree except with the difference that at each node in the tree only a random selection of features is used for the split in that node.
- $i = i + 1$
- if $i < N$ go to step 2, otherwise all the trees have been trained, so random forest training is finished.
Note that I described the algorithm as a sequential algorithm, but since training of the trees is not dependent on each other, you can also do this in parallel. Now for prediction step, first make a prediction for every tree (i.e. $T_1$, $T_2$, ..., $T_N$) in the forest and then:
If it is used for a regression task, take the average of predictions as the final prediction of the random forest.
If it is used for a classification task, use soft voting strategy: take the average of the probabilities predicted by the trees for each class, then declare the class with the highest average probability as the final prediction of random forest.
Further, it is worth mentioning that it is possible to train the trees in a sequentially dependent manner and that's exactly what gradient boosted trees algorithm does, which is a totally different method from random forests.
The random forests is a collection of multiple decision trees which are trained independently of one another. So there is no notion of sequentially dependent training (which is the case in boosting algorithms). As a result of this, as mentioned in another answer, it is possible to do parallel training of the trees.
You might like to know where the "random" in random forest comes from: there are two ways with which randomness is injected into the process of learning the trees. First is the random selection of data points used for training each of the trees, and second is the random selection of features used in building each tree. As a single decision tree usually tends to overfit on the data, the injection of randomness in this way results in having a bunch of trees where each one of them have a good accuracy (and possibly overfit) on a different subset of the available training data. Therefore, when we take the average of the predictions made by all the trees, we would observe a reduction in overfitting (compared to the case of training one single decision tree on all the available data).
To better understand this, here is a rough sketch of the training process assuming all the data points are stored in a set denoted by $M$ and the number of trees in the forest is $N$:
- $i = 0$
- Take a boostrap sample of $M$ (i.e. sampling with replacement and with the same size as $M$) which is denoted by $S_i$.
- Train $i$-th tree, denoted as $T_i$, using $S_i$ as input data.
- the training process is the same as training a decision tree except with the difference that at each node in the tree only a random selection of features is used for the split in that node.
- $i = i + 1$
- if $i < N$ go to step 2, otherwise all the trees have been trained, so random forest training is finished.
Note that I described the algorithm as a sequential algorithm, but since training of the trees is not dependent on each other, you can also do this in parallel. Now for prediction step, first make a prediction for every tree (i.e. $T_1$, $T_2$, ..., $T_N$) in the forest and then:
If it is used for a regression task, take the average of predictions as the final prediction of the random forest.
If it is used for a classification task, use soft voting strategy: take the average of the probabilities predicted by the trees for each class, then declare the class with the highest average probability as the final prediction of random forest.
Further, it is worth mentioning that it is possible to train the trees in a sequentially dependent manner and that's exactly what gradient boosted trees algorithm does, which is a totally different method from random forests.
edited Nov 20 at 17:00
answered Nov 20 at 7:13
today
24618
24618
add a comment |
add a comment |
Random forest is a bagging algorithm rather than a boosting algorithm.
Random forest constructs the tree independently using random sample of the data. A parallel implementation is possible.
You might like to check out gradient boosting where trees are built sequentially where new tree tries to correct the mistake previously made.
add a comment |
Random forest is a bagging algorithm rather than a boosting algorithm.
Random forest constructs the tree independently using random sample of the data. A parallel implementation is possible.
You might like to check out gradient boosting where trees are built sequentially where new tree tries to correct the mistake previously made.
add a comment |
Random forest is a bagging algorithm rather than a boosting algorithm.
Random forest constructs the tree independently using random sample of the data. A parallel implementation is possible.
You might like to check out gradient boosting where trees are built sequentially where new tree tries to correct the mistake previously made.
Random forest is a bagging algorithm rather than a boosting algorithm.
Random forest constructs the tree independently using random sample of the data. A parallel implementation is possible.
You might like to check out gradient boosting where trees are built sequentially where new tree tries to correct the mistake previously made.
answered Nov 20 at 2:06
Siong Thye Goh
2,3741618
2,3741618
add a comment |
add a comment |
So how does it works ?
Random Forest is a collection of decision trees. The trees are constructed independently. Each tree is trained on subset of features and subset of a sample chosen with replacement.
When predicting, say for Classification, the input parameters are given to each tree in the forest and each tree "votes" on the classification, label with most votes wins.
Why to use Random Forest over simple Decision Tree? Bias/Variance trade off. Random Forest are built from much simpler trees when compared to a single decision tree. Generally Random forests provide a big reduction of error due to variance and small increase in error due to bias.
If we are chosing different features for every Decision Tree, then how the learning by a set of features in previous Decision Tree improves while we send the missclassified values ahead as for the next Decision Tree there is totally a new set of features ?
– Abhay Raj Singh
Nov 20 at 6:50
3
@AbhayRajSingh - you do not "send the misclassified values ahead" in Random Forest. As Akavall says, "The trees are constructed independently"
– Henry
Nov 20 at 10:16
add a comment |
So how does it works ?
Random Forest is a collection of decision trees. The trees are constructed independently. Each tree is trained on subset of features and subset of a sample chosen with replacement.
When predicting, say for Classification, the input parameters are given to each tree in the forest and each tree "votes" on the classification, label with most votes wins.
Why to use Random Forest over simple Decision Tree? Bias/Variance trade off. Random Forest are built from much simpler trees when compared to a single decision tree. Generally Random forests provide a big reduction of error due to variance and small increase in error due to bias.
If we are chosing different features for every Decision Tree, then how the learning by a set of features in previous Decision Tree improves while we send the missclassified values ahead as for the next Decision Tree there is totally a new set of features ?
– Abhay Raj Singh
Nov 20 at 6:50
3
@AbhayRajSingh - you do not "send the misclassified values ahead" in Random Forest. As Akavall says, "The trees are constructed independently"
– Henry
Nov 20 at 10:16
add a comment |
So how does it works ?
Random Forest is a collection of decision trees. The trees are constructed independently. Each tree is trained on subset of features and subset of a sample chosen with replacement.
When predicting, say for Classification, the input parameters are given to each tree in the forest and each tree "votes" on the classification, label with most votes wins.
Why to use Random Forest over simple Decision Tree? Bias/Variance trade off. Random Forest are built from much simpler trees when compared to a single decision tree. Generally Random forests provide a big reduction of error due to variance and small increase in error due to bias.
So how does it works ?
Random Forest is a collection of decision trees. The trees are constructed independently. Each tree is trained on subset of features and subset of a sample chosen with replacement.
When predicting, say for Classification, the input parameters are given to each tree in the forest and each tree "votes" on the classification, label with most votes wins.
Why to use Random Forest over simple Decision Tree? Bias/Variance trade off. Random Forest are built from much simpler trees when compared to a single decision tree. Generally Random forests provide a big reduction of error due to variance and small increase in error due to bias.
answered Nov 20 at 5:23
Akavall
1,60611523
1,60611523
If we are chosing different features for every Decision Tree, then how the learning by a set of features in previous Decision Tree improves while we send the missclassified values ahead as for the next Decision Tree there is totally a new set of features ?
– Abhay Raj Singh
Nov 20 at 6:50
3
@AbhayRajSingh - you do not "send the misclassified values ahead" in Random Forest. As Akavall says, "The trees are constructed independently"
– Henry
Nov 20 at 10:16
add a comment |
If we are chosing different features for every Decision Tree, then how the learning by a set of features in previous Decision Tree improves while we send the missclassified values ahead as for the next Decision Tree there is totally a new set of features ?
– Abhay Raj Singh
Nov 20 at 6:50
3
@AbhayRajSingh - you do not "send the misclassified values ahead" in Random Forest. As Akavall says, "The trees are constructed independently"
– Henry
Nov 20 at 10:16
If we are chosing different features for every Decision Tree, then how the learning by a set of features in previous Decision Tree improves while we send the missclassified values ahead as for the next Decision Tree there is totally a new set of features ?
– Abhay Raj Singh
Nov 20 at 6:50
If we are chosing different features for every Decision Tree, then how the learning by a set of features in previous Decision Tree improves while we send the missclassified values ahead as for the next Decision Tree there is totally a new set of features ?
– Abhay Raj Singh
Nov 20 at 6:50
3
3
@AbhayRajSingh - you do not "send the misclassified values ahead" in Random Forest. As Akavall says, "The trees are constructed independently"
– Henry
Nov 20 at 10:16
@AbhayRajSingh - you do not "send the misclassified values ahead" in Random Forest. As Akavall says, "The trees are constructed independently"
– Henry
Nov 20 at 10:16
add a comment |
Thanks for contributing an answer to Cross Validated!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f377865%2frandom-forest-and-decision-tree-algorithm%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
"When we move from one decision tree to the next decision tree". This suggests an linear process. We've built parallel implementations where we worked on one tree per CPU core; this works perfectly fine unless you use a separate random number generator per CPU core in training, all of which share the same seed. In that case you can end up with lots of identical trees.
– MSalters
Nov 21 at 14:25