Discussion about this post

User's avatar
Vasco Gameiro's avatar

Hey,t hank you for the great post! One question I had while reading this:

In the two-step process where you first extract an embedding from the decision tree’s geometry and then fit a logistic regression on top of it, aren’t we increasing the risk of overfitting by effectively using information about the target in the feature construction?

From how I see it, the embedding is derived after training the tree on labels, so the features passed to the logistic regression already contain target-informed structure. Doesn’t this leak label information into the features and make overfitting more likely?

Or do you think that the randomization/ensemble nature of the forests plus regularization on the logistic regression (e.g., ℓ₂) largely mitigates that risk?

raghuram kowdeed's avatar

I have a question. So steps are create stretch matrix and apply transform by stretch matrix multiplication with feature . Then fit model per leaf right ? ( expanding with zeros on rest of leaves same as fitting model per leaf i.e interacting beta with discrete variable ) . My question since it is fitting leaf with linear transformation ( stretch matrix ), would it be same as just fitting model per leaf on feature without any transform ?

3 more comments...

No posts

Ready for more?