Poster
in
Workshop: HiLD: High-dimensional Learning Dynamics Workshop
An improved residual based random forest for robust prediction
Mingyan Li
Random forest (RF) model, introduced by Breiman (2001), is a robust method with low signal-to-noise ratio data and is very unlikely to overfit, although it could be true that RF deteriorates if mean-shift contamination presents in the training set. This paper introduces a residual based solution, the penalized weighted random forest (PWRF) method, which modifies random forest (RF) model to improve robustness over systematic or trend contamination. This method dictates the impact from contamination in the training set based on the squared residual of each training observation, which provides the flexibility to deal with different types of data. The experiment suggests that PWRF provides competent if not better results that two robust RF methods and the original RF method.