Poster
Optimal Decision Tree Pruning Revisited: Algorithms and Complexity
Juha Harviainen · Frank Sommer · Manuel Sorge · Stefan Szeider
West Exhibition Hall B2-B3 #W-1015
Decision trees classify objects by asking a series of simple questions. The sequence of questions is determined by the answers to previous questions, creating a model with a hierarchical structure that is used to represent and organize data in the form of parent–child relationship. Traditional methods for building decision trees can result in an overly complicated structure, necessitating simplification by not asking some of the less relevant questions.However, identifying which questions to omit without compromising performance is challenging. Therefore, one typically performs educated guesses about which of them could be left out, lacking guarantees on whether the excluded questions were the best ones possible. We study which properties of decision trees can be exploited to achieve such guarantees. That is, we develop algorithms that determine the least relevant questions and are efficient if the initial tree has certain properties, or show that no efficient algorithms exist.Based on our discoveries, we create a proof-of-concept implementation that helps us to assess how well traditional methods perform on this problem when excluding a fixed number of questions. Surprisingly, we discover that they are nearly optimal despite the inherent hardness of the problem, providing new insight on the robustness of conventional methods.
Live content is unavailable. Log in and register to view live content