Decision Trees with Minimal Costs
Charles X. Ling - The University of Western Ontario
Qiang Yang - The Hong Kong University of Science & Technology
Jianning Wang - The University of Western Ontario
Shichao Zhang - Guangxi Normal University
We propose a simple, novel and yet effective method for building and testingdecision trees that minimizes the sum of the misclassification and test costs.More specifically, we first put forward an original and simple splittingcriterion for attribute selection in tree building. Our tree-buildingalgorithm has many desirable properties for a cost-sensitive learning systemthat must account for both types of costs. Then, assuming that the test casesmay have a large number of missing values, we design several intelligent teststrategies that can suggest ways of obtaining the missing values at a cost inorder to minimize the total cost. We experimentally compare these strategiesand C4.5, and demonstrate that our new algorithms significantly outperformC4.5 and its variations. In addition, our algorithm¡¯s complexity is similarto that of C4.5, and is much lower than that of previous work. Our work isuseful for many diagnostic tasks which must factor in the misclassificationand test costs for obtaining missing information.