Learning for Efficient Retrieval of Structured Data with Noisy Queries
Charles Parker - Oregon State University, United States
Alan Fern - Oregon State University, United States
Prasad Tadepalli - Oregon State University, United States
Increasingly large collections of structured data necessitate the development of efficient, noise-tolerant retrieval tools. In this work, we consider this issue and describe an approach to learn a similarity function that is not only accurate, but that also increases the effectiveness of retrieval data structures. We present an algorithm that uses functional gradient boosting to maximize both retrieval accuracy and the retrieval efficiency of vantage point trees. We demonstrate the effectiveness of our approach on two datasets, including a moderately sized real-world dataset of folk music.