Skip to yearly menu bar Skip to main content


The Catalog Problem: Clustering and Ordering Variable-Sized Sets

Mateusz Jurewicz · Graham Taylor · Leon Derczynski

Exhibit Hall 1 #803
[ ]
[ PDF [ Poster

Abstract: Prediction of a $\textbf{varying number}$ of $\textbf{ordered clusters}$ from sets of $\textbf{any cardinality}$ is a challenging task for neural networks, combining elements of set representation, clustering and learning to order. This task arises in many diverse areas, ranging from medical triage and early discharge, through machine part management and multi-channel signal analysis for petroleum exploration to product catalog structure prediction. This paper focuses on that last area, which exemplifies a number of challenges inherent to adaptive ordered clustering, referred to further as the eponymous $\textit{Catalog Problem}$. These include learning variable cluster constraints, exhibiting relational reasoning and managing combinatorial complexity. Despite progress in both neural clustering and set-to-sequence methods, no joint, fully differentiable model exists to-date. We develop such a modular architecture, referred to further as Neural Ordered Clusters (NOC), enhance it with a specific mechanism for learning cluster-level cardinality constraints, and provide a robust comparison of its performance in relation to alternative models. We test our method on three datasets, including synthetic catalog structures and PROCAT, a dataset of real-world catalogs consisting of over 1.5M products, achieving state-of-the-art results on a new, more challenging formulation of the underlying problem, which has not been addressed before. Additionally, we examine the network's ability to learn higher-order interactions.

Chat is not available.