Session
Optimization (Combinatorial) 2
Learning to Optimize Combinatorial Functions
Nir Rosenfeld · Eric Balkanski · Amir Globerson · Yaron Singer
Submodular functions have become a ubiquitous tool in machine learning. They are learnable from data, and can be optimized efficiently and with guarantees. Nonetheless, recent negative results show that optimizing learned surrogates of submodular functions can result in arbitrarily bad approximations of the true optimum. Our goal in this paper is to highlight the source of this hardness, and propose an alternative criterion for optimizing general combinatorial functions from sampled data. We prove a tight equivalence showing that a class of functions is optimizable if and only if it can be learned. We provide efficient and scalable optimization algorithms for several function classes of interest, and demonstrate their utility on the task of optimally choosing trending social media items.
Proportional Allocation: Simple, Distributed, and Diverse Matching with High Entropy
Shipra Agarwal · Morteza Zadimoghaddam · Vahab Mirrokni
Inspired by many applications of bipartite matching in online advertising and machine learning, we study a simple and natural iterative proportional allocation algorithm: Maintain a priority score $\priority_a$ for each node $a\in\advertisers$ on one side of the bipartition, initialized as $\priority_a=1$. Iteratively allocate the nodes $i\in \impressions$ on the other side to eligible nodes in $\advertisers$ in proportion of their priority scores. After each round, for each node $a\in \advertisers$, decrease or increase the score $\priority_a$ based on whether it is over- or under- allocated. Our first result is that this simple, distributed algorithm converges to a $(1-\epsilon)$-approximate fractional $b$-matching solution in $O({\log n\over \epsilon^2} )$ rounds. We also extend the proportional allocation algorithm and convergence results to the maximum weighted matching problem, and show that the algorithm can be naturally tuned to produce maximum matching with {\em high entropy}. High entropy, in turn, implies additional desirable properties of this matching, e.g., it satisfies certain diversity and fairness (aka anonymity) properties that are desirable in a variety of applications in online advertising and machine learning.
Binary Partitions with Approximate Minimum Impurity
Eduardo Laber · Marco Molinaro · Felipe de A. Mello Pereira
The problem of splitting attributes is one of the main steps in the construction of decision trees.In order to decide the best split, impurity measures such as Entropy and Gini are widely used. In practice, decision-tree inducers use heuristics for finding splits with small impurity when they consider nominal attributes with a large number of distinct values. However, there are no known guarantees for the quality of the splits obtained by these heuristics.To fill this gap, we propose two new splitting procedures that provably achieve near-optimal impurity. We also report experiments that provide evidence that the proposed methods are interesting candidates to be employed in splitting nominal attributes with many values during decision tree/random forest induction.
Partial Optimality and Fast Lower Bounds for Weighted Correlation Clustering
Jan-Hendrik Lange · Andreas Karrenbauer · Bjoern Andres
Weighted correlation clustering is hard to solve and hard to approximate for general graphs. Its applications in network analysis and computer vision call for efficient algorithms. To this end, we make three contributions: We establish partial optimality conditions that can be checked efficiently, and doing so recursively solves the problem for series-parallel graphs to optimality, in linear time. We exploit the packing dual of the problem to compute a heuristic, but non-trivial lower bound faster than that of a canonical linear program relaxation. We introduce a re-weighting with the dual solution by which efficient local search algorithms converge to better feasible solutions. The effectiveness of our methods is demonstrated empirically on a number of benchmark instances.
Bounds on the Approximation Power of Feedforward Neural Networks
Mohammad Mehrabi · Aslan Tchamkerten · MANSOOR I YOUSEFI
The approximation power of general feedforward neural networks with piecewise linear activation functions is investigated. First, lower bounds on the size of a network are established in terms of the approximation error and network depth and width. These bounds improve upon state-of-the-art bounds for certain classes of functions, such as strongly convex functions. Second, an upper bound is established on the difference of two neural networks with identical weights but different activation functions.