Skip to yearly menu bar Skip to main content


Session

Deep Learning (Adversarial) 4

Abstract:
Chat is not available.

Thu 12 July 7:00 - 7:20 PDT

The Mechanics of n-Player Differentiable Games

David Balduzzi · Sebastien Racaniere · James Martens · Jakob Foerster · Karl Tuyls · Thore Graepel

The cornerstone underpinning deep learning is the guarantee that gradient descent on an objective converges to local minima. Unfortunately, this guarantee fails in settings, such as generative adversarial nets, where there are multiple interacting losses. The behavior of gradient-based methods in games is not well understood -- and is becoming increasingly important as adversarial and multi-objective architectures proliferate. In this paper, we develop new techniques to understand and control the dynamics in general games. The key result is to decompose the second-order dynamics into two components. The first is related to potential games, which reduce to gradient descent on an implicit function; the second relates to Hamiltonian games, a new class of games that obey a conservation law, akin to conservation laws in classical mechanical systems. The decomposition motivates Symplectic Gradient Adjustment (SGA), a new algorithm for finding stable fixed points in general games. Basic experiments show SGA is competitive with recently proposed algorithms for finding local Nash equilibria in GANs -- whilst at the same time being applicable to -- and having guarantees in -- much more general games.

Thu 12 July 7:20 - 7:30 PDT

K-Beam Minimax: Efficient Optimization for Deep Adversarial Learning

Jihun Hamm · Yung-Kyun Noh

Minimax optimization plays a key role in adversarial training of machine learning algorithms, such as learning generative models, domain adaptation, privacy preservation, and robust learning.In this paper, we demonstrate the failure of alternating gradient descent in minimax optimization problems due to the discontinuity of solutions of the inner maximization. To address this, we propose a new $\epsilon$-subgradient descent algorithm that addresses this problem by simultaneously tracking $K$ candidate solutions.Practically, the algorithm can find solutions that previous saddle-point algorithms cannot find, with only a sublinear increase of complexity in $K$. We analyze the conditions under which the algorithm converges to the true solution in detail. A significant improvement in stability and convergence speed of the algorithm is observed in simple representative problems, GAN training, and domain-adaptation problems.

Thu 12 July 7:30 - 7:40 PDT

First Order Generative Adversarial Networks

Calvin Seward · Thomas Unterthiner · Urs M Bergmann · Nikolay Jetchev · Sepp Hochreiter

GANs excel at learning high dimensional distributions, but they can update generator parameters in directions that do not correspond to the steepest descent direction of the objective. Prominent examples of problematic update directions include those used in both Goodfellow's original GAN and the WGAN-GP. To formally describe an optimal update direction, we introduce a theoretical framework which allows the derivation of requirements on both the divergence and corresponding method for determining an update direction, with these requirements guaranteeing unbiased mini-batch updates in the direction of steepest descent. We propose a novel divergence which approximates the Wasserstein distance while regularizing the critic's first order information. Together with an accompanying update direction, this divergence fulfills the requirements for unbiased steepest descent updates. We verify our method, the First Order GAN, with image generation on CelebA, LSUN and CIFAR-10 and set a new state of the art on the One Billion Word language generation task.

Thu 12 July 7:40 - 7:50 PDT

Towards Fast Computation of Certified Robustness for ReLU Networks

Tsui-Wei Weng · Huan Zhang · Hongge Chen · Zhao Song · Cho-Jui Hsieh · Luca Daniel · Duane Boning · Inderjit Dhillon

Verifying the robustness property of a general Rectified Linear Unit (ReLU) network is an NP-complete problem. Although finding the exact minimum adversarial distortion is hard, giving a certified lower bound of the minimum distortion is possible. Current available methods of computing such a bound are either time-consuming or deliver low quality bounds that are too loose to be useful. In this paper, we exploit the special structure of ReLU networks and provide two computationally efficient algorithms (Fast-Lin, Fast-Lip) that are able to certify non-trivial lower bounds of minimum adversarial distortions. Experiments show that (1) our methods deliver bounds close to (the gap is 2-3X) exact minimum distortions found by Reluplex in small networks while our algorithms are more than 10,000 times faster; (2) our methods deliver similar quality of bounds (the gap is within 35\% and usually around 10\%; sometimes our bounds are even better) for larger networks compared to the methods based on solving linear programming problems but our algorithms are 33-14,000 times faster; (3) our method is capable of solving large MNIST and CIFAR networks up to 7 layers with more than 10,000 neurons within tens of seconds on a single CPU core. In addition, we show that there is no polynomial time algorithm that can approximately find the minimum $\ell_1$ adversarial distortion of a ReLU network with a $0.99\ln n$ approximation ratio unless NP=P, where $n$ is the number of neurons in the network.

Thu 12 July 7:50 - 8:00 PDT

LaVAN: Localized and Visible Adversarial Noise

Danny Karmon · Daniel Zoran · Yoav Goldberg

Most works on adversarial examples for deep-learning based image classifiers usenoise that, while small, covers the entire image. We explore the case where thenoise is allowed to be visible but confined to a small, localized patch of theimage, without covering any of the main object(s) in the image. We show that it is possible to generate localized adversarial noises that cover only 2% of the pixels in the image, none of them over the main object, and that are transferable acrossimages and locations, and successfully fool a state-of-the-art Inception v3model with very high success rates.