In this paper, we propose a two-step method to compute the Wasserstein distance in Wasserstein Generative Adversarial Networks (WGANs): 1) The convex part of our objective can be solved by linear programming; 2) The non-convex residual can be approximated by a deep neural network. We theoretically prove that the proposed formulation is equivalent to the discrete Monge-Kantorovich dual formulation. Furthermore, we give the approximation error bound of the Wasserstein distance and the error bound of generalizing the Wasserstein distance from discrete to continuous distributions. Our approach optimizes the exact Wasserstein distance, obviating the need for weight clipping previously used in WGANs. Results on synthetic data show that the our method computes the Wasserstein distance more accurately. Qualitative and quantitative results on MNIST, LSUN and CIFAR-10 datasets show that the proposed method is more efficient than state-of-the-art WGAN methods, and still produces images of comparable quality.