Paper ID: 113 Title: CryptoNets: Applying Neural Networks to Encrypted Data with High Throughput and Accuracy Review #1 ===== Summary of the paper (Summarize the main claims/contributions of the paper.): The authors show how a simple trained neural network can be expressed as a low-degree polynomial so that it is amenable in the use of a homomorphic encryption scheme. The empirical results are only for MNIST. Clarity - Justification: The paper is very well written and easy to follow. Thank you for writing a clear paper. Significance - Justification: There are many papers that propose the use of fully homomorphic encryption for privacy-preserving cloud machine learning applications. I have a tendency to find these papers unconvincing. The reason is that homomorphic encryption cannot aggregate data from multiple sources (due to the working with only a single key pair). Therefore, for model training it can only solve the problem of outsourcing computation, which is somewhat pointless given the heavy performance penalty. The authors here take a bit of a different spin on the problem and focus on the inference/prediction part. Here the cloud provider already has a (possibly proprietary) model and the goal is to apply it to encrypted data of the customer. This actually makes more sense to me as an application than the "outsourcing computation" story. Detailed comments. (Explain the basis for your ratings while providing constructive feedback.): This result is part of a flurry of recent papers that use fully homorphic encryption to achieve some kind of privacy-preserving machine learning in the cloud. The application sketched out here is actually one of the more convincing ones; it's conceivable that this might actually happen in practice at some point. The scenario is that the cloud provider has a trained deep network which they want to apply to the encrypted data of a customer. To be sure, it is not quite clear how the cloud provider got to the trained model in the first place. Indeed, if the model is going to be useful on sensitive data, it better had been trained on sensitive data as well. The authors mention differential privacy as a solution for the training stage. However, differentially private training of neural nets is an open problem with very limited process. At any rate, I'm okay with putting this aspect aside and focusing on the inference task only. Short of coming up with new breakthrough results in crypto, the authors use existing homomorphic encryption schemes as a black-box as almost all papers in this area do. At this point, the name of the game is low-degree polynomials as the complexity of encryption/decryption is determined by the degree of the computation. There is an obvious tension here between low-degree polynomials and deep learning. After all, one of the reasons people turn to deep learning in the first place is that it is able to express relationships not captured by polynomial kernel learning. For this reason, I find the emphasis on "deep learning" in this paper to be a bit questionable. MNIST is a data set on which kernels get to 99% accuracy. So, in principle the title of the paper could've been: "CryptoKernels: Applying Kernels to Encrypted...". But of course that wouldn't have been a timely thing to do. I would've been quite impressed (almost skeptical) to see this pan out on ImageNet in an accuracy regime where kernels fail. Against this backdrop, the authors do a pretty solid job breaking the computation down and simplifying it. This is something I find interesting regardless of its crypto applications. The performance numbers are solid. ===== Review #2 ===== Summary of the paper (Summarize the main claims/contributions of the paper.): Reports on the design and performance throughout of an homomorphic encryption method for making encrypted predictons of a deep neural network on encrypted inputs. Claims this to yield "high throughput" of accurate yet private prediction. Clarity - Justification: Describes the method sufficient for replication. However, presentation is weak in lacking any experimentation that clarifies how this method compares to related work, or even to non-encrypted baseline executions same network architecture. Significance - Justification: Novelty with respect to the cited previous work of Xie et al, 2014 is unclear. Authors simply note on lines 258-266 that Xie et al approximated activation functions using lower-order polynomials, whereas this new work uses square function (lowest-degree polynomial). No experimental work is offered in this paper to compare these two methods and understand the tradeoffs. Further confusing the two methods is that the earlier method was called "Crypto-Nets" and the new method is referred to as "CryptoNets". In the end, it is not clear how much novelty/significance over the earlier work the current authors are claiming. Lack of comparison (either analytically or empirically) seems a serious obmission that will confuse readers. Compensating somewhat for this unclear novelty is that this paper does seem to represent the most concrete discussion to date of a relatively fast workable method for encrypted predictions of deep networks. It would be better to demonstrate performance on more than the simple MNIST data, however, such as ImageNet, since MNIST models and data may be too simple to be representative of the performance of this type of approach on more realistic modern applications. Detailed comments. (Explain the basis for your ratings while providing constructive feedback.): A key claim of this paper is that it provides "high throughput". But "high" is necessarily a relative notion, which the authors do not provide a baseline upon which the reader can judge whether this claim is true. Is it "high" because it is comparable to the performance of a non-encrypted prediction? No, because they only achieve 51,000 prediction per hour (14 images / sec), whereas their simple MNIST deep net in non-encrypted form would easily predict many orders of magnitude more images per sec, based on common knowledge of the performance of other deep nets (for both imagenet and simpler ones for MNIST). Sadly, the authors do not even cite in this paper the throughout of the non-encrypted form of their network, so the readers cannot clearly see this performance difference. Or, if it claimed "high" because it is significantly faster than previous encrypted-prediction networks? If so, why, again, do the authors not cite any such baselines for direct comparison? ===== Review #3 ===== Summary of the paper (Summarize the main claims/contributions of the paper.): The paper proposes to make the prediction task secure in an online environment where the users do not have the reveal anything about the their data as well as the prediction while still can use any third party classifier. The main component of the proposal is the idea of homomorphic encryption. The authors shows that the operations in the deep neural network can be simplifieed to an extend such that final operations are amenable to homomorphic encryptions, thus providing a first example of leveraging the theory of homomorphic to design practical and secure algorithms. The authors also provide experiments to demonstrate the feasibility of the approach by doing a careful evaluations of their proposal Clarity - Justification: The paper is easy to read and the authors clearly explains the key contributions and the ideas. Significance - Justification: To the best of my knowledge preserving privacy in online prediction is a very important problem. The authors show a first proof of concept prototype that the operations in deep networks can be simplified to an extend such that it lies in the closure of homomorphic encryption. This is in itself very novel and is leads ot an interesting direction "Can we simplify current machine learning so that is it naturally amenable to powerful and secure framework like homomorphic encryption ?" The significance are two folds, first it is no way clear that such a simplification will preserve the power of deep networks for machine learning, the authors shows that this directions is promising. Secondly, the authors show that the known computational barrier can be brought down to reasonably levels by smart design choices. This is very practical. Detailed comments. (Explain the basis for your ratings while providing constructive feedback.): See the significance section, this work opens room for interesting frontiers and is likely to lead to many followups. ===== Review #4 ===== Summary of the paper (Summarize the main claims/contributions of the paper.): This paper summarizes a system called CryptoNets, which allows a data owner to send their data in an encrypted form to a cloud service that hosts the neural network. The system is developed based on the Homomorphic Encryption. Clarity - Justification: This paper explained the core concepts in Homomorphic Encryption relatively well. I found the discussion on practical considerations are also useful to understand the substance of the work. Significance - Justification: Secured machine learning is very important in the cloud computing era. This paper tries to address this issue under the context of deep learning models, which is good. However, this paper is mainly an engineering work that tailors neural networks to what can be done currently under the Homomorphic Encryption framework (which is not difficult for engineers). It's not a general solution and needs to compromise the modeling power. Detailed comments. (Explain the basis for your ratings while providing constructive feedback.): I think this paper is trying to address an important problem in machine learning at the cloud computing era. I would think the contribution of the paper can be much higher if some theoretical justification can be made to indicate that although approximations and restrictions are made in the network structure, the missing power can be easily compensated, for example, using one or two more additional supported operations. Having the system evaluated only on MNIST is not sufficient. =====