Paper ID: 438
Title: Boolean Matrix Factorization and Noisy Completion via Message Passing

Review #1
=====
Summary of the paper (Summarize the main claims/contributions of the paper.): The authors suggest and test a message passing algorithm for Boolean matrix factorization. The performance of the algorithms is promising. Since the algorithm potentially applies to a large class of problems that are not easily tractable with other methods this work in interesting to the ICML community.  

Clarity - Justification: I find the paper rather clear. 

Significance - Justification: I find the proposed algorithm interesting. It derives from Bayesian formulation of the problem thus making the algorithm rather flexible. It's empirical performance is promising, given how hard the matrix factorization is.     

Detailed comments. (Explain the basis for your ratings while providing constructive feedback.): ** In the first paragraph of section 3 the authors refer to other works on matrix factorization that suggest message passing algorithms (I would ad to these Matsushita, R., & Tanaka, T. (2013). Low-rank matrix reconstruction and clustering via approximate message passing). In fact the cited algorithms are written for a class of observations channels that include both (1a) and (1b) and for general priors on the elements of X and Y, including Boolean. I would expect the authors to discuss in more detail the relation between their algorithm and those from cited references for the Boolean variables and Z given by (1a) or (1b).   ** Why do the authors compute MAP of the posterior instead of computing the marginals? A version of the algorithm for marginals in the appendix, but what motivates their choice of one over the other? They give an answer in the appendix, but I am not satisfied with this. If there is a strong ground truth factorization that the marginals should be biased towards it and no decimation should be needed. On the other hand if marginals are not information wouldn't this suggest there is actually no significant "signal" in the data?   ** It should be clearer in the introduction that the authors do not provide nor test the algorithm for the XOR problem (1b), only on (1a).    Minor comments:   ** "Y is known to he l-sparse which makes it possible to use approximate message passing". This is incorrect. AMP does not require sparsity, it is more general.   ** The authors say on line 422: "Near optimal behavior of max-sum BP in dense factor-graph is not without precedence" ... but some of the references they give are sum-product not max-sum.    ** "we derive message passing solutions to a novel graphical model for ..." what is novel in this graphical model? As the authors state there are other works that treated matrix factorization as a graphical model.   ** Spell-check better, e.g. "incrmental".   ** Use symbols readable in black-and-white in the Fig. 4. 

=====

Review #2
=====
Summary of the paper (Summarize the main claims/contributions of the paper.): The paper encodes the Boolean matrix factorization and completion problem as a graphical model and derives a special-purpose version of iterative belief propagation to solve the corresponding MAP inference problem.

Clarity - Justification: The paper is rather easy to follow. The discussion of related work and somne of the technical parts use too much technical jargon but are otherwise fine. The experimental methodology is non-trivial but rather well described.

Significance - Justification: The technical content, of encoding matrix factorization in a graphical model and running BP, is not rocket science, and very similar to other matrix factorization algorithms. That being said, the paper does provide a very useful and significant advance in bridging the Boolean factorization world with the graphical model and coding theory world. Boolean matrix factorization was until now the domain of data miners that used symbol techniques. The proposed solver is significant in that it uses very different, statistical techniques and performs very well. Just making this connection bewteen symbolic and statistical is a significant contribution to both sides.

Detailed comments. (Explain the basis for your ratings while providing constructive feedback.):  I recommend to accept this paper, because of the signifiant connections made, and the solid empirical validation of the proposed algorithm. The experimental methodology is sound, and it compares against the right baselines (ASSO in particular). There are not many baselines, simply because there are not too many BMF solvers out there. This fact only increases the importance of this work.                      The authors may be interested in the following duality. As they show, one can speed up BMF by using graphical model inference. Conversely, one can speed up graphical model inference by using BMF, see Van den Broeck, G., and Darwiche, A. (2013). On the complexity and approximation of binary evidence in lifted inference. In Advances in Neural Information Processing Systems (pp. 2868-2876).  Typo line 190: incrmental Typo line 284: the a     

=====

Review #3
=====
Summary of the paper (Summarize the main claims/contributions of the paper.): The authors describe a graphical model and simplified MAP message-passing scheme for noisy binary matrix factorization.  The proposed MAP inference approach is validated experimentally against other algorithms for binary matrix factorization/completion (including 1-Bit completion and GLRM).

Clarity - Justification: The paper was easy to read modulo a few typos here and there and some equation formatting issues.  One note is that the authors should probably consider introducing their notation for MRFs (i.e., write the probability distribution as a product of phi potential functions) as some readers less familiar with this topic may find it confusing.

Significance - Justification: The fact that the message passing can be implemented in linear time makes it an attractive alternative to other more sophisticated optimization strategies for this problem.  Overall,  There are a number of other Bayesian models for matrix factorization that the authors should consider citing.  For example:    A. T. Cemgil. Bayesian inference for nonnegative matrix factorisation models. Computational Intelligence and Neuroscience, 2009.  The authors should also expand the discussion of the hardness of the general problem with appropriate citations.

Detailed comments. (Explain the basis for your ratings while providing constructive feedback.): This is an application paper that applies variational methods and BP to perform MAP inference on a constructed graphical model.  The experiments seem to suggest that this approach is a viable competitor in terms of speed and reconstruction accuracy (for matrix completion problems).  Even though the convergence of BP appears not to be an issue, the authors should consider convergent alternatives to BP for the MAP inference task (such as TRW, MPLP, max-sum diffusion, etc.) as they come with additional performance guarantees and often work well for MAP inference tasks.  It would be interesting to see the same experimental plots with the convergent variants.

=====