Paper ID: 45
Title: Hawkes Processes with Stochastic Excitations

Review #1
=====
Summary of the paper (Summarize the main claims/contributions of the paper.): This paper extends the standard Hawkes process to account for self-excitation that, instead of being constant, varies over time according to a stochastic differential equation (SDE). The authors focus on two examples of SDEs, the Geometric Browian Motion process and the Exponential Langevin, and derive a inference algorithm based on MCMC techniques.  The paper is in general well written and clear, but I have two main concerns: i) The scalability of the proposed inference algorithm; and ii) the lack of experimental validation, which would help the ready to understand the relevance and advantages of the proposed Stochastic Hawkes process.  

Clarity - Justification: The paper is well written and easily to follow. There are only a few minor typos/ details that I point out in the detailed comments.

Significance - Justification: My main concern on the paper is the relevance and utility of the proposed Stochastic Hawkes process and the inference algorithm in real applications.  There is no experimental evidence of the mixing properties (e.g., average the acceptance rate of the MH steps to infer the latent self-excitation Y_i, and the hyper parameters a, lambda_0 or delta) or scalability of the proposed inference algorithm. Moreover, based on Figure 2 and as the authors point out, it seems very important to know the true dynamics of the self-excitation (the SDE that Y follows) to provide accurate estimation of the self excitation, which can be too restrictive in real applications. 

Detailed comments. (Explain the basis for your ratings while providing constructive feedback.): - I believe that there is a typo in Section 2.3 -- according to the branching process, an inmigran event is the one coming from the base intensity lambda_0. It is written in the paper as a+Y_0 exp{- delta t}.  - I find the notation of the latent variable corresponding to the branching process a bit confusing. for example,  according to Section 2.3, Z_ij is a binary variable, but at the same time Z_i=Z_ij, therefore, is Z_i really capturing the branching process that the event at T_i is coming from?  - Finally, let me simply remark that there are several papers in the literature that, in contrast to Ogata's algorithm, have already proposed schemes to sample from a Hawkes process with linear complexity with respect to the number of events. See for example (Valera, ICDM'2015) or  (Farajtabar, NIPS2015). 

=====

Review #2
=====
Summary of the paper (Summarize the main claims/contributions of the paper.): The authors consider the classical Hawkes models that are now extensively used in neuroscience, finance, genetics and to model earthquakes. These self-exciting processes can be viewed as branching processes over Poisson processes. The main contribution of the paper lies in the magnitude of self-excitation which satisfies a stochastic differential equation. The authors propose a new simulating algorithm of Hawkes processes whose complexity is O(n). In the Bayesian approach, an MCMC algorithm is used to update the parameters. 

Clarity - Justification: The paper is well written and well organized. Just a remark: The notation for the baseline function is not very clear. 

Significance - Justification: Using SDE to model self-excitation enhances classical models. The Bayesian inference of parameters of Hawkes models constitutes a nice extension of Rasmussen (2013). It would be interesting to consider multivariate Hawkes models to provide a more valuable contribution. Some aspects should be clarified (see below). 

Detailed comments. (Explain the basis for your ratings while providing constructive feedback.): In Hawkes & Oakes (1974) and Ozaki (1979), the base intensity does not depend on t and is constant. Please exaplain why you can consider non-constant functions. Some references are missing: See Hansen, Reynaud-Bouret and Rivoirard (2015) and references therein.

=====

Review #3
=====
Summary of the paper (Summarize the main claims/contributions of the paper.): This paper takes some standard/recent analytic result for Hawkes processes (Algorithm 1, proposition 1, etc.) and extends them by stochastic modelling of the excitation level using stochastic differential equations.  This is fully Bayesian so it is the most extensive presentation I have seen todate.

Clarity - Justification: Didn't see real problems.  Well written paper with extensive explanations.

Significance - Justification: This is an extensive development of a range of samplers for Hawkes processes.  A clear problem is that nowhere is there any effort to convince us that the SDE's actually correspond to useful classes of problems in the real world.  So while I like the presentation, one can be skeptical of the usefulness.  Perhaps its the case that the technique will allow now problems to be addressed ... I expect, as your Figure 2 illustrates.  Nonetheless, this is an extensive treatment so I'm sure some will find it useful.

Detailed comments. (Explain the basis for your ratings while providing constructive feedback.): What is the $\wedge$ symbol mean in algorithm 1.  I know it is adapted from Dasios and Zhao, but they don't explain it either.  Is their algorithm O(n)?  

=====